Context Window

Context Window

The maximum number of tokens an LLM can process in a single request, encompassing the prompt, retrieved document chunks in RAG, and the agent's conversation history. It serves as the physical memory boundary that determines how much external knowledge can be 'read' by the model at once before information is truncated or lost.

Definition

Disambiguation

Distinguish from the total size of a vector database; this is the 'RAM' of the model, not the 'Hard Drive'.

Visual Metaphor

"A workbench with a fixed surface area where every tool and document must fit simultaneously to be used."

Key Tools

OpenAI GPT-4Anthropic ClaudeLangChain (TokenTextSplitter)TiktokenLlamaIndex

Related Connections

Tokens(Prerequisite)
Lost-in-the-Middle Phenomenon(Constraint)
Chunking Strategy(Component)
Needle In A Haystack (NIAH)(Evaluation Metric)

Conceptual Overview

Disambiguation

Distinguish from the total size of a vector database; this is the 'RAM' of the model, not the 'Hard Drive'.

Visual Analog

A workbench with a fixed surface area where every tool and document must fit simultaneously to be used.

Definition

Conceptual Overview

Disambiguation

Visual Analog

Related Articles