Input Tokens

Input Tokens

Input tokens are the quantized units of text—ranging from characters to sub-words—that are processed by an LLM's transformer layers to generate attention weights. In RAG pipelines, managing input tokens involves a direct trade-off between providing more retrieved context for accuracy and minimizing latency/API costs as the prompt size approaches the model's context window limit.

Definition

Disambiguation

Refers to the tokenized representation of the prompt and retrieved documents, not the raw character count or the resulting generated text.

Visual Metaphor

"Individual Scrabble tiles being fed into a conveyor belt for a machine to read and analyze."

Key Tools

TiktokenHugging Face TokenizersLangChain (TokenTextSplitter)SentencePiece

Related Connections

Context Window(Constraint)
Tokenizer(Prerequisite)
Prompt Engineering(Optimization Method)
Output Tokens(Complementary)

Conceptual Overview

Disambiguation

Refers to the tokenized representation of the prompt and retrieved documents, not the raw character count or the resulting generated text.

Visual Analog

Individual Scrabble tiles being fed into a conveyor belt for a machine to read and analyze.

Definition

Conceptual Overview

Disambiguation

Visual Analog

Related Articles