SmartFAQs.ai
Back to Learn
Concept

Input Tokens

Input tokens are the quantized units of text—ranging from characters to sub-words—that are processed by an LLM's transformer layers to generate attention weights. In RAG pipelines, managing input tokens involves a direct trade-off between providing more retrieved context for accuracy and minimizing latency/API costs as the prompt size approaches the model's context window limit.

Definition

Input tokens are the quantized units of text—ranging from characters to sub-words—that are processed by an LLM's transformer layers to generate attention weights. In RAG pipelines, managing input tokens involves a direct trade-off between providing more retrieved context for accuracy and minimizing latency/API costs as the prompt size approaches the model's context window limit.

Disambiguation

Refers to the tokenized representation of the prompt and retrieved documents, not the raw character count or the resulting generated text.

Visual Metaphor

"Individual Scrabble tiles being fed into a conveyor belt for a machine to read and analyze."

Key Tools
TiktokenHugging Face TokenizersLangChain (TokenTextSplitter)SentencePiece
Related Connections

Conceptual Overview

Input tokens are the quantized units of text—ranging from characters to sub-words—that are processed by an LLM's transformer layers to generate attention weights. In RAG pipelines, managing input tokens involves a direct trade-off between providing more retrieved context for accuracy and minimizing latency/API costs as the prompt size approaches the model's context window limit.

Disambiguation

Refers to the tokenized representation of the prompt and retrieved documents, not the raw character count or the resulting generated text.

Visual Analog

Individual Scrabble tiles being fed into a conveyor belt for a machine to read and analyze.

Related Articles