SmartFAQs.ai
Back to Learn
Concept

Output Tokens

Output tokens are the discrete numerical units of text generated by a Large Language Model (LLM) during the inference phase, serving as the primary variable for determining API costs and the 'Time to Last Token' latency in RAG pipelines.

Definition

Output tokens are the discrete numerical units of text generated by a Large Language Model (LLM) during the inference phase, serving as the primary variable for determining API costs and the 'Time to Last Token' latency in RAG pipelines.

Disambiguation

Distinct from 'Input Tokens' (the prompt); these represent the generated payload and directly impact user-perceived speed.

Visual Metaphor

"A ticker tape machine printing a message character-by-character; the longer the tape, the more time and paper consumed."

Key Tools
TiktokenSentencePiecevLLMOpenAI APIHugging Face Tokenizers
Related Connections

Conceptual Overview

Output tokens are the discrete numerical units of text generated by a Large Language Model (LLM) during the inference phase, serving as the primary variable for determining API costs and the 'Time to Last Token' latency in RAG pipelines.

Disambiguation

Distinct from 'Input Tokens' (the prompt); these represent the generated payload and directly impact user-perceived speed.

Visual Analog

A ticker tape machine printing a message character-by-character; the longer the tape, the more time and paper consumed.

Related Articles