Definition
The inference-time sequence of operations where an LLM consumes a prompt, retrieved context, and conversation history to synthesize a grounded response. It encompasses prompt template application, model invocation, and the integration of output guardrails or formatting logic.
Focuses on the response synthesis phase, not the data ingestion or vector search indexing phases.
"A professional chef (LLM) assembling a meal using a specific recipe (Prompt) and fresh ingredients delivered from a pantry (Retrieved Context)."
- Prompt Template(Component)
- Context Window(Prerequisite)
- Retrieval-Augmented Generation (RAG)(Parent Framework)
- Output Guardrails(Component)
Conceptual Overview
The inference-time sequence of operations where an LLM consumes a prompt, retrieved context, and conversation history to synthesize a grounded response. It encompasses prompt template application, model invocation, and the integration of output guardrails or formatting logic.
Disambiguation
Focuses on the response synthesis phase, not the data ingestion or vector search indexing phases.
Visual Analog
A professional chef (LLM) assembling a meal using a specific recipe (Prompt) and fresh ingredients delivered from a pantry (Retrieved Context).