RAG

RAG

Retrieval-Augmented Generation (RAG) is an architectural framework that optimizes Large Language Model (LLM) output by retrieving relevant information from an authoritative, external knowledge base before generating a response. This pattern balances the trade-off between the high cost of model fine-tuning and the hallucination risks of zero-shot inference, though it introduces additional system latency and orchestration complexity.

Definition

Disambiguation

Dynamic context retrieval vs. static model weight modification (fine-tuning).

Visual Metaphor

"An open-book exam where the student (LLM) uses a searchable library (Vector Database) to find specific facts before writing an answer."

Key Tools

LangChainLlamaIndexPineconeWeaviateFAISSChromaDB

Related Connections

Vector Database(Component)
Embeddings(Prerequisite)
Semantic Search(Component)
Context Window(Constraint)

Conceptual Overview

Disambiguation

Dynamic context retrieval vs. static model weight modification (fine-tuning).

Visual Analog

An open-book exam where the student (LLM) uses a searchable library (Vector Database) to find specific facts before writing an answer.

Definition

Conceptual Overview

Disambiguation

Visual Analog

Related Articles