Retrieval-Augmented Generation (RAG)

An architectural framework that optimizes Large Language Model outputs by retrieving relevant document snippets from external knowledge bases before the generation phase. It trades increased inference latency and retrieval-logic complexity for a significant reduction in hallucinations and the ability to access real-time or private data without retraining.

Definition

Disambiguation

RAG is an 'open-book' process at inference time, whereas fine-tuning is 'studying' to update the model's internal weights.

Visual Metaphor

"An open-book exam where a student (LLM) researches specific facts in a textbook (Vector Database) to answer a question instead of relying on memory."

Conceptual Overview

Disambiguation

RAG is an 'open-book' process at inference time, whereas fine-tuning is 'studying' to update the model's internal weights.

Visual Analog

An open-book exam where a student (LLM) researches specific facts in a textbook (Vector Database) to answer a question instead of relying on memory.

Retrieval-Augmented Generation (RAG)

Definition

Conceptual Overview

Disambiguation

Visual Analog

Related Articles