Definition
A lexical retrieval technique in RAG pipelines that indexes and retrieves documents based on exact keyword matches and term frequency (typically using BM25), providing high precision for specific entities, acronyms, or identifiers that semantic vector models might fail to capture. While it lacks semantic nuance, it serves as a critical component in hybrid search to handle 'out-of-vocabulary' terms or niche technical jargon.
Keyword-based literal matching vs. meaning-based semantic matching.
"The index at the back of a thick textbook that lists every exact page where a specific name or technical term appears."
Conceptual Overview
A lexical retrieval technique in RAG pipelines that indexes and retrieves documents based on exact keyword matches and term frequency (typically using BM25), providing high precision for specific entities, acronyms, or identifiers that semantic vector models might fail to capture. While it lacks semantic nuance, it serves as a critical component in hybrid search to handle 'out-of-vocabulary' terms or niche technical jargon.
Disambiguation
Keyword-based literal matching vs. meaning-based semantic matching.
Visual Analog
The index at the back of a thick textbook that lists every exact page where a specific name or technical term appears.