Definition
The process of querying a vector database to return the 'K' most relevant document chunks based on their proximity to the query embedding in a high-dimensional space. While a higher 'K' provides more context to the LLM, it increases noise and computational cost, whereas a lower 'K' risks missing critical information needed for an accurate response.
Retrieval of documents from a database, not token sampling during generation.
"A high-powered magnet pulling the 'K' closest metal shards out of a large pile of sand."
- Cosine Similarity(Prerequisite)
- Vector Embedding(Component)
- Reranking(Post-processing Step)
- Context Window(Constraint)
Conceptual Overview
The process of querying a vector database to return the 'K' most relevant document chunks based on their proximity to the query embedding in a high-dimensional space. While a higher 'K' provides more context to the LLM, it increases noise and computational cost, whereas a lower 'K' risks missing critical information needed for an accurate response.
Disambiguation
Retrieval of documents from a database, not token sampling during generation.
Visual Analog
A high-powered magnet pulling the 'K' closest metal shards out of a large pile of sand.