Definition
A query preprocessing technique that augments a user’s prompt with semantically related terms to increase retrieval recall by bridging vocabulary gaps between the query and the indexed document corpus. Trade-off: While it improves the chances of finding relevant documents (recall), it risks introducing 'semantic drift' or noise, which can degrade the precision of the context window if synonyms are too broad.
Explicit text-based augmentation versus implicit vector-based similarity.
"A flashlight with a wide-angle lens attachment that illuminates peripheral objects related to the central focus."
- Query Rewriting(Parent process)
- Recall(Primary optimization metric)
- BM25(Algorithm commonly paired with expansion for hybrid search)
- Semantic Drift(Potential failure mode)
Conceptual Overview
A query preprocessing technique that augments a user’s prompt with semantically related terms to increase retrieval recall by bridging vocabulary gaps between the query and the indexed document corpus. Trade-off: While it improves the chances of finding relevant documents (recall), it risks introducing 'semantic drift' or noise, which can degrade the precision of the context window if synonyms are too broad.
Disambiguation
Explicit text-based augmentation versus implicit vector-based similarity.
Visual Analog
A flashlight with a wide-angle lens attachment that illuminates peripheral objects related to the central focus.