Definition
HyDE (Hypothetical Document Embeddings) is a retrieval-augmentation technique that uses an LLM to generate a 'fake' or hypothetical response to a user query, which is then embedded and used to search a vector database. By converting a short query into a document-like representation, it improves retrieval accuracy by matching document-to-document semantics, though it introduces a trade-off of increased latency and API costs for the initial generation step.
A retrieval orchestration pattern for query transformation, not a specific vector embedding model like ADA or BERT.
"A police sketch artist creating a 'best-guess' drawing of a suspect to match against a database of actual photographs."
- Dense Retrieval(Underlying Method)
- Query Expansion(Architectural Precursor)
- Vector Similarity Search(Component)
- Zero-Shot Learning(Operational Framework)
Conceptual Overview
HyDE (Hypothetical Document Embeddings) is a retrieval-augmentation technique that uses an LLM to generate a 'fake' or hypothetical response to a user query, which is then embedded and used to search a vector database. By converting a short query into a document-like representation, it improves retrieval accuracy by matching document-to-document semantics, though it introduces a trade-off of increased latency and API costs for the initial generation step.
Disambiguation
A retrieval orchestration pattern for query transformation, not a specific vector embedding model like ADA or BERT.
Visual Analog
A police sketch artist creating a 'best-guess' drawing of a suspect to match against a database of actual photographs.