Definition
Multi-Query RAG is a retrieval optimization technique that uses an LLM to generate multiple variations of a user's input query, performing concurrent vector searches for each version to overcome the limitations of distance-based similarity. This process improves recall by capturing documents that might be semantically relevant but poorly aligned with the specific phrasing of the original prompt.
Distinct from Multi-Modal RAG (which handles different data types) and Multi-Hop RAG (which performs iterative, sequential lookups).
"A fisherman casting five different nets into a lake simultaneously to increase the chances of catching a specific school of fish, rather than relying on a single cast."
Conceptual Overview
Multi-Query RAG is a retrieval optimization technique that uses an LLM to generate multiple variations of a user's input query, performing concurrent vector searches for each version to overcome the limitations of distance-based similarity. This process improves recall by capturing documents that might be semantically relevant but poorly aligned with the specific phrasing of the original prompt.
Disambiguation
Distinct from Multi-Modal RAG (which handles different data types) and Multi-Hop RAG (which performs iterative, sequential lookups).
Visual Analog
A fisherman casting five different nets into a lake simultaneously to increase the chances of catching a specific school of fish, rather than relying on a single cast.