Definition
Branched RAG is an advanced retrieval architecture where a single user query is processed through multiple parallel execution paths—each potentially employing different retrieval strategies, indices, or prompts—before being synthesized into a final answer. While it significantly improves recall and handles multi-faceted queries better than linear pipelines, it introduces higher latency and increased computational costs due to parallel LLM and retriever calls.
Unlike simple Multi-Query RAG which only varies the search string, Branched RAG varies the structural logic and data sources of the pipeline itself.
"A river delta that splits into multiple distinct channels to navigate complex terrain, eventually merging back into a single body of water at the ocean."
- Query Decomposition(Prerequisite)
- Ensemble Retrieval(Component)
- Routing(Component)
- Reciprocal Rank Fusion (RRF)(Component)
Conceptual Overview
Branched RAG is an advanced retrieval architecture where a single user query is processed through multiple parallel execution paths—each potentially employing different retrieval strategies, indices, or prompts—before being synthesized into a final answer. While it significantly improves recall and handles multi-faceted queries better than linear pipelines, it introduces higher latency and increased computational costs due to parallel LLM and retriever calls.
Disambiguation
Unlike simple Multi-Query RAG which only varies the search string, Branched RAG varies the structural logic and data sources of the pipeline itself.
Visual Analog
A river delta that splits into multiple distinct channels to navigate complex terrain, eventually merging back into a single body of water at the ocean.