Definition
The architectural process of partitioning a massive vector index into multiple shards across a cluster of nodes to enable horizontal scaling and parallelized similarity searching. It involves a coordinator node broadcasting a query to all shards, collecting local top-K results, and merging/re-ranking them into a global result set.
Focuses on data sharding and concurrent search execution rather than simple load balancing or distributed training.
"A city-wide scavenger hunt where a coordinator sends teams to different neighborhoods simultaneously, then aggregates the best items found in each area into a final collection."
- Vector Sharding(Prerequisite)
- Horizontal Scaling(Architectural Goal)
- Query Latency(Performance Trade-off)
- Re-ranking(Downstream Component)
Conceptual Overview
The architectural process of partitioning a massive vector index into multiple shards across a cluster of nodes to enable horizontal scaling and parallelized similarity searching. It involves a coordinator node broadcasting a query to all shards, collecting local top-K results, and merging/re-ranking them into a global result set.
Disambiguation
Focuses on data sharding and concurrent search execution rather than simple load balancing or distributed training.
Visual Analog
A city-wide scavenger hunt where a coordinator sends teams to different neighborhoods simultaneously, then aggregates the best items found in each area into a final collection.