Definition
Product Quantization (PQ) is a lossy compression technique that decomposes high-dimensional vector spaces into a Cartesian product of lower-dimensional subspaces, quantizing each independently via a codebook to enable memory-efficient approximate nearest neighbor (ANN) search. In RAG pipelines, it allows for the storage and retrieval of millions of embeddings in RAM by trading off a marginal decrease in retrieval precision for significant reductions in memory footprint and latency.
Mathematical vector decomposition for storage, not retail product management.
"Breaking a complex 1,000-piece puzzle into 10 smaller sections and replacing each section with its closest match from a standard catalog of 256 pre-printed templates."
- Vector Embedding(Prerequisite)
- Approximate Nearest Neighbor (ANN)(Context)
- Centroids(Component)
- Inverted File Index (IVF)(Complementary Technique)
Conceptual Overview
Product Quantization (PQ) is a lossy compression technique that decomposes high-dimensional vector spaces into a Cartesian product of lower-dimensional subspaces, quantizing each independently via a codebook to enable memory-efficient approximate nearest neighbor (ANN) search. In RAG pipelines, it allows for the storage and retrieval of millions of embeddings in RAM by trading off a marginal decrease in retrieval precision for significant reductions in memory footprint and latency.
Disambiguation
Mathematical vector decomposition for storage, not retail product management.
Visual Analog
Breaking a complex 1,000-piece puzzle into 10 smaller sections and replacing each section with its closest match from a standard catalog of 256 pre-printed templates.