Definition
The mathematical process of scaling embedding vectors to a unit length (magnitude of 1) so that retrieval is based purely on the angular direction of vectors. In RAG pipelines, this ensures that Cosine Similarity can be computed as a simple Dot Product, preventing long documents or high-frequency term clusters from disproportionately influencing retrieval scores.
Distinguish from 'Token Truncation' or 'Text Cleaning'; this refers to vector magnitude in latent space.
"A Compass Needle: regardless of how long the needle is physically, only the direction it points matters for navigation."
- Cosine Similarity(Prerequisite)
- L2 Norm(Component)
- Dot Product(Component)
- Inner Product Search(Component)
Conceptual Overview
The mathematical process of scaling embedding vectors to a unit length (magnitude of 1) so that retrieval is based purely on the angular direction of vectors. In RAG pipelines, this ensures that Cosine Similarity can be computed as a simple Dot Product, preventing long documents or high-frequency term clusters from disproportionately influencing retrieval scores.
Disambiguation
Distinguish from 'Token Truncation' or 'Text Cleaning'; this refers to vector magnitude in latent space.
Visual Analog
A Compass Needle: regardless of how long the needle is physically, only the direction it points matters for navigation.