Definition
Semantic similarity is a metric used in RAG pipelines to quantify the conceptual relationship between two text segments by calculating the distance between their vector embeddings in a high-dimensional latent space. It allows AI agents to retrieve information based on intent and meaning rather than exact keyword overlap, though it faces trade-offs where high similarity does not always guarantee factual relevance or logical alignment.
Distinct from lexical similarity (keyword matching), which only looks for identical characters.
"A multi-dimensional star map where stars (data points) located in the same constellation represent similar concepts, regardless of their 'names'."
- Vector Embeddings(Prerequisite)
- Cosine Similarity(Mathematical Mechanism)
- Semantic Search(Application Case)
- Bi-Encoders(Architectural Component)
Conceptual Overview
Semantic similarity is a metric used in RAG pipelines to quantify the conceptual relationship between two text segments by calculating the distance between their vector embeddings in a high-dimensional latent space. It allows AI agents to retrieve information based on intent and meaning rather than exact keyword overlap, though it faces trade-offs where high similarity does not always guarantee factual relevance or logical alignment.
Disambiguation
Distinct from lexical similarity (keyword matching), which only looks for identical characters.
Visual Analog
A multi-dimensional star map where stars (data points) located in the same constellation represent similar concepts, regardless of their 'names'.