Definition
A metric used in RAG evaluation to quantify the lexical overlap between an LLM-generated response and a human-provided ground truth by calculating n-gram precision. While efficient, it serves as a measure of surface-level similarity rather than semantic accuracy or reasoning quality.
Measures exact word matches, not semantic meaning or factual correctness.
"A Stencil Overlay: checking how many words in the generated text perfectly align with the cutouts of a reference template."
- N-gram(Prerequisite)
- Ground Truth(Component)
- ROUGE Score(Alternative)
- BERTScore(Semantic Alternative)
Conceptual Overview
A metric used in RAG evaluation to quantify the lexical overlap between an LLM-generated response and a human-provided ground truth by calculating n-gram precision. While efficient, it serves as a measure of surface-level similarity rather than semantic accuracy or reasoning quality.
Disambiguation
Measures exact word matches, not semantic meaning or factual correctness.
Visual Analog
A Stencil Overlay: checking how many words in the generated text perfectly align with the cutouts of a reference template.