ROUGE

ROUGE

ROUGE (Recall-Oriented Understudy for Gisting Evaluation) is a metric suite used to evaluate the quality of LLM-generated summaries or RAG responses by measuring n-gram overlap against a reference ground truth. While computationally efficient, it presents an architectural trade-off by prioritizing lexical recall over semantic meaning, meaning it can penalize accurate responses that use different phrasing than the reference.

Definition

Disambiguation

Refers to n-gram overlap metrics for text evaluation, not the color or cosmetic term.

Visual Metaphor

"A Translucent Stencil: Placing a reference template over a generated answer to see how much of the original 'shape' of the text shows through the gaps."

Conceptual Overview

Disambiguation

Refers to n-gram overlap metrics for text evaluation, not the color or cosmetic term.

Visual Analog

A Translucent Stencil: Placing a reference template over a generated answer to see how much of the original 'shape' of the text shows through the gaps.

Definition

Conceptual Overview

Disambiguation

Visual Analog

Related Articles