Definition
A graphical plot that illustrates the diagnostic ability of a binary classifier—such as a relevance scorer or a guardrail model in a RAG pipeline—by mapping the True Positive Rate against the False Positive Rate at various decision thresholds.
Focuses on the sensitivity of retrieval relevance and classification thresholds rather than statistical population distributions.
"A radio tuner sensitivity dial where increasing the sensitivity captures the intended signal (relevant data) but simultaneously increases the audible static (false positives)."
- AUC (Area Under the Curve)(Aggregate Metric)
- Decision Threshold(Prerequisite)
- Precision-Recall Curve(Alternative Evaluation Framework)
- Retrieval Recall(Component)
Conceptual Overview
A graphical plot that illustrates the diagnostic ability of a binary classifier—such as a relevance scorer or a guardrail model in a RAG pipeline—by mapping the True Positive Rate against the False Positive Rate at various decision thresholds.
Disambiguation
Focuses on the sensitivity of retrieval relevance and classification thresholds rather than statistical population distributions.
Visual Analog
A radio tuner sensitivity dial where increasing the sensitivity captures the intended signal (relevant data) but simultaneously increases the audible static (false positives).