Definition
F1@K is a retrieval metric representing the harmonic mean of Precision@K and Recall@K, measuring the balance between the relevance of the top K results and the proportion of total relevant documents retrieved. In RAG pipelines, it serves as a critical benchmark for optimizing the retrieval window size, balancing the cost of LLM context window usage against the risk of missing pertinent information.
Measures retrieval quality at a specific rank threshold, not the linguistic accuracy of the generated answer.
"A gold miner’s sieve: it measures both the percentage of gold flakes captured from the stream and the purity of the contents within that specific bucket size."
Conceptual Overview
F1@K is a retrieval metric representing the harmonic mean of Precision@K and Recall@K, measuring the balance between the relevance of the top K results and the proportion of total relevant documents retrieved. In RAG pipelines, it serves as a critical benchmark for optimizing the retrieval window size, balancing the cost of LLM context window usage against the risk of missing pertinent information.
Disambiguation
Measures retrieval quality at a specific rank threshold, not the linguistic accuracy of the generated answer.
Visual Analog
A gold miner’s sieve: it measures both the percentage of gold flakes captured from the stream and the purity of the contents within that specific bucket size.