Definition
BM25 parameters ($k_1$ and $b$) are hyperparameters that calibrate term frequency saturation and document length normalization in probabilistic ranking; they are critical in RAG pipelines for tuning sparse retrieval to balance keyword density against chunk length. High $k_1$ values increase the impact of repeated terms, while high $b$ values more aggressively penalize longer chunks to favor concise, relevant snippets.
Refers to the mathematical constants of the 'Best Matching 25' ranking function, not the 'Top 25' results retrieved.
"A 'Saturation Valve' (k1) that limits how much extra credit a repeated word gets, and a 'Length Scale' (b) that weighs a heavy book against a light postcard."
- Sparse Retrieval(Prerequisite)
- Term Frequency-Inverse Document Frequency (TF-IDF)(Predecessor)
- Hybrid Search(Component)
- Reciprocal Rank Fusion (RRF)(Related)
Conceptual Overview
BM25 parameters ($k_1$ and $b$) are hyperparameters that calibrate term frequency saturation and document length normalization in probabilistic ranking; they are critical in RAG pipelines for tuning sparse retrieval to balance keyword density against chunk length. High $k_1$ values increase the impact of repeated terms, while high $b$ values more aggressively penalize longer chunks to favor concise, relevant snippets.
Disambiguation
Refers to the mathematical constants of the 'Best Matching 25' ranking function, not the 'Top 25' results retrieved.
Visual Analog
A 'Saturation Valve' (k1) that limits how much extra credit a repeated word gets, and a 'Length Scale' (b) that weighs a heavy book against a light postcard.