be right back · interactive explainer

BM25, one knob at a time

BM25 scores how relevant a document is to a single query term. It's three ideas multiplied together: rarity (IDF), term frequency with diminishing returns, and a length penalty. Drag the knobs and watch the score — and the curves — respond.

tl;dr

Matching a rare term is worth more than a common one (IDF). Seeing a term more times in a doc helps, but with diminishing returns controlled by k₁. And a match in a short, focused doc counts for more than the same match in a long rambly one — the length penalty, controlled by b. DuckDB's defaults are k₁=1.2, b=0.75.

Knobs
tf — term frequency in this doc3
How many times the query term appears in the document.
|d| — this doc's length (tokens)90
Longer docs get penalized relative to the corpus average.
avgdl — average doc length120
The corpus-wide average chunk length.
n — docs containing the term40
The term's document frequency (df). Lower = rarer = higher IDF.
N — total docs in corpus1000
Total number of chunks. (BRB: ~245K.)
k₁ — tf saturation1.2
Lower = saturates faster (the 2nd mention barely helps). DuckDB default 1.2.
b — length normalization0.75
0 = ignore length entirely; 1 = full length penalty. DuckDB default 0.75.
Score for this term
single-term BM25 contribution (a full query sums this over every term)

1.Watch the two curves move

The left curve fixes everything except tf and sweeps it — this is the saturation shape k₁ controls. The right curve sweeps |d| — the length penalty b controls. The green marker is where your current sliders sit.

score vs tf (term frequency) — saturation
score vs |d| (doc length) — length penalty