be right back · interactive explainer

BM25, one knob at a time

BM25 scores how relevant a document is to a single query term. It's three ideas multiplied together: rarity (IDF), term frequency with diminishing returns, and a length penalty. Drag the knobs and watch the score — and the curves — respond.

tl;dr

Matching a rare term is worth more than a common one (IDF). Seeing a term more times in a doc helps, but with diminishing returns controlled by k₁. And a match in a short, focused doc counts for more than the same match in a long rambly one — the length penalty, controlled by b. DuckDB's defaults are k₁=1.2, b=0.75.

Knobs

tf — term frequency in this doc3

How many times the query term appears in the document.

|d| — this doc's length (tokens)90

Longer docs get penalized relative to the corpus average.

avgdl — average doc length120

The corpus-wide average chunk length.

n — docs containing the term40

The term's document frequency (df). Lower = rarer = higher IDF.

N — total docs in corpus1000

Total number of chunks. (BRB: ~245K.)

k₁ — tf saturation1.2

Lower = saturates faster (the 2nd mention barely helps). DuckDB default 1.2.

b — length normalization0.75

0 = ignore length entirely; 1 = full length penalty. DuckDB default 0.75.

Score for this term

—

single-term BM25 contribution (a full query sums this over every term)

1.Watch the two curves move

The left curve fixes everything except tf and sweeps it — this is the saturation shape k₁ controls. The right curve sweeps |d| — the length penalty b controls. The green marker is where your current sliders sit.

score vs tf (term frequency) — saturation

score vs |d| (doc length) — length penalty