be right back · interactive explainer

Sliding-window chunking

Before messages can be embedded, they're grouped into overlapping chunks with two knobs: window (how many messages per chunk) and stride (how far the window jumps each step). Drag them and watch the overlap appear and disappear.

tl;dr

A chunk is a window of consecutive messages. We slide the window forward by stride messages each step. When stride < window, consecutive chunks share messages — that overlap is deliberate, so a message near a boundary still has its surrounding context in at least one chunk. stride = window tiles with no overlap; stride > window skips messages entirely (bad). Production uses window=8, stride=4 — 50% overlap.

1.The knobs

window — messages per chunk8
How many consecutive messages each chunk contains.
stride — step between chunks4
How far the window advances before cutting the next chunk.
chunks produced
overlap per step
overlap ratio
messages covered

2.Which chunk covers which message

Each row is one chunk; each column is one message in the conversation. A filled cell means that chunk contains that message. The thin strip underneath colors each message by how many chunks cover it.

in chunk overlap (2+ chunks) single chunk gap (0 chunks)

3.The conversation, annotated

The same result on the actual messages. Rows highlighted green sit in the overlap zone (covered by more than one chunk); rows in red were skipped entirely. The chips on the right list which chunks each message landed in.