TRICKYwalkthrough

Range Split and Merge Operations

7 of 8
3 related
A range grows past 512 MB. How does the control plane split it without dropping a single query?
Why median by volume? The key-space midpoint splits the key range in half by ordering, but one half might hold 90% of the data if the distribution is skewed.
The split orchestrator picks the median key by data volume (not the key-space midpoint) so each new range holds roughly 256 MB.
The split is a two-phase operation: (1) the old range leader creates two new ranges locally, populating metadata for both, (2) the control plane commits the split to the Raft metadata store in a single atomic write. During the split, in-flight queries continue to be served by the old range.
After the metadata swap, the old range's epoch is incremented, and query routers redirect to the new ranges. CockroachDB processes ~100 splits per hour per cluster at steady state.
Merge is the reverse: two adjacent ranges whose combined size falls below 128 MB are merged to reduce metadata overhead.
Why it matters in interviews
This is the KEY INSIGHT of the topic. Interviewers ask: how do you split without downtime? Explaining the median-key selection, two-phase operation, and in-flight query handling separates senior candidates from those who only know the theory.
Related concepts