TRICKYwalkthrough
Range Split and Merge Operations
A range grows past 512 MB. How does the control plane split it without dropping a single query?
Why median by volume? The key-space midpoint splits the key range in half by ordering, but one half might hold 90% of the data if the distribution is skewed.
“The split orchestrator picks the median key by data volume (not the key-space midpoint) so each new range holds roughly 256 MB.”
The split is a two-phase operation: (1) the old range leader creates two new ranges locally, populating metadata for both, (2) the control plane commits the split to the Raft metadata store in a single atomic write. During the split, in-flight queries continue to be served by the old range.
After the metadata swap, the old range's epoch is incremented, and query routers redirect to the new ranges. CockroachDB processes ~100 splits per hour per cluster at steady state.
Merge is the reverse: two adjacent ranges whose combined size falls below 128 MB are merged to reduce metadata overhead.
Related concepts