STANDARDwalkthrough

Node Addition and Removal

3 of 8
3 related
A new cache node joins the cluster. With modular hashing, every key is rehashed: hash(key)%101hash(key) \% 101 instead of %100\% 100.
With the ring, the new node's 256 vnodes each steal a portion of the clockwise neighbor's arc. Total keys transferred: 500M/1015M500M / 101 \approx 5M keys, roughly 1% of total data.
At 500M keys, 99% remap.
At 1 KB per value, that is a 5 GB transfer at 1 Gbps in ~40 seconds. We use a two-phase approach: (1) the new node copies data from donors in the background while the old owner keeps serving, (2) once caught up, the ring coordinator atomically updates topology.
Node removal is the reverse: with RF=3, the successor already holds a replica, so zero data transfer is needed for reads. Why not stop the world?
At 115K ops/sec, 40 seconds of downtime means 4.6M failed lookups. Trade-off: two-phase introduces a brief window where two nodes claim the same key range, resolved via epoch versioning.
Why it matters in interviews
Interviewers ask 'what happens when you add a node at 3 AM?' Explaining the two-phase bootstrap and deriving 5 GB in 40 seconds shows you think about operational reality.
Related concepts