STANDARDwalkthrough
Node Addition and Removal
A new cache node joins the cluster. With modular hashing, every key is rehashed: instead of .
With the ring, the new node's 256 vnodes each steal a portion of the clockwise neighbor's arc. Total keys transferred: keys, roughly 1% of total data.
“At 500M keys, 99% remap.”
At 1 KB per value, that is a 5 GB transfer at 1 Gbps in ~40 seconds. We use a two-phase approach: (1) the new node copies data from donors in the background while the old owner keeps serving, (2) once caught up, the ring coordinator atomically updates topology.
Node removal is the reverse: with RF=3, the successor already holds a replica, so zero data transfer is needed for reads. Why not stop the world?
At 115K ops/sec, 40 seconds of downtime means 4.6M failed lookups. Trade-off: two-phase introduces a brief window where two nodes claim the same key range, resolved via epoch versioning.