Replication on the Ring

4 of 8

3 related

A key hashes to position P on the ring. We replicate it to the next RF-1 distinct physical nodes clockwise.

Why distinct physical nodes? Because two consecutive vnodes might belong to the same server.

“With RF=3, the key lives on 3 different servers.”

Blindly replicating to the next 2 ring positions could place all 3 copies on one machine, and if that machine fails, all replicas are lost. We walk clockwise, skipping vnodes on already-selected physical nodes, until we have RF distinct hosts.

For rack-aware placement, also skip nodes in the same rack. Cassandra and DynamoDB both use this clockwise-walk strategy.

Per-node storage with RF=3:

500M \times 1\text{KB} / 100 = 5\text{ GB raw}

5 \times 3 = 15\text{ GB replicated}

. On node failure, the successor already holds the data: zero cache misses reach the database.

Trade-off: RF=3 triples storage cost and write amplification to 345K total node writes/sec.

Why it matters in interviews

The clockwise walk skipping same-physical-node rule separates senior candidates. Interviewers push on 'what if two vnodes are on the same machine?' to test replica placement.

Related concepts

← PreviousNode Addition and Removal Next →Hot Key Problem