STANDARDwalkthrough
Replica Placement Policies
RF=3 means three copies, but all three replicas on machines in the same rack is useless. A single rack failure (power supply, top-of-rack switch) loses all copies.
The control plane enforces placement constraints: spread replicas across distinct racks, availability zones, and regions. When the rebalancer or split orchestrator creates a new replica, it queries the placement constraint solver: find a node in a different rack/zone than existing replicas, with sufficient disk space, and below the CPU threshold.
“We chose rack-aware placement (not latency-optimized placement) because correlated failures at the rack level are common in production.”
Google Spanner places replicas across continents for disaster recovery and uses Paxos to maintain consistency across 50-100ms round-trip latencies. The trade-off is concrete: cross-region replication adds 50-100ms latency per write because the Raft leader must wait for a majority ack.
For read-heavy workloads, this is acceptable because reads can be served from the local region via follower reads with bounded staleness.
Related concepts