Bounded Load Consistent Hashing

6 of 8

3 related

How do we handle load imbalance without detecting individual hot keys? Google published bounded-load consistent hashing in 2017: set a capacity cap on each node at

(1 + \epsilon) \times \text{average\_load}

With

\epsilon = 0.25

, no node exceeds 125% of average load. This requires real-time load tracking via Redis INCR/DECR per request or gossip protocol.

“When a key hashes to an overloaded node, the ring walk continues clockwise to the next node below the cap.”

Vimeo adopted it for their CDN cache layer, reducing miss rates by 50% during spikes. Why not just use hot key splitting?

Because bounded load handles all keys uniformly without per-key metadata. It is reactive (responds to actual load) rather than proactive.

Trade-off: key-to-node mapping changes dynamically based on load, reducing cache hit ratio by 2-5% compared to static ring assignment.

Why it matters in interviews

Mentioning bounded load as an alternative to hot key splitting demonstrates depth. Interviewers value knowing the trade-off: uniform key handling at the cost of 2-5% hit ratio reduction.

Related concepts

← PreviousHot Key Problem Next →Jump Consistent Hashing