TRICKYwalkthrough

Cache Stampede

5 of 8

2 related

A celebrity tweets a short URL at 2 AM. 50,000 users click it simultaneously. The Redis entry for that URL expired 3 seconds ago.

The database buckles, queries back up, timeouts cascade, and now every short URL in that shard is unreachable. This is a cache stampede: a popular key expires and the resulting flood of concurrent cache misses overwhelms the backing data store.

“All 50,000 requests miss the cache at the same instant and slam into MySQL, which can handle maybe 10,000 simple lookups per second on a single node.”

We mitigate this with three layered defenses, each addressing a different failure vector. First, request coalescing (also called single-flight): only one request fetches from the database while the other 49,999 wait for that single result.

This prevents duplicate work but still allows one database hit per expired key. Second, stale-while-revalidate: we serve the expired cached value immediately while a background thread refreshes it, so users never see latency from cache misses.

Trade-off: we briefly serve stale data (up to one TTL window old) in exchange for zero cache-miss latency. Third, TTL jitter: we randomize each key's expiration by plus or minus 10%, preventing thousands of keys from expiring at the same second.

Without jitter, batch-created URLs all expire simultaneously, creating a coordinated stampede even without celebrity traffic. Implication: with all three mitigations, the database never sees more than one concurrent read per expired key, turning a 50,000x amplification into a 1x load.

What if the interviewer asks: why not set TTLs to infinity? Because stale data accumulates indefinitely, and Redis memory is finite.

We need expiration to reclaim cache space for active URLs.

Formula & tradeoffs

Formula

\text{TTL jitter} = \text{base TTL} \pm 10\%

Why it matters in interviews

Cache stampedes are the most common cause of database outages in read-heavy systems. Naming request coalescing, stale-while-revalidate, and TTL jitter as distinct mitigations, each solving a different failure vector, shows we understand production failure patterns rather than textbook caching theory.

Related concepts

← PreviousRate Limiting Next →Write-Ahead Log (WAL)