STANDARDwalkthrough

Two-Stage Aggregation: Surviving the Viral Ad

4 of 8
3 related
Keyed aggregation has a built-in assassin: the hot key. The stream partitions work by ad_id so each ad's counter lives on exactly one worker: correct, and fatal the moment one ad goes viral.
The fix is the oldest trick in distributed computing, wearing streaming clothes: two-stage aggregation: MapReduce's combiner, alive and well. Stage one runs keyed by (ad_id, random_salt): the salt (0-15) fans the hot ad across 16 workers, each maintaining a partial count for its slice and emitting partials every second or so. Stage two runs keyed by plain ad_id: it receives at most 16 partial updates per ad per second: regardless of whether the ad takes ten clicks or a million: and folds them into the true counter. The math is the argument: stage two's per-ad load is bounded by salt_count x emit_frequency, not by click volume: the viral ad costs stage two 16 messages a second, same as a sleepy one.
A Super Bowl spot drawing 100K clicks/sec routes every one of those events to the single worker owning that key, while its 199 peers idle: the pipeline's throughput collapses to one machine's capacity, and repartitioning does not help because the skew IS the key.
Costs, stated honestly: a second shuffle hop adds ~a second of latency; partial emissions multiply message volume for cold keys (a one-click ad now sends one partial: same cost: the overhead only pays off on hot keys, which is why adaptive implementations salt only keys detected hot); and counts are associative, which is the property that makes the whole trick legal: partial sums merge losslessly. That last point is the interview depth: two-stage works for sums, counts, maxes: and needs care for distinct-count (HyperLogLog sketches merge; raw distinct sets explode) or averages (ship (sum, count) pairs, never averages of averages).
What if the interviewer asks: why not just cache the hot ad's counter in the workers? Caching reads scales reads: this is a write hot spot, and writes must serialize somewhere: salting is how you make "somewhere" be sixteen places.
Why it matters in interviews
The hot-key melt is the streaming interview's favorite ambush, and salt -> partial -> merge is the canonical escape. Knowing which aggregates merge losslessly (and that HLL sketches restore mergeability for distinct-counts) turns a memorized fix into transferable competence.
Related concepts