TRICKYwalkthrough

The Budget Race: Where Latency Is Money

6 of 8
3 related
An advertiser sets a {1,000 daily budget}. Clicks cost 2.
This is the budget race, and it converts aggregation latency from a freshness metric into a dollar amount: if spend counters lag by 10 seconds, a viral campaign over-delivers by up to $10,000: ten times the budget: and the platform eats it, because billing an advertiser past their cap is a lawsuit. The design response is a dedicated fast lane distinct from both analytics paths: a budget service maintaining per-campaign spend counters updated from stage-two aggregates within 1-2 seconds, consulted by ad-serving on every impression decision.
The campaign goes viral at 500 clicks/sec: the budget exhausts in exactly one second: and every click served after exhaustion but before the system notices is money someone loses.
Its properties invert the analytics paths': it may be slightly wrong (a duplicate inflates spend by a click: fine, err toward stopping early) but must be fast and must fail closed for money: if the budget service is unreachable, high-spend campaigns pause rather than run blind: the asymmetry being that under-delivery costs goodwill while over-delivery costs cash. Even with a 2-second lane, physics leaves a residual: a truly viral campaign over-delivers by ~1,000 clicks: so the system layers predictive throttling: when spend velocity says the budget will exhaust within the lag horizon, serving begins probabilistic pacing early: deliberately slowing the final dollars: which advertisers rarely mind because smooth pacing beats a budget that detonates at 9:01 AM anyway.
The remaining overage is a published tolerance (industry-standard: platforms absorb overage beyond the cap), and finance sees it as a line item: an SLO priced in dollars per incident, the clearest example in this course of a latency budget with a literal invoice. What if the interviewer asks: why not check the budget synchronously per click?
Ad serving decides in <10ms across millions of QPS: a synchronous counter would be the hottest key on the platform: the fast lane IS the synchronous check, kept answerable by keeping it near-real-time.
Why it matters in interviews
The budget race translates "aggregation lag" into dollars per second: the most concrete stakes any streaming interview offers. Fail-closed-for-money, predictive pacing, and overage-as-priced-SLO show product judgment layered on the systems answer.
Related concepts