Timing Wheels and the In-Memory Window

2 of 8

3 related

Once a scheduler instance has leased a bucket and loaded ~700 MB of upcoming tasks, it needs an in-memory structure that answers "what fires THIS second" millions of times without breaking a sweat. Option one: a min-heap keyed by fire time: pop the root whenever its time arrives.

Option two, the classic at scale: the timing wheel. Picture a clock face with one slot per second (say 300 slots for a 5-minute window): each slot holds a plain list of the tasks firing in that second.

“Simple, correct, O(log n) per operation: perfectly fine for hundreds of thousands of timers, and many production systems stop here.”

Insertion is O(1): compute the slot index, append. Firing is O(1) per task: every tick, grab the current slot's list, fire everything, advance the pointer.

No comparisons, no heap rebalancing: just array indexing, which is why Kafka's purgatory and Netty's timers are timing wheels. For windows longer than the wheel, hierarchical wheels cascade: an hours wheel feeds a minutes wheel feeds the seconds wheel, tasks migrating inward as their time approaches: the same digits-of-a-clock idea that makes the structure feel inevitable once seen.

The trade-offs are honest: a wheel's granularity is fixed (a 1-second wheel cannot fire at 100ms precision: fine for us, our SLO is seconds), empty slots still tick (trivial cost), and the wheel is volatile: which is exactly why it is only a serving structure: the durable truth stays in the bucket store, and a crashed instance's replacement reloads the bucket and replays, using firing checkpoints to skip what was already dispatched. Choosing between heap and wheel in an interview: name the load.

At 11.6K average firings/sec with million-task top-of-minute spikes, O(1) insertion under a burst of reschedules is worth having; below ~100K timers, the heap's simplicity wins. What if the interviewer asks: why keep it in memory at all?

Because firing is latency-sensitive (the never-early, barely-late contract) and the alternative: hitting storage per firing: rebuilds the polling problem the buckets just solved.

Why it matters in interviews

The timing wheel is a classic structure interviewers love to hear derived, not name-dropped: slots, O(1) insert, tick-and-fire, hierarchical cascade. Pairing it with the heap alternative and a load-based decision rule shows judgment over memorization.

Related concepts

← PreviousTime-Bucket Partitioning: Finding What Fires Now Next →At-Least-Once Firing with Idempotent Handlers