TRICKYwalkthrough

Priority Queue Isolation

1 of 8

3 related

It is Sunday evening and marketing launches a 100 million recipient campaign. At the same moment, a user tries to log in to their bank app and waits for an OTP.

If they share one queue, the OTP sits behind 100 million promotional pushes and arrives in 20 minutes, long after the login screen timed out. How do we guarantee the OTP arrives in under 5 seconds no matter what marketing is doing?

“Both messages enter the same notification pipeline.”

Three options. Option one: one shared queue with priority fields.

Workers read the priority field and process urgent messages first. Sounds right, fails in practice: Kafka partitions are consumed in order, so a priority field buried inside 100M queued messages does nothing until a consumer reaches it.

Option two: bigger worker pool. Scale consumers until the backlog drains fast.

During a campaign burst you would need 30x capacity that sits idle the rest of the week, and a second concurrent campaign breaks it again. Option three: physically separate queues per priority tier, each with its own dedicated worker pool.

P0 transactional (OTP, security alerts, ride arriving): 50M/day, its own Kafka topic, workers never touch anything else, p99 under 5 seconds. P1 engagement (mentions, direct messages): 3B/day, target 30 seconds.

P2 marketing and digests: 7B/day, batched, allowed to lag minutes. We chose physical isolation.

The campaign can back up the P2 topic by 40 minutes and the P0 lane stays empty. The trade-off: three topics and three consumer groups to operate, and engineers must classify every notification type up front.

Misclassify a payment receipt as P2 and it queues behind campaigns. What if the interviewer asks: why not preemption instead of isolation?

Preempting mid-stream in a log-based queue means re-queueing work you already read. Isolation makes starvation structurally impossible instead of operationally managed.

Why it matters in interviews

This is the single most common failure probed in notification interviews: head-of-line blocking of transactional messages behind bulk sends. Proposing physically isolated priority tiers with per-tier worker pools and latency targets (5s / 30s / minutes) shows systems maturity, not just queue vocabulary.

Related concepts

Next →Device Token Lifecycle