TRICKYwalkthrough
At-Least-Once Firing with Idempotent Handlers
A scheduler instance loads its bucket, fires 40,000 tasks, and dies mid-window. Its lease expires; a replacement takes over the bucket.
This is the delivery-semantics question wearing a clock, and the answer mirrors the notification topic: exactly-once firing is unachievable across a crash boundary (the instance cannot atomically fire into Kafka AND record the checkpoint), so the system promises at-least-once and makes duplicates harmless. The mechanics, in order. Fire = enqueue: "firing" a task means publishing an execution message to the work queue (Kafka), never executing inline: the scheduler is a dispatcher, not a worker, and its per-task cost stays microseconds. Checkpoint after enqueue: the instance records firing progress (per-bucket high-water mark plus a small recent-ID set) AFTER the enqueue succeeds: so a crash between enqueue and checkpoint means the replacement re-fires a small suffix: duplicates, not losses. Replay on takeover: the new lease holder reloads the bucket, skips everything below the checkpoint, and re-fires the uncertain tail. Idempotency downstream: every execution message carries a deterministic key: hash(task_id, scheduled_fire_time): and workers dedupe exactly the way the notification gateways did (Redis SETNX with TTL), so the re-fired suffix executes once as far as the world can tell.
“What happens to the 40,000 already fired: and the 3.4 million not yet fired?”
The recurring-task subtlety: the key includes the scheduled fire time, so Tuesday's run and Wednesday's run of the same cron are different idempotency keys, while two firings of Tuesday's run are the same. And the fencing rule from the ID topic applies verbatim: a lease holder that cannot renew stops firing before expiry, so two instances never fire the same bucket concurrently.
What if the interviewer asks: why not at-most-once (checkpoint before enqueue)? Because dropping a payment retry or a medication reminder silently is strictly worse than running it twice against an idempotent handler: the same loud-versus-wrong choice every topic in this course keeps making.
Related concepts