TRICKYwalkthrough

Worker ID Assignment Without Collisions

4 of 8
3 related
Ten bits of worker ID keep 1,024 machines from colliding: but only if no two machines ever hold the same worker ID at the same time. Assigning those IDs is the one place this "coordination-free" design needs coordination, moved off the hot path to startup.
The production-grade ways. Option one: ZooKeeper/etcd ephemeral leases.
The wrong ways first, because they ship constantly. Hardcoded config: someone copy-pastes a deployment manifest and two pods boot with worker 7; every millisecond they overlap produces potential duplicates. Derive from IP/MAC: works until DHCP reassigns, containers share a host network, or two VMs clone the same image.
On boot, a worker atomically claims an unused ID znode with a session lease; if the worker dies, the lease expires and the ID returns to the pool. The subtle danger is reuse-too-fast: worker 7 GC-pauses, its lease expires, a new node claims 7, then old-7 wakes and keeps generating as 7.
The fix is fencing: a worker that cannot renew its lease must stop generating BEFORE the lease expiry (generation deadline = lease expiry minus a safety margin), the same fencing discipline as the lease-based splits in our control-plane topic. Option two: Kubernetes StatefulSet ordinals: pod-N takes worker ID N; the orchestrator guarantees at most one pod-N exists, and you inherit that guarantee for free: the pragmatic modern choice.
Option three: registration in a database with a heartbeat column: simpler infrastructure, coarser failure handling. The numbers worth saying: lease TTL ~10s, renewal every 3s, generation deadline at TTL minus 2s; a worker that loses ZooKeeper keeps generating only until its deadline, then parks.
And startup at scale: 1,024 IDs is a hard cap, so autoscaling groups need ID-pool monitoring: running out of worker IDs is a quiet way to block scale-out. What if the interviewer asks: why not just use more worker bits?
Every bit added is a bit taken from sequence or timestamp: back to the 64-bit budget.
Why it matters in interviews
Worker assignment is where candidates hand-wave ("use ZooKeeper") and interviewers dig. Knowing the lease + fencing deadline mechanics: and the StatefulSet-ordinal shortcut: shows you have thought about the GC-pause double-worker scenario, the topic's version of split-brain.
Related concepts