STANDARDwalkthrough
Instagram ID Generation
At 200M uploads/day (~2,300 writes/sec), we need globally unique, time-sortable IDs without a coordination bottleneck. We chose in-database PL/pgSQL generation (not a centralized service like Twitter's Snowflake) because it eliminates a network hop and a single point of failure.
Each PostgreSQL shard generates IDs using a local sequence, so there is zero cross-shard coordination. Because the timestamp occupies the most significant bits, IDs are naturally time-sortable: a numeric comparison tells you which photo came first without querying a created_at column.
“The 64-bit ID layout: 41 bits for millisecond timestamp since a custom epoch (January 1, 2011, good for 69 years), 13 bits for a logical shard ID (supporting 8,192 shards), and 10 bits for an auto-incrementing sequence (1,024 IDs per millisecond per shard).”
Trade-off: we accept coupling to PostgreSQL. If we ever migrated away from Postgres, we would need to rewrite the ID generator.
Snowflake avoids this coupling but requires deploying and operating a separate high-availability service. What if the interviewer asks: what happens when the sequence overflows 1,024 per millisecond on a single shard?
The shard stalls until the next millisecond. At peak load, 2,300 writes/sec across 8,192 shards means each shard averages 0.28 writes/ms, far below the 1,024 ceiling.
Related concepts