URL Shortener
VERY COMMONURL shortener comes up in almost every system design round. It is how Bitly serves 10 billion redirects per month with sub-50ms latency. You will design a counter-based ID generator, a Redis cache layer for 289K reads per second, and a sharded MySQL backend that grows at 18 TB per year.
- Design a counter-based ID generator with range pre-allocation
- Build a cache-first read path handling 867K peak RPS
- Avoid the cache stampede that kills most URL shortener designs
Visual Solutions
Step-by-step animated walkthroughs with capacity estimation, API design, database schema, and failure modes built in.
Cheat Sheet
Key concepts, trade-offs, and quick-reference notes for URL Shortener. Everything you need at a glance.
Anti-Patterns
Common design mistakes candidates make. Wrong approaches vs correct approaches for each trap.
Failure Modes
What breaks in production, how to detect it, and how to fix it. Detection metrics, mitigations, and severity ratings.
Start simple. Build to staff-level.
“I would design a URL shortener for 500M DAU using counter-based ID generation with range pre-allocation. Each app server grabs 10,000 IDs from ZooKeeper and increments locally with zero coordination. Base62 encoding gives us 62^7 = 3.5 trillion codes, enough for 96 years at 100M URLs per day. The 10:1 read-to-write ratio (289K vs 29K RPS) makes this cache-first. Redis absorbs 95%+ of reads. Each record averages 131 bytes, so storage grows at 5 TB per year. The urls table is sharded by short_code hash across 16 shards. Click analytics flow through Kafka so redirects stay under 50ms at p99.”
Base62 Encoding
EASYWe chose 7 characters because 62^7 = 3.5 trillion codes. At 100M/day, that lasts 95 years before we need to extend the code length.
Core Feature Design301 vs 302 Redirects
EASYWe default to 302 (not 301) because we need analytics. 301 cuts server load by 60%, but the browser caches the redirect and we lose click visibility.
What is it?Consistent Hashing
STANDARDWe chose consistent hashing (not modular hashing) for Redis because when a node dies, only 1/N of the keys need to remap. Without it, the entire cache misses at once.
High Level DesignRate Limiting
STANDARDWe cap at 100 URLs/min/user via a sliding window in Redis (not fixed window, to avoid burst at boundaries). Without it, one bot can burn through our counter range in hours.
System APIsRead Replica
STANDARDWe run 3 replicas that could split 289K RPS to ~96K each. But we chose replicas for redundancy, not throughput, because Redis already handles 95% of reads.
High Level DesignWrite-Ahead Log (WAL)
STANDARDWe persist each counter range to a WAL before acknowledging. If the counter crashes, the service skips ahead on restart instead of reissuing IDs that cause collisions.
Fault ToleranceCache Stampede
TRICKYOne expired key, 50K concurrent cache misses, database on fire. We chose request coalescing so one thread refills the cache while the rest wait, instead of all 50K hitting MySQL.
Fault ToleranceBloom Filter
TRICKYWe chose a Bloom filter (not a hash set) for dedup because 12 GB answers 'is this URL already shortened?' for 10B URLs. A hash set would need ~80 GB. Eliminates 85% of unnecessary DB lookups on writes.
Core Feature Design