Visual Solution

Design a Complete URL Shortener

Design a URL shortening service like bit.ly from scratch. Cover requirements gathering, API design, encoding strategy, database schema, caching, scaling to 20K QPS reads, analytics, and monitoring. Walk through the complete architecture.

Solution PathTarget: 25 min
Design a complete URL shortener from requirements to production architecture. The key is recognizing the 100:1 read:write ratio, choosing Base62 for collision-free codes, caching aggressively, and sharding the database with consistent hashing.
01Requirements and Estimation
1/4
Start by clarifying requirements: 500M new URLs/month, 100:1 read:write ratio, 5-year retention. This gives us ~200 writes/sec, ~20K reads/sec, and ~15TB storage over 5 years.
200 writes/sec, 20K reads/sec, 15TB storage (5 years). Read-heavy system.
02High-Level Architecture
2/4
The architecture follows the request path: Client -> Load Balancer -> Stateless App Servers -> Redis Cache (cache-aside) -> Sharded Database. The write path encodes a counter to Base62. The read path checks cache first, falls back to DB.
LB -> App Servers -> Redis Cache -> Sharded DB. Base62 encoding. Cache-aside pattern.
03Key Design DecisionsKEY INSIGHT
3/4
Three critical decisions: (1) Base62 counter for zero-collision short codes, (2) Cache-aside with Redis for 80%+ hit rate (Zipf distribution), (3) Hash-based sharding with consistent hashing for horizontal DB scaling. Analytics via async Kafka pipeline.
Base62 encoding + Redis cache-aside + consistent hash sharding + async analytics
04Complete Architecture
4/4
The final architecture handles 20K reads/sec with Redis absorbing 80%+ of traffic. The DB is sharded across multiple nodes using consistent hashing. A separate analytics pipeline (Kafka -> consumer -> analytics DB) tracks clicks asynchronously without impacting redirect latency.
Complete URL shortener: scalable to 20K+ QPS, 15TB+ storage, sub-10ms redirects with caching.
Your 3-minute elevator pitch
I'd design this as a read-heavy system with Base62-encoded short codes, a Redis cache-aside layer handling 80%+ of reads, sharded Postgres for storage, and async analytics via Kafka. Key numbers: 200 writes/sec, 20K reads/sec, 15TB over 5 years.
Concepts from this question6 concepts unlocked

Base62 Encoding

EASY

An encoding scheme using 62 characters [a-zA-Z0-9] to convert numeric IDs into compact alphanumeric strings. Each character represents a digit in base-62.

62n unique codes for n characters    6273.5×101262^n \text{ unique codes for } n \text{ characters} \implies 62^7 \approx 3.5 \times 10^{12}

Base62 maps auto-incrementing IDs to short, URL-safe strings with zero collision probability. It is the preferred encoding for most URL shortener designs.

url-q3url-q5url-q10
Practice (3 Qs) →

Read-Heavy Caching (Cache-Aside)

STANDARD

A caching pattern where the application checks the cache first, falls back to the database on a miss, and populates the cache before returning. Ideal for read-heavy workloads.

Hit rate80%    DB load reduced by 5×\text{Hit rate} \geq 80\% \implies \text{DB load reduced by } 5\times

URL shorteners have a 100:1 read:write ratio. Caching the top 20% of URLs serves 80% of traffic (Zipf distribution), dramatically reducing database load.

url-q8url-q10
Practice (2 Qs) →

Database Sharding

TRICKY

Horizontal partitioning of data across multiple database instances. Each shard holds a subset of rows, determined by a shard key (e.g., hash of short_code).

Shard=hash(short_code)modN\text{Shard} = \text{hash}(\texttt{short\_code}) \mod N

A single database cannot hold 15TB+ of URL data. Sharding by short_code hash distributes reads and writes evenly across shards.

url-q9url-q10
Practice (2 Qs) →

TTL and Expiration

STANDARD

Time-to-live (TTL) sets a maximum lifetime for a record. Expired URLs are cleaned up by a background job, freeing short codes for reuse and bounding storage growth.

Without TTL, storage grows unboundedly. A 2-year default TTL with background cleanup keeps the system healthy and reclaims short codes over time.

url-q6url-q10
Practice (2 Qs) →

Horizontal Scaling

STANDARD

Adding more machines (scale out) rather than upgrading a single machine (scale up). Stateless application servers can be scaled horizontally behind a load balancer.

Servers needed=QPSQPS per server\text{Servers needed} = \lceil \frac{\text{QPS}}{\text{QPS per server}} \rceil

At 20K read QPS, a single server is insufficient. Stateless app servers behind a load balancer scale linearly with traffic.

url-q9url-q10
Practice (2 Qs) →

CDN Edge Caching

STANDARD

Caching redirect responses at CDN edge nodes closest to the user. Reduces latency and offloads traffic from origin servers for globally popular URLs.

For a global URL shortener, edge caching can serve redirects in under 10ms for popular URLs without hitting the origin server at all.