Anti-Patterns

URL Shortener Anti-Patterns

Common design mistakes candidates make. Learn what goes wrong and how to avoid each trap in your interview.

Using MD5/SHA for Short Codes

Very CommonCONCEPT

Generating short codes by hashing the long URL with MD5 or SHA-256, then truncating to 7 characters.

Why: MD5 feels like the 'standard' approach for generating unique strings. Candidates default to what they know from other contexts.

WRONG: Hash the long URL with MD5 (128-bit output), take first 7 characters as short code.
RIGHT: Use an auto-incrementing counter + Base62 encoding. Each ID maps to exactly one short code with zero collisions. Simpler, faster, and collision-free.
See pattern: Collision Handling

Ignoring Read/Write Ratio

Very CommonCONCEPT

Designing the system as if reads and writes are equally frequent, missing the 100:1 ratio.

Why: Candidates jump straight to write-path design (URL creation) without considering that redirects dominate traffic.

WRONG: Optimize the write path with complex distributed ID generation, while leaving the read path with direct DB lookups.
RIGHT: Recognize the 100:1 read:write ratio first. Prioritize read-path optimization: aggressive caching (Redis), CDN for popular URLs, and read replicas.

Single Point of Failure Database

CommonCONCEPT

Using a single database instance with no replication or failover strategy.

Why: Candidates forget to discuss high availability. A single DB works fine in development but fails in production at scale.

WRONG: Single MySQL instance stores all URLs. If it goes down, the entire service is unavailable.
RIGHT: Use primary-replica replication for reads. Add automatic failover. Consider multi-region deployment for global latency.

No Cache Layer

CommonTIME WASTE

Every redirect request hits the database directly, even for the most popular URLs.

Why: Candidates design the happy path (cache miss) and forget that 80% of traffic goes to 20% of URLs.

WRONG: GET /:code -> DB lookup -> redirect. Every single request queries the database.
RIGHT: Add Redis/Memcached as a cache-aside layer. Check cache first, fall back to DB on miss, populate cache on read. 80%+ hit rate is achievable.
See pattern: Cache Strategy Selection

Synchronous Analytics Writes

CommonTIME WASTE

Logging click analytics synchronously in the redirect path, adding latency to every redirect.

Why: It seems natural to log the click in the same request handler that performs the redirect.

WRONG: On redirect: write click event to analytics DB, wait for acknowledgment, then send 301/302 response.
RIGHT: Fire-and-forget: publish click event to Kafka/SQS asynchronously. The redirect response returns immediately. A separate consumer processes analytics.

No TTL or Expiration

CommonDOMAIN

URLs live forever with no cleanup mechanism, causing unbounded storage growth.

Why: It is easy to skip the 'what happens to old URLs' question. The system works fine initially, but storage grows forever.

WRONG: Every URL is permanent. After 5 years, you have 30 billion records with no way to reclaim space.
RIGHT: Add an optional expires_at field. Run a background cleanup job to delete expired URLs and free their short codes. Default TTL of 2 years if not specified.

Not Handling Custom Aliases

OccasionalCASE MISS

Only supporting auto-generated short codes, ignoring the common requirement for vanity URLs.

Why: Candidates focus on the auto-generation algorithm and forget that users often want custom aliases (e.g., bit.ly/my-brand).

WRONG: API only accepts long URL, always auto-generates the short code. No way for users to choose their own.
RIGHT: Add an optional 'customAlias' field to the create API. Check uniqueness against existing codes. Validate format (alphanumeric, reasonable length).

Ignoring Analytics Requirements

OccasionalCASE MISS

Building only the shorten/redirect functionality without any click tracking or reporting.

Why: Analytics seems like a 'nice to have' rather than a core feature. But interviewers expect it.

WRONG: Only implement POST /shorten and GET /:code. No tracking of who clicked, when, from where.
RIGHT: Track click count, timestamp, referrer, user agent, and geo-location. Store in a separate analytics store (not the main DB). Expose GET /api/v1/stats/:code endpoint.