News Feed Anti-Patterns
Common design mistakes candidates make. Learn what goes wrong and how to avoid each trap in your interview.
Pure Fanout-on-Write for ALL Users
We chose a hybrid model (not pure push) because pushing every tweet to every follower's cache without a celebrity threshold causes catastrophic write amplification for high-follower accounts. Trade-off: the hybrid model adds read-time merge complexity, but it caps maximum write amplification at 10K per tweet.
Why: Fanout-on-write is elegant and makes reads instant. Candidates pick it because the read path is straightforward: fetch the cache. They test with small follower counts (100-500) and it works fine. Then they forget about users with 10M+ followers. One tweet from a celebrity triggers 10M cache writes, which takes minutes to drain and delays every other user's fanout in the queue.
Pure Fanout-on-Read for ALL Users
We chose fanout-on-write for the majority of users (not pure pull) because building the timeline on every read by querying each followed user's tweets turns a GET into a scatter-gather across hundreds of sources. Trade-off: we accept write amplification for normal users to keep reads at .
Why: Fanout-on-read avoids write amplification entirely. Candidates choose it because the write path is a single INSERT. But at read time, a user following 500 accounts requires 500 queries to fetch recent tweets, then a merge-sort. Even with caching, cold timelines take 500+ ms to build. At 300M timeline reads/day, that is 150 billion sub-queries per day.
Full Tweet Objects in Timeline Cache
We chose to store tweet IDs only (not full objects) in the timeline cache because full objects waste 125x more memory. Trade-off: we pay one extra round-trip (~2ms) for batch hydration on read, but we save 125x memory.
Why: It feels faster: skip the hydration step and serve tweets directly from the timeline cache. But a tweet object is ~1KB. With 800 tweets per user and 200M users, that is . Storing 8-byte IDs: . The 125x difference is the gap between a 160-node Redis cluster and a 2-node cluster.
Auto-Increment IDs Instead of Snowflake
We chose Snowflake IDs (not AUTO_INCREMENT) because auto-increment creates a single-point bottleneck across shards and loses time-based sorting. Trade-off: Snowflake IDs leak creation time from the embedded timestamp, which may be a privacy concern.
Why: AUTO_INCREMENT is the default. It works on a single MySQL server. But with 64 shards, each shard generates its own sequence. Two tweets created at the same millisecond on different shards might both get ID 1,000,001. You need a central coordinator (single point of failure) or odd/even tricks (breaks if you add shards). Meanwhile, you lose the ability to sort by ID and get chronological order for free.
No Cache for Timeline
We chose to pre-build timelines in Redis (not query the database on every read) because 300M daily reads would crush the database. Trade-off: we accept the write amplification cost and memory overhead of maintaining 200M timeline caches to keep reads at sub-10ms.
Why: Candidates sometimes skip caching, assuming the database can handle the load. But building a timeline from MySQL requires joining the follows table with the tweets table, filtering by 500 followee IDs, sorting by timestamp, and paginating. That is a multi-join query across shards. At 300M reads/day (3,472 QPS average, 10K+ peak), this query runs against hot tables with billions of rows.
Sharding by tweet_id Instead of user_id
We chose to shard by user_id (not tweet_id) because the most common access pattern is 'get all tweets by user X'. Trade-off: we accept uneven shard sizes (celebrity shards are larger), mitigated with read replicas.
Why: Hash-based sharding on tweet_id gives perfect data distribution. Each shard holds exactly 1/N of all tweets. But the most common access pattern is 'get all tweets by user X' (profile page, fanout reads). With tweet_id sharding, user X's tweets are scattered across all 64 shards. Every profile view fans out to all 64 shards, waits for the slowest one, and merges results.
Synchronous Fanout Blocking POST Response
We chose async fanout via Kafka (not synchronous writes in the POST handler) because synchronous fanout makes API response time proportional to follower count. Trade-off: followers see the tweet 2-5 seconds later, but the author gets an instant response.
Why: It is the straightforward implementation: inside the POST handler, loop through followers and write to each cache. With 200 followers, that is 200 Redis writes at 0.5ms each = 100ms added to the API response. Seems fine. But a user with 5,000 followers now waits 2.5 seconds for their tweet to post. The HTTP connection times out at 30 seconds for users with 60K+ followers.
No Rate Limiting on Tweet Creation
We rate-limit tweet creation at the API gateway (not leaving it unlimited) because each tweet triggers fanout to all followers. A bot posting 1,000 tweets/minute with 10K followers generates cache writes per minute. Trade-off: legitimate power users hit the rate limit occasionally, but we protect the fanout queue from flood attacks.
Why: Rate limiting feels like a separate concern, so candidates skip it in the core design. But each tweet triggers a fanout to all followers. A bot posting 1,000 tweets/minute with 10K followers generates cache writes per minute. That is 167K extra writes/sec, enough to spike the fanout queue latency for all users.