Twitter / News Feed
VERY COMMONNews feed design comes up at every FAANG company. It is how Twitter delivers 500 million tweets per day to 200 million users with under 5 seconds of delivery latency. You will solve the fanout problem, handle the celebrity edge case that broke Twitter in 2012, and design a timeline cache that fits 200 million users in 1.28 TB of Redis.
- Design hybrid fanout that handles both 200 and 50M followers
- Size a timeline cache for 200M users at 1.28 TB
- Avoid the celebrity fanout storm that crashes write pipelines
Visual Solutions
Step-by-step animated walkthroughs with capacity estimation, API design, database schema, and failure modes built in.
Cheat Sheet
Key concepts, trade-offs, and quick-reference notes for News Feed. Everything you need at a glance.
Anti-Patterns
Common design mistakes candidates make. Wrong approaches vs correct approaches for each trap.
Failure Modes
What breaks in production, how to detect it, and how to fix it. Detection metrics, mitigations, and severity ratings.
Start simple. Build to staff-level.
“Twitter feed for 200M DAU processing 500M tweets per day. Hybrid fanout: users under 10K followers get fanout-on-write, pushing tweet IDs to follower timeline caches in Redis. Celebrities above 10K use fanout-on-read, merged at request time. Each tweet is 1KB with a Snowflake ID (41-bit timestamp, 10-bit machine, 12-bit sequence) for time-sorted ordering. Timeline cache holds 800 tweet IDs per user at 6.4KB each, 1.28TB total. Fanout flows through Kafka so the POST returns in under 100ms. Feed ranking combines recency with engagement signals.”
Fanout-on-Write (Push Model)
EASYWe chose to push the tweet ID into every follower's timeline cache at write time because it makes reads . The cost we accept: N cache writes per tweet, where N = follower count.
Core Feature DesignFanout-on-Read (Pull Model)
EASYWe chose pull-at-read-time for celebrities (not push) because a 30M-follower account would trigger 30M cache writes per tweet. Reads cost queries, but we avoid massive write amplification.
Core Feature DesignSnowflake ID Generation
STANDARDWe chose Snowflake (not AUTO_INCREMENT or UUID): a 64-bit ID with 41 bits timestamp, 10 bits machine, 12 bits sequence. It generates IDs/sec with zero coordination and gives us ORDER BY id = ORDER BY time for free.
High Level DesignHybrid Fanout Model
STANDARDWe chose a threshold of 10K followers to split push vs pull. Below 10K: fanout-on-write for sub-10ms reads. Above 10K: fanout-on-read to cap write amplification. Trade-off: celebrity tweets appear 1-2 seconds later.
Core Feature DesignTimeline Cache (Redis Sorted Set)
STANDARDWe chose Redis sorted sets (not Memcached or MySQL) to store 800 tweet IDs per user because ZREVRANGE gives us chronological pages in sub-10ms. 6.4 KB per user, 1.28 TB for 200M users.
High Level DesignWrite Amplification
STANDARDWe accept the cost: 1 tweet from a user with 200 followers = 200 cache writes. At 500M tweets/day: 100 billion cache writes/day. This is the dominant bottleneck, which is why we cap fanout at 10K followers.
Fault ToleranceSocial Graph (Redis + MySQL)
EASYWe chose two stores (not one) because neither handles both speed and durability. Redis SETs for membership checks and fast follower lookups. MySQL follows table for durability and analytics. Synced via Kafka events.
Database SchemaCursor-Based Pagination
TRICKYWe chose cursor-based pagination (not OFFSET) using the last seen Snowflake ID as cursor. Always regardless of page depth, unlike OFFSET pagination which degrades linearly. Trade-off: we cannot jump to page N directly.
System APIs