TRICKYwalkthrough

News Feed Generation (Hybrid Fanout)

1 of 8
3 related
How do we build a feed for 500M daily users without melting our write infrastructure or starving readers? We chose fanout-on-write (not pure fanout-on-read) because 95% of users have under 10K followers, making push affordable.
We push metadata pointers, not image bytes, because duplicating a 4KB photo object across 500 follower caches wastes 2MB per upload versus 4KB for pointers. At read time, the feed service hydrates each pointer into a full photo object with CDN URLs, captions, and like counts.
When a user uploads a photo, we push only the photo_id (8 bytes) into every follower's timeline cache.
For celebrities beyond a follower-count threshold, we switch to fanout-on-read: their posts are merged into feeds at request time. Why not fanout-on-read for everyone?
Because pulling and merging from hundreds of followees on every feed open adds 200-300ms of latency versus pre-built cache reads under 5ms. Trade-off: we accept write amplification for normal users (one write per follower) in exchange for sub-50ms feed reads.
The celebrity threshold is tuned based on write queue depth. What if the interviewer asks: why not fanout-on-write for celebrities too?
Because a single post from an account with 100M followers would generate 100M cache writes, flooding the write pipeline for minutes and delaying feed updates for all other users.
Why it matters in interviews
Interviewers want you to justify the hybrid split, not describe it. Explaining why we push photo metadata pointers instead of full objects, and why celebrities get a different path, shows you reason about write amplification trade-offs rather than reciting a pattern.
Related concepts