STANDARDwalkthrough

Fanout-on-Write

1 of 8

3 related

How do we make timeline reads instant? The constraint: if we compute timelines at read time, every feed open triggers merges across hundreds of followed accounts, adding 200-500ms of latency.

When a user with 1,000 followers publishes a post, our system performs 1,000 cache writes, one into each follower's timeline cache. The result: a single O(1) cache lookup returns the entire timeline with no merging, no sorting, no fan-in from multiple sources.

“We solve this by doing all the work at write time. Fanout-on-write pre-computes every follower's timeline the moment a post is published.”

We chose fanout-on-write (not fanout-on-read) for normal users because our non-functional requirement is sub-10ms timeline reads, and pre-computation is the only way to achieve that at 200 million Daily Active Users (DAU). Trade-off: we gave up cheap writes in exchange for instant reads.

Every post triggers

O(N)

cache insertions where

N

is the follower count. For a user with 10,000 followers, that is 10,000 Redis LPUSH operations per post.

Implication: write amplification scales linearly with follower count, which works for the 99% of users with fewer than 10,000 followers but becomes catastrophic for celebrities. What if the interviewer asks: 'How do we handle a write spike when many users post simultaneously?' We absorb spikes with a fanout queue (Kafka or SQS) between the post service and the cache writers, decoupling ingestion speed from fanout throughput.

The queue lets us process fanout asynchronously without blocking the poster's publish response.

Why it matters in interviews

Interviewers expect us to quantify the write amplification cost and identify exactly when fanout-on-write breaks down. Explaining the celebrity threshold and the queue-based decoupling shows production-grade understanding beyond the textbook definition.

Related concepts

Next →Fanout-on-Read