TRICKYwalkthrough

Hybrid Fanout

3 of 8
3 related
What happens when a celebrity with 50 million followers posts? The constraint: with pure fanout-on-write, one post triggers 50 million cache writes.
This is the celebrity problem, and neither pure fanout strategy solves it alone. We split users by follower count with a configurable threshold.
That single publish operation would take minutes to propagate and spike our Redis write throughput past capacity.
Users with fewer than 10,000 followers use fanout-on-write: their posts are pushed into follower caches immediately, keeping reads fast for the 99% of posts from normal accounts. Users with more than 10,000 followers use fanout-on-read: their posts stay in their own post store and are merged into timelines at request time.
When a user opens their feed, our system fetches the pre-built cache (containing posts from non-celebrity followees) and merges in recent posts from any celebrities they follow. This merge runs on dedicated timeline mixer servers that combine the two sources in under 50ms.
We chose hybrid fanout (not pure fanout-on-write) because the write amplification for celebrity accounts would require provisioning Redis clusters 10x beyond normal capacity for infrequent spikes. Trade-off: we gave up uniform read latency in exchange for bounded write costs.
Users who follow many celebrities see slightly higher read latency (30-50ms vs sub-10ms) due to the merge step. The threshold (10K followers) is tunable based on write capacity.
Some systems use engagement velocity instead of raw follower count to decide the strategy. Implication: we need a follower count lookup on every publish to route it through the correct fanout path, adding one Redis GET per post.
What if the interviewer asks: 'How do we handle a user crossing the threshold?' We re-classify users in a background job that runs hourly, not in the hot path. The transition does not require migrating existing cache entries because old entries expire naturally via TTL.
Why it matters in interviews
This is the answer interviewers expect. Explaining the celebrity problem with concrete numbers (50M writes per post) and then presenting the hybrid threshold shows we understand why neither pure approach works alone. Naming the timeline mixer as a distinct component elevates the design.
Related concepts