STANDARDwalkthrough

Group Chat Fanout

6 of 8
3 related
A user sends a message to a 500-member group. We need to deliver it to 499 other members.
At 231K messages/sec with 10% being group messages and average group size 20, that is 231K×0.1×20=462K231K \times 0.1 \times 20 = 462K fan-out deliveries per second on top of 1:1 messages. We chose a group message queue pattern: the chat service writes the message once to Cassandra (partitioned by group conversation_id), then publishes one event per online member to the routing layer.
The naive approach sends 499 individual messages through the routing layer.
Why not write 499 copies to storage? Because the message body is identical for all members.
We store it once and fan out only the delivery notification (message_id + conversation_id, ~16 bytes each). The fan-out service queries the conversation_members table in MySQL to get the member list, checks each member's presence in Redis, and for online members, looks up their gateway server and pushes the message.
For offline members, it queues a push notification with a 5-second delivery target. We cap group size at 500 members because fan-out cost scales linearly with group size.
A 10,000-member group would generate 10K delivery events per message, and at 100 messages/minute, that is 1M events/minute from one group. Discord solves this with channel-based pub/sub (members subscribe to channels, not individual delivery), but for groups under 500, direct delivery is simpler and more reliable.
Trade-off: our approach works for WhatsApp-style groups (up to 500) but not for Discord-style servers (thousands of members). We accept this scope limit.
Why it matters in interviews
Interviewers push on group size limits. Explaining the fan-out math (462K deliveries/sec) and why you cap at 500 members shows you understand the linear scaling cost. Mentioning store-once, deliver-many proves you separate storage from delivery.
Related concepts