Chat System Cheat Sheet
Key concepts, trade-offs, and quick-reference notes for your interview prep.
Per-Message Storage Breakdown
#1💡 150B per message, 3 TB/day, 1.1 PB/year raw, 3.3 PB at RF=3
WebSocket Server Sizing
#2💡 50K connections/server, 10K servers, 500MB RAM per server, set ulimit to 65536
Per-Conversation Sequence Numbers
#3💡 Per-conversation (not global), Redis INCR, assigned before Kafka publish
Idempotency Keys for Dedup
#4💡 Client-generated UUID, Redis SET with 24h TTL, ~800 GB dedup cache across 15 nodes
Lazy Presence (Not Eager Broadcast)
#5💡 Lazy, not eager. 16.7M heartbeats/sec. 60s TTL. Only push to active viewers.
Connection Registry (Redis)
#6💡 conn:user_id -> server_id, 90s TTL, 1ms lookup vs 10K broadcast
Kafka as Message Buffer
#7💡 Kafka decouples delivery from persistence. Ack in 12ms, not 60ms.
Catch-Up Sync on Reconnection
#8💡 Client sends last_seen_seq per conversation. Cassandra range query. Single-partition read.
Group Chat Fan-Out Limits
#9💡 500 member cap. Store once, fan out delivery notifications. 462K deliveries/sec from groups.
E2E Encryption Trade-Offs
#10💡 Signal Protocol. Server is blind relay. No server-side search or moderation.