Autocomplete Cheat Sheet
Key concepts, trade-offs, and quick-reference notes for your interview prep.
Precompute the Answer, Not the Data
#1💡 k is a product constant (UI shows 8-10). Storing top-100 multiplies the index for invisible rows.
Client Contract: The 60-70% QPS Cut
#2💡 Debouncing does not slow users: the final prefix (the one they stop on) is the one that renders.
Zipf Head: Three Caches Before the Service
#3💡 No Redis between edge and service: the index is already RAM on the serving nodes; a cache there adds a hop, not speed.
Index Math: 10M Queries to 80 GB
#4💡 Replication is for throughput, not durability: the immutable snapshot in object storage is the durable copy.
Build -> Ship -> Swap: Immutable Snapshots
#5💡 Never mutate the live index in place: real-time updates belong to the overlay, not the index.
Trending Overlay: Add, Never Reorder
#6💡 Why not 5-minute full rebuilds? Compute waste for the 99.99% unchanged, and no human window to catch a poisoned build.
Trending Manipulation Defenses
#7💡 Design for an adversary: the overlay is the single most manipulable surface in the system.
Scoring and the Safety Gates
#8💡 Posture is minutes-to-mitigate + days-to-clean, not the fantasy of prevention. Say that out loud.
Capacity: Keystrokes to Origin QPS
#9💡 Autocomplete traffic ~= 2-3x search traffic even after debouncing. Derive it, do not hand-wave it.
The Metrics: Latency Is Table Stakes, CTR Is Truth
#10💡 A perfectly fast autocomplete serving last week's world is broken. Index age is an availability metric.