Search Autocomplete
COMMONAutocomplete design is asked at Google, Amazon, LinkedIn, and Meta because it hides a complete distributed system behind the most familiar UI element on the internet. It is how a suggest box answers 700K requests per second in under 100ms: faster than the next keystroke falls. You will abolish query-time ranking by precomputing every prefix's top-10 (200M entries, 80 GB of sharded RAM), delete 60-70% of traffic with a three-rule client contract, layer trending freshness over immutable hourly snapshots, and treat suggestion quality as the safety system it really is.
- Abolish query-time ranking: precompute top-10 per prefix (10M queries -> 200M entries -> 80 GB)
- Cut 60-70% of QPS with the client contract: debounce, 2-char minimum, sequence-numbered discard
- Layer freshness: immutable snapshot builds + a capped trending overlay that adds but never reorders
Visual Solutions
Step-by-step animated walkthroughs with capacity estimation, API design, database schema, and failure modes built in.
Cheat Sheet
Key concepts, trade-offs, and quick-reference notes for Autocomplete. Everything you need at a glance.
Anti-Patterns
Common design mistakes candidates make. Wrong approaches vs correct approaches for each trap.
Failure Modes
What breaks in production, how to detect it, and how to fix it. Detection metrics, mitigations, and severity ratings.
Start simple. Build to staff-level.
“I would design autocomplete for 500M DAU serving 700K requests per second in under 100ms. The key move: abolish query-time ranking by precomputing every prefix's top-10 offline: 10M corpus queries become 200M prefix entries, about 80 GB of hash-sharded RAM, so serving is one hash get in microseconds. A client contract (150ms debounce, 2-character minimum, sequence-numbered stale discard) deletes 60-70% of keystrokes before any server, and Zipf-aligned edge caching absorbs two-thirds of the rest: origin sees about 200K/sec. Freshness layers hourly immutable snapshots (canaried pointer-swap promotion) with a megabytes-sized decaying trending overlay capped at 2-3 marked slots. Quality is a safety system: CTR-weighted scoring, build-time filters, and a 60-second kill switch.”
Precomputed Top-K
TRICKYStore the answer, not the data: 200M prefix entries, one hash get, microseconds
Core Autocomplete DesignTrie vs Hash Map
STANDARDPointer-chasing vs string duplication: hash for exact-prefix speed, trie for compression
Core Autocomplete DesignZipf and Cache Layers
STANDARDTop 100K prefixes ~ half of traffic: browser, edge, and hot-map absorb it before origin
High Level System DesignImmutable Snapshots
STANDARDBuild, ship, warm, swap: promotion is a pointer; rollback is the same pointer
Database SchemaTrending Overlay
TRICKYBatch owns truth, the overlay owns the last 5 minutes: add, never reorder
Replication and Fault TolerancePrefix Sharding
STANDARDHash the full prefix (alphabetic ranges melt on Zipf); replicas are throughput, not durability
High Level System DesignThe Client Contract
EASYDebounce + 2-char min + sequence numbers: 60-70% of QPS deleted before any server
Scale and NumbersQuality and Safety Gates
STANDARDCTR-weighted scoring, build-time filters, 60-second kill switch: minutes to mitigate
Monitoring and Complete System