Search Autocomplete

COMMON

Autocomplete design is asked at Google, Amazon, LinkedIn, and Meta because it hides a complete distributed system behind the most familiar UI element on the internet. It is how a suggest box answers 700K requests per second in under 100ms: faster than the next keystroke falls. You will abolish query-time ranking by precomputing every prefix's top-10 (200M entries, 80 GB of sharded RAM), delete 60-70% of traffic with a three-rule client contract, layer trending freshness over immutable hourly snapshots, and treat suggestion quality as the safety system it really is.

Abolish query-time ranking: precompute top-10 per prefix (10M queries -> 200M entries -> 80 GB)
Cut 60-70% of QPS with the client contract: debounce, 2-char minimum, sequence-numbered discard
Layer freshness: immutable snapshot builds + a capped trending overlay that adds but never reorders

GoogleAmazonLinkedInMetaUberAirbnb

Concepts

Deep dives

Cheat Items

Quick ref

▶

Visual Solutions

Step-by-step animated walkthroughs with capacity estimation, API design, database schema, and failure modes built in.

AnimatedWatch solutions →

📋

Cheat Sheet

Key concepts, trade-offs, and quick-reference notes for Autocomplete. Everything you need at a glance.

Quick referenceView cheat sheet →

⚠

Anti-Patterns

Common design mistakes candidates make. Wrong approaches vs correct approaches for each trap.

8 anti-patternsLearn pitfalls →

🔥

Failure Modes

What breaks in production, how to detect it, and how to fix it. Detection metrics, mitigations, and severity ratings.

5 failure modesStudy failures →

Difficulty Ladder

Start simple. Build to staff-level.

Level 1

Junior / Basics

Core concepts, single-service design, straightforward requirements

Level 2

Mid-Level Interview

Multi-service architecture, trade-off discussions, standard scaling

Level 3

Senior / Deep Dive

Complex distributed systems, failure modes, consistency guarantees

Level 4

Staff+ / FAANG Hard

Planet-scale design, novel architectures, cross-cutting concerns

Elevator Pitch3-minute interview summary

“I would design autocomplete for 500M DAU serving 700K requests per second in under 100ms. The key move: abolish query-time ranking by precomputing every prefix's top-10 offline: 10M corpus queries become 200M prefix entries, about 80 GB of hash-sharded RAM, so serving is one hash get in microseconds. A client contract (150ms debounce, 2-character minimum, sequence-numbered stale discard) deletes 60-70% of keystrokes before any server, and Zipf-aligned edge caching absorbs two-thirds of the rest: origin sees about 200K/sec. Freshness layers hourly immutable snapshots (canaried pointer-swap promotion) with a megabytes-sized decaying trending overlay capped at 2-3 marked slots. Quality is a safety system: CTR-weighted scoring, build-time filters, and a 60-second kill switch.”

Concepts Unlocked8 concepts in this topic