Whiteboard ScaleAutocompleteAnti-Patterns
Anti-Patterns

Autocomplete Anti-Patterns

Common design mistakes candidates make. Learn what goes wrong and how to avoid each trap in your interview.

SELECT ... WHERE query LIKE 'prefix%' per Keystroke

Very CommonFORMULA

Serving suggestions straight from the search database with a LIKE query on every keystroke.

Why: It works in the demo with 10K rows, ships in a sprint, and the database is already there.

WRONG: Every keystroke becomes an index range-scan plus sort on the production database. At 100K keystrokes/sec the search DB spends itself on suggestions and the actual search product degrades with it.
RIGHT: A dedicated precomputed prefix index in RAM: one hash get per request, built offline from logs. The database never sees a keystroke.

Walking the Trie and Ranking at Query Time

Very CommonFORMULA

Storing raw completions in a trie and computing top-k (subtree walk + heap) on every request.

Why: It is the textbook data-structure answer, and it benchmarks fine on narrow prefixes in development.

WRONG: The prefix "a" owns hundreds of thousands of descendants. Walking and heap-ranking them takes hundreds of ms of CPU: per keystroke: exactly on the hottest prefixes where latency matters most.
RIGHT: Precompute top-10 at every node/prefix at build time. Query time reads a cached answer: O(1) relative to corpus size. The trie walk happens once per build, not once per keystroke.

One Request per Keystroke

Very CommonFORMULA

The client fires a suggest request on every input event with no debounce, threshold, or cancellation.

Why: The naive onChange handler is one line; the waste is invisible on the developer's machine and network.

WRONG: A user typing "weather" sends 7 requests; 5 are for prefixes they typed through in 300ms. Fleet capacity is provisioned 3x higher than needed, and out-of-order responses flash wrong suggestions.
RIGHT: The client contract: 150-200ms debounce, 2-character minimum, sequence-numbered responses with stale discard and AbortController cancellation. 60-70% of volume never leaves the device.

Sharding by Alphabetic Range

CommonFORMULA

Partitioning the prefix space a-c / d-f / ... across shards because it looks orderly.

Why: Range sharding feels natural for strings, and the skew only shows under production traffic distributions.

WRONG: Prefix popularity is Zipf-skewed by leading letter: the a-c shard takes an order of magnitude more traffic than x-z. One shard melts during every traffic peak while others idle at 5%.
RIGHT: Hash the full prefix onto the ring (consistent hashing). Exact-key lookups need no range locality, and hashing smooths hot and cold prefixes statistically across the fleet.

Mutating the Live Index In Place

CommonFORMULA

Applying score updates, insertions, and deletions directly to the serving index as events arrive.

Why: Real-time freshness sounds strictly better than batch, and the concurrency cost is invisible until scale.

WRONG: An 80 GB structure under 200K reads/sec now needs locks or epochs for every write; a pipeline bug or manipulation spike writes garbage into the live index with no rollback except a rebuild you no longer have.
RIGHT: Immutable snapshots with atomic pointer swap for the batch layer; a small, decaying, capped overlay for real-time. Freshness AND an undo button.

Rendering Whatever Response Arrives Last

CommonFORMULA

The UI displays each suggest response as it lands, with no sequencing between request and response.

Why: On fast office networks responses arrive in order; the bug needs real-world latency variance to reproduce.

WRONG: The response for "we" (slow, fat prefix) lands after the response for "wea". The user sees suggestions regress mid-typing: flickering, wrong, occasionally embarrassing.
RIGHT: Sequence number per request; render only if seq > last rendered; cancel in-flight requests on new input. Monotonic rendering: the browser edition of a distributed-systems invariant.

Frequency Is the Only Score

CommonFORMULA

Promoting whatever users type most, with no decay, no CTR signal, and no safety filtering.

Why: Frequency is the easiest signal to compute, and the failure modes are product/legal, not technical: invisible to the engineering demo.

WRONG: Offensive queries, typos, and coordinated campaigns rank first because raw volume says so. A person's name completes to an accusation. Screenshots circulate; legal calls; the feature gets turned off.
RIGHT: Time-decayed, CTR-weighted, deduplicated scoring; build-time blocklist + policy classes + PII scrubbing; serve-time kill switch for live incidents; full auditability of why any suggestion appeared.

Trusting the Trending Signal

OccasionalFORMULA

Letting raw short-window velocity write directly into user-visible suggestions.

Why: Breaking-news freshness demos brilliantly, and nobody red-teams the counter until someone else does.

WRONG: A botnet sustains one query for ten minutes; it outranks legitimate suggestions globally. The overlay becomes a free billboard for spam, scams, or coordinated harassment.
RIGHT: Per-source contribution caps, velocity-vs-baseline (not absolute counts), an N-minute sanity delay, a hard 2-3 slot merge cap that can add but never reorder, and the blocklist on the overlay path with zero exceptions.