TRICKYwalkthrough
Precompute the Answer: Top-K per Prefix
A user types "we" and expects "weather", "weather tomorrow", "wednesday" to appear before their finger reaches the next key: the whole round trip has perhaps 100 milliseconds, and the suggestion computation gets under 10ms of it. The naive design walks a data structure at query time: find the prefix node in a trie, explore every descendant, rank the completions by score, return the top 10.
Query-time ranking is dead on arrival. The fix is the topic's defining move: precompute the answer, not the data.
“Do the math on "a": that subtree contains hundreds of thousands of completions; walking and heap-ranking them takes hundreds of milliseconds of CPU: per keystroke, times 700K requests per second.”
For every prefix a user could type, store the final top-10 list as a value: "we" maps directly to [weather, weekend, wednesday, ...]. A lookup is one hash-map get (or one trie descent with a cached top-k at every node): O(1) relative to corpus size, microseconds of CPU.
The cost moves to storage and build time, where it is cheap: 10M distinct suggestion-worthy queries at an average 20 characters yield roughly 200M prefix entries; each holds 10 suggestions at ~40 bytes: about 80 GB, shardable RAM, rebuilt offline. The deeper principle generalizes across system design: when read latency is the constraint and the query set is enumerable, move work from read time to write/build time.
The feed topic made the same trade (fanout-on-write), and the KV store's bloom filters are the same instinct. The trade-offs: storage amplification (every query appears under ~20 prefixes), and staleness: a precomputed answer is only as fresh as its last build, which is exactly why the trending overlay exists as a second layer.
What if the interviewer asks: why top-10 and not top-100? The UI shows 8-10; storing more multiplies the index for suggestions no one can see: k is a product constant, not a tuning knob.
Related concepts