STANDARDwalkthrough
Geocoding and Place Search
A user types "coffee near me" into the Google Maps search bar. The system must interpret this as a category search ("coffee"), apply a geospatial filter ("near me" means within 2 km of the user's current location), rank results by relevance, rating, and distance, and return the top 10 results in under 200ms.
Each POI record contains name, address, category tags, coordinates, rating, and hours, averaging 500 bytes. Total storage: .
“This requires two core operations. Forward geocoding converts a text address into coordinates: "1600 Pennsylvania Ave, Washington DC" becomes . Reverse geocoding does the opposite: a tap on the map at returns "Empire State Building, New York." We store 200 million Points of Interest (POIs) in an Elasticsearch cluster.”
For text search, we tokenize the user's input and run a fuzzy match against place names using Elasticsearch's BM25 scoring. But raw text matching is not enough.
A search for "coffee" from San Francisco should not return results in Tokyo. We apply a geospatial boost: results within 2 km of the user's location receive a 10x relevance multiplier, results within 10 km get 3x, and anything beyond gets no boost.
Elasticsearch's geo_distance scoring function handles this natively. For autocomplete, we use Elasticsearch's completion suggester, which builds an in-memory finite state transducer (FST) for prefix matching.
As the user types "star", the FST returns "Starbucks" within 50ms. The FST for 200M POIs consumes about 8 GB of RAM.
The alternative to Elasticsearch is a custom trie with geospatial indexing (R-tree), which would be faster but requires building and maintaining the entire search infrastructure from scratch. Trade-off: Elasticsearch gives us full-text search, fuzzy matching, geospatial queries, and autocomplete out of the box, but it consumes more memory and has higher tail latency (p99 around 150ms) compared to a hand-tuned solution.
Google Places API handles billions of geocoding requests daily using a proprietary search engine, but the architecture principles are the same. What if the interviewer asks: how do we handle misspellings like "Starbuks"?
Elasticsearch's fuzzy query with edit distance 2 matches "Starbuks" to "Starbucks" with minimal latency impact.
Related concepts