Whiteboard ScaleTopicsGoogle Maps

Google Maps / Navigation

COMMON

Google Maps design is asked at Google, Apple, Uber, and Mapbox because it tests map tile rendering, graph routing algorithms, and real-time traffic estimation in one problem. It is how Google Maps serves 300 million daily users with sub-500ms route computation across 500 million road nodes. You will design a quadtree tile pyramid serving 521K tiles per second via CDN, a Contraction Hierarchies routing engine that is 100x faster than Dijkstra, and a GPS probe pipeline ingesting 4 million updates per second for 97%+ ETA accuracy.

  • Design Contraction Hierarchies routing that is 100x faster than Dijkstra on 500M nodes
  • Build a GPS probe pipeline ingesting 4M updates/sec for real-time traffic estimation
  • Serve 521K map tiles/sec via a quadtree tile pyramid with 98% CDN cache hit ratio
GoogleAppleUberMapboxHereWaze
8
Concepts
Deep dives
10
Cheat Items
Quick ref
Elevator Pitch3-minute interview summary

I would design a navigation platform for 300M DAU serving 521K tile requests per second from a pre-rendered catalog of 500 TB vector tiles across 22 zoom levels. CDN absorbs 98% of tile traffic. The routing engine uses Contraction Hierarchies on a 500M-node road graph, computing shortest paths in under 10ms (100x faster than Dijkstra). Live traffic from 4 million GPS probes per second updates 50M road segment speeds every 60 seconds in Redis. ETA prediction combines CH routing with an ML model trained on historical speed patterns, achieving 97%+ accuracy. Search across 200M POIs uses Elasticsearch with geospatial filtering. Directions serve 11,600 requests per second.

Concepts Unlocked8 concepts in this topic

Map Tile Rendering and Zoom Levels

STANDARD

Why pre-render tiles and not render on demand? Because on-demand rendering at 521K tiles/sec would need 26,000 GPUs. Pre-render into a quadtree pyramid with 4z4^z tiles per zoom level.

High Level System Design

Vector Tiles vs Raster Tiles

STANDARD

Why vector tiles and not raster PNG? Because vector tiles are 3x smaller, support client-side rotation, and decouple data from styling. Style changes do not require re-rendering 500 TB of tiles.

High Level System Design

Shortest Path with Contraction Hierarchies

TRICKY

Why CH and not Dijkstra? Because Dijkstra explores millions of nodes in 500ms. CH preprocessing adds shortcut edges, enabling bidirectional upward search across ~1,000 nodes in <10ms.

Core Routing Design

Road Network Graph and Partitioning

STANDARD

Why fit the graph in memory? Because CH routing needs sub-millisecond adjacency lookups. 500M nodes x 80B + 1.2B edges x 120B = 100 GB per server, replicated across 230 servers.

Database Schema

Live Traffic from GPS Probes

TRICKY

Why HMM map matching before speed aggregation? Because raw GPS has 5 to 15m error. A probe on an overpass attributed to the surface road below corrupts traffic for both roads.

Core Routing Design

ETA Prediction with ML

TRICKY

Why not distance over speed limit? Because real speeds vary by time, weather, and signals. Two-stage model: CH routing + GBDT correction with 15 features achieves 97%+ accuracy.

Monitoring and Complete System

Geocoding and Place Search

STANDARD

Why Elasticsearch and not SQL LIKE? Because LIKE '%coffee%' on 200M rows triggers a full table scan. Elasticsearch delivers sub-200ms BM25 search with geospatial filtering.

High Level System Design

Location Service and Map Matching

STANDARD

Why map matching and not raw GPS? Because 15m accuracy cannot distinguish an overpass from the road below. HMM with Viterbi decoding finds the most probable road sequence.

Replication and Fault Tolerance