TRICKYwalkthrough
Shortest Path with Contraction Hierarchies
A user in San Francisco requests driving directions to New York City. The road network has 500 million intersections and over 1 billion road segments.
How do we find the shortest path across a continent in under 10 milliseconds? We have three candidates.
“Running Dijkstra's algorithm from scratch would visit millions of nodes, taking 500ms to 2 seconds, far too slow when millions of users request routes simultaneously.”
Option one: Dijkstra's algorithm. It explores nodes in order of distance from the source.
Correct, but it fans out in all directions like a ripple, visiting millions of irrelevant nodes. For a cross-country route, it explores roughly 5 million nodes.
At 100ns per node lookup in memory, that is 500ms. Option two: A* search.
It adds a heuristic (straight-line distance to the destination) so the search fans out preferentially toward the target. This is 2 to 3x faster than Dijkstra, but for a 4,500 km route, it still explores over a million nodes and takes 200ms.
Option three: Contraction Hierarchies (CH). We pre-process the graph offline by repeatedly removing ("contracting") the least important nodes and adding shortcut edges that preserve shortest-path distances.
A small residential intersection gets contracted early; a highway interchange persists until the end. After pre-processing, a bidirectional search from source and destination meets in the middle, exploring only about 1,000 nodes in the high-importance "highway" layer.
Query time: under 10ms. That is 100x faster than Dijkstra.
Pre-processing takes about 30 minutes for the full US road graph. Google Maps, OSRM, and GraphHopper all use CH variants in production.
The trade-off is update cost. When a road closes or traffic changes edge weights, we cannot simply update one edge.
The shortcut edges that bypass the closed road are now invalid. We handle this with a real-time edge weight overlay: the CH structure uses static distances for pre-processing, and a lightweight overlay applies live traffic penalties at query time.
Full re-contraction runs nightly; partial re-contraction handles major topology changes within minutes. What if the interviewer asks: why not just use A* with better heuristics?
Because no admissible heuristic can bridge the gap. A* with Euclidean distance is already near-optimal for the heuristic, yet it still explores 1M+ nodes.
CH changes the problem from "search smarter" to "pre-compute a hierarchical shortcut network."
Related concepts