TRICKYwalkthrough
ETA Estimation
The displayed ETA is the single most important number in the rider experience. Get it wrong by 5 minutes and users lose trust.
We chose a three-layer pipeline (not Dijkstra alone, not ML alone) because each layer corrects a different class of error. Trade-off: we accepted higher compute cost (3 model evaluations per query) in exchange for reducing ETA error from 25% to under 5%.
“The constraint: a static road graph gives 25% average error because it ignores traffic, construction, and time-of-day patterns.”
Layer 1: Dijkstra on the road graph. The road network is a weighted directed graph where edges are road segments and weights are traversal times.
Layer 2: real-time traffic weights. Every active driver reports speed every 3 seconds.
We aggregate these into per-road-segment speed estimates, updated every 30 seconds. A road segment with 50 drivers averaging 15 km/h overrides the static speed limit of 60 km/h.
Layer 3: ML correction. A gradient-boosted model trained on billions of historical trips corrects systematic biases: construction zones the map does not know about, traffic lights not modeled in the graph, school zones that slow traffic at 3 PM.
The fallback when the ML model fails: straight-line distance divided by average city speed (25 km/h). This overestimates by 40% on average but is better than showing nothing.
Related concepts