Whiteboard ScaleRide SharingDesign Walkthrough

Ride Sharing System Design Walkthrough

Complete design walkthrough with animated diagrams, capacity math, API design, schema, and failure modes.

Solution PathTarget: 30 min
Ride-sharing platform for 1M DAU riders and 500K drivers, processing 167K GPS updates per second through Kafka into an H3 hexagonal geo index in Redis. We chose H3 (not QuadTree) because cell assignment is a pure function with no write contention at 167K/sec. Dispatch ranks candidates by ETA, direction, and rating with optimistic locking to prevent double-booking. WebSocket tracking during active rides at 1.3 MB/sec egress. Trip state in MySQL sharded by rider_id for ACID fare calculation. Location history in Cassandra with 30-day TTL.
1/10
1.

What is Ride Sharing?

Uber connects 5.4 billion trips per year across 10,000 cities in 72 countries. The system sounds straightforward, but the real challenge is a geo-spatial matching problem at massive write throughput.
We need to accomplish three things in seconds: locate nearby drivers from 500K active drivers broadcasting GPS every 3 seconds, match the optimal driver by ETA, direction, and rating, and track both parties in real time for the trip duration. 500K drivers at 3-second intervals means 167K location writes per second into a spatial index that must answer 'who is nearby?' in under 100ms. This is not a database problem.
The entire working set fits in 16.5 MB but the write throughput exceeds what any traditional database can handle. We need an in-memory spatial index fed by a streaming pipeline.
5.4B trips/year, 10K cities, 72 countries. Three challenges: locate nearby drivers, match the optimal one, track in real time. Core bottleneck: 167K location writes/sec into a spatial index.
Uber connects 5.4 billion trips per year across 10,000 cities in 72 countries. The system sounds straightforward: a rider requests a ride, we find a driver, they meet. But the real challenge is a geo-spatial matching problem at massive write throughput. We need to find the nearest available drivers from 500K broadcasting GPS every 3 seconds, match the optimal one considering ETA, direction of travel, and rating, then track both parties in real time for the entire trip. That means 167K location updates per second flowing into a spatial index that must answer 'who is nearby and heading my way?' in under 100ms. This is not a database problem. It is a streaming data problem where the entire working set fits in 16.5 MB of memory but the write throughput exceeds what any traditional database can handle.
  • We chose H3 hexagonal grid (not QuadTree) to index 500K driver locations because H3 cell assignment has zero write contention at 167K updates/sec
  • We buffer 167K location updates/sec through Kafka before writing to Redis, decoupling ingestion from indexing
  • We rank dispatch candidates by ETA, direction, rating, and acceptance rate (not nearest distance alone)
  • We use WebSocket for active ride tracking (not polling) to reduce request volume by 99%