HLD Problem

Design Uber/Lyft

Design a ride-sharing platform with real-time location tracking, driver matching, ETA calculation, surge pricing, and payments.

45 min readHard

1Requirements Gathering

Functional Requirements
  • Riders request rides with pickup/dropoff
  • Match riders with nearby drivers
  • Real-time location tracking
  • ETA calculation for pickup and trip
  • Fare estimation before ride
  • Surge/dynamic pricing
  • Payment processing
  • Rating system for drivers/riders
  • Trip history
Non-Functional Requirements
  • High availability (99.99%)
  • Low latency matching (< 1 min)
  • Real-time location updates (every 3-5s)
  • Strong consistency for payments
  • Handle millions of concurrent rides
  • Global scale (100+ countries)
  • Accurate ETA (< 10% error)

2Capacity Estimation

Scale Numbers

100M+
Monthly Riders
5M+
Active Drivers
20M
Trips/Day
1M+
Concurrent Rides

Location Update Traffic

Active drivers sending location5M drivers
Update frequencyEvery 4 seconds
Location updates per second~1.25M/sec

3High-Level Architecture

System Architecture

Rider App
Driver App
↓ WebSocket + REST
API Gateway
Auth, Rate Limit
WebSocket Server
Real-time updates
Ride Service
Trip lifecycle
Matching Service
Driver-rider match
Location Service
GPS tracking
Pricing Service
Fare, surge
ETA Service
Time estimates
User Service
Auth, profiles
Payment Service
Transactions
Notification
Push, SMS
PostgreSQL
Users, trips
Redis/Geo
Driver locations
Kafka
Event streaming
Cassandra
Location history

4Core Components Deep Dive

4.1 Real-Time Location Tracking

Drivers send GPS coordinates every 3-5 seconds. Need to store and query efficiently.
Storage Options:
  • Redis with Geo - GEOADD, GEORADIUS
  • Quadtree - spatial indexing
  • Google S2 - cell-based geo indexing
Query Pattern:
GEOADD drivers:active -122.4194 37.7749 driver:123 GEORADIUS drivers:active -122.4194 37.7749 5 km COUNT 10

4.2 Driver Matching Algorithm

1
Find nearby drivers
Query geo-index for drivers within 5km radius
2
Filter available
Remove drivers on trips, offline, or at capacity
3
Calculate ETA
Get driving time from each driver to pickup
4
Rank drivers
Score by ETA, rating, acceptance rate
5
Send request
Offer trip to top driver, 15s timeout
6
Fallback
If declined/timeout, try next driver

4.3 ETA Calculation

Inputs
  • Source and destination coordinates
  • Current traffic conditions
  • Historical traffic patterns
  • Road type (highway vs local)
  • Time of day, day of week
Approach
  • Graph-based routing (Dijkstra, A*)
  • ML models on historical data
  • Real-time traffic integration
  • Segment the city into zones
  • Pre-compute zone-to-zone ETAs

4.4 Surge/Dynamic Pricing

When demand exceeds supply, increase prices to balance the market.
surge_multiplier = demand_count / supply_count if ratio > 2.0: surge = 2.5x if ratio > 1.5: surge = 1.5x if ratio > 1.2: surge = 1.2x else: surge = 1.0x
Calculate per geo-zone (S2 cell or hexagon). Update every 2-5 minutes.

5Ride Lifecycle

REQUESTED
Rider requests ride
MATCHING
Finding driver
ACCEPTED
Driver accepts
ARRIVING
Driver en route
ARRIVED
At pickup
IN_TRIP
Trip in progress
COMPLETED
At destination
State Machine
Each ride is a state machine. State transitions trigger events: notifications, payment holds, ETA recalculations. Use Kafka for event sourcing.

6Scaling Strategies

Location Service
  • Shard by geo-region (city/zone)
  • Use Redis cluster for hot data
  • Archive to Cassandra for analytics
  • Separate read/write paths
Matching Service
  • Run matching per city/zone
  • Stateless workers, scale horizontally
  • Cache driver availability
  • Priority queue for requests
Real-Time Updates
  • WebSocket connections pooled
  • Use message broker (Kafka)
  • Fan-out via pub/sub
  • Fallback to polling
Payment Processing
  • Synchronous for ride booking
  • Async for final settlement
  • Idempotency for retries
  • Strong consistency required

7Key Takeaways

1Geo-spatial indexing - Redis Geo or Quadtree for driver locations.
2Real-time via WebSocket - drivers push location every 3-5 seconds.
3Surge pricing - balance supply/demand per geo-zone.
4State machine for ride lifecycle with event sourcing.
5Shard by geography - each city operates semi-independently.
6Strong consistency for payments, eventual for analytics.