HLD Problem
Design Uber/Lyft
Design a ride-sharing platform with real-time location tracking, driver matching, ETA calculation, surge pricing, and payments.
45 min readHard
1Requirements Gathering
Functional Requirements
- •Riders request rides with pickup/dropoff
- •Match riders with nearby drivers
- •Real-time location tracking
- •ETA calculation for pickup and trip
- •Fare estimation before ride
- •Surge/dynamic pricing
- •Payment processing
- •Rating system for drivers/riders
- •Trip history
Non-Functional Requirements
- •High availability (99.99%)
- •Low latency matching (< 1 min)
- •Real-time location updates (every 3-5s)
- •Strong consistency for payments
- •Handle millions of concurrent rides
- •Global scale (100+ countries)
- •Accurate ETA (< 10% error)
2Capacity Estimation
Scale Numbers
100M+
Monthly Riders
5M+
Active Drivers
20M
Trips/Day
1M+
Concurrent Rides
Location Update Traffic
Active drivers sending location5M drivers
Update frequencyEvery 4 seconds
Location updates per second~1.25M/sec
3High-Level Architecture
System Architecture
Rider App
Driver App
↓ WebSocket + REST
API Gateway
Auth, Rate Limit
WebSocket Server
Real-time updates
↓
Ride Service
Trip lifecycle
Matching Service
Driver-rider match
Location Service
GPS tracking
Pricing Service
Fare, surge
ETA Service
Time estimates
User Service
Auth, profiles
Payment Service
Transactions
Notification
Push, SMS
↓
PostgreSQL
Users, trips
Redis/Geo
Driver locations
Kafka
Event streaming
Cassandra
Location history
4Core Components Deep Dive
4.1 Real-Time Location Tracking
Drivers send GPS coordinates every 3-5 seconds. Need to store and query efficiently.
Storage Options:
- Redis with Geo - GEOADD, GEORADIUS
- Quadtree - spatial indexing
- Google S2 - cell-based geo indexing
Query Pattern:
GEOADD drivers:active
-122.4194 37.7749 driver:123
GEORADIUS drivers:active
-122.4194 37.7749 5 km
COUNT 10
4.2 Driver Matching Algorithm
1
Find nearby drivers
Query geo-index for drivers within 5km radius
2
Filter available
Remove drivers on trips, offline, or at capacity
3
Calculate ETA
Get driving time from each driver to pickup
4
Rank drivers
Score by ETA, rating, acceptance rate
5
Send request
Offer trip to top driver, 15s timeout
6
Fallback
If declined/timeout, try next driver
4.3 ETA Calculation
Inputs
- Source and destination coordinates
- Current traffic conditions
- Historical traffic patterns
- Road type (highway vs local)
- Time of day, day of week
Approach
- Graph-based routing (Dijkstra, A*)
- ML models on historical data
- Real-time traffic integration
- Segment the city into zones
- Pre-compute zone-to-zone ETAs
4.4 Surge/Dynamic Pricing
When demand exceeds supply, increase prices to balance the market.
surge_multiplier = demand_count / supply_count
if ratio > 2.0: surge = 2.5x
if ratio > 1.5: surge = 1.5x
if ratio > 1.2: surge = 1.2x
else: surge = 1.0x
Calculate per geo-zone (S2 cell or hexagon). Update every 2-5 minutes.
5Ride Lifecycle
REQUESTED
Rider requests ride
→
MATCHING
Finding driver
→
ACCEPTED
Driver accepts
→
ARRIVING
Driver en route
→
ARRIVED
At pickup
→
IN_TRIP
Trip in progress
→
COMPLETED
At destination
State Machine
Each ride is a state machine. State transitions trigger events: notifications, payment holds, ETA recalculations. Use Kafka for event sourcing.
6Scaling Strategies
Location Service
- Shard by geo-region (city/zone)
- Use Redis cluster for hot data
- Archive to Cassandra for analytics
- Separate read/write paths
Matching Service
- Run matching per city/zone
- Stateless workers, scale horizontally
- Cache driver availability
- Priority queue for requests
Real-Time Updates
- WebSocket connections pooled
- Use message broker (Kafka)
- Fan-out via pub/sub
- Fallback to polling
Payment Processing
- Synchronous for ride booking
- Async for final settlement
- Idempotency for retries
- Strong consistency required
7Key Takeaways
1Geo-spatial indexing - Redis Geo or Quadtree for driver locations.
2Real-time via WebSocket - drivers push location every 3-5 seconds.
3Surge pricing - balance supply/demand per geo-zone.
4State machine for ride lifecycle with event sourcing.
5Shard by geography - each city operates semi-independently.
6Strong consistency for payments, eventual for analytics.