HLD Problem

Design Notification System

Design a scalable notification system supporting push notifications, SMS, email, and in-app messages with prioritization, rate limiting, and personalization.

35 min readMedium

1Requirements Gathering

Functional Requirements

•Send push notifications (iOS, Android, Web)
•Send SMS messages
•Send emails (transactional & marketing)
•In-app notifications
•User notification preferences
•Template management
•Scheduling (send later)
•Notification history
•Analytics and tracking

Non-Functional Requirements

•High availability (99.99%)
•Low latency for critical notifications (< 1s)
•At-least-once delivery guarantee
•Handle 10M+ notifications/minute
•Rate limiting per user/channel
•Soft real-time (eventual delivery)
•Cost optimization

2Capacity Estimation

Scale Numbers

500M

Notifications/Day

10M

Peak/Minute

100M

Users

Channels

Traffic Breakdown by Channel

Push Notifications60% (~300M/day)

In-App25% (~125M/day)

Email12% (~60M/day)

SMS3% (~15M/day)

3High-Level Architecture

System Architecture

User Actions

Scheduled Jobs

System Events

External APIs

↓

Notification API

REST/gRPC endpoints

↓

Message Queue (Kafka)

Decouple producers & consumers

↓

Notification Service

Orchestration

Preference Service

User settings

Template Service

Message templates

Rate Limiter

Throttling

↓

Channel Workers

Push Worker

APNS, FCM

SMS Worker

Twilio, SNS

Email Worker

SES, SendGrid

In-App Worker

WebSocket

↓

PostgreSQL

Users, prefs

Redis

Rate limits, cache

Cassandra

Notification logs

Templates, media

4Core Components Deep Dive

4.1 Notification Flow

Trigger received

Event triggers notification (order placed, message received)

Publish to queue

Notification request added to Kafka topic

Check preferences

Query user preferences - which channels enabled?

Rate limiting

Check if user hit notification limits

Render template

Inject dynamic data into message template

Route to channels

Send to appropriate channel workers

Deliver

Each worker sends to external provider (APNS, Twilio, etc.)

Log result

Store delivery status for analytics

4.2 Priority Handling

Critical

OTP codes
Security alerts
Payment confirmations

SLA: < 30 seconds

High

Order updates
Direct messages
Mentions

SLA: < 1 minute

Normal

Marketing
Recommendations
Weekly digests

SLA: < 5 minutes

Separate Queues by Priority

Use dedicated Kafka topics or SQS queues for each priority level. Critical queue gets more consumer instances and faster processing.

4.3 Rate Limiting

Per User Limits:

Push: Max 5/hour, 20/day
SMS: Max 3/hour, 5/day
Email: Max 10/day

Global Limits:

SMS provider rate limits
Email sending reputation
Push provider quotas

Use Redis with sliding window algorithm for distributed rate limiting.

5Channel Deep Dives

Push Notifications

Providers:

APNS (iOS)
FCM (Android/Web)

Flow:

Get device token from user record
Build platform-specific payload
Send to APNS/FCM
Handle delivery receipts

Challenges:

Token refresh
Silent vs alert
Payload size limits

SMS

Providers:

Twilio
AWS SNS
Plivo

Flow:

Validate phone number format
Select provider based on region/cost
Send via provider API
Handle delivery status webhooks

Challenges:

Cost (expensive)
Country regulations
Carrier filtering

Providers:

AWS SES
SendGrid
Mailgun

Flow:

Build HTML/text email from template
Add tracking pixels and links
Send via email provider
Handle bounces and complaints

Challenges:

Spam filtering
Sender reputation
Bounce handling

6API Design

POST/api/v1/notifications/send

Send a notification

{
  "user_id": "12345",
  "template_id": "order_shipped",
  "channels": ["push", "email"],
  "data": {
    "order_id": "ORD-789",
    "tracking_url": "..."
  },
  "priority": "high"
}

POST/api/v1/notifications/bulk

Send to multiple users (async)

GET/api/v1/users/:id/preferences

Get user notification preferences

PUT/api/v1/users/:id/preferences

Update notification preferences

7Scaling Strategies

Horizontal Scaling

Add workers per channel independently
Kafka partitions for parallelism
Auto-scale based on queue depth
Regional deployments for latency

Reliability

Idempotency keys prevent duplicates
Dead letter queues for failures
Retry with exponential backoff
Circuit breakers for providers

Cost Optimization

Batch SMS/email where possible
Use cheapest provider per region
Aggregate low-priority notifications
Time-shift non-urgent sends

Observability

Track delivery rates per channel
Alert on high failure rates
End-to-end latency tracking
Provider health monitoring

8Key Takeaways

1Message queue decouples notification triggers from delivery.

2Priority queues ensure critical notifications get delivered first.

3Rate limiting prevents notification fatigue and respects provider limits.

4Per-channel workers allow independent scaling and provider abstraction.

5Idempotency + retries ensure at-least-once delivery without duplicates.

6User preferences respect opt-outs and channel selections.