Module 3 - Async Processing

Message Queues

Decouple services, handle traffic spikes, and build resilient systems with asynchronous messaging.

1What is a Message Queue?

A Message Queue is a form of asynchronous service-to-service communication. Messages are stored in a queue until they are processed and deleted. Each message is processed only once, by a single consumer.

Simple Analogy

Think of a restaurant kitchen:

🙋

Waiter (Producer)

Takes order, puts ticket on the rail

📋

Order Rail (Queue)

Holds tickets in order, waiter doesn't wait for cooking

👨‍🍳

Chef (Consumer)

Picks tickets, cooks food at their own pace

The waiter doesn't stand there waiting for the chef. They go take more orders. This is asynchronous processing.

Core Components

📤Producer

publish→

Message Queue

consume→

📥Consumer

Producer

Creates and sends messages to the queue. Doesn't care who processes them or when.

Queue (Broker)

Stores messages durably. Maintains order. Delivers to consumers.

Consumer

Receives and processes messages. Acknowledges completion.

2Why Use Message Queues?

Decoupling

Problem: Service A calls Service B directly. If B is down, A fails too.

Solution: A publishes to queue. B reads when ready. They don't know about each other.

Example: Order service publishes 'OrderCreated'. Email, inventory, analytics services each consume independently.

Load Leveling (Traffic Spikes)

Problem: Flash sale: 10,000 orders/second. Your server handles 100/second. Crash!

Solution: Queue absorbs the spike. Consumers process at sustainable rate.

Example: Black Friday: queue up orders, process over next few hours. No dropped orders.

Resilience

Problem: Database down for 5 minutes. All requests during that time lost.

Solution: Messages wait in queue until DB recovers. Nothing lost.

Example: Payment service down? Payment messages queue up, process when it's back.

Async Processing

Problem: User clicks 'Submit'. Waits 30 seconds while you send emails, update inventory, etc.

Solution: Return immediately. Do heavy work asynchronously via queue.

Example: Submit order → 'Order received!' → Email, shipping, etc. happen in background.

3Anatomy of a Message

Typical Message Structure

{
  // HEADERS (metadata)
  "message_id": "msg-uuid-12345",
  "timestamp": "2024-01-15T10:30:00Z",
  "type": "order.created",
  "version": "1.0",
  "correlation_id": "req-uuid-67890",  // trace through system
  "retry_count": 0,
  
  // BODY (payload)
  "payload": {
    "order_id": "ORD-123",
    "user_id": "USR-456",
    "items": [
      {"product_id": "PROD-1", "quantity": 2, "price": 29.99}
    ],
    "total": 59.98
  }
}

Headers (Metadata)

• message_id - unique identifier
• timestamp - when created
• type - what kind of event
• correlation_id - for tracing
• retry_count - how many attempts

Body (Payload)

• The actual data being sent
• Keep it small (< 256KB typically)
• Include everything consumer needs
• Version your schema

Best Practice

Include enough data in the message so the consumer doesn't need to call back to the producer. But don't include huge blobs-store them elsewhere (S3) and include a reference.

4Delivery Guarantees

How many times will your message be delivered? This is crucial to understand:

At-Most-Once

Message delivered 0 or 1 time. Fire and forget.

How: Send message, don't wait for ack. If lost, it's lost.

+ Fastest

+ Simplest

+ No duplicates

- Can lose messages

- Not suitable for critical data

Best for: Metrics, logs, analytics where losing a few is OK

At-Least-Once

Message delivered 1 or more times. Guaranteed delivery but may duplicate.

How: Consumer acks after processing. If no ack, broker retries.

+ No message loss

+ Most common choice

- May have duplicates

- Consumer must be idempotent

Best for: Orders, payments (with idempotency keys)

Exactly-Once

Message delivered exactly 1 time. The holy grail.

How: Distributed transactions, deduplication, idempotency combined.

+ Perfect delivery

+ No duplicates, no loss

- Very hard to achieve

- Performance overhead

- Often impossible across systems

Best for: Critical financial transactions (often use at-least-once + idempotency instead)

Visual: Why Duplicates Happen

Consumer receives message, processes it

Order created in database ✓

Consumer crashes before sending ACK

Network failure, server restart, etc.

Broker thinks message wasn't processed

No ACK received, redelivers message

Consumer processes same message again

Duplicate order! Unless consumer is idempotent.

Critical Rule

Always design consumers to be idempotent. Processing the same message twice should have the same effect as processing it once. Use unique IDs to detect duplicates.

5Queue vs Topic (Pub/Sub)

Queue (Point-to-Point)

Producer

→

Queue

→

C1 ✓

C2 ✗

• Each message goes to ONE consumer
• Consumers compete for messages
• Message deleted after processing
• Good for: task distribution, work queues

Example: Email sending tasks, image processing jobs

Topic (Publish/Subscribe)

Producer

→

Topic

→

Sub1 ✓

Sub2 ✓

Sub3 ✓

• Each message goes to ALL subscribers
• Subscribers get independent copies
• Good for: events, notifications, fanout

Example: OrderCreated → Email, Inventory, Analytics all receive

6Popular Message Queue Systems

RabbitMQ

Traditional Message Broker

AMQP

Key Features

• Flexible routing
• Multiple exchange types
• Plugins ecosystem
• Management UI

Throughput: ~20-50K msgs/sec per node

Persistence: Durable queues, message persistence to disk

Best for: Complex routing, enterprise integration, when you need flexibility

Apache Kafka

Distributed Streaming Platform

Custom (TCP)

Key Features

• High throughput
• Message replay
• Partitioning
• Exactly-once (within Kafka)

Throughput: ~100K-1M msgs/sec per broker

Persistence: Append-only log, configurable retention (days/size)

Best for: High-volume event streaming, log aggregation, real-time analytics

Amazon SQS

Managed Queue Service

HTTP/REST

Key Features

• Fully managed
• Auto-scaling
• Dead letter queues
• FIFO queues option

Throughput: Nearly unlimited (scales automatically)

Persistence: 14 days retention, replicated across AZs

Best for: AWS-native apps, simple queuing needs, no ops overhead

Redis Streams

In-Memory Stream

Redis Protocol

Key Features

• Consumer groups
• Low latency
• Simple to set up
• Persistence optional

Throughput: ~100K+ msgs/sec (in-memory)

Persistence: RDB/AOF persistence, but primarily in-memory

Best for: When you already use Redis, need low latency

Comparison Table

Feature	RabbitMQ	Kafka	SQS
Throughput	Medium	Very High	High (auto-scale)
Message Replay	No	Yes	No
Ordering	Per queue	Per partition	FIFO queues only
Operations	Self-managed	Complex	Fully managed
Best For	Complex routing	Event streaming	Simple tasks on AWS

7Common Patterns

Work Queue (Task Distribution)

Distribute tasks among multiple workers for parallel processing

Producer

→

Queue

→

Worker 1

Worker 2

Worker 3

Example: Image resizing: upload → queue → multiple workers resize in parallel

Fanout (Broadcast)

Send same event to multiple services

Order Service

→

Topic

→

Inventory

Analytics

Example: OrderCreated event → email confirmation, inventory update, analytics tracking

Request-Reply

Synchronous-style communication over async infrastructure

Client

→

Request Q

→

Server

→

Reply Q

→

Client

Example: RPC over queues when you need request-reply semantics but want queue benefits

Dead Letter Queue (DLQ)

Handle messages that fail repeatedly

Main Queue

→

Consumer

fails 3x→

DLQ

Example: After 3 retries, move to DLQ for manual inspection. Don't block the queue.

8Real-World Example: E-commerce Order

Without Message Queue (Synchronous)

// All in one request - slow and fragile!
def place_order(order):
    save_to_database(order)       # 50ms
    charge_payment(order)         # 500ms - if fails, everything fails
    send_email(order)             # 200ms
    update_inventory(order)       # 100ms
    notify_warehouse(order)       # 150ms
    update_analytics(order)       # 100ms
    return "Order placed"         # Total: 1100ms - user waits!

Problems: User waits 1+ second. If email service is slow, entire order is slow. If analytics fails, order fails.

With Message Queue (Asynchronous)

// Fast response, reliable processing
def place_order(order):
    save_to_database(order)                    # 50ms
    charge_payment(order)                      # 500ms - critical, keep sync
    publish("order.created", order)            # 5ms
    return "Order placed"                      # Total: 555ms

# Separate consumers (run independently)
@consumer("order.created")
def send_confirmation_email(order):
    send_email(order)

@consumer("order.created")  
def update_inventory(order):
    decrement_stock(order)

@consumer("order.created")
def notify_warehouse(order):
    create_shipping_label(order)

Benefits: 2x faster response. Email service down? Orders still work. Each service scales independently.

9Best Practices

Make Consumers Idempotent

Same message processed twice = same result

if Order.exists(message.order_id): return # Skip duplicate

Keep Messages Small

Don't put large blobs in messages

{'file_url': 's3://bucket/file.pdf'} // Not the file itself

Use Dead Letter Queues

Catch failed messages for debugging

after 3 retries → move to DLQ → alert team

Set Message TTL

Don't process stale messages

if message.age > 24h: discard()

Monitor Queue Depth

Alert if queue grows too large

if queue.length > 10000: page_oncall()

Include Correlation IDs

Trace messages through the system

{'correlation_id': 'req-123', ...}

10Key Takeaways

1Message queue = async communication. Producer sends, consumer processes later.

2Benefits: decoupling, load leveling, resilience, async processing.

3Delivery guarantees: at-most-once, at-least-once (most common), exactly-once (hard).

4Queue = one consumer gets message. Topic = all subscribers get copy.

5Popular tools: RabbitMQ (routing), Kafka (streaming), SQS (managed).

6Always make consumers idempotent. Use DLQ for failed messages.