Module 3 — Asynchronous Processing

Retry Strategies

Handle transient failures gracefully with smart retry logic that balances persistence with system health.

1The Persistent Caller Analogy

Simple Analogy
You call a friend but they don't answer. Do you:
A) Give up immediately
B) Call 1000 times per second until they answer (annoying!)
C) Wait a bit, try again, wait longer, try again...

Option C is exponential backoff—the smart approach!

Retry Strategy defines how a system handles failed operations—when to retry, how long to wait, and when to give up. Good strategies prevent thundering herds while maximizing success.

2Types of Failures

Transient (Retry-able)

  • • Network timeout
  • • Service temporarily unavailable (503)
  • • Database connection dropped
  • • Rate limited (429)

Permanent (Don't Retry)

  • • Invalid input (400)
  • • Not found (404)
  • • Unauthorized (401, 403)
  • • Business logic error
Rule of Thumb

Retry 5xx errors and timeouts. Don't retry 4xx errors (except 429)—they'll fail again.

3Retry Strategies

1. Immediate Retry

0s
0s
0s
0s
No delay between retries

Never do this! Creates thundering herd when service recovers.

2. Fixed Delay

1s
1s
1s
1s
Same delay every time (1s)

Simple but not adaptive. May be too aggressive or too slow.

3. Exponential Backoff

1s
2s
4s
8s
Doubles each time: 1s → 2s → 4s → 8s

Gives overwhelmed services time to recover. Industry standard.

4. Exponential Backoff + Jitter

1.2s
2.8s
3.5s
7.1s
Exponential + random offset

Best approach. Jitter prevents synchronized retries from many clients.

4The Jitter Problem

Without jitter, all clients retry at exactly the same time, creating traffic spikes:

Without Jitter

10:00:00 - 1000 clients fail
10:00:01 - 1000 clients retry (spike!)
10:00:03 - 1000 clients retry (spike!)
10:00:07 - 1000 clients retry (spike!)

With Jitter

10:00:00 - 1000 clients fail
10:00:01~02 - clients retry spread
10:00:03~05 - clients retry spread
10:00:07~12 - clients retry spread
Exponential Backoff with Jitter
function getRetryDelay(attempt, baseDelay = 1000) {
  // Exponential: 1s, 2s, 4s, 8s, 16s...
  const exponentialDelay = baseDelay * Math.pow(2, attempt);
  
  // Cap at 30 seconds
  const cappedDelay = Math.min(exponentialDelay, 30000);
  
  // Add jitter: random 0-100% of delay
  const jitter = Math.random() * cappedDelay;
  
  return cappedDelay + jitter;
}

// Usage
for (let attempt = 0; attempt < maxRetries; attempt++) {
  try {
    return await makeRequest();
  } catch (error) {
    if (!isRetryable(error)) throw error;
    await sleep(getRetryDelay(attempt));
  }
}
throw new Error('Max retries exceeded');

5Circuit Breaker Pattern

After too many failures, stop trying temporarily. Prevents wasting resources on a dead service.

CLOSED
Normal operation
→ failures exceed threshold →
OPEN
Fail fast, no calls
→ timeout expires →
HALF-OPEN
Test with one call
→ success →
CLOSED
Back to normal
Circuit Breaker Benefits

Fail fast instead of waiting for timeouts. Give downstream services time to recover. Prevent cascade failures across your system.

6Dead Letter Queues

After max retries, don't lose the message! Send to a Dead Letter Queue for investigation.

Message
Process (fail)
Retry 1 (fail)
Retry 2 (fail)
DLQ

After 3 failures → Dead Letter Queue for manual review

7Key Takeaways

1Only retry transient failures—5xx, timeouts, rate limits. Not 4xx.
2Exponential backoff gives services time to recover: 1s → 2s → 4s → 8s.
3Add jitter to prevent synchronized retry storms from many clients.
4Set max retries (3-5 typically) and a max delay cap (30s).
5Circuit breaker pattern: stop retrying after repeated failures.
6Dead Letter Queue: capture failed messages for debugging, don't lose them.