Retries & Timeouts
Handle transient failures gracefully-without making things worse.
1The Phone Call Analogy
Retries automatically repeat failed requests for transient errors. Timeouts limit how long you wait, preventing indefinite hangs.
2When to Retry
Retry These
- ✓Network timeout
- ✓503 Service Unavailable
- ✓429 Too Many Requests
- ✓Connection refused
- ✓DNS resolution failed
Don't Retry These
- ✗400 Bad Request
- ✗401 Unauthorized
- ✗403 Forbidden
- ✗404 Not Found
- ✗422 Validation Error
Rule of thumb: Retry 5xx errors (server issues). Don't retry 4xx errors (client issues)-the request is wrong and will fail again.
3Retry Strategies
Immediate Retry
Wait 0, retryProblem: Can overwhelm already struggling server
When: Almost never. Maybe for local cache miss.
Fixed Delay
Wait 1s, retryProblem: All clients retry at same time (thundering herd)
When: Simple cases, low traffic
Exponential Backoff
Wait 1s, 2s, 4s, 8s...Problem: Can get very long waits
When: Standard approach. Use with max retries.
Exponential + Jitter
Wait (1s, 2s, 4s...) + random(0-1s)Problem: Slightly more complex
When: Best practice. Prevents synchronized retries.
4Timeout Types
Connection Timeout
How long to wait to establish TCP connection
Typical: 1-5 seconds
Read Timeout
How long to wait for response after request sent
Typical: 5-30 seconds (depends on operation)
Request Timeout
Total time for entire request (connect + read)
Typical: 10-60 seconds
Idle Timeout
How long to keep connection open when unused
Typical: 30-120 seconds
No Timeout = Danger
Without timeouts, a slow downstream service can hang your entire application. Always set timeouts.
5Implementation Example
async function fetchWithRetry(url, options = {}) {
const maxRetries = 3;
const baseDelay = 1000; // 1 second
for (let attempt = 0; attempt <= maxRetries; attempt++) {
try {
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 5000); // 5s timeout
const response = await fetch(url, {
...options,
signal: controller.signal
});
clearTimeout(timeout);
if (response.status === 429 || response.status >= 500) {
throw new Error(`Retryable: ${response.status}`);
}
return response;
} catch (error) {
if (attempt === maxRetries) throw error;
// Exponential backoff with jitter
const delay = baseDelay * Math.pow(2, attempt);
const jitter = Math.random() * 1000;
await sleep(delay + jitter);
}
}
}6The Retry Storm Problem
Solution: Exponential backoff + jitter spreads retries over time. Also consider circuit breakers to stop retrying entirely when downstream is unhealthy.
7Key Takeaways
?Quiz
1. GET /users returns 404. What should you do?
2. Best retry strategy to prevent thundering herd?