Module 5 — Architecture

Circuit Breaker Pattern

Prevent cascading failures by failing fast when a service is struggling.

1The Electrical Analogy

Simple Analogy
Like a circuit breaker in your house:
Normal
Power flows freely. Everything works.
Overload
Too much current detected.
Tripped
Breaker cuts power to prevent fire. You flip it back later.

Without it: overload → house fire. With it: breaker trips → minor inconvenience.

2Why Do We Need It?

When a downstream service fails, without circuit breaker: your service keeps trying → threads pile up → memory exhausted → your service crashes → services calling you crash → cascading failure.

Cascading Failure Without Circuit Breaker

API
Gateway
Service
A
Service
B
Service
C (DOWN)
1Service C goes down
2Service B keeps retrying C, threads blocked, memory fills
3Service B slows down, then crashes
4Service A crashes, then Gateway crashes. Total outage.

3The Three States

A circuit breaker moves between three states based on success/failure rates:

CLOSED

Normal operation. Requests flow through.

  • • Counts failures
  • • All requests pass through
  • • If failures exceed threshold → OPEN

OPEN

Fail fast. No requests sent downstream.

  • • Return error immediately
  • • Don't even try downstream
  • • After timeout → HALF-OPEN

HALF-OPEN

Testing. Allow limited requests.

  • • Let a few requests through
  • • If successful → CLOSED
  • • If fails → OPEN again

State Machine

CLOSED
failures > threshold
OPEN
timeout expires
HALF-
OPEN
↰ test succeeds (to CLOSED)↲ test fails (to OPEN)

4Interactive Demo

Circuit Breaker Simulator
Downstream Service
Click to toggle
Circuit State
closed
Failures: 0/3
Total: 0
Click "Send Request" to start...

5Configuration Parameters

failureThreshold
Number of failures before opening circuittypical: 5
💡 Too low = opens too often. Too high = doesn't protect.
successThreshold
Consecutive successes in half-open to close circuittypical: 3
💡 Higher = more confidence before closing.
timeout
How long to stay open before trying half-opentypical: 30s
💡 Give downstream time to recover.
failureRateThreshold
Percentage of failures that triggers opentypical: 50%
💡 Better than count alone for variable traffic.
// Example: Resilience4j configuration
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
    .failureRateThreshold(50)           // Open if 50% fail
    .waitDurationInOpenState(Duration.ofSeconds(30))
    .permittedNumberOfCallsInHalfOpenState(3)
    .minimumNumberOfCalls(10)           // Need 10 calls before calculating
    .slidingWindowSize(100)
    .build();

CircuitBreaker cb = CircuitBreaker.of("paymentService", config);

// Usage
Supplier<String> decoratedSupplier = CircuitBreaker
    .decorateSupplier(cb, () -> paymentService.charge());

Try<String> result = Try.ofSupplier(decoratedSupplier);

6What to Do When Circuit Opens

When circuit is open, you return a fallback instead of an error:

Return Cached Data
Example: Product catalog from Redis
When: Read operations, slightly stale OK
Return Default Value
Example: Empty recommendations list
When: Non-critical features
Queue for Later
Example: Email notifications
When: Async operations
Graceful Degradation
Example: Show basic UI without personalization
When: Feature can be disabled
Don't Do This

Return misleading data. If payment service is down, don't say "payment successful" with cached data. Be honest about degraded functionality.

7Real-World Implementation

Library/ToolLanguageFeatures
Resilience4jJavaCB, rate limiting, retry, bulkhead
HystrixJava (deprecated)Netflix's original, use Resilience4j now
Polly.NETCB, retry, timeout, bulkhead
opossumNode.jsSimple CB implementation
go-breakerGoSony's implementation
IstioService MeshCB at infrastructure level

8Key Takeaways

1Circuit breaker prevents cascading failures by failing fast.
2Three states: CLOSED (normal), OPEN (failing fast), HALF-OPEN (testing).
3Configure carefully: failure threshold, timeout, success threshold.
4Always provide fallback behavior: cached data, defaults, graceful degradation.
5Use established libraries: Resilience4j (Java), Polly (.NET), Istio (mesh).
6In interviews: mention alongside retry, timeout, and bulkhead patterns.