SYSTEM-DESIGN Contents

Circuit Breaker

Circuit breakers stop calling failing dependencies and give systems time to recover, protecting latency and preventing cascading failures.

On this page

Circuit Breakers Prevent Cascading Failures

A circuit breaker stops calling a dependency that is failing and allows it time to recover. Without a circuit breaker, your system keeps sending traffic into failure, increasing load, increasing queues, and making recovery harder.

States: Closed, Open, Half-Open

  • Closed: calls flow normally
  • Open: calls are rejected fast (or routed to fallback)
  • Half-open: limited probe calls test if the dependency has recovered

What Signals Should Trip the Breaker

  • High error rate (5xx, timeouts)
  • Rising latency (p95/p99)
  • Connection pool exhaustion

Fast Failure Is a Feature

When a dependency is down, failing fast protects the rest of your system. It preserves resources for critical paths and avoids long queueing delays. Users may see errors, but the system stays responsive and recoverable.

Fallbacks and Degradation

Circuit breakers are often paired with fallbacks: return cached data, partial results, or a degraded response. Fallbacks must be explicitly safe and should not silently violate correctness for critical flows.

Operational Considerations

  • Log breaker state changes (open/close events)
  • Expose metrics: open duration, reject counts, probe success rate
  • Separate breakers per dependency and sometimes per endpoint

Production-First Takeaway

Circuit breakers turn repeated failure into controlled behavior. Trip on meaningful signals, fail fast, probe recovery carefully, and instrument breaker behavior so on-call can see what is happening.