TRICKYwalkthrough

Circuit Breaker

8 of 8
2 related
Our rate limiter runs on Redis. The constraint: Redis is now a single point of failure.
The component meant to protect our system became the component that brought it down. A circuit breaker exists in our design to monitor the health of the rate limiter itself and prevent this cascading failure.
If Redis goes down at 3 AM, every request to every endpoint calls a dead Redis node, times out after 500ms, and returns a 500 error.
In the closed state, requests flow through the rate limiter normally. When Redis failures exceed a threshold (say 5 failures in 10 seconds), the circuit opens and stops calling Redis entirely.
Now we face a design decision. Fail-open means we allow all requests through without rate limiting, accepting the risk of overload but keeping the service alive. Fail-closed means we reject all requests, protecting downstream services but causing a total outage for users. We chose fail-open (not fail-closed) because a brief period without rate limiting is less catastrophic than a complete service outage.
Trade-off: we gave up abuse protection during Redis downtime in exchange for service availability. After a cooldown period (typically 30 seconds), the circuit enters a half-open state, sending a small percentage of requests to Redis to test recovery.
If those succeed, the circuit closes and normal rate limiting resumes. Netflix's Hystrix library popularized this pattern, and AWS API Gateway implements circuit breakers around its internal rate limiting infrastructure.
What if the interviewer asks: 'When would you choose fail-closed?' We choose fail-closed when the downstream service cannot tolerate any overload at all, such as a billing service where unbounded requests could trigger millions of dollars in erroneous charges.
Why it matters in interviews
The circuit breaker question tests whether our rate limiter design handles its own failure. Articulating the fail-open vs fail-closed trade-off and the half-open recovery mechanism shows we think about resilience, not the happy path alone.
Related concepts