---
checkId: check.servicegraph.circuitbreaker
plugin: stellaops.doctor.servicegraph
severity: warn
tags: [servicegraph, resilience, circuit-breaker]
---
# Circuit Breaker Status

## What It Checks
Reads `Resilience:Enabled` or `HttpClient:Resilience:Enabled` and, when enabled, validates `BreakDurationSeconds`, `FailureThreshold`, and `SamplingDurationSeconds`.

The check reports info when resilience is not configured, warns when `BreakDurationSeconds < 5` or `FailureThreshold < 2`, and passes otherwise.

## Why It Matters
Circuit breakers protect external dependencies from retry storms. Bad thresholds either trip too aggressively or never trip when a downstream service is failing.

## Common Causes
- Resilience policies were never enabled on outgoing HTTP clients
- Thresholds were copied from a benchmark profile into production
- Multiple services use different resilience defaults, making failures unpredictable

## How to Fix

### Docker Compose
```yaml
services:
  doctor-web:
    environment:
      Resilience__Enabled: "true"
      Resilience__CircuitBreaker__BreakDurationSeconds: "30"
      Resilience__CircuitBreaker__FailureThreshold: "5"
      Resilience__CircuitBreaker__SamplingDurationSeconds: "60"
```

### Bare Metal / systemd
Keep breaker settings in the same configuration source used for HTTP client registration so the service and Doctor observe the same values.

### Kubernetes / Helm
Standardize resilience values across backend-facing workloads instead of per-pod overrides.

## Verification
```bash
stella doctor --check check.servicegraph.circuitbreaker
```

## Related Checks
- `check.servicegraph.backend` - breaker policy protects this path when the backend degrades
- `check.servicegraph.timeouts` - timeout settings and breaker settings should be tuned together