# Runbook: Policy Engine - OPA Process Crashed > **Sprint:** SPRINT_20260117_029_DOCS_runbook_coverage > **Task:** RUN-003 - Policy Engine Runbooks ## Metadata | Field | Value | |-------|-------| | **Component** | Policy Engine | | **Severity** | Critical | | **On-call scope** | Platform team | | **Last updated** | 2026-01-17 | | **Doctor check** | `check.policy.opa-health` | --- ## Symptoms - [ ] Policy evaluations failing with "OPA unavailable" error - [ ] Alert `PolicyOPACrashed` firing - [ ] OPA process exited unexpectedly - [ ] Error: "connection refused" when connecting to OPA - [ ] Metric `policy_opa_restarts_total` increasing --- ## Impact | Impact Type | Description | |-------------|-------------| | **User-facing** | All policy evaluations fail; gate decisions blocked | | **Data integrity** | No data loss; decisions delayed until OPA recovers | | **SLA impact** | Gate latency SLO violated; release pipeline blocked | --- ## Diagnosis ### Quick checks 1. **Check Doctor diagnostics:** ```bash stella doctor --check check.policy.opa-health ``` 2. **Check OPA process status:** ```bash stella policy status ``` Look for: OPA process state, restart count 3. **Check OPA logs for crash reason:** ```bash stella policy opa logs --last 30m --level error ``` ### Deep diagnosis 1. **Check OPA memory usage before crash:** ```bash stella policy stats --opa-metrics ``` Problem if: Memory usage near limit before crash 2. **Check for problematic policy:** ```bash stella policy list --last-error ``` Look for: Policies that caused evaluation errors 3. **Check OPA configuration:** ```bash stella policy opa config show ``` Look for: Invalid configuration, missing bundles 4. **Check for infinite loops in Rego:** ```bash stella policy analyze --detect-loops ``` --- ## Resolution ### Immediate mitigation 1. **Restart OPA process:** ```bash stella policy opa restart ``` 2. **If OPA keeps crashing, start in safe mode:** ```bash stella policy opa start --safe-mode ``` Note: Safe mode disables custom policies 3. **Enable failopen temporarily (if allowed by policy):** ```bash stella policy config set failopen true stella policy reload ``` **Warning:** Only use if compliance allows fail-open mode ### Root cause fix **If OOM killed:** 1. Increase OPA memory limit: ```bash stella policy opa config set memory_limit 2Gi stella policy opa restart ``` 2. Enable garbage collection tuning: ```bash stella policy opa config set gc_min_heap_size 256Mi stella policy opa config set gc_max_heap_size 1Gi ``` **If policy caused crash:** 1. Identify problematic policy: ```bash stella policy list --status error ``` 2. Disable the problematic policy: ```bash stella policy disable stella policy reload ``` 3. Fix and re-enable: ```bash stella policy validate --file stella policy update --file stella policy enable ``` **If bundle loading failed:** 1. Check bundle integrity: ```bash stella policy bundle verify ``` 2. Rebuild bundle: ```bash stella policy bundle build --output bundle.tar.gz stella policy bundle load bundle.tar.gz ``` **If configuration issue:** 1. Reset to default configuration: ```bash stella policy opa config reset ``` 2. Reconfigure with validated settings: ```bash stella policy opa config set workers 4 stella policy opa config set decision_log true stella policy opa restart ``` ### Verification ```bash # Check OPA is running stella policy status # Check OPA health stella policy opa health # Test policy evaluation stella policy evaluate --test # Check no crashes in recent logs stella policy opa logs --level error --last 30m # Monitor stability stella policy stats --watch ``` --- ## Prevention - [ ] **Resources:** Set appropriate memory limits based on policy complexity - [ ] **Validation:** Validate all policies before deployment - [ ] **Monitoring:** Alert on OPA restart count > 2 in 10 minutes - [ ] **Testing:** Load test policies before production deployment --- ## Related Resources - **Architecture:** `docs/modules/policy/architecture.md` - **Related runbooks:** `policy-evaluation-slow.md`, `policy-compilation-failed.md` - **Doctor check:** `src/Doctor/__Plugins/StellaOps.Doctor.Plugin.Policy/` - **OPA documentation:** https://www.openpolicyagent.org/docs/latest/