# Runbook: Release Orchestrator - Gate Evaluation Timeout > **Sprint:** SPRINT_20260117_029_DOCS_runbook_coverage > **Task:** RUN-004 - Release Orchestrator Runbooks ## Metadata | Field | Value | |-------|-------| | **Component** | Release Orchestrator | | **Severity** | High | | **On-call scope** | Platform team | | **Last updated** | 2026-01-17 | | **Doctor check** | `check.orchestrator.gate-timeout` | --- ## Symptoms - [ ] Promotion gates timing out before completing evaluation - [ ] Alert `OrchestratorGateTimeout` firing - [ ] Error: "gate evaluation timeout exceeded" - [ ] Promotion stuck waiting for gate response - [ ] Metric `orchestrator_gate_timeout_total` increasing --- ## Impact | Impact Type | Description | |-------------|-------------| | **User-facing** | Promotions delayed or blocked; release pipeline stalled | | **Data integrity** | No data loss; promotion can be retried | | **SLA impact** | Release SLO violated if timeout persists | --- ## Diagnosis ### Quick checks 1. **Check Doctor diagnostics:** ```bash stella doctor --check check.orchestrator.gate-timeout ``` 2. **Identify timed-out gates:** ```bash stella promotion gates --status timeout ``` 3. **Check gate service health:** ```bash stella orch gate-services status ``` ### Deep diagnosis 1. **Check specific gate latency:** ```bash stella orch gate stats --gate --last 1h ``` Look for: P95 latency, timeout rate 2. **Check external service connectivity:** ```bash stella orch connectivity --gate ``` 3. **Check gate evaluation logs:** ```bash stella orch logs --gate --promotion ``` Look for: Slow queries, external API delays 4. **Check policy engine latency (for policy gates):** ```bash stella policy stats --last 10m ``` --- ## Resolution ### Immediate mitigation 1. **Increase timeout for specific gate:** ```bash stella orch config set gates..timeout 5m stella orch reload ``` 2. **Skip the timed-out gate (requires approval):** ```bash stella promotion gate skip \ --reason "External service timeout - approved by " ``` 3. **Retry the promotion:** ```bash stella promotion retry ``` ### Root cause fix **If external service is slow:** 1. Configure gate retry with backoff: ```bash stella orch config set gates..retries 3 stella orch config set gates..retry_backoff 5s ``` 2. Enable gate result caching: ```bash stella orch config set gates..cache_ttl 5m ``` 3. Configure circuit breaker: ```bash stella orch config set gates..circuit_breaker.enabled true stella orch config set gates..circuit_breaker.threshold 5 ``` **If policy evaluation is slow:** 1. Optimize policy (see `policy-evaluation-slow.md` runbook) 2. Increase policy worker count: ```bash stella policy config set opa.workers 4 ``` **If evidence retrieval is slow:** 1. Enable evidence pre-fetching: ```bash stella orch config set gates.evidence_prefetch true ``` 2. Increase evidence cache: ```bash stella orch config set evidence.cache_size 1000 stella orch config set evidence.cache_ttl 10m ``` ### Verification ```bash # Retry promotion stella promotion retry # Monitor gate evaluation stella promotion gates --watch # Check gate latency improved stella orch gate stats --gate --last 10m # Verify no timeouts stella orch logs --filter "timeout" --last 30m ``` --- ## Prevention - [ ] **Timeouts:** Set appropriate timeouts based on gate SLAs (default: 2m) - [ ] **Monitoring:** Alert on gate P95 latency > 1m - [ ] **Caching:** Enable caching for slow gates - [ ] **Circuit breakers:** Enable circuit breakers for external service gates --- ## Related Resources - **Architecture:** `docs/modules/release-orchestrator/gates.md` - **Related runbooks:** `orchestrator-promotion-stuck.md`, `policy-evaluation-slow.md` - **Dashboard:** Grafana > Stella Ops > Gate Latency