Implement remediation-aware health checks across all Doctor plugin modules (Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment, EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release, Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation, Authority, Core, Cryptography, Database, Docker, Integration, Notify, Observability, Security, ServiceGraph, Sources, Verification). Each check now emits structured remediation metadata (severity, category, runbook links, and fix suggestions) consumed by the Doctor dashboard remediation panel. Also adds: - docs/doctor/articles/ knowledge base for check explanations - Advisory AI search seed and allowlist updates for doctor content - Sprint plan for doctor checks documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3.5 KiB
checkId, plugin, severity, tags
| checkId | plugin | severity | tags | |||||
|---|---|---|---|---|---|---|---|---|
| check.release.promotion.gates | stellaops.doctor.release | warn |
|
Promotion Gate Health
What It Checks
Queries the Release Orchestrator at /api/v1/promotion-gates and validates each promotion gate's dependencies:
- Policy availability: if a gate requires policy pass, verifies that all required policies are loaded in the policy engine (queries OPA at
/v1/policies). - Attestor availability: if a gate requires attestations, verifies the attestor service is reachable at its health endpoint.
- Approval configuration: if a gate requires approval, verifies that at least one approver is configured.
- Severity: missing policies or missing approvers escalate to fail; attestor unavailability is a warning.
Evidence collected: gate_count, gates_with_policy, gates_with_attestation, gates_with_approval, issue_count, issues.
The check requires ReleaseOrchestrator:Url or Release:Orchestrator:Url to be configured.
Why It Matters
Promotion gates enforce the security and compliance requirements for environment promotions. If a gate references a policy that does not exist in the policy engine, releases will fail at promotion time with a cryptic error. If the attestor is down, attestation-gated promotions will block. If approval is required but no approvers are configured, releases will wait indefinitely. These issues are best caught proactively, not during a time-critical production deployment.
Common Causes
- Required policies not loaded or compiled in the policy engine
- Attestor service unavailable (crashed, misconfigured, network issue)
- Approval workflow misconfigured (approvers removed, role changes)
- Environment was deleted but its promotion gate configuration remains
- Policy engine URL misconfigured so policy lookup fails
- Policy was renamed but gate still references the old name
How to Fix
Docker Compose
# List all promotion gates and their status
stella release gates list
# Check policy engine for required policies
stella policy list
# Verify attestor health
curl -s http://localhost:5090/health
# Configure approvers for a gate
stella release gates configure <gate-id> --approvers <user1>,<user2>
services:
orchestrator:
environment:
ReleaseOrchestrator__DefaultApprovers: "admin,release-manager"
Bare Metal / systemd
# List gates
stella release gates list
# Verify policy engine is running
sudo systemctl status stellaops-policy-engine
# Verify attestor is running
sudo systemctl status stellaops-attestor
# Reload policies
stella policy compile --all
Kubernetes / Helm
# Check policy engine pods
kubectl get pods -l app=stellaops-policy-engine
# Check attestor pods
kubectl get pods -l app=stellaops-attestor
# Verify gate configuration
kubectl exec -it <orchestrator-pod> -- stella release gates list
Set in Helm values.yaml:
releaseOrchestrator:
promotionGates:
defaultApprovers:
- admin
- release-manager
policy:
engineUrl: "http://stellaops-policy-engine:8181"
attestor:
url: "http://stellaops-attestor:5090"
Verification
stella doctor run --check check.release.promotion.gates
Related Checks
check.policy.engine-- policy engine health affects gate policy checkscheck.release.active-- gate failures cause active releases to get stuckcheck.release.configuration-- workflow configuration defines which gates are used