--- checkId: check.release.environment.readiness plugin: stellaops.doctor.release severity: warn tags: [release, environment, readiness, deployment] --- # Environment Readiness ## What It Checks Queries the Release Orchestrator at `/api/v1/environments` and evaluates the health and readiness of all configured target environments: - **Reachability**: environments must respond to health checks. - **Health status**: environments must report as healthy. - **Health check freshness**: warn if the last health check data is older than 1 hour. - **Production priority**: production environment issues escalate to fail severity; non-production issues are warnings. Evidence collected: `environment_count`, `dev_environments`, `staging_environments`, `prod_environments`, `unreachable_count`, `unhealthy_count`, `unreachable_environments`, `unhealthy_environments`, `stale_health_check_count`. The check requires `ReleaseOrchestrator:Url` or `Release:Orchestrator:Url` to be configured. ## Why It Matters Environments are the deployment targets in the release pipeline. An unreachable or unhealthy environment will cause any release targeting it to fail, blocking the promotion chain. Production environment issues are critical because they can indicate that the currently deployed version is also impacted. Stale health data means the system is operating on outdated information, which can lead to deploying to an environment that is actually down. ## Common Causes - Environment agent not responding (crashed, network partition) - Network connectivity issue between the orchestrator and target environment - Container runtime issue in the target environment (Docker daemon down) - Resource exhaustion (disk full, memory pressure) on the target host - Dev/staging environment intentionally powered down - Health check scheduler not running, producing stale data - Environment agent intermittent connectivity causing stale health reports ## How to Fix ### Docker Compose ```bash # Ping the unreachable environment stella env ping # View environment agent logs stella env logs # Check environment health details stella env health # Refresh health data for all environments stella env health --refresh-all ``` ### Bare Metal / systemd ```bash # Check the environment agent service ssh "systemctl status stellaops-agent" # Test network connectivity stella env ping # View agent logs on the target host ssh "journalctl -u stellaops-agent --since '1 hour ago'" # Restart agent if needed ssh "systemctl restart stellaops-agent" ``` ### Kubernetes / Helm ```bash # Check agent pods in the target cluster kubectl --context get pods -l app=stellaops-agent # View agent logs kubectl --context logs -l app=stellaops-agent --tail=200 # Check node resource availability kubectl --context top nodes ``` ## Verification ``` stella doctor run --check check.release.environment.readiness ``` ## Related Checks - `check.release.active` -- unreachable environments cause active releases to get stuck - `check.release.rollback.readiness` -- environment health affects rollback capability - `check.release.promotion.gates` -- environments must be reachable for gate checks to pass