Implement remediation-aware health checks across all Doctor plugin modules (Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment, EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release, Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation, Authority, Core, Cryptography, Database, Docker, Integration, Notify, Observability, Security, ServiceGraph, Sources, Verification). Each check now emits structured remediation metadata (severity, category, runbook links, and fix suggestions) consumed by the Doctor dashboard remediation panel. Also adds: - docs/doctor/articles/ knowledge base for check explanations - Advisory AI search seed and allowlist updates for doctor content - Sprint plan for doctor checks documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3.3 KiB
checkId, plugin, severity, tags
| checkId | plugin | severity | tags | ||||
|---|---|---|---|---|---|---|---|
| check.release.environment.readiness | stellaops.doctor.release | warn |
|
Environment Readiness
What It Checks
Queries the Release Orchestrator at /api/v1/environments and evaluates the health and readiness of all configured target environments:
- Reachability: environments must respond to health checks.
- Health status: environments must report as healthy.
- Health check freshness: warn if the last health check data is older than 1 hour.
- Production priority: production environment issues escalate to fail severity; non-production issues are warnings.
Evidence collected: environment_count, dev_environments, staging_environments, prod_environments, unreachable_count, unhealthy_count, unreachable_environments, unhealthy_environments, stale_health_check_count.
The check requires ReleaseOrchestrator:Url or Release:Orchestrator:Url to be configured.
Why It Matters
Environments are the deployment targets in the release pipeline. An unreachable or unhealthy environment will cause any release targeting it to fail, blocking the promotion chain. Production environment issues are critical because they can indicate that the currently deployed version is also impacted. Stale health data means the system is operating on outdated information, which can lead to deploying to an environment that is actually down.
Common Causes
- Environment agent not responding (crashed, network partition)
- Network connectivity issue between the orchestrator and target environment
- Container runtime issue in the target environment (Docker daemon down)
- Resource exhaustion (disk full, memory pressure) on the target host
- Dev/staging environment intentionally powered down
- Health check scheduler not running, producing stale data
- Environment agent intermittent connectivity causing stale health reports
How to Fix
Docker Compose
# Ping the unreachable environment
stella env ping <environment-name>
# View environment agent logs
stella env logs <environment-name>
# Check environment health details
stella env health <environment-name>
# Refresh health data for all environments
stella env health --refresh-all
Bare Metal / systemd
# Check the environment agent service
ssh <environment-host> "systemctl status stellaops-agent"
# Test network connectivity
stella env ping <environment-name>
# View agent logs on the target host
ssh <environment-host> "journalctl -u stellaops-agent --since '1 hour ago'"
# Restart agent if needed
ssh <environment-host> "systemctl restart stellaops-agent"
Kubernetes / Helm
# Check agent pods in the target cluster
kubectl --context <target-cluster> get pods -l app=stellaops-agent
# View agent logs
kubectl --context <target-cluster> logs -l app=stellaops-agent --tail=200
# Check node resource availability
kubectl --context <target-cluster> top nodes
Verification
stella doctor run --check check.release.environment.readiness
Related Checks
check.release.active-- unreachable environments cause active releases to get stuckcheck.release.rollback.readiness-- environment health affects rollback capabilitycheck.release.promotion.gates-- environments must be reachable for gate checks to pass