--- checkId: check.environment.connectivity plugin: stellaops.doctor.environment severity: warn tags: [environment, connectivity, agent, network] --- # Environment Connectivity ## What It Checks Retrieves the list of environments from the Release Orchestrator (`/api/v1/environments`), then probes each environment agent's `/health` endpoint. For each agent the check measures: - **Reachability** -- whether the health endpoint returns a success status code - **Latency** -- fails warn if response takes more than 500ms - **TLS certificate validity** -- warns if the agent's TLS certificate expires within 30 days - **Authentication** -- detects 401/403 responses indicating credential issues If any agent is unreachable, the check fails. High latency or expiring certificates produce a warn. ## Why It Matters Environment agents are the control surface through which Stella Ops manages deployments, collects telemetry, and enforces policy. An unreachable agent means the platform cannot deploy to, monitor, or roll back services in that environment. TLS certificate expiry causes hard connectivity failures with no graceful degradation. High latency slows deployment pipelines and can cause timeouts in approval workflows. ## Common Causes - Environment agent service is stopped or crashed - Firewall rule change blocking the agent port - Network partition between Stella Ops control plane and target environment - TLS certificate not renewed before expiry - Agent authentication credentials rotated without updating Stella Ops configuration - DNS resolution failure for the agent hostname ## How to Fix ### Docker Compose ```bash # Check if the environment agent container is running docker ps --filter "name=environment-agent" # View agent logs for errors docker logs stellaops-environment-agent --tail 100 # Restart the agent docker compose -f docker-compose.stella-ops.yml restart environment-agent # If TLS cert is expiring, replace the certificate files # mounted into the agent container and restart cp /path/to/new/cert.pem devops/compose/certs/agent.pem cp /path/to/new/key.pem devops/compose/certs/agent-key.pem docker compose -f docker-compose.stella-ops.yml restart environment-agent ``` ### Bare Metal / systemd ```bash # Check agent service status sudo systemctl status stellaops-environment-agent # View logs sudo journalctl -u stellaops-environment-agent --since "1 hour ago" # Restart agent sudo systemctl restart stellaops-environment-agent # Renew TLS certificate sudo cp /path/to/new/cert.pem /etc/stellaops/certs/agent.pem sudo cp /path/to/new/key.pem /etc/stellaops/certs/agent-key.pem sudo systemctl restart stellaops-environment-agent # Test network connectivity from control plane curl -v https://:/health ``` ### Kubernetes / Helm ```bash # Check agent pod status kubectl get pods -n stellaops -l app=environment-agent # View agent logs kubectl logs -n stellaops -l app=environment-agent --tail=100 # Restart agent pods kubectl rollout restart deployment/environment-agent -n stellaops # Renew TLS certificate via cert-manager or manual secret update kubectl create secret tls agent-tls \ --cert=/path/to/cert.pem \ --key=/path/to/key.pem \ -n stellaops --dry-run=client -o yaml | kubectl apply -f - # Check network policies kubectl get networkpolicies -n stellaops ``` ## Verification ```bash stella doctor run --check check.environment.connectivity ``` ## Related Checks - `check.environment.deployments` - checks health of services deployed via agents - `check.environment.network.policy` - verifies network policies that may block agent connectivity - `check.environment.secrets` - agent credentials may need rotation