--- checkId: check.agent.certificate.expiry plugin: stellaops.doctor.agent severity: fail tags: [agent, certificate, security, quick] --- # Agent Certificate Expiry ## What It Checks Inspects the `CertificateExpiresAt` field on every non-revoked, non-inactive agent and classifies each into one of four buckets: 1. **Expired** -- `CertificateExpiresAt` is in the past. Result: **Fail**. 2. **Critical** -- certificate expires within **1 day** (24 hours). Result: **Fail**. 3. **Warning** -- certificate expires within **7 days**. Result: **Warn**. 4. **Healthy** -- certificate has more than 7 days remaining. Result: **Pass**. The check short-circuits to the most severe bucket found. Evidence includes per-agent names with time-since-expiry or time-until-expiry, plus counts of `TotalActive`, `Expired`, `Critical`, and `Warning` agents. Agents whose `CertificateExpiresAt` is null or default are silently skipped (certificate info not available). If no active agents exist the check is skipped entirely. ## Why It Matters Agent mTLS certificates authenticate the agent to the orchestrator. An expired certificate causes the agent to fail heartbeats, reject task assignments, and drop out of the fleet. In production this means deployments and scans silently stop being dispatched to that agent, potentially leaving environments unserviced. ## Common Causes - Certificate auto-renewal is disabled on the agent - Agent was offline when renewal was due (missed the renewal window) - Certificate authority is unreachable from the agent host - Agent bootstrap was incomplete (certificate provisioned but auto-renewal not configured) - Certificate renewal threshold not yet reached (warning-level) - Certificate authority rate limiting prevented renewal (critical-level) ## How to Fix ### Docker Compose ```bash # Check certificate expiry for agent containers docker compose -f devops/compose/docker-compose.stella-ops.yml exec agent \ stella agent health --show-cert # Force certificate renewal docker compose -f devops/compose/docker-compose.stella-ops.yml exec agent \ stella agent renew-cert --force # Verify auto-renewal configuration docker compose -f devops/compose/docker-compose.stella-ops.yml exec agent \ stella agent config show | grep auto_renew ``` ### Bare Metal / systemd ```bash # Force certificate renewal on an affected agent stella agent renew-cert --agent-id --force # If agent is unreachable, re-bootstrap stella agent bootstrap --name --env # Verify auto-renewal is enabled stella agent config --agent-id | grep auto_renew # Check agent logs for renewal failures stella agent logs --agent-id --level warn ``` ### Kubernetes / Helm ```bash # Check cert expiry across agent pods kubectl exec -it deploy/stellaops-agent -n stellaops -- \ stella agent health --show-cert # Force renewal via pod exec kubectl exec -it deploy/stellaops-agent -n stellaops -- \ stella agent renew-cert --force # If using cert-manager, check Certificate resource kubectl get certificate -n stellaops kubectl describe certificate stellaops-agent-tls -n stellaops ``` ## Verification ``` stella doctor run --check check.agent.certificate.expiry ``` ## Related Checks - `check.agent.certificate.validity` -- verifies certificate chain of trust (not just expiry) - `check.agent.heartbeat.freshness` -- expired certs cause heartbeat failures - `check.agent.stale` -- agents with expired certs often show as stale