Implement remediation-aware health checks across all Doctor plugin modules (Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment, EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release, Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation, Authority, Core, Cryptography, Database, Docker, Integration, Notify, Observability, Security, ServiceGraph, Sources, Verification). Each check now emits structured remediation metadata (severity, category, runbook links, and fix suggestions) consumed by the Doctor dashboard remediation panel. Also adds: - docs/doctor/articles/ knowledge base for check explanations - Advisory AI search seed and allowlist updates for doctor content - Sprint plan for doctor checks documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3.4 KiB
3.4 KiB
checkId, plugin, severity, tags
| checkId | plugin | severity | tags | |||
|---|---|---|---|---|---|---|
| check.docker.daemon | stellaops.doctor.docker | fail |
|
Docker Daemon
What It Checks
Validates that the Docker daemon is running and responsive. The check connects to the Docker daemon (using Docker:Host configuration or the platform default) and performs two operations:
- Ping: Sends a ping request to verify the daemon is alive (with a configurable timeout, default 10 seconds via
Docker:TimeoutSeconds). - Version: Retrieves version information to confirm the daemon is fully operational.
Evidence collected on success: host address, Docker version, API version, OS, architecture, and kernel version.
On failure, the check distinguishes between:
- DockerApiException: The daemon is running but returned an error (reports status code and response body).
- Connection failure: Cannot connect to the daemon at all (Docker not installed, not running, or socket inaccessible).
Default Docker host:
- Linux:
unix:///var/run/docker.sock - Windows:
npipe://./pipe/docker_engine
Why It Matters
The Docker daemon is the core runtime for all Stella Ops containers. If the daemon is down:
- No containers can start, stop, or restart.
- Health checks for all containerized services fail.
- Image pulls and builds are impossible.
- Docker Compose operations fail entirely.
- The entire Stella Ops platform is offline in container-based deployments.
Common Causes
- Docker daemon is not running or not accessible
- Docker is not installed on the host
- Docker service crashed or was stopped
- Docker daemon returned an error response (resource exhaustion, configuration error)
- Timeout connecting to the daemon (overloaded host, slow disk)
How to Fix
Docker Compose
Check and restart the Docker daemon:
# Check daemon status
sudo systemctl status docker
# Start the daemon
sudo systemctl start docker
# Enable auto-start on boot
sudo systemctl enable docker
# Verify
docker info
If Docker is not installed:
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
Bare Metal / systemd
# Check status
sudo systemctl status docker
# View daemon logs
sudo journalctl -u docker --since "10 minutes ago"
# Restart the daemon
sudo systemctl restart docker
# Verify connectivity
docker version
docker info
If the daemon crashes repeatedly, check for resource exhaustion:
# Check disk space (Docker requires space for images/containers)
df -h /var/lib/docker
# Check memory
free -h
# Clean up Docker resources
docker system prune -a
Kubernetes / Helm
On Kubernetes nodes, the container runtime (containerd/CRI-O) replaces Docker daemon. Check the runtime:
# Check containerd status
sudo systemctl status containerd
# Check CRI-O status
sudo systemctl status crio
# Restart if needed
sudo systemctl restart containerd
For Docker Desktop (development):
# Restart Docker Desktop
# macOS: killall Docker && open -a Docker
# Windows: Restart-Service docker
Verification
stella doctor run --check check.docker.daemon
Related Checks
check.docker.socket— verifies the Docker socket exists and has correct permissionscheck.docker.apiversion— verifies the Docker API version is compatiblecheck.docker.storage— verifies Docker storage is healthy (requires running daemon)check.docker.network— verifies Docker networks are configured (requires running daemon)