--- checkId: check.docker.daemon plugin: stellaops.doctor.docker severity: fail tags: [docker, daemon, container] --- # Docker Daemon ## What It Checks Validates that the Docker daemon is running and responsive. The check connects to the Docker daemon (using `Docker:Host` configuration or the platform default) and performs two operations: 1. **Ping**: Sends a ping request to verify the daemon is alive (with a configurable timeout, default 10 seconds via `Docker:TimeoutSeconds`). 2. **Version**: Retrieves version information to confirm the daemon is fully operational. Evidence collected on success: host address, Docker version, API version, OS, architecture, and kernel version. On failure, the check distinguishes between: - **DockerApiException**: The daemon is running but returned an error (reports status code and response body). - **Connection failure**: Cannot connect to the daemon at all (Docker not installed, not running, or socket inaccessible). Default Docker host: - **Linux**: `unix:///var/run/docker.sock` - **Windows**: `npipe://./pipe/docker_engine` ## Why It Matters The Docker daemon is the core runtime for all Stella Ops containers. If the daemon is down: - No containers can start, stop, or restart. - Health checks for all containerized services fail. - Image pulls and builds are impossible. - Docker Compose operations fail entirely. - The entire Stella Ops platform is offline in container-based deployments. ## Common Causes - Docker daemon is not running or not accessible - Docker is not installed on the host - Docker service crashed or was stopped - Docker daemon returned an error response (resource exhaustion, configuration error) - Timeout connecting to the daemon (overloaded host, slow disk) ## How to Fix ### Docker Compose Check and restart the Docker daemon: ```bash # Check daemon status sudo systemctl status docker # Start the daemon sudo systemctl start docker # Enable auto-start on boot sudo systemctl enable docker # Verify docker info ``` If Docker is not installed: ```bash curl -fsSL https://get.docker.com | sh sudo usermod -aG docker $USER ``` ### Bare Metal / systemd ```bash # Check status sudo systemctl status docker # View daemon logs sudo journalctl -u docker --since "10 minutes ago" # Restart the daemon sudo systemctl restart docker # Verify connectivity docker version docker info ``` If the daemon crashes repeatedly, check for resource exhaustion: ```bash # Check disk space (Docker requires space for images/containers) df -h /var/lib/docker # Check memory free -h # Clean up Docker resources docker system prune -a ``` ### Kubernetes / Helm On Kubernetes nodes, the container runtime (containerd/CRI-O) replaces Docker daemon. Check the runtime: ```bash # Check containerd status sudo systemctl status containerd # Check CRI-O status sudo systemctl status crio # Restart if needed sudo systemctl restart containerd ``` For Docker Desktop (development): ```bash # Restart Docker Desktop # macOS: killall Docker && open -a Docker # Windows: Restart-Service docker ``` ## Verification ``` stella doctor run --check check.docker.daemon ``` ## Related Checks - `check.docker.socket` — verifies the Docker socket exists and has correct permissions - `check.docker.apiversion` — verifies the Docker API version is compatible - `check.docker.storage` — verifies Docker storage is healthy (requires running daemon) - `check.docker.network` — verifies Docker networks are configured (requires running daemon)