Implement remediation-aware health checks across all Doctor plugin modules (Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment, EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release, Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation, Authority, Core, Cryptography, Database, Docker, Integration, Notify, Observability, Security, ServiceGraph, Sources, Verification). Each check now emits structured remediation metadata (severity, category, runbook links, and fix suggestions) consumed by the Doctor dashboard remediation panel. Also adds: - docs/doctor/articles/ knowledge base for check explanations - Advisory AI search seed and allowlist updates for doctor content - Sprint plan for doctor checks documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3.7 KiB
checkId, plugin, severity, tags
| checkId | plugin | severity | tags | |||
|---|---|---|---|---|---|---|
| check.docker.storage | stellaops.doctor.docker | warn |
|
Docker Storage
What It Checks
Validates Docker storage driver and disk space usage. The check connects to the Docker daemon and retrieves system information, then inspects:
| Condition | Result |
|---|---|
Storage driver is not overlay2, btrfs, or zfs |
warn — non-recommended driver |
Free disk space on Docker root partition < 10 GB (configurable via Docker:MinFreeSpaceGb) |
warn |
Disk usage > 85% (configurable via Docker:MaxStorageUsagePercent) |
warn |
The check reads the Docker root directory (typically /var/lib/docker) and queries drive info for that partition. On platforms where disk info is unavailable, the check still validates the storage driver.
Evidence collected includes: storage driver, Docker root directory, total space, free space, usage percentage, and whether the driver is recommended.
Why It Matters
Docker storage issues are a leading cause of container deployment failures:
- Non-recommended storage drivers (e.g.,
vfs,devicemapper) have performance and reliability problems.overlay2is the recommended driver for most workloads. - Low disk space prevents image pulls, container creation, and volume writes. Docker images and layers consume significant space.
- High disk usage can cause container crashes, database corruption, and evidence write failures.
The Docker root directory often shares a partition with the OS, so storage exhaustion affects the entire host.
Common Causes
- Storage driver is not overlay2, btrfs, or zfs (e.g., using legacy
devicemapperorvfs) - Low disk space on the Docker root partition (less than 10 GB free)
- Disk usage exceeds 85% threshold
- Unused images, containers, and volumes consuming space
- Large build caches not pruned
How to Fix
Docker Compose
Check and clean Docker storage:
# Check disk usage
docker system df
# Detailed disk usage
docker system df -v
# Prune unused data (images, containers, networks, build cache)
docker system prune -a
# Prune volumes too (WARNING: removes data volumes)
docker system prune -a --volumes
# Check storage driver
docker info | grep "Storage Driver"
Configure storage thresholds:
environment:
Docker__MinFreeSpaceGb: "10"
Docker__MaxStorageUsagePercent: "85"
Bare Metal / systemd
Switch to overlay2 storage driver if not already using it:
# Check current driver
docker info | grep "Storage Driver"
# Configure overlay2 in /etc/docker/daemon.json
{
"storage-driver": "overlay2"
}
# Restart Docker (WARNING: may require re-pulling images)
sudo systemctl restart docker
Free up disk space:
# Find large Docker directories
du -sh /var/lib/docker/*
# Clean unused resources
docker system prune -a
# Set up automatic cleanup via cron
echo "0 2 * * 0 docker system prune -f --filter 'until=168h'" | sudo crontab -
Kubernetes / Helm
Monitor node disk usage:
# Check node disk pressure
kubectl describe node <node> | grep -A 5 "Conditions"
# Check for DiskPressure condition
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{range .status.conditions[?(@.type=="DiskPressure")]}{.status}{"\n"}{end}{end}'
Configure kubelet garbage collection thresholds:
# In kubelet config
imageGCHighThresholdPercent: 85
imageGCLowThresholdPercent: 80
evictionHard:
nodefs.available: "10%"
imagefs.available: "15%"
Verification
stella doctor run --check check.docker.storage
Related Checks
check.core.env.diskspace— checks general disk space (not Docker-specific)check.docker.daemon— daemon must be running to query storage info