Implement remediation-aware health checks across all Doctor plugin modules (Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment, EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release, Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation, Authority, Core, Cryptography, Database, Docker, Integration, Notify, Observability, Security, ServiceGraph, Sources, Verification). Each check now emits structured remediation metadata (severity, category, runbook links, and fix suggestions) consumed by the Doctor dashboard remediation panel. Also adds: - docs/doctor/articles/ knowledge base for check explanations - Advisory AI search seed and allowlist updates for doctor content - Sprint plan for doctor checks documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2.3 KiB
2.3 KiB
checkId, plugin, severity, tags
| checkId | plugin | severity | tags | |||
|---|---|---|---|---|---|---|
| check.metrics.prometheus.scrape | stellaops.doctor.observability | warn |
|
Prometheus Scrape
What It Checks
Verifies that the application metrics endpoint is accessible for Prometheus scraping. The check:
- Reads
Metrics:Path(default/metrics),Metrics:Port(default8080), andMetrics:Host(defaultlocalhost). - Sends a GET request to
http://{host}:{port}{path}with a 5-second timeout. - Counts the number of Prometheus-formatted metric lines in the response.
- Passes if the endpoint returns a successful response with metrics.
- Warns on non-success status codes, timeouts, or connection failures.
The check only runs when Metrics:Enabled is set to true.
Why It Matters
Prometheus metrics provide real-time visibility into service health, request latencies, error rates, and resource utilization. Without a scrapeable metrics endpoint, alerting rules cannot fire, dashboards go blank, and capacity planning has no data.
Common Causes
- Metrics endpoint not enabled in configuration
- Wrong port configured
- Service not running on the expected port
- Authentication required but not configured for Prometheus
- Firewall blocking the metrics port
How to Fix
Docker Compose
environment:
Metrics__Enabled: "true"
Metrics__Path: "/metrics"
Metrics__Port: "8080"
# Test metrics endpoint
docker exec <platform-container> curl -s http://localhost:8080/metrics | head -5
Bare Metal / systemd
Edit appsettings.json:
{
"Metrics": {
"Enabled": true,
"Path": "/metrics",
"Port": 8080
}
}
# Verify metrics are exposed
curl -s http://localhost:8080/metrics | head -5
# Check port binding
netstat -an | grep 8080
Kubernetes / Helm
metrics:
enabled: true
port: 8080
path: "/metrics"
serviceMonitor:
enabled: true
Add Prometheus annotations to the pod:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
Verification
stella doctor run --check check.metrics.prometheus.scrape
Related Checks
check.telemetry.otlp.endpoint— verifies OTLP collector endpoint reachabilitycheck.logs.directory.writable— verifies log directory is writable