Files

master c58a236d70 Doctor plugin checks: implement health check classes and documentation

Implement remediation-aware health checks across all Doctor plugin modules
(Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment,
EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release,
Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation,
Authority, Core, Cryptography, Database, Docker, Integration, Notify,
Observability, Security, ServiceGraph, Sources, Verification).

Each check now emits structured remediation metadata (severity, category,
runbook links, and fix suggestions) consumed by the Doctor dashboard
remediation panel.

Also adds:
- docs/doctor/articles/ knowledge base for check explanations
- Advisory AI search seed and allowlist updates for doctor content
- Sprint plan for doctor checks documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-27 12:28:00 +02:00

2.3 KiB

Raw Blame History

checkId, plugin, severity, tags

checkId

plugin

severity

Prometheus Scrape

What It Checks

Verifies that the application metrics endpoint is accessible for Prometheus scraping. The check:

Reads Metrics:Path (default /metrics), Metrics:Port (default 8080), and Metrics:Host (default localhost).
Sends a GET request to http://{host}:{port}{path} with a 5-second timeout.
Counts the number of Prometheus-formatted metric lines in the response.
Passes if the endpoint returns a successful response with metrics.
Warns on non-success status codes, timeouts, or connection failures.

The check only runs when Metrics:Enabled is set to true.

Why It Matters

Prometheus metrics provide real-time visibility into service health, request latencies, error rates, and resource utilization. Without a scrapeable metrics endpoint, alerting rules cannot fire, dashboards go blank, and capacity planning has no data.

Common Causes

Metrics endpoint not enabled in configuration
Wrong port configured
Service not running on the expected port
Authentication required but not configured for Prometheus
Firewall blocking the metrics port

How to Fix

Docker Compose

environment:
  Metrics__Enabled: "true"
  Metrics__Path: "/metrics"
  Metrics__Port: "8080"

# Test metrics endpoint
docker exec <platform-container> curl -s http://localhost:8080/metrics | head -5

Bare Metal / systemd

Edit appsettings.json:

{
  "Metrics": {
    "Enabled": true,
    "Path": "/metrics",
    "Port": 8080
  }
}

# Verify metrics are exposed
curl -s http://localhost:8080/metrics | head -5

# Check port binding
netstat -an | grep 8080

Kubernetes / Helm

metrics:
  enabled: true
  port: 8080
  path: "/metrics"
  serviceMonitor:
    enabled: true

Add Prometheus annotations to the pod:

annotations:
  prometheus.io/scrape: "true"
  prometheus.io/port: "8080"
  prometheus.io/path: "/metrics"

Verification

stella doctor run --check check.metrics.prometheus.scrape

check.telemetry.otlp.endpoint — verifies OTLP collector endpoint reachability
check.logs.directory.writable — verifies log directory is writable

2.3 KiB Raw Blame History