Files
git.stella-ops.org/docs/doctor/articles/compliance/evidence-rate.md
master c58a236d70 Doctor plugin checks: implement health check classes and documentation
Implement remediation-aware health checks across all Doctor plugin modules
(Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment,
EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release,
Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation,
Authority, Core, Cryptography, Database, Docker, Integration, Notify,
Observability, Security, ServiceGraph, Sources, Verification).

Each check now emits structured remediation metadata (severity, category,
runbook links, and fix suggestions) consumed by the Doctor dashboard
remediation panel.

Also adds:
- docs/doctor/articles/ knowledge base for check explanations
- Advisory AI search seed and allowlist updates for doctor content
- Sprint plan for doctor checks documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 12:28:00 +02:00

3.5 KiB

checkId, plugin, severity, tags
checkId plugin severity tags
check.compliance.evidence-rate stellaops.doctor.compliance fail
compliance
evidence
attestation

Evidence Generation Rate

What It Checks

Monitors evidence generation success rate by querying the Evidence Locker at /api/v1/evidence/metrics. The check computes the success rate as (totalGenerated - failed) / totalGenerated over the last 24 hours and compares it against two thresholds:

Condition Result
Evidence Locker unreachable Warn
Success rate < 95% Fail
Success rate 95%-99% Warn
Success rate >= 99% Pass

Evidence collected: success_rate, total_generated_24h, failed_24h, pending_24h, avg_generation_time_ms.

The check only runs when EvidenceLocker:Url or Services:EvidenceLocker:Url is configured. It uses a 10-second HTTP timeout. If no evidence has been generated (totalGenerated == 0), the success rate defaults to 100%.

Why It Matters

Evidence generation is a critical path in the release pipeline. Every release decision, scan result, and policy evaluation produces evidence that feeds compliance audits and attestation chains. A dropping success rate means evidence records are being lost, which creates gaps in the audit trail. Below 95%, the system is losing more than 1 in 20 evidence records, making compliance reporting unreliable and potentially invalidating release approvals that lack supporting evidence.

Common Causes

  • Evidence generation service failures (internal errors, OOM)
  • Database connectivity issues preventing evidence persistence
  • Signing key unavailable, blocking signed evidence creation
  • Storage quota exceeded on the evidence backend
  • Intermittent failures due to high load or resource contention

How to Fix

Docker Compose

# Check evidence locker logs for errors
docker compose logs evidence-locker --since 1h | grep -i error

# Verify signing keys
docker compose exec evidence-locker stella evidence keys status

# Check database connectivity
docker compose exec evidence-locker stella evidence db check

# Check storage capacity
docker compose exec evidence-locker df -h /data/evidence

# If storage is full, clean up or expand volume
docker compose exec evidence-locker stella evidence cleanup --older-than 90d --dry-run

Bare Metal / systemd

# Check service logs
journalctl -u stellaops-evidence-locker --since "1 hour ago" | grep -i error

# Verify signing keys
stella evidence keys status

# Check database connectivity
stella evidence db check

# Check storage usage
df -h /var/lib/stellaops/evidence

sudo systemctl restart stellaops-evidence-locker

Kubernetes / Helm

# Check evidence locker pod logs
kubectl logs deploy/stellaops-evidence-locker --since=1h | grep -i error

# Verify signing keys
kubectl exec deploy/stellaops-evidence-locker -- stella evidence keys status

# Check persistent volume usage
kubectl exec deploy/stellaops-evidence-locker -- df -h /data/evidence

# Check for OOMKilled pods
kubectl get events --field-selector reason=OOMKilled -n stellaops

Verification

stella doctor run --check check.compliance.evidence-rate
  • check.compliance.attestation-signing — signing key health affects evidence generation
  • check.compliance.evidence-integrity — integrity of generated evidence
  • check.compliance.provenance-completeness — provenance depends on evidence generation
  • check.compliance.audit-readiness — overall audit readiness depends on evidence availability