Doctor plugin checks: implement health check classes and documentation

Implement remediation-aware health checks across all Doctor plugin modules
(Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment,
EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release,
Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation,
Authority, Core, Cryptography, Database, Docker, Integration, Notify,
Observability, Security, ServiceGraph, Sources, Verification).

Each check now emits structured remediation metadata (severity, category,
runbook links, and fix suggestions) consumed by the Doctor dashboard
remediation panel.

Also adds:
- docs/doctor/articles/ knowledge base for check explanations
- Advisory AI search seed and allowlist updates for doctor content
- Sprint plan for doctor checks documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
master
2026-03-27 12:28:00 +02:00
parent fbd24e71de
commit c58a236d70
326 changed files with 18500 additions and 463 deletions

View File

@@ -0,0 +1,65 @@
---
checkId: check.timestamp.tsa.response-time
plugin: stellaops.doctor.timestamping
severity: warn
tags: [timestamping, tsa, latency, performance]
---
# TSA Response Time
## What It Checks
Measures TSA endpoint response times against configurable thresholds. The check:
- Probes each configured TSA endpoint and measures round-trip latency.
- Compares latency against warning threshold (default 5000ms) and critical threshold (default 30000ms).
- Fails if any endpoint exceeds the critical latency threshold.
- Warns if any endpoint exceeds the warning threshold.
- Passes if all endpoints respond within acceptable latency.
- Reports degraded if no endpoints are configured.
## Why It Matters
High TSA latency slows down the evidence generation pipeline. Every release artifact that needs a timestamp will be delayed by slow TSA responses. In high-throughput environments, TSA latency can become a bottleneck that blocks the entire release pipeline.
## Common Causes
- TSA server under heavy load
- Network latency to remote TSA endpoints
- Firewall or proxy adding latency
- TSA provider experiencing service degradation
## How to Fix
### Docker Compose
Consider adding a geographically closer TSA endpoint or a local TSA:
```yaml
environment:
Timestamping__WarnLatencyMs: "5000"
Timestamping__CriticalLatencyMs: "30000"
```
### Bare Metal / systemd
```bash
# Test TSA latency manually
time curl -s -o /dev/null https://freetsa.org/tsr
# Add a faster TSA endpoint
stella tsa add --name "LocalTSA" --url "https://tsa.internal.example.com/tsr"
```
### Kubernetes / Helm
```yaml
timestamping:
warnLatencyMs: 5000
criticalLatencyMs: 30000
```
Consider deploying a local TSA proxy or cache to reduce latency.
## Verification
```
stella doctor run --check check.timestamp.tsa.response-time
```
## Related Checks
- `check.timestamp.tsa.reachable` — verifies TSA endpoints are reachable
- `check.timestamp.tsa.valid-response` — verifies valid RFC-3161 responses
- `check.timestamp.tsa.failover-ready` — confirms failover readiness