Doctor plugin checks: implement health check classes and documentation

Implement remediation-aware health checks across all Doctor plugin modules
(Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment,
EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release,
Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation,
Authority, Core, Cryptography, Database, Docker, Integration, Notify,
Observability, Security, ServiceGraph, Sources, Verification).

Each check now emits structured remediation metadata (severity, category,
runbook links, and fix suggestions) consumed by the Doctor dashboard
remediation panel.

Also adds:
- docs/doctor/articles/ knowledge base for check explanations
- Advisory AI search seed and allowlist updates for doctor content
- Sprint plan for doctor checks documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
master
2026-03-27 12:28:00 +02:00
parent fbd24e71de
commit c58a236d70
326 changed files with 18500 additions and 463 deletions

View File

@@ -0,0 +1,110 @@
---
checkId: check.attestation.keymaterial
plugin: stellaops.doctor.attestor
severity: warn
tags: [attestation, signing, security, expiration]
---
# Signing Key Expiration
## What It Checks
Monitors the expiration timeline of attestation signing keys. The check reads the signing mode from configuration and, for modes that use expiring keys (file, kms, certificate), retrieves key information and classifies each key:
1. **Expired** -- key has already expired (`daysUntilExpiry < 0`). Result: **Fail** with list of expired key IDs.
2. **Critical** -- key expires within **7 days**. Result: **Fail** with key IDs and days remaining.
3. **Warning** -- key expires within **30 days**. Result: **Warn** with key IDs and days remaining.
4. **Healthy** -- all keys have more than 30 days until expiration. Result: **Pass** with key count and per-key expiry dates (up to 5 keys shown).
For **keyless** signing mode, the check returns **Skip** because keyless signing does not use expiring key material.
If no signing keys are found, the check returns **Skip** with a note that no file-based or certificate-based keys were found.
Evidence collected: `ExpiredKeys` (list of IDs), `CriticalKeys` (ID + days), `WarningKeys` (ID + days), `TotalKeys`, `HealthyKeys`, per-key entries showing `Key:<id>` with expiry date and days remaining.
Thresholds:
- Warning: 30 days (`WarningDays`)
- Critical: 7 days (`CriticalDays`)
## Why It Matters
Expired signing keys make it impossible to create new attestations, blocking the release pipeline at policy gates that require signed artifacts. Keys approaching expiration should be rotated proactively to ensure overlap between old and new keys, allowing verifiers to accept signatures from both during the transition period. Without monitoring, key expiration causes a sudden, hard outage.
## Common Causes
- Keys were not rotated before expiration (manual process forgotten)
- Scheduled rotation job failed (permissions, connectivity)
- Key expiration not monitored (no alerting configured)
- Normal lifecycle -- keys approaching the warning threshold (plan rotation)
- Rotation reminders not configured
## How to Fix
### Docker Compose
```bash
# Check key status
docker compose -f devops/compose/docker-compose.stella-ops.yml exec attestor \
stella keys status
# Rotate expired or critical keys
docker compose -f devops/compose/docker-compose.stella-ops.yml exec attestor \
stella keys rotate <key-id>
# Set up expiration monitoring
docker compose -f devops/compose/docker-compose.stella-ops.yml exec attestor \
stella notify channels add --type email --event key.expiring --threshold-days 30
```
### Bare Metal / systemd
```bash
# Rotate expired keys immediately
stella keys rotate <expired-key-id>
# Set up key expiration monitoring
stella notify channels add --type email --event key.expiring --threshold-days 30
# Schedule immediate key rotation for critical keys (with overlap)
stella keys rotate <critical-key-id> --overlap-days 7
# Plan rotation for warning-level keys (dry run first)
stella keys rotate <warning-key-id> --dry-run
# Execute rotation with overlap period
stella keys rotate <warning-key-id> --overlap-days 14
# Review all key status
stella keys status
```
### Kubernetes / Helm
```bash
# Check key status
kubectl exec -it deploy/stellaops-attestor -n stellaops -- \
stella keys status
# Rotate keys
kubectl exec -it deploy/stellaops-attestor -n stellaops -- \
stella keys rotate <key-id> --overlap-days 14
# Configure automatic key rotation in Helm values
# attestor:
# signing:
# autoRotate: true
# rotationBeforeDays: 30
# overlapDays: 14
helm upgrade stellaops stellaops/stellaops -f values.yaml
```
## Verification
```
stella doctor run --check check.attestation.keymaterial
```
## Related Checks
- `check.attestation.cosign.keymaterial` -- verifies key material availability (existence, not expiration)
- `check.auth.signing-key` -- auth signing key health (separate from attestation keys)
- `check.attestation.rekor.verification.job` -- expired keys cause verification failures