Doctor plugin checks: implement health check classes and documentation

Implement remediation-aware health checks across all Doctor plugin modules
(Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment,
EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release,
Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation,
Authority, Core, Cryptography, Database, Docker, Integration, Notify,
Observability, Security, ServiceGraph, Sources, Verification).

Each check now emits structured remediation metadata (severity, category,
runbook links, and fix suggestions) consumed by the Doctor dashboard
remediation panel.

Also adds:
- docs/doctor/articles/ knowledge base for check explanations
- Advisory AI search seed and allowlist updates for doctor content
- Sprint plan for doctor checks documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
master
2026-03-27 12:28:00 +02:00
parent fbd24e71de
commit c58a236d70
326 changed files with 18500 additions and 463 deletions

View File

@@ -0,0 +1,181 @@
# Sprint 20260326_001 — Doctor Health Checks Documentation
## Topic & Scope
- Document every Doctor health check (99 checks across 16 plugins) with precise, actionable remediation.
- Each check must have: what it tests, why it matters, exact fix steps, Docker compose specifics, and verification.
- Fix false-positive checks that fail on default Docker compose installations.
- Working directory: `docs/modules/doctor/`, `src/Doctor/__Plugins/`
- Expected evidence: docs, improved check messages, tests.
## Dependencies & Concurrency
- No upstream dependencies. Can be parallelized by plugin.
- Depends on the 4 check code fixes already applied (RequiredSettings, EnvironmentVariables, SecretsConfiguration, DockerSocket).
## Documentation Prerequisites
- `docs/modules/doctor/architecture.md` — existing Doctor architecture overview
- `docs/modules/doctor/registry-checks.md` — existing check registry reference
- `devops/compose/docker-compose.stella-ops.yml` — the reference deployment
## Delivery Tracker
### DOC-001 - Create check reference index
Status: TODO
Dependency: none
Owners: Documentation author
Task description:
- Create `docs/modules/doctor/checks/README.md` with a master table of all 99 checks
- Columns: Check ID, Plugin, Category, Severity, Summary, Docker Compose Status (Pass/Warn/Fail/N/A)
- Group by plugin (Core, Security, Docker, Agent, Attestor, Auth, etc.)
- Include quick-reference severity legend
Completion criteria:
- [ ] All 99 checks listed with correct metadata
- [ ] Docker Compose Status column filled from actual test run
### DOC-002 - Core Plugin checks documentation (9 checks)
Status: TODO
Dependency: DOC-001
Owners: Documentation author
Task description:
- Create `docs/modules/doctor/checks/core.md`
- Document each check:
- **check.core.config.required**: What settings are checked, key variants (colon vs `__`), compose env var names, how to add missing settings
- **check.core.env.variables**: Which env vars are checked, why `ASPNETCORE_ENVIRONMENT` may not be set in compose, when this is OK
- **check.core.health.endpoint**: Health endpoint configuration
- **check.core.memory**: Memory threshold configuration
- **check.core.startup.time**: Expected startup time ranges
- Each remaining core check
- For each check: Symptom → Root Cause → Fix → Verify
Completion criteria:
- [ ] Each check has: description, what it tests, severity, fix steps, Docker compose notes, verification command
### DOC-003 - Security Plugin checks documentation
Status: TODO
Dependency: DOC-001
Owners: Documentation author
Task description:
- Create `docs/modules/doctor/checks/security.md`
- Document: check.security.secrets, check.security.tls, check.security.cors, check.security.headers
- Include: which keys are considered "secrets" vs DSNs, vault provider configuration, development vs production guidance
Completion criteria:
- [ ] Each check documented with fix steps and Docker compose notes
### DOC-004 - Docker Plugin checks documentation
Status: TODO
Dependency: DOC-001
Owners: Documentation author
Task description:
- Create `docs/modules/doctor/checks/docker.md`
- Document: check.docker.socket, check.docker.daemon, check.docker.images
- Include: container-vs-host detection, socket mount instructions, Windows named pipe notes
Completion criteria:
- [ ] Each check documented with container-aware behavior explained
### DOC-005 - Agent Plugin checks documentation (11 checks)
Status: TODO
Dependency: DOC-001
Owners: Documentation author
Task description:
- Create `docs/modules/doctor/checks/agent.md`
- Document all 11 agent checks: capacity, certificates, cluster health/quorum, heartbeat, resources, versions, stale detection, task failure rate, task backlog
Completion criteria:
- [ ] Each check documented with thresholds, configuration options, fix steps
### DOC-006 - Attestor Plugin checks documentation (6 checks)
Status: TODO
Dependency: DOC-001
Owners: Documentation author
Task description:
- Create `docs/modules/doctor/checks/attestor.md`
- Document: cosign key material, clock skew, Rekor connectivity/verification, signing key expiration, transparency log consistency
Completion criteria:
- [ ] Each check documented including air-gap/offline scenarios
### DOC-007 - Auth Plugin checks documentation (4 checks)
Status: TODO
Dependency: DOC-001
Owners: Documentation author
Task description:
- Create `docs/modules/doctor/checks/auth.md`
- Document: auth configuration, OIDC provider connectivity, signing key health, token service health
Completion criteria:
- [ ] Each check documented with OIDC troubleshooting steps
### DOC-008 - Remaining plugins documentation
Status: TODO
Dependency: DOC-001
Owners: Documentation author
Task description:
- Create one doc per remaining plugin:
- `docs/modules/doctor/checks/binary-analysis.md` (6 checks)
- `docs/modules/doctor/checks/compliance.md` (7 checks)
- `docs/modules/doctor/checks/crypto.md` (6 checks)
- `docs/modules/doctor/checks/environment.md` (6 checks)
- `docs/modules/doctor/checks/evidence-locker.md` (4 checks)
- `docs/modules/doctor/checks/observability.md` (4 checks)
- `docs/modules/doctor/checks/notify.md` (9 checks)
- `docs/modules/doctor/checks/operations.md` (3 checks)
- `docs/modules/doctor/checks/policy.md` (1 check)
- `docs/modules/doctor/checks/postgres.md` (3 checks)
- `docs/modules/doctor/checks/release.md` (6 checks)
- `docs/modules/doctor/checks/scanner.md` (7 checks)
- `docs/modules/doctor/checks/storage.md` (3 checks)
- `docs/modules/doctor/checks/timestamping.md` (9 checks)
- `docs/modules/doctor/checks/vex.md` (3 checks)
Completion criteria:
- [ ] Every check across all 16 plugins documented
### DOC-009 - Improve check remediation messages in code
Status: TODO
Dependency: DOC-002 through DOC-008
Owners: Developer
Task description:
- For each check, update the `WithRemediation()` steps to include:
- Exact commands (not vague "configure X")
- Docker compose env var names (using `__` separator)
- File paths relative to the compose directory
- Link to the documentation page (e.g., "See docs/modules/doctor/checks/core.md")
- Update `WithCauses()` to be specific, not generic
Completion criteria:
- [ ] All 99 checks have precise, copy-pasteable remediation steps
- [ ] No check reports a generic "configure X" without specifying how
- [ ] Docker compose installations pass all checks that should pass
### DOC-010 - Docker compose default pass baseline
Status: TODO
Dependency: DOC-009
Owners: QA / Test Automation
Task description:
- Run all 99 Doctor checks against a fresh `docker compose up` installation
- Document which checks MUST pass, which are expected warnings, which are N/A
- Create `docs/modules/doctor/compose-baseline.md` with the expected results
- Add any remaining code fixes for false positives
Completion criteria:
- [ ] Baseline document created
- [ ] Zero false-positive FAILs on fresh Docker compose install
- [ ] All WARN checks documented as expected or fixed
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-03-26 | Sprint created. 4 code fixes applied (RequiredSettings, EnvironmentVariables, SecretsConfiguration, DockerSocket). | Planning |
## Decisions & Risks
- Risk: 99 checks is a large documentation surface. Parallelize by plugin.
- Decision: Each plugin gets its own doc file for maintainability.
- Decision: Remediation messages in code should link to docs, not duplicate full instructions.
## Next Checkpoints
- DOC-001 (index): 1 day
- DOC-002 through DOC-008 (all plugin docs): 3-5 days
- DOC-009 (code remediation improvements): 2 days
- DOC-010 (baseline): 1 day