doctor: complete runtime check documentation sprint

Signed-off-by: master <>
This commit is contained in:
master
2026-03-31 23:26:24 +03:00
parent 404d50bcb7
commit 152c1b1357
54 changed files with 2210 additions and 258 deletions

View File

@@ -1,181 +0,0 @@
# Sprint 20260326_001 — Doctor Health Checks Documentation
## Topic & Scope
- Document every Doctor health check (99 checks across 16 plugins) with precise, actionable remediation.
- Each check must have: what it tests, why it matters, exact fix steps, Docker compose specifics, and verification.
- Fix false-positive checks that fail on default Docker compose installations.
- Working directory: `docs/modules/doctor/`, `src/Doctor/__Plugins/`
- Expected evidence: docs, improved check messages, tests.
## Dependencies & Concurrency
- No upstream dependencies. Can be parallelized by plugin.
- Depends on the 4 check code fixes already applied (RequiredSettings, EnvironmentVariables, SecretsConfiguration, DockerSocket).
## Documentation Prerequisites
- `docs/modules/doctor/architecture.md` — existing Doctor architecture overview
- `docs/modules/doctor/registry-checks.md` — existing check registry reference
- `devops/compose/docker-compose.stella-ops.yml` — the reference deployment
## Delivery Tracker
### DOC-001 - Create check reference index
Status: TODO
Dependency: none
Owners: Documentation author
Task description:
- Create `docs/modules/doctor/checks/README.md` with a master table of all 99 checks
- Columns: Check ID, Plugin, Category, Severity, Summary, Docker Compose Status (Pass/Warn/Fail/N/A)
- Group by plugin (Core, Security, Docker, Agent, Attestor, Auth, etc.)
- Include quick-reference severity legend
Completion criteria:
- [ ] All 99 checks listed with correct metadata
- [ ] Docker Compose Status column filled from actual test run
### DOC-002 - Core Plugin checks documentation (9 checks)
Status: TODO
Dependency: DOC-001
Owners: Documentation author
Task description:
- Create `docs/modules/doctor/checks/core.md`
- Document each check:
- **check.core.config.required**: What settings are checked, key variants (colon vs `__`), compose env var names, how to add missing settings
- **check.core.env.variables**: Which env vars are checked, why `ASPNETCORE_ENVIRONMENT` may not be set in compose, when this is OK
- **check.core.health.endpoint**: Health endpoint configuration
- **check.core.memory**: Memory threshold configuration
- **check.core.startup.time**: Expected startup time ranges
- Each remaining core check
- For each check: Symptom → Root Cause → Fix → Verify
Completion criteria:
- [ ] Each check has: description, what it tests, severity, fix steps, Docker compose notes, verification command
### DOC-003 - Security Plugin checks documentation
Status: TODO
Dependency: DOC-001
Owners: Documentation author
Task description:
- Create `docs/modules/doctor/checks/security.md`
- Document: check.security.secrets, check.security.tls, check.security.cors, check.security.headers
- Include: which keys are considered "secrets" vs DSNs, vault provider configuration, development vs production guidance
Completion criteria:
- [ ] Each check documented with fix steps and Docker compose notes
### DOC-004 - Docker Plugin checks documentation
Status: TODO
Dependency: DOC-001
Owners: Documentation author
Task description:
- Create `docs/modules/doctor/checks/docker.md`
- Document: check.docker.socket, check.docker.daemon, check.docker.images
- Include: container-vs-host detection, socket mount instructions, Windows named pipe notes
Completion criteria:
- [ ] Each check documented with container-aware behavior explained
### DOC-005 - Agent Plugin checks documentation (11 checks)
Status: TODO
Dependency: DOC-001
Owners: Documentation author
Task description:
- Create `docs/modules/doctor/checks/agent.md`
- Document all 11 agent checks: capacity, certificates, cluster health/quorum, heartbeat, resources, versions, stale detection, task failure rate, task backlog
Completion criteria:
- [ ] Each check documented with thresholds, configuration options, fix steps
### DOC-006 - Attestor Plugin checks documentation (6 checks)
Status: TODO
Dependency: DOC-001
Owners: Documentation author
Task description:
- Create `docs/modules/doctor/checks/attestor.md`
- Document: cosign key material, clock skew, Rekor connectivity/verification, signing key expiration, transparency log consistency
Completion criteria:
- [ ] Each check documented including air-gap/offline scenarios
### DOC-007 - Auth Plugin checks documentation (4 checks)
Status: TODO
Dependency: DOC-001
Owners: Documentation author
Task description:
- Create `docs/modules/doctor/checks/auth.md`
- Document: auth configuration, OIDC provider connectivity, signing key health, token service health
Completion criteria:
- [ ] Each check documented with OIDC troubleshooting steps
### DOC-008 - Remaining plugins documentation
Status: TODO
Dependency: DOC-001
Owners: Documentation author
Task description:
- Create one doc per remaining plugin:
- `docs/modules/doctor/checks/binary-analysis.md` (6 checks)
- `docs/modules/doctor/checks/compliance.md` (7 checks)
- `docs/modules/doctor/checks/crypto.md` (6 checks)
- `docs/modules/doctor/checks/environment.md` (6 checks)
- `docs/modules/doctor/checks/evidence-locker.md` (4 checks)
- `docs/modules/doctor/checks/observability.md` (4 checks)
- `docs/modules/doctor/checks/notify.md` (9 checks)
- `docs/modules/doctor/checks/operations.md` (3 checks)
- `docs/modules/doctor/checks/policy.md` (1 check)
- `docs/modules/doctor/checks/postgres.md` (3 checks)
- `docs/modules/doctor/checks/release.md` (6 checks)
- `docs/modules/doctor/checks/scanner.md` (7 checks)
- `docs/modules/doctor/checks/storage.md` (3 checks)
- `docs/modules/doctor/checks/timestamping.md` (9 checks)
- `docs/modules/doctor/checks/vex.md` (3 checks)
Completion criteria:
- [ ] Every check across all 16 plugins documented
### DOC-009 - Improve check remediation messages in code
Status: TODO
Dependency: DOC-002 through DOC-008
Owners: Developer
Task description:
- For each check, update the `WithRemediation()` steps to include:
- Exact commands (not vague "configure X")
- Docker compose env var names (using `__` separator)
- File paths relative to the compose directory
- Link to the documentation page (e.g., "See docs/modules/doctor/checks/core.md")
- Update `WithCauses()` to be specific, not generic
Completion criteria:
- [ ] All 99 checks have precise, copy-pasteable remediation steps
- [ ] No check reports a generic "configure X" without specifying how
- [ ] Docker compose installations pass all checks that should pass
### DOC-010 - Docker compose default pass baseline
Status: TODO
Dependency: DOC-009
Owners: QA / Test Automation
Task description:
- Run all 99 Doctor checks against a fresh `docker compose up` installation
- Document which checks MUST pass, which are expected warnings, which are N/A
- Create `docs/modules/doctor/compose-baseline.md` with the expected results
- Add any remaining code fixes for false positives
Completion criteria:
- [ ] Baseline document created
- [ ] Zero false-positive FAILs on fresh Docker compose install
- [ ] All WARN checks documented as expected or fixed
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-03-26 | Sprint created. 4 code fixes applied (RequiredSettings, EnvironmentVariables, SecretsConfiguration, DockerSocket). | Planning |
## Decisions & Risks
- Risk: 99 checks is a large documentation surface. Parallelize by plugin.
- Decision: Each plugin gets its own doc file for maintainability.
- Decision: Remediation messages in code should link to docs, not duplicate full instructions.
## Next Checkpoints
- DOC-001 (index): 1 day
- DOC-002 through DOC-008 (all plugin docs): 3-5 days
- DOC-009 (code remediation improvements): 2 days
- DOC-010 (baseline): 1 day