doctor: complete runtime check documentation sprint

Signed-off-by: master <>
This commit is contained in:
master
2026-03-31 23:26:24 +03:00
parent 404d50bcb7
commit 152c1b1357
54 changed files with 2210 additions and 258 deletions

View File

@@ -0,0 +1,169 @@
# Sprint 20260326_001 - Doctor Runtime Check Documentation
## Topic & Scope
- Align Doctor documentation to the live runtime catalog of 101 checks across 14 plugins.
- Backfill missing runtime articles for database, observability, servicegraph, and verification checks.
- Publish a runtime index and compose baseline sourced from local Doctor API evidence.
- Fix empty runtime runbook URLs in database, servicegraph, and verification checks and cover them with targeted unit tests.
- Working directory: `docs/doctor/`.
- Allowed cross-module edits: `docs/modules/doctor/`, `src/__Libraries/StellaOps.Doctor.Plugins.*`, `src/__Libraries/__Tests/StellaOps.Doctor.Plugins.*`.
- Expected evidence: runtime index, compose baseline, article files, unit tests, local API evidence.
## Dependencies & Concurrency
- The original sprint text was stale and referenced `99` checks across `16` plugins. Execution was normalized against the live runtime catalog exposed by `GET /api/v1/doctor/checks` on 2026-03-31.
- Canonical per-check remediation remains in `docs/doctor/articles/**`; `docs/modules/doctor/checks/README.md` is the generated runtime index.
- Safe parallelism existed by plugin, but the sprint was completed in a single integrated pass to keep the runtime index, article set, and code remediation aligned.
## Documentation Prerequisites
- `docs/doctor/README.md`
- `docs/modules/doctor/registry-checks.md`
- `docs/doctor/articles/_TEMPLATE.md`
- `devops/compose/docker-compose.stella-ops.yml`
- `src/Doctor/AGENTS.md`
## Delivery Tracker
### DOC-001 - Create runtime check reference index
Status: DONE
Dependency: none
Owners: Documentation author
Task description:
- Created `docs/modules/doctor/checks/README.md` as the runtime-backed master index for all 101 checks exposed by the local Doctor API.
- Grouped checks by plugin and linked every runtime check to its canonical article under `docs/doctor/articles/**`.
Completion criteria:
- [x] All 101 runtime checks listed with current plugin and severity metadata.
- [x] Baseline status column populated from live run `dr_20260331_195122_99ff09`.
### DOC-002 - Verify and normalize existing core coverage
Status: DONE
Dependency: DOC-001
Owners: Documentation author
Task description:
- Verified the existing core article set already covered the runtime core catalog.
- Indexed the core checks in the runtime README and documented their captured baseline states.
Completion criteria:
- [x] Every runtime core check resolves to an article.
- [x] Runtime index links core checks to their canonical articles.
### DOC-003 - Verify and normalize existing security and attestation coverage
Status: DONE
Dependency: DOC-001
Owners: Documentation author
Task description:
- Verified the existing security and attestation article corpus against the live runtime catalog.
- Indexed those checks in the runtime README and preserved article-first remediation.
Completion criteria:
- [x] Every runtime security and attestation check resolves to an article.
- [x] Runtime index links security and attestation checks to their canonical articles.
### DOC-004 - Verify and normalize existing docker coverage
Status: DONE
Dependency: DOC-001
Owners: Documentation author
Task description:
- Verified the existing docker article set against the live runtime docker plugin.
- Indexed docker checks in the runtime README with baseline status from the captured run.
Completion criteria:
- [x] Every runtime docker check resolves to an article.
- [x] Runtime index records compose baseline status for docker checks.
### DOC-005 - Backfill runtime database articles
Status: DONE
Dependency: DOC-001
Owners: Documentation author
Task description:
- Added the missing runtime database articles:
`db-connection`, `db-latency`, `db-migrations-failed`, `db-migrations-pending`, `db-permissions`, `db-pool-health`, `db-pool-size`, and `db-schema-version`.
- Each article now documents the exact runtime check, compose-style configuration keys, remediation, and verification commands.
Completion criteria:
- [x] All runtime database checks have article coverage.
- [x] New articles follow the Doctor frontmatter and verification conventions.
### DOC-006 - Backfill runtime servicegraph articles
Status: DONE
Dependency: DOC-001
Owners: Documentation author
Task description:
- Added the missing runtime servicegraph articles:
`servicegraph-backend`, `servicegraph-circuitbreaker`, `servicegraph-endpoints`, `servicegraph-mq`, `servicegraph-timeouts`, and `servicegraph-valkey`.
- The new articles document the runtime configuration keys, thresholds, and compose remediation flow used by these checks.
Completion criteria:
- [x] All runtime servicegraph checks have article coverage.
- [x] New servicegraph articles match the runtime check IDs exposed by the local API.
### DOC-007 - Verify and normalize existing integration, environment, release, scanner, and compliance coverage
Status: DONE
Dependency: DOC-001
Owners: Documentation author
Task description:
- Verified the existing article coverage for runtime integration, environment, release, scanner, and compliance checks.
- Indexed those checks in the runtime README so the live catalog now has one authoritative lookup path.
Completion criteria:
- [x] Every runtime check in these plugin groups resolves to an article.
- [x] Runtime index reflects the current plugin counts and baseline statuses.
### DOC-008 - Backfill runtime observability and verification articles
Status: DONE
Dependency: DOC-001
Owners: Documentation author
Task description:
- Added the missing runtime observability articles:
`observability-alerting`, `observability-healthchecks`, `observability-logging`, `observability-metrics`, `observability-otel`, and `observability-tracing`.
- Added the missing runtime verification articles:
`verification-artifact-pull`, `verification-policy-engine`, `verification-sbom-validation`, `verification-signature`, and `verification-vex-validation`.
Completion criteria:
- [x] All runtime observability checks have article coverage.
- [x] All runtime verification checks have article coverage.
### DOC-009 - Add local runbook URLs for runtime database, servicegraph, and verification checks
Status: DONE
Dependency: DOC-005, DOC-006, DOC-008
Owners: Developer
Task description:
- Updated runtime database, servicegraph, and verification checks so remediation payloads emit local `docs/doctor/articles/**` runbook URLs instead of empty values.
- Added focused unit tests under the database, servicegraph, and verification test projects to assert the emitted runbook URLs.
Completion criteria:
- [x] No runtime database, servicegraph, or verification check uses `WithRunbookUrl(\"\")`.
- [x] New unit tests verify the expected runbook URL paths for failure or warning branches.
### DOC-010 - Capture compose baseline and document runtime limitations
Status: DONE
Dependency: DOC-009
Owners: QA / Test Automation
Task description:
- Created `docs/modules/doctor/compose-baseline.md` from the captured local runtime baseline `dr_20260331_195122_99ff09`.
- Documented the evidence source, the observed pass/info/warn/fail/skip counts, and the limitation that this was a live-stack capture rather than a second fresh parallel compose bring-up.
Completion criteria:
- [x] Baseline document created with run ID, counts, and observed fail/warn details.
- [x] Runtime index links back to the compose baseline.
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-03-26 | Sprint created. 4 code fixes applied (RequiredSettings, EnvironmentVariables, SecretsConfiguration, DockerSocket). | Planning |
| 2026-03-31 | Audited the live Doctor runtime catalog and normalized sprint scope from stale `99/16` inventory to the actual `101/14` runtime inventory. | Planning |
| 2026-03-31 | Added 25 missing runtime articles under `docs/doctor/articles/**` for database, observability, servicegraph, and verification checks. | Documentation |
| 2026-03-31 | Published `docs/modules/doctor/checks/README.md` and `docs/modules/doctor/compose-baseline.md` from live Doctor API evidence. | Documentation |
| 2026-03-31 | Patched runtime database, servicegraph, and verification checks to emit local runbook URLs and added targeted unit tests for those paths. | Development |
| 2026-03-31 | Sprint delivery complete; archived from `docs/implplan/` to `docs-archived/implplan/`. | Planning |
## Decisions & Risks
- Decision: the live runtime catalog (`101` checks across `14` plugins) is the authoritative target for this sprint, not the stale sprint text that still referenced `99` checks across `16` plugins.
- Decision: `docs/doctor/articles/**` remains the canonical per-check remediation surface; [the runtime index](../../docs/modules/doctor/checks/README.md) is a generated lookup layer, not a second documentation corpus.
- Decision: [the compose baseline](../../docs/modules/doctor/compose-baseline.md) is based on the live local stack because `devops/compose/docker-compose.stella-ops.yml` hardcodes container names, which blocks a safe parallel fresh-stack run on the same machine.
- Risk: the captured live baseline still shows 4 failures. This sprint documents the current runtime and closes article/runbook gaps, but a rebuilt fresh-stack validation remains a separate operational confirmation step.
- Risk: the source tree currently contains newer Doctor plugin code paths beyond the live runtime catalog. This sprint aligned the runtime inventory and verified article coverage, but future runtime expansion should rerun the same catalog/index generation flow.
## Next Checkpoints
- Rebuild and rerun the Doctor services before claiming a fresh-stack zero-false-positive baseline.
- If the runtime catalog changes again, regenerate the runtime index and refresh the compose baseline from a new run ID.