Doctor plugin checks: implement health check classes and documentation

Implement remediation-aware health checks across all Doctor plugin modules
(Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment,
EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release,
Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation,
Authority, Core, Cryptography, Database, Docker, Integration, Notify,
Observability, Security, ServiceGraph, Sources, Verification).

Each check now emits structured remediation metadata (severity, category,
runbook links, and fix suggestions) consumed by the Doctor dashboard
remediation panel.

Also adds:
- docs/doctor/articles/ knowledge base for check explanations
- Advisory AI search seed and allowlist updates for doctor content
- Sprint plan for doctor checks documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
master
2026-03-27 12:28:00 +02:00
parent fbd24e71de
commit c58a236d70
326 changed files with 18500 additions and 463 deletions

View File

@@ -0,0 +1,115 @@
---
checkId: check.release.configuration
plugin: stellaops.doctor.release
severity: warn
tags: [release, configuration, workflow, validation]
---
# Release Configuration
## What It Checks
Queries the Release Orchestrator at `/api/v1/workflows` and validates all release workflow definitions:
- **Empty stages**: fail if a workflow has no stages defined.
- **Invalid transitions**: fail if a stage references a next stage that does not exist in the workflow.
- **Unreachable stages**: warn if a stage has no incoming transitions and is not the entry point (first stage).
- **Missing environment mapping**: fail if a stage has no target environment assigned.
- **No workflows**: warn if no release workflows are configured at all.
Evidence collected: `workflow_count`, `active_workflow_count`, `total_stages`, `validation_error_count`, `errors`, `workflow_names`.
The check requires `ReleaseOrchestrator:Url` or `Release:Orchestrator:Url` to be configured.
## Why It Matters
Release workflows define the promotion path from development through production. Configuration errors in workflows are silent until a release attempts to use the broken path, at which point the release fails mid-flight. An invalid stage transition creates a dead end in the pipeline. A missing environment mapping means the orchestrator does not know where to deploy. These issues should be caught at configuration time, not during a production release.
## Common Causes
- Workflow configuration incomplete (created but not finished)
- Stage transition misconfigured after adding or removing stages
- Environment deleted from the system but workflow not updated to reflect the change
- Copy-paste errors when duplicating workflows
- Stages added but not connected to any transition path
## How to Fix
### Docker Compose
```bash
# List all workflows and their validation status
stella release workflow list
# View details of a specific workflow
stella release workflow show <workflow-id>
# Edit workflow to fix configuration
stella release workflow edit <workflow-id>
# Create a new workflow from scratch
stella release workflow create --name "standard" --stages dev,staging,prod
```
### Bare Metal / systemd
```bash
# List workflows
stella release workflow list
# Validate a specific workflow
stella release workflow validate <workflow-id>
# Fix workflow configuration
stella release workflow edit <workflow-id>
```
Edit the workflow configuration file directly if needed:
```json
{
"workflows": [
{
"name": "standard",
"isActive": true,
"stages": [
{ "name": "dev", "environmentId": "<dev-env-id>", "nextStages": ["staging"] },
{ "name": "staging", "environmentId": "<staging-env-id>", "nextStages": ["prod"] },
{ "name": "prod", "environmentId": "<prod-env-id>", "nextStages": [] }
]
}
]
}
```
### Kubernetes / Helm
```bash
# Check workflow configuration in ConfigMap
kubectl get configmap stellaops-release-workflows -o yaml
# Validate workflows
kubectl exec -it <orchestrator-pod> -- stella release workflow list
```
Set in Helm `values.yaml`:
```yaml
releaseOrchestrator:
workflows:
- name: standard
isActive: true
stages:
- name: dev
environmentRef: dev
nextStages: [staging]
- name: staging
environmentRef: staging
nextStages: [prod]
- name: prod
environmentRef: prod
nextStages: []
```
## Verification
```
stella doctor run --check check.release.configuration
```
## Related Checks
- `check.release.active` -- active releases fail when workflow configuration is broken
- `check.release.environment.readiness` -- workflows reference environments that must exist and be healthy
- `check.release.promotion.gates` -- gates are associated with workflow stage transitions