Doctor plugin checks: implement health check classes and documentation

Implement remediation-aware health checks across all Doctor plugin modules (Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment, EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release, Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation, Authority, Core, Cryptography, Database, Docker, Integration, Notify, Observability, Security, ServiceGraph, Sources, Verification). Each check now emits structured remediation metadata (severity, category, runbook links, and fix suggestions) consumed by the Doctor dashboard remediation panel. Also adds: - docs/doctor/articles/ knowledge base for check explanations - Advisory AI search seed and allowlist updates for doctor content - Sprint plan for doctor checks documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 12:28:00 +02:00
parent fbd24e71de
commit c58a236d70
326 changed files with 18500 additions and 463 deletions
--- a/docs/doctor/articles/scanner/witness-graph.md
+++ b/docs/doctor/articles/scanner/witness-graph.md
@@ -0,0 +1,108 @@
+---
+checkId: check.scanner.witness.graph
+plugin: stellaops.doctor.scanner
+severity: warn
+tags: [scanner, witness, graph, reachability, evidence]
+---
+# Witness Graph Health
+
+## What It Checks
+Queries the Scanner service at `/api/v1/witness/stats` and evaluates witness graph construction health:
+
+- **Construction failures**: fail if failure rate exceeds 10% of total constructions.
+- **Incomplete graphs**: warn if any graphs are incomplete (missing nodes or edges).
+- **Consistency errors**: warn if any consistency errors are detected (orphaned nodes, version mismatches).
+
+Evidence collected: `total_constructed`, `construction_failures`, `failure_rate`, `incomplete_graphs`, `avg_nodes_per_graph`, `avg_edges_per_graph`, `avg_completeness`, `consistency_errors`.
+
+The check requires `Scanner:Url` or `Services:Scanner:Url` to be configured.
+
+## Why It Matters
+Witness graphs are the evidence artifacts that prove how a vulnerability reachability verdict was reached. They record the call chain from application entry point to vulnerable function. Without intact witness graphs, reachability findings lack provenance, attestation of scan results is weakened, and auditors cannot verify that "unreachable" verdicts are legitimate. Incomplete or inconsistent graphs can cause incorrect reachability conclusions.
+
+## Common Causes
+- Missing SBOM input (SBOM generation failed for the artifact)
+- Parser error on specific artifact types or ecosystems
+- Cyclical dependency detected causing infinite traversal
+- Resource exhaustion during graph construction on large projects
+- Partial SBOM data (some dependencies resolved, others missing)
+- Missing transitive dependencies in the dependency tree
+- Version mismatch between SBOM and slice data
+- Orphaned nodes from stale cache entries
+
+## How to Fix
+
+### Docker Compose
+```bash
+# View recent construction failures
+docker compose -f docker-compose.stella-ops.yml logs scanner | grep -i "witness.*fail\|graph.*error"
+
+# Rebuild failed graphs
+stella scanner witness rebuild --failed
+
+# Check SBOM pipeline health (witness graphs depend on SBOMs)
+stella doctor run --check check.scanner.sbom
+```
+
+```yaml
+services:
+  scanner:
+    environment:
+      Scanner__WitnessGraph__MaxDepth: "50"
+      Scanner__WitnessGraph__TimeoutMs: "30000"
+      Scanner__WitnessGraph__ConsistencyCheckEnabled: "true"
+```
+
+### Bare Metal / systemd
+```bash
+# View construction errors
+sudo journalctl -u stellaops-scanner --since "1 hour ago" | grep -i witness
+
+# Rebuild failed graphs
+stella scanner witness rebuild --failed
+
+# View graph statistics
+stella scanner witness stats
+```
+
+Edit `/etc/stellaops/scanner/appsettings.json`:
+
+```json
+{
+  "WitnessGraph": {
+    "MaxDepth": 50,
+    "TimeoutMs": 30000,
+    "ConsistencyCheckEnabled": true
+  }
+}
+```
+
+### Kubernetes / Helm
+```bash
+# Check scanner logs for witness graph issues
+kubectl logs -l app=stellaops-scanner --tail=200 | grep -i witness
+
+# Rebuild failed graphs
+kubectl exec -it <scanner-pod> -- stella scanner witness rebuild --failed
+```
+
+Set in Helm `values.yaml`:
+
+```yaml
+scanner:
+  witnessGraph:
+    maxDepth: 50
+    timeoutMs: 30000
+    consistencyCheckEnabled: true
+```
+
+## Verification
+```
+stella doctor run --check check.scanner.witness.graph
+```
+
+## Related Checks
+- `check.scanner.sbom` -- witness graphs are constructed from SBOM data
+- `check.scanner.reachability` -- reachability verdicts depend on witness graph integrity
+- `check.scanner.slice.cache` -- stale cache entries can cause consistency errors
+- `check.scanner.resources` -- resource exhaustion causes construction failures