Files

master c58a236d70 Doctor plugin checks: implement health check classes and documentation

Implement remediation-aware health checks across all Doctor plugin modules
(Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment,
EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release,
Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation,
Authority, Core, Cryptography, Database, Docker, Integration, Notify,
Observability, Security, ServiceGraph, Sources, Verification).

Each check now emits structured remediation metadata (severity, category,
runbook links, and fix suggestions) consumed by the Doctor dashboard
remediation panel.

Also adds:
- docs/doctor/articles/ knowledge base for check explanations
- Advisory AI search seed and allowlist updates for doctor content
- Sprint plan for doctor checks documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-27 12:28:00 +02:00

3.5 KiB

Raw Blame History

checkId, plugin, severity, tags

checkId

plugin

severity

Witness Graph Health

What It Checks

Queries the Scanner service at /api/v1/witness/stats and evaluates witness graph construction health:

Construction failures: fail if failure rate exceeds 10% of total constructions.
Incomplete graphs: warn if any graphs are incomplete (missing nodes or edges).
Consistency errors: warn if any consistency errors are detected (orphaned nodes, version mismatches).

Evidence collected: total_constructed, construction_failures, failure_rate, incomplete_graphs, avg_nodes_per_graph, avg_edges_per_graph, avg_completeness, consistency_errors.

The check requires Scanner:Url or Services:Scanner:Url to be configured.

Why It Matters

Witness graphs are the evidence artifacts that prove how a vulnerability reachability verdict was reached. They record the call chain from application entry point to vulnerable function. Without intact witness graphs, reachability findings lack provenance, attestation of scan results is weakened, and auditors cannot verify that "unreachable" verdicts are legitimate. Incomplete or inconsistent graphs can cause incorrect reachability conclusions.

Common Causes

Missing SBOM input (SBOM generation failed for the artifact)
Parser error on specific artifact types or ecosystems
Cyclical dependency detected causing infinite traversal
Resource exhaustion during graph construction on large projects
Partial SBOM data (some dependencies resolved, others missing)
Missing transitive dependencies in the dependency tree
Version mismatch between SBOM and slice data
Orphaned nodes from stale cache entries

How to Fix

Docker Compose

# View recent construction failures
docker compose -f docker-compose.stella-ops.yml logs scanner | grep -i "witness.*fail\|graph.*error"

# Rebuild failed graphs
stella scanner witness rebuild --failed

# Check SBOM pipeline health (witness graphs depend on SBOMs)
stella doctor run --check check.scanner.sbom

services:
  scanner:
    environment:
      Scanner__WitnessGraph__MaxDepth: "50"
      Scanner__WitnessGraph__TimeoutMs: "30000"
      Scanner__WitnessGraph__ConsistencyCheckEnabled: "true"

Bare Metal / systemd

# View construction errors
sudo journalctl -u stellaops-scanner --since "1 hour ago" | grep -i witness

# Rebuild failed graphs
stella scanner witness rebuild --failed

# View graph statistics
stella scanner witness stats

Edit /etc/stellaops/scanner/appsettings.json:

{
  "WitnessGraph": {
    "MaxDepth": 50,
    "TimeoutMs": 30000,
    "ConsistencyCheckEnabled": true
  }
}

Kubernetes / Helm

# Check scanner logs for witness graph issues
kubectl logs -l app=stellaops-scanner --tail=200 | grep -i witness

# Rebuild failed graphs
kubectl exec -it <scanner-pod> -- stella scanner witness rebuild --failed

Set in Helm values.yaml:

scanner:
  witnessGraph:
    maxDepth: 50
    timeoutMs: 30000
    consistencyCheckEnabled: true

Verification

stella doctor run --check check.scanner.witness.graph

check.scanner.sbom -- witness graphs are constructed from SBOM data
check.scanner.reachability -- reachability verdicts depend on witness graph integrity
check.scanner.slice.cache -- stale cache entries can cause consistency errors
check.scanner.resources -- resource exhaustion causes construction failures

3.5 KiB Raw Blame History