git.stella-ops.org/30-Dec-2025 - Evidence‑Gated AI Explanations.md at c706b3d3e03afa397a16f80c183d5f3413f8cef3 - git.stella-ops.org

Files

master e6ee092c7a product advisories update

2025-12-30 16:05:16 +02:00

3.9 KiB

Raw Blame History

Here’s a simple, high‑signal pattern you can drop into your security product: gate AI remediation/explanations behind an “Evidence Coverage” badge—and hide suggestions when coverage is weak.

What this solves (plain English)

AI advice is only trustworthy when it’s grounded in real evidence. If your scan only sees half the picture, AI “fixes” become noise. A visible coverage badge makes this explicit and keeps the UI quiet until you’ve got enough facts.

What “Evidence Coverage” means

Score = % of the verdict’s required facts present, e.g., do we have:

Reachability (is the vulnerable code/path actually callable in this artifact/runtime?)
VEX (vendor/product statements: affected/not‑affected/under‑investigation)
Runtime (telemetry, process trees, loaded libs, eBPF hooks)
Exploit signals (known exploits, KEV, EPSS tier, in‑the‑wild intel)
Patch/backport proof (distro backports, symbols, diff/Build‑ID match)
Provenance (in‑toto/DSSE attestations, signer trust)
Environment match (kernel/os/distro/package set parity)
Differential context (did this change since last release?)

Each fact bucket contributes weighted points → a 0–100% Coverage score.

UX rule of thumb

<60%: Hide AI suggestions by default. Show a muted badge “Coverage 41% — add sources to unlock guidance.”
60–79%: Collapse AI panel; allow manual expand with a caution label. Every sentence shows its citations.
≥80%: Show AI remediation by default with a green badge and inline evidence chips.
100%: Add a subtle “High confidence” ribbon + “export proof” link.

Minimal UI components

A small badge next to each finding: Coverage 82% (click → drawer).
Drawer tabs: Sources, Why we think it’s reachable, Counter‑evidence, Gaps.
“Fill the gaps” call‑outs (e.g., “Attach VEX”, “Enable runtime sensor”, “Upload SBOM”).

Copy you can reuse

Collapsed state (low coverage): “We’re missing runtime or VEX evidence. Add one source to unlock tailored remediation.”
Expanded (medium): “Guidance shown with caution. 3/5 evidence buckets present. See gaps →”

Data model (lean)

coverage:
  score: 0-100
  buckets:
    - id: reachability      # call graph, symbol, entrypoints
      present: true
      weight: 0.22
      evidence_refs: [e1,e7]
    - id: vex               # product/vendor statements
      present: false
      weight: 0.18
      evidence_refs: []
    - id: runtime
      present: true
      weight: 0.20
      evidence_refs: [e3]
    - id: exploit_signals
      present: true
      weight: 0.15
      evidence_refs: [e6]
    - id: patch_backport
      present: false
      weight: 0.15
      evidence_refs: []
    - id: provenance
      present: true
      weight: 0.10
      evidence_refs: [e9]

Policy in one line (ship this as a guard)

if coverage.score < 60: hide_ai()
elif coverage.score < 80: show_ai(collapsed=true, label="limited evidence")
else: show_ai(collapsed=false, label="evidence-backed")

What the AI must output (when shown)

Step‑by‑step remediation with per‑step citations to the evidence drawer.
Why this is safe (mentions backports, ABI risk, service impact).
Counterfactual: “If VEX says Not Affected → do X instead.”
Residual risk and rollback plan.

How to reach ≥80% more often

Auto‑request missing inputs (“Upload maintainer VEX” / “Turn on runtime for 24h”).
Fetch distro backport diffs and symbol maps to close the patch/backport bucket.
Merge SBOM + call‑graph + eBPF to strengthen reachability.

If you want, I can draft a drop‑in React component (Badge + Drawer) and a tiny scoring service (C#/.NET 10) that plugs into your verdict pipeline.

3.9 KiB Raw Blame History Unescape Escape