20 KiB
Here’s a compact, practical way to add an explanation graph that traces every vulnerability verdict back to raw evidence—so auditors can verify results without trusting an LLM.
What it is (in one line)
A small, immutable graph that connects a verdict → to reasoning steps → to raw evidence (source scan records, binary symbol/build‑ID matches, external advisories/feeds), with cryptographic hashes so anyone can replay/verify it.
Minimal data model (vendor‑neutral)
{
"explanationGraph": {
"scanId": "uuid",
"artifact": {
"purl": "pkg:docker/redis@7.2.4",
"digest": "sha256:…",
"buildId": "elf:abcd…|pe:…|macho:…"
},
"verdicts": [
{
"verdictId": "uuid",
"cve": "CVE-2024-XXXX",
"status": "affected|not_affected|under_investigation",
"policy": "vex/lattice:v1",
"reasoning": [
{"stepId":"s1","type":"callgraph.reachable","evidenceRef":"e1"},
{"stepId":"s2","type":"version.match","evidenceRef":"e2"},
{"stepId":"s3","type":"vendor.vex.override","evidenceRef":"e3"}
],
"provenance": {
"scanner": "StellaOps.Scanner@1.3.0",
"rulesHash": "sha256:…",
"time": "2025-11-25T12:34:56Z",
"attestation": "dsse:…"
}
}
],
"evidence": [
{
"evidenceId":"e1",
"kind":"binary.callgraph",
"hash":"sha256:…",
"summary":"main -> libssl!EVP_* path present",
"blobPointer":"ipfs://… | file://… | s3://…"
},
{
"evidenceId":"e2",
"kind":"source.scan",
"hash":"sha256:…",
"summary":"Detected libssl 3.0.14 via SONAME + build‑id",
"blobPointer":"…"
},
{
"evidenceId":"e3",
"kind":"external.feed",
"hash":"sha256:…",
"summary":"Vendor VEX: CVE not reachable when FIPS mode enabled",
"blobPointer":"…",
"externalRef":{"type":"advisory","id":"VEX-ACME-2025-001","url":"…"}
}
]
}
}
How it works (flow)
- Collect raw artifacts: scanner findings, binary symbol matches (Build‑ID / PDB / dSYM), SBOM components, external feeds (NVD, vendor VEX).
- Normalize to evidence nodes (immutable blobs with content hash + pointer).
- Reason via small, deterministic rules (your lattice/policy). Each rule emits a reasoning step that points to evidence.
- Emit a verdict with status + full chain of steps.
- Seal with DSSE/Sigstore (or your offline signer) so the whole graph is replayable.
Why this helps (auditable AI)
- No black box: every “affected/not affected” claim links to verifiable bytes.
- Deterministic: same inputs + rules = same verdict (hashes prove it).
- Reproducible for clients/regulators: export graph + blobs, they replay locally.
- LLM‑optional: you can add LLM explanations as non‑authoritative annotations; the verdict remains policy‑driven.
C# drop‑in (Stella Ops style)
public record EvidenceNode(
string EvidenceId, string Kind, string Hash, string Summary, string BlobPointer,
ExternalRef? ExternalRef = null);
public record ReasoningStep(string StepId, string Type, string EvidenceRef);
public record Verdict(
string VerdictId, string Cve, string Status, string Policy,
IReadOnlyList<ReasoningStep> Reasoning, Provenance Provenance);
public record Provenance(string Scanner, string RulesHash, DateTimeOffset Time, string Attestation);
public record ExplanationGraph(
Guid ScanId, Artifact Artifact,
IReadOnlyList<Verdict> Verdicts, IReadOnlyList<EvidenceNode> Evidence);
public record Artifact(string Purl, string Digest, string BuildId);
- Persist as immutable documents (Mongo collection
explanations). - Store large evidence blobs in object storage; keep
hash+blobPointerin Mongo. - Sign the serialized graph (DSSE) and store the signature alongside.
UI (compact “trace” panel)
- Top line: CVE → Status chip (Affected / Not affected / Needs review).
- Three tabs: Evidence, Reasoning, Provenance.
- One‑click export: “Download Replay Bundle (.zip)” → JSON graph + evidence blobs + verify script.
- Badge: “Deterministic ✓” when rulesHash + inputs resolve to prior signature.
Ops & replay
- Bundle a tiny CLI:
stellaops-explain verify graph.json --evidence ./blobs/. - Verification checks: all hashes match, DSSE signature valid, rulesHash known, verdict derivable from steps.
Where to start (1‑week sprint)
- Day 1–2: Model + Mongo collections + signer service.
- Day 3: Scanner adapters emit
EvidenceNoderecords; policy engine emitsReasoningStep. - Day 4: Verdict assembly + DSSE signing + export bundle.
- Day 5: Minimal UI trace panel + CLI verifier.
If you want, I can generate the Mongo schemas, a DSSE signing helper, and the React/Angular trace panel stub next. Here’s a concrete implementation plan you can hand to your developers so they’re not guessing what to build.
I’ll break it down by phases, and inside each phase I’ll call out owner, deliverables, and acceptance criteria.
Phase 0 – Scope & decisions (½ day)
Goal: Lock in the “rules of the game” so nobody bikesheds later.
Decisions to confirm (write in a short ADR):
-
Canonical representation & hashing
-
Format for hashing: canonical JSON (stable property ordering, UTF‑8, no whitespace).
-
Algorithm: SHA‑256 for:
ExplanationGraphdocument- each
EvidenceNode
-
Hash scope:
evidence.hash= hash of the raw evidence blob (or canonical subset if huge)graphHash= hash of the entire explanation graph document (minus signature).
-
-
Signing
-
Format: DSSE envelope (
payloadType = "stellaops/explanation-graph@v1"). -
Key management: use existing offline signing key or Sigstore‑style keyless if already in org.
-
Signature attached as:
provenance.attestationfield inside each verdict and- stored in a separate
explanation_signaturescollection or S3 path for replay.
-
-
Storage
-
Metadata: MongoDB collection
explanation_graphs. -
Evidence blobs:
-
S3 (or compatible) bucket
stella-explanations/with layout:evidence/{evidenceId}orevidence/{hash}.
-
-
-
ID formats
scanId: UUID (string).verdictId,evidenceId,stepId: UUID (string).buildId: reuse existing convention (elf:<buildid>,pe:<guid>,macho:<uuid>).
Deliverable: 1–2 page ADR in repo (/docs/adr/000-explanation-graph.md).
Phase 1 – Domain model & persistence (backend)
Owner: Backend
1.1. Define core C# domain models
Place in StellaOps.Explanations project or equivalent:
public record ArtifactRef(
string Purl,
string Digest,
string BuildId);
public record ExternalRef(
string Type, // "advisory", "vex", "nvd", etc.
string Id,
string Url);
public record EvidenceNode(
string EvidenceId,
string Kind, // "binary.callgraph", "source.scan", "external.feed", ...
string Hash, // sha256 of blob
string Summary,
string BlobPointer, // s3://..., file://..., ipfs://...
ExternalRef? ExternalRef = null);
public record ReasoningStep(
string StepId,
string Type, // "callgraph.reachable", "version.match", ...
string EvidenceRef); // EvidenceId
public record Provenance(
string Scanner,
string RulesHash, // hash of rules/policy bundle used
DateTimeOffset Time,
string Attestation); // DSSE envelope (base64 or JSON)
public record Verdict(
string VerdictId,
string Cve,
string Status, // "affected", "not_affected", "under_investigation"
string Policy, // e.g. "vex.lattice:v1"
IReadOnlyList<ReasoningStep> Reasoning,
Provenance Provenance);
public record ExplanationGraph(
Guid ScanId,
ArtifactRef Artifact,
IReadOnlyList<Verdict> Verdicts,
IReadOnlyList<EvidenceNode> Evidence,
string GraphHash); // sha256 of canonical JSON
1.2. MongoDB schema
Collection: explanation_graphs
Document shape:
{
"_id": "scanId:artifactDigest", // composite key or just ObjectId + separate fields
"scanId": "uuid",
"artifact": {
"purl": "pkg:docker/redis@7.2.4",
"digest": "sha256:...",
"buildId": "elf:abcd..."
},
"verdicts": [ /* Verdict[] */ ],
"evidence": [ /* EvidenceNode[] */ ],
"graphHash": "sha256:..."
}
Indexes:
{ scanId: 1 }{ "artifact.digest": 1 }{ "verdicts.cve": 1, "artifact.digest": 1 }(compound)- Optional: TTL or archiving mechanism if you don’t want to keep these forever.
Acceptance criteria:
- You can serialize/deserialize
ExplanationGraphto Mongo without loss. - Indexes exist and queries by
scanId,artifact.digest, and(digest + CVE)are efficient.
Phase 2 – Evidence ingestion plumbing
Goal: Make every relevant raw fact show up as an EvidenceNode.
Owner: Backend scanner team
2.1. Evidence factory service
Create IEvidenceService:
public interface IEvidenceService
{
Task<EvidenceNode> StoreBinaryCallgraphAsync(
Guid scanId,
ArtifactRef artifact,
byte[] callgraphBytes,
string summary,
ExternalRef? externalRef = null);
Task<EvidenceNode> StoreSourceScanAsync(
Guid scanId,
ArtifactRef artifact,
byte[] scanResultJson,
string summary);
Task<EvidenceNode> StoreExternalFeedAsync(
Guid scanId,
ExternalRef externalRef,
byte[] rawPayload,
string summary);
}
Implementation tasks:
-
Hash computation
- Compute SHA‑256 over raw bytes.
- Prefer a helper:
public static string Sha256Hex(ReadOnlySpan<byte> data) { ... } -
Blob storage
- S3 key format, e.g.:
explanations/{scanId}/{evidenceId}. BlobPointerstring =s3://stella-explanations/explanations/{scanId}/{evidenceId}.
- S3 key format, e.g.:
-
EvidenceNode creation
- Generate
evidenceId = Guid.NewGuid().ToString("N"). - Populate
kind,hash,summary,blobPointer,externalRef.
- Generate
-
Graph assembly contract
- Evidence service does not write to Mongo.
- It only uploads blobs and returns
EvidenceNodeobjects. - The ExplanationGraphBuilder (next phase) collects them.
Acceptance criteria:
-
Given a callgraph binary, a corresponding
EvidenceNodeis returned with:- hash matching the blob (verified in tests),
- blob present in S3,
- summary populated.
Phase 3 – Reasoning & policy integration
Goal: Instrument your existing VEX / lattice policy engine to emit deterministic reasoning steps instead of just a boolean status.
Owner: Policy / rules engine team
3.1. Expose rule evaluation trace
Assume you already have something like:
VulnerabilityStatus Evaluate(ArtifactRef artifact, string cve, Findings findings);
Extend it to:
public sealed class RuleEvaluationTrace
{
public string StepType { get; init; } // e.g. "version.match"
public string RuleId { get; init; } // "rule:openssl:versionFromElf"
public string Description { get; init; } // human-readable explanation
public string EvidenceKind { get; init; } // to match with EvidenceService
public object EvidencePayload { get; init; } // callgraph bytes, json, etc.
}
public sealed class EvaluationResult
{
public string Status { get; init; } // "affected", etc.
public IReadOnlyList<RuleEvaluationTrace> Trace { get; init; }
}
New API:
EvaluationResult EvaluateWithTrace(
ArtifactRef artifact, string cve, Findings findings);
3.2. From trace to ReasoningStep + EvidenceNode
Create ExplanationGraphBuilder:
public interface IExplanationGraphBuilder
{
Task<ExplanationGraph> BuildAsync(
Guid scanId,
ArtifactRef artifact,
IReadOnlyList<CveFinding> cveFindings,
string scannerName);
}
Internal algorithm for each CveFinding:
-
Call
EvaluateWithTrace(artifact, cve, finding)to getEvaluationResult. -
For each
RuleEvaluationTrace:-
Use
EvidenceServicewith appropriate method based onEvidenceKind. -
Get back an
EvidenceNodewithevidenceId. -
Create
ReasoningStep:StepId = Guid.NewGuid()Type = trace.StepTypeEvidenceRef = evidenceNode.EvidenceId
-
-
Assemble
Verdict:
var verdict = new Verdict(
verdictId: Guid.NewGuid().ToString("N"),
cve: finding.Cve,
status: result.Status,
policy: "vex.lattice:v1",
reasoning: steps,
provenance: new Provenance(
scanner: scannerName,
rulesHash: rulesBundleHash,
time: DateTimeOffset.UtcNow,
attestation: "" // set in Phase 4
)
);
-
Collect:
- all
EvidenceNodes (dedupe byhashto avoid duplicates). - all
Verdicts.
- all
Acceptance criteria:
-
Given deterministic inputs (scan + rules bundle hash), repeated runs produce:
- same sequence of
ReasoningSteptypes, - same set of
EvidenceNode.hashvalues, - same
status.
- same sequence of
Phase 4 – Graph hashing & DSSE signing
Owner: Security / platform
4.1. Canonical JSON for hash
Implement:
public static class ExplanationGraphSerializer
{
public static string ToCanonicalJson(ExplanationGraph graph)
{
// no graphHash, no attestation in this step
}
}
Key requirements:
- Consistent property ordering (e.g. alphabetical).
- No extra whitespace.
- UTF‑8 encoding.
- Primitive formatting options fixed (e.g. date as ISO 8601 with
Z).
4.2. Hash and sign
Before persisting:
var graphWithoutHash = graph with { GraphHash = "" };
var canonicalJson = ExplanationGraphSerializer.ToCanonicalJson(graphWithoutHash);
var graphHash = Sha256Hex(Encoding.UTF8.GetBytes(canonicalJson));
// sign DSSE envelope
var envelope = dsseSigner.Sign(
payloadType: "stellaops/explanation-graph@v1",
payload: Encoding.UTF8.GetBytes(canonicalJson)
);
// attach
var signedVerdicts = graph.Verdicts
.Select(v => v with
{
Provenance = v.Provenance with { Attestation = envelope.ToJson() }
})
.ToList();
var finalGraph = graph with
{
GraphHash = $"sha256:{graphHash}",
Verdicts = signedVerdicts
};
Then write finalGraph to Mongo.
Acceptance criteria:
- Recomputing
graphHashfrom Mongo document (zeroinggraphHashandattestation) matches stored value. - Verifying DSSE signature with the public key succeeds.
Phase 5 – Backend APIs & export bundle
Owner: Backend / API
5.1. Read APIs
Add endpoints (REST-ish):
-
Get graph for scan-artifact
GET /explanations/scans/{scanId}/artifacts/{digest}- Returns entire
ExplanationGraphJSON.
- Returns entire
-
Get single verdict
GET /explanations/scans/{scanId}/artifacts/{digest}/cves/{cve}- Returns
Verdict+ its subset ofEvidenceNodes.
- Returns
-
Search by CVE
GET /explanations/search?cve=CVE-2024-XXXX&digest=sha256:...- Returns list of
(scanId, artifact, verdictId).
- Returns list of
5.2. Export replay bundle
POST /explanations/{scanId}/{digest}/export
Implementation:
-
Create a temporary directory.
-
Write:
-
graph.json→ExplanationGraphas stored. -
signature.json→ DSSE envelope alone (optional). -
Evidence blobs:
-
For each
EvidenceNode:- Download from S3 and store as
evidence/{evidenceId}.
- Download from S3 and store as
-
-
-
Zip the folder:
explanation-{scanId}-{shortDigest}.zip. -
Stream as download.
5.3. CLI verifier
Small .NET / Go CLI:
Commands:
stellaops-explain verify graph.json --evidence ./evidence
Verification steps:
-
Load
graph.json, parse toExplanationGraph. -
Strip
graphHash&attestation, re‑serialize canonical JSON. -
Recompute SHA‑256 and compare to
graphHash. -
Verify DSSE envelope with public key.
-
For each
EvidenceNode:- Read file
./evidence/{evidenceId}. - Recompute hash and compare with
evidence.hash.
- Read file
Exit with non‑zero code if anything fails; print a short summary.
Acceptance criteria:
- Export bundle round‑trips:
verifypasses on an exported zip. - APIs documented in OpenAPI / Swagger.
Phase 6 – UI: Explanation trace panel
Owner: Frontend
6.1. API integration
New calls in frontend client:
GET /explanations/scans/{scanId}/artifacts/{digest}- Optionally
GET /explanations/.../cves/{cve}if you want lazy loading per CVE.
6.2. Component UX
On the “vulnerability detail” view:
- Add “Explanation” tab with three sections:
-
Verdict summary
- Badge:
Affected/Not affected/Under investigation. - Text:
Derived using policy {policy}, rules hash {rulesHash[..8]}.
- Badge:
-
Reasoning timeline
-
Vertical list of
ReasoningSteps:- Icon per type (e.g. “flow” icon for
callgraph.reachable). - Title =
Type(humanized). - Click to expand underlying
EvidenceNode.summary. - Optional “View raw evidence” link (downloads blob via S3 signed URL).
- Icon per type (e.g. “flow” icon for
-
-
Provenance
-
Show:
scannerrulesHashtime- “Attested ✓” if DSSE verifies on the backend (or pre‑computed).
-
-
Export
- Button: “Download replay bundle (.zip)”
- Calls export endpoint and triggers browser download.
Acceptance criteria:
-
For any CVE in UI, a user can:
- See why it is (not) affected in at most 2 clicks.
- Download a replay bundle via the UI.
Phase 7 – Testing strategy
Owner: QA + all devs
7.1. Unit tests
-
EvidenceService:
- Hash matches blob contents.
- BlobPointer formats are as expected.
-
ExplanationGraphBuilder:
- Given fixed test input, the resulting graph JSON matches golden file.
-
Serializer:
- Canonical JSON is stable under property reordering in the code.
7.2. Integration tests
-
End‑to‑end fake scan:
- Simulate scanner output + rules.
- Build graph → persist → fetch via API.
- Run CLI verify on exported bundle in CI.
7.3. Security tests
-
Signature tampering:
- Modify
graph.jsonin exported bundle;verifymust fail.
- Modify
-
Evidence tampering:
- Modify an evidence file;
verifymust fail.
- Modify an evidence file;
Phase 8 – Rollout
Owner: PM / Tech lead
-
Feature flag
-
Start with explanation graph generation behind a flag for:
- subset of scanners,
- subset of tenants.
-
-
Backfill (optional)
-
If useful, run a one‑off job that:
- Takes recent scans,
- Rebuilds explanation graphs,
- Stores them in Mongo.
-
-
Docs
-
Short doc page for customers:
- “What is an Explanation Graph?”
- “How to verify it with the CLI?”
-
Developer checklist (TL;DR)
You can literally drop this into Jira as epics/tasks:
-
Backend
- Implement domain models (
ExplanationGraph,Verdict,EvidenceNode, etc.). - Implement
IEvidenceService+ S3 integration. - Extend policy engine to
EvaluateWithTrace. - Implement
ExplanationGraphBuilder. - Implement canonical serializer, hashing, DSSE signing.
- Implement Mongo persistence + indexes.
- Implement REST APIs + export ZIP.
- Implement domain models (
-
Frontend
- Wire new APIs into the vulnerability detail view.
- Build Explanation tab (Summary / Reasoning / Provenance).
- Implement “Download replay bundle” button.
-
Tools
- Implement
stellaops-explain verifyCLI. - Add CI test that runs verify against a sample bundle.
- Implement
-
QA
- Golden‑file tests for graphs.
- Signature & evidence tampering tests.
- UI functional tests on explanations.
If you’d like, next step I can turn this into:
- concrete OpenAPI spec for the new endpoints, and/or
- a sample
stellaops-explain verifyCLI skeleton (C# or Go).