Files
git.stella-ops.org/docs/product-advisories/27-Nov-2025 - Explainability Layer for Vulnerability Verdicts.md
StellaOps Bot 3488b22c0c
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
up
2025-11-29 11:08:08 +02:00

20 KiB
Raw Blame History

Heres a compact, practical way to add an explanation graph that traces every vulnerability verdict back to raw evidence—so auditors can verify results without trusting an LLM.


What it is (in one line)

A small, immutable graph that connects a verdict → to reasoning steps → to raw evidence (source scan records, binary symbol/buildID matches, external advisories/feeds), with cryptographic hashes so anyone can replay/verify it.


Minimal data model (vendorneutral)

{
  "explanationGraph": {
    "scanId": "uuid",
    "artifact": {
      "purl": "pkg:docker/redis@7.2.4",
      "digest": "sha256:…",
      "buildId": "elf:abcd…|pe:…|macho:…"
    },
    "verdicts": [
      {
        "verdictId": "uuid",
        "cve": "CVE-2024-XXXX",
        "status": "affected|not_affected|under_investigation",
        "policy": "vex/lattice:v1",
        "reasoning": [
          {"stepId":"s1","type":"callgraph.reachable","evidenceRef":"e1"},
          {"stepId":"s2","type":"version.match","evidenceRef":"e2"},
          {"stepId":"s3","type":"vendor.vex.override","evidenceRef":"e3"}
        ],
        "provenance": {
          "scanner": "StellaOps.Scanner@1.3.0",
          "rulesHash": "sha256:…", 
          "time": "2025-11-25T12:34:56Z",
          "attestation": "dsse:…"
        }
      }
    ],
    "evidence": [
      {
        "evidenceId":"e1",
        "kind":"binary.callgraph",
        "hash":"sha256:…",
        "summary":"main -> libssl!EVP_* path present",
        "blobPointer":"ipfs://… | file://… | s3://…"
      },
      {
        "evidenceId":"e2",
        "kind":"source.scan",
        "hash":"sha256:…",
        "summary":"Detected libssl 3.0.14 via SONAME + buildid",
        "blobPointer":"…"
      },
      {
        "evidenceId":"e3",
        "kind":"external.feed",
        "hash":"sha256:…",
        "summary":"Vendor VEX: CVE not reachable when FIPS mode enabled",
        "blobPointer":"…",
        "externalRef":{"type":"advisory","id":"VEX-ACME-2025-001","url":"…"}
      }
    ]
  }
}

How it works (flow)

  • Collect raw artifacts: scanner findings, binary symbol matches (BuildID / PDB / dSYM), SBOM components, external feeds (NVD, vendor VEX).
  • Normalize to evidence nodes (immutable blobs with content hash + pointer).
  • Reason via small, deterministic rules (your lattice/policy). Each rule emits a reasoning step that points to evidence.
  • Emit a verdict with status + full chain of steps.
  • Seal with DSSE/Sigstore (or your offline signer) so the whole graph is replayable.

Why this helps (auditable AI)

  • No black box: every “affected/not affected” claim links to verifiable bytes.
  • Deterministic: same inputs + rules = same verdict (hashes prove it).
  • Reproducible for clients/regulators: export graph + blobs, they replay locally.
  • LLMoptional: you can add LLM explanations as nonauthoritative annotations; the verdict remains policydriven.

C# dropin (StellaOps style)

public record EvidenceNode(
    string EvidenceId, string Kind, string Hash, string Summary, string BlobPointer,
    ExternalRef? ExternalRef = null);

public record ReasoningStep(string StepId, string Type, string EvidenceRef);

public record Verdict(
    string VerdictId, string Cve, string Status, string Policy,
    IReadOnlyList<ReasoningStep> Reasoning, Provenance Provenance);

public record Provenance(string Scanner, string RulesHash, DateTimeOffset Time, string Attestation);

public record ExplanationGraph(
    Guid ScanId, Artifact Artifact,
    IReadOnlyList<Verdict> Verdicts, IReadOnlyList<EvidenceNode> Evidence);

public record Artifact(string Purl, string Digest, string BuildId);
  • Persist as immutable documents (Mongo collection explanations).
  • Store large evidence blobs in object storage; keep hash + blobPointer in Mongo.
  • Sign the serialized graph (DSSE) and store the signature alongside.

UI (compact “trace” panel)

  • Top line: CVE → Status chip (Affected / Not affected / Needs review).
  • Three tabs: Evidence, Reasoning, Provenance.
  • Oneclick export: “Download Replay Bundle (.zip)” → JSON graph + evidence blobs + verify script.
  • Badge: “Deterministic ✓” when rulesHash + inputs resolve to prior signature.

Ops & replay

  • Bundle a tiny CLI: stellaops-explain verify graph.json --evidence ./blobs/.
  • Verification checks: all hashes match, DSSE signature valid, rulesHash known, verdict derivable from steps.

Where to start (1week sprint)

  • Day 12: Model + Mongo collections + signer service.
  • Day 3: Scanner adapters emit EvidenceNode records; policy engine emits ReasoningStep.
  • Day 4: Verdict assembly + DSSE signing + export bundle.
  • Day 5: Minimal UI trace panel + CLI verifier.

If you want, I can generate the Mongo schemas, a DSSE signing helper, and the React/Angular trace panel stub next. Heres a concrete implementation plan you can hand to your developers so theyre not guessing what to build.

Ill break it down by phases, and inside each phase Ill call out owner, deliverables, and acceptance criteria.


Phase 0 Scope & decisions (½ day)

Goal: Lock in the “rules of the game” so nobody bikesheds later.

Decisions to confirm (write in a short ADR):

  1. Canonical representation & hashing

    • Format for hashing: canonical JSON (stable property ordering, UTF8, no whitespace).

    • Algorithm: SHA256 for:

      • ExplanationGraph document
      • each EvidenceNode
    • Hash scope:

      • evidence.hash = hash of the raw evidence blob (or canonical subset if huge)
      • graphHash = hash of the entire explanation graph document (minus signature).
  2. Signing

    • Format: DSSE envelope (payloadType = "stellaops/explanation-graph@v1").

    • Key management: use existing offline signing key or Sigstorestyle keyless if already in org.

    • Signature attached as:

      • provenance.attestation field inside each verdict and
      • stored in a separate explanation_signatures collection or S3 path for replay.
  3. Storage

    • Metadata: MongoDB collection explanation_graphs.

    • Evidence blobs:

      • S3 (or compatible) bucket stella-explanations/ with layout:

        • evidence/{evidenceId} or evidence/{hash}.
  4. ID formats

    • scanId: UUID (string).
    • verdictId, evidenceId, stepId: UUID (string).
    • buildId: reuse existing convention (elf:<buildid>, pe:<guid>, macho:<uuid>).

Deliverable: 12 page ADR in repo (/docs/adr/000-explanation-graph.md).


Phase 1 Domain model & persistence (backend)

Owner: Backend

1.1. Define core C# domain models

Place in StellaOps.Explanations project or equivalent:

public record ArtifactRef(
    string Purl,
    string Digest,
    string BuildId);

public record ExternalRef(
    string Type,     // "advisory", "vex", "nvd", etc.
    string Id,
    string Url);

public record EvidenceNode(
    string EvidenceId,
    string Kind,         // "binary.callgraph", "source.scan", "external.feed", ...
    string Hash,         // sha256 of blob
    string Summary,
    string BlobPointer,  // s3://..., file://..., ipfs://...
    ExternalRef? ExternalRef = null);

public record ReasoningStep(
    string StepId,
    string Type,         // "callgraph.reachable", "version.match", ...
    string EvidenceRef); // EvidenceId

public record Provenance(
    string Scanner,
    string RulesHash,    // hash of rules/policy bundle used
    DateTimeOffset Time,
    string Attestation); // DSSE envelope (base64 or JSON)

public record Verdict(
    string VerdictId,
    string Cve,
    string Status,       // "affected", "not_affected", "under_investigation"
    string Policy,       // e.g. "vex.lattice:v1"
    IReadOnlyList<ReasoningStep> Reasoning,
    Provenance Provenance);

public record ExplanationGraph(
    Guid ScanId,
    ArtifactRef Artifact,
    IReadOnlyList<Verdict> Verdicts,
    IReadOnlyList<EvidenceNode> Evidence,
    string GraphHash);   // sha256 of canonical JSON

1.2. MongoDB schema

Collection: explanation_graphs

Document shape:

{
  "_id": "scanId:artifactDigest",   // composite key or just ObjectId + separate fields
  "scanId": "uuid",
  "artifact": {
    "purl": "pkg:docker/redis@7.2.4",
    "digest": "sha256:...",
    "buildId": "elf:abcd..."
  },
  "verdicts": [ /* Verdict[] */ ],
  "evidence": [ /* EvidenceNode[] */ ],
  "graphHash": "sha256:..."
}

Indexes:

  • { scanId: 1 }
  • { "artifact.digest": 1 }
  • { "verdicts.cve": 1, "artifact.digest": 1 } (compound)
  • Optional: TTL or archiving mechanism if you dont want to keep these forever.

Acceptance criteria:

  • You can serialize/deserialize ExplanationGraph to Mongo without loss.
  • Indexes exist and queries by scanId, artifact.digest, and (digest + CVE) are efficient.

Phase 2 Evidence ingestion plumbing

Goal: Make every relevant raw fact show up as an EvidenceNode.

Owner: Backend scanner team

2.1. Evidence factory service

Create IEvidenceService:

public interface IEvidenceService
{
    Task<EvidenceNode> StoreBinaryCallgraphAsync(
        Guid scanId,
        ArtifactRef artifact,
        byte[] callgraphBytes,
        string summary,
        ExternalRef? externalRef = null);

    Task<EvidenceNode> StoreSourceScanAsync(
        Guid scanId,
        ArtifactRef artifact,
        byte[] scanResultJson,
        string summary);

    Task<EvidenceNode> StoreExternalFeedAsync(
        Guid scanId,
        ExternalRef externalRef,
        byte[] rawPayload,
        string summary);
}

Implementation tasks:

  1. Hash computation

    • Compute SHA256 over raw bytes.
    • Prefer a helper:
    public static string Sha256Hex(ReadOnlySpan<byte> data) { ... }
    
  2. Blob storage

    • S3 key format, e.g.: explanations/{scanId}/{evidenceId}.
    • BlobPointer string = s3://stella-explanations/explanations/{scanId}/{evidenceId}.
  3. EvidenceNode creation

    • Generate evidenceId = Guid.NewGuid().ToString("N").
    • Populate kind, hash, summary, blobPointer, externalRef.
  4. Graph assembly contract

    • Evidence service does not write to Mongo.
    • It only uploads blobs and returns EvidenceNode objects.
    • The ExplanationGraphBuilder (next phase) collects them.

Acceptance criteria:

  • Given a callgraph binary, a corresponding EvidenceNode is returned with:

    • hash matching the blob (verified in tests),
    • blob present in S3,
    • summary populated.

Phase 3 Reasoning & policy integration

Goal: Instrument your existing VEX / lattice policy engine to emit deterministic reasoning steps instead of just a boolean status.

Owner: Policy / rules engine team

3.1. Expose rule evaluation trace

Assume you already have something like:

VulnerabilityStatus Evaluate(ArtifactRef artifact, string cve, Findings findings);

Extend it to:

public sealed class RuleEvaluationTrace
{
    public string StepType { get; init; }          // e.g. "version.match"
    public string RuleId { get; init; }            // "rule:openssl:versionFromElf"
    public string Description { get; init; }       // human-readable explanation
    public string EvidenceKind { get; init; }      // to match with EvidenceService
    public object EvidencePayload { get; init; }   // callgraph bytes, json, etc.
}

public sealed class EvaluationResult
{
    public string Status { get; init; }            // "affected", etc.
    public IReadOnlyList<RuleEvaluationTrace> Trace { get; init; }
}

New API:

EvaluationResult EvaluateWithTrace(
    ArtifactRef artifact, string cve, Findings findings);

3.2. From trace to ReasoningStep + EvidenceNode

Create ExplanationGraphBuilder:

public interface IExplanationGraphBuilder
{
    Task<ExplanationGraph> BuildAsync(
        Guid scanId,
        ArtifactRef artifact,
        IReadOnlyList<CveFinding> cveFindings,
        string scannerName);
}

Internal algorithm for each CveFinding:

  1. Call EvaluateWithTrace(artifact, cve, finding) to get EvaluationResult.

  2. For each RuleEvaluationTrace:

    • Use EvidenceService with appropriate method based on EvidenceKind.

    • Get back an EvidenceNode with evidenceId.

    • Create ReasoningStep:

      • StepId = Guid.NewGuid()
      • Type = trace.StepType
      • EvidenceRef = evidenceNode.EvidenceId
  3. Assemble Verdict:

var verdict = new Verdict(
    verdictId: Guid.NewGuid().ToString("N"),
    cve: finding.Cve,
    status: result.Status,
    policy: "vex.lattice:v1",
    reasoning: steps,
    provenance: new Provenance(
        scanner: scannerName,
        rulesHash: rulesBundleHash,
        time: DateTimeOffset.UtcNow,
        attestation: "" // set in Phase 4
    )
);
  1. Collect:

    • all EvidenceNodes (dedupe by hash to avoid duplicates).
    • all Verdicts.

Acceptance criteria:

  • Given deterministic inputs (scan + rules bundle hash), repeated runs produce:

    • same sequence of ReasoningStep types,
    • same set of EvidenceNode.hash values,
    • same status.

Phase 4 Graph hashing & DSSE signing

Owner: Security / platform

4.1. Canonical JSON for hash

Implement:

public static class ExplanationGraphSerializer
{
    public static string ToCanonicalJson(ExplanationGraph graph)
    {
        // no graphHash, no attestation in this step
    }
}

Key requirements:

  • Consistent property ordering (e.g. alphabetical).
  • No extra whitespace.
  • UTF8 encoding.
  • Primitive formatting options fixed (e.g. date as ISO 8601 with Z).

4.2. Hash and sign

Before persisting:

var graphWithoutHash = graph with { GraphHash = "" };
var canonicalJson = ExplanationGraphSerializer.ToCanonicalJson(graphWithoutHash);
var graphHash = Sha256Hex(Encoding.UTF8.GetBytes(canonicalJson));

// sign DSSE envelope
var envelope = dsseSigner.Sign(
    payloadType: "stellaops/explanation-graph@v1",
    payload: Encoding.UTF8.GetBytes(canonicalJson)
);

// attach
var signedVerdicts = graph.Verdicts
    .Select(v => v with
    {
        Provenance = v.Provenance with { Attestation = envelope.ToJson() }
    })
    .ToList();

var finalGraph = graph with
{
    GraphHash = $"sha256:{graphHash}",
    Verdicts = signedVerdicts
};

Then write finalGraph to Mongo.

Acceptance criteria:

  • Recomputing graphHash from Mongo document (zeroing graphHash and attestation) matches stored value.
  • Verifying DSSE signature with the public key succeeds.

Phase 5 Backend APIs & export bundle

Owner: Backend / API

5.1. Read APIs

Add endpoints (REST-ish):

  1. Get graph for scan-artifact

    GET /explanations/scans/{scanId}/artifacts/{digest}

    • Returns entire ExplanationGraph JSON.
  2. Get single verdict

    GET /explanations/scans/{scanId}/artifacts/{digest}/cves/{cve}

    • Returns Verdict + its subset of EvidenceNodes.
  3. Search by CVE

    GET /explanations/search?cve=CVE-2024-XXXX&digest=sha256:...

    • Returns list of (scanId, artifact, verdictId).

5.2. Export replay bundle

POST /explanations/{scanId}/{digest}/export

Implementation:

  • Create a temporary directory.

  • Write:

    • graph.jsonExplanationGraph as stored.

    • signature.json → DSSE envelope alone (optional).

    • Evidence blobs:

      • For each EvidenceNode:

        • Download from S3 and store as evidence/{evidenceId}.
  • Zip the folder: explanation-{scanId}-{shortDigest}.zip.

  • Stream as download.

5.3. CLI verifier

Small .NET / Go CLI:

Commands:

stellaops-explain verify graph.json --evidence ./evidence

Verification steps:

  1. Load graph.json, parse to ExplanationGraph.

  2. Strip graphHash & attestation, reserialize canonical JSON.

  3. Recompute SHA256 and compare to graphHash.

  4. Verify DSSE envelope with public key.

  5. For each EvidenceNode:

    • Read file ./evidence/{evidenceId}.
    • Recompute hash and compare with evidence.hash.

Exit with nonzero code if anything fails; print a short summary.

Acceptance criteria:

  • Export bundle roundtrips: verify passes on an exported zip.
  • APIs documented in OpenAPI / Swagger.

Phase 6 UI: Explanation trace panel

Owner: Frontend

6.1. API integration

New calls in frontend client:

  • GET /explanations/scans/{scanId}/artifacts/{digest}
  • Optionally GET /explanations/.../cves/{cve} if you want lazy loading per CVE.

6.2. Component UX

On the “vulnerability detail” view:

  • Add “Explanation” tab with three sections:
  1. Verdict summary

    • Badge: Affected / Not affected / Under investigation.
    • Text: Derived using policy {policy}, rules hash {rulesHash[..8]}.
  2. Reasoning timeline

    • Vertical list of ReasoningSteps:

      • Icon per type (e.g. “flow” icon for callgraph.reachable).
      • Title = Type (humanized).
      • Click to expand underlying EvidenceNode.summary.
      • Optional “View raw evidence” link (downloads blob via S3 signed URL).
  3. Provenance

    • Show:

      • scanner
      • rulesHash
      • time
      • “Attested ✓” if DSSE verifies on the backend (or precomputed).
  4. Export

    • Button: “Download replay bundle (.zip)”
    • Calls export endpoint and triggers browser download.

Acceptance criteria:

  • For any CVE in UI, a user can:

    • See why it is (not) affected in at most 2 clicks.
    • Download a replay bundle via the UI.

Phase 7 Testing strategy

Owner: QA + all devs

7.1. Unit tests

  • EvidenceService:

    • Hash matches blob contents.
    • BlobPointer formats are as expected.
  • ExplanationGraphBuilder:

    • Given fixed test input, the resulting graph JSON matches golden file.
  • Serializer:

    • Canonical JSON is stable under property reordering in the code.

7.2. Integration tests

  • Endtoend fake scan:

    • Simulate scanner output + rules.
    • Build graph → persist → fetch via API.
    • Run CLI verify on exported bundle in CI.

7.3. Security tests

  • Signature tampering:

    • Modify graph.json in exported bundle; verify must fail.
  • Evidence tampering:

    • Modify an evidence file; verify must fail.

Phase 8 Rollout

Owner: PM / Tech lead

  1. Feature flag

    • Start with explanation graph generation behind a flag for:

      • subset of scanners,
      • subset of tenants.
  2. Backfill (optional)

    • If useful, run a oneoff job that:

      • Takes recent scans,
      • Rebuilds explanation graphs,
      • Stores them in Mongo.
  3. Docs

    • Short doc page for customers:

      • “What is an Explanation Graph?”
      • “How to verify it with the CLI?”

Developer checklist (TL;DR)

You can literally drop this into Jira as epics/tasks:

  1. Backend

    • Implement domain models (ExplanationGraph, Verdict, EvidenceNode, etc.).
    • Implement IEvidenceService + S3 integration.
    • Extend policy engine to EvaluateWithTrace.
    • Implement ExplanationGraphBuilder.
    • Implement canonical serializer, hashing, DSSE signing.
    • Implement Mongo persistence + indexes.
    • Implement REST APIs + export ZIP.
  2. Frontend

    • Wire new APIs into the vulnerability detail view.
    • Build Explanation tab (Summary / Reasoning / Provenance).
    • Implement “Download replay bundle” button.
  3. Tools

    • Implement stellaops-explain verify CLI.
    • Add CI test that runs verify against a sample bundle.
  4. QA

    • Goldenfile tests for graphs.
    • Signature & evidence tampering tests.
    • UI functional tests on explanations.

If youd like, next step I can turn this into:

  • concrete OpenAPI spec for the new endpoints, and/or
  • a sample stellaops-explain verify CLI skeleton (C# or Go).