# Verifiable Proof Spine: Receipts + Benchmarks

Compiled: 2025-12-01 (UTC)

## Why this matters
Move from “trust the scanner” to “prove every verdict.” Each finding and every “not affected” claim must carry cryptographic, replayable evidence that anyone can verify offline or online.

## Differentiators to build in
- **Graph Revision ID on every verdict**: stable Merkle root over SBOM nodes/edges, policies, feeds, scan params, and tool versions. Any data change → new graph hash → new revisioned verdicts; surface the ID in UI/API.
- **Machine-verifiable receipts (DSSE)**: emit a DSSE-wrapped in-toto statement per verdict (predicate `stellaops.dev/verdict@v1`) including graphRevisionId, artifact digests, rule id/version, inputs, and timestamps; sign with Authority keys (offline mode supported) and keep receipts queryable/exportable; mirror to Rekor-compatible ledger when online.
- **Reachability evidence**: attach call-stack slices (entry→sink, symbols, file:line) for code-level cases; for binaries, include symbol presence proofs (bitmap/offsets) hashed and referenced from DSSE payloads.
- **Deterministic replay manifests**: publish `replay.manifest.json` with inputs, feeds, rule/tool/container digests so auditors can recompute the same graph hash and verdicts offline.

## Benchmarks to publish (headline KPIs)
- **False-positive reduction vs baseline scanners**: run public corpus across 3–4 popular scanners; label ground truth once; report mean and p95 FP reduction.
- **Proof coverage**: percentage of findings/VEX items carrying valid DSSE receipts; break out reachable vs unreachable and “not affected.”
- **Triage time saved**: analyst minutes from alert to final disposition with receipts visible vs hidden; publish p50/p95 deltas.
- **Determinism stability**: re-run identical scans across nodes; publish % identical graph hashes and explain drift causes when different.

## Minimal implementation plan (week-by-week)
- **Week 1 – Primitives**: add Graph Revision ID generator in scanner.webservice (Merkle over normalized SBOM+edges+policies+toolVersions); define `VerdictReceipt` schema (protobuf/JSON) and DSSE envelope types.
- **Week 2 – Signing + storage**: wire DSSE signing via Authority with offline key support/rotation; persist receipts in `Receipts` table keyed by (graphRevisionId, verdictId); enable JSONL export and ledger mirror.
- **Week 3 – Reachability proofs**: capture call-stack slices in reachability engine; serialize and hash; add binary symbol proof module (ELF/PE bitmap + digest) and reference from receipts.
- **Week 4 – Replay + UX**: emit replay.manifest per scan (inputs, tool digests); UI shows “Verified” badge, graph hash, signature issuer, and one-click “Copy receipt”; API: `GET /verdicts/{id}/receipt`, `GET /graphs/{rev}/replay`.
- **Week 5 – Benchmarks harness**: create `bench/` fixtures and runner with baseline scanner adapters, ground-truth labels, metrics export for FP%, proof coverage, triage time capture hooks.

## Developer guardrails (non-negotiable)
- **No receipt, no ship**: any surfaced verdict must carry a DSSE receipt; fail closed otherwise.
- **Schema freeze windows**: changes to rule inputs or policy logic must bump rule version and therefore graph hash.
- **Replay-first CI**: PRs touching scanning/rules must pass a replay test that reproduces prior graph hashes on gold fixtures.
- **Clock safety**: use monotonic time for receipts; include UTC wall-time separately.

## What to show buyers/auditors
- Audit kit: sample container + receipts + replay manifest + one command to reproduce the same graph hash.
- One-page benchmark readout: FP reduction, proof coverage, triage time saved (p50/p95), corpus description.

## Optional follow-ons
- Provide DSSE predicate schema, Postgres DDL for `Receipts` and `Graphs`, and a minimal .NET verification CLI (`stellaops-verify`) that replays a manifest and validates signatures.