diff --git a/docs/product-advisories/unprocessed/21-Dec-2025 - Designing Explainable Triage Workflows.md b/docs/product-advisories/unprocessed/21-Dec-2025 - Designing Explainable Triage Workflows.md new file mode 100644 index 000000000..4e43a4e3f --- /dev/null +++ b/docs/product-advisories/unprocessed/21-Dec-2025 - Designing Explainable Triage Workflows.md @@ -0,0 +1,154 @@ +Below are operating guidelines for Product and Development Managers to deliver a “vulnerability-first + reachability + multi-analyzer + single built-in attested verdict” capability as a coherent, top-of-market feature set. + +## 1) Product north star and non-negotiables + +**North star:** Every vulnerability finding must resolve to a **policy-backed, reachability-informed, runtime-corroborated verdict** that is **exportable as one signed attestation attached to the built artifact**. + +**Non-negotiables** + +* **Vulnerability-first UX:** Users start from a CVE/finding and immediately see applicability, reachability, runtime corroboration, and policy rationale. +* **Single canonical verdict artifact:** One built-in, signed verdict attestation per subject (OCI digest), replayable (“same inputs → same output”). +* **Deterministic evidence:** Evidence objects are content-hashed and versioned (feeds, policies, analyzers, graph snapshots). +* **Unknowns are first-class:** “Unknown reachability/runtime/config” is not hidden; it is budgeted and policy-controlled. + +## 2) Scope: what “reachability” means across analyzers + +PMs must define reachability per layer and force consistent semantics: + +1. **Source reachability** + + * Entry points → call graph → vulnerable function/symbol (proof subgraph stored). +2. **Language dependency reachability** + + * Resolved dependency graph + vulnerable component mapping + (where feasible) call-path to vulnerable code. +3. **OS dependency applicability** + + * Installed package inventory + file ownership + linkage/usage hints (where available). +4. **Binary mapping reachability** + + * Build-ID / symbol tables / imports + (optional) DWARF/source map; fallback heuristics are explicitly labeled. +5. **Runtime corroboration (eBPF / runtime sensors)** + + * Execution facts: library loads, syscalls, network exposure, process ancestry; mapped to a “supports/contradicts/unknown” posture for the finding. + +**Manager rule:** Any analyzer that cannot produce a proof object must emit an explicit “UNKNOWN with reason code,” never a silent “not reachable.” + +## 3) The decision model: a strict, explainable merge into one verdict + +Adopt a small fixed set of verdicts and require all teams to use them: + +* `AFFECTED`, `NOT_AFFECTED`, `MITIGATED`, `NEEDS_REVIEW` + +Each verdict must carry: + +* **Reason steps** (policy/lattice merge trace) +* **Confidence score** (bounded; explainable inputs) +* **Counterfactuals** (“what would flip this verdict”) +* **Evidence pointers** (hashes to proof objects) + +**PM guidance on precedence:** Do not hardcode “vendor > distro > internal.” Require a policy-defined merge (lattice semantics) where evidence quality and freshness influence trust. + +## 4) Built-in attestation as the primary deliverable + +**Deliverable:** An OCI-attached DSSE/in-toto style attestation called (example) `stella.verdict.v1`. + +Minimum contents: + +* Subject: image digest(s) +* Inputs: feed snapshot IDs, analyzer versions/digests, policy bundle hash, time window, environment tags +* Per-CVE records: component, installed version, fixed version, verdict, confidence, reason steps +* Proof pointers: reachability subgraph hash, runtime fact hashes, config/exposure facts hash +* Replay manifest: “verify this verdict” command + inputs hash + +**Acceptance criterion:** A third party can validate signature and replay deterministically using exported inputs, obtaining byte-identical verdict output. + +## 5) UX requirements (vulnerability-first, proof-linked) + +PMs must enforce these UX invariants: + +* Finding row shows: Verdict chip + confidence + “why” one-liner + proof badges (Reachability / Runtime / Policy / Provenance). +* Click-through yields: + + * Policy explanation (human-readable steps) + * Evidence graph (hashes, issuers, timestamps, signature status) + * Reachability mini-map (stored subgraph) + * Runtime corroboration timeline (windowed) + * Export: “Audit pack” (verdict + proofs + inputs) + +**Rule:** Any displayed claim must link to a proof node or be explicitly marked “operator note.” + +## 6) Engineering execution rules (to keep this shippable) + +**Modular contracts** + +* Each analyzer outputs into a shared internal schema (typed nodes/edges + content hashes). +* Evidence objects are immutable; updates create new objects (versioned snapshots). + +**Performance strategy** + +* Vulnerability-first query plan: build “vulnerable element set” per CVE, then run targeted reachability; avoid whole-program graphs unless needed. +* Progressive fidelity: fast heuristic → deeper proof when requested; verdict must reflect confidence accordingly. + +**Determinism** + +* Pin all feeds/policies/analyzer images by digest. +* Canonical serialization for graphs and verdicts. +* Stable hashing rules documented and tested. + +## 7) Release gates and KPIs (what managers track weekly) + +**Quality KPIs** + +* % findings with non-UNKNOWN reachability +* % findings with runtime corroboration available (where sensor deployed) +* False-positive reduction vs baseline (measured via developer confirmations / triage outcomes) +* “Explainability completeness”: % verdicts with reason steps + at least one proof pointer +* Replay success rate: % attestations replaying deterministically in CI + +**Operational KPIs** + +* Median time to first verdict per image +* Cache hit rate for graphs/proofs +* Storage growth per scan (evidence size budgets) + +**Policy KPIs** + +* Unknown budget breaches by environment (prod/dev) +* Override/exception volume and aging + +## 8) Roadmap sequencing (recommended) + +1. **Phase 1: Single attested verdict + OS/lang SCA applicability** + + * Deterministic inputs, verdict schema, signature, OCI attach, basic policy steps. +2. **Phase 2: Source reachability proofs (top languages)** + + * Store subgraphs; introduce confidence + counterfactuals. +3. **Phase 3: Binary mapping fallback** + + * Build-ID/symbol-based reachability + explicit “heuristic” labeling. +4. **Phase 4: Runtime corroboration (eBPF) integration** + + * Evidence facts + time-window model + correlation to findings. +5. **Phase 5: Full lattice merge + Trust Algebra Studio** + + * Operator-defined semantics; evidence quality weighting; vendor trust scoring. + +## 9) Risk management rules (preempt common failure modes) + +* **Overclaiming:** Never present “not affected” without an evidence-backed rationale; otherwise use `NEEDS_REVIEW` with a clear missing-evidence reason. +* **Evidence sprawl:** Enforce evidence budgets (per-scan size caps) and retention tiers; “audit pack export” must remain complete even when the platform prunes caches. +* **Runtime ambiguity:** Runtime corroboration is supportive, not absolute; map to “observed/supports/contradicts/unknown” rather than binary. +* **Policy drift:** Policy bundles are versioned and pinned into attestations; changes must produce new signed verdicts (delta verdicts). + +## 10) Definition of done for the feature + +A release is “done” only if: + +* A build produces an OCI artifact with an attached **signed verdict attestation**. +* Each verdict is **explainable** (reason steps + proof pointers). +* Reachability evidence is **stored as a reproducible subgraph** (or explicitly UNKNOWN with reason). +* Replay verification reproduces the same verdict with pinned inputs. +* UX starts from vulnerabilities and links directly to proofs and audit export. + +If you want, I can turn these guidelines into: (1) a manager-ready checklist per sprint, and (2) a concrete “verdict attestation” JSON schema with canonical hashing/serialization rules. diff --git a/docs/product-advisories/unprocessed/21-Dec-2025 - How Top Scanners Shape Evidence‑First UX.md b/docs/product-advisories/unprocessed/21-Dec-2025 - How Top Scanners Shape Evidence‑First UX.md new file mode 100644 index 000000000..a8da8e8c1 --- /dev/null +++ b/docs/product-advisories/unprocessed/21-Dec-2025 - How Top Scanners Shape Evidence‑First UX.md @@ -0,0 +1,556 @@ +## Guidelines for Product and Development Managers: Signed, Replayable Risk Verdicts + +### Purpose + +Signed, replayable risk verdicts are the Stella Ops mechanism for producing a **cryptographically verifiable, audit‑ready decision** about an artifact (container image, VM image, filesystem snapshot, SBOM, etc.) that can be **recomputed later to the same result** using the same inputs (“time-travel replay”). + +This capability is not “scan output with a signature.” It is a **decision artifact** that becomes the unit of governance in CI/CD, registry admission, and audits. + +--- + +# 1) Shared definitions and non-negotiables + +## 1.1 Definitions + +**Risk verdict** +A structured decision: *Pass / Fail / Warn / Needs‑Review* (or similar), produced by a deterministic evaluator under a specific policy and knowledge state. + +**Signed** +The verdict is wrapped in a tamper‑evident envelope (e.g., DSSE/in‑toto statement) and signed using an organization-approved trust model (key-based, keyless, or offline CA). + +**Replayable** +Given the same: + +* target artifact identity +* SBOM (or derivation method) +* vulnerability and advisory knowledge state +* VEX inputs +* policy bundle +* evaluator version + …Stella Ops can **re-evaluate and reproduce the same verdict** and provide evidence equivalence. + +> Critical nuance: replayability is about *result equivalence*. Byte‑for‑byte equality is ideal but not always required if signatures/metadata necessarily vary. If byte‑for‑byte is a goal, you must strictly control timestamps, ordering, and serialization. + +--- + +## 1.2 Non-negotiables (what must be true in v1) + +1. **Verdicts are bound to immutable artifact identity** + + * Container image: digest (sha256:…) + * SBOM: content digest + * File tree: merkle root digest, or equivalent + +2. **Verdicts are deterministic** + + * No “current time” dependence in scoring + * No non-deterministic ordering of findings + * No implicit network calls during evaluation + +3. **Verdicts are explainable** + + * Every deny/block decision must cite the policy clause and evidence pointers that triggered it. + +4. **Verdicts are verifiable** + + * Independent verification toolchain exists (CLI/library) that validates signature and checks referenced evidence integrity. + +5. **Knowledge state is pinned** + + * The verdict references a “knowledge snapshot” (vuln feeds, advisories, VEX set) by digest/ID, not “latest.” + +--- + +## 1.3 Explicit non-goals (avoid scope traps) + +* Building a full CNAPP runtime protection product as part of verdicting. +* Implementing “all possible attestation standards.” Pick one canonical representation; support others via adapters. +* Solving global revocation and key lifecycle for every ecosystem on day one; define a minimum viable trust model per deployment mode. + +--- + +# 2) Product Management Guidelines + +## 2.1 Position the verdict as the primary product artifact + +**PM rule:** if a workflow does not end in a verdict artifact, it is not part of this moat. + +Examples: + +* CI pipeline step produces `VERDICT.attestation` attached to the OCI artifact. +* Registry admission checks for a valid verdict attestation meeting policy. +* Audit export bundles the verdict plus referenced evidence. + +**Avoid:** “scan reports” as the goal. Reports are views; the verdict is the object. + +--- + +## 2.2 Define the core personas and success outcomes + +Minimum personas: + +1. **Release/Platform Engineering** + + * Needs automated gates, reproducibility, and low friction. +2. **Security Engineering / AppSec** + + * Needs evidence, explainability, and exception workflows. +3. **Audit / Compliance** + + * Needs replay, provenance, and a defensible trail. + +Define “first value” for each: + +* Release engineer: gate merges/releases without re-running scans. +* Security engineer: investigate a deny decision with evidence pointers in minutes. +* Auditor: replay a verdict months later using the same knowledge snapshot. + +--- + +## 2.3 Product requirements (expressed as “shall” statements) + +### 2.3.1 Verdict content requirements + +A verdict SHALL contain: + +* **Subject**: immutable artifact reference (digest, type, locator) +* **Decision**: pass/fail/warn/etc. +* **Policy binding**: policy bundle ID + version + digest +* **Knowledge snapshot binding**: snapshot IDs/digests for vuln feed and VEX set +* **Evaluator binding**: evaluator name/version + schema version +* **Rationale summary**: stable short explanation (human-readable) +* **Findings references**: pointers to detailed findings/evidence (content-addressed) +* **Unknowns state**: explicit unknown counts and categories + +### 2.3.2 Replay requirements + +The product SHALL support: + +* Re-evaluating the same subject under the same policy+knowledge snapshot +* Proving equivalence of inputs used in the original verdict +* Producing a “replay report” that states: + + * replay succeeded and matched + * or replay failed and why (e.g., missing evidence, policy changed) + +### 2.3.3 UX requirements + +UI/UX SHALL: + +* Show verdict status clearly (Pass/Fail/…) +* Display: + + * policy clause(s) responsible + * top evidence pointers + * knowledge snapshot ID + * signature trust status (who signed, chain validity) +* Provide “Replay” as an action (even if replay happens offline, the UX must guide it) + +--- + +## 2.4 Product taxonomy: separate “verdicts” from “evaluations” from “attestations” + +This is where many products get confused. Your terminology must remain strict: + +* **Evaluation**: internal computation that produces decision + findings. +* **Verdict**: the stable, canonical decision payload (the thing being signed). +* **Attestation**: the signed envelope binding the verdict to cryptographic identity. + +PMs must enforce this vocabulary in PRDs, UI labels, and docs. + +--- + +## 2.5 Policy model guidelines for verdicting + +Verdicting depends on policy discipline. + +PM rules: + +* Policy must be **versioned** and **content-addressed**. +* Policies must be **pure functions** of declared inputs: + + * SBOM graph + * VEX claims + * vulnerability data + * reachability evidence (if present) + * environment assertions (if present) +* Policies must produce: + + * a decision + * plus a minimal explanation graph (policy rule ID → evidence IDs) + +Avoid “freeform scripts” early. You need determinism and auditability. + +--- + +## 2.6 Exceptions are part of the verdict product, not an afterthought + +PM requirement: + +* Exceptions must be first-class objects with: + + * scope (exact artifact/component range) + * owner + * justification + * expiry + * required evidence (optional but strongly recommended) + +And verdict logic must: + +* record that an exception was applied +* include exception IDs in the verdict evidence graph +* make exception usage visible in UI and audit pack exports + +--- + +## 2.7 Success metrics (PM-owned) + +Choose metrics that reflect the moat: + +* **Replay success rate**: % of verdicts that can be replayed after N days. +* **Policy determinism incidents**: number of non-deterministic evaluation bugs. +* **Audit cycle time**: time to satisfy an audit evidence request for a release. +* **Noise**: # of manual suppressions/overrides per 100 releases (should drop). +* **Gate adoption**: % of releases gated by verdict attestations (not reports). + +--- + +# 3) Development Management Guidelines + +## 3.1 Architecture principles (engineering tenets) + +### Tenet A: Determinism-first evaluation + +Engineering SHALL ensure evaluation is deterministic across: + +* OS and architecture differences (as much as feasible) +* concurrency scheduling +* non-ordered data structures + +Practical rules: + +* Never iterate over maps/hashes without sorting keys. +* Canonicalize output ordering (findings sorted by stable tuple: (component_id, cve_id, path, rule_id)). +* Keep “generated at” timestamps out of the signed payload; if needed, place them in an unsigned wrapper or separate metadata field excluded from signature. + +### Tenet B: Content-address everything + +All significant inputs/outputs should have content digests: + +* SBOM digest +* policy digest +* knowledge snapshot digest +* evidence bundle digest +* verdict digest + +This makes replay and integrity checks possible. + +### Tenet C: No hidden network + +During evaluation, the engine must not fetch “latest” anything. +Network is allowed only in: + +* snapshot acquisition phase +* artifact retrieval phase +* attestation publication phase + …and each must be explicitly logged and pinned. + +--- + +## 3.2 Canonical verdict schema and serialization rules + +**Engineering guideline:** pick a canonical serialization and stick to it. + +Options: + +* Canonical JSON (JCS or equivalent) +* CBOR with deterministic encoding + +Rules: + +* Define a **schema version** and strict validation. +* Make field names stable; avoid “optional” fields that appear/disappear nondeterministically. +* Ensure numeric formatting is stable (no float drift; prefer integers or rational representation). +* Always include empty arrays if required for stability, or exclude consistently by schema rule. + +--- + +## 3.3 Suggested verdict payload (illustrative) + +This is not a mandate—use it as a baseline structure. + +```json +{ + "schema_version": "1.0", + "subject": { + "type": "oci-image", + "name": "registry.example.com/app/service", + "digest": "sha256:…", + "platform": "linux/amd64" + }, + "evaluation": { + "evaluator": "stella-eval", + "evaluator_version": "0.9.0", + "policy": { + "id": "prod-default", + "version": "2025.12.1", + "digest": "sha256:…" + }, + "knowledge_snapshot": { + "vuln_db_digest": "sha256:…", + "advisory_digest": "sha256:…", + "vex_set_digest": "sha256:…" + } + }, + "decision": { + "status": "fail", + "score": 87, + "reasons": [ + { "rule_id": "RISK.CRITICAL.REACHABLE", "evidence_ref": "sha256:…" } + ], + "unknowns": { + "unknown_reachable": 2, + "unknown_unreachable": 0 + } + }, + "evidence": { + "sbom_digest": "sha256:…", + "finding_bundle_digest": "sha256:…", + "inputs_manifest_digest": "sha256:…" + } +} +``` + +Then wrap this payload in your chosen attestation envelope and sign it. + +--- + +## 3.4 Attestation format and storage guidelines + +Development managers must enforce a consistent publishing model: + +1. **Envelope** + + * Prefer DSSE/in-toto style envelope because it: + + * standardizes signing + * supports multiple signature schemes + * is widely adopted in supply chain ecosystems + +2. **Attachment** + + * OCI artifacts should carry verdicts as referrers/attachments to the subject digest (preferred). + * For non-OCI targets, store in an internal ledger keyed by the subject digest/ID. + +3. **Verification** + + * Provide: + + * `stella verify ` → checks signature and integrity references + * `stella replay ` → re-run evaluation from snapshots and compare + +4. **Transparency / logs** + + * Optional in v1, but plan for: + + * transparency log (public or private) to strengthen auditability + * offline alternatives for air-gapped customers + +--- + +## 3.5 Knowledge snapshot engineering requirements + +A “snapshot” must be an immutable bundle, ideally content-addressed: + +Snapshot includes: + +* vulnerability database at a specific point +* advisory sources (OS distro advisories) +* VEX statement set(s) +* any enrichment signals that influence scoring + +Rules: + +* Snapshot resolution must be explicit: “use snapshot digest X” +* Must support export/import for air-gapped deployments +* Must record source provenance and ingestion timestamps (timestamps may be excluded from signed payload if they cause nondeterminism; store them in snapshot metadata) + +--- + +## 3.6 Replay engine requirements + +Replay is not “re-run scan and hope it matches.” + +Replay must: + +* retrieve the exact subject (or confirm it via digest) +* retrieve the exact SBOM (or deterministically re-generate it from the subject in a defined way) +* load exact policy bundle by digest +* load exact knowledge snapshot by digest +* run evaluator version pinned in verdict (or enforce a compatibility mapping) +* produce: + + * verdict-equivalence result + * a delta explanation if mismatch occurs + +Engineering rule: replay must fail loudly and specifically when inputs are missing. + +--- + +## 3.7 Testing strategy (required) + +Deterministic systems require “golden” testing. + +Minimum tests: + +1. **Golden verdict tests** + + * Fixed artifact + fixed snapshots + fixed policy + * Expected verdict output must match exactly + +2. **Cross-platform determinism tests** + + * Run same evaluation on different machines/containers and compare outputs + +3. **Mutation tests for determinism** + + * Randomize ordering of internal collections; output should remain unchanged + +4. **Replay regression tests** + + * Store verdict + snapshots and replay after code changes to ensure compatibility guarantees hold + +--- + +## 3.8 Versioning and backward compatibility guidelines + +This is essential to prevent “replay breaks after upgrades.” + +Rules: + +* **Verdict schema version** changes must be rare and carefully managed. +* Maintain a compatibility matrix: + + * evaluator vX can replay verdict schema vY +* If you must evolve logic, do so by: + + * bumping evaluator version + * preserving older evaluators in a compatibility mode (containerized evaluators are often easiest) + +--- + +## 3.9 Security and key management guidelines + +Development managers must ensure: + +* Signing keys are managed via: + + * KMS/HSM (enterprise) + * keyless (OIDC-based) where acceptable + * offline keys for air-gapped + +* Verification trust policy is explicit: + + * which identities are trusted to sign verdicts + * which policies are accepted + * whether transparency is required + * how to handle revocation/rotation + +* Separate “can sign” from “can publish” + + * Signing should be restricted; publishing may be broader. + +--- + +# 4) Operational workflow requirements (cross-functional) + +## 4.1 CI gate flow + +* Build artifact +* Produce SBOM deterministically (or record SBOM digest if generated elsewhere) +* Evaluate → produce verdict payload +* Sign verdict → publish attestation attached to artifact +* Gate decision uses verification of: + + * signature validity + * policy compliance + * snapshot integrity + +## 4.2 Registry / admission flow + +* Admission controller checks for a valid, trusted verdict attestation +* Optionally requires: + + * verdict not older than X snapshot age (this is policy) + * no expired exceptions + * replay not required (replay is for audits; admission is fast-path) + +## 4.3 Audit flow + +* Export “audit pack”: + + * verdict + signature chain + * policy bundle + * knowledge snapshot + * referenced evidence bundles +* Auditor (or internal team) runs `verify` and optionally `replay` + +--- + +# 5) Common failure modes to avoid + +1. **Signing “findings” instead of a decision** + + * Leads to unbounded payload growth and weak governance semantics. + +2. **Using “latest” feeds during evaluation** + + * Breaks replayability immediately. + +3. **Embedding timestamps in signed payload** + + * Eliminates deterministic byte-level reproducibility. + +4. **Letting the UI become the source of truth** + + * The verdict artifact must be the authority; UI is a view. + +5. **No clear separation between: evidence store, snapshot store, verdict store** + + * Creates coupling and makes offline operations painful. + +--- + +# 6) Definition of Done checklist (use this to gate release) + +A feature increment for signed, replayable verdicts is “done” only if: + +* [ ] Verdict binds to immutable subject digest +* [ ] Verdict includes policy digest/version and knowledge snapshot digests +* [ ] Verdict is signed and verifiable via CLI +* [ ] Verification works offline (given exported artifacts) +* [ ] Replay works with stored snapshots and produces match/mismatch output with reasons +* [ ] Determinism tests pass (golden + mutation + cross-platform) +* [ ] UI displays signer identity, policy, snapshot IDs, and rule→evidence links +* [ ] Exceptions (if implemented) are recorded in verdict and enforced deterministically + +--- + +## Optional: Recommended implementation sequence (keeps risk down) + +1. Canonical verdict schema + deterministic evaluator skeleton +2. Signing + verification CLI +3. Snapshot bundle format + pinned evaluation +4. Replay tool + golden tests +5. OCI attachment publishing + registry/admission integration +6. Evidence bundles + UI explainability +7. Exceptions + audit pack export + +--- + +If you want this turned into a formal internal PRD template, I can format it as: + +* “Product requirements” (MUST/SHOULD/COULD) +* “Engineering requirements” (interfaces + invariants + test plan) +* “Security model” (trust roots, signing identities, verification policy) +* “Acceptance criteria” for an MVP and for GA diff --git a/docs/product-advisories/unprocessed/21-Dec-2025 - Mapping Evidence Within Compiled Binaries.md b/docs/product-advisories/unprocessed/21-Dec-2025 - Mapping Evidence Within Compiled Binaries.md new file mode 100644 index 000000000..9aa050260 --- /dev/null +++ b/docs/product-advisories/unprocessed/21-Dec-2025 - Mapping Evidence Within Compiled Binaries.md @@ -0,0 +1,783 @@ +Below is a practical, production-grade architecture for building a **vulnerable binaries database**. I’m going to be explicit about what “such a database” can mean, because there are two materially different products: + +1. **Known-build catalog**: “These exact shipped binaries (Build-ID / hash) are affected or fixed for CVE X.” +2. **Binary fingerprint DB**: “Even if the binary is unpackaged / self-built, we can match vulnerable code patterns.” + +You want both. The first gets you breadth fast; the second is the moat. + +--- + +## 1) Core principle: treat “binary identity” as the primary key + +For Linux ELF: + +* Primary: `ELF Build-ID` (from `.note.gnu.build-id`) +* Fallback: `sha256(file_bytes)` +* Add: `sha256(.text)` and/or BLAKE3 for speed + +This creates a stable identity that survives “package metadata lies.” + +**BinaryKey = build_id || file_sha256** + +--- + +## 2) High-level system diagram + +``` + ┌──────────────────────────┐ + │ Vulnerability Intel │ + │ OSV/NVD + distro advis. │ + └───────────┬──────────────┘ + │ normalize + v + ┌──────────────────────────┐ + │ Vuln Knowledge Store │ + │ CVE↔pkg ranges, patches │ + └───────────┬──────────────┘ + │ + │ +┌───────────────────────v─────────────────────────┐ +│ Repo Snapshotter (per distro/arch/date) │ +│ - mirrors metadata + packages (+ debuginfo) │ +│ - verifies signatures │ +│ - emits signed snapshot manifest │ +└───────────┬───────────────────────────┬─────────┘ + │ │ + │ packages │ debuginfo/sources + v v +┌──────────────────────────┐ ┌──────────────────────────┐ +│ Package Unpacker │ │ Source/Buildinfo Mapper │ +│ - extract files │ │ - pkg→source commit/patch │ +└───────────┬──────────────┘ └───────────┬──────────────┘ + │ binaries │ + v │ +┌──────────────────────────┐ │ +│ Binary Feature Extractor │ │ +│ - Build-ID, hashes │ │ +│ - dyn deps, symbols │ │ +│ - function boundaries (opt)│ │ +└───────────┬──────────────┘ │ + │ │ + v v +┌──────────────────────────────────────────────────┐ +│ Vulnerable Binary Classifier │ +│ Tier A: pkg/version range │ +│ Tier B: Build-ID→known shipped build │ +│ Tier C: code fingerprints (function/CFG hashes) │ +└───────────┬───────────────────────────┬──────────┘ + │ │ + v v +┌──────────────────────────┐ ┌──────────────────────────┐ +│ Vulnerable Binary DB │ │ Evidence/Attestation DB │ +│ (indexed by BinaryKey) │ │ (signed proofs, snapshots)│ +└───────────┬──────────────┘ └───────────┬──────────────┘ + │ publish signed snapshot │ + v v + Clients/Scanners Explainable VEX outputs +``` + +--- + +## 3) Data stores you actually need + +### A) Relational store (Postgres) + +Use this for *indexes and joins*. + +Key tables: + +**`binary_identity`** + +* `binary_key` (build_id or file_sha256) PK +* `build_id` (nullable) +* `file_sha256`, `text_sha256` +* `arch`, `osabi`, `type` (ET_DYN/EXEC), `stripped` +* `first_seen_snapshot`, `last_seen_snapshot` + +**`binary_package_map`** + +* `binary_key` +* `distro`, `pkg_name`, `pkg_version_release`, `arch` +* `file_path_in_pkg`, `snapshot_id` + +**`snapshot_manifest`** + +* `snapshot_id` +* `distro`, `arch`, `timestamp` +* `repo_metadata_digests`, `signing_key_id`, `dsse_envelope_ref` + +**`cve_package_ranges`** + +* `cve_id`, `ecosystem` (deb/rpm/apk), `pkg_name` +* `vulnerable_ranges`, `fixed_ranges` +* `advisory_ref`, `snapshot_id` + +**`binary_vuln_assertion`** + +* `binary_key`, `cve_id` +* `status` ∈ {affected, not_affected, fixed, unknown} +* `method` ∈ {range_match, buildid_catalog, fingerprint_match} +* `confidence` (0–1) +* `evidence_ref` (points to signed evidence) + +### B) Object store (S3/MinIO) + +Do not bloat Postgres with large blobs. + +Store: + +* extracted symbol lists, string tables +* function hash maps +* disassembly snippets for matched functions (small) +* DSSE envelopes / attestations +* optional: debug info extracts (or references to where they can be fetched) + +### C) Optional search index (OpenSearch/Elastic) + +If you want fast “find all binaries exporting `SSL_read`” style queries, index symbols/strings. + +--- + +## 4) Building the database: pipelines + +### Pipeline 1: Distro repo snapshots → Known-build catalog (breadth) + +This is your fastest route to a “binaries DB.” + +**Step 1 — Snapshot** + +* Mirror repo metadata + packages for (distro, release, arch). +* Verify signatures (APT Release.gpg, RPM signatures, APK signatures). +* Emit **signed snapshot manifest** (DSSE) listing digests of everything mirrored. + +**Step 2 — Extract binaries** +For each package: + +* unpack (deb/rpm/apk) +* select ELF files (EXEC + shared libs) +* compute Build-ID, file hash, `.text` hash +* store identity + `binary_package_map` + +**Step 3 — Assign CVE status (Tier A + Tier B)** + +* Ingest distro advisories and/or OSV mappings into `cve_package_ranges` +* For each `binary_package_map`, apply range checks +* Create `binary_vuln_assertion` entries: + + * `method=range_match` (coarse) +* If you have a Build-ID mapping to exact shipped builds, you can tag: + + * `method=buildid_catalog` (stronger than pure version) + +This yields a database where a scanner can do: + +* “Given Build-ID, tell me all CVEs per the distro snapshot.” + +This already reduces noise because the primary key is the **binary**. + +--- + +### Pipeline 2: Patch-aware classification (backports handled) + +To handle “version says vulnerable but backport fixed” you must incorporate patch provenance. + +**Step 1 — Build provenance mapping** +Per ecosystem: + +* Debian/Ubuntu: parse `Sources`, changelogs, (ideally) `.buildinfo`, patch series. +* RPM distros: SRPM + changelog + patch list. +* Alpine: APKBUILD + patches. + +**Step 2 — CVE ↔ patch linkage** +From advisories and patch metadata, store: + +* “CVE fixed by patch set P in build B of pkg V-R” + +**Step 3 — Apply to binaries** +Instead of version-only, decide: + +* if the **specific build** includes the patch +* mark as `fixed` even if upstream version looks vulnerable + +This is still not “binary-only,” but it’s much closer to truth for distros. + +--- + +### Pipeline 3: Binary fingerprint factory (the moat) + +This is where you become independent of packaging claims. + +You build fingerprints at the **function/CFG level** for high-impact CVEs. + +#### 3.1 Select targets + +You cannot fingerprint everything. Start with: + +* top shared libs (openssl, glibc, zlib, expat, libxml2, curl, sqlite, ncurses, etc.) +* CVEs that are exploited in the wild / high-severity +* CVEs where distros backport heavily (version logic is unreliable) + +#### 3.2 Identify “changed functions” from the fix + +Input: upstream commit/patch or distro patch. + +Process: + +* diff the patch +* extract affected files + functions (tree-sitter/ctags + diff hunks) +* list candidate functions and key basic blocks + +#### 3.3 Build vulnerable + fixed reference binaries + +For each (arch, toolchain profile): + +* compile “known vulnerable” and “known fixed” +* ensure reproducibility: record compiler version, flags, link mode +* store provenance (DSSE) for these reference builds + +#### 3.4 Extract robust fingerprints + +Avoid raw byte signatures (they break across compilers). + +Better fingerprint types, from weakest to strongest: + +* **symbol-level**: function name + versioned symbol + library SONAME +* **function normalized hash**: + + * disassemble function + * normalize: + + * strip addresses/relocs + * bucket registers + * normalize immediates (where safe) + * hash instruction sequence or basic-block sequence +* **basic-block multiset hash**: + + * build a set/multiset of block hashes; order-independent +* **lightweight CFG hash**: + + * nodes: block hashes + * edges: control flow + * hash canonical representation + +Store fingerprints like: + +**`vuln_fingerprint`** + +* `cve_id` +* `component` (openssl/libssl) +* `arch` +* `fp_type` (func_norm_hash, bb_multiset, cfg_hash) +* `fp_value` +* `function_hint` (name if present; else pattern) +* `confidence`, `notes` +* `evidence_ref` (points to reference builds + patch) + +#### 3.5 Validate fingerprints at scale + +This is non-negotiable. + +Validation loop: + +* Test against: + + * known vulnerable builds (must match) + * known fixed builds (must not match) + * large “benign corpus” (estimate false positives) +* Maintain: + + * precision/recall metrics per fingerprint + * confidence score + +Only promote fingerprints to “production” when validation passes thresholds. + +--- + +## 5) Query-time logic (how scanners use the DB) + +Given a target binary, the scanner computes: + +* `binary_key` +* basic features (arch, SONAME, symbols) +* optional function hashes (for targeted libs) + +Then it queries in this precedence order: + +1. **Exact match**: `binary_key` exists with explicit assertion (strong) +2. **Build catalog**: Build-ID→known distro build→CVE mapping (strong) +3. **Fingerprint match**: function/CFG hashes hit (strong, binary-only) +4. **Fallback**: package range matching (weakest) + +Return result as a signed VEX with evidence references. + +--- + +## 6) Update model: “sealed knowledge snapshots” + +To make this auditable and customer-friendly: + +* Every repo snapshot is immutable and signed. +* Every fingerprint bundle is versioned and signed. +* Every “vulnerable binaries DB release” is a signed manifest pointing to: + + * which repo snapshots were used + * which advisory snapshots were used + * which fingerprint sets were included + +This lets you prove: + +* what you knew +* when you knew it +* exactly which data drove the verdict + +--- + +## 7) Scaling and cost control + +Without control, fingerprinting explodes. Use these constraints: + +* Only disassemble/hash functions for: + + * libraries in your “hot set” + * binaries whose package indicates relevance to a targeted CVE family +* Deduplicate aggressively: + + * identical `.text_sha256` ⇒ reuse extracted functions + * identical Build-ID across paths ⇒ reuse features +* Incremental snapshots: + + * process only new/changed packages per snapshot + * store “already processed digest” cache (Valkey) + +--- + +## 8) Security and trust boundaries + +A vulnerable binary DB is itself a high-value target. Hardening must be part of architecture: + +* Verify upstream repo signatures before ingestion. +* Run unpacking/extraction in sandboxes (namespaces/seccomp) because packages can be hostile. +* Sign: + + * snapshot manifests + * fingerprint sets + * DB releases +* Keep signing keys in an HSM/KMS. +* Maintain provenance chain: input digests → output digests. + +--- + +## 9) Minimal viable build plan (no wishful thinking) + +### MVP 1: Known-build binary catalog (fast, broad) + +* Repo snapshot + Build-ID extraction +* CVE ranges + advisory snapshots +* `binary_key → (pkg@ver, CVEs)` mapping +* Signed snapshot releases + +### MVP 2: Patch-aware backport handling (precision jump) + +* Source/patch mapping +* Build-specific “fixed-by-backport” logic + +### MVP 3: Fingerprints for top components (moat) + +* Fingerprint factory for ~20–50 “most reused” libs +* Validation corpus + confidence scoring + +--- + +## 10) What you can claim (and what you should not) + +You can defensibly claim: + +* “We can identify vulnerable **binaries** (not just packages) with verifiable evidence.” + +You should not claim (until you have it): + +* “We can detect every vulnerable binary on earth” + No one can do that without unacceptable false positives/negatives. + +--- + +If you want this aligned tightly to your Stella Ops services split (scanner webservice lattice algorithms + proof-of-integrity graph), I can map the above into: + +* concrete microservices, +* queue/job boundaries, +* and the exact API surfaces (`/snapshots`, `/features`, `/fingerprints`, `/match`, `/vex`). +To code **MVP 2 (patch-aware backport handling)** without any human triage, you need one thing in your database that most scanners do not maintain: + +> A normalized, distro-specific index: +> **(distro, release, source-package, CVE) → fixed-in distro version (or not-affected / wontfix / unknown), with evidence and confidence.** + +Then every scan decision becomes a deterministic version comparison against that index, not against upstream semver. + +Below is a concrete, buildable approach (pipelines + data model + code skeletons) that stays fully automated. + +--- + +## 1) What MVP2 computes + +### Output table you must build + +**`cve_fix_index`** + +* `distro` (e.g., debian, ubuntu, rhel, alpine) +* `release` (e.g., bookworm, jammy, 9, 3.19) +* `source_pkg` (not binary subpackage) +* `cve_id` +* `state` ∈ {`fixed`, `vulnerable`, `not_affected`, `wontfix`, `unknown`} +* `fixed_version` (nullable; distro version string, including revision) +* `method` ∈ {`security_feed`, `changelog`, `patch_header`, `upstream_patch_match`} +* `confidence` (float) +* `evidence` (JSON: references to advisory entry, changelog lines, patch names + digests) +* `snapshot_id` (your sealed snapshot identifier) + +### Why “source package”? + +Security trackers and patch sets are tracked at the **source** level (e.g., `openssl`), while runtime installs are often **binary subpackages** (e.g., `libssl3`). You need a stable join: +`binary_pkg -> source_pkg`. + +--- + +## 2) No-human signals, in strict priority order + +You can do this with **zero manual** work by using a tiered resolver: + +### Tier 1 — Structured distro security feed (highest precision) + +This is the authoritative “backport-aware” answer because it encodes: + +* “fixed in 1.1.1n-0ubuntu2.4” (even if upstream says “fixed in 1.1.1o”) +* “not affected” cases +* sometimes arch-specific applicability + +Your ingestor just parses and normalizes it. + +### Tier 2 — Source package changelog CVE mentions + +If a feed entry is missing/late, parse source changelog: + +* Debian/Ubuntu: `debian/changelog` +* RPM: `%changelog` in `.spec` +* Alpine: `secfixes` in `APKBUILD` (often present) + +This is surprisingly effective because maintainers often include “CVE-XXXX-YYYY” in the entry that introduced the fix. + +### Tier 3 — Patch metadata (DEP-3 headers / patch filenames) + +Parse patches shipped with the source package: + +* Debian: `debian/patches/*` + `debian/patches/series` +* RPM: patch files listed in spec / SRPM +* Alpine: `patches/*.patch` in the aport + +Search patch headers and filenames for CVE IDs, store patch hashes. + +### Tier 4 — Upstream patch equivalence (optional in MVP2, strong) + +If you can map CVE→upstream fix commit (OSV often helps), you can match canonicalized patch hunks against distro patches. + +MVP2 can ship without Tier 4; Tier 1+2 already eliminates most backport false positives. + +--- + +## 3) Architecture: the “Fix Index Builder” job + +### Inputs + +* Your sealed repo snapshot: Packages + Sources (or SRPM/aports) +* Distro security feed snapshot (OVAL/JSON/errata tracker) for same release +* (Optional) OSV/NVD upstream ranges for fallback only + +### Processing graph + +1. **Build `binary_pkg → source_pkg` map** from repo metadata +2. **Ingest security feed** → produce `FixRecord(method=security_feed, confidence=0.95)` +3. **For source packages in snapshot**: + + * unpack source + * parse changelog for CVE mentions → `FixRecord(method=changelog, confidence=0.75–0.85)` + * parse patch headers → `FixRecord(method=patch_header, confidence=0.80–0.90)` +4. **Merge** records into a single best record per key (distro, release, source_pkg, cve) +5. Store into `cve_fix_index` with evidence +6. Sign the resulting snapshot manifest + +--- + +## 4) Merge logic (no human, deterministic) + +You need a deterministic rule for conflicts. + +Recommended (conservative but still precision-improving): + +1. If any record says `not_affected` with confidence ≥ 0.9 → choose `not_affected` +2. Else if any record says `fixed` with confidence ≥ 0.9 → choose `fixed` and `fixed_version = max_fixed_version_among_high_conf` +3. Else if any record says `fixed` at all → choose `fixed` with best available `fixed_version` +4. Else if any says `wontfix` → choose `wontfix` +5. Else `unknown` + +Additionally: + +* Keep *all* evidence records in `evidence` so you can explain and audit. + +--- + +## 5) Version comparison: do not reinvent it + +Backport handling lives or dies on correct version ordering. + +### Practical approach (recommended for ingestion + server-side decisioning) + +Use official tooling in containerized workers: + +* Debian/Ubuntu: `dpkg --compare-versions` +* RPM distros: `rpmdev-vercmp` or `rpm` library +* Alpine: `apk version -t` + +This is reliable and avoids subtle comparator bugs. + +If you must do it in-process, use well-tested libraries per ecosystem (but containerized official tools are the most robust). + +--- + +## 6) Concrete code: Debian/Ubuntu changelog + patch parsing + +This example shows **Tier 2 + Tier 3** inference for a single unpacked source tree. You would wrap this inside your snapshot processing loop. + +### 6.1 CVE extractor + +```python +import re +from pathlib import Path +from hashlib import sha256 + +CVE_RE = re.compile(r"\bCVE-\d{4}-\d{4,7}\b") + +def extract_cves(text: str) -> set[str]: + return set(CVE_RE.findall(text or "")) +``` + +### 6.2 Parse the *top* debian/changelog entry (for this version) + +This works well because when you unpack a `.dsc` for version `V`, the top entry is for `V`. + +```python +def parse_debian_changelog_top_entry(src_dir: Path) -> tuple[str, set[str], dict]: + """ + Returns: + version: str + cves: set[str] found in the top entry + evidence: dict with excerpt for explainability + """ + changelog_path = src_dir / "debian" / "changelog" + if not changelog_path.exists(): + return "", set(), {} + + lines = changelog_path.read_text(errors="replace").splitlines() + if not lines: + return "", set(), {} + + # First line: "pkgname (version) distro; urgency=..." + m = re.match(r"^[^\s]+\s+\(([^)]+)\)\s+", lines[0]) + version = m.group(1) if m else "" + + entry_lines = [lines[0]] + # Collect until maintainer trailer line: " -- Name date" + for line in lines[1:]: + entry_lines.append(line) + if line.startswith(" -- "): + break + + entry_text = "\n".join(entry_lines) + cves = extract_cves(entry_text) + + evidence = { + "file": "debian/changelog", + "version": version, + "excerpt": entry_text[:2000], # store small excerpt, not whole file + } + return version, cves, evidence +``` + +### 6.3 Parse CVEs from patch headers (DEP-3-ish) + +```python +def parse_debian_patches_for_cves(src_dir: Path) -> tuple[dict[str, list[dict]], dict]: + """ + Returns: + cve_to_patches: {CVE: [ {path, sha256, header_excerpt}, ... ]} + evidence_summary: dict + """ + patches_dir = src_dir / "debian" / "patches" + if not patches_dir.exists(): + return {}, {} + + cve_to_patches: dict[str, list[dict]] = {} + + for patch in patches_dir.glob("*"): + if not patch.is_file(): + continue + # Read only first N lines to keep it cheap + header = "\n".join(patch.read_text(errors="replace").splitlines()[:80]) + cves = extract_cves(header + "\n" + patch.name) + if not cves: + continue + + digest = sha256(patch.read_bytes()).hexdigest() + rec = { + "path": str(patch.relative_to(src_dir)), + "sha256": digest, + "header_excerpt": header[:1200], + } + for cve in cves: + cve_to_patches.setdefault(cve, []).append(rec) + + evidence = { + "dir": "debian/patches", + "matched_cves": len(cve_to_patches), + } + return cve_to_patches, evidence +``` + +### 6.4 Produce FixRecords from the source tree + +```python +def infer_fix_records_from_debian_source(src_dir: Path, distro: str, release: str, source_pkg: str, snapshot_id: str): + version, changelog_cves, changelog_ev = parse_debian_changelog_top_entry(src_dir) + cve_to_patches, patch_ev = parse_debian_patches_for_cves(src_dir) + + records = [] + + # Changelog-based: treat CVE mentioned in top entry as fixed in this version + for cve in changelog_cves: + records.append({ + "distro": distro, + "release": release, + "source_pkg": source_pkg, + "cve_id": cve, + "state": "fixed", + "fixed_version": version, + "method": "changelog", + "confidence": 0.80, + "evidence": {"changelog": changelog_ev}, + "snapshot_id": snapshot_id, + }) + + # Patch-header-based: treat CVE-tagged patches as fixed in this version + for cve, patches in cve_to_patches.items(): + records.append({ + "distro": distro, + "release": release, + "source_pkg": source_pkg, + "cve_id": cve, + "state": "fixed", + "fixed_version": version, + "method": "patch_header", + "confidence": 0.87, + "evidence": {"patches": patches, "patch_summary": patch_ev}, + "snapshot_id": snapshot_id, + }) + + return records +``` + +That is the automated “patch-aware” signal generator. + +--- + +## 7) Wiring this into your database build + +### 7.1 Store raw evidence and merged result + +Two-stage storage is worth it: + +1. `cve_fix_evidence` (append-only) +2. `cve_fix_index` (merged best record) + +So you can: + +* rerun merge rules +* improve confidence scoring +* keep auditability + +### 7.2 Merging “fixed_version” for a CVE + +When multiple versions mention the same CVE, you usually want the **latest** mentioning version (highest by distro comparator), because repeated mentions often indicate earlier partial fix. + +Pseudo: + +```python +def choose_fixed_version(existing: str | None, candidate: str, vercmp) -> str: + if not existing: + return candidate + return candidate if vercmp(candidate, existing) > 0 else existing +``` + +Where `vercmp` calls `dpkg --compare-versions` (Debian) or equivalent for that distro. + +--- + +## 8) Decisioning logic at scan time (what changes with MVP2) + +Without MVP2, you likely do: + +* upstream range check (false positives for backports) + +With MVP2, you do: + +1. identify `distro+release` from environment (or image base) +2. map `binary_pkg → source_pkg` +3. query `cve_fix_index(distro, release, source_pkg, cve)` +4. if `state=fixed` and `pkg_version >= fixed_version` → fixed +5. if `state=not_affected` → safe +6. else fallback to upstream ranges + +That single substitution removes most backport noise. + +--- + +## 9) Practical notes so you don’t get trapped + +### A) You must know the distro release + +Backport reality is release-specific. The same package name/version can have different patching across releases. + +### B) Arch-specific fixes exist + +Your schema should allow `arch` on fix records (nullable). If the feed says “only amd64 affected,” store it. + +### C) False positives in changelog parsing + +Mitigation without humans: + +* require CVE mention + security-ish verbs (“fix”, “security”, “patch”) OR presence in patch headers +* otherwise lower confidence and let feed override + +### D) Keep evidence small + +Store: + +* excerpt + patch hashes + Not entire source tarballs. + +--- + +## 10) Minimal “done definition” for MVP2 + +You have MVP2 when, for Debian/Ubuntu at least, you can demonstrate: + +* A CVE that upstream marks vulnerable for version X +* The distro backported it in X- +* Your system classifies: + + * `X-older_revision` as vulnerable + * `X-newer_revision` as fixed +* With evidence: fix feed record and/or changelog/patch proof + +No human required. + +--- + +If you want, I can provide the same “Tier 2/3 inference” module for RPM (SRPM/spec parsing) and Alpine (APKBUILD `secfixes` extraction), plus the exact Postgres DDL for `cve_fix_evidence` and `cve_fix_index`, and the merge SQL. diff --git a/docs/product-advisories/unprocessed/21-Dec-2025 - Smart Diff - Reproducibility as a Feature.md b/docs/product-advisories/unprocessed/21-Dec-2025 - Smart Diff - Reproducibility as a Feature.md new file mode 100644 index 000000000..37ce9daa5 --- /dev/null +++ b/docs/product-advisories/unprocessed/21-Dec-2025 - Smart Diff - Reproducibility as a Feature.md @@ -0,0 +1,247 @@ +Yes — you need **dedicated UI affordances** for “Verdict + DeltaVerdict + Evidence” because the interaction model is fundamentally different from a classic “vulnerability list” UI. +But you do **not** necessarily need a whole new top‑level product area on day one. + +The right approach is usually: + +1. **Embed the experience where decisions happen** (build/release/PR gates). +2. Add **one dedicated “Compare / Delta” screen** (a focused view) reachable from those contexts. +3. Introduce a **top-level “Assurance/Audit” workspace only if you have compliance-heavy users** who need cross-project oversight. + +Below is a concrete way to implement both options and when to choose each. + +--- + +## When a dedicated UI is warranted + +A dedicated UI is justified if at least **two** of these are true: + +* You have **multiple repos/services** and security/compliance need to see **fleet-wide deltas**, not just per build. +* You need **approval workflows** (exceptions, risk acceptance, “ship with waiver”). +* You need **auditor-grade artifact browsing**: signatures, provenance, replay, evidence packs. +* Developers complain about “scan noise” and need **diff-first triage** to be fast. +* You have separate personas: **Dev**, **Security**, **Compliance/Audit** — each needs different default views. + +If those aren’t true, keep it embedded and light. + +--- + +## Recommended approach (most teams): Dedicated “Compare view” + embedded panels + +### Where it belongs in the existing UI + +Assuming your current navigation is something like: + +**Projects → Repos → Builds/Releases → Findings/Vulnerabilities** + +Then “DeltaVerdict” belongs primarily in **Build/Release details**, not in the global vulnerability list. + +**Add two key entry points:** + +1. A **status + delta summary** on every Build/Release page (above the fold). +2. A **Compare** action that opens a dedicated comparison screen (or tab). + +### Information architecture (practical, minimal) + +On the **Build/Release details page**, add a header section: + +* **Verdict chip**: Allowed / Blocked / Warn +* **Delta chip**: “+2 new exploitable highs”, “Reachability flip: yes/no”, “Unknowns: +3” +* **Baseline**: “Compared to: v1.4.2 (last green in prod)” +* **Actions**: + + * **Compare** (opens dedicated delta view) + * **Download Evidence Pack** + * **Verify Signatures** + * **Replay** (copy command / show determinism hash) + +Then add a tab set: + +* **Delta (default)** +* Components (SBOM) +* Vulnerabilities +* Reachability +* VEX / Claims +* Attestations (hashes, signatures, provenance) + +#### Why “Delta” should be the default tab + +The user’s first question in a release is: *What changed that affects risk?* +If you make them start in a full vuln list, you rebuild the noise problem. + +--- + +## How the dedicated “Compare / Delta” view should work + +Think of it as a “git diff”, but for risk and provenance. + +### 1) Baseline selection (must be explicit and explainable) + +Top of the Compare view: + +* **Base** selector (default chosen by system): + + * “Last green verdict in same environment” + * “Previous release tag” + * “Parent commit / merge-base” +* **Head** selector: + + * Current build/release +* Show **why** the baseline was chosen (small text): + “Selected last prod release with Allowed verdict under policy P123.” + +This matters because auditors will ask “why did you compare against *that*?” + +### 2) Delta summary strip (fast triage) + +A horizontal strip with only the key deltas: + +* **New exploitable vulns:** N (by severity) +* **Reachability flips:** N (new reachable / newly unreachable) +* **Component changes:** +A / −R / ~C +* **VEX claim flips:** N +* **Policy/feed drift:** policy changed? feed snapshot changed? stale? + +### 3) Three-pane layout (best for speed) + +Left: **Delta categories** (counts) + +* New exploitable vulns +* Newly reachable +* Component adds/removes +* Changed versions +* Claim changes +* Unknowns / missing data + +Middle: **List of changed items** (sorted by risk) + +* Each item shows: component, version, CVE (if applicable), exploitability, reachability, current disposition (VEX), gating rule triggered + +Right: **Proof / explanation panel** + +* “Why is it blocked?” +* Shows: + + * the **policy rule** that fired (with rule ID) + * the **witness path** for reachability (minimal path) + * the **claim sources** for VEX (vendor/distro/internal) and merge explanation + * links to the exact **envelope hashes** involved + +This is where “proof-carrying” becomes usable. + +### 4) Actionables output (make it operational) + +At the top of the item list include a “What to do next” section: + +* Upgrade component X → version Y +* Patch CVE Z +* Add/confirm VEX claim with evidence +* Reduce reachability (feature flag, build config) +* Resolve unknowns (SBOM missing for module A) + +This prevents the compare screen from becoming yet another “informational dashboard.” + +--- + +## If you do NOT create any new dedicated view + +If you strongly want zero new screens, the minimum acceptable integration is: + +* Add a **Delta toggle** on the existing Vulnerabilities page: + + * “All findings” vs “Changes since baseline” +* Add a **baseline selector** on that page. +* Add an **Attestations panel** on the Build/Release page for evidence pack + signature verification. + +This can work, but it tends to fail as the system grows because: + +* Vulnerability list UIs are optimized for volume browsing, not causal proof +* Reachability and VEX explanation become buried +* Auditors still need a coherent “verdict story” + +If you go this route, at least add a **“Compare drawer”** (modal) that shows the delta summary and links into filtered views. + +--- + +## When you SHOULD add a top-level dedicated UI (“Assurance” workspace) + +Create a dedicated left-nav item only when you have these needs: + +1. **Cross-project oversight**: “show me all new exploitable highs introduced this week across org.” +2. **Audit operations**: evidence pack management, replay logs, signature verification at scale. +3. **Policy governance**: browse policy versions, rollout status, exceptions, owners. +4. **Release approvals**: security sign-off steps, waivers, expiry dates. + +### What that workspace would contain + +* **Overview dashboard** + + * blocked releases (by reason) + * new risk deltas by team/repo + * unknowns trend + * stale feed snapshot alerts +* **Comparisons** + + * search by repo/build/tag and compare any two artifacts +* **Attestations & Evidence** + + * list of verdicts/delta verdicts with verification status + * evidence pack download and replay +* **Policies & Exceptions** + + * policy versions, diffs, who changed what + * exceptions with expiry and justification + +This becomes the home for Security/Compliance, while Devs stay in the build/release context. + +--- + +## Implementation details that make the UI “work” (avoid common failures) + +### 1) Idempotent “Compute delta” behavior + +When user opens Compare view: + +* UI requests DeltaVerdict by `{base_verdict_hash, head_verdict_hash, policy_hash}`. +* If not present, backend computes it. +* UI shows deterministic progress (“pending”), not “scanning…”. + +### 2) Determinism and trust indicators + +Every compare screen should surface: + +* Determinism hash +* Policy version/hash +* Feed snapshot timestamp/age +* Signature verification status + +If verification fails, the UI must degrade clearly (red banner, disable “Approved” actions). + +### 3) Baseline rules must be visible + +Auditors hate “magic.” +Show baseline selection logic and allow override. + +### 4) Don’t show full graphs by default + +Default to: + +* minimal witness path(s) +* minimal changed subgraph +* expand-on-demand for deep investigation + +### 5) Role-based access + +* Developers: see deltas, actionables, witness paths +* Security: see claims sources, merge rationale, policy reasoning +* Audit: see signatures, replay, evidence pack + +--- + +## Decision recommendation (most likely correct) + +* Build **embedded panels** + a **dedicated Compare/Delta view** reachable from Build/Release and PR checks. +* Delay a top-level “Assurance” workspace until you see real demand from security/compliance for cross-project oversight and approvals. + +This gives you the usability benefits of “diff-first” without fragmenting navigation or building a parallel UI too early. + +If you share (even roughly) your existing nav structure (what pages exist today), I can map the exact placements and propose a concrete IA tree and page wireframe outline aligned to your current UI.