Here’s a compact, practical design for a **smart‑difference scanner** that produces tiny, verifiable binary deltas and plugs cleanly into a release/provenance workflow—explained from the ground up. --- # What this thing does (in plain words) It compares two software artifacts (containers, packages, binaries), computes the *smallest safe update* between them, and emits both: * a **delta** (what to apply), * and **proof** (why it’s safe and who built it). You get faster rollouts, smaller downloads, and auditable provenance—plus a built‑in rollback that’s just as verifiable. --- # Core idea 1. **Content‑defined chunking (CDC)** Split files into variable‑size chunks using Rabin/CDC, so similar regions line up even if bytes shift. Build a **Merkle DAG** over the chunks. 2. **Deterministic delta ops** Delta = ordered ops: `COPY ` or `ADD `. No “magic heuristics”; same inputs → same delta. 3. **Function‑level diffs (executables only)** For ELF/PE, disassemble and compare by symbol/function to highlight *semantic* changes (added/removed/modified functions), but still ship chunk‑level ops for patching. 4. **Verification & attestation** Every delta links to attestations (SLSA/DSSE/cosign/Rekor) so a verifier can check builder identity, materials, and inclusion proofs **offline**. --- # Supported inputs * **Blobs**: OCI layers, .deb/.rpm payloads, zip/jar/war * **Binaries**: ELF/PE segments (per‑section CDC first, then optional symbol compare) --- # Artifacts the scanner emits **`delta-manifest.json` (deterministic):** * `base_digest`, `target_digest`, `artifact_type` * `changed_chunks[]` (ids, byte ranges) * `ops[]` (COPY/ADD sequence) * `functions_changed` (added/removed/modified counts; top symbols) * `materials_delta` (new/removed deps & digests) * `attestations[]` (DSSE/cosign refs, Rekor log pointers or embedded CT tile) * `score_inputs` (pre‑computed metrics to keep scoring reproducible) The actual **delta payload** is a compact binary: header + op stream + ADD byte blobs. --- # How verification works (offline‑first) * **Content addressability**: chunk ids are hashes; COPY ops verify by recomputing. * **Attestations**: DSSE/cosign bundle includes builder identity and `materials[]` digests. Rekor inclusion proof (or embedded tile fragment) lets verifiers reassemble the transparency chain without the Internet. * **Policy**: if SLSA predicate present and policy threshold met → “green”; else fall back to vendor signature + content checks and mark **provenance gaps**. --- # Risk scoring (explainable) Compute a single `delta_risk` from: * `provenance_completeness` (SLSA level, DSSE validity, Rekor inclusion) * `delta_entropy` (how many new bytes vs copies; unexpected high entropy is riskier) * `new_deps_count` (materials delta) * `signed_attestation_validity` (key/trust chain freshness) * `function_change_impact` (count/criticality of changed symbols) Expose the **breakdown** directly in UI so reviewers see *why* the score is what it is. --- # Rollback that’s actually safe * Rollback is just “apply delta going to previous artifact” **plus** a **signed rollback attestation** anchored in the transparency log. * Verifier refuses rollbacks without matching provenance or if the computed rollback delta doesn’t reproduce the earlier artifact’s digest. --- # Minimal internal data structures (sketch) ```txt Chunk { id: sha256(bytes), size: u32, merkle: sha256(left||right) } DeltaOp = COPY {chunk_id} | ADD {len, bytes} DeltaManifest { base_digest, target_digest, artifact_type, ops[], changed_chunks[], functions_changed: {added[], removed[], modified[]}, materials_delta: {added[], removed[]}, attestations: {dsse_bundle_ref, rekor_inclusion[]}, score_inputs: {provenance, entropy, deps, attestation_validity, fn_impact} } ``` --- # Pipeline (end‑to‑end) 1. **Ingest** base & target → normalize (strip nondeterministic metadata; preserve signatures). 2. **CDC pass** → chunk map → Merkle DAGs. 3. **Delta construction** (greedy minimal ADDs, prefer COPY of identical chunk ids). 4. **(Executables)** symbol table → lightweight disassembly → function map diff. 5. **Attestation linkage** → attach DSSE bundle refs + Rekor proofs. 6. **Scoring** → deterministic `delta_risk` + breakdown. 7. **Emit** `delta.manifest` + `delta.bin`. --- # UI: what reviewers see * **Top changed functions** (name, section, size delta, call‑fanout hint) * **Provenance panel** (SLSA level, DSSE signer, Rekor entry—click to open) * **Delta anatomy** (COPY/ADD ratio, entropy, bytes added) * **Dependencies delta** (new/removed materials with digests) * **“Apply” / “Rollback”** buttons gated by policy & attestation validity --- # How this fits your Stella Ops stack (drop‑in plan) * **Module**: add `DeltaScanner` service under Evidence/Attestor boundary. * **Air‑gap**: store DSSE bundles and Rekor tile fragments alongside artifacts in EvidenceLocker. * **SBOM/VEX**: on delta, also diff SBOM nodes and attach a *delta‑SBOM* for impacted components; feed VEX evaluation to **AdvisoryAI** for surfaced risk notes. * **Release gates**: block promotion if `delta_risk > threshold` or `provenance_completeness < policy`. * **CLI**: `stella delta create|verify|apply|rollback --base A --target B --policy policy.yaml`. --- # Implementation notes (concise) * **CDC**: Rabin fingerprinting window 48–64B; average chunk 4–16 KiB; rolling mask yields boundaries. * **Hashing**: BLAKE3 for speed; SHA‑256 for interop (store both if needed). * **Disassembly**: Capstone/llvm‑objdump (ELF/PE), symbol map fallback if stripped. * **Determinism**: fix chunk params, hash orderings, and traversal; sort tables prior to emit. * **Security**: validate all COPY targets exist in base; cap ADD size; verify DSSE before score. --- # Deliverables you can ship quickly * `delta-scanner` lib (CDC + DAG + ops) * `delta-verify` (attestations, Rekor proof check offline) * `delta-score` (pure function over `delta-manifest`) * UI panels: Delta, Provenance, Risk (reuse Stella’s style system) * CI job: create delta + attach DSSE + upload to EvidenceLocker --- # Test matrix (essentials) * Small edit in large file (ADD minimal) * Repacked zip with same payload (COPY dominates) * Stripped vs non‑stripped ELF (function compare graceful) * Added dependency layer in OCI (materials_delta flagged) * Missing SLSA but valid vendor sig (gap recorded, lower score) * Rollback with/without signed rollback attestation (accept/deny) --- If you want, I can generate: * a ready‑to‑commit **Go/.NET** reference implementation skeleton, * a **policy.yaml** template with thresholds, * and **UI wireframes** (ASCII + Mermaid) for the three panels.