Here’s a tight, practical plan to add **deterministic binary‑patch evidence** to Stella Ops by integrating **B2R2** (IR lifter/disassembler for .NET/F#) into your scanning pipeline, then feeding stable “diff signatures” into your **VEX Resolver**. # What & why (one minute) * **Goal:** Prove (offline) that a distro backport truly patched a CVE—even if version strings look “vulnerable”—by comparing *what the CPU will execute* before/after a patch. * **How:** Lift binaries to a normalized IR with **B2R2**, canonicalize semantics (strip address noise, relocations, NOPs, padding), **bucket** by function and **hash** stable opcode/semantics. Patch deltas become small, reproducible evidence blobs your VEX engine can consume. # High‑level flow 1. **Collect**: For each package/artifact, grab: *installed binary*, *claimed patched reference* (vendor’s patched ELF/PE or your golden set), and optional *original vulnerable build*. 2. **Lift**: Use B2R2 to disassemble → lift to **LIR**/**SSA** (arch‑agnostic). 3. **Normalize** (deterministic): * Strip addrs/symbols/relocations; fold NOPs; normalize register aliases; constant‑prop + dead‑code elim; canonical call/ret; normalize PLT stubs; elide alignment/padding. 4. **Segment**: Per‑function IR slices bounded by CFG; compute **stable function IDs** = `SHA256(package@version, build-id, arch, fn-cfg-shape)`. 5. **Hashing**: * **Opcode hash**: SHA256 of normalized opcode stream. * **Semantic hash**: SHA256 of (basic‑block graph + dataflow summaries). * **Const set hash**: extracted immediate set (range‑bucketed) to detect patched lookups. 6. **Diff**: * Compare (patched vs baseline) per function: unchanged / changed / added / removed. * For changed: emit **delta record** with before/after hashes and minimal edit script (block‑level). 7. **Evidence object** (deterministic, replayable): * `type: "disasm.patch-evidence@1"` * inputs: file digests (SHA256/SHA3‑256), Build‑ID, arch, toolchain versions, B2R2 commit, normalization profile ID * outputs: per‑function records + global summary * sign: DSSE (in‑toto link) with your offline key profile 8. **Feed VEX**: * Map CVE→fix‑site heuristics (from vendor advisories/diff hints) to function buckets. * If all required buckets show “patched” (semantic hash change matches inventory rule), set **`affected=false, justification=code_not_present_or_not_reachable`** (CycloneDX VEX/CVE‑level) with pointer to evidence object. # Module boundaries in Stella Ops * **Scanner.WebService** (per your rule): host *lattice algorithms* + this disassembly stage. * **Sbomer**: records exact files/Build‑IDs in CycloneDX 1.6/1.7 SBOM (you’re moving to 1.7 soon—ensure `properties` include `disasm.profile`, `b2r2.version`). * **Feedser/Vexer**: consume evidence blobs; Vexer attaches VEX statements referencing `evidenceRef`. * **Authority/Attestor**: sign DSSE attestations; Timeline/Notify surface verdict transitions. # On‑disk schemas (minimal) ```json { "type": "stella.disasm.patch-evidence@1", "subject": [{"name": "libssl.so.1.1", "digest": {"sha256": "<...>"}, "buildId": "elf:..."}], "tool": {"name": "stella-b2r2", "b2r2": "", "profile": "norm-v1"}, "arch": "x86_64", "functions": [{ "fnId": "sha256(pkg,buildId,arch,cfgShape)", "addrRange": "0x401000-0x40118f", "opcodeHashBefore": "<...>", "opcodeHashAfter": "<...>", "semanticHashBefore": "<...>", "semanticHashAfter": "<...>", "delta": {"blocksEdited": 2, "immDiff": ["0x7f->0x00"]} }], "summary": {"unchanged": 812, "changed": 6, "added": 1, "removed": 0} } ``` # Determinism controls * Pin **B2R2 version** and **normalization profile**; serialize the profile (passes + order + flags) and include it in evidence. * Containerize the lifter; record image digest in evidence. * For randomness (e.g., hash‑salts), set fixed zeros; set `TZ=UTC`, `LC_ALL=C`, and stable CPU features. * Replay manifests: list all inputs (file digests, B2R2 commit, profile) so anyone can re‑run and reproduce the exact hashes. # C# integration sketch (.NET 10) ```csharp // StellaOps.Scanner.Disasm public sealed class DisasmService { private readonly IBinarySource _source; // pulls files + vendor refs private readonly IB2R2Host _b2r2; // thin wrapper over F# via FFI or CLI private readonly INormalizer _norm; // norm-v1 pipeline private readonly IEvidenceStore _evidence; public async Task AnalyzeAsync(Artifact a, Artifact baseline) { var liftedAfter = await _b2r2.LiftAsync(a.Path, a.Arch); var liftedBefore = await _b2r2.LiftAsync(baseline.Path, baseline.Arch); var fnAfter = _norm.Normalize(liftedAfter).Functions; var fnBefore = _norm.Normalize(liftedBefore).Functions; var bucketsAfter = Bucket(fnAfter); var bucketsBefore = Bucket(fnBefore); var diff = DiffBuckets(bucketsBefore, bucketsAfter); var evidence = EvidenceBuilder.Build(a, baseline, diff, _norm.ProfileId, _b2r2.Version); await _evidence.PutAsync(evidence); // write + DSSE sign via Attestor return evidence; } } ``` # Normalization profile (norm‑v1) * **Pass order:** CFG build → SSA → const‑prop → DCE → register‑rename‑canon → call/ret stub‑canon → PLT/plt.got unwrap → NOP/padding strip → reloc placeholder canon (`IMM_RELOC` tokens) → block re‑ordering freeze (cfg sort). * **Hash material:** `for block in topo(cfg): emit (opcode, operandKinds, IMM_BUCKETS)`; exclude absolute addrs/symbols. # Hash‑bucketing details * **IMM_BUCKETS:** bucket immediates by role: {addr, const, mask, len}. For `addr`, replace with `IMM_RELOC(section, relType)`. For `const`, clamp to ranges (e.g., table sizes). * **CFG shape hash:** adjacency list over block arity; keeps compiler‑noise from breaking determinism. * **Semantic hash seed:** keccak of (CFG shape hash || value‑flow summaries per def‑use). # VEX Resolver hookup * Extend rule language: `requires(fnId in {"EVP_DigestVerifyFinal", ...} && delta.immDiff.any == true)` → verdict `not_affected` with `justification="code_not_present_or_not_reachable"` and `impactStatement="Patched verification path altered constants"`. * If some required fix‑sites unchanged → `affected=true` with `actionStatement="Patched binary mismatch: function(s) unchanged"`, priority ↑. # Golden set + backports * Maintain per‑distro **golden patched refs** (Build‑ID pinned). If vendor publishes only source patch, build once with a fixed toolchain profile to derive reference hashes. * Backports: You’ll often see *different* opcode deltas with the *same* semantic intent—treat evidence as **policy‑mappable**: define acceptable delta patterns (e.g., bounds‑check added) and store them as **“semantic signatures”**. # CLI user journey (StellaOps standard CLI) ``` stella scan disasm \ --pkg openssl --file /usr/lib/x86_64-linux-gnu/libssl.so.1.1 \ --baseline @golden:debian-12/libssl.so.1.1 \ --out evidence.json --attest ``` * Output: DSSE‑signed evidence; `stella vex resolve` then pulls it and updates the VEX verdicts. # Minimal MVP (2 sprints) **Sprint A (MVP)** * B2R2 host + norm‑v1 for x86_64, aarch64 (ELF). * Function bucketing + opcode hash; per‑function delta; DSSE evidence. * VEX rule: “all listed fix‑sites changed → not_affected”. **Sprint B** * Semantic hash; IMM bucketing; PLT/reloc canon; UI diff viewer in Timeline. * Golden‑set builder & cache; distro backport adapters (Debian, RHEL, Alpine, SUSE, Astra). # Risks & guardrails * Stripped binaries: OK (IR still works). PIE/ASLR: neutralized via reloc canon. LTO/inlining: mitigate with CFG shape + semantic hash (not symbol names). * False positives: keep “changed‑but‑harmless” patterns whitelisted via semantic signatures (policy‑versioned). * Performance: cache lifted IR by `(digest, arch, profile)`; parallelize per function. If you want, I can draft the **norm‑v1** pass list as a concrete F# pipeline for B2R2 and a **.proto/JSON‑Schema** for `stella.disasm.patch-evidence@1`, ready to drop into `scanner.webservice`.