Files
git.stella-ops.org/docs-archived/product/advisories/03-Dec-2026 - C# Disassembly with Deterministic Signatures.md
2026-01-08 09:06:03 +02:00

8.1 KiB
Raw Blame History

Heres a tight, practical plan to add deterministic binarypatch evidence to StellaOps by integrating B2R2 (IR lifter/disassembler for .NET/F#) into your scanning pipeline, then feeding stable “diff signatures” into your VEX Resolver.

What & why (one minute)

  • Goal: Prove (offline) that a distro backport truly patched a CVE—even if version strings look “vulnerable”—by comparing what the CPU will execute before/after a patch.
  • How: Lift binaries to a normalized IR with B2R2, canonicalize semantics (strip address noise, relocations, NOPs, padding), bucket by function and hash stable opcode/semantics. Patch deltas become small, reproducible evidence blobs your VEX engine can consume.

Highlevel flow

  1. Collect: For each package/artifact, grab: installed binary, claimed patched reference (vendors patched ELF/PE or your golden set), and optional original vulnerable build.

  2. Lift: Use B2R2 to disassemble → lift to LIR/SSA (archagnostic).

  3. Normalize (deterministic):

    • Strip addrs/symbols/relocations; fold NOPs; normalize register aliases; constantprop + deadcode elim; canonical call/ret; normalize PLT stubs; elide alignment/padding.
  4. Segment: Perfunction IR slices bounded by CFG; compute stable function IDs = SHA256(package@version, build-id, arch, fn-cfg-shape).

  5. Hashing:

    • Opcode hash: SHA256 of normalized opcode stream.
    • Semantic hash: SHA256 of (basicblock graph + dataflow summaries).
    • Const set hash: extracted immediate set (rangebucketed) to detect patched lookups.
  6. Diff:

    • Compare (patched vs baseline) per function: unchanged / changed / added / removed.
    • For changed: emit delta record with before/after hashes and minimal edit script (blocklevel).
  7. Evidence object (deterministic, replayable):

    • type: "disasm.patch-evidence@1"
    • inputs: file digests (SHA256/SHA3256), BuildID, arch, toolchain versions, B2R2 commit, normalization profile ID
    • outputs: perfunction records + global summary
    • sign: DSSE (intoto link) with your offline key profile
  8. Feed VEX:

    • Map CVE→fixsite heuristics (from vendor advisories/diff hints) to function buckets.
    • If all required buckets show “patched” (semantic hash change matches inventory rule), set affected=false, justification=code_not_present_or_not_reachable (CycloneDX VEX/CVElevel) with pointer to evidence object.

Module boundaries in StellaOps

  • Scanner.WebService (per your rule): host lattice algorithms + this disassembly stage.
  • Sbomer: records exact files/BuildIDs in CycloneDX 1.6/1.7 SBOM (youre moving to 1.7 soon—ensure properties include disasm.profile, b2r2.version).
  • Feedser/Vexer: consume evidence blobs; Vexer attaches VEX statements referencing evidenceRef.
  • Authority/Attestor: sign DSSE attestations; Timeline/Notify surface verdict transitions.

Ondisk schemas (minimal)

{
  "type": "stella.disasm.patch-evidence@1",
  "subject": [{"name": "libssl.so.1.1", "digest": {"sha256": "<...>"}, "buildId": "elf:..."}],
  "tool": {"name": "stella-b2r2", "b2r2": "<commit>", "profile": "norm-v1"},
  "arch": "x86_64",
  "functions": [{
    "fnId": "sha256(pkg,buildId,arch,cfgShape)",
    "addrRange": "0x401000-0x40118f",
    "opcodeHashBefore": "<...>",
    "opcodeHashAfter":  "<...>",
    "semanticHashBefore": "<...>",
    "semanticHashAfter":  "<...>",
    "delta": {"blocksEdited": 2, "immDiff": ["0x7f->0x00"]}
  }],
  "summary": {"unchanged": 812, "changed": 6, "added": 1, "removed": 0}
}

Determinism controls

  • Pin B2R2 version and normalization profile; serialize the profile (passes + order + flags) and include it in evidence.
  • Containerize the lifter; record image digest in evidence.
  • For randomness (e.g., hashsalts), set fixed zeros; set TZ=UTC, LC_ALL=C, and stable CPU features.
  • Replay manifests: list all inputs (file digests, B2R2 commit, profile) so anyone can rerun and reproduce the exact hashes.

C# integration sketch (.NET 10)

// StellaOps.Scanner.Disasm
public sealed class DisasmService
{
    private readonly IBinarySource _source; // pulls files + vendor refs
    private readonly IB2R2Host _b2r2;       // thin wrapper over F# via FFI or CLI
    private readonly INormalizer _norm;     // norm-v1 pipeline
    private readonly IEvidenceStore _evidence;

    public async Task<DisasmEvidence> AnalyzeAsync(Artifact a, Artifact baseline)
    {
        var liftedAfter = await _b2r2.LiftAsync(a.Path, a.Arch);
        var liftedBefore = await _b2r2.LiftAsync(baseline.Path, baseline.Arch);

        var fnAfter = _norm.Normalize(liftedAfter).Functions;
        var fnBefore = _norm.Normalize(liftedBefore).Functions;

        var bucketsAfter = Bucket(fnAfter);
        var bucketsBefore = Bucket(fnBefore);

        var diff = DiffBuckets(bucketsBefore, bucketsAfter);
        var evidence = EvidenceBuilder.Build(a, baseline, diff, _norm.ProfileId, _b2r2.Version);

        await _evidence.PutAsync(evidence);  // write + DSSE sign via Attestor
        return evidence;
    }
}

Normalization profile (normv1)

  • Pass order: CFG build → SSA → constprop → DCE → registerrenamecanon → call/ret stubcanon → PLT/plt.got unwrap → NOP/padding strip → reloc placeholder canon (IMM_RELOC tokens) → block reordering freeze (cfg sort).
  • Hash material: for block in topo(cfg): emit (opcode, operandKinds, IMM_BUCKETS); exclude absolute addrs/symbols.

Hashbucketing details

  • IMM_BUCKETS: bucket immediates by role: {addr, const, mask, len}. For addr, replace with IMM_RELOC(section, relType). For const, clamp to ranges (e.g., table sizes).
  • CFG shape hash: adjacency list over block arity; keeps compilernoise from breaking determinism.
  • Semantic hash seed: keccak of (CFG shape hash || valueflow summaries per defuse).

VEX Resolver hookup

  • Extend rule language: requires(fnId in {"EVP_DigestVerifyFinal", ...} && delta.immDiff.any == true) → verdict not_affected with justification="code_not_present_or_not_reachable" and impactStatement="Patched verification path altered constants".
  • If some required fixsites unchanged → affected=true with actionStatement="Patched binary mismatch: function(s) unchanged", priority ↑.

Golden set + backports

  • Maintain perdistro golden patched refs (BuildID pinned). If vendor publishes only source patch, build once with a fixed toolchain profile to derive reference hashes.
  • Backports: Youll often see different opcode deltas with the same semantic intent—treat evidence as policymappable: define acceptable delta patterns (e.g., boundscheck added) and store them as “semantic signatures”.

CLI user journey (StellaOps standard CLI)

stella scan disasm \
  --pkg openssl --file /usr/lib/x86_64-linux-gnu/libssl.so.1.1 \
  --baseline @golden:debian-12/libssl.so.1.1 \
  --out evidence.json --attest
  • Output: DSSEsigned evidence; stella vex resolve then pulls it and updates the VEX verdicts.

Minimal MVP (2 sprints)

Sprint A (MVP)

  • B2R2 host + normv1 for x86_64, aarch64 (ELF).
  • Function bucketing + opcode hash; perfunction delta; DSSE evidence.
  • VEX rule: “all listed fixsites changed → not_affected”.

Sprint B

  • Semantic hash; IMM bucketing; PLT/reloc canon; UI diff viewer in Timeline.
  • Goldenset builder & cache; distro backport adapters (Debian, RHEL, Alpine, SUSE, Astra).

Risks & guardrails

  • Stripped binaries: OK (IR still works). PIE/ASLR: neutralized via reloc canon. LTO/inlining: mitigate with CFG shape + semantic hash (not symbol names).
  • False positives: keep “changedbutharmless” patterns whitelisted via semantic signatures (policyversioned).
  • Performance: cache lifted IR by (digest, arch, profile); parallelize per function.

If you want, I can draft the normv1 pass list as a concrete F# pipeline for B2R2 and a .proto/JSONSchema for stella.disasm.patch-evidence@1, ready to drop into scanner.webservice.