8.1 KiB
Here’s a tight, practical plan to add deterministic binary‑patch evidence to Stella Ops by integrating B2R2 (IR lifter/disassembler for .NET/F#) into your scanning pipeline, then feeding stable “diff signatures” into your VEX Resolver.
What & why (one minute)
- Goal: Prove (offline) that a distro backport truly patched a CVE—even if version strings look “vulnerable”—by comparing what the CPU will execute before/after a patch.
- How: Lift binaries to a normalized IR with B2R2, canonicalize semantics (strip address noise, relocations, NOPs, padding), bucket by function and hash stable opcode/semantics. Patch deltas become small, reproducible evidence blobs your VEX engine can consume.
High‑level flow
-
Collect: For each package/artifact, grab: installed binary, claimed patched reference (vendor’s patched ELF/PE or your golden set), and optional original vulnerable build.
-
Lift: Use B2R2 to disassemble → lift to LIR/SSA (arch‑agnostic).
-
Normalize (deterministic):
- Strip addrs/symbols/relocations; fold NOPs; normalize register aliases; constant‑prop + dead‑code elim; canonical call/ret; normalize PLT stubs; elide alignment/padding.
-
Segment: Per‑function IR slices bounded by CFG; compute stable function IDs =
SHA256(package@version, build-id, arch, fn-cfg-shape). -
Hashing:
- Opcode hash: SHA256 of normalized opcode stream.
- Semantic hash: SHA256 of (basic‑block graph + dataflow summaries).
- Const set hash: extracted immediate set (range‑bucketed) to detect patched lookups.
-
Diff:
- Compare (patched vs baseline) per function: unchanged / changed / added / removed.
- For changed: emit delta record with before/after hashes and minimal edit script (block‑level).
-
Evidence object (deterministic, replayable):
type: "disasm.patch-evidence@1"- inputs: file digests (SHA256/SHA3‑256), Build‑ID, arch, toolchain versions, B2R2 commit, normalization profile ID
- outputs: per‑function records + global summary
- sign: DSSE (in‑toto link) with your offline key profile
-
Feed VEX:
- Map CVE→fix‑site heuristics (from vendor advisories/diff hints) to function buckets.
- If all required buckets show “patched” (semantic hash change matches inventory rule), set
affected=false, justification=code_not_present_or_not_reachable(CycloneDX VEX/CVE‑level) with pointer to evidence object.
Module boundaries in Stella Ops
- Scanner.WebService (per your rule): host lattice algorithms + this disassembly stage.
- Sbomer: records exact files/Build‑IDs in CycloneDX 1.6/1.7 SBOM (you’re moving to 1.7 soon—ensure
propertiesincludedisasm.profile,b2r2.version). - Feedser/Vexer: consume evidence blobs; Vexer attaches VEX statements referencing
evidenceRef. - Authority/Attestor: sign DSSE attestations; Timeline/Notify surface verdict transitions.
On‑disk schemas (minimal)
{
"type": "stella.disasm.patch-evidence@1",
"subject": [{"name": "libssl.so.1.1", "digest": {"sha256": "<...>"}, "buildId": "elf:..."}],
"tool": {"name": "stella-b2r2", "b2r2": "<commit>", "profile": "norm-v1"},
"arch": "x86_64",
"functions": [{
"fnId": "sha256(pkg,buildId,arch,cfgShape)",
"addrRange": "0x401000-0x40118f",
"opcodeHashBefore": "<...>",
"opcodeHashAfter": "<...>",
"semanticHashBefore": "<...>",
"semanticHashAfter": "<...>",
"delta": {"blocksEdited": 2, "immDiff": ["0x7f->0x00"]}
}],
"summary": {"unchanged": 812, "changed": 6, "added": 1, "removed": 0}
}
Determinism controls
- Pin B2R2 version and normalization profile; serialize the profile (passes + order + flags) and include it in evidence.
- Containerize the lifter; record image digest in evidence.
- For randomness (e.g., hash‑salts), set fixed zeros; set
TZ=UTC,LC_ALL=C, and stable CPU features. - Replay manifests: list all inputs (file digests, B2R2 commit, profile) so anyone can re‑run and reproduce the exact hashes.
C# integration sketch (.NET 10)
// StellaOps.Scanner.Disasm
public sealed class DisasmService
{
private readonly IBinarySource _source; // pulls files + vendor refs
private readonly IB2R2Host _b2r2; // thin wrapper over F# via FFI or CLI
private readonly INormalizer _norm; // norm-v1 pipeline
private readonly IEvidenceStore _evidence;
public async Task<DisasmEvidence> AnalyzeAsync(Artifact a, Artifact baseline)
{
var liftedAfter = await _b2r2.LiftAsync(a.Path, a.Arch);
var liftedBefore = await _b2r2.LiftAsync(baseline.Path, baseline.Arch);
var fnAfter = _norm.Normalize(liftedAfter).Functions;
var fnBefore = _norm.Normalize(liftedBefore).Functions;
var bucketsAfter = Bucket(fnAfter);
var bucketsBefore = Bucket(fnBefore);
var diff = DiffBuckets(bucketsBefore, bucketsAfter);
var evidence = EvidenceBuilder.Build(a, baseline, diff, _norm.ProfileId, _b2r2.Version);
await _evidence.PutAsync(evidence); // write + DSSE sign via Attestor
return evidence;
}
}
Normalization profile (norm‑v1)
- Pass order: CFG build → SSA → const‑prop → DCE → register‑rename‑canon → call/ret stub‑canon → PLT/plt.got unwrap → NOP/padding strip → reloc placeholder canon (
IMM_RELOCtokens) → block re‑ordering freeze (cfg sort). - Hash material:
for block in topo(cfg): emit (opcode, operandKinds, IMM_BUCKETS); exclude absolute addrs/symbols.
Hash‑bucketing details
- IMM_BUCKETS: bucket immediates by role: {addr, const, mask, len}. For
addr, replace withIMM_RELOC(section, relType). Forconst, clamp to ranges (e.g., table sizes). - CFG shape hash: adjacency list over block arity; keeps compiler‑noise from breaking determinism.
- Semantic hash seed: keccak of (CFG shape hash || value‑flow summaries per def‑use).
VEX Resolver hookup
- Extend rule language:
requires(fnId in {"EVP_DigestVerifyFinal", ...} && delta.immDiff.any == true)→ verdictnot_affectedwithjustification="code_not_present_or_not_reachable"andimpactStatement="Patched verification path altered constants". - If some required fix‑sites unchanged →
affected=truewithactionStatement="Patched binary mismatch: function(s) unchanged", priority ↑.
Golden set + backports
- Maintain per‑distro golden patched refs (Build‑ID pinned). If vendor publishes only source patch, build once with a fixed toolchain profile to derive reference hashes.
- Backports: You’ll often see different opcode deltas with the same semantic intent—treat evidence as policy‑mappable: define acceptable delta patterns (e.g., bounds‑check added) and store them as “semantic signatures”.
CLI user journey (StellaOps standard CLI)
stella scan disasm \
--pkg openssl --file /usr/lib/x86_64-linux-gnu/libssl.so.1.1 \
--baseline @golden:debian-12/libssl.so.1.1 \
--out evidence.json --attest
- Output: DSSE‑signed evidence;
stella vex resolvethen pulls it and updates the VEX verdicts.
Minimal MVP (2 sprints)
Sprint A (MVP)
- B2R2 host + norm‑v1 for x86_64, aarch64 (ELF).
- Function bucketing + opcode hash; per‑function delta; DSSE evidence.
- VEX rule: “all listed fix‑sites changed → not_affected”.
Sprint B
- Semantic hash; IMM bucketing; PLT/reloc canon; UI diff viewer in Timeline.
- Golden‑set builder & cache; distro backport adapters (Debian, RHEL, Alpine, SUSE, Astra).
Risks & guardrails
- Stripped binaries: OK (IR still works). PIE/ASLR: neutralized via reloc canon. LTO/inlining: mitigate with CFG shape + semantic hash (not symbol names).
- False positives: keep “changed‑but‑harmless” patterns whitelisted via semantic signatures (policy‑versioned).
- Performance: cache lifted IR by
(digest, arch, profile); parallelize per function.
If you want, I can draft the norm‑v1 pass list as a concrete F# pipeline for B2R2 and a .proto/JSON‑Schema for stella.disasm.patch-evidence@1, ready to drop into scanner.webservice.