save progress

This commit is contained in:
StellaOps Bot
2026-01-03 12:41:57 +02:00
parent 83c37243e0
commit d486d41a48
48 changed files with 7174 additions and 1086 deletions

View File

@@ -0,0 +1,153 @@
Heres a tight, practical plan to add **deterministic binarypatch evidence** to StellaOps by integrating **B2R2** (IR lifter/disassembler for .NET/F#) into your scanning pipeline, then feeding stable “diff signatures” into your **VEX Resolver**.
# What & why (one minute)
* **Goal:** Prove (offline) that a distro backport truly patched a CVE—even if version strings look “vulnerable”—by comparing *what the CPU will execute* before/after a patch.
* **How:** Lift binaries to a normalized IR with **B2R2**, canonicalize semantics (strip address noise, relocations, NOPs, padding), **bucket** by function and **hash** stable opcode/semantics. Patch deltas become small, reproducible evidence blobs your VEX engine can consume.
# Highlevel flow
1. **Collect**: For each package/artifact, grab: *installed binary*, *claimed patched reference* (vendors patched ELF/PE or your golden set), and optional *original vulnerable build*.
2. **Lift**: Use B2R2 to disassemble → lift to **LIR**/**SSA** (archagnostic).
3. **Normalize** (deterministic):
* Strip addrs/symbols/relocations; fold NOPs; normalize register aliases; constantprop + deadcode elim; canonical call/ret; normalize PLT stubs; elide alignment/padding.
4. **Segment**: Perfunction IR slices bounded by CFG; compute **stable function IDs** = `SHA256(package@version, build-id, arch, fn-cfg-shape)`.
5. **Hashing**:
* **Opcode hash**: SHA256 of normalized opcode stream.
* **Semantic hash**: SHA256 of (basicblock graph + dataflow summaries).
* **Const set hash**: extracted immediate set (rangebucketed) to detect patched lookups.
6. **Diff**:
* Compare (patched vs baseline) per function: unchanged / changed / added / removed.
* For changed: emit **delta record** with before/after hashes and minimal edit script (blocklevel).
7. **Evidence object** (deterministic, replayable):
* `type: "disasm.patch-evidence@1"`
* inputs: file digests (SHA256/SHA3256), BuildID, arch, toolchain versions, B2R2 commit, normalization profile ID
* outputs: perfunction records + global summary
* sign: DSSE (intoto link) with your offline key profile
8. **Feed VEX**:
* Map CVE→fixsite heuristics (from vendor advisories/diff hints) to function buckets.
* If all required buckets show “patched” (semantic hash change matches inventory rule), set **`affected=false, justification=code_not_present_or_not_reachable`** (CycloneDX VEX/CVElevel) with pointer to evidence object.
# Module boundaries in StellaOps
* **Scanner.WebService** (per your rule): host *lattice algorithms* + this disassembly stage.
* **Sbomer**: records exact files/BuildIDs in CycloneDX 1.6/1.7 SBOM (youre moving to 1.7 soon—ensure `properties` include `disasm.profile`, `b2r2.version`).
* **Feedser/Vexer**: consume evidence blobs; Vexer attaches VEX statements referencing `evidenceRef`.
* **Authority/Attestor**: sign DSSE attestations; Timeline/Notify surface verdict transitions.
# Ondisk schemas (minimal)
```json
{
"type": "stella.disasm.patch-evidence@1",
"subject": [{"name": "libssl.so.1.1", "digest": {"sha256": "<...>"}, "buildId": "elf:..."}],
"tool": {"name": "stella-b2r2", "b2r2": "<commit>", "profile": "norm-v1"},
"arch": "x86_64",
"functions": [{
"fnId": "sha256(pkg,buildId,arch,cfgShape)",
"addrRange": "0x401000-0x40118f",
"opcodeHashBefore": "<...>",
"opcodeHashAfter": "<...>",
"semanticHashBefore": "<...>",
"semanticHashAfter": "<...>",
"delta": {"blocksEdited": 2, "immDiff": ["0x7f->0x00"]}
}],
"summary": {"unchanged": 812, "changed": 6, "added": 1, "removed": 0}
}
```
# Determinism controls
* Pin **B2R2 version** and **normalization profile**; serialize the profile (passes + order + flags) and include it in evidence.
* Containerize the lifter; record image digest in evidence.
* For randomness (e.g., hashsalts), set fixed zeros; set `TZ=UTC`, `LC_ALL=C`, and stable CPU features.
* Replay manifests: list all inputs (file digests, B2R2 commit, profile) so anyone can rerun and reproduce the exact hashes.
# C# integration sketch (.NET 10)
```csharp
// StellaOps.Scanner.Disasm
public sealed class DisasmService
{
private readonly IBinarySource _source; // pulls files + vendor refs
private readonly IB2R2Host _b2r2; // thin wrapper over F# via FFI or CLI
private readonly INormalizer _norm; // norm-v1 pipeline
private readonly IEvidenceStore _evidence;
public async Task<DisasmEvidence> AnalyzeAsync(Artifact a, Artifact baseline)
{
var liftedAfter = await _b2r2.LiftAsync(a.Path, a.Arch);
var liftedBefore = await _b2r2.LiftAsync(baseline.Path, baseline.Arch);
var fnAfter = _norm.Normalize(liftedAfter).Functions;
var fnBefore = _norm.Normalize(liftedBefore).Functions;
var bucketsAfter = Bucket(fnAfter);
var bucketsBefore = Bucket(fnBefore);
var diff = DiffBuckets(bucketsBefore, bucketsAfter);
var evidence = EvidenceBuilder.Build(a, baseline, diff, _norm.ProfileId, _b2r2.Version);
await _evidence.PutAsync(evidence); // write + DSSE sign via Attestor
return evidence;
}
}
```
# Normalization profile (normv1)
* **Pass order:** CFG build → SSA → constprop → DCE → registerrenamecanon → call/ret stubcanon → PLT/plt.got unwrap → NOP/padding strip → reloc placeholder canon (`IMM_RELOC` tokens) → block reordering freeze (cfg sort).
* **Hash material:** `for block in topo(cfg): emit (opcode, operandKinds, IMM_BUCKETS)`; exclude absolute addrs/symbols.
# Hashbucketing details
* **IMM_BUCKETS:** bucket immediates by role: {addr, const, mask, len}. For `addr`, replace with `IMM_RELOC(section, relType)`. For `const`, clamp to ranges (e.g., table sizes).
* **CFG shape hash:** adjacency list over block arity; keeps compilernoise from breaking determinism.
* **Semantic hash seed:** keccak of (CFG shape hash || valueflow summaries per defuse).
# VEX Resolver hookup
* Extend rule language: `requires(fnId in {"EVP_DigestVerifyFinal", ...} && delta.immDiff.any == true)` → verdict `not_affected` with `justification="code_not_present_or_not_reachable"` and `impactStatement="Patched verification path altered constants"`.
* If some required fixsites unchanged → `affected=true` with `actionStatement="Patched binary mismatch: function(s) unchanged"`, priority ↑.
# Golden set + backports
* Maintain perdistro **golden patched refs** (BuildID pinned). If vendor publishes only source patch, build once with a fixed toolchain profile to derive reference hashes.
* Backports: Youll often see *different* opcode deltas with the *same* semantic intent—treat evidence as **policymappable**: define acceptable delta patterns (e.g., boundscheck added) and store them as **“semantic signatures”**.
# CLI user journey (StellaOps standard CLI)
```
stella scan disasm \
--pkg openssl --file /usr/lib/x86_64-linux-gnu/libssl.so.1.1 \
--baseline @golden:debian-12/libssl.so.1.1 \
--out evidence.json --attest
```
* Output: DSSEsigned evidence; `stella vex resolve` then pulls it and updates the VEX verdicts.
# Minimal MVP (2 sprints)
**Sprint A (MVP)**
* B2R2 host + normv1 for x86_64, aarch64 (ELF).
* Function bucketing + opcode hash; perfunction delta; DSSE evidence.
* VEX rule: “all listed fixsites changed → not_affected”.
**Sprint B**
* Semantic hash; IMM bucketing; PLT/reloc canon; UI diff viewer in Timeline.
* Goldenset builder & cache; distro backport adapters (Debian, RHEL, Alpine, SUSE, Astra).
# Risks & guardrails
* Stripped binaries: OK (IR still works). PIE/ASLR: neutralized via reloc canon. LTO/inlining: mitigate with CFG shape + semantic hash (not symbol names).
* False positives: keep “changedbutharmless” patterns whitelisted via semantic signatures (policyversioned).
* Performance: cache lifted IR by `(digest, arch, profile)`; parallelize per function.
If you want, I can draft the **normv1** pass list as a concrete F# pipeline for B2R2 and a **.proto/JSONSchema** for `stella.disasm.patch-evidence@1`, ready to drop into `scanner.webservice`.