# Smart-Diff Technical Reference **Source Advisories**: - 09-Dec-2025 - Smart‑Diff and Provenance‑Rich Binaries - 12-Dec-2025 - Smart‑Diff Detects Meaningful Risk Shifts - 13-Dec-2025 - Smart‑Diff - Defining Meaningful Risk Change - 05-Dec-2025 - Design Notes on Smart‑Diff and Call‑Stack Analysis **Last Updated**: 2025-12-14 --- ## 1. SMART-DIFF PREDICATE SCHEMA ```json { "predicateType": "stellaops.dev/predicates/smart-diff@v1", "predicate": { "baseImage": {"name":"...", "digest":"sha256:..."}, "targetImage": {"name":"...", "digest":"sha256:..."}, "diff": { "filesAdded": [...], "filesRemoved": [...], "filesChanged": [{"path":"...", "hunks":[...]}], "packagesChanged": [{"name":"openssl","from":"1.1.1u","to":"3.0.14"}] }, "context": { "entrypoint":["/app/start"], "env":{"FEATURE_X":"true"}, "user":{"uid":1001,"caps":["NET_BIND_SERVICE"]} }, "reachabilityGate": {"reachable":true,"configActivated":true,"runningUser":false,"class":6}, "scanner": {"name":"StellaOps.Scanner","version":"...","ruleset":"reachability-2025.12"} } } ``` ## 2. REACHABILITY GATE (3-BIT SEVERITY) **Data Model:** ```csharp public sealed record ReachabilityGate( bool? Reachable, // true / false / null for unknown bool? ConfigActivated, bool? RunningUser, int Class, // 0..7 derived from the bits when all known string Rationale // short explanation, human-readable ); ``` **Class Computation:** 0-7 based on 3 binary gates (reachable, config-activated, running user) **Unknown Handling:** - Never silently treat `null` as `false` or `true` - If any bit is `null`, set `Class = -1` or compute from known bits only ## 3. DELTA DATA STRUCTURES ```csharp // Delta.Packages { added[], removed[], changed[{name, fromVer, toVer}] } // Delta.Layers { changed[{path, fromHash, toHash, licenseDelta}] } // Delta.Functions { added[], removed[], changed[{symbol, file, signatureHashFrom, signatureHashTo}] } // PatchDelta { addedSymbols[], removedSymbols[], changedSignatures[] } ``` ## 4. SMART-DIFF ALGORITHMS **Core Diff Computation:** ```pseudo prev = load_snapshot(t-1) curr = load_snapshot(t) Δ.pkg = diff_packages(prev.lock, curr.lock) Δ.layers= diff_layers(prev.sbom, curr.sbom) Δ.funcs = diff_cfg(prev.cfgIndex, curr.cfgIndex) scope = union( impact_of(Δ.pkg.changed), impact_of_files(Δ.layers.changed), reachability_of(Δ.funcs.changed) ) for f in scope.functions: rescore(f) for v in impacted_vulns(scope): annotate(v, patch_delta(Δ)) link_evidence(v, dsse_attestation(), proof_links()) for v in previously_flagged where vulnerable_apis_now_absent(v, curr): emit_vex_candidate(v, status="not_affected", rationale="API not present", evidence=proof_links()) ``` ## 5. MATERIAL RISK CHANGE DETECTION RULES **FindingKey:** ``` FindingKey = (component_purl, component_version, cve_id) ``` **RiskState Fields:** - `reachable: bool | unknown` - `vex_status: enum` (AFFECTED | NOT_AFFECTED | FIXED | UNDER_INVESTIGATION | UNKNOWN) - `in_affected_range: bool | unknown` - `kev: bool` - `epss_score: float | null` - `policy_flags: set` - `evidence_links: list` **Rule R1: Reachability Flip** - `reachable` changes: `false → true` (risk ↑) or `true → false` (risk ↓) **Rule R2: VEX Status Flip** - Meaningful changes: `AFFECTED ↔ NOT_AFFECTED`, `UNDER_INVESTIGATION → NOT_AFFECTED` **Rule R3: Affected Range Boundary** - `in_affected_range` flips: `false → true` or `true → false` **Rule R4: Intelligence/Policy Flip** - `kev` changes `false → true` - `epss_score` crosses configured threshold - `policy_flag` changes severity (warn → block) ## 6. SUPPRESSION RULES **Suppression Conditions (ALL must apply):** 1. `reachable == false` 2. `vex_status == NOT_AFFECTED` 3. `kev == false` 4. No policy override **Patch Churn Suppression:** - If version changes AND `in_affected_range` remains false in both AND no KEV/policy flip → suppress ## 7. CALL-STACK ANALYSIS **C# Roslyn Skeleton:** ```csharp public static class SmartDiff { public static async Task> ReachableSinks(string solutionPath, string[] entrypoints, string[] sinks) { var workspace = MSBuild.MSBuildWorkspace.Create(); var solution = await workspace.OpenSolutionAsync(solutionPath); var index = new HashSet(); foreach (var proj in solution.Projects) { var comp = await proj.GetCompilationAsync(); if (comp is null) continue; var epSymbols = comp.GlobalNamespace.GetMembers().SelectMany(Descend) .OfType().Where(m => entrypoints.Contains(m.ToDisplayString())).ToList(); var sinkSymbols = comp.GlobalNamespace.GetMembers().SelectMany(Descend) .OfType().Where(m => sinks.Contains(m.ToDisplayString())).ToList(); foreach (var ep in epSymbols) foreach (var sink in sinkSymbols) { var refs = await SymbolFinder.FindReferencesAsync(sink, solution); if (refs.SelectMany(r => r.Locations).Any()) index.Add($"{ep.ToDisplayString()} -> {sink.ToDisplayString()}"); } } return index; static IEnumerable Descend(INamespaceOrTypeSymbol sym) { foreach (var m in sym.GetMembers()) { yield return m; if (m is INamespaceOrTypeSymbol nt) foreach (var x in Descend(nt)) yield return x; } } } } ``` **Go SSA Skeleton:** ```go package main import ( "fmt" "golang.org/x/tools/go/callgraph/cha" "golang.org/x/tools/go/packages" "golang.org/x/tools/go/ssa" ) func main() { cfg := &packages.Config{Mode: packages.LoadAllSyntax, Tests: false} pkgs, _ := packages.Load(cfg, "./...") prog, pkgsSSA := ssa.NewProgram(pkgs[0].Fset, ssa.BuilderMode(0)) for _, p := range pkgsSSA { prog.CreatePackage(p, p.Syntax, p.TypesInfo, true) } prog.Build() cg := cha.CallGraph(prog) fmt.Println("nodes:", len(cg.Nodes)) } ``` ## 8. SINK TAXONOMY ```yaml sinks: - CMD_EXEC - UNSAFE_DESER - SQL_RAW - SSRF - FILE_WRITE - PATH_TRAVERSAL - TEMPLATE_INJECTION - CRYPTO_WEAK - AUTHZ_BYPASS ``` ## 9. POLICY SCORING FORMULA **Priority Score:** ``` score = + 1000 if new.kev + 500 if new.reachable + 200 if reason includes RANGE_FLIP to affected + 150 if VEX_FLIP to AFFECTED + 0..100 based on EPSS (epss * 100) + policy weight: +300 if decision BLOCK, +100 if WARN ``` ## 10. PROVENANCE-RICH BINARIES (BINARY SCA + PROVENANCE + SARIF) Smart-Diff becomes materially stronger when it can reason about *binary-level* deltas (symbols/sections/hardening), not only package versions. Required extractors (deterministic): - ELF/PE/Mach-O headers, sections, imports/exports, build-id, rpaths - Symbol tables (public + demangled), string tables, debug info pointers (DWARF/PDB when present) - Compiler/linker fingerprints (e.g., `.comment`, PE version info, toolchain IDs) - Per-section and per-function rolling hashes (stable across identical bytes) - Optional: Bloom filter for symbol presence proofs (binary digest + filter digest) Provenance capture (per binary): - Compiler name/version, target triple, LTO mode, linker name/version - Hardening flags (PIE/RELRO/CFGuard/CET/FORTIFY, stack protector) - Link inputs (libraries + order) and build materials (git commit, dependency lock digests) Attestation output: - Emit a DSSE-wrapped in-toto statement per binary (SLSA provenance compatible) with subject = binary sha256. CI output (developer-facing): - Emit SARIF 2.1.0 (`tool`: `StellaOps.BinarySCA`) so binary findings and hardening regressions can surface in code scanning. - Each SARIF result references the binary digest, symbol/section, and the attestation digest(s) needed to verify the claim. Smart-Diff linkage rule: - When a binary changes, map file delta → binary digest delta → symbol delta → impacted sinks/vulns, then re-score only the impacted scope. --- **Document Version**: 1.0 **Target Platform**: .NET 10, PostgreSQL ≥16, Angular v17