Here’s a compact, practical way to add two high‑leverage capabilities to your scanner: **DSSE‑signed path witnesses** and **Smart‑Diff × Reachability**—what they are, why they matter, and exactly how to implement them in Stella Ops without ceremony. --- # 1) DSSE‑signed path witnesses (entrypoint → calls → sink) **What it is (in plain terms):** When you flag a CVE as “reachable,” also emit a tiny, human‑readable proof: the **exact path** from a real entrypoint (e.g., HTTP route, CLI verb, cron) through functions/methods to the **vulnerable sink**. Wrap that proof in a **DSSE** envelope and sign it. Anyone can verify the witness later—offline—without rerunning analysis. **Why it matters:** * Turns red flags into **auditable evidence** (quiet‑by‑design). * Lets CI/CD, auditors, and customers **verify** findings independently. * Enables **deterministic replay** and provenance chains (ties nicely to in‑toto/SLSA). **Minimal JSON witness (stable, vendor‑neutral):** ```json { "witness_schema": "stellaops.witness.v1", "artifact": { "sbom_digest": "sha256:...", "component_purl": "pkg:nuget/Example@1.2.3" }, "vuln": { "id": "CVE-2024-XXXX", "source": "NVD", "range": "≤1.2.3" }, "entrypoint": { "kind": "http", "name": "GET /billing/pay" }, "path": [ {"symbol": "BillingController.Pay()", "file": "BillingController.cs", "line": 42}, {"symbol": "PaymentsService.Authorize()", "file": "PaymentsService.cs", "line": 88}, {"symbol": "LibXYZ.Parser.Parse()", "file": "Parser.cs", "line": 17} ], "sink": { "symbol": "LibXYZ.Parser.Parse()", "type": "deserialization" }, "evidence": { "callgraph_digest": "sha256:...", "build_id": "dotnet:RID:linux-x64:sha256:...", "analysis_config_digest": "sha256:..." }, "observed_at": "2025-12-18T00:00:00Z" } ``` **Wrap in DSSE (payloadType & payload are required)** ```json { "payloadType": "application/vnd.stellaops.witness+json", "payload": "base64(JSON_above)", "signatures": [{ "keyid": "attestor-stellaops-ed25519", "sig": "base64(...)" }] } ``` **.NET 10 signing/verifying (Ed25519)** ```csharp using System.Security.Cryptography; using System.Text.Json; var payloadBytes = JsonSerializer.SerializeToUtf8Bytes(witnessJsonObj); var dsse = new { payloadType = "application/vnd.stellaops.witness+json", payload = Convert.ToBase64String(payloadBytes), signatures = new [] { new { keyid = keyId, sig = Convert.ToBase64String(Sign(payloadBytes, privateKey)) } } }; byte[] Sign(byte[] data, byte[] privateKey) { using var ed = new Ed25519(); // import private key, sign data (left as your Ed25519 helper) return ed.SignData(data, privateKey); } ``` **Where to emit:** * **Scanner.Worker**: after reachability confirms `reachable=true`, emit witness → **Attestor** signs → **Authority** stores (Postgres) → optional Rekor‑style mirror. * Expose `/witness/{findingId}` for download & independent verification. --- # 2) Smart‑Diff × Reachability (incremental, low‑noise updates) **What it is:** On **SBOM/VEX/dependency** deltas, don’t rescan everything. Update only **affected regions** of the call graph and recompute reachability **just for changed nodes/edges**. **Why it matters:** * **Order‑of‑magnitude faster** incremental scans. * Fewer flaky diffs; triage stays focused on **meaningful risk change**. * Perfect for PR gating: “what changed” → “what became reachable/unreachable.” **Core idea (graph‑reachability):** * Maintain a per‑service **call graph** `G = (V, E)` with **entrypoint set** `S`. * On diff: compute changed nodes/edges ΔV/ΔE. * Run **incremental BFS/DFS** from impacted nodes to sinks (forward or backward), reusing memoized results. * Recompute only **frontiers** touched by Δ. **Minimal tables (Postgres):** ```sql -- Nodes (functions/methods) CREATE TABLE cg_nodes( id BIGSERIAL PRIMARY KEY, service TEXT, symbol TEXT, file TEXT, line INT, hash TEXT, UNIQUE(service, hash) ); -- Edges (calls) CREATE TABLE cg_edges( src BIGINT REFERENCES cg_nodes(id), dst BIGINT REFERENCES cg_nodes(id), kind TEXT, PRIMARY KEY(src, dst) ); -- Entrypoints & Sinks CREATE TABLE cg_entrypoints(node_id BIGINT REFERENCES cg_nodes(id) PRIMARY KEY); CREATE TABLE cg_sinks(node_id BIGINT REFERENCES cg_nodes(id) PRIMARY KEY, sink_type TEXT); -- Memoized reachability cache CREATE TABLE cg_reach_cache( entry_id BIGINT, sink_id BIGINT, path JSONB, reachable BOOLEAN, updated_at TIMESTAMPTZ, PRIMARY KEY(entry_id, sink_id) ); ``` **Incremental algorithm (pseudocode):** ```text Input: ΔSBOM, ΔDeps, ΔCode → ΔNodes, ΔEdges 1) Apply Δ to cg_nodes/cg_edges 2) ImpactSet = neighbors(ΔNodes ∪ endpoints(ΔEdges)) 3) For each e∈Entrypoints intersect ancestors(ImpactSet): Recompute forward search to affected sinks, stop early on unchanged subgraphs Update cg_reach_cache; if state flips, emit new/updated DSSE witness ``` **.NET 10 reachability sketch (fast & local):** ```csharp HashSet ImpactSet = ComputeImpact(deltaNodes, deltaEdges); foreach (var e in Intersect(Entrypoints, Ancestors(ImpactSet))) { var res = BoundedReach(e, affectedSinks, graph, cache); foreach (var r in res.Changed) { cache.Upsert(e, r.Sink, r.Path, r.Reachable); if (r.Reachable) EmitDsseWitness(e, r.Sink, r.Path); } } ``` **CI/PR flow:** 1. Build → SBOM diff → Dependency diff → Call‑graph delta. 2. Run incremental reachability. 3. If any `unreachable→reachable` transitions: **fail gate**, attach DSSE witnesses. 4. If `reachable→unreachable`: auto‑close prior findings (and archive prior witness). --- # UX hooks (quick wins) * In findings list, add a **“Show Witness”** button → modal renders the signed path (entrypoint→…→sink) + **“Verify Signature”** one‑click. * In PR checks, summarize only **state flips** with tiny links: “+2 reachable (view witness)” / “−1 (now unreachable)”. --- # Minimal tasks to get this live * **Scanner.Worker**: build call‑graph extraction (per language), add incremental graph store, reachability cache. * **Attestor**: DSSE signing endpoint + key management (Ed25519 by default; PQC mode later). * **Authority**: tables above + witness storage + retrieval API. * **Router/CI plugin**: PR annotation with **state flips** and links to witnesses. * **UI**: witness modal + signature verify. If you want, I can draft the exact Postgres migrations, the C# repositories, and a tiny verifier CLI that checks DSSE signatures and prints the call path. Below is a concrete, buildable blueprint for an **advanced reachability analysis engine** inside Stella Ops. I’m going to assume your “Stella Ops” components are roughly: * **Scanner.Worker**: runs analyses in CI / on artifacts * **Authority**: stores graphs/findings/witnesses * **Attestor**: signs DSSE envelopes (Ed25519) * (optional) **SurfaceBuilder**: background worker that computes “vuln surfaces” for packages The key advance is: **don’t treat a CVE as “a package”**. Treat it as a **set of trigger methods** (public API) that can reach the vulnerable code inside the dependency—computed by “Smart‑Diff” once, reused everywhere. --- ## 0) Define the contract (precision/soundness) up front If you don’t write this down, you’ll fight false positives/negatives forever. ### What Stella Ops will guarantee (first release) * **Whole-program static call graph** (app + selected dependency assemblies) * **Context-insensitive** (fast), **path witness** extracted (shortest path) * **Dynamic dispatch handled** with CHA/RTA (+ DI hints), with explicit uncertainty flags * **Reflection handled best-effort** (constant-string resolution), otherwise “unknown edge” ### What it will NOT guarantee (first release) * Perfect handling of reflection / `dynamic` / runtime codegen * Perfect delegate/event resolution across complex flows * Full taint/dataflow reachability (you can add later) This is fine. The major value is: “**we can show you the call path**” and “**we can prove the vuln is triggered by calling these library APIs**”. --- ## 1) The big idea: “Vuln surfaces” (Smart-Diff → triggers) ### Problem CVE feeds typically say “package X version range Y is vulnerable” but rarely say *which methods*. If you only do package-level reachability, noise is huge. ### Solution For each CVE+package, compute a **vulnerability surface**: * **Candidate sinks** = methods changed between vulnerable and fixed versions (diff at IL level) * **Trigger methods** = *public/exported* methods in the vulnerable version that can reach those changed methods internally Then your service scan becomes: > “Can any entrypoint reach any trigger method?” This is both faster and more precise. --- ## 2) Data model (Authority / Postgres) You already had call graph tables; here’s a concrete schema that supports: * graph snapshots * incremental updates * vuln surfaces * reachability cache * DSSE witnesses ### 2.1 Graph tables ```sql CREATE TABLE cg_snapshots ( snapshot_id BIGSERIAL PRIMARY KEY, service TEXT NOT NULL, build_id TEXT NOT NULL, graph_digest TEXT NOT NULL, created_at TIMESTAMPTZ NOT NULL DEFAULT now(), UNIQUE(service, build_id) ); CREATE TABLE cg_nodes ( node_id BIGSERIAL PRIMARY KEY, snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE, method_key TEXT NOT NULL, -- stable key (see below) asm_name TEXT, type_name TEXT, method_name TEXT, file_path TEXT, line_start INT, il_hash TEXT, -- normalized IL hash for diffing flags INT NOT NULL DEFAULT 0, -- bitflags: has_reflection, compiler_generated, etc. UNIQUE(snapshot_id, method_key) ); CREATE TABLE cg_edges ( snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE, src_node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE, dst_node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE, kind SMALLINT NOT NULL, -- 0=call,1=newobj,2=dispatch,3=delegate,4=reflection_guess,... PRIMARY KEY(snapshot_id, src_node_id, dst_node_id, kind) ); CREATE TABLE cg_entrypoints ( snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE, node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE, kind TEXT NOT NULL, -- http, grpc, cli, job, etc. name TEXT NOT NULL, -- GET /foo, "Main", etc. PRIMARY KEY(snapshot_id, node_id, kind, name) ); ``` ### 2.2 Vuln surface tables (Smart‑Diff artifacts) ```sql CREATE TABLE vuln_surfaces ( surface_id BIGSERIAL PRIMARY KEY, ecosystem TEXT NOT NULL, -- nuget package TEXT NOT NULL, cve_id TEXT NOT NULL, vuln_version TEXT NOT NULL, -- a representative vulnerable version fixed_version TEXT NOT NULL, surface_digest TEXT NOT NULL, created_at TIMESTAMPTZ NOT NULL DEFAULT now(), UNIQUE(ecosystem, package, cve_id, vuln_version, fixed_version) ); CREATE TABLE vuln_surface_sinks ( surface_id BIGINT REFERENCES vuln_surfaces(surface_id) ON DELETE CASCADE, sink_method_key TEXT NOT NULL, reason TEXT NOT NULL, -- changed|added|removed|heuristic PRIMARY KEY(surface_id, sink_method_key) ); CREATE TABLE vuln_surface_triggers ( surface_id BIGINT REFERENCES vuln_surfaces(surface_id) ON DELETE CASCADE, trigger_method_key TEXT NOT NULL, sink_method_key TEXT NOT NULL, internal_path JSONB, -- optional: library internal witness path PRIMARY KEY(surface_id, trigger_method_key, sink_method_key) ); ``` ### 2.3 Reachability cache & witnesses ```sql CREATE TABLE reach_findings ( finding_id BIGSERIAL PRIMARY KEY, snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE, cve_id TEXT NOT NULL, ecosystem TEXT NOT NULL, package TEXT NOT NULL, package_version TEXT NOT NULL, reachable BOOLEAN NOT NULL, reachable_entrypoints INT NOT NULL DEFAULT 0, updated_at TIMESTAMPTZ NOT NULL DEFAULT now(), UNIQUE(snapshot_id, cve_id, package, package_version) ); CREATE TABLE reach_witnesses ( witness_id BIGSERIAL PRIMARY KEY, finding_id BIGINT REFERENCES reach_findings(finding_id) ON DELETE CASCADE, entry_node_id BIGINT REFERENCES cg_nodes(node_id), dsse_envelope JSONB NOT NULL, created_at TIMESTAMPTZ NOT NULL DEFAULT now() ); ``` --- ## 3) Stable identity: MethodKey + IL hash ### 3.1 MethodKey (must be stable across builds) Use a normalized string like: ``` {AssemblyName}|{DeclaringTypeFullName}|{MethodName}`{GenericArity}({ParamType1},{ParamType2},...) ``` Examples: * `MyApp|BillingController|Pay(System.String)` * `LibXYZ|LibXYZ.Parser|Parse(System.ReadOnlySpan)` ### 3.2 Normalized IL hash (for smart-diff + incremental graph updates) Raw IL bytes aren’t stable (metadata tokens change). Normalize: * opcode names * branch targets by *instruction index*, not offset * method operands by **resolved MethodKey** * string operands by literal or hashed literal * type operands by full name Then hash `SHA256(normalized_bytes)`. --- ## 4) Call graph extraction for .NET (concrete, doable) ### Tooling choice Start with **Mono.Cecil** (MIT license, easy IL traversal). You can later swap to `System.Reflection.Metadata` for speed. ### 4.1 Build process (Scanner.Worker) 1. `dotnet restore` (use your locked restore) 2. `dotnet build -c Release /p:DebugType=portable /p:DebugSymbols=true` 3. Collect: * app assemblies: `bin/Release/**/publish/*.dll` or build output * `.pdb` files for sequence points (file/line for witnesses) ### 4.2 Cecil loader ```csharp var rp = new ReaderParameters { ReadSymbols = true, SymbolReaderProvider = new PortablePdbReaderProvider() }; var asm = AssemblyDefinition.ReadAssembly(dllPath, rp); ``` ### 4.3 Node extraction (methods) Walk all types, including nested: ```csharp IEnumerable AllTypes(ModuleDefinition m) { var stack = new Stack(m.Types); while (stack.Count > 0) { var t = stack.Pop(); yield return t; foreach (var nt in t.NestedTypes) stack.Push(nt); } } foreach (var type in AllTypes(asm.MainModule)) foreach (var method in type.Methods) { var key = MethodKey.From(method); // your normalizer var (file, line) = PdbFirstSequencePoint(method); var ilHash = method.HasBody ? ILFingerprint(method) : null; // store node (method_key, file, line, il_hash, flags...) } ``` ### 4.4 Edge extraction (direct calls) ```csharp foreach (var method in type.Methods.Where(m => m.HasBody)) { var srcKey = MethodKey.From(method); foreach (var ins in method.Body.Instructions) { if (ins.Operand is MethodReference mr) { if (ins.OpCode.Code is Code.Call or Code.Callvirt or Code.Newobj) { var dstKey = MethodKey.From(mr); // important: stable even if not resolved edges.Add(new Edge(srcKey, dstKey, kind: CallKind.Direct)); } if (ins.OpCode.Code is Code.Ldftn or Code.Ldvirtftn) { // delegate capture (handle later) } } } } ``` --- ## 5) Advanced precision: dynamic dispatch + DI + async/await If you stop at direct edges only, you’ll miss many real paths. ### 5.1 Async/await mapping (critical for readable witnesses) Async methods compile into a state machine `MoveNext()`. You want edges attributed back to the original method. In Cecil: * Check `AsyncStateMachineAttribute` on a method * It references a state machine type * Find that type’s `MoveNext` method * Map `MoveNextKey -> OriginalMethodKey` Then, while extracting edges: ```csharp srcKey = MoveNextToOriginal.TryGetValue(srcKey, out var original) ? original : srcKey; ``` Do the same for iterator state machines. ### 5.2 Virtual/interface dispatch (CHA/RTA) You need 2 maps: 1. **type hierarchy / interface impl map** 2. **override map** from “declared method” → “implementation method(s)” **Build override map** ```csharp // For each method, Cecil exposes method.Overrides for explicit implementations. overrideMap[MethodKey.From(overrideRef)] = MethodKey.From(methodDef); ``` **CHA**: for callvirt to virtual method `T.M`, add edges to overrides in derived classes **RTA**: restrict to derived classes that are actually instantiated. How to get instantiated types: * look for `newobj` instructions and add the created type to `InstantiatedTypes` * plus DI registrations (below) ### 5.3 DI hints (Microsoft.Extensions.DependencyInjection) You will see calls like: * `ServiceCollectionServiceExtensions.AddTransient(...)` In IL these are generic method calls. Detect and record `TService -> TImpl` as “instantiated”. This massively improves RTA for modern .NET apps. ### 5.4 Delegates/lambdas (good enough approach) Implement intraprocedural tracking: * when you see `ldftn SomeMethod` then `newobj Action::.ctor` then `stloc.s X` * store `delegateTargets[local X] += SomeMethod` * when you see `ldloc.s X` and later `callvirt Invoke`, add edges to targets This makes Minimal API entrypoint discovery work too. ### 5.5 Reflection (best-effort) Implement only high-signal heuristics: * `typeof(T).GetMethod("Foo")` with constant "Foo" * `GetType().GetMethod("Foo")` with constant "Foo" (type unknown → mark uncertain) If resolved, add edge with `kind=reflection_guess`. If not, set node flag `has_reflection = true` and in results show “may be incomplete”. --- ## 6) Entrypoint detection (concrete detectors) ### 6.1 MVC controllers Detect: * types deriving from `Microsoft.AspNetCore.Mvc.ControllerBase` * methods: * public * not `[NonAction]` * has `[HttpGet]`, `[HttpPost]`, `[Route]` etc. Extract route template from attributes’ ctor arguments. Store in `cg_entrypoints`: * kind = `http` * name = `GET /billing/pay` (compose verb+template) ### 6.2 Minimal APIs Scan `Program.Main` IL: * find calls to `MapGet`, `MapPost`, ... * extract route string from preceding `ldstr` * resolve handler method via delegate tracking (ldftn) Entry: * kind = `http` * name = `GET /foo` ### 6.3 CLI Find assembly entry point method (`asm.EntryPoint`) or `static Main`. Entry: * kind = `cli` * name = `Main` Start here. Add gRPC/jobs later. --- ## 7) Smart-Diff SurfaceBuilder (the “advanced” part) This is what makes your reachability actually meaningful for CVEs. ### 7.1 SurfaceBuilder inputs From your vuln ingestion pipeline: * ecosystem = nuget * package = `LibXYZ` * affected range = `<= 1.2.3` * fixed version = `1.2.4` * CVE id ### 7.2 Choose a vulnerable version to diff Pick the **highest affected version below fixed**. * fixed = 1.2.4 * vulnerable representative = 1.2.3 (If multiple fixed versions exist, build multiple surfaces.) ### 7.3 Download both packages Use NuGet.Protocol to download `.nupkg`, unzip, pick TFMs you care about (often `netstandard2.0` is safest). Compute fingerprints for each assembly. ### 7.4 Compute method fingerprints For each method: * MethodKey * Normalized IL hash ### 7.5 Diff ``` ChangedMethods = { k | hashVuln[k] != hashFixed[k] } ∪ added ∪ removed ``` Store these as `vuln_surface_sinks` with reason. ### 7.6 Build internal library call graph Same Cecil extraction, but only for package assemblies. Now compute triggers: **Reverse BFS from sinks**: * Start from all sink method keys * Walk predecessors * When you encounter a **public/exported method**, record it as a trigger Also store one internal path for each trigger → sink (for witnesses). ### 7.7 Add interface/base declarations as triggers Important: your app might call a library via an interface method signature, not the concrete implementation. For each trigger implementation method: * for each `method.Overrides` entry, add the overridden method key as an additional trigger This reduces dependence on perfect dispatch expansion during app scanning. ### 7.8 Persist the surface Store: * sinks set * triggers set * internal witness paths (optional but highly valuable) Now you’ve converted a “version range” CVE into “these specific library APIs are dangerous”. --- ## 8) Reachability engine (fast, witness-producing) ### 8.1 In-memory graph format (CSR) Don’t BFS off dictionaries; you’ll die on perf. Build integer indices: * `method_key -> nodeIndex (0..N-1)` * store arrays: * `predOffsets[N+1]` * `preds[edgeCount]` Construction: 1. count predecessors per node 2. prefix sum to offsets 3. fill preds ### 8.2 Reverse BFS from sinks This computes: * `visited[node]` = can reach a sink * `parent[node]` = next node toward a sink (for path reconstruction) ```csharp public sealed class ReachabilityEngine { public ReachabilityResult Compute( Graph g, ReadOnlySpan entrypoints, ReadOnlySpan sinks) { var visitedMark = g.VisitMark; // int[] length N (reused across runs) var parent = g.Parent; // int[] length N (reused) g.RunId++; var q = new IntQueue(capacity: g.NodeCount); var sinkSet = new BitSet(g.NodeCount); foreach (var s in sinks) { sinkSet.Set(s); visitedMark[s] = g.RunId; parent[s] = s; q.Enqueue(s); } while (q.TryDequeue(out var v)) { var start = g.PredOffsets[v]; var end = g.PredOffsets[v + 1]; for (int i = start; i < end; i++) { var p = g.Preds[i]; if (visitedMark[p] == g.RunId) continue; visitedMark[p] = g.RunId; parent[p] = v; q.Enqueue(p); } } // Collect reachable entrypoints and paths var results = new List(); foreach (var e in entrypoints) { if (visitedMark[e] != g.RunId) continue; var path = ReconstructPath(e, parent, sinkSet); results.Add(new EntryWitness(e, path)); } return new ReachabilityResult(results); } private static int[] ReconstructPath(int entry, int[] parent, BitSet sinks) { var path = new List(32); int cur = entry; path.Add(cur); // follow parent pointers until a sink for (int guard = 0; guard < 10_000; guard++) { if (sinks.Get(cur)) break; var nxt = parent[cur]; if (nxt == cur || nxt < 0) break; // safety cur = nxt; path.Add(cur); } return path.ToArray(); } } ``` ### 8.3 Producing the witness For each node index in the path: * method_key * file_path / line_start (if known) * optional flags (reflection_guess edge, dispatch edge) Then attach: * vuln id, package, version * entrypoint kind/name * graph digest + config digest * surface digest * timestamp Send JSON to Attestor for DSSE signing, store envelope in Authority. --- ## 9) Scaling: don’t do BFS 500 times if you can avoid it ### 9.1 First-line scaling (usually enough) * Group vulnerabilities by package/version → surfaces reused * Only run reachability for vulns where: * dependency present AND * surface exists OR fallback mode * Limit witnesses per vuln (top 3) In practice, with N~50k nodes and E~200k edges, a reverse BFS is fast in C# if done with arrays. ### 9.2 Incremental Smart-Diff × Reachability (your “low noise” killer feature) #### Step A: compute graph delta between snapshots Use `il_hash` per method to detect changed nodes: * added / removed / changed nodes * edges updated only for changed nodes #### Step B: decide which vulnerabilities need recompute Store a cached reverse-reachable set per vuln surface if you want (bitset), OR just do a cheaper heuristic: Recompute for vulnerability if: * sink set changed (new surface or version changed), OR * any changed node is on any previously stored witness path, OR * entrypoints changed, OR * impacted nodes touch any trigger node’s predecessors (use a small localized search) A practical approach: * store all node IDs that appear in any witness path for that vuln * if delta touches any of those nodes/edges, recompute * otherwise reuse cached result This yields a massive win on PR scans where most code is unchanged. #### Step C: “Impact frontier” recompute (optional) If you want more advanced: * compute `ImpactSet = ΔNodes ∪ endpoints(ΔEdges)` * run reverse BFS **starting from ImpactSet ∩ ReverseReachSet** and update visited marks This is trickier to implement correctly (dynamic graph), so I’d ship the heuristic first. --- ## 10) Practical fallback modes (don’t block shipping) You won’t have surfaces for every CVE on day 1. Handle this gracefully: ### Mode 1: Surface-based reachability (best) * sink = trigger methods from surface * result: “reachable” with path ### Mode 2: Package API usage (good fallback) * sink = *any* method in that package that is called by app * result: “package reachable” (lower confidence), still provide path to callsite ### Mode 3: Dependency present only (SBOM level) * no call graph needed * result: “present” only Your UI can show confidence tiers: * **Confirmed reachable (surface)** * **Likely reachable (package API)** * **Present only (SBOM)** --- ## 11) Integration points inside Stella Ops ### Scanner.Worker (per build) 1. Build/collect assemblies + pdb 2. `CallGraphBuilder` → nodes/edges/entrypoints + graph_digest 3. Load SBOM vulnerabilities list 4. For each vuln: * resolve surface triggers; if missing → enqueue SurfaceBuilder job + fallback mode * run reachability BFS * for each reachable entrypoint: emit DSSE witness 5. Persist findings/witnesses ### SurfaceBuilder (async worker) * triggered by “surface missing” events or nightly preload of top packages * computes surface once, stores forever ### Authority * stores graphs, surfaces, findings, witnesses * provides retrieval APIs for UI/CI --- ## 12) What to implement first (in the order that produces value fastest) ### Week 1–2 scope (realistic, shippable) 1. Cecil call graph extraction (direct calls) 2. MVC + Minimal API entrypoints 3. Reverse BFS reachability with path witnesses 4. DSSE witness signing + storage 5. SurfaceBuilder v1: * IL hash per method * changed methods as sinks * triggers via internal reverse BFS 6. UI: “Show Witness” + “Verify Signature” ### Next increment (precision upgrades) 7. async/await mapping to original methods 8. RTA + DI registration hints 9. delegate tracking for Minimal API handlers (if not already) 10. interface override triggers in surface builder ### Later (if you want “attackability”, not just “reachability”) 11. taint/dataflow for top sink classes (deserialization, path traversal, SQL, command exec) 12. sanitizer modeling & parameter constraints --- ## 13) Common failure modes and how to harden ### MethodKey mismatches (surface vs app call) * Ensure both are generated from the same normalization rules * For generic methods, prefer **definition** keys (strip instantiation) * Store both “exact” and “erased generic” variants if needed ### Multi-target frameworks * SurfaceBuilder: compute triggers for each TFM, union them * App scan: choose TFM closest to build RID, but allow fallback to union ### Huge graphs * Drop `System.*` nodes/edges unless: * the vuln is in System.* (rare, but handle separately) * Deduplicate nodes by MethodKey across assemblies where safe * Use CSR arrays + pooled queues ### Reflection heavy projects * Mark analysis confidence lower * Include “unknown edges present” in finding metadata * Still produce a witness path up to the reflective callsite --- If you want, I can also paste a **complete Cecil-based CallGraphBuilder class** (nodes+edges+PDB lines), plus the **SurfaceBuilder** that downloads NuGet packages and generates `vuln_surface_triggers` end-to-end.