Files
git.stella-ops.org/docs/product-advisories/18-Dec-2025 - Concrete Advances in Reachability Analysis.md
StellaOps Bot 28823a8960 save progress
2025-12-18 09:10:36 +02:00

28 KiB
Raw Blame History

Heres a compact, practical way to add two highleverage capabilities to your scanner: DSSEsigned path witnesses and SmartDiff × Reachability—what they are, why they matter, and exactly how to implement them in StellaOps without ceremony.


1) DSSEsigned path witnesses (entrypoint → calls → sink)

What it is (in plain terms): When you flag a CVE as “reachable,” also emit a tiny, humanreadable proof: the exact path from a real entrypoint (e.g., HTTP route, CLI verb, cron) through functions/methods to the vulnerable sink. Wrap that proof in a DSSE envelope and sign it. Anyone can verify the witness later—offline—without rerunning analysis.

Why it matters:

  • Turns red flags into auditable evidence (quietbydesign).
  • Lets CI/CD, auditors, and customers verify findings independently.
  • Enables deterministic replay and provenance chains (ties nicely to intoto/SLSA).

Minimal JSON witness (stable, vendorneutral):

{
  "witness_schema": "stellaops.witness.v1",
  "artifact": { "sbom_digest": "sha256:...", "component_purl": "pkg:nuget/Example@1.2.3" },
  "vuln": { "id": "CVE-2024-XXXX", "source": "NVD", "range": "≤1.2.3" },
  "entrypoint": { "kind": "http", "name": "GET /billing/pay" },
  "path": [
    {"symbol": "BillingController.Pay()", "file": "BillingController.cs", "line": 42},
    {"symbol": "PaymentsService.Authorize()", "file": "PaymentsService.cs", "line": 88},
    {"symbol": "LibXYZ.Parser.Parse()", "file": "Parser.cs", "line": 17}
  ],
  "sink": { "symbol": "LibXYZ.Parser.Parse()", "type": "deserialization" },
  "evidence": {
    "callgraph_digest": "sha256:...",
    "build_id": "dotnet:RID:linux-x64:sha256:...",
    "analysis_config_digest": "sha256:..."
  },
  "observed_at": "2025-12-18T00:00:00Z"
}

Wrap in DSSE (payloadType & payload are required)

{
  "payloadType": "application/vnd.stellaops.witness+json",
  "payload": "base64(JSON_above)",
  "signatures": [{ "keyid": "attestor-stellaops-ed25519", "sig": "base64(...)" }]
}

.NET 10 signing/verifying (Ed25519)

using System.Security.Cryptography;
using System.Text.Json;

var payloadBytes = JsonSerializer.SerializeToUtf8Bytes(witnessJsonObj);
var dsse = new {
  payloadType = "application/vnd.stellaops.witness+json",
  payload = Convert.ToBase64String(payloadBytes),
  signatures = new [] { new { keyid = keyId, sig = Convert.ToBase64String(Sign(payloadBytes, privateKey)) } }
};
byte[] Sign(byte[] data, byte[] privateKey)
{
    using var ed = new Ed25519();
    // import private key, sign data (left as your Ed25519 helper)
    return ed.SignData(data, privateKey);
}

Where to emit:

  • Scanner.Worker: after reachability confirms reachable=true, emit witness → Attestor signs → Authority stores (Postgres) → optional Rekorstyle mirror.
  • Expose /witness/{findingId} for download & independent verification.

2) SmartDiff × Reachability (incremental, lownoise updates)

What it is: On SBOM/VEX/dependency deltas, dont rescan everything. Update only affected regions of the call graph and recompute reachability just for changed nodes/edges.

Why it matters:

  • Orderofmagnitude faster incremental scans.
  • Fewer flaky diffs; triage stays focused on meaningful risk change.
  • Perfect for PR gating: “what changed” → “what became reachable/unreachable.”

Core idea (graphreachability):

  • Maintain a perservice call graph G = (V, E) with entrypoint set S.
  • On diff: compute changed nodes/edges ΔV/ΔE.
  • Run incremental BFS/DFS from impacted nodes to sinks (forward or backward), reusing memoized results.
  • Recompute only frontiers touched by Δ.

Minimal tables (Postgres):

-- Nodes (functions/methods)
CREATE TABLE cg_nodes(
  id BIGSERIAL PRIMARY KEY,
  service TEXT, symbol TEXT, file TEXT, line INT,
  hash TEXT, UNIQUE(service, hash)
);
-- Edges (calls)
CREATE TABLE cg_edges(
  src BIGINT REFERENCES cg_nodes(id),
  dst BIGINT REFERENCES cg_nodes(id),
  kind TEXT, PRIMARY KEY(src, dst)
);
-- Entrypoints & Sinks
CREATE TABLE cg_entrypoints(node_id BIGINT REFERENCES cg_nodes(id) PRIMARY KEY);
CREATE TABLE cg_sinks(node_id BIGINT REFERENCES cg_nodes(id) PRIMARY KEY, sink_type TEXT);

-- Memoized reachability cache
CREATE TABLE cg_reach_cache(
  entry_id BIGINT, sink_id BIGINT,
  path JSONB, reachable BOOLEAN,
  updated_at TIMESTAMPTZ,
  PRIMARY KEY(entry_id, sink_id)
);

Incremental algorithm (pseudocode):

Input: ΔSBOM, ΔDeps, ΔCode → ΔNodes, ΔEdges
1) Apply Δ to cg_nodes/cg_edges
2) ImpactSet = neighbors(ΔNodes  endpoints(ΔEdges))
3) For each e∈Entrypoints intersect ancestors(ImpactSet):
     Recompute forward search to affected sinks, stop early on unchanged subgraphs
     Update cg_reach_cache; if state flips, emit new/updated DSSE witness

.NET 10 reachability sketch (fast & local):

HashSet<int> ImpactSet = ComputeImpact(deltaNodes, deltaEdges);
foreach (var e in Intersect(Entrypoints, Ancestors(ImpactSet)))
{
    var res = BoundedReach(e, affectedSinks, graph, cache);
    foreach (var r in res.Changed)
    {
        cache.Upsert(e, r.Sink, r.Path, r.Reachable);
        if (r.Reachable) EmitDsseWitness(e, r.Sink, r.Path);
    }
}

CI/PR flow:

  1. Build → SBOM diff → Dependency diff → Callgraph delta.
  2. Run incremental reachability.
  3. If any unreachable→reachable transitions: fail gate, attach DSSE witnesses.
  4. If reachable→unreachable: autoclose prior findings (and archive prior witness).

UX hooks (quick wins)

  • In findings list, add a “Show Witness” button → modal renders the signed path (entrypoint→…→sink) + “Verify Signature” oneclick.
  • In PR checks, summarize only state flips with tiny links: “+2 reachable (view witness)” / “1 (now unreachable)”.

Minimal tasks to get this live

  • Scanner.Worker: build callgraph extraction (per language), add incremental graph store, reachability cache.
  • Attestor: DSSE signing endpoint + key management (Ed25519 by default; PQC mode later).
  • Authority: tables above + witness storage + retrieval API.
  • Router/CI plugin: PR annotation with state flips and links to witnesses.
  • UI: witness modal + signature verify.

If you want, I can draft the exact Postgres migrations, the C# repositories, and a tiny verifier CLI that checks DSSE signatures and prints the call path. Below is a concrete, buildable blueprint for an advanced reachability analysis engine inside Stella Ops. Im going to assume your “Stella Ops” components are roughly:

  • Scanner.Worker: runs analyses in CI / on artifacts
  • Authority: stores graphs/findings/witnesses
  • Attestor: signs DSSE envelopes (Ed25519)
  • (optional) SurfaceBuilder: background worker that computes “vuln surfaces” for packages

The key advance is: dont treat a CVE as “a package”. Treat it as a set of trigger methods (public API) that can reach the vulnerable code inside the dependency—computed by “SmartDiff” once, reused everywhere.


0) Define the contract (precision/soundness) up front

If you dont write this down, youll fight false positives/negatives forever.

What Stella Ops will guarantee (first release)

  • Whole-program static call graph (app + selected dependency assemblies)
  • Context-insensitive (fast), path witness extracted (shortest path)
  • Dynamic dispatch handled with CHA/RTA (+ DI hints), with explicit uncertainty flags
  • Reflection handled best-effort (constant-string resolution), otherwise “unknown edge”

What it will NOT guarantee (first release)

  • Perfect handling of reflection / dynamic / runtime codegen
  • Perfect delegate/event resolution across complex flows
  • Full taint/dataflow reachability (you can add later)

This is fine. The major value is: “we can show you the call path” and “we can prove the vuln is triggered by calling these library APIs”.


1) The big idea: “Vuln surfaces” (Smart-Diff → triggers)

Problem

CVE feeds typically say “package X version range Y is vulnerable” but rarely say which methods. If you only do package-level reachability, noise is huge.

Solution

For each CVE+package, compute a vulnerability surface:

  • Candidate sinks = methods changed between vulnerable and fixed versions (diff at IL level)
  • Trigger methods = public/exported methods in the vulnerable version that can reach those changed methods internally

Then your service scan becomes:

“Can any entrypoint reach any trigger method?”

This is both faster and more precise.


2) Data model (Authority / Postgres)

You already had call graph tables; heres a concrete schema that supports:

  • graph snapshots
  • incremental updates
  • vuln surfaces
  • reachability cache
  • DSSE witnesses

2.1 Graph tables

CREATE TABLE cg_snapshots (
  snapshot_id BIGSERIAL PRIMARY KEY,
  service TEXT NOT NULL,
  build_id TEXT NOT NULL,
  graph_digest TEXT NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
  UNIQUE(service, build_id)
);

CREATE TABLE cg_nodes (
  node_id BIGSERIAL PRIMARY KEY,
  snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
  method_key TEXT NOT NULL,              -- stable key (see below)
  asm_name TEXT,
  type_name TEXT,
  method_name TEXT,
  file_path TEXT,
  line_start INT,
  il_hash TEXT,                          -- normalized IL hash for diffing
  flags INT NOT NULL DEFAULT 0,          -- bitflags: has_reflection, compiler_generated, etc.
  UNIQUE(snapshot_id, method_key)
);

CREATE TABLE cg_edges (
  snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
  src_node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
  dst_node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
  kind SMALLINT NOT NULL,                -- 0=call,1=newobj,2=dispatch,3=delegate,4=reflection_guess,...
  PRIMARY KEY(snapshot_id, src_node_id, dst_node_id, kind)
);

CREATE TABLE cg_entrypoints (
  snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
  node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
  kind TEXT NOT NULL,                    -- http, grpc, cli, job, etc.
  name TEXT NOT NULL,                    -- GET /foo, "Main", etc.
  PRIMARY KEY(snapshot_id, node_id, kind, name)
);

2.2 Vuln surface tables (SmartDiff artifacts)

CREATE TABLE vuln_surfaces (
  surface_id BIGSERIAL PRIMARY KEY,
  ecosystem TEXT NOT NULL,               -- nuget
  package TEXT NOT NULL,
  cve_id TEXT NOT NULL,
  vuln_version TEXT NOT NULL,            -- a representative vulnerable version
  fixed_version TEXT NOT NULL,
  surface_digest TEXT NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
  UNIQUE(ecosystem, package, cve_id, vuln_version, fixed_version)
);

CREATE TABLE vuln_surface_sinks (
  surface_id BIGINT REFERENCES vuln_surfaces(surface_id) ON DELETE CASCADE,
  sink_method_key TEXT NOT NULL,
  reason TEXT NOT NULL,                  -- changed|added|removed|heuristic
  PRIMARY KEY(surface_id, sink_method_key)
);

CREATE TABLE vuln_surface_triggers (
  surface_id BIGINT REFERENCES vuln_surfaces(surface_id) ON DELETE CASCADE,
  trigger_method_key TEXT NOT NULL,
  sink_method_key TEXT NOT NULL,
  internal_path JSONB,                   -- optional: library internal witness path
  PRIMARY KEY(surface_id, trigger_method_key, sink_method_key)
);

2.3 Reachability cache & witnesses

CREATE TABLE reach_findings (
  finding_id BIGSERIAL PRIMARY KEY,
  snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
  cve_id TEXT NOT NULL,
  ecosystem TEXT NOT NULL,
  package TEXT NOT NULL,
  package_version TEXT NOT NULL,
  reachable BOOLEAN NOT NULL,
  reachable_entrypoints INT NOT NULL DEFAULT 0,
  updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
  UNIQUE(snapshot_id, cve_id, package, package_version)
);

CREATE TABLE reach_witnesses (
  witness_id BIGSERIAL PRIMARY KEY,
  finding_id BIGINT REFERENCES reach_findings(finding_id) ON DELETE CASCADE,
  entry_node_id BIGINT REFERENCES cg_nodes(node_id),
  dsse_envelope JSONB NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

3) Stable identity: MethodKey + IL hash

3.1 MethodKey (must be stable across builds)

Use a normalized string like:

{AssemblyName}|{DeclaringTypeFullName}|{MethodName}`{GenericArity}({ParamType1},{ParamType2},...)

Examples:

  • MyApp|BillingController|Pay(System.String)
  • LibXYZ|LibXYZ.Parser|Parse(System.ReadOnlySpan<System.Byte>)

3.2 Normalized IL hash (for smart-diff + incremental graph updates)

Raw IL bytes arent stable (metadata tokens change). Normalize:

  • opcode names
  • branch targets by instruction index, not offset
  • method operands by resolved MethodKey
  • string operands by literal or hashed literal
  • type operands by full name

Then hash SHA256(normalized_bytes).


4) Call graph extraction for .NET (concrete, doable)

Tooling choice

Start with Mono.Cecil (MIT license, easy IL traversal). You can later swap to System.Reflection.Metadata for speed.

4.1 Build process (Scanner.Worker)

  1. dotnet restore (use your locked restore)

  2. dotnet build -c Release /p:DebugType=portable /p:DebugSymbols=true

  3. Collect:

    • app assemblies: bin/Release/**/publish/*.dll or build output
    • .pdb files for sequence points (file/line for witnesses)

4.2 Cecil loader

var rp = new ReaderParameters {
    ReadSymbols = true,
    SymbolReaderProvider = new PortablePdbReaderProvider()
};

var asm = AssemblyDefinition.ReadAssembly(dllPath, rp);

4.3 Node extraction (methods)

Walk all types, including nested:

IEnumerable<TypeDefinition> AllTypes(ModuleDefinition m)
{
    var stack = new Stack<TypeDefinition>(m.Types);
    while (stack.Count > 0)
    {
        var t = stack.Pop();
        yield return t;
        foreach (var nt in t.NestedTypes) stack.Push(nt);
    }
}

foreach (var type in AllTypes(asm.MainModule))
foreach (var method in type.Methods)
{
    var key = MethodKey.From(method);           // your normalizer
    var (file, line) = PdbFirstSequencePoint(method);
    var ilHash = method.HasBody ? ILFingerprint(method) : null;

    // store node (method_key, file, line, il_hash, flags...)
}

4.4 Edge extraction (direct calls)

foreach (var method in type.Methods.Where(m => m.HasBody))
{
    var srcKey = MethodKey.From(method);
    foreach (var ins in method.Body.Instructions)
    {
        if (ins.Operand is MethodReference mr)
        {
            if (ins.OpCode.Code is Code.Call or Code.Callvirt or Code.Newobj)
            {
                var dstKey = MethodKey.From(mr); // important: stable even if not resolved
                edges.Add(new Edge(srcKey, dstKey, kind: CallKind.Direct));
            }
            if (ins.OpCode.Code is Code.Ldftn or Code.Ldvirtftn)
            {
                // delegate capture (handle later)
            }
        }
    }
}

5) Advanced precision: dynamic dispatch + DI + async/await

If you stop at direct edges only, youll miss many real paths.

5.1 Async/await mapping (critical for readable witnesses)

Async methods compile into a state machine MoveNext(). You want edges attributed back to the original method.

In Cecil:

  • Check AsyncStateMachineAttribute on a method
  • It references a state machine type
  • Find that types MoveNext method
  • Map MoveNextKey -> OriginalMethodKey

Then, while extracting edges:

srcKey = MoveNextToOriginal.TryGetValue(srcKey, out var original) ? original : srcKey;

Do the same for iterator state machines.

5.2 Virtual/interface dispatch (CHA/RTA)

You need 2 maps:

  1. type hierarchy / interface impl map
  2. override map from “declared method” → “implementation method(s)”

Build override map

// For each method, Cecil exposes method.Overrides for explicit implementations.
overrideMap[MethodKey.From(overrideRef)] = MethodKey.From(methodDef);

CHA: for callvirt to virtual method T.M, add edges to overrides in derived classes RTA: restrict to derived classes that are actually instantiated.

How to get instantiated types:

  • look for newobj instructions and add the created type to InstantiatedTypes
  • plus DI registrations (below)

5.3 DI hints (Microsoft.Extensions.DependencyInjection)

You will see calls like:

  • ServiceCollectionServiceExtensions.AddTransient<TService, TImpl>(...)

In IL these are generic method calls. Detect and record TService -> TImpl as “instantiated”. This massively improves RTA for modern .NET apps.

5.4 Delegates/lambdas (good enough approach)

Implement intraprocedural tracking:

  • when you see ldftn SomeMethod then newobj Action::.ctor then stloc.s X
  • store delegateTargets[local X] += SomeMethod
  • when you see ldloc.s X and later callvirt Invoke, add edges to targets

This makes Minimal API entrypoint discovery work too.

5.5 Reflection (best-effort)

Implement only high-signal heuristics:

  • typeof(T).GetMethod("Foo") with constant "Foo"
  • GetType().GetMethod("Foo") with constant "Foo" (type unknown → mark uncertain)

If resolved, add edge with kind=reflection_guess. If not, set node flag has_reflection = true and in results show “may be incomplete”.


6) Entrypoint detection (concrete detectors)

6.1 MVC controllers

Detect:

  • types deriving from Microsoft.AspNetCore.Mvc.ControllerBase

  • methods:

    • public
    • not [NonAction]
    • has [HttpGet], [HttpPost], [Route] etc.

Extract route template from attributes ctor arguments.

Store in cg_entrypoints:

  • kind = http
  • name = GET /billing/pay (compose verb+template)

6.2 Minimal APIs

Scan Program.Main IL:

  • find calls to MapGet, MapPost, ...
  • extract route string from preceding ldstr
  • resolve handler method via delegate tracking (ldftn)

Entry:

  • kind = http
  • name = GET /foo

6.3 CLI

Find assembly entry point method (asm.EntryPoint) or static Main. Entry:

  • kind = cli
  • name = Main

Start here. Add gRPC/jobs later.


7) Smart-Diff SurfaceBuilder (the “advanced” part)

This is what makes your reachability actually meaningful for CVEs.

7.1 SurfaceBuilder inputs

From your vuln ingestion pipeline:

  • ecosystem = nuget
  • package = LibXYZ
  • affected range = <= 1.2.3
  • fixed version = 1.2.4
  • CVE id

7.2 Choose a vulnerable version to diff

Pick the highest affected version below fixed.

  • fixed = 1.2.4
  • vulnerable representative = 1.2.3

(If multiple fixed versions exist, build multiple surfaces.)

7.3 Download both packages

Use NuGet.Protocol to download .nupkg, unzip, pick TFMs you care about (often netstandard2.0 is safest). Compute fingerprints for each assembly.

7.4 Compute method fingerprints

For each method:

  • MethodKey
  • Normalized IL hash

7.5 Diff

ChangedMethods = { k | hashVuln[k] != hashFixed[k] }  added  removed

Store these as vuln_surface_sinks with reason.

7.6 Build internal library call graph

Same Cecil extraction, but only for package assemblies. Now compute triggers:

Reverse BFS from sinks:

  • Start from all sink method keys
  • Walk predecessors
  • When you encounter a public/exported method, record it as a trigger

Also store one internal path for each trigger → sink (for witnesses).

7.7 Add interface/base declarations as triggers

Important: your app might call a library via an interface method signature, not the concrete implementation.

For each trigger implementation method:

  • for each method.Overrides entry, add the overridden method key as an additional trigger

This reduces dependence on perfect dispatch expansion during app scanning.

7.8 Persist the surface

Store:

  • sinks set
  • triggers set
  • internal witness paths (optional but highly valuable)

Now youve converted a “version range” CVE into “these specific library APIs are dangerous”.


8) Reachability engine (fast, witness-producing)

8.1 In-memory graph format (CSR)

Dont BFS off dictionaries; youll die on perf.

Build integer indices:

  • method_key -> nodeIndex (0..N-1)

  • store arrays:

    • predOffsets[N+1]
    • preds[edgeCount]

Construction:

  1. count predecessors per node
  2. prefix sum to offsets
  3. fill preds

8.2 Reverse BFS from sinks

This computes:

  • visited[node] = can reach a sink
  • parent[node] = next node toward a sink (for path reconstruction)
public sealed class ReachabilityEngine
{
    public ReachabilityResult Compute(
        Graph g,
        ReadOnlySpan<int> entrypoints,
        ReadOnlySpan<int> sinks)
    {
        var visitedMark = g.VisitMark;      // int[] length N (reused across runs)
        var parent = g.Parent;              // int[] length N (reused)
        g.RunId++;

        var q = new IntQueue(capacity: g.NodeCount);
        var sinkSet = new BitSet(g.NodeCount);
        foreach (var s in sinks)
        {
            sinkSet.Set(s);
            visitedMark[s] = g.RunId;
            parent[s] = s;
            q.Enqueue(s);
        }

        while (q.TryDequeue(out var v))
        {
            var start = g.PredOffsets[v];
            var end = g.PredOffsets[v + 1];
            for (int i = start; i < end; i++)
            {
                var p = g.Preds[i];
                if (visitedMark[p] == g.RunId) continue;
                visitedMark[p] = g.RunId;
                parent[p] = v;
                q.Enqueue(p);
            }
        }

        // Collect reachable entrypoints and paths
        var results = new List<EntryWitness>();
        foreach (var e in entrypoints)
        {
            if (visitedMark[e] != g.RunId) continue;
            var path = ReconstructPath(e, parent, sinkSet);
            results.Add(new EntryWitness(e, path));
        }

        return new ReachabilityResult(results);
    }

    private static int[] ReconstructPath(int entry, int[] parent, BitSet sinks)
    {
        var path = new List<int>(32);
        int cur = entry;
        path.Add(cur);

        // follow parent pointers until a sink
        for (int guard = 0; guard < 10_000; guard++)
        {
            if (sinks.Get(cur)) break;
            var nxt = parent[cur];
            if (nxt == cur || nxt < 0) break; // safety
            cur = nxt;
            path.Add(cur);
        }
        return path.ToArray();
    }
}

8.3 Producing the witness

For each node index in the path:

  • method_key
  • file_path / line_start (if known)
  • optional flags (reflection_guess edge, dispatch edge)

Then attach:

  • vuln id, package, version
  • entrypoint kind/name
  • graph digest + config digest
  • surface digest
  • timestamp

Send JSON to Attestor for DSSE signing, store envelope in Authority.


9) Scaling: dont do BFS 500 times if you can avoid it

9.1 First-line scaling (usually enough)

  • Group vulnerabilities by package/version → surfaces reused

  • Only run reachability for vulns where:

    • dependency present AND
    • surface exists OR fallback mode
  • Limit witnesses per vuln (top 3)

In practice, with N50k nodes and E200k edges, a reverse BFS is fast in C# if done with arrays.

9.2 Incremental Smart-Diff × Reachability (your “low noise” killer feature)

Step A: compute graph delta between snapshots

Use il_hash per method to detect changed nodes:

  • added / removed / changed nodes
  • edges updated only for changed nodes

Step B: decide which vulnerabilities need recompute

Store a cached reverse-reachable set per vuln surface if you want (bitset), OR just do a cheaper heuristic:

Recompute for vulnerability if:

  • sink set changed (new surface or version changed), OR
  • any changed node is on any previously stored witness path, OR
  • entrypoints changed, OR
  • impacted nodes touch any trigger nodes predecessors (use a small localized search)

A practical approach:

  • store all node IDs that appear in any witness path for that vuln
  • if delta touches any of those nodes/edges, recompute
  • otherwise reuse cached result

This yields a massive win on PR scans where most code is unchanged.

Step C: “Impact frontier” recompute (optional)

If you want more advanced:

  • compute ImpactSet = ΔNodes endpoints(ΔEdges)
  • run reverse BFS starting from ImpactSet ∩ ReverseReachSet and update visited marks This is trickier to implement correctly (dynamic graph), so Id ship the heuristic first.

10) Practical fallback modes (dont block shipping)

You wont have surfaces for every CVE on day 1. Handle this gracefully:

Mode 1: Surface-based reachability (best)

  • sink = trigger methods from surface
  • result: “reachable” with path

Mode 2: Package API usage (good fallback)

  • sink = any method in that package that is called by app
  • result: “package reachable” (lower confidence), still provide path to callsite

Mode 3: Dependency present only (SBOM level)

  • no call graph needed
  • result: “present” only

Your UI can show confidence tiers:

  • Confirmed reachable (surface)
  • Likely reachable (package API)
  • Present only (SBOM)

11) Integration points inside Stella Ops

Scanner.Worker (per build)

  1. Build/collect assemblies + pdb

  2. CallGraphBuilder → nodes/edges/entrypoints + graph_digest

  3. Load SBOM vulnerabilities list

  4. For each vuln:

    • resolve surface triggers; if missing → enqueue SurfaceBuilder job + fallback mode
    • run reachability BFS
    • for each reachable entrypoint: emit DSSE witness
  5. Persist findings/witnesses

SurfaceBuilder (async worker)

  • triggered by “surface missing” events or nightly preload of top packages
  • computes surface once, stores forever

Authority

  • stores graphs, surfaces, findings, witnesses
  • provides retrieval APIs for UI/CI

12) What to implement first (in the order that produces value fastest)

Week 12 scope (realistic, shippable)

  1. Cecil call graph extraction (direct calls)

  2. MVC + Minimal API entrypoints

  3. Reverse BFS reachability with path witnesses

  4. DSSE witness signing + storage

  5. SurfaceBuilder v1:

    • IL hash per method
    • changed methods as sinks
    • triggers via internal reverse BFS
  6. UI: “Show Witness” + “Verify Signature”

Next increment (precision upgrades)

  1. async/await mapping to original methods
  2. RTA + DI registration hints
  3. delegate tracking for Minimal API handlers (if not already)
  4. interface override triggers in surface builder

Later (if you want “attackability”, not just “reachability”)

  1. taint/dataflow for top sink classes (deserialization, path traversal, SQL, command exec)
  2. sanitizer modeling & parameter constraints

13) Common failure modes and how to harden

MethodKey mismatches (surface vs app call)

  • Ensure both are generated from the same normalization rules
  • For generic methods, prefer definition keys (strip instantiation)
  • Store both “exact” and “erased generic” variants if needed

Multi-target frameworks

  • SurfaceBuilder: compute triggers for each TFM, union them
  • App scan: choose TFM closest to build RID, but allow fallback to union

Huge graphs

  • Drop System.* nodes/edges unless:

    • the vuln is in System.* (rare, but handle separately)
  • Deduplicate nodes by MethodKey across assemblies where safe

  • Use CSR arrays + pooled queues

Reflection heavy projects

  • Mark analysis confidence lower
  • Include “unknown edges present” in finding metadata
  • Still produce a witness path up to the reflective callsite

If you want, I can also paste a complete Cecil-based CallGraphBuilder class (nodes+edges+PDB lines), plus the SurfaceBuilder that downloads NuGet packages and generates vuln_surface_triggers end-to-end.