28 KiB
Here’s a compact, practical way to add two high‑leverage capabilities to your scanner: DSSE‑signed path witnesses and Smart‑Diff × Reachability—what they are, why they matter, and exactly how to implement them in Stella Ops without ceremony.
1) DSSE‑signed path witnesses (entrypoint → calls → sink)
What it is (in plain terms): When you flag a CVE as “reachable,” also emit a tiny, human‑readable proof: the exact path from a real entrypoint (e.g., HTTP route, CLI verb, cron) through functions/methods to the vulnerable sink. Wrap that proof in a DSSE envelope and sign it. Anyone can verify the witness later—offline—without rerunning analysis.
Why it matters:
- Turns red flags into auditable evidence (quiet‑by‑design).
- Lets CI/CD, auditors, and customers verify findings independently.
- Enables deterministic replay and provenance chains (ties nicely to in‑toto/SLSA).
Minimal JSON witness (stable, vendor‑neutral):
{
"witness_schema": "stellaops.witness.v1",
"artifact": { "sbom_digest": "sha256:...", "component_purl": "pkg:nuget/Example@1.2.3" },
"vuln": { "id": "CVE-2024-XXXX", "source": "NVD", "range": "≤1.2.3" },
"entrypoint": { "kind": "http", "name": "GET /billing/pay" },
"path": [
{"symbol": "BillingController.Pay()", "file": "BillingController.cs", "line": 42},
{"symbol": "PaymentsService.Authorize()", "file": "PaymentsService.cs", "line": 88},
{"symbol": "LibXYZ.Parser.Parse()", "file": "Parser.cs", "line": 17}
],
"sink": { "symbol": "LibXYZ.Parser.Parse()", "type": "deserialization" },
"evidence": {
"callgraph_digest": "sha256:...",
"build_id": "dotnet:RID:linux-x64:sha256:...",
"analysis_config_digest": "sha256:..."
},
"observed_at": "2025-12-18T00:00:00Z"
}
Wrap in DSSE (payloadType & payload are required)
{
"payloadType": "application/vnd.stellaops.witness+json",
"payload": "base64(JSON_above)",
"signatures": [{ "keyid": "attestor-stellaops-ed25519", "sig": "base64(...)" }]
}
.NET 10 signing/verifying (Ed25519)
using System.Security.Cryptography;
using System.Text.Json;
var payloadBytes = JsonSerializer.SerializeToUtf8Bytes(witnessJsonObj);
var dsse = new {
payloadType = "application/vnd.stellaops.witness+json",
payload = Convert.ToBase64String(payloadBytes),
signatures = new [] { new { keyid = keyId, sig = Convert.ToBase64String(Sign(payloadBytes, privateKey)) } }
};
byte[] Sign(byte[] data, byte[] privateKey)
{
using var ed = new Ed25519();
// import private key, sign data (left as your Ed25519 helper)
return ed.SignData(data, privateKey);
}
Where to emit:
- Scanner.Worker: after reachability confirms
reachable=true, emit witness → Attestor signs → Authority stores (Postgres) → optional Rekor‑style mirror. - Expose
/witness/{findingId}for download & independent verification.
2) Smart‑Diff × Reachability (incremental, low‑noise updates)
What it is: On SBOM/VEX/dependency deltas, don’t rescan everything. Update only affected regions of the call graph and recompute reachability just for changed nodes/edges.
Why it matters:
- Order‑of‑magnitude faster incremental scans.
- Fewer flaky diffs; triage stays focused on meaningful risk change.
- Perfect for PR gating: “what changed” → “what became reachable/unreachable.”
Core idea (graph‑reachability):
- Maintain a per‑service call graph
G = (V, E)with entrypoint setS. - On diff: compute changed nodes/edges ΔV/ΔE.
- Run incremental BFS/DFS from impacted nodes to sinks (forward or backward), reusing memoized results.
- Recompute only frontiers touched by Δ.
Minimal tables (Postgres):
-- Nodes (functions/methods)
CREATE TABLE cg_nodes(
id BIGSERIAL PRIMARY KEY,
service TEXT, symbol TEXT, file TEXT, line INT,
hash TEXT, UNIQUE(service, hash)
);
-- Edges (calls)
CREATE TABLE cg_edges(
src BIGINT REFERENCES cg_nodes(id),
dst BIGINT REFERENCES cg_nodes(id),
kind TEXT, PRIMARY KEY(src, dst)
);
-- Entrypoints & Sinks
CREATE TABLE cg_entrypoints(node_id BIGINT REFERENCES cg_nodes(id) PRIMARY KEY);
CREATE TABLE cg_sinks(node_id BIGINT REFERENCES cg_nodes(id) PRIMARY KEY, sink_type TEXT);
-- Memoized reachability cache
CREATE TABLE cg_reach_cache(
entry_id BIGINT, sink_id BIGINT,
path JSONB, reachable BOOLEAN,
updated_at TIMESTAMPTZ,
PRIMARY KEY(entry_id, sink_id)
);
Incremental algorithm (pseudocode):
Input: ΔSBOM, ΔDeps, ΔCode → ΔNodes, ΔEdges
1) Apply Δ to cg_nodes/cg_edges
2) ImpactSet = neighbors(ΔNodes ∪ endpoints(ΔEdges))
3) For each e∈Entrypoints intersect ancestors(ImpactSet):
Recompute forward search to affected sinks, stop early on unchanged subgraphs
Update cg_reach_cache; if state flips, emit new/updated DSSE witness
.NET 10 reachability sketch (fast & local):
HashSet<int> ImpactSet = ComputeImpact(deltaNodes, deltaEdges);
foreach (var e in Intersect(Entrypoints, Ancestors(ImpactSet)))
{
var res = BoundedReach(e, affectedSinks, graph, cache);
foreach (var r in res.Changed)
{
cache.Upsert(e, r.Sink, r.Path, r.Reachable);
if (r.Reachable) EmitDsseWitness(e, r.Sink, r.Path);
}
}
CI/PR flow:
- Build → SBOM diff → Dependency diff → Call‑graph delta.
- Run incremental reachability.
- If any
unreachable→reachabletransitions: fail gate, attach DSSE witnesses. - If
reachable→unreachable: auto‑close prior findings (and archive prior witness).
UX hooks (quick wins)
- In findings list, add a “Show Witness” button → modal renders the signed path (entrypoint→…→sink) + “Verify Signature” one‑click.
- In PR checks, summarize only state flips with tiny links: “+2 reachable (view witness)” / “−1 (now unreachable)”.
Minimal tasks to get this live
- Scanner.Worker: build call‑graph extraction (per language), add incremental graph store, reachability cache.
- Attestor: DSSE signing endpoint + key management (Ed25519 by default; PQC mode later).
- Authority: tables above + witness storage + retrieval API.
- Router/CI plugin: PR annotation with state flips and links to witnesses.
- UI: witness modal + signature verify.
If you want, I can draft the exact Postgres migrations, the C# repositories, and a tiny verifier CLI that checks DSSE signatures and prints the call path. Below is a concrete, buildable blueprint for an advanced reachability analysis engine inside Stella Ops. I’m going to assume your “Stella Ops” components are roughly:
- Scanner.Worker: runs analyses in CI / on artifacts
- Authority: stores graphs/findings/witnesses
- Attestor: signs DSSE envelopes (Ed25519)
- (optional) SurfaceBuilder: background worker that computes “vuln surfaces” for packages
The key advance is: don’t treat a CVE as “a package”. Treat it as a set of trigger methods (public API) that can reach the vulnerable code inside the dependency—computed by “Smart‑Diff” once, reused everywhere.
0) Define the contract (precision/soundness) up front
If you don’t write this down, you’ll fight false positives/negatives forever.
What Stella Ops will guarantee (first release)
- Whole-program static call graph (app + selected dependency assemblies)
- Context-insensitive (fast), path witness extracted (shortest path)
- Dynamic dispatch handled with CHA/RTA (+ DI hints), with explicit uncertainty flags
- Reflection handled best-effort (constant-string resolution), otherwise “unknown edge”
What it will NOT guarantee (first release)
- Perfect handling of reflection /
dynamic/ runtime codegen - Perfect delegate/event resolution across complex flows
- Full taint/dataflow reachability (you can add later)
This is fine. The major value is: “we can show you the call path” and “we can prove the vuln is triggered by calling these library APIs”.
1) The big idea: “Vuln surfaces” (Smart-Diff → triggers)
Problem
CVE feeds typically say “package X version range Y is vulnerable” but rarely say which methods. If you only do package-level reachability, noise is huge.
Solution
For each CVE+package, compute a vulnerability surface:
- Candidate sinks = methods changed between vulnerable and fixed versions (diff at IL level)
- Trigger methods = public/exported methods in the vulnerable version that can reach those changed methods internally
Then your service scan becomes:
“Can any entrypoint reach any trigger method?”
This is both faster and more precise.
2) Data model (Authority / Postgres)
You already had call graph tables; here’s a concrete schema that supports:
- graph snapshots
- incremental updates
- vuln surfaces
- reachability cache
- DSSE witnesses
2.1 Graph tables
CREATE TABLE cg_snapshots (
snapshot_id BIGSERIAL PRIMARY KEY,
service TEXT NOT NULL,
build_id TEXT NOT NULL,
graph_digest TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE(service, build_id)
);
CREATE TABLE cg_nodes (
node_id BIGSERIAL PRIMARY KEY,
snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
method_key TEXT NOT NULL, -- stable key (see below)
asm_name TEXT,
type_name TEXT,
method_name TEXT,
file_path TEXT,
line_start INT,
il_hash TEXT, -- normalized IL hash for diffing
flags INT NOT NULL DEFAULT 0, -- bitflags: has_reflection, compiler_generated, etc.
UNIQUE(snapshot_id, method_key)
);
CREATE TABLE cg_edges (
snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
src_node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
dst_node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
kind SMALLINT NOT NULL, -- 0=call,1=newobj,2=dispatch,3=delegate,4=reflection_guess,...
PRIMARY KEY(snapshot_id, src_node_id, dst_node_id, kind)
);
CREATE TABLE cg_entrypoints (
snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
kind TEXT NOT NULL, -- http, grpc, cli, job, etc.
name TEXT NOT NULL, -- GET /foo, "Main", etc.
PRIMARY KEY(snapshot_id, node_id, kind, name)
);
2.2 Vuln surface tables (Smart‑Diff artifacts)
CREATE TABLE vuln_surfaces (
surface_id BIGSERIAL PRIMARY KEY,
ecosystem TEXT NOT NULL, -- nuget
package TEXT NOT NULL,
cve_id TEXT NOT NULL,
vuln_version TEXT NOT NULL, -- a representative vulnerable version
fixed_version TEXT NOT NULL,
surface_digest TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE(ecosystem, package, cve_id, vuln_version, fixed_version)
);
CREATE TABLE vuln_surface_sinks (
surface_id BIGINT REFERENCES vuln_surfaces(surface_id) ON DELETE CASCADE,
sink_method_key TEXT NOT NULL,
reason TEXT NOT NULL, -- changed|added|removed|heuristic
PRIMARY KEY(surface_id, sink_method_key)
);
CREATE TABLE vuln_surface_triggers (
surface_id BIGINT REFERENCES vuln_surfaces(surface_id) ON DELETE CASCADE,
trigger_method_key TEXT NOT NULL,
sink_method_key TEXT NOT NULL,
internal_path JSONB, -- optional: library internal witness path
PRIMARY KEY(surface_id, trigger_method_key, sink_method_key)
);
2.3 Reachability cache & witnesses
CREATE TABLE reach_findings (
finding_id BIGSERIAL PRIMARY KEY,
snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
cve_id TEXT NOT NULL,
ecosystem TEXT NOT NULL,
package TEXT NOT NULL,
package_version TEXT NOT NULL,
reachable BOOLEAN NOT NULL,
reachable_entrypoints INT NOT NULL DEFAULT 0,
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE(snapshot_id, cve_id, package, package_version)
);
CREATE TABLE reach_witnesses (
witness_id BIGSERIAL PRIMARY KEY,
finding_id BIGINT REFERENCES reach_findings(finding_id) ON DELETE CASCADE,
entry_node_id BIGINT REFERENCES cg_nodes(node_id),
dsse_envelope JSONB NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
3) Stable identity: MethodKey + IL hash
3.1 MethodKey (must be stable across builds)
Use a normalized string like:
{AssemblyName}|{DeclaringTypeFullName}|{MethodName}`{GenericArity}({ParamType1},{ParamType2},...)
Examples:
MyApp|BillingController|Pay(System.String)LibXYZ|LibXYZ.Parser|Parse(System.ReadOnlySpan<System.Byte>)
3.2 Normalized IL hash (for smart-diff + incremental graph updates)
Raw IL bytes aren’t stable (metadata tokens change). Normalize:
- opcode names
- branch targets by instruction index, not offset
- method operands by resolved MethodKey
- string operands by literal or hashed literal
- type operands by full name
Then hash SHA256(normalized_bytes).
4) Call graph extraction for .NET (concrete, doable)
Tooling choice
Start with Mono.Cecil (MIT license, easy IL traversal). You can later swap to System.Reflection.Metadata for speed.
4.1 Build process (Scanner.Worker)
-
dotnet restore(use your locked restore) -
dotnet build -c Release /p:DebugType=portable /p:DebugSymbols=true -
Collect:
- app assemblies:
bin/Release/**/publish/*.dllor build output .pdbfiles for sequence points (file/line for witnesses)
- app assemblies:
4.2 Cecil loader
var rp = new ReaderParameters {
ReadSymbols = true,
SymbolReaderProvider = new PortablePdbReaderProvider()
};
var asm = AssemblyDefinition.ReadAssembly(dllPath, rp);
4.3 Node extraction (methods)
Walk all types, including nested:
IEnumerable<TypeDefinition> AllTypes(ModuleDefinition m)
{
var stack = new Stack<TypeDefinition>(m.Types);
while (stack.Count > 0)
{
var t = stack.Pop();
yield return t;
foreach (var nt in t.NestedTypes) stack.Push(nt);
}
}
foreach (var type in AllTypes(asm.MainModule))
foreach (var method in type.Methods)
{
var key = MethodKey.From(method); // your normalizer
var (file, line) = PdbFirstSequencePoint(method);
var ilHash = method.HasBody ? ILFingerprint(method) : null;
// store node (method_key, file, line, il_hash, flags...)
}
4.4 Edge extraction (direct calls)
foreach (var method in type.Methods.Where(m => m.HasBody))
{
var srcKey = MethodKey.From(method);
foreach (var ins in method.Body.Instructions)
{
if (ins.Operand is MethodReference mr)
{
if (ins.OpCode.Code is Code.Call or Code.Callvirt or Code.Newobj)
{
var dstKey = MethodKey.From(mr); // important: stable even if not resolved
edges.Add(new Edge(srcKey, dstKey, kind: CallKind.Direct));
}
if (ins.OpCode.Code is Code.Ldftn or Code.Ldvirtftn)
{
// delegate capture (handle later)
}
}
}
}
5) Advanced precision: dynamic dispatch + DI + async/await
If you stop at direct edges only, you’ll miss many real paths.
5.1 Async/await mapping (critical for readable witnesses)
Async methods compile into a state machine MoveNext(). You want edges attributed back to the original method.
In Cecil:
- Check
AsyncStateMachineAttributeon a method - It references a state machine type
- Find that type’s
MoveNextmethod - Map
MoveNextKey -> OriginalMethodKey
Then, while extracting edges:
srcKey = MoveNextToOriginal.TryGetValue(srcKey, out var original) ? original : srcKey;
Do the same for iterator state machines.
5.2 Virtual/interface dispatch (CHA/RTA)
You need 2 maps:
- type hierarchy / interface impl map
- override map from “declared method” → “implementation method(s)”
Build override map
// For each method, Cecil exposes method.Overrides for explicit implementations.
overrideMap[MethodKey.From(overrideRef)] = MethodKey.From(methodDef);
CHA: for callvirt to virtual method T.M, add edges to overrides in derived classes
RTA: restrict to derived classes that are actually instantiated.
How to get instantiated types:
- look for
newobjinstructions and add the created type toInstantiatedTypes - plus DI registrations (below)
5.3 DI hints (Microsoft.Extensions.DependencyInjection)
You will see calls like:
ServiceCollectionServiceExtensions.AddTransient<TService, TImpl>(...)
In IL these are generic method calls. Detect and record TService -> TImpl as “instantiated”. This massively improves RTA for modern .NET apps.
5.4 Delegates/lambdas (good enough approach)
Implement intraprocedural tracking:
- when you see
ldftn SomeMethodthennewobj Action::.ctorthenstloc.s X - store
delegateTargets[local X] += SomeMethod - when you see
ldloc.s Xand latercallvirt Invoke, add edges to targets
This makes Minimal API entrypoint discovery work too.
5.5 Reflection (best-effort)
Implement only high-signal heuristics:
typeof(T).GetMethod("Foo")with constant "Foo"GetType().GetMethod("Foo")with constant "Foo" (type unknown → mark uncertain)
If resolved, add edge with kind=reflection_guess.
If not, set node flag has_reflection = true and in results show “may be incomplete”.
6) Entrypoint detection (concrete detectors)
6.1 MVC controllers
Detect:
-
types deriving from
Microsoft.AspNetCore.Mvc.ControllerBase -
methods:
- public
- not
[NonAction] - has
[HttpGet],[HttpPost],[Route]etc.
Extract route template from attributes’ ctor arguments.
Store in cg_entrypoints:
- kind =
http - name =
GET /billing/pay(compose verb+template)
6.2 Minimal APIs
Scan Program.Main IL:
- find calls to
MapGet,MapPost, ... - extract route string from preceding
ldstr - resolve handler method via delegate tracking (ldftn)
Entry:
- kind =
http - name =
GET /foo
6.3 CLI
Find assembly entry point method (asm.EntryPoint) or static Main.
Entry:
- kind =
cli - name =
Main
Start here. Add gRPC/jobs later.
7) Smart-Diff SurfaceBuilder (the “advanced” part)
This is what makes your reachability actually meaningful for CVEs.
7.1 SurfaceBuilder inputs
From your vuln ingestion pipeline:
- ecosystem = nuget
- package =
LibXYZ - affected range =
<= 1.2.3 - fixed version =
1.2.4 - CVE id
7.2 Choose a vulnerable version to diff
Pick the highest affected version below fixed.
- fixed = 1.2.4
- vulnerable representative = 1.2.3
(If multiple fixed versions exist, build multiple surfaces.)
7.3 Download both packages
Use NuGet.Protocol to download .nupkg, unzip, pick TFMs you care about (often netstandard2.0 is safest). Compute fingerprints for each assembly.
7.4 Compute method fingerprints
For each method:
- MethodKey
- Normalized IL hash
7.5 Diff
ChangedMethods = { k | hashVuln[k] != hashFixed[k] } ∪ added ∪ removed
Store these as vuln_surface_sinks with reason.
7.6 Build internal library call graph
Same Cecil extraction, but only for package assemblies. Now compute triggers:
Reverse BFS from sinks:
- Start from all sink method keys
- Walk predecessors
- When you encounter a public/exported method, record it as a trigger
Also store one internal path for each trigger → sink (for witnesses).
7.7 Add interface/base declarations as triggers
Important: your app might call a library via an interface method signature, not the concrete implementation.
For each trigger implementation method:
- for each
method.Overridesentry, add the overridden method key as an additional trigger
This reduces dependence on perfect dispatch expansion during app scanning.
7.8 Persist the surface
Store:
- sinks set
- triggers set
- internal witness paths (optional but highly valuable)
Now you’ve converted a “version range” CVE into “these specific library APIs are dangerous”.
8) Reachability engine (fast, witness-producing)
8.1 In-memory graph format (CSR)
Don’t BFS off dictionaries; you’ll die on perf.
Build integer indices:
-
method_key -> nodeIndex (0..N-1) -
store arrays:
predOffsets[N+1]preds[edgeCount]
Construction:
- count predecessors per node
- prefix sum to offsets
- fill preds
8.2 Reverse BFS from sinks
This computes:
visited[node]= can reach a sinkparent[node]= next node toward a sink (for path reconstruction)
public sealed class ReachabilityEngine
{
public ReachabilityResult Compute(
Graph g,
ReadOnlySpan<int> entrypoints,
ReadOnlySpan<int> sinks)
{
var visitedMark = g.VisitMark; // int[] length N (reused across runs)
var parent = g.Parent; // int[] length N (reused)
g.RunId++;
var q = new IntQueue(capacity: g.NodeCount);
var sinkSet = new BitSet(g.NodeCount);
foreach (var s in sinks)
{
sinkSet.Set(s);
visitedMark[s] = g.RunId;
parent[s] = s;
q.Enqueue(s);
}
while (q.TryDequeue(out var v))
{
var start = g.PredOffsets[v];
var end = g.PredOffsets[v + 1];
for (int i = start; i < end; i++)
{
var p = g.Preds[i];
if (visitedMark[p] == g.RunId) continue;
visitedMark[p] = g.RunId;
parent[p] = v;
q.Enqueue(p);
}
}
// Collect reachable entrypoints and paths
var results = new List<EntryWitness>();
foreach (var e in entrypoints)
{
if (visitedMark[e] != g.RunId) continue;
var path = ReconstructPath(e, parent, sinkSet);
results.Add(new EntryWitness(e, path));
}
return new ReachabilityResult(results);
}
private static int[] ReconstructPath(int entry, int[] parent, BitSet sinks)
{
var path = new List<int>(32);
int cur = entry;
path.Add(cur);
// follow parent pointers until a sink
for (int guard = 0; guard < 10_000; guard++)
{
if (sinks.Get(cur)) break;
var nxt = parent[cur];
if (nxt == cur || nxt < 0) break; // safety
cur = nxt;
path.Add(cur);
}
return path.ToArray();
}
}
8.3 Producing the witness
For each node index in the path:
- method_key
- file_path / line_start (if known)
- optional flags (reflection_guess edge, dispatch edge)
Then attach:
- vuln id, package, version
- entrypoint kind/name
- graph digest + config digest
- surface digest
- timestamp
Send JSON to Attestor for DSSE signing, store envelope in Authority.
9) Scaling: don’t do BFS 500 times if you can avoid it
9.1 First-line scaling (usually enough)
-
Group vulnerabilities by package/version → surfaces reused
-
Only run reachability for vulns where:
- dependency present AND
- surface exists OR fallback mode
-
Limit witnesses per vuln (top 3)
In practice, with N50k nodes and E200k edges, a reverse BFS is fast in C# if done with arrays.
9.2 Incremental Smart-Diff × Reachability (your “low noise” killer feature)
Step A: compute graph delta between snapshots
Use il_hash per method to detect changed nodes:
- added / removed / changed nodes
- edges updated only for changed nodes
Step B: decide which vulnerabilities need recompute
Store a cached reverse-reachable set per vuln surface if you want (bitset), OR just do a cheaper heuristic:
Recompute for vulnerability if:
- sink set changed (new surface or version changed), OR
- any changed node is on any previously stored witness path, OR
- entrypoints changed, OR
- impacted nodes touch any trigger node’s predecessors (use a small localized search)
A practical approach:
- store all node IDs that appear in any witness path for that vuln
- if delta touches any of those nodes/edges, recompute
- otherwise reuse cached result
This yields a massive win on PR scans where most code is unchanged.
Step C: “Impact frontier” recompute (optional)
If you want more advanced:
- compute
ImpactSet = ΔNodes ∪ endpoints(ΔEdges) - run reverse BFS starting from ImpactSet ∩ ReverseReachSet and update visited marks This is trickier to implement correctly (dynamic graph), so I’d ship the heuristic first.
10) Practical fallback modes (don’t block shipping)
You won’t have surfaces for every CVE on day 1. Handle this gracefully:
Mode 1: Surface-based reachability (best)
- sink = trigger methods from surface
- result: “reachable” with path
Mode 2: Package API usage (good fallback)
- sink = any method in that package that is called by app
- result: “package reachable” (lower confidence), still provide path to callsite
Mode 3: Dependency present only (SBOM level)
- no call graph needed
- result: “present” only
Your UI can show confidence tiers:
- Confirmed reachable (surface)
- Likely reachable (package API)
- Present only (SBOM)
11) Integration points inside Stella Ops
Scanner.Worker (per build)
-
Build/collect assemblies + pdb
-
CallGraphBuilder→ nodes/edges/entrypoints + graph_digest -
Load SBOM vulnerabilities list
-
For each vuln:
- resolve surface triggers; if missing → enqueue SurfaceBuilder job + fallback mode
- run reachability BFS
- for each reachable entrypoint: emit DSSE witness
-
Persist findings/witnesses
SurfaceBuilder (async worker)
- triggered by “surface missing” events or nightly preload of top packages
- computes surface once, stores forever
Authority
- stores graphs, surfaces, findings, witnesses
- provides retrieval APIs for UI/CI
12) What to implement first (in the order that produces value fastest)
Week 1–2 scope (realistic, shippable)
-
Cecil call graph extraction (direct calls)
-
MVC + Minimal API entrypoints
-
Reverse BFS reachability with path witnesses
-
DSSE witness signing + storage
-
SurfaceBuilder v1:
- IL hash per method
- changed methods as sinks
- triggers via internal reverse BFS
-
UI: “Show Witness” + “Verify Signature”
Next increment (precision upgrades)
- async/await mapping to original methods
- RTA + DI registration hints
- delegate tracking for Minimal API handlers (if not already)
- interface override triggers in surface builder
Later (if you want “attackability”, not just “reachability”)
- taint/dataflow for top sink classes (deserialization, path traversal, SQL, command exec)
- sanitizer modeling & parameter constraints
13) Common failure modes and how to harden
MethodKey mismatches (surface vs app call)
- Ensure both are generated from the same normalization rules
- For generic methods, prefer definition keys (strip instantiation)
- Store both “exact” and “erased generic” variants if needed
Multi-target frameworks
- SurfaceBuilder: compute triggers for each TFM, union them
- App scan: choose TFM closest to build RID, but allow fallback to union
Huge graphs
-
Drop
System.*nodes/edges unless:- the vuln is in System.* (rare, but handle separately)
-
Deduplicate nodes by MethodKey across assemblies where safe
-
Use CSR arrays + pooled queues
Reflection heavy projects
- Mark analysis confidence lower
- Include “unknown edges present” in finding metadata
- Still produce a witness path up to the reflective callsite
If you want, I can also paste a complete Cecil-based CallGraphBuilder class (nodes+edges+PDB lines), plus the SurfaceBuilder that downloads NuGet packages and generates vuln_surface_triggers end-to-end.