42 KiB
Here’s a practical blueprint for building a reachability‑first code+binary scanner that fuses static call‑graphs with runtime evidence, and scales to large monorepos/microservices.
1) Static analyzers (per language)
-
.NET (Roslyn / IL)
-
Parse solutions with
Microsoft.CodeAnalysis.MSBuild, collect symbols, build call graph fromISymbol→IInvocationOperation. -
Handle reflection edges by heuristics (string literals,
Type.GetType, DI registrations). -
IL pass: read assemblies with
System.Reflection.Metadatato connect external/library calls. -
Minimal sample:
using Microsoft.CodeAnalysis; using Microsoft.CodeAnalysis.CSharp; using Microsoft.CodeAnalysis.MSBuild; var ws = MSBuildWorkspace.Create(); var sln = await ws.OpenSolutionAsync(@"path\to.sln"); foreach (var proj in sln.Projects) foreach (var doc in proj.Documents) { var model = await doc.GetSemanticModelAsync(); var root = await doc.GetSyntaxRootAsync(); foreach (var node in root.DescendantNodes().OfType<Microsoft.CodeAnalysis.CSharp.Syntax.InvocationExpressionSyntax>()) { var sym = model.GetSymbolInfo(node).Symbol as IMethodSymbol; if (sym != null) { // record edge: caller -> sym.ContainingType.Name + "." + sym.Name } } }
-
-
Java (Soot or WALA)
- Build bytecode call graph (CHA/RTA/points‑to) and export edges.
- Seed entrypoints from
public static void main, Spring Boot controllers, servlet mappings.
-
Node/Python
- Build AST + import graph; resolve exports (
module.exports,export default, Python__all__). - Track dynamic requires (best‑effort string eval); record web/router handlers as entrypoints.
- Build AST + import graph; resolve exports (
-
Go/Rust
- Use build graph (Go modules, Cargo metadata) + AST to map
mainand handler functions. - Include linker‑time features/conditions to avoid dead edges.
- Use build graph (Go modules, Cargo metadata) + AST to map
-
Binary‑only (containers, closed libs)
- Recover function boundaries (Ghidra/rizin), mine strings/imports, detect candidates for entrypoints from container
ENTRYPOINT/CMD, service files, and exposed ports. - Heuristics: exported symbols, syscall usage, and common framework stubs.
- Recover function boundaries (Ghidra/rizin), mine strings/imports, detect candidates for entrypoints from container
2) Runtime confirmation (evidence)
- Windows/.NET: ETW sampling to “mint” runtime edges (method IDs, stack samples) without heavy overhead.
- Linux/containers: eBPF/usdt or perf sampling to confirm hot paths; record PID→image→build info to link evidence back to SBOM components.
- Rule: static edge exists → mark probable; static+runtime match → mark proven (confidence ↑, prioritize).
3) Entrypoint discovery
- Web services: framework routers (ASP.NET Core endpoints, Spring mappings, Express routes, FastAPI decorators).
- Jobs/CLIs: scheduler configs (Cron, systemd timers, k8s CronJobs).
- Events: message consumers (RabbitMQ/Kafka topics), gRPC service maps.
Entrypoints seed reachability: start from entry, traverse call graph, intersect with SBOM → “reachable components + reachable vulns”.
4) Scale & storage
- Shard by repo/service; compute graphs independently.
- Compress with SCCs (strongly connected components) to shrink graph size.
- Cap cardinality using hot‑path sampling (keep top‑N edges by observed frequency).
- Cache: content‑addressed graphs keyed by
(SBOM hash, compiler flags, env); invalidate on source/SBOM/CFG changes or new VEX/policy. - Store edges as
(caller, callee, kind: static|runtime, weight, build-id)in Postgres; keep Valkey for ephemeral reachability queries.
5) SBOM/VEX linkage
-
Normalize package coordinates (purl), map symbols/binaries → SBOM components.
-
For each CVE:
- Reachable? (entrypoint‑anchored traversal hits affected symbol/library)
- Proven at runtime? (evidence present)
- Gated by config? (feature flags, platform checks)
-
Emit VEX with machine‑explainable reasons (e.g., not reachable, reachable but not loaded, reachable+proven).
6) APIs and outputs (developer‑friendly)
-
CLI
scan graph --lang dotnet --sln path.sln --out graph.scc.jsonscan runtime --target pod/myservice --duration 30s --out stacks.jsonreachability join --graph graph.scc.json --runtime stacks.json --sbom bom.cdx.json --out reach.cdxr.json
-
HTTP
POST /graph(upload call graph)POST /runtime(upload evidence)POST /reachability→ returns ranked, evidence‑linked findings
-
Artifacts
graph.scc.json(SCC‑compressed call graph)reach.cdxr.json(CycloneDX extension with evidence)vex.json(OpenVEX/CSAF w/ “justifications”)
7) Quality gates & tests
- Golden images: tiny test services where reachable/unreachable CVEs are known.
- Mutation tests: toggle entrypoints, flags, and ensure reachability shifts correctly.
- Drift checks: if runtime sees edges not in static graph → open “coverage debt” issue.
8) Security & perf knobs
- Sampling rate caps (CPU bound), PID/image allowlists, PII‑safe symbol hashing option.
- Offline mode: bundle symbols + evidence into a replayable archive (deterministic re‑evaluation).
If you want, I can generate a starter repo layout (Roslyn worker, Java WALA worker, eBPF sampler, joiner, and a Postgres schema) tailored to your .NET 10 + microservices stack.
Below is a developer-ready product + BA implementation specification for the Reachability-First Scanner described earlier, tailored to StellaOps (.NET 10) and your standing architecture rules (lattice algorithms run in scanner.webservice; Concelier/Excititor preserve prune source; Postgres is SoR; Valkey is ephemeral only).
StellaOps Reachability-First Scanner
Developer Implementation Specification (v1)
0) Objective and boundaries
Objective
Reduce vulnerability noise by classifying findings as Unreachable / Possibly Reachable / Reachable (Static) / Proven Reachable (Runtime) using:
- Static call graph (best-effort; language-aware)
- Runtime evidence (sampling, low overhead)
- Entrypoint seeding (framework-aware)
- Join against SBOM component mapping + vulnerability data (from Concelier) + VEX (from Excititor)
Non-goals (v1)
- Perfect points-to analysis for all languages.
- Full decompilation for every binary (support is “best-effort” with confidence).
- Executing or fuzzing workloads.
1) Product behavior: what the user sees
1.1 Reachability statuses (canonical)
These labels must be stable across UI/CLI/API:
- UNREACHABLE: no path from any discovered entrypoint to affected component/symbol.
- POSSIBLY_REACHABLE: graph incomplete / dynamic behavior; heuristics indicate risk.
- REACHABLE_STATIC: a static path exists from at least one entrypoint.
- REACHABLE_PROVEN: runtime evidence confirms code path or library load (stronger than static).
Required explanation fields (always returned)
Every reachability classification must include:
why[]: list of structured reasons (machine-readable codes + human text)evidence[]: references to graph paths and/or runtime samplesconfidence: 0.0–1.0scope: component-only or symbol-level (if symbol mapping exists)
1.2 Key UX outputs (pipeline-first)
-
CLI output for CI gates:
stella scan reachability --format sarif|json -
UI detail panel must show:
- Entry point(s) → path summary (k shortest paths, default k=3)
- Whether runtime proved it (samples, timestamps, container/build IDs)
- Which assumptions/heuristics were used (reflection, DI, dynamic import, etc.)
2) System architecture (StellaOps modules)
2.1 Services and responsibilities
StellaOps.Scanner.WebService (authoritative)
Owns the reachability pipeline and the lattice computation for reachability decisions. Responsibilities:
- Ingest static graphs from language workers
- Ingest runtime evidence (from collectors)
- Normalize symbols → components (SBOM join)
- Compute reachability results, confidence, and explanation artifacts
- Expose query APIs and CI export formats
- Persist everything to Postgres (SoR)
- Use Valkey only as ephemeral accelerator
Language workers (stateless compute)
Examples:
StellaOps.Scanner.Worker.DotNetStellaOps.Scanner.Worker.JavaStellaOps.Scanner.Worker.NodeStellaOps.Scanner.Worker.PythonStellaOps.Scanner.Worker.GoStellaOps.Scanner.Worker.RustStellaOps.Scanner.Worker.Binary
Responsibilities:
- Produce
CallGraph.v1.json(+ optionalEntrypoints.v1.json) - Provide symbol IDs stable within a scan (see hashing rules)
Runtime collectors (agent/sidecar; optional)
- Windows: ETW/EventPipe sampling for .NET
- Linux: eBPF/perf sampling for native; plus runtime-specific exporters where feasible
Collectors only emit evidence events; they never compute reachability.
Concelier / Excititor integration
- Concelier provides vulnerability facts (CVE ↔ component versions).
- Excititor provides VEX statements. Neither computes reachability or lattice merges; they provide pruned sources only.
3) Data contracts (hard requirements)
3.1 Stable identifiers
All graph nodes must have:
nodeId: stable across replays when code is unchanged.symbolKey: canonical string (language-specific)artifactKey: assembly/jar/module/binary identity (prefer build ID + path + hash)- Optional:
purlCandidates[](library mapping hints)
DotNet nodeId rule (v1):
nodeId = SHA256(assemblyMvid + ":" + metadataToken + ":" + genericArity + ":" + signatureShape)
- If token unavailable (source-only), fallback: SHA256(projectPath + ":" + file + ":" + span + ":" + symbolDisplayString)
3.2 CallGraph.v1.json
Minimum required schema:
{
"schema": "stella.callgraph.v1",
"scanKey": "uuid",
"language": "dotnet|java|node|python|go|rust|binary",
"artifacts": [{ "artifactKey": "…", "kind": "assembly|jar|module|binary", "sha256": "…" }],
"nodes": [{
"nodeId": "…",
"artifactKey": "…",
"symbolKey": "Namespace.Type::Method(…)",
"visibility": "public|internal|private|unknown",
"isEntrypointCandidate": false
}],
"edges": [{
"from": "nodeId",
"to": "nodeId",
"kind": "static|heuristic",
"reason": "direct_call|virtual_call|reflection_string|di_binding|dynamic_import|unknown",
"weight": 1.0
}],
"entrypoints": [{
"nodeId": "…",
"kind": "http|grpc|cli|job|event|unknown",
"route": "/api/orders/{id}",
"framework": "aspnetcore|minimalapi|spring|express|unknown"
}]
}
3.3 RuntimeEvidence.v1.json
{
"schema": "stella.runtimeevidence.v1",
"scanKey": "uuid",
"collectedAt": "2025-12-14T10:00:00Z",
"environment": {
"os": "linux|windows",
"k8s": { "namespace": "…", "pod": "…", "container": "…" },
"imageDigest": "sha256:…",
"buildId": "…"
},
"samples": [{
"timestamp": "…",
"pid": 1234,
"threadId": 77,
"frames": ["nodeId","nodeId","nodeId"],
"sampleWeight": 1.0
}],
"loadedArtifacts": [{
"artifactKey": "…",
"evidence": "loaded_module|mapped_file|jar_loaded"
}]
}
4) Postgres schema (system of record)
4.1 Core tables
You can implement with migrations in StellaOps.Scanner.Persistence (EF Core 9).
scan
scan_id uuid pkcreated_at timestamptzrepo_uri text nullcommit_sha text nullsbom_digest text(hash of SBOM input)policy_digest text(hash of reachability policy inputs)status text(NEW/RUNNING/DONE/FAILED)
Indexes:
(commit_sha, sbom_digest)for caching
artifact
artifact_id uuid pkscan_id uuid fkartifact_key textunique per scankind textsha256 textbuild_id text nullpurl text null
Index:
(scan_id, artifact_key)unique
cg_node
scan_id uuid fknode_id text(hash string)artifact_key textsymbol_key textvisibility textflags int(bitset: entrypointCandidate, external, generated, etc.) PK:(scan_id, node_id)
GIN index:
symbol_keytrigram for search (optional)
cg_edge
scan_id uuid fkfrom_node_id textto_node_id textkind smallint(0 static, 1 heuristic, 2 runtime_minted)reason smallintweight realPK:(scan_id, from_node_id, to_node_id, kind, reason)
Indexes:
(scan_id, from_node_id)(scan_id, to_node_id)
entrypoint
scan_id uuidnode_id textkind textframework textroute text nullPK:(scan_id, node_id, kind, framework, route)
runtime_sample
scan_id uuidcollected_at timestamptzenv_hash text(hash of environment identity)sample_id bigserial pktimestamp timestamptzpid intthread_id intframes text[](nodeIds)weight real
Partition suggestion:
- Partition by
scan_idor by month depending on retention.
symbol_component_map
scan_id uuidnode_id textpurl textmapping_kind text(exact|heuristic|external)confidence realPK:(scan_id, node_id, purl)
reachability_component
scan_id uuidpurl textstatus smallint(0 unreachable, 1 possible, 2 reachable_static, 3 reachable_proven)confidence realwhy jsonbevidence jsonbPK:(scan_id, purl)
reachability_finding
scan_id uuidcve_id textpurl textstatus smallintconfidence realwhy jsonbevidence jsonbPK:(scan_id, cve_id, purl)
4.2 Valkey usage (ephemeral only)
Allowed:
- Dedup keys for evidence ingest (short TTL)
- Hot query cache:
(scan_id, purl)→ reachability result - Rate limits / nonces
Not allowed:
- Authoritative queueing for scan state
- Any “only copy” of results
5) Reachability computation (the actual algorithm)
5.1 Inputs
- Call graph nodes/edges + entrypoints
- Runtime evidence (optional)
- SBOM (CycloneDX/SPDX) with purls
- Concelier vulnerability facts (CVE ↔ purl/version ranges)
- Excititor VEX statements (not affected / affected / under investigation)
5.2 Normalize to a graph suitable for traversal
In scanner.webservice:
-
Build adjacency list for
cg_edge.kind in (static, heuristic) -
Optionally compress SCCs:
- Compute SCCs (Tarjan/Kosaraju)
- Store SCC mapping for explanation paths (must remain explainable)
5.3 Entrypoint seeding rules
Entrypoints come from:
- Worker-reported entrypoints (preferred)
- Framework discovery in worker (ASP.NET maps, Spring mappings, etc.)
- Fallback:
Main, exported symbols, container CMD/ENTRYPOINT
If entrypoints are empty, mark all results as POSSIBLY_REACHABLE with reason NO_ENTRYPOINTS_DISCOVERED, unless runtime evidence exists.
5.4 Traversal
For each scan:
-
Start from all entrypoints; traverse reachable nodes.
-
Track:
firstSeenFromEntrypoint[node](for k-shortest path reconstruction)pathWitness[node](parent pointers or compressed witness)
Produce:
reachableNodesStaticset
5.5 Join to components (SBOM)
Map reachable nodes to purls using symbol_component_map.
Mapping sources (priority order):
- Exact binary symbol → package metadata (where available)
- Assembly/jar/module to SBOM component (by hash/purl)
- Heuristics: namespace prefixes, import paths, jar manifest, npm package.json, go module path
If a vulnerable purl is in SBOM but has no symbol mapping, component reachability defaults:
- If artifact is loaded at runtime → at least
REACHABLE_PROVEN(component level) - Else if referenced by static dependency graph →
POSSIBLY_REACHABLE - Else →
UNREACHABLE(withNO_SYMBOL_MAPPINGreason)
5.6 Runtime evidence upgrade (“minting”)
If runtime evidence is present:
-
For each sample stack:
- Mark each frame node as “executed”
- Mint runtime edges: consecutive frames become
cg_edge.kind=runtime_minted(optional table or derived view)
-
If any executed node maps to purl affected by CVE:
- Upgrade status to
REACHABLE_PROVEN
- Upgrade status to
-
If only loaded artifact exists:
- Upgrade component status to
REACHABLE_PROVEN(component-only), but keep symbol-level as unknown.
- Upgrade component status to
5.7 Confidence scoring (deterministic)
A simple deterministic scoring function (v1) used everywhere:
-
Base:
UNREACHABLE→ 0.05POSSIBLY_REACHABLE→ 0.35REACHABLE_STATIC→ 0.70REACHABLE_PROVEN→ 0.95
-
Modifiers:
- +0.10 if path uses only
staticedges (no heuristic) - −0.15 if path includes
reflection_string|dynamic_import - +0.10 if runtime evidence hits a node in affected component
- −0.10 if entrypoints incomplete (
NO_ENTRYPOINTS_DISCOVERED) Clamp to[0, 1].
- +0.10 if path uses only
All modifiers must be recorded in why[].
6) Language worker specs (what each worker must do)
6.1 .NET worker (Roslyn + optional IL)
Goal (v1): produce good-enough call graph + entrypoints for ASP.NET Core and workers.
Required features
-
Direct invocation edges:
InvocationExpressionSyntax -
Object creation edges: constructors
-
Delegate invocation: best-effort; record heuristic edge when target unresolved
-
Virtual/interface dispatch:
- record
virtual_calledge to declared method - optionally add edges to known overrides within solution (static, conservative)
- record
-
Async/await: treat state machine calls as implementation detail; connect logical caller → awaited method
Entrypoint discovery (.NET)
Implement these detectors:
-
Program.Main(classic) -
ASP.NET Core:
- Controllers:
[ApiController], route attributes, action methods - Minimal APIs:
MapGet/MapPost/MapMethodspatterns (syntactic + semantic) - gRPC:
MapGrpcService<T>()and service methods - Hosted services:
IHostedService,BackgroundService.ExecuteAsyncas job entrypoints
- Controllers:
-
Message consumers (if present): known libs patterns (e.g., MassTransit consumers)
Reflection and DI heuristics
Produce heuristic edges when you see:
-
Type.GetType("…"),Assembly.GetType,GetMethod("…"),Invoke -
services.AddTransient<IFoo,Foo>()/AddScoped/AddSingleton- Add edge
IFoo→Fooconstructor asdi_bindingheuristic
- Add edge
-
Activator.CreateInstance,ServiceProvider.GetServicepatterns
Output guarantees
- Must not crash on partial compilation (missing refs); produce partial graph with
why=COMPILATION_PARTIAL - Provide
artifact_keyper assembly/project output
6.2 Java / Node / Python / Go / Rust workers
v1 expectations:
- Provide import graph + framework entrypoints + best-effort call edges.
- Always label uncertain resolution as
heuristicwith a reason code.
6.3 Binary worker
v1 expectations:
- Identify artifacts, exported symbols, imported libs, and candidate entrypoints from container metadata.
- Provide component-level mapping primarily; symbol-level mapping only when confident.
7) APIs (scanner.webservice)
7.1 Ingestion endpoints
POST /api/scans→ creates scan record (returnsscanId)POST /api/scans/{scanId}/callgraphs→ acceptsCallGraph.v1.jsonPOST /api/scans/{scanId}/runtimeevidence→ acceptsRuntimeEvidence.v1.jsonPOST /api/scans/{scanId}/sbom→ accepts CycloneDX/SPDXPOST /api/scans/{scanId}/compute-reachability→ triggers computation (idempotent)
Rules:
- All ingests must be idempotent via
contentDigestheader (store seen digests in Postgres; Valkey may accelerate dedupe). - Reject mismatched
scanKey/scanId.
7.2 Query endpoints
-
GET /api/scans/{scanId}/reachability/components?purl=... -
GET /api/scans/{scanId}/reachability/findings?cve=... -
GET /api/scans/{scanId}/reachability/explain?cve=...&purl=...- returns
why[]+ path witness + sample refs
- returns
7.3 Export endpoints
GET /api/scans/{scanId}/exports/sarifGET /api/scans/{scanId}/exports/cdxr(CycloneDX reachability extension)GET /api/scans/{scanId}/exports/openvex(reachability justifications as VEX annotations)
8) Deterministic replay requirements (must-have)
Every reachability result must be reproducible from:
- SBOM digest
- CallGraph digests (per worker)
- RuntimeEvidence digests (optional)
- Concelier feed snapshot digest
- Excititor VEX snapshot digest
- Policy digest (confidence scoring + gating rules)
Implement ReplayManifest.json:
{
"schema": "stella.replaymanifest.v1",
"scanId": "uuid",
"inputs": {
"sbomDigest": "sha256:…",
"callGraphs": [{"language":"dotnet","digest":"sha256:…"}],
"runtimeEvidence": [{"digest":"sha256:…"}],
"concelierSnapshot": "sha256:…",
"excititorSnapshot": "sha256:…",
"policyDigest": "sha256:…"
}
}
9) Quality gates and acceptance criteria
9.1 Golden corpus (mandatory)
Create /tests/Reachability.Golden/ with:
- Minimal ASP.NET controller app with known reachable endpoint → vulnerable lib call
- Minimal app with vulnerable lib present but never called → unreachable
- Reflection-based activation case → “possible” unless runtime proves
- BackgroundService job case
Acceptance:
-
Each golden test asserts:
- Reachability status
- At least one
why[]reason - Deterministic
confidencewithin ±0.01
9.2 Drift detection (mandatory)
If runtime minted edges not present in static graph above a threshold:
- Emit
COVERAGE_DRIFTwarning with top missing edges - Store drift report in Postgres (
reachability_drifttable or JSONB field)
9.3 Performance SLOs (v1 targets)
- 1 medium service (100k LOC .NET) static graph: < 2 minutes on CI runner class machine
- Reachability compute: < 30 seconds
- Query
GET finding: < 200ms p95 (use Postgres indexes + optional Valkey cache)
10) Implementation plan (developer checklist)
Milestone A — Data plumbing (1)
- Implement Postgres tables + migrations
- Implement ingestion endpoints
- Persist call graphs and runtime evidence
- Implement digesting + idempotency
Done when: you can ingest a call graph + SBOM and query stored nodes/edges.
Milestone B — DotNet worker MVP (2)
- Roslyn graph extraction (direct calls + constructors)
- ASP.NET Core entrypoints (controllers + minimal APIs)
- Output
CallGraph.v1.json
Done when: golden corpus “reachable/unreachable” works on static only.
Milestone C — Reachability compute (3)
- Entry point seeding
- SCC compression optional
- Traversal + path witness generation
- Join to SBOM purls
- Generate
reachability_component+reachability_finding
Done when: component-level reachability returns stable answers.
Milestone D — Runtime evidence upgrade (4)
- Ingest runtime evidence
- Upgrade statuses to
REACHABLE_PROVEN - Drift report
Done when: runtime samples can flip a “possible” to “proven”.
Milestone E — Exports (5)
- SARIF export
- CycloneDX reachability extension export
- OpenVEX annotations export
Done when: CI pipeline can gate on proven/static reachability.
If you want this converted into repo-facing artifacts (e.g., docs/modules/scanner/reachability-spec.md, EF Core entity classes skeletons, and OpenAPI snippets for the endpoints), tell me the exact module paths you use under /src/ and I’ll emit drop-in files matching your StellaOps naming conventions.
Below is a concrete, implementable architecture + technology stack for StellaOps.Scanner.Worker.DotNet (the .NET static graph worker), written as if you’re handing it directly to an engineer to build.
1) Worker’s purpose in the StellaOps scanner system
StellaOps.Scanner.Worker.DotNet is a stateless “graph producer.” It does not compute reachability, confidence, or any lattice logic (that lives in scanner.webservice). The worker:
- Discovers entrypoints (ASP.NET Core controllers, minimal APIs, gRPC, hosted services, etc.)
- Extracts a static call graph (method → method edges)
- Adds heuristic edges for DI/reflection/dynamic patterns
- Emits
CallGraph.v1.jsonand optionally uploads it toscanner.webservice
Key constraint: node IDs must be compatible with runtime evidence (EventPipe/ETW) mapping. That’s why we build node IDs from (Module MVID + metadata token) whenever possible.
2) Deployment model
2.1 Container image choice
You have two legitimate modes; implement both:
Mode A — “Artifacts-first” (preferred for security)
- Input: already-built assemblies from CI (
bin/Release/.../*.dll+ associated files) - Worker does no
dotnet build - Worker performs IL/metadata scanning + optional Roslyn source parsing for entrypoints/heuristics
Mode B — “Build-and-scan” (convenience; higher risk)
- Input: repo checkout with
.sln - Worker runs
dotnet restore/dotnet buildinside a sandboxed container, then scans outputs
Because .NET build can execute MSBuild tasks, analyzers, and source generators (code execution risk), the product-default should be Mode A in any untrusted scenario.
2.2 Runtime requirements
- Base runtime: .NET 10 (LTS). Microsoft’s support policy lists .NET 10 as LTS with original release Nov 11, 2025 and latest patch 10.0.1 (Dec 9, 2025). (Microsoft)
- If you use Mode B, the image must include .NET 10 SDK (not just runtime). (Microsoft)
2.3 Sandbox controls (Mode B)
If you allow building:
- Run with no outbound network (or allowlist only internal NuGet proxy).
- Read-only root FS; writable temp only.
- Drop Linux capabilities; use seccomp/apparmor defaults.
- Mount repo read-only; write outputs to a dedicated volume.
- Disable telemetry:
DOTNET_CLI_TELEMETRY_OPTOUT=1.
3) Core architecture (pipeline)
Implement the worker as a single executable (CLI) with internal pipeline stages:
┌───────────────────────────────────────────────────────────────┐
│ Worker.DotNet CLI │
│ Inputs: --sln / --assemblies / --repo, --scanKey, --out │
└───────────────┬───────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────┐
│ Stage 0: Discovery │
│ - Find solutions/projects or assemblies │
│ - Determine configuration/TFM │
└───────────────┬───────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────┐
│ Stage 1: Build (optional) │
│ - dotnet restore/build OR skip │
│ - Collect output assembly paths │
└───────────────┬───────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────┐
│ Stage 2: Reference Indexer │
│ - Build mapping: (AssemblyName, Version) -> artifactKey │
│ - Compute sha256 per referenced dll │
└───────────────┬───────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────┐
│ Stage 3: IL Call Graph Extractor │
│ - Parse each project assembly │
│ - Create method nodes (nodeId = hash(MVID:token)) │
│ - Parse IL & add static edges (call/callvirt/newobj/ldftn...) │
│ - Emit external nodes for member refs │
└───────────────┬───────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────┐
│ Stage 4: Roslyn Entrypoints + Heuristics │
│ - Controllers/minimal APIs/gRPC/HostedService entrypoints │
│ - DI binding edges (AddTransient/AddScoped/AddSingleton etc.) │
│ - Reflection edges (Type.GetType/GetMethod/Invoke etc.) │
│ - Resolve Roslyn symbols -> nodeIds via symbolKey dictionary │
└───────────────┬───────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────┐
│ Stage 5: Merge + Emit │
│ - Merge nodes/edges/entrypoints │
│ - Output CallGraph.v1.json │
│ - Optional POST to scanner.webservice │
└───────────────────────────────────────────────────────────────┘
Why IL-first? Because you want metadata token + MVID node IDs that correlate naturally with runtime stacks. Deterministic builds make MVID stable for identical compilation inputs. (Microsoft Learn)
4) Technology stack (NuGet + platform APIs)
4.1 Roslyn / MSBuild loading
Use Roslyn MSBuild workspace packages:
Microsoft.CodeAnalysis.Workspaces.MSBuild(MSBuildWorkspace support) (NuGet)Microsoft.CodeAnalysis.CSharp.Workspaces(C# semantic model / operations API)- Optional:
Microsoft.CodeAnalysismeta-package (superset) (NuGet) Microsoft.Build.Locator(register MSBuild instances for workspace loading)
Roslyn packages are actively published by RoslynTeam (latest shown as 5.0.0 as of Nov 2025). (NuGet)
4.2 IL + metadata scanning
Prefer BCL APIs (no extra dependencies):
System.Reflection.MetadataSystem.Reflection.PortableExecutableSystem.Reflection.Emit.OpCodesfor IL decoding (operand sizes) (This lets you implement a compact IL parser without Cecil.)
Optional alternative (faster development, more deps):
Mono.Cecil(makes IL traversal trivial) (NuGet)
4.3 CLI + logging + JSON
System.CommandLine(recommended)Microsoft.Extensions.Logging(+ Console logger)System.Text.Json(source-generated serializers strongly recommended)
4.4 Runtime alignment note
Runtime collectors commonly rely on EventPipe/ETW; the .NET diagnostics client library (Microsoft.Diagnostics.NETCore.Client) is the standard managed API for EventPipe sessions. (Microsoft Learn)
The worker itself doesn’t collect runtime evidence, but the nodeId algorithm must match what runtime collectors can compute (hence MVID+token).
5) Internal module decomposition
Implement these internal components as classes/services. Keep them testable (pure functions where possible).
5.1 WorkerOptions
Holds CLI options:
ScanKey(uuid)RepoRoot,SolutionPathORAssembliesPath[]Configuration(default Release)TargetFramework(optional)BuildMode=Artifacts | BuildOutFileUploadUrl+ApiKey(optional)MaxEdgesPerNode(optional throttle)IncludeExternalNodes(bool)Concurrency(int)
5.2 BuildOrchestrator (Mode B only)
Responsibilities:
- Run
dotnet restoreanddotnet build - Capture output logs and surface them as structured diagnostics
- Return discovered output assemblies (dll paths)
Hard requirements:
- Support
--no-restoreand--no-buildtoggles (or equivalent) - Support
ContinuousIntegrationBuild=trueto improve determinism when available - If build fails, still attempt to scan any assemblies that exist, but mark output with
why=BUILD_FAILED_PARTIAL.
5.3 MsbuildWorkspaceLoader (Roslyn)
Responsibilities:
-
Register MSBuild with
MSBuildLocator -
Load
.slnviaMSBuildWorkspace -
Provide:
SolutionobjectProjectlist (C# only for v1)- Compilation(s) when needed (for semantic analysis)
MSBuildWorkspace is the canonical Roslyn path for analyzing MSBuild solutions. (NuGet)
5.4 ReferenceIndexer
Responsibilities:
-
Build a map from referenced assemblies to
artifactKey -
For each
PortableExecutableReferencewith a file path:-
compute sha256
-
read assembly identity (name, version)
-
create
artifactKey -
add to:
AssemblyIdentity -> artifactKeyartifactKey -> sha256/path/version
-
This index is used by IL extractor to attribute external nodes to correct artifacts.
5.5 IlCallGraphExtractor
Responsibilities:
-
For each “root” assembly (project output):
- open PE
- get module MVID
- enumerate
MethodDefinitionrows - create nodes for all methods
- parse IL bodies and emit edges
IL parsing scope (v1)
You only need to recognize these opcodes as “calls”:
callcallvirtnewobjjmpldftnldvirtftn
Node identity
-
Internal method nodeId:
nodeId = SHA256( MVID + ":" + metadataToken + ":" + arity + ":" + signatureShape )- Minimal acceptable:
SHA256(MVID + ":" + metadataToken)
This is intentionally compatible with how runtime stacks identify methods (module + token).
External method nodes
If a call operand is a MemberRef/MethodSpec that targets another assembly:
-
Create an “external node” with:
symbolKeycomputed from metadata signatureartifactKeyresolved viaReferenceIndexer(assembly identity match)nodeId = SHA256("ext:" + artifactKey + ":" + symbolKey)(runtime-proof not required)
Set flags |= External.
5.6 RoslynEntrypointExtractor
Responsibilities:
- Produce
entrypoints[]records pointing to nodeIds.
Must support (v1)
ASP.NET Core MVC controllers
-
Type has
[ApiController]or derives fromControllerBase -
Action methods: public instance methods with routing attributes
[HttpGet],[HttpPost],[Route], etc. -
Route template:
- combine controller + action route attributes (best effort)
-
entrypoint.kind = http,framework=aspnetcore
Minimal APIs
-
Detect invocation of
MapGet,MapPost,MapPut,MapDelete,MapMethods -
Extract route string literal when available
-
Handler target:
- lambda => map to generated method? (best effort)
- method group => resolve to method symbolKey => nodeId
gRPC
- Detect
MapGrpcService<T>()(endpoint registration) - Entry points: service methods on generated base types (best effort)
Background jobs
- Types implementing
IHostedService BackgroundService.ExecuteAsyncoverrideentrypoint.kind = job
Mapping Roslyn → nodeId
Do not attempt to compute metadata tokens from Roslyn symbols directly.
Instead:
- Generate the same canonical
symbolKeyfor Roslyn symbols - Resolve
symbolKey -> nodeIdusing a dictionary built from IL nodes
If not resolvable, emit an entrypoint with a synthetic “unresolved” node:
nodeId = SHA256("unresolved:" + symbolKey)flags |= Unresolvedwhy += ENTRYPOINT_SYMBOL_UNRESOLVED
5.7 RoslynHeuristicEdgeExtractor
Responsibilities:
- Add heuristic edges that IL won’t reliably capture.
DI bindings (must-have)
Detect common DI registration patterns:
services.AddTransient<IFoo, Foo>()AddScoped,AddSingletonEmit heuristic edge:- from: interface method set? (v1 simplify to type-level constructor edge)
- to:
Foo..ctor(...)node reason = di_binding
Practical v1 implementation:
- Create edge from a synthetic “DI container” node per assembly to implementation constructors.
- Or create edges from the registration site method to the constructor. (Choose one and keep consistent.)
Reflection (must-have)
Emit heuristic edges with lower confidence:
Type.GetType("Namespace.Type, Assembly")Assembly.Load(...),GetMethod("X"),InvokeActivator.CreateInstance(...)
If string literal resolves to a type/method in the solution, create edge:
- from: caller method
- to: target method/ctor
reason = reflection_string
If not resolvable, record a why=REFLECTION_UNRESOLVED_STRING diagnostic; do not crash.
5.8 GraphMerger
Responsibilities:
-
Merge nodes/edges/entrypoints from IL and Roslyn stages
-
De-duplicate edges by
(from,to,kind,reason) -
Apply optional throttles:
- cap edges per node
- drop low-weight heuristics if too many
5.9 CallGraphWriter
Responsibilities:
-
Serialize
CallGraph.v1.jsonexactly to spec -
Include:
artifacts[](project outputs + references)nodes[],edges[]entrypoints[]language = "dotnet"scanKey
6) Canonical symbolKey format (critical for merges)
Pick one canonical form and use it everywhere.
Recommended v1 symbolKey shape:
{Namespace}.{TypeName}[`Arity][+Nested]::{MethodName}[`Arity]({ParamType1},{ParamType2},...)
Rules:
- Use
System.*full names for BCL types - Use
+for nested types (metadata style) - Use backtick arity for generic type/method definitions
- For arrays:
System.String[] - For byref:
System.String&
Implementation detail:
- IL extractor can build this from metadata signatures.
- Roslyn extractor can build this using a controlled
SymbolDisplayFormat.
If you get this right, Roslyn → IL mapping becomes reliable.
7) CLI surface (what developers will actually run)
Minimum viable commands:
Artifacts-first scan
stella-worker-dotnet scan \
--scanKey 00000000-0000-0000-0000-000000000000 \
--assemblies ./artifacts/bin/Release \
--out ./callgraph.json
Build-and-scan (internal trusted only)
stella-worker-dotnet scan \
--scanKey ... \
--sln ./src/MySolution.sln \
--configuration Release \
--tfm net10.0 \
--buildMode build \
--out ./callgraph.json
Upload to scanner.webservice
stella-worker-dotnet scan \
--scanKey ... \
--assemblies ./artifacts/bin/Release \
--upload https://scanner/api/scans/{scanId}/callgraphs \
--apiKey $STELLA_API_KEY
8) Observability and failure behavior
8.1 Structured diagnostics
Always emit:
- counts: nodes/edges/entrypoints
- build outcome: success/failed/partial
- list of projects scanned/skipped
- unresolved symbol counts (entrypoints + heuristic edges)
8.2 Hard failure vs partial output
-
If at least one assembly scanned, output a graph even if others fail.
-
Mark diagnostics in output:
- add
why/notes(if you extend schema) OR log to stderr and let webservice record the warning on ingest.
- add
9) Why this architecture works for reachability
- IL extraction provides the most faithful call edges and stable node IDs.
- Roslyn adds what IL can’t: framework entrypoints, DI and reflection heuristics.
- Node IDs based on MVID+token align with deterministic compilation behavior and runtime tooling patterns. Deterministic compilation replaces timestamp/MVID with values derived from compilation inputs, enabling stable identity under stable inputs. (Microsoft Learn)
- Roslyn MSBuildWorkspace is the canonical way to load solutions/projects with correct references and compilation options. (NuGet)
10) Implementation “must-do” checklist for the developer
-
Define canonical symbolKey and implement it in:
- IL extractor
- Roslyn symbol formatter
-
Implement
ReferenceIndexerto map assembly identity → artifactKey. -
Implement IL extractor:
- nodes for method defs
- edges for call opcodes
-
Implement entrypoint detectors (controllers + minimal APIs + hosted service).
-
Implement DI + reflection heuristic edges.
-
Merge and output
CallGraph.v1.jsonmatching schema. -
Add golden tests (small ASP.NET apps) verifying:
- entrypoint detection
- at least one static path exists
- DI/reflection edges appear with correct reason codes
If you want, I can also provide:
- a concrete folder layout (
/src/StellaOps.Scanner.Worker.DotNet/...) with project files, - the exact
SymbolDisplayFormatyou should use for symbolKey stability, - and an IL opcode decoder snippet that only records call-like opcodes (fast, minimal).