Here’s a practical blueprint for building a **reachability‑first code+binary scanner** that fuses static call‑graphs with runtime evidence, and scales to large monorepos/microservices.

---

# 1) Static analyzers (per language)

* **.NET (Roslyn / IL)**

  * Parse solutions with `Microsoft.CodeAnalysis.MSBuild`, collect symbols, build call graph from `ISymbol` → `IInvocationOperation`.
  * Handle reflection edges by heuristics (string literals, `Type.GetType`, DI registrations).
  * IL pass: read assemblies with `System.Reflection.Metadata` to connect external/library calls.
  * Minimal sample:

    ```csharp
    using Microsoft.CodeAnalysis;
    using Microsoft.CodeAnalysis.CSharp;
    using Microsoft.CodeAnalysis.MSBuild;

    var ws = MSBuildWorkspace.Create();
    var sln = await ws.OpenSolutionAsync(@"path\to.sln");
    foreach (var proj in sln.Projects)
    foreach (var doc in proj.Documents)
    {
        var model = await doc.GetSemanticModelAsync();
        var root = await doc.GetSyntaxRootAsync();
        foreach (var node in root.DescendantNodes().OfType<Microsoft.CodeAnalysis.CSharp.Syntax.InvocationExpressionSyntax>())
        {
            var sym = model.GetSymbolInfo(node).Symbol as IMethodSymbol;
            if (sym != null)
            {
                // record edge: caller -> sym.ContainingType.Name + "." + sym.Name
            }
        }
    }
    ```
* **Java (Soot or WALA)**

  * Build bytecode call graph (CHA/RTA/points‑to) and export edges.
  * Seed entrypoints from `public static void main`, Spring Boot controllers, servlet mappings.
* **Node/Python**

  * Build AST + import graph; resolve exports (`module.exports`, `export default`, Python `__all__`).
  * Track dynamic requires (best‑effort string eval); record web/router handlers as entrypoints.
* **Go/Rust**

  * Use build graph (Go modules, Cargo metadata) + AST to map `main` and handler functions.
  * Include linker‑time features/conditions to avoid dead edges.
* **Binary‑only (containers, closed libs)**

  * Recover function boundaries (Ghidra/rizin), mine strings/imports, detect candidates for entrypoints from container `ENTRYPOINT/CMD`, service files, and exposed ports.
  * Heuristics: exported symbols, syscall usage, and common framework stubs.

---

# 2) Runtime confirmation (evidence)

* **Windows/.NET:** ETW sampling to “mint” runtime edges (method IDs, stack samples) without heavy overhead.
* **Linux/containers:** eBPF/usdt or perf sampling to confirm hot paths; record PID→image→build info to link evidence back to SBOM components.
* **Rule:** static edge exists → mark **probable**; static+runtime match → mark **proven** (confidence ↑, prioritize).

---

# 3) Entrypoint discovery

* **Web services:** framework routers (ASP.NET Core endpoints, Spring mappings, Express routes, FastAPI decorators).
* **Jobs/CLIs:** scheduler configs (Cron, systemd timers, k8s CronJobs).
* **Events:** message consumers (RabbitMQ/Kafka topics), gRPC service maps.

Entrypoints seed reachability: start from entry, traverse call graph, intersect with SBOM → “reachable components + reachable vulns”.

---

# 4) Scale & storage

* **Shard** by repo/service; compute graphs independently.
* **Compress** with SCCs (strongly connected components) to shrink graph size.
* **Cap cardinality** using hot‑path sampling (keep top‑N edges by observed frequency).
* **Cache**: content‑addressed graphs keyed by `(SBOM hash, compiler flags, env)`; invalidate on source/SBOM/CFG changes or new VEX/policy.
* **Store** edges as `(caller, callee, kind: static|runtime, weight, build-id)` in Postgres; keep Valkey for ephemeral reachability queries.

---

# 5) SBOM/VEX linkage

* Normalize package coordinates (purl), map symbols/binaries → SBOM components.
* For each CVE:

  * **Reachable?** (entrypoint‑anchored traversal hits affected symbol/library)
  * **Proven at runtime?** (evidence present)
  * **Gated by config?** (feature flags, platform checks)
* Emit VEX with machine‑explainable reasons (e.g., *not reachable*, *reachable but not loaded*, *reachable+proven*).

---

# 6) APIs and outputs (developer‑friendly)

* **CLI**

  * `scan graph --lang dotnet --sln path.sln --out graph.scc.json`
  * `scan runtime --target pod/myservice --duration 30s --out stacks.json`
  * `reachability join --graph graph.scc.json --runtime stacks.json --sbom bom.cdx.json --out reach.cdxr.json`
* **HTTP**

  * `POST /graph` (upload call graph)
  * `POST /runtime` (upload evidence)
  * `POST /reachability` → returns ranked, evidence‑linked findings
* **Artifacts**

  * `graph.scc.json` (SCC‑compressed call graph)
  * `reach.cdxr.json` (CycloneDX extension with evidence)
  * `vex.json` (OpenVEX/CSAF w/ “justifications”)

---

# 7) Quality gates & tests

* **Golden images**: tiny test services where reachable/unreachable CVEs are known.
* **Mutation tests**: toggle entrypoints, flags, and ensure reachability shifts correctly.
* **Drift checks**: if runtime sees edges not in static graph → open “coverage debt” issue.

---

# 8) Security & perf knobs

* Sampling rate caps (CPU bound), PID/image allowlists, PII‑safe symbol hashing option.
* Offline mode: bundle symbols + evidence into a replayable archive (deterministic re‑evaluation).

---

If you want, I can generate a **starter repo layout** (Roslyn worker, Java WALA worker, eBPF sampler, joiner, and a Postgres schema) tailored to your .NET 10 + microservices stack.
Below is a developer-ready **product + BA implementation specification** for the **Reachability-First Scanner** described earlier, tailored to **StellaOps (.NET 10)** and your standing architecture rules (**lattice algorithms run in `scanner.webservice`; Concelier/Excititor preserve prune source; Postgres is SoR; Valkey is ephemeral only**).

---

# StellaOps Reachability-First Scanner

## Developer Implementation Specification (v1)

## 0) Objective and boundaries

### Objective

Reduce vulnerability noise by classifying findings as **Unreachable / Possibly Reachable / Reachable (Static) / Proven Reachable (Runtime)** using:

1. **Static call graph** (best-effort; language-aware)
2. **Runtime evidence** (sampling, low overhead)
3. **Entrypoint seeding** (framework-aware)
4. **Join** against SBOM component mapping + vulnerability data (from Concelier) + VEX (from Excititor)

### Non-goals (v1)

* Perfect points-to analysis for all languages.
* Full decompilation for every binary (support is “best-effort” with confidence).
* Executing or fuzzing workloads.

---

# 1) Product behavior: what the user sees

## 1.1 Reachability statuses (canonical)

These labels must be stable across UI/CLI/API:

* **UNREACHABLE**: no path from any discovered entrypoint to affected component/symbol.
* **POSSIBLY_REACHABLE**: graph incomplete / dynamic behavior; heuristics indicate risk.
* **REACHABLE_STATIC**: a static path exists from at least one entrypoint.
* **REACHABLE_PROVEN**: runtime evidence confirms code path or library load (stronger than static).

### Required explanation fields (always returned)

Every reachability classification must include:

* `why[]`: list of structured reasons (machine-readable codes + human text)
* `evidence[]`: references to graph paths and/or runtime samples
* `confidence`: 0.0–1.0
* `scope`: component-only or symbol-level (if symbol mapping exists)

## 1.2 Key UX outputs (pipeline-first)

* CLI output for CI gates: `stella scan reachability --format sarif|json`
* UI detail panel must show:

  * Entry point(s) → path summary (k shortest paths, default k=3)
  * Whether runtime proved it (samples, timestamps, container/build IDs)
  * Which assumptions/heuristics were used (reflection, DI, dynamic import, etc.)

---

# 2) System architecture (StellaOps modules)

## 2.1 Services and responsibilities

### `StellaOps.Scanner.WebService` (authoritative)

**Owns the reachability pipeline and the lattice computation for reachability decisions.**
Responsibilities:

* Ingest static graphs from language workers
* Ingest runtime evidence (from collectors)
* Normalize symbols → components (SBOM join)
* Compute reachability results, confidence, and explanation artifacts
* Expose query APIs and CI export formats
* Persist everything to Postgres (SoR)
* Use Valkey only as ephemeral accelerator

### Language workers (stateless compute)

Examples:

* `StellaOps.Scanner.Worker.DotNet`
* `StellaOps.Scanner.Worker.Java`
* `StellaOps.Scanner.Worker.Node`
* `StellaOps.Scanner.Worker.Python`
* `StellaOps.Scanner.Worker.Go`
* `StellaOps.Scanner.Worker.Rust`
* `StellaOps.Scanner.Worker.Binary`

Responsibilities:

* Produce `CallGraph.v1.json` (+ optional `Entrypoints.v1.json`)
* Provide symbol IDs stable within a scan (see hashing rules)

### Runtime collectors (agent/sidecar; optional)

* Windows: ETW/EventPipe sampling for .NET
* Linux: eBPF/perf sampling for native; plus runtime-specific exporters where feasible

Collectors only emit **evidence events**; they never compute reachability.

### Concelier / Excititor integration

* Concelier provides vulnerability facts (CVE ↔ component versions).
* Excititor provides VEX statements.
  **Neither computes reachability or lattice merges**; they provide **pruned sources** only.

---

# 3) Data contracts (hard requirements)

## 3.1 Stable identifiers

All graph nodes must have:

* `nodeId`: stable across replays when code is unchanged.
* `symbolKey`: canonical string (language-specific)
* `artifactKey`: assembly/jar/module/binary identity (prefer build ID + path + hash)
* Optional: `purlCandidates[]` (library mapping hints)

**DotNet nodeId rule (v1):**
`nodeId = SHA256(assemblyMvid + ":" + metadataToken + ":" + genericArity + ":" + signatureShape)`

* If token unavailable (source-only), fallback: SHA256(projectPath + ":" + file + ":" + span + ":" + symbolDisplayString)

## 3.2 CallGraph.v1.json

Minimum required schema:

```json
{
  "schema": "stella.callgraph.v1",
  "scanKey": "uuid",
  "language": "dotnet|java|node|python|go|rust|binary",
  "artifacts": [{ "artifactKey": "…", "kind": "assembly|jar|module|binary", "sha256": "…" }],
  "nodes": [{
    "nodeId": "…",
    "artifactKey": "…",
    "symbolKey": "Namespace.Type::Method(…)",
    "visibility": "public|internal|private|unknown",
    "isEntrypointCandidate": false
  }],
  "edges": [{
    "from": "nodeId",
    "to": "nodeId",
    "kind": "static|heuristic",
    "reason": "direct_call|virtual_call|reflection_string|di_binding|dynamic_import|unknown",
    "weight": 1.0
  }],
  "entrypoints": [{
    "nodeId": "…",
    "kind": "http|grpc|cli|job|event|unknown",
    "route": "/api/orders/{id}",
    "framework": "aspnetcore|minimalapi|spring|express|unknown"
  }]
}
```

## 3.3 RuntimeEvidence.v1.json

```json
{
  "schema": "stella.runtimeevidence.v1",
  "scanKey": "uuid",
  "collectedAt": "2025-12-14T10:00:00Z",
  "environment": {
    "os": "linux|windows",
    "k8s": { "namespace": "…", "pod": "…", "container": "…" },
    "imageDigest": "sha256:…",
    "buildId": "…"
  },
  "samples": [{
    "timestamp": "…",
    "pid": 1234,
    "threadId": 77,
    "frames": ["nodeId","nodeId","nodeId"],
    "sampleWeight": 1.0
  }],
  "loadedArtifacts": [{
    "artifactKey": "…",
    "evidence": "loaded_module|mapped_file|jar_loaded"
  }]
}
```

---

# 4) Postgres schema (system of record)

## 4.1 Core tables

You can implement with migrations in `StellaOps.Scanner.Persistence` (EF Core 9).

### `scan`

* `scan_id uuid pk`
* `created_at timestamptz`
* `repo_uri text null`
* `commit_sha text null`
* `sbom_digest text` (hash of SBOM input)
* `policy_digest text` (hash of reachability policy inputs)
* `status text` (NEW/RUNNING/DONE/FAILED)

Indexes:

* `(commit_sha, sbom_digest)` for caching

### `artifact`

* `artifact_id uuid pk`
* `scan_id uuid fk`
* `artifact_key text` unique per scan
* `kind text`
* `sha256 text`
* `build_id text null`
* `purl text null`

Index:

* `(scan_id, artifact_key)` unique

### `cg_node`

* `scan_id uuid fk`
* `node_id text` (hash string)
* `artifact_key text`
* `symbol_key text`
* `visibility text`
* `flags int` (bitset: entrypointCandidate, external, generated, etc.)
  PK: `(scan_id, node_id)`

GIN index:

* `symbol_key` trigram for search (optional)

### `cg_edge`

* `scan_id uuid fk`
* `from_node_id text`
* `to_node_id text`
* `kind smallint` (0 static, 1 heuristic, 2 runtime_minted)
* `reason smallint`
* `weight real`
  PK: `(scan_id, from_node_id, to_node_id, kind, reason)`

Indexes:

* `(scan_id, from_node_id)`
* `(scan_id, to_node_id)`

### `entrypoint`

* `scan_id uuid`
* `node_id text`
* `kind text`
* `framework text`
* `route text null`
  PK: `(scan_id, node_id, kind, framework, route)`

### `runtime_sample`

* `scan_id uuid`
* `collected_at timestamptz`
* `env_hash text` (hash of environment identity)
* `sample_id bigserial pk`
* `timestamp timestamptz`
* `pid int`
* `thread_id int`
* `frames text[]` (nodeIds)
* `weight real`

Partition suggestion:

* Partition by `scan_id` or by month depending on retention.

### `symbol_component_map`

* `scan_id uuid`
* `node_id text`
* `purl text`
* `mapping_kind text` (exact|heuristic|external)
* `confidence real`
  PK: `(scan_id, node_id, purl)`

### `reachability_component`

* `scan_id uuid`
* `purl text`
* `status smallint` (0 unreachable, 1 possible, 2 reachable_static, 3 reachable_proven)
* `confidence real`
* `why jsonb`
* `evidence jsonb`
  PK: `(scan_id, purl)`

### `reachability_finding`

* `scan_id uuid`
* `cve_id text`
* `purl text`
* `status smallint`
* `confidence real`
* `why jsonb`
* `evidence jsonb`
  PK: `(scan_id, cve_id, purl)`

## 4.2 Valkey usage (ephemeral only)

Allowed:

* Dedup keys for evidence ingest (short TTL)
* Hot query cache: `(scan_id, purl)` → reachability result
* Rate limits / nonces

Not allowed:

* Authoritative queueing for scan state
* Any “only copy” of results

---

# 5) Reachability computation (the actual algorithm)

## 5.1 Inputs

* Call graph nodes/edges + entrypoints
* Runtime evidence (optional)
* SBOM (CycloneDX/SPDX) with purls
* Concelier vulnerability facts (CVE ↔ purl/version ranges)
* Excititor VEX statements (not affected / affected / under investigation)

## 5.2 Normalize to a graph suitable for traversal

In `scanner.webservice`:

1. Build adjacency list for `cg_edge.kind in (static, heuristic)`
2. Optionally compress SCCs:

   * Compute SCCs (Tarjan/Kosaraju)
   * Store SCC mapping for explanation paths (must remain explainable)

## 5.3 Entrypoint seeding rules

Entrypoints come from:

* Worker-reported entrypoints (preferred)
* Framework discovery in worker (ASP.NET maps, Spring mappings, etc.)
* Fallback: `Main`, exported symbols, container CMD/ENTRYPOINT

**If entrypoints are empty**, mark all results as `POSSIBLY_REACHABLE` with reason `NO_ENTRYPOINTS_DISCOVERED`, unless runtime evidence exists.

## 5.4 Traversal

For each scan:

* Start from all entrypoints; traverse reachable nodes.
* Track:

  * `firstSeenFromEntrypoint[node]` (for k-shortest path reconstruction)
  * `pathWitness[node]` (parent pointers or compressed witness)

Produce:

* `reachableNodesStatic` set

## 5.5 Join to components (SBOM)

Map reachable nodes to purls using `symbol_component_map`.

Mapping sources (priority order):

1. Exact binary symbol → package metadata (where available)
2. Assembly/jar/module to SBOM component (by hash/purl)
3. Heuristics: namespace prefixes, import paths, jar manifest, npm package.json, go module path

If a vulnerable purl is in SBOM but has **no symbol mapping**, component reachability defaults:

* If artifact is **loaded at runtime** → at least `REACHABLE_PROVEN` (component level)
* Else if referenced by static dependency graph → `POSSIBLY_REACHABLE`
* Else → `UNREACHABLE` (with `NO_SYMBOL_MAPPING` reason)

## 5.6 Runtime evidence upgrade (“minting”)

If runtime evidence is present:

* For each sample stack:

  * Mark each frame node as “executed”
  * Mint runtime edges: consecutive frames become `cg_edge.kind=runtime_minted` (optional table or derived view)
* If any executed node maps to purl affected by CVE:

  * Upgrade status to `REACHABLE_PROVEN`
* If only loaded artifact exists:

  * Upgrade component status to `REACHABLE_PROVEN` (component-only), but keep symbol-level as unknown.

## 5.7 Confidence scoring (deterministic)

A simple deterministic scoring function (v1) used everywhere:

* Base:

  * `UNREACHABLE` → 0.05
  * `POSSIBLY_REACHABLE` → 0.35
  * `REACHABLE_STATIC` → 0.70
  * `REACHABLE_PROVEN` → 0.95
* Modifiers:

  * +0.10 if path uses only `static` edges (no heuristic)
  * −0.15 if path includes `reflection_string|dynamic_import`
  * +0.10 if runtime evidence hits a node in affected component
  * −0.10 if entrypoints incomplete (`NO_ENTRYPOINTS_DISCOVERED`)
    Clamp to `[0, 1]`.

All modifiers must be recorded in `why[]`.

---

# 6) Language worker specs (what each worker must do)

## 6.1 .NET worker (Roslyn + optional IL)

**Goal (v1):** produce good-enough call graph + entrypoints for ASP.NET Core and workers.

### Required features

* Direct invocation edges: `InvocationExpressionSyntax`
* Object creation edges: constructors
* Delegate invocation: best-effort; record heuristic edge when target unresolved
* Virtual/interface dispatch:

  * record `virtual_call` edge to declared method
  * optionally add edges to known overrides within solution (static, conservative)
* Async/await: treat state machine calls as implementation detail; connect logical caller → awaited method

### Entrypoint discovery (.NET)

Implement these detectors:

* `Program.Main` (classic)
* ASP.NET Core:

  * Controllers: `[ApiController]`, route attributes, action methods
  * Minimal APIs: `MapGet/MapPost/MapMethods` patterns (syntactic + semantic)
  * gRPC: `MapGrpcService<T>()` and service methods
  * Hosted services: `IHostedService`, `BackgroundService.ExecuteAsync` as job entrypoints
* Message consumers (if present): known libs patterns (e.g., MassTransit consumers)

### Reflection and DI heuristics

Produce **heuristic edges** when you see:

* `Type.GetType("…")`, `Assembly.GetType`, `GetMethod("…")`, `Invoke`
* `services.AddTransient<IFoo,Foo>()` / `AddScoped` / `AddSingleton`

  * Add edge `IFoo` → `Foo` constructor as `di_binding` heuristic
* `Activator.CreateInstance`, `ServiceProvider.GetService` patterns

### Output guarantees

* Must not crash on partial compilation (missing refs); produce partial graph with `why=COMPILATION_PARTIAL`
* Provide `artifact_key` per assembly/project output

## 6.2 Java / Node / Python / Go / Rust workers

v1 expectations:

* Provide import graph + framework entrypoints + best-effort call edges.
* Always label uncertain resolution as `heuristic` with a reason code.

## 6.3 Binary worker

v1 expectations:

* Identify artifacts, exported symbols, imported libs, and candidate entrypoints from container metadata.
* Provide component-level mapping primarily; symbol-level mapping only when confident.

---

# 7) APIs (scanner.webservice)

## 7.1 Ingestion endpoints

* `POST /api/scans` → creates scan record (returns `scanId`)
* `POST /api/scans/{scanId}/callgraphs` → accepts `CallGraph.v1.json`
* `POST /api/scans/{scanId}/runtimeevidence` → accepts `RuntimeEvidence.v1.json`
* `POST /api/scans/{scanId}/sbom` → accepts CycloneDX/SPDX
* `POST /api/scans/{scanId}/compute-reachability` → triggers computation (idempotent)

Rules:

* All ingests must be **idempotent** via `contentDigest` header (store seen digests in Postgres; Valkey may accelerate dedupe).
* Reject mismatched `scanKey/scanId`.

## 7.2 Query endpoints

* `GET /api/scans/{scanId}/reachability/components?purl=...`
* `GET /api/scans/{scanId}/reachability/findings?cve=...`
* `GET /api/scans/{scanId}/reachability/explain?cve=...&purl=...`

  * returns `why[]` + path witness + sample refs

## 7.3 Export endpoints

* `GET /api/scans/{scanId}/exports/sarif`
* `GET /api/scans/{scanId}/exports/cdxr` (CycloneDX reachability extension)
* `GET /api/scans/{scanId}/exports/openvex` (reachability justifications as VEX annotations)

---

# 8) Deterministic replay requirements (must-have)

Every reachability result must be reproducible from:

* SBOM digest
* CallGraph digests (per worker)
* RuntimeEvidence digests (optional)
* Concelier feed snapshot digest
* Excititor VEX snapshot digest
* Policy digest (confidence scoring + gating rules)

Implement `ReplayManifest.json`:

```json
{
  "schema": "stella.replaymanifest.v1",
  "scanId": "uuid",
  "inputs": {
    "sbomDigest": "sha256:…",
    "callGraphs": [{"language":"dotnet","digest":"sha256:…"}],
    "runtimeEvidence": [{"digest":"sha256:…"}],
    "concelierSnapshot": "sha256:…",
    "excititorSnapshot": "sha256:…",
    "policyDigest": "sha256:…"
  }
}
```

---

# 9) Quality gates and acceptance criteria

## 9.1 Golden corpus (mandatory)

Create `/tests/Reachability.Golden/` with:

* Minimal ASP.NET controller app with known reachable endpoint → vulnerable lib call
* Minimal app with vulnerable lib present but never called → unreachable
* Reflection-based activation case → “possible” unless runtime proves
* BackgroundService job case

**Acceptance**:

* Each golden test asserts:

  * Reachability status
  * At least one `why[]` reason
  * Deterministic `confidence` within ±0.01

## 9.2 Drift detection (mandatory)

If runtime minted edges not present in static graph above a threshold:

* Emit `COVERAGE_DRIFT` warning with top missing edges
* Store drift report in Postgres (`reachability_drift` table or JSONB field)

## 9.3 Performance SLOs (v1 targets)

* 1 medium service (100k LOC .NET) static graph: < 2 minutes on CI runner class machine
* Reachability compute: < 30 seconds
* Query `GET finding`: < 200ms p95 (use Postgres indexes + optional Valkey cache)

---

# 10) Implementation plan (developer checklist)

## Milestone A — Data plumbing (1)

* Implement Postgres tables + migrations
* Implement ingestion endpoints
* Persist call graphs and runtime evidence
* Implement digesting + idempotency

**Done when:** you can ingest a call graph + SBOM and query stored nodes/edges.

## Milestone B — DotNet worker MVP (2)

* Roslyn graph extraction (direct calls + constructors)
* ASP.NET Core entrypoints (controllers + minimal APIs)
* Output `CallGraph.v1.json`

**Done when:** golden corpus “reachable/unreachable” works on static only.

## Milestone C — Reachability compute (3)

* Entry point seeding
* SCC compression optional
* Traversal + path witness generation
* Join to SBOM purls
* Generate `reachability_component` + `reachability_finding`

**Done when:** component-level reachability returns stable answers.

## Milestone D — Runtime evidence upgrade (4)

* Ingest runtime evidence
* Upgrade statuses to `REACHABLE_PROVEN`
* Drift report

**Done when:** runtime samples can flip a “possible” to “proven”.

## Milestone E — Exports (5)

* SARIF export
* CycloneDX reachability extension export
* OpenVEX annotations export

**Done when:** CI pipeline can gate on proven/static reachability.

---

If you want this converted into **repo-facing artifacts** (e.g., `docs/modules/scanner/reachability-spec.md`, EF Core entity classes skeletons, and OpenAPI snippets for the endpoints), tell me the exact module paths you use under `/src/` and I’ll emit drop-in files matching your StellaOps naming conventions.
Below is a concrete, implementable **architecture + technology stack** for **`StellaOps.Scanner.Worker.DotNet`** (the .NET static graph worker), written as if you’re handing it directly to an engineer to build.

---

# 1) Worker’s purpose in the StellaOps scanner system

**`StellaOps.Scanner.Worker.DotNet` is a stateless “graph producer.”** It does **not** compute reachability, confidence, or any lattice logic (that lives in `scanner.webservice`). The worker:

1. **Discovers entrypoints** (ASP.NET Core controllers, minimal APIs, gRPC, hosted services, etc.)
2. **Extracts a static call graph** (method → method edges)
3. **Adds heuristic edges** for DI/reflection/dynamic patterns
4. **Emits `CallGraph.v1.json`** and optionally uploads it to `scanner.webservice`

Key constraint: node IDs must be compatible with runtime evidence (EventPipe/ETW) mapping. That’s why we build node IDs from **(Module MVID + metadata token)** whenever possible.

---

# 2) Deployment model

## 2.1 Container image choice

You have two legitimate modes; implement both:

### Mode A — “Artifacts-first” (preferred for security)

* Input: already-built assemblies from CI (`bin/Release/.../*.dll` + associated files)
* Worker does **no `dotnet build`**
* Worker performs **IL/metadata scanning** + optional Roslyn source parsing for entrypoints/heuristics

### Mode B — “Build-and-scan” (convenience; higher risk)

* Input: repo checkout with `.sln`
* Worker runs `dotnet restore`/`dotnet build` inside a sandboxed container, then scans outputs

Because .NET build can execute **MSBuild tasks, analyzers, and source generators** (code execution risk), the product-default should be Mode A in any untrusted scenario.

## 2.2 Runtime requirements

* Base runtime: **.NET 10 (LTS)**. Microsoft’s support policy lists .NET 10 as LTS with original release **Nov 11, 2025** and latest patch **10.0.1 (Dec 9, 2025)**. ([Microsoft][1])
* If you use Mode B, the image must include **.NET 10 SDK** (not just runtime). ([Microsoft][2])

## 2.3 Sandbox controls (Mode B)

If you allow building:

* Run with **no outbound network** (or allowlist only internal NuGet proxy).
* Read-only root FS; writable temp only.
* Drop Linux capabilities; use seccomp/apparmor defaults.
* Mount repo read-only; write outputs to a dedicated volume.
* Disable telemetry: `DOTNET_CLI_TELEMETRY_OPTOUT=1`.

---

# 3) Core architecture (pipeline)

Implement the worker as a single executable (CLI) with internal pipeline stages:

```
┌───────────────────────────────────────────────────────────────┐
│ Worker.DotNet CLI                                              │
│  Inputs: --sln / --assemblies / --repo, --scanKey, --out       │
└───────────────┬───────────────────────────────────────────────┘
                │
                ▼
┌───────────────────────────────────────────────────────────────┐
│ Stage 0: Discovery                                              │
│  - Find solutions/projects or assemblies                         │
│  - Determine configuration/TFM                                   │
└───────────────┬───────────────────────────────────────────────┘
                │
                ▼
┌───────────────────────────────────────────────────────────────┐
│ Stage 1: Build (optional)                                       │
│  - dotnet restore/build OR skip                                 │
│  - Collect output assembly paths                                │
└───────────────┬───────────────────────────────────────────────┘
                │
                ▼
┌───────────────────────────────────────────────────────────────┐
│ Stage 2: Reference Indexer                                      │
│  - Build mapping: (AssemblyName, Version) -> artifactKey        │
│  - Compute sha256 per referenced dll                            │
└───────────────┬───────────────────────────────────────────────┘
                │
                ▼
┌───────────────────────────────────────────────────────────────┐
│ Stage 3: IL Call Graph Extractor                                │
│  - Parse each project assembly                                  │
│  - Create method nodes (nodeId = hash(MVID:token))              │
│  - Parse IL & add static edges (call/callvirt/newobj/ldftn...)  │
│  - Emit external nodes for member refs                           │
└───────────────┬───────────────────────────────────────────────┘
                │
                ▼
┌───────────────────────────────────────────────────────────────┐
│ Stage 4: Roslyn Entrypoints + Heuristics                        │
│  - Controllers/minimal APIs/gRPC/HostedService entrypoints      │
│  - DI binding edges (AddTransient/AddScoped/AddSingleton etc.)  │
│  - Reflection edges (Type.GetType/GetMethod/Invoke etc.)        │
│  - Resolve Roslyn symbols -> nodeIds via symbolKey dictionary    │
└───────────────┬───────────────────────────────────────────────┘
                │
                ▼
┌───────────────────────────────────────────────────────────────┐
│ Stage 5: Merge + Emit                                           │
│  - Merge nodes/edges/entrypoints                                │
│  - Output CallGraph.v1.json                                     │
│  - Optional POST to scanner.webservice                           │
└───────────────────────────────────────────────────────────────┘
```

**Why IL-first?**
Because you want **metadata token + MVID** node IDs that correlate naturally with runtime stacks. Deterministic builds make MVID stable for identical compilation inputs. ([Microsoft Learn][3])

---

# 4) Technology stack (NuGet + platform APIs)

## 4.1 Roslyn / MSBuild loading

Use Roslyn MSBuild workspace packages:

* `Microsoft.CodeAnalysis.Workspaces.MSBuild` (MSBuildWorkspace support) ([NuGet][4])
* `Microsoft.CodeAnalysis.CSharp.Workspaces` (C# semantic model / operations API)
* Optional: `Microsoft.CodeAnalysis` meta-package (superset) ([NuGet][5])
* `Microsoft.Build.Locator` (register MSBuild instances for workspace loading)

Roslyn packages are actively published by RoslynTeam (latest shown as **5.0.0** as of Nov 2025). ([NuGet][6])

## 4.2 IL + metadata scanning

Prefer BCL APIs (no extra dependencies):

* `System.Reflection.Metadata`
* `System.Reflection.PortableExecutable`
* `System.Reflection.Emit.OpCodes` for IL decoding (operand sizes)
  (This lets you implement a compact IL parser without Cecil.)

Optional alternative (faster development, more deps):

* `Mono.Cecil` (makes IL traversal trivial) ([NuGet][7])

## 4.3 CLI + logging + JSON

* `System.CommandLine` (recommended)
* `Microsoft.Extensions.Logging` (+ Console logger)
* `System.Text.Json` (source-generated serializers strongly recommended)

## 4.4 Runtime alignment note

Runtime collectors commonly rely on EventPipe/ETW; the .NET diagnostics client library (`Microsoft.Diagnostics.NETCore.Client`) is the standard managed API for EventPipe sessions. ([Microsoft Learn][8])
The worker itself doesn’t collect runtime evidence, but the **nodeId algorithm must match what runtime collectors can compute** (hence MVID+token).

---

# 5) Internal module decomposition

Implement these internal components as classes/services. Keep them testable (pure functions where possible).

## 5.1 `WorkerOptions`

Holds CLI options:

* `ScanKey` (uuid)
* `RepoRoot`, `SolutionPath` OR `AssembliesPath[]`
* `Configuration` (default Release)
* `TargetFramework` (optional)
* `BuildMode` = `Artifacts | Build`
* `OutFile`
* `UploadUrl` + `ApiKey` (optional)
* `MaxEdgesPerNode` (optional throttle)
* `IncludeExternalNodes` (bool)
* `Concurrency` (int)

## 5.2 `BuildOrchestrator` (Mode B only)

Responsibilities:

* Run `dotnet restore` and `dotnet build`
* Capture output logs and surface them as structured diagnostics
* Return discovered output assemblies (dll paths)

Hard requirements:

* Support `--no-restore` and `--no-build` toggles (or equivalent)
* Support `ContinuousIntegrationBuild=true` to improve determinism when available
* If build fails, still attempt to scan any assemblies that exist, but mark output with `why=BUILD_FAILED_PARTIAL`.

## 5.3 `MsbuildWorkspaceLoader` (Roslyn)

Responsibilities:

* Register MSBuild with `MSBuildLocator`
* Load `.sln` via `MSBuildWorkspace`
* Provide:

  * `Solution` object
  * `Project` list (C# only for v1)
  * Compilation(s) when needed (for semantic analysis)

MSBuildWorkspace is the canonical Roslyn path for analyzing MSBuild solutions. ([NuGet][4])

## 5.4 `ReferenceIndexer`

Responsibilities:

* Build a map from referenced assemblies to `artifactKey`
* For each `PortableExecutableReference` with a file path:

  * compute sha256
  * read assembly identity (name, version)
  * create `artifactKey`
  * add to:

    * `AssemblyIdentity -> artifactKey`
    * `artifactKey -> sha256/path/version`

This index is used by IL extractor to attribute **external nodes** to correct artifacts.

## 5.5 `IlCallGraphExtractor`

Responsibilities:

* For each “root” assembly (project output):

  * open PE
  * get module MVID
  * enumerate `MethodDefinition` rows
  * create nodes for all methods
  * parse IL bodies and emit edges

### IL parsing scope (v1)

You only need to recognize these opcodes as “calls”:

* `call`
* `callvirt`
* `newobj`
* `jmp`
* `ldftn`
* `ldvirtftn`

### Node identity

* Internal method nodeId:

  * `nodeId = SHA256( MVID + ":" + metadataToken + ":" + arity + ":" + signatureShape )`
  * Minimal acceptable: `SHA256(MVID + ":" + metadataToken)`

This is intentionally compatible with how runtime stacks identify methods (module + token).

### External method nodes

If a call operand is a `MemberRef`/`MethodSpec` that targets another assembly:

* Create an “external node” with:

  * `symbolKey` computed from metadata signature
  * `artifactKey` resolved via `ReferenceIndexer` (assembly identity match)
  * `nodeId = SHA256("ext:" + artifactKey + ":" + symbolKey)` (runtime-proof not required)

Set `flags |= External`.

## 5.6 `RoslynEntrypointExtractor`

Responsibilities:

* Produce `entrypoints[]` records pointing to nodeIds.

### Must support (v1)

**ASP.NET Core MVC controllers**

* Type has `[ApiController]` or derives from `ControllerBase`
* Action methods: public instance methods with routing attributes `[HttpGet]`, `[HttpPost]`, `[Route]`, etc.
* Route template:

  * combine controller + action route attributes (best effort)
* `entrypoint.kind = http`, `framework=aspnetcore`

**Minimal APIs**

* Detect invocation of `MapGet`, `MapPost`, `MapPut`, `MapDelete`, `MapMethods`
* Extract route string literal when available
* Handler target:

  * lambda => map to generated method? (best effort)
  * method group => resolve to method symbolKey => nodeId

**gRPC**

* Detect `MapGrpcService<T>()` (endpoint registration)
* Entry points: service methods on generated base types (best effort)

**Background jobs**

* Types implementing `IHostedService`
* `BackgroundService.ExecuteAsync` override
* `entrypoint.kind = job`

### Mapping Roslyn → nodeId

Do **not** attempt to compute metadata tokens from Roslyn symbols directly.

Instead:

* Generate the same canonical `symbolKey` for Roslyn symbols
* Resolve `symbolKey -> nodeId` using a dictionary built from IL nodes

If not resolvable, emit an entrypoint with a synthetic “unresolved” node:

* `nodeId = SHA256("unresolved:" + symbolKey)`
* `flags |= Unresolved`
* `why += ENTRYPOINT_SYMBOL_UNRESOLVED`

## 5.7 `RoslynHeuristicEdgeExtractor`

Responsibilities:

* Add **heuristic edges** that IL won’t reliably capture.

### DI bindings (must-have)

Detect common DI registration patterns:

* `services.AddTransient<IFoo, Foo>()`
* `AddScoped`, `AddSingleton`
  Emit heuristic edge:
* from: interface method set? (v1 simplify to type-level constructor edge)
* to: `Foo..ctor(...)` node
* `reason = di_binding`

Practical v1 implementation:

* Create edge from a synthetic “DI container” node per assembly to implementation constructors.
* Or create edges from the registration site method to the constructor.
  (Choose one and keep consistent.)

### Reflection (must-have)

Emit heuristic edges with lower confidence:

* `Type.GetType("Namespace.Type, Assembly")`
* `Assembly.Load(...)`, `GetMethod("X")`, `Invoke`
* `Activator.CreateInstance(...)`

If string literal resolves to a type/method in the solution, create edge:

* from: caller method
* to: target method/ctor
* `reason = reflection_string`

If not resolvable, record a `why=REFLECTION_UNRESOLVED_STRING` diagnostic; do not crash.

## 5.8 `GraphMerger`

Responsibilities:

* Merge nodes/edges/entrypoints from IL and Roslyn stages
* De-duplicate edges by `(from,to,kind,reason)`
* Apply optional throttles:

  * cap edges per node
  * drop low-weight heuristics if too many

## 5.9 `CallGraphWriter`

Responsibilities:

* Serialize `CallGraph.v1.json` exactly to spec
* Include:

  * `artifacts[]` (project outputs + references)
  * `nodes[]`, `edges[]`
  * `entrypoints[]`
  * `language = "dotnet"`
  * `scanKey`

---

# 6) Canonical symbolKey format (critical for merges)

Pick one canonical form and use it everywhere.

Recommended v1 `symbolKey` shape:

```
{Namespace}.{TypeName}[`Arity][+Nested]::{MethodName}[`Arity]({ParamType1},{ParamType2},...)
```

Rules:

* Use `System.*` full names for BCL types
* Use `+` for nested types (metadata style)
* Use backtick arity for generic type/method definitions
* For arrays: `System.String[]`
* For byref: `System.String&`

**Implementation detail:**

* IL extractor can build this from metadata signatures.
* Roslyn extractor can build this using a controlled `SymbolDisplayFormat`.

If you get this right, Roslyn → IL mapping becomes reliable.

---

# 7) CLI surface (what developers will actually run)

Minimum viable commands:

### Artifacts-first scan

```bash
stella-worker-dotnet scan \
  --scanKey 00000000-0000-0000-0000-000000000000 \
  --assemblies ./artifacts/bin/Release \
  --out ./callgraph.json
```

### Build-and-scan (internal trusted only)

```bash
stella-worker-dotnet scan \
  --scanKey ... \
  --sln ./src/MySolution.sln \
  --configuration Release \
  --tfm net10.0 \
  --buildMode build \
  --out ./callgraph.json
```

### Upload to scanner.webservice

```bash
stella-worker-dotnet scan \
  --scanKey ... \
  --assemblies ./artifacts/bin/Release \
  --upload https://scanner/api/scans/{scanId}/callgraphs \
  --apiKey $STELLA_API_KEY
```

---

# 8) Observability and failure behavior

## 8.1 Structured diagnostics

Always emit:

* counts: nodes/edges/entrypoints
* build outcome: success/failed/partial
* list of projects scanned/skipped
* unresolved symbol counts (entrypoints + heuristic edges)

## 8.2 Hard failure vs partial output

* If at least one assembly scanned, output a graph even if others fail.
* Mark diagnostics in output:

  * add `why`/`notes` (if you extend schema) OR log to stderr and let webservice record the warning on ingest.

---

# 9) Why this architecture works for reachability

* **IL extraction** provides the most faithful call edges and stable node IDs.
* **Roslyn adds what IL can’t:** framework entrypoints, DI and reflection heuristics.
* Node IDs based on **MVID+token** align with deterministic compilation behavior and runtime tooling patterns. Deterministic compilation replaces timestamp/MVID with values derived from compilation inputs, enabling stable identity under stable inputs. ([Microsoft Learn][3])
* Roslyn MSBuildWorkspace is the canonical way to load solutions/projects with correct references and compilation options. ([NuGet][4])

---

# 10) Implementation “must-do” checklist for the developer

1. **Define canonical symbolKey** and implement it in:

   * IL extractor
   * Roslyn symbol formatter
2. Implement `ReferenceIndexer` to map assembly identity → artifactKey.
3. Implement IL extractor:

   * nodes for method defs
   * edges for call opcodes
4. Implement entrypoint detectors (controllers + minimal APIs + hosted service).
5. Implement DI + reflection heuristic edges.
6. Merge and output `CallGraph.v1.json` matching schema.
7. Add golden tests (small ASP.NET apps) verifying:

   * entrypoint detection
   * at least one static path exists
   * DI/reflection edges appear with correct reason codes

---

If you want, I can also provide:

* a **concrete folder layout** (`/src/StellaOps.Scanner.Worker.DotNet/...`) with project files,
* the **exact `SymbolDisplayFormat`** you should use for symbolKey stability,
* and an IL opcode decoder snippet that only records call-like opcodes (fast, minimal).

[1]: https://dotnet.microsoft.com/en-us/platform/support/policy/dotnet-core ".NET and .NET Core official support policy | .NET"
[2]: https://dotnet.microsoft.com/en-US/download/dotnet/10.0 "Download .NET 10.0 (Linux, macOS, and Windows) | .NET"
[3]: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/compiler-options/code-generation?utm_source=chatgpt.com "C# Compiler Options that control code generation"
[4]: https://www.nuget.org/packages/Microsoft.CodeAnalysis.Workspaces.MSBuild/?utm_source=chatgpt.com "Microsoft.CodeAnalysis.Workspaces.MSBuild 5.0.0"
[5]: https://www.nuget.org/packages/microsoft.codeanalysis?utm_source=chatgpt.com "Microsoft.CodeAnalysis 5.0.0"
[6]: https://www.nuget.org/profiles/RoslynTeam?utm_source=chatgpt.com "NuGet Gallery | RoslynTeam"
[7]: https://www.nuget.org/packages/mono.cecil/?utm_source=chatgpt.com "Mono.Cecil 0.11.6"
[8]: https://learn.microsoft.com/en-us/dotnet/core/diagnostics/diagnostics-client-library?utm_source=chatgpt.com "Diagnostics client library - .NET"