Here’s a practical, from‑scratch blueprint for a **two‑stage reachability map** that turns low‑level runtime facts into auditable, reproducible evidence for triage and VEX decisions.

---

# What this is (plain English)

* **Goal:** prove (or rule out) whether a vulnerable function/package could actually run in *your* build and deployment.
* **How:**

  1. extract **binary‑level call targets** (what functions your program *could* call),
  2. map those targets onto **symbol graphs** (named functions/classes/modules),
  3. correlate those symbols with **SBOM components** (which package/image layer they live in),
  4. store each “slice” of reachability as a **signed attestation** so anyone can replay and verify it.

---

# Stage A — Binary → Symbol graph

* **Inputs:** built artifacts (ELF/COFF/Mach‑O), debug symbols (when available), stripped bins, and language runtimes.
* **Process (per artifact):**

  * Parse binaries (headers, sections, symbol tables, relocations).
  * Recover call edges:

    * Direct calls: disassemble; record `caller -> callee`.
    * Indirect calls: resolve via PLT/IAT/vtables; fall back to conservative points‑to sets.
    * Dynamic loading: log `dlopen/LoadLibrary` + exported symbol usage heuristics.
  * Normalize to **Symbol Graph**: nodes = `{binary, symbol, addr, hash}`, edges = `CALLS`.
* **Outputs:** `symbol-graph.jsonl` (+ compact binary form), content‑addressed by hash.

# Stage B — Symbol graph ↔ SBOM components

* **Inputs:** CycloneDX/SPDX SBOM for the image/build; file→component mapping (path→pkg).
* **Process:**

  * For each symbol: derive file path (or Build‑ID) → map to SBOM component/version/layer.
  * Build **Component Reachability Graph**:

    * nodes = `{component@version}`, edges = “component provides symbol X used by Y”.
    * annotate with file hashes, Build‑IDs, container layer digests.
* **Outputs:** `reachability-slices/COMPONENT@VERSION.slice.json` (per impacted component).

# Attestable “slice” (the evidence object)

Each slice is a minimal proof unit answering: *“This vulnerable symbol is (or isn’t) on a feasible path at runtime in build X.”*

* **Contents:**

  * Scan manifest (tool versions, ruleset hashes, feed versions).
  * Inputs digests (binaries, SBOM, container layers).
  * The subgraph (only nodes/edges needed).
  * Query + result (e.g., “is `openssl:EVP_PKEY_decrypt` reachable from any exported entrypoint?”).
* **Format:** DSSE + in‑toto statement, stored as OCI artifact or file; **deterministic** (same inputs → same bytes).

# Triage flow (how it helps today)

* Given CVE → map to symbols/functions → check reachability slice:

  * **Reachable path found:** mark “affected (reachable)”, include call chain and components; raise priority.
  * **No path / gated by feature flag:** mark “not affected (unreachable/mitigated)”, with proof chain.
  * **Unknowns present:** fail‑safe policy (e.g., “unknowns > N → block prod”) with explicit unknown edges listed.

# Minimal data model (JSON hints)

* `Symbol`: `{ id, name, demangled, addr, file_sha256, build_id }`
* `Edge`: `{ src_symbol_id, dst_symbol_id, kind: "direct"|"plt"|"indirect" }`
* `Mapping`: `{ file_sha256|build_id -> component_purl, layer_digest, path }`
* `Slice`: `{ inputs:{…}, query:{…}, subgraph:{symbols:[…],edges:[…]}, verdict:"reachable"|"unreachable"|"unknown" }`

# Determinism & replay

* Pin **everything**: disassembler version, rules, demangler options, container digests, SBOM doc hash, symbolization flags.
* Emit a **Scan Manifest** with content hashes; store alongside slices.
* Provide a `replay` command that re‑hydrates inputs and re‑computes the slice; byte‑for‑byte match required.

# Where this plugs into Stella Ops (suggested modules)

* **Sbomer**: component/file mapping & SBOM import.
* **Scanner.webservice**: binary parse & call‑graph extraction (keep lattice/policy elsewhere per your rule).
* **Vexer/Policy Engine**: consume slices as evidence for “affected/not‑affected” claims.
* **Attestor/Authority**: sign DSSE/in‑toto statements; push to OCI.
* **Timeline/Notify**: surface verdict deltas over time, link to slices.

# Guardrails & fallbacks

* If stripped binaries: prefer Build‑ID + external symbol servers; else conservative over‑approx (mark unknown).
* For JIT/dynamic plugins: capture runtime traces (eBPF/ETW) and merge as **observed edges** with timestamps.
* Mixed‑lang stacks: unify by file hash + symbol name mangling rules per toolchain.

# Quick implementation plan (6 sprints)

1. **Binary ingest**: ELF/PE/Mach‑O parsing, Build‑ID hashing, symbol tables, PLT/IAT resolution.
2. **Call‑edge recovery**: direct calls, basic indirect resolution, slice extractor by entrypoint.
3. **SBOM mapping**: file→component map, layer digests, purl normalization.
4. **Evidence format**: DSSE/in‑toto schema, deterministic manifests, OCI storage.
5. **Queries & policies**: “is‑reachable?” API, unknowns budget, feature‑flag conditions, VEX plumbing.
6. **Runtime merge**: optional eBPF/ETW traces → annotate edges, produce “observed‑path” slices.

# Lightweight APIs (sketch)

* `POST /reachability/query { cve, symbols[], entrypoints[], policy } -> slice+verdict`
* `GET /slice/{digest}` -> attested slice
* `POST /replay { slice_digest }` -> match | mismatch (with diff)

# Small example (CVE → symbol mapping)

* `CVE‑XXXX‑YYYY` → advisory lists function `foo_decrypt` in `libfoo.so`
* We resolve `libfoo.so` Build‑ID in image, find symbols that match demangled name, build call paths from service entrypoints; if path exists, slice is “reachable” with 3–7 hop chain; otherwise “unreachable” with reasons (no import, stripped at link‑time, dead code eliminated, or gated by `FEATURE_X=false`).

# Costs (rough, for planning inside Stella Ops)

* **Core parsing & graph**: 3–4 engineer‑weeks
* **Indirect calls & heuristics**: +3–5 weeks
* **SBOM mapping & layers**: 2 weeks
* **Attestations & OCI storage**: 1–2 weeks
* **Policy/VEX integration & UI surfacing**: 2–3 weeks
* **Runtime trace merge (optional)**: 2–4 weeks
  *(Parallelizable; add 25–40% for hardening/tests.)*

If you want, I can turn this into:

* a concrete **.NET 10 service skeleton** (endpoints + data contracts),
* a **DSSE/in‑toto schema** for the slice, and
* a **dev checklist** for deterministic builds and replay harness.