- Introduced a detailed specification for encoding binary reachability that integrates call graphs with SBOMs. - Defined a minimal data model including nodes, edges, and SBOM components. - Outlined a step-by-step guide for building the reachability graph in a C#-centric manner. - Established core domain models, including enumerations for binary formats and symbol kinds. - Created a public API for the binary reachability service, including methods for graph building and serialization. - Specified SBOM component resolution and binary parsing abstractions for PE, ELF, and Mach-O formats. - Enhanced symbol normalization and digesting processes to ensure deterministic signatures. - Included error handling, logging, and a high-level test plan to ensure robustness and correctness. - Added non-functional requirements to guide performance, memory usage, and thread safety.
22 KiB
Here’s a crisp idea you can drop straight into Stella Ops: treat “unknowns” as first‑class data, not noise.
Unknowns Registry — turning uncertainty into signals
Why: Scanners and VEX feeds miss things (ambiguous package IDs, unverifiable hashes, orphaned layers, missing SBOM edges, runtime-only artifacts). Today these get logged and forgotten. If we structure them, downstream agents can reason about risk and shrink blast radius proactively.
What it is: A small service + schema that records every uncertainty with enough context for later inference.
Core model (v0)
{
"unknown_id": "unk:sha256:…",
"observed_at": "2025-11-18T12:00:00Z",
"provenance": {
"source": "Scanner.Analyzer.DotNet|Sbomer|Signals|Vexer",
"host": "runner-42",
"scan_id": "scan:…"
},
"scope": {
"artifact": { "type": "oci.image", "ref": "registry/app@sha256:…" },
"subpath": "/app/bin/Contoso.dll",
"phase": "build|scan|runtime"
},
"unknown_type": "identity_gap|version_conflict|hash_mismatch|missing_edge|runtime_shadow|policy_undecidable",
"evidence": {
"raw": "nuget id 'Serilog' but assembly name 'Serilog.Core'",
"signals": ["sym:Serilog.Core.Logger", "procopen:/app/agent"]
},
"transitive": {
"depth": 2,
"parents": ["pkg:nuget/Serilog@?"],
"children": []
},
"confidence": { "p": 0.42, "method": "bayes-merge|rule" },
"exposure_hints": {
"surface": ["logging pipeline", "startup path"],
"runtime_hits": 3
},
"status": "open|triaged|suppressed|resolved",
"labels": ["reachability:possible", "sbom:incomplete"]
}
Categorize by three axes
- Provenance (where it came from): Scanner vs Sbomer vs Vexer vs Signals.
- Scope (what it touches): image/layer/file/symbol/runtime‑proc/policy.
- Transitive depth (how far from an entry point): 0 = direct, 1..N via deps.
How agents use it
- Cartographer: includes unknown edges in the graph with special weight; lets Policy/Lattice down‑rank vulnerable nodes near high‑impact unknowns.
- Remedy Assistant (Zastava): proposes micro‑probes (“add EventPipe/JFR tap for X symbol”) or build‑time assertions (“pin Serilog>=3.1, regenerate SBOM”).
- Scheduler: prioritizes scans where unknown density × asset criticality is highest.
Minimal API (idempotent, additive)
POST /unknowns/ingest— upsert byunknown_id(hash of type+scope+evidence).GET /unknowns?artifact=…&status=open— list for a target.POST /unknowns/:id/triage— set status/labels, attach rationale.GET /metrics— density by artifact/namespace/unknown_type.
All additive; no versioning required. Repeat calls with the same payload are no‑ops.
Scoring hook (into your lattice)
- Add a “Unknowns Pressure” term:
risk = base ⊕ (α * density_depth≤1) ⊕ (β * runtime_shadow) ⊕ (γ * policy_undecidable) - Gate “green” only if
density_depth≤1 == 0or compensating controls active.
Storage & plumbing
- Store: append‑only KV (Badger/Rocks) + Graph overlay (SQLite/Neo4j—your call).
- Emit: DSSE‑signed “Unknowns Attestation” per scan for replayable audits.
- UI: heatmap per artifact (unknowns by type × depth), drill‑down to evidence.
First 2‑day slice
- Define
unknown_typeenum + hashableunknown_id. - Wire Scanner/Sbomer/Vexer to emit unknowns (start with: identity_gap, missing_edge).
- Persist + expose
/metrics(density, by depth and type). - In Policy Studio, add the Unknowns Pressure term with default α/β/γ.
If you want, I’ll draft the exact protobuf/JSON schema and drop a .NET 10 record types + EF model, plus a tiny CLI to query and a Grafana panel JSON. I will treat “it” as the whole vision behind Pushing Binary Reachability Toward True Determinism inside Stella Ops: function-/symbol-level reachability for binaries and higher-level languages, wired into Scanner, Cartographer, Signals, and VEX.
Below is an implementation-oriented architecture plan you can hand directly to agents.
1. Scope, goals, and non-negotiable invariants
1.1. Scope
Deliver a deterministic reachability pipeline for containers that:
-
Builds call graphs and symbol usage maps for:
- Native binaries (ELF, PE, Mach-O) — primary for this branch.
- Scripted/VM languages later: JS, Python, PHP (as part of the same architecture).
-
Maps symbols and functions to:
- Packages (purls).
- Vulnerabilities (CVE → symbol/function list via Concelier/VEX data).
-
Computes deterministic reachability states for each
(vulnerability, artifact)pair. -
Emits:
- Machine-readable JSON (with
purls). - Graph overlays for Cartographer.
- Inputs for the lattice/trust engine and VEXer/Excitor.
- Machine-readable JSON (with
1.2. Invariants
-
Deterministic replay: Given the same:
- Image digest(s),
- Analyzer versions,
- Config + policy,
- Runtime trace inputs (if any), the same reachability outputs must be produced, bit-for-bit.
-
Idempotent, additive APIs:
- No versioning of endpoints, only additive/optional fields.
- Same request = same response, no side effects besides storing/caching.
-
Lattice logic runs in
Scanner.WebService:- All “reachable/unreachable/unknown” and confidence merging lives in Scanner, not Concelier/Excitors.
-
Preserve prune source:
- Concelier and Excitors preserve provenance and do not “massage” reachability; they only consume it.
-
Offline, air-gap friendly:
- No mandatory external calls; dependency on local analyzers and local advisory/VEX cache.
2. High-level pipeline
From container image to reachability output:
-
Image enumeration
Scanner.WebServicereceives an image ref or tarball and spawns an analysis run. -
Binary discovery & classification Binary analyzers detect ELF/PE/Mach-O + main interpreters (python, node, php) and scripts.
-
Symbolization & call graph building
-
For each binary/module, we produce:
- Symbol table (exported + imported).
- Call graph edges (function-level where possible).
-
For dynamic languages, we later plug in appropriate analyzers.
-
-
Symbol→package mapping
-
Match symbols to packages and
purls using:- Known vendor symbol maps (from Concelier / Feedser).
- Heuristics, path patterns, build IDs.
-
-
Vulnerability→symbol mapping
- From Concelier/VEX/CSAF: map each CVE to the set of symbols/functions it affects.
-
Reachability solving
-
For each
(CVE, artifact):- Determine presence and reachability of affected symbols from known entrypoints.
- Merge static call graph and runtime signals (if available) via deterministic lattice.
-
-
Output & storage
- Reachability JSON with purls and confidence.
- Graph overlay into Cartographer.
- Signals/events for downstream scoring.
- DSSE-signed reachability attestation for replay/audit.
3. Component architecture
3.1. New and extended services
-
StellaOps.Scanner.WebService(extended)- Orchestration of reachability analyses.
- Lattice/merging engine.
- Idempotent reachability APIs.
-
StellaOps.Scanner.Analyzers.Binary.*(new)…Binary.Discovery: file type detection, ELF/PE/Mach-O parsing.…Binary.Symbolizer: resolves symbols, imports/exports, relocations.…Binary.CallGraph.Native: builds call graphs where possible (via disassembly/CFG).…Binary.CallGraph.DynamicStubs: heuristics for indirect calls, PLT/GOT, vtables.
-
StellaOps.Scanner.Analyzers.Script.*(future extension)…Lang.JavaScript.CallGraph…Lang.Python.CallGraph…Lang.Php.CallGraph- These emit the same generic call-graph IR.
-
StellaOps.Reachability.Engine(within Scanner.WebService)- Normalizes all call graphs into a common IR.
- Merges static and dynamic evidence.
- Computes reachability states and scores.
-
StellaOps.Cartographer.ReachabilityOverlay(new overlay module)- Stores per-artifact call graphs and reachability tags.
- Provides graph queries for UI and policy tools.
-
StellaOps.Signals(extended)- Ingests runtime call traces (e.g., from EventPipe/JFR/ebpf in other branches).
- Feeds function-hit events into the Reachability Engine.
-
Unknowns Registry integration (optional but recommended)
- Stores unresolved symbol/package mappings and incomplete edges as
unknowns. - Used to adjust risk scores (“Unknowns Pressure”) when binary analysis is incomplete.
- Stores unresolved symbol/package mappings and incomplete edges as
4. Detailed design by layer
4.1. Static analysis layer (binaries)
4.1.1. Binary discovery
Module: StellaOps.Scanner.Analyzers.Binary.Discovery
-
Inputs:
- Per-image file list (from existing Scanner).
- Byte slices of candidate binaries.
-
Logic:
-
Detect ELF/PE/Mach-O via magic bytes, not extensions.
-
Classify as:
- Main executable
- Shared library
- Plugin/module
-
-
Output:
-
binary_manifest.jsonper image:{ "image_ref": "registry/app@sha256:…", "binaries": [ { "id": "bin:elf:/usr/local/bin/app", "path": "/usr/local/bin/app", "format": "elf", "arch": "x86_64", "role": "executable" } ] }
-
4.1.2. Symbolization
Module: StellaOps.Scanner.Analyzers.Binary.Symbolizer
-
Uses:
- ELF/PE/Mach-O parsers (internal or third-party), no external calls.
-
Output per binary:
{ "binary_id": "bin:elf:/usr/local/bin/app", "build_id": "buildid:abcd…", "exports": ["pkg1::ClassA::method1", "..."], "imports": ["openssl::EVP_EncryptInit_ex", "..."], "sections": { "text": { "va": "0x...", "size": 12345 } } } -
Writes unresolved symbol sets to Unknowns Registry when:
- Imports cannot be tied to known packages or symbols.
4.1.3. Call graph construction
Module: StellaOps.Scanner.Analyzers.Binary.CallGraph.Native
-
Core tasks:
-
Build control-flow graphs (CFG) for each function via:
- Disassembly.
- Basic block detection.
-
Identify direct calls (
call func) and indirect calls (function pointers, vtables).
-
-
IR model:
{ "binary_id": "bin:elf:/usr/local/bin/app", "functions": [ { "fid": "func:app::main", "va": "0x401000", "size": 128 }, { "fid": "func:libssl::EVP_EncryptInit_ex", "external": true } ], "edges": [ { "caller": "func:app::main", "callee": "func:app::init_config", "type": "direct" }, { "caller": "func:app::main", "callee": "func:libssl::EVP_EncryptInit_ex", "type": "import" } ] } -
Edge confidence:
type: direct|import|indirect|heuristic- Used later by the lattice.
4.1.4. Entry point inference
-
Sources:
- ELF
PT_INTERP, PEAddressOfEntryPoint. - Application-level hints (known frameworks, service main methods).
- Container metadata (CMD, ENTRYPOINT).
- ELF
-
Output:
{ "binary_id": "bin:elf:/usr/local/bin/app", "entrypoints": ["func:app::main"] }
Note: For JS/Python/PHP, equivalent analyzers will later define module entrypoints (
index.js,wsgi_app,public/index.php).
4.2. Symbol-to-package and CVE-to-symbol mapping
4.2.1. Symbol→package mapping
Module: StellaOps.Reachability.Mapping.SymbolToPurl
-
Inputs:
- Binary symbolization outputs.
- Local mapping DB in Concelier (vendor symbol maps, debug info, name patterns).
- File path + container context (
/usr/lib/...,/site-packages/...).
-
Output:
{ "symbol": "libssl::EVP_EncryptInit_ex", "purl": "pkg:apk/alpine/openssl@3.1.5-r2", "confidence": 0.93, "method": "vendor_map+path_heuristic" } -
Unresolved / ambiguous symbols:
- Stored as
unknownsof typeidentity_gap.
- Stored as
4.2.2. CVE→symbol mapping
Responsibility: Concelier + its advisory ingestion.
-
For each vulnerability:
{ "cve_id": "CVE-2025-12345", "purl": "pkg:apk/alpine/openssl@3.1.5-r2", "affected_symbols": [ "libssl::EVP_EncryptInit_ex", "libssl::EVP_EncryptUpdate" ], "source": "vendor_vex", "confidence": 1.0 } -
Reachability Engine consumes this mapping read-only.
4.3. Reachability Engine
Module: StellaOps.Reachability.Engine (in Scanner.WebService)
4.3.1. Core data model
Per (artifact, cve, purl):
{
"artifact": { "type": "oci.image", "ref": "registry/app@sha256:…" },
"cve_id": "CVE-2025-12345",
"purl": "pkg:apk/alpine/openssl@3.1.5-r2",
"symbols": [
{
"symbol": "libssl::EVP_EncryptInit_ex",
"static_presence": "present|absent|unknown",
"static_reachability": "reachable|unreachable|unknown",
"runtime_hits": 3,
"runtime_reachability": "observed|not_observed|unknown"
}
],
"reachability_state": "confirmed_reachable|statically_reachable|present_not_reachable|not_present|unknown",
"confidence": {
"p": 0.87,
"evidence": ["static_callgraph", "runtime_trace", "symbol_map"],
"unknowns_pressure": 0.12
}
}
4.3.2. Lattice / state machine
Define a deterministic lattice over states:
NOT_PRESENTPRESENT_NOT_REACHABLESTATICALLY_REACHABLERUNTIME_OBSERVED
And “unknown” flags overlayed when evidence is missing.
Merging rules (simplified):
- If
NOT_PRESENTand no conflicting evidence →NOT_PRESENT. - If at least one affected symbol is on a static path from any entrypoint →
STATICALLY_REACHABLE. - If symbol observed at runtime →
RUNTIME_OBSERVED(top state). - If symbol present in binary but not on any static path →
PRESENT_NOT_REACHABLE, unless unknown edges exist near it (then downgrade with lower confidence). - Unknowns Registry entries near affected symbols increase
unknowns_pressureand may push fromNOT_PRESENTtoUNKNOWN.
Implementation: pure functional merge functions inside Scanner.WebService:
ReachabilityState Merge(ReachabilityState a, ReachabilityState b);
ReachabilityState FromEvidence(StaticEvidence s, RuntimeEvidence r, UnknownsPressure u);
4.3.3. Deterministic inputs
To guarantee replay:
-
Build Reachability Plan Manifest per run:
{ "plan_id": "reach:sha256:…", "scanner_version": "1.4.0", "analyzers": { "binary_discovery": "1.0.0", "binary_symbolizer": "1.1.0", "binary_callgraph": "1.2.0" }, "inputs": { "image_digest": "sha256:…", "runtime_trace_files": ["signals:run:2025-11-18T12:00:00Z"], "config": { "assume_indirect_calls": "conservative", "max_call_depth": 10 } } } -
DSSE-sign the plan + result.
4.4. Storage and graph overlay
4.4.1. Reachability store
Backend: re-use existing Scanner/Cartographer storage stack (e.g., Postgres or SQLite + blob store).
Tables/collections:
-
reachability_runsplan_id,image_ref,created_at,scanner_version.
-
reachability_resultsplan_id,cve_id,purl,state,confidence_p,unknowns_pressure,payload_json.
-
Indexes on
(image_ref, cve_id),(image_ref, purl).
4.4.2. Cartographer overlay
Edges:
-
IMAGE→BINARY→FUNCTION→PACKAGE→CVE -
Extra property on
IMAGE -[AFFECTED_BY]-> CVE:reachability_statereachability_plan_id
Enables queries:
- “Show me all CVEs with
STATICALLY_REACHABLEin this namespace.” - “Show me binaries with high density of reachable crypto CVEs.”
4.5. APIs (idempotent, additive)
4.5.1. Trigger reachability
POST /reachability/runs
Request:
{
"artifact": { "type": "oci.image", "ref": "registry/app@sha256:…" },
"config": {
"include_languages": ["binary"],
"max_call_depth": 10,
"assume_indirect_calls": "conservative"
}
}
Response:
{ "plan_id": "reach:sha256:…" }
- Idempotent key:
(image_ref, config_hash). Subsequent calls return sameplan_id.
4.5.2. Fetch results
GET /reachability/runs/:plan_id
{
"plan": { /* reachability plan manifest */ },
"results": [
{
"cve_id": "CVE-2025-12345",
"purl": "pkg:apk/alpine/openssl@3.1.5-r2",
"reachability_state": "static_reachable",
"confidence": { "p": 0.84, "unknowns_pressure": 0.1 }
}
]
}
4.5.3. Per-CVE view for VEXer/Excitor
GET /reachability/by-cve?artifact=…&cve_id=…
- Returns filtered result for downstream VEX creation.
All APIs are read-only except for the side effect of storing/caching runs.
5. Interaction with other Stella Ops modules
5.1. Concelier
-
Provides:
- CVE→purl→symbol mapping.
- Vendor VEX statements indicating affected functions.
-
Consumes:
- Nothing from reachability directly; Scanner/WebService passes reachability summary to VEXer/Excitor which merges with vendor statements.
5.2. VEXer / Excitor
-
Input:
-
For each
(artifact, cve):- Reachability state.
- Confidence.
-
-
Logic:
-
Translate states to VEX statements:
NOT_PRESENT→not_affectedPRESENT_NOT_REACHABLE→not_affected(with justification “code not reachable according to analysis”)STATICALLY_REACHABLE→affectedRUNTIME_OBSERVED→affected(higher severity)
-
Attach determinism proof:
- Plan ID + DSSE of reachability run.
-
5.3. Signals
-
Provides:
- Function hit events:
(binary_id, function_id, timestamp)aggregated per image.
- Function hit events:
-
Reachability Engine:
- Marks
runtime_hitsand stateRUNTIME_OBSERVEDfor symbols with hits.
- Marks
-
Unknowns:
- If runtime sees hits in functions with no static edges to entrypoints (or unmapped symbols), these produce Unknowns and increase
unknowns_pressure.
- If runtime sees hits in functions with no static edges to entrypoints (or unmapped symbols), these produce Unknowns and increase
5.4. Unknowns Registry
-
From reachability pipeline, create Unknowns when:
- Symbol→package mapping is ambiguous.
- CVE→symbol mapping exists, but symbol cannot be found in binaries.
- Call graph has indirect calls that cannot be resolved.
-
The “Unknowns Pressure” term is fed into:
- Reachability confidence.
- Global risk scoring (Trust Algebra Studio).
6. Implementation phases and engineering plan
Phase 0 – Scaffolding & manifests (1 sprint)
-
Create:
StellaOps.Reachability.Engineskeleton.- Reachability Plan Manifest schema.
- Reachability Run + Result persistence.
-
Add
/reachability/runsand/reachability/runs/:plan_idendpoints, returning mock data. -
Wire DSSE attestation generation for reachability results (even if payload is empty).
Phase 1 – Binary discovery + symbolization (1–2 sprints)
-
Implement
Binary.DiscoveryandBinary.Symbolizer. -
Feed symbol tables into Reachability Engine as “presence-only evidence”:
- States:
NOT_PRESENTvsPRESENT_NOT_REACHABLEvsUNKNOWN.
- States:
-
Integrate with Concelier’s CVE→purl mapping (no symbol-level yet):
- For CVEs affecting a package present in the image, mark as
PRESENT_NOT_REACHABLE.
- For CVEs affecting a package present in the image, mark as
-
Emit Unknowns for unresolved binary roles and ambiguous package mapping.
Deliverable: package-level reachability with deterministic manifests.
Phase 2 – Binary call graphs & entrypoints (2–3 sprints)
-
Implement
Binary.CallGraph.Native:- CFG + direct call edges.
-
Implement entrypoint inference from binary + container ENTRYPOINT/CMD.
-
Add static reachability algorithm:
- DFS/BFS from entrypoints through call graph.
- Mark affected symbols as reachable if found on paths.
-
Extend Concelier to ingest symbol-aware vulnerability metadata (for pilots; can be partial).
Deliverable: function-level static reachability for native binaries where symbol maps exist.
Phase 3 – Runtime integration (2 sprints, may be in parallel workstream)
-
Integrate Signals runtime evidence:
- Define schema for function hit events.
- Add ingestion path into Reachability Engine.
-
Update lattice:
- Promote symbols to
RUNTIME_OBSERVEDwhen hits exist.
- Promote symbols to
-
Extend DSSE attestation to reference runtime evidence URIs (hashes of trace inputs).
Deliverable: static + runtime-confirmed reachability.
Phase 4 – Unknowns & pressure (1 sprint)
-
Wire Unknowns Registry:
- Emit unknowns from Symbolizer and CallGraph (identity gaps, missing edges).
- Compute
unknowns_pressureper(artifact, cve)as density of unknowns near affected symbols.
-
Adjust confidence calculation in Reachability Engine.
-
Expose unknowns metrics in API and Cartographer.
Deliverable: explicit modelling of uncertainty, feeding into trust/lattice.
Phase 5 – Language extensions (JS/Python/PHP) (ongoing)
- Implement per-language call-graph analyzers creating the same IR as binary.
- Extend symbol→purl mapping for these ecosystems (npm, PyPI, Packagist).
- Update reachability solver to include multi-language edges (e.g., Python calling into native modules).
7. Minimal contracts for agents
To hand off to agents, you can codify:
-
IR schemas
- Call graph IR.
- Reachability Result JSON.
- Reachability Plan Manifest.
-
API contracts
POST /reachability/runsGET /reachability/runs/:plan_idGET /reachability/by-cve
-
Module boundaries
Scanner.Analyzers.Binary.*produce IR only; NO network calls.Reachability.Engineis the only place where lattice logic lives.Concelieris read-only for reachability; no custom logic there.
-
Determinism practices
- All algorithmic randomness is banned; where unavoidable, seed with values derived from plan_id.
- All external inputs must be listed in the Plan Manifest.
If you like, next step I can draft:
- Concrete C# record types for the IRs.
- A small pseudo-code implementation of the lattice functions and static reachability DFS.
- A proposed directory layout under
src/StellaOps.Scannerandsrc/StellaOps.Cartographer.