- Created a new document for the Stella Ops Reference Architecture outlining the system's topology, trust boundaries, artifact association, and interfaces. - Developed a comprehensive Testing Strategy document detailing the importance of offline readiness, interoperability, determinism, and operational guardrails. - Introduced a README for the Testing Strategy, summarizing processing details and key concepts implemented. - Added guidance for AI agents and developers in the tests directory, including directory structure, test categories, key patterns, and rules for test development.
6.3 KiB
Here’s a practical, from‑scratch blueprint for a two‑stage reachability map that turns low‑level runtime facts into auditable, reproducible evidence for triage and VEX decisions.
What this is (plain English)
-
Goal: prove (or rule out) whether a vulnerable function/package could actually run in your build and deployment.
-
How:
- extract binary‑level call targets (what functions your program could call),
- map those targets onto symbol graphs (named functions/classes/modules),
- correlate those symbols with SBOM components (which package/image layer they live in),
- store each “slice” of reachability as a signed attestation so anyone can replay and verify it.
Stage A — Binary → Symbol graph
-
Inputs: built artifacts (ELF/COFF/Mach‑O), debug symbols (when available), stripped bins, and language runtimes.
-
Process (per artifact):
-
Parse binaries (headers, sections, symbol tables, relocations).
-
Recover call edges:
- Direct calls: disassemble; record
caller -> callee. - Indirect calls: resolve via PLT/IAT/vtables; fall back to conservative points‑to sets.
- Dynamic loading: log
dlopen/LoadLibrary+ exported symbol usage heuristics.
- Direct calls: disassemble; record
-
Normalize to Symbol Graph: nodes =
{binary, symbol, addr, hash}, edges =CALLS.
-
-
Outputs:
symbol-graph.jsonl(+ compact binary form), content‑addressed by hash.
Stage B — Symbol graph ↔ SBOM components
-
Inputs: CycloneDX/SPDX SBOM for the image/build; file→component mapping (path→pkg).
-
Process:
-
For each symbol: derive file path (or Build‑ID) → map to SBOM component/version/layer.
-
Build Component Reachability Graph:
- nodes =
{component@version}, edges = “component provides symbol X used by Y”. - annotate with file hashes, Build‑IDs, container layer digests.
- nodes =
-
-
Outputs:
reachability-slices/COMPONENT@VERSION.slice.json(per impacted component).
Attestable “slice” (the evidence object)
Each slice is a minimal proof unit answering: “This vulnerable symbol is (or isn’t) on a feasible path at runtime in build X.”
-
Contents:
- Scan manifest (tool versions, ruleset hashes, feed versions).
- Inputs digests (binaries, SBOM, container layers).
- The subgraph (only nodes/edges needed).
- Query + result (e.g., “is
openssl:EVP_PKEY_decryptreachable from any exported entrypoint?”).
-
Format: DSSE + in‑toto statement, stored as OCI artifact or file; deterministic (same inputs → same bytes).
Triage flow (how it helps today)
-
Given CVE → map to symbols/functions → check reachability slice:
- Reachable path found: mark “affected (reachable)”, include call chain and components; raise priority.
- No path / gated by feature flag: mark “not affected (unreachable/mitigated)”, with proof chain.
- Unknowns present: fail‑safe policy (e.g., “unknowns > N → block prod”) with explicit unknown edges listed.
Minimal data model (JSON hints)
Symbol:{ id, name, demangled, addr, file_sha256, build_id }Edge:{ src_symbol_id, dst_symbol_id, kind: "direct"|"plt"|"indirect" }Mapping:{ file_sha256|build_id -> component_purl, layer_digest, path }Slice:{ inputs:{…}, query:{…}, subgraph:{symbols:[…],edges:[…]}, verdict:"reachable"|"unreachable"|"unknown" }
Determinism & replay
- Pin everything: disassembler version, rules, demangler options, container digests, SBOM doc hash, symbolization flags.
- Emit a Scan Manifest with content hashes; store alongside slices.
- Provide a
replaycommand that re‑hydrates inputs and re‑computes the slice; byte‑for‑byte match required.
Where this plugs into Stella Ops (suggested modules)
- Sbomer: component/file mapping & SBOM import.
- Scanner.webservice: binary parse & call‑graph extraction (keep lattice/policy elsewhere per your rule).
- Vexer/Policy Engine: consume slices as evidence for “affected/not‑affected” claims.
- Attestor/Authority: sign DSSE/in‑toto statements; push to OCI.
- Timeline/Notify: surface verdict deltas over time, link to slices.
Guardrails & fallbacks
- If stripped binaries: prefer Build‑ID + external symbol servers; else conservative over‑approx (mark unknown).
- For JIT/dynamic plugins: capture runtime traces (eBPF/ETW) and merge as observed edges with timestamps.
- Mixed‑lang stacks: unify by file hash + symbol name mangling rules per toolchain.
Quick implementation plan (6 sprints)
- Binary ingest: ELF/PE/Mach‑O parsing, Build‑ID hashing, symbol tables, PLT/IAT resolution.
- Call‑edge recovery: direct calls, basic indirect resolution, slice extractor by entrypoint.
- SBOM mapping: file→component map, layer digests, purl normalization.
- Evidence format: DSSE/in‑toto schema, deterministic manifests, OCI storage.
- Queries & policies: “is‑reachable?” API, unknowns budget, feature‑flag conditions, VEX plumbing.
- Runtime merge: optional eBPF/ETW traces → annotate edges, produce “observed‑path” slices.
Lightweight APIs (sketch)
POST /reachability/query { cve, symbols[], entrypoints[], policy } -> slice+verdictGET /slice/{digest}-> attested slicePOST /replay { slice_digest }-> match | mismatch (with diff)
Small example (CVE → symbol mapping)
CVE‑XXXX‑YYYY→ advisory lists functionfoo_decryptinlibfoo.so- We resolve
libfoo.soBuild‑ID in image, find symbols that match demangled name, build call paths from service entrypoints; if path exists, slice is “reachable” with 3–7 hop chain; otherwise “unreachable” with reasons (no import, stripped at link‑time, dead code eliminated, or gated byFEATURE_X=false).
Costs (rough, for planning inside Stella Ops)
- Core parsing & graph: 3–4 engineer‑weeks
- Indirect calls & heuristics: +3–5 weeks
- SBOM mapping & layers: 2 weeks
- Attestations & OCI storage: 1–2 weeks
- Policy/VEX integration & UI surfacing: 2–3 weeks
- Runtime trace merge (optional): 2–4 weeks (Parallelizable; add 25–40% for hardening/tests.)
If you want, I can turn this into:
- a concrete .NET 10 service skeleton (endpoints + data contracts),
- a DSSE/in‑toto schema for the slice, and
- a dev checklist for deterministic builds and replay harness.