Files
git.stella-ops.org/docs/product-advisories/20-Dec-2025 - Layered binary + call‑stack reachability.md
master 53503cb407 Add reference architecture and testing strategy documentation
- Created a new document for the Stella Ops Reference Architecture outlining the system's topology, trust boundaries, artifact association, and interfaces.
- Developed a comprehensive Testing Strategy document detailing the importance of offline readiness, interoperability, determinism, and operational guardrails.
- Introduced a README for the Testing Strategy, summarizing processing details and key concepts implemented.
- Added guidance for AI agents and developers in the tests directory, including directory structure, test categories, key patterns, and rules for test development.
2025-12-22 07:59:30 +02:00

6.3 KiB
Raw Blame History

Heres a practical, fromscratch blueprint for a twostage reachability map that turns lowlevel runtime facts into auditable, reproducible evidence for triage and VEX decisions.


What this is (plain English)

  • Goal: prove (or rule out) whether a vulnerable function/package could actually run in your build and deployment.

  • How:

    1. extract binarylevel call targets (what functions your program could call),
    2. map those targets onto symbol graphs (named functions/classes/modules),
    3. correlate those symbols with SBOM components (which package/image layer they live in),
    4. store each “slice” of reachability as a signed attestation so anyone can replay and verify it.

Stage A — Binary → Symbol graph

  • Inputs: built artifacts (ELF/COFF/MachO), debug symbols (when available), stripped bins, and language runtimes.

  • Process (per artifact):

    • Parse binaries (headers, sections, symbol tables, relocations).

    • Recover call edges:

      • Direct calls: disassemble; record caller -> callee.
      • Indirect calls: resolve via PLT/IAT/vtables; fall back to conservative pointsto sets.
      • Dynamic loading: log dlopen/LoadLibrary + exported symbol usage heuristics.
    • Normalize to Symbol Graph: nodes = {binary, symbol, addr, hash}, edges = CALLS.

  • Outputs: symbol-graph.jsonl (+ compact binary form), contentaddressed by hash.

Stage B — Symbol graph ↔ SBOM components

  • Inputs: CycloneDX/SPDX SBOM for the image/build; file→component mapping (path→pkg).

  • Process:

    • For each symbol: derive file path (or BuildID) → map to SBOM component/version/layer.

    • Build Component Reachability Graph:

      • nodes = {component@version}, edges = “component provides symbol X used by Y”.
      • annotate with file hashes, BuildIDs, container layer digests.
  • Outputs: reachability-slices/COMPONENT@VERSION.slice.json (per impacted component).

Attestable “slice” (the evidence object)

Each slice is a minimal proof unit answering: “This vulnerable symbol is (or isnt) on a feasible path at runtime in build X.”

  • Contents:

    • Scan manifest (tool versions, ruleset hashes, feed versions).
    • Inputs digests (binaries, SBOM, container layers).
    • The subgraph (only nodes/edges needed).
    • Query + result (e.g., “is openssl:EVP_PKEY_decrypt reachable from any exported entrypoint?”).
  • Format: DSSE + intoto statement, stored as OCI artifact or file; deterministic (same inputs → same bytes).

Triage flow (how it helps today)

  • Given CVE → map to symbols/functions → check reachability slice:

    • Reachable path found: mark “affected (reachable)”, include call chain and components; raise priority.
    • No path / gated by feature flag: mark “not affected (unreachable/mitigated)”, with proof chain.
    • Unknowns present: failsafe policy (e.g., “unknowns > N → block prod”) with explicit unknown edges listed.

Minimal data model (JSON hints)

  • Symbol: { id, name, demangled, addr, file_sha256, build_id }
  • Edge: { src_symbol_id, dst_symbol_id, kind: "direct"|"plt"|"indirect" }
  • Mapping: { file_sha256|build_id -> component_purl, layer_digest, path }
  • Slice: { inputs:{…}, query:{…}, subgraph:{symbols:[…],edges:[…]}, verdict:"reachable"|"unreachable"|"unknown" }

Determinism & replay

  • Pin everything: disassembler version, rules, demangler options, container digests, SBOM doc hash, symbolization flags.
  • Emit a Scan Manifest with content hashes; store alongside slices.
  • Provide a replay command that rehydrates inputs and recomputes the slice; byteforbyte match required.

Where this plugs into StellaOps (suggested modules)

  • Sbomer: component/file mapping & SBOM import.
  • Scanner.webservice: binary parse & callgraph extraction (keep lattice/policy elsewhere per your rule).
  • Vexer/Policy Engine: consume slices as evidence for “affected/notaffected” claims.
  • Attestor/Authority: sign DSSE/intoto statements; push to OCI.
  • Timeline/Notify: surface verdict deltas over time, link to slices.

Guardrails & fallbacks

  • If stripped binaries: prefer BuildID + external symbol servers; else conservative overapprox (mark unknown).
  • For JIT/dynamic plugins: capture runtime traces (eBPF/ETW) and merge as observed edges with timestamps.
  • Mixedlang stacks: unify by file hash + symbol name mangling rules per toolchain.

Quick implementation plan (6 sprints)

  1. Binary ingest: ELF/PE/MachO parsing, BuildID hashing, symbol tables, PLT/IAT resolution.
  2. Calledge recovery: direct calls, basic indirect resolution, slice extractor by entrypoint.
  3. SBOM mapping: file→component map, layer digests, purl normalization.
  4. Evidence format: DSSE/intoto schema, deterministic manifests, OCI storage.
  5. Queries & policies: “isreachable?” API, unknowns budget, featureflag conditions, VEX plumbing.
  6. Runtime merge: optional eBPF/ETW traces → annotate edges, produce “observedpath” slices.

Lightweight APIs (sketch)

  • POST /reachability/query { cve, symbols[], entrypoints[], policy } -> slice+verdict
  • GET /slice/{digest} -> attested slice
  • POST /replay { slice_digest } -> match | mismatch (with diff)

Small example (CVE → symbol mapping)

  • CVEXXXXYYYY → advisory lists function foo_decrypt in libfoo.so
  • We resolve libfoo.so BuildID in image, find symbols that match demangled name, build call paths from service entrypoints; if path exists, slice is “reachable” with 37 hop chain; otherwise “unreachable” with reasons (no import, stripped at linktime, dead code eliminated, or gated by FEATURE_X=false).

Costs (rough, for planning inside StellaOps)

  • Core parsing & graph: 34 engineerweeks
  • Indirect calls & heuristics: +35 weeks
  • SBOM mapping & layers: 2 weeks
  • Attestations & OCI storage: 12 weeks
  • Policy/VEX integration & UI surfacing: 23 weeks
  • Runtime trace merge (optional): 24 weeks (Parallelizable; add 2540% for hardening/tests.)

If you want, I can turn this into:

  • a concrete .NET 10 service skeleton (endpoints + data contracts),
  • a DSSE/intoto schema for the slice, and
  • a dev checklist for deterministic builds and replay harness.