Files

master 53503cb407 Add reference architecture and testing strategy documentation

- Created a new document for the Stella Ops Reference Architecture outlining the system's topology, trust boundaries, artifact association, and interfaces.
- Developed a comprehensive Testing Strategy document detailing the importance of offline readiness, interoperability, determinism, and operational guardrails.
- Introduced a README for the Testing Strategy, summarizing processing details and key concepts implemented.
- Added guidance for AI agents and developers in the tests directory, including directory structure, test categories, key patterns, and rules for test development.

2025-12-22 07:59:30 +02:00

6.3 KiB

Raw Blame History

Here’s a practical, from‑scratch blueprint for a two‑stage reachability map that turns low‑level runtime facts into auditable, reproducible evidence for triage and VEX decisions.

What this is (plain English)

Goal: prove (or rule out) whether a vulnerable function/package could actually run in your build and deployment.
How:
1. extract binary‑level call targets (what functions your program could call),
2. map those targets onto symbol graphs (named functions/classes/modules),
3. correlate those symbols with SBOM components (which package/image layer they live in),
4. store each “slice” of reachability as a signed attestation so anyone can replay and verify it.

Stage A — Binary → Symbol graph

Inputs: built artifacts (ELF/COFF/Mach‑O), debug symbols (when available), stripped bins, and language runtimes.
Process (per artifact):
- Parse binaries (headers, sections, symbol tables, relocations).
- Recover call edges:
  - Direct calls: disassemble; record caller -> callee.
  - Indirect calls: resolve via PLT/IAT/vtables; fall back to conservative points‑to sets.
  - Dynamic loading: log dlopen/LoadLibrary + exported symbol usage heuristics.
- Normalize to Symbol Graph: nodes = {binary, symbol, addr, hash}, edges = CALLS.
Outputs: symbol-graph.jsonl (+ compact binary form), content‑addressed by hash.

Stage B — Symbol graph ↔ SBOM components

Inputs: CycloneDX/SPDX SBOM for the image/build; file→component mapping (path→pkg).
Process:
- For each symbol: derive file path (or Build‑ID) → map to SBOM component/version/layer.
- Build Component Reachability Graph:
  - nodes = {component@version}, edges = “component provides symbol X used by Y”.
  - annotate with file hashes, Build‑IDs, container layer digests.
Outputs: reachability-slices/COMPONENT@VERSION.slice.json (per impacted component).

Attestable “slice” (the evidence object)

Each slice is a minimal proof unit answering: “This vulnerable symbol is (or isn’t) on a feasible path at runtime in build X.”

Contents:
- Scan manifest (tool versions, ruleset hashes, feed versions).
- Inputs digests (binaries, SBOM, container layers).
- The subgraph (only nodes/edges needed).
- Query + result (e.g., “is openssl:EVP_PKEY_decrypt reachable from any exported entrypoint?”).
Format: DSSE + in‑toto statement, stored as OCI artifact or file; deterministic (same inputs → same bytes).

Triage flow (how it helps today)

Given CVE → map to symbols/functions → check reachability slice:
- Reachable path found: mark “affected (reachable)”, include call chain and components; raise priority.
- No path / gated by feature flag: mark “not affected (unreachable/mitigated)”, with proof chain.
- Unknowns present: fail‑safe policy (e.g., “unknowns > N → block prod”) with explicit unknown edges listed.

Minimal data model (JSON hints)

Symbol: { id, name, demangled, addr, file_sha256, build_id }
Edge: { src_symbol_id, dst_symbol_id, kind: "direct"|"plt"|"indirect" }
Mapping: { file_sha256|build_id -> component_purl, layer_digest, path }
Slice: { inputs:{…}, query:{…}, subgraph:{symbols:[…],edges:[…]}, verdict:"reachable"|"unreachable"|"unknown" }

Determinism & replay

Pin everything: disassembler version, rules, demangler options, container digests, SBOM doc hash, symbolization flags.
Emit a Scan Manifest with content hashes; store alongside slices.
Provide a replay command that re‑hydrates inputs and re‑computes the slice; byte‑for‑byte match required.

Where this plugs into Stella Ops (suggested modules)

Sbomer: component/file mapping & SBOM import.
Scanner.webservice: binary parse & call‑graph extraction (keep lattice/policy elsewhere per your rule).
Vexer/Policy Engine: consume slices as evidence for “affected/not‑affected” claims.
Attestor/Authority: sign DSSE/in‑toto statements; push to OCI.
Timeline/Notify: surface verdict deltas over time, link to slices.

Guardrails & fallbacks

If stripped binaries: prefer Build‑ID + external symbol servers; else conservative over‑approx (mark unknown).
For JIT/dynamic plugins: capture runtime traces (eBPF/ETW) and merge as observed edges with timestamps.
Mixed‑lang stacks: unify by file hash + symbol name mangling rules per toolchain.

Quick implementation plan (6 sprints)

Binary ingest: ELF/PE/Mach‑O parsing, Build‑ID hashing, symbol tables, PLT/IAT resolution.
Call‑edge recovery: direct calls, basic indirect resolution, slice extractor by entrypoint.
SBOM mapping: file→component map, layer digests, purl normalization.
Evidence format: DSSE/in‑toto schema, deterministic manifests, OCI storage.
Queries & policies: “is‑reachable?” API, unknowns budget, feature‑flag conditions, VEX plumbing.
Runtime merge: optional eBPF/ETW traces → annotate edges, produce “observed‑path” slices.

Lightweight APIs (sketch)

POST /reachability/query { cve, symbols[], entrypoints[], policy } -> slice+verdict
GET /slice/{digest} -> attested slice
POST /replay { slice_digest } -> match | mismatch (with diff)

Small example (CVE → symbol mapping)

CVE‑XXXX‑YYYY → advisory lists function foo_decrypt in libfoo.so
We resolve libfoo.so Build‑ID in image, find symbols that match demangled name, build call paths from service entrypoints; if path exists, slice is “reachable” with 3–7 hop chain; otherwise “unreachable” with reasons (no import, stripped at link‑time, dead code eliminated, or gated by FEATURE_X=false).

Costs (rough, for planning inside Stella Ops)

Core parsing & graph: 3–4 engineer‑weeks
Indirect calls & heuristics: +3–5 weeks
SBOM mapping & layers: 2 weeks
Attestations & OCI storage: 1–2 weeks
Policy/VEX integration & UI surfacing: 2–3 weeks
Runtime trace merge (optional): 2–4 weeks (Parallelizable; add 25–40% for hardening/tests.)

If you want, I can turn this into:

a concrete .NET 10 service skeleton (endpoints + data contracts),
a DSSE/in‑toto schema for the slice, and
a dev checklist for deterministic builds and replay harness.

6.3 KiB Raw Blame History Unescape Escape