stella-ops.org/git.stella-ops.org

Fork 0

Files

master d519782a8f

Docs CI / lint-and-preview (push) Has been cancelled

Details

prep docs and service updates

2025-11-21 06:56:36 +00:00

12 KiB

Raw Blame History

Function-Level Evidence Readiness (Nov 2025 Advisory)

Last updated: 2025-11-12. Owner: Business Analysis Guild.

This memo captures the outstanding work required to make Stella Ops scanners emit stable, function-level evidence that matches the November 2025 advisory. It does not implement any code; instead it enumerates requirements, links them to sprint tasks, and spells out the schema/API updates that the next agent must land.

1. Goal & Scope

Goal. Anchor every vulnerability finding to an immutable {artifact_digest, code_id} tuple plus optional symbol hints so replayers can prove reachability against stripped binaries.

Scope. Scanner analyzers, runtime ingestion, Signals scoring, Replay manifests, Policy/VEX emission, CLI/UI explainers, and documentation/runbooks needed to operationalise the advisory.

Out of scope: implementing disassemblers or symbol servers; those will be handled inside the module-specific backlog tasks referenced below.

2. Advisory Requirements vs. System Gaps

Requirement	Current gap	Task references	Notes
Immutable code identity (`code_id` = `{format, build_id, start, length}` + optional `code_block_hash`)	Callgraph nodes are opaque strings with no address metadata.	Sprint 401 `GRAPH-CAS-401-001`, `GAP-SCAN-001`, `GAP-SYM-007`	`code_id` should live alongside existing `SymbolID` helpers so analyzers can emit it without duplicating logic.
Symbol hints (demangled name, source, confidence)	No schema fields for symbol metadata; demangling is ad-hoc per analyzer.	`GAP-SYM-007`	Require deterministic casing + `symbol.source ∈ {DWARF,PDB,SYM,none}`.
Runtime facts mapped to code anchors	`/signals/runtime-facts` now accepts JSON and NDJSON (gzip) streams, stores symbol/code/process/container metadata.	Sprint 400 `ZASTAVA-REACH-201-001`, Sprint 401 `SIGNALS-RUNTIME-401-002`, `GAP-ZAS-002`, `GAP-SIG-003`	Provenance enrichment (process/socket/container) persisted; next step is exposing CAS URIs + context facts and emitting events for Policy/Replay.
Replay/DSSE coverage	Replay manifests don’t enforce hash/CAS registration for graphs/traces.	Sprint 400 `REPLAY-REACH-201-005`, Sprint 401 `REPLAY-401-004`, `GAP-REP-004`	Extend manifest v2 with analyzer versions + BLAKE3 digests; add DSSE predicate types.
Policy/VEX/UI explainability	Policy uses coarse `reachability:*` tags; UI/CLI cannot show call paths or evidence hashes.	Sprint 401 `POLICY-VEX-401-006`, `UI-CLI-401-007`, `GAP-POL-005`, `GAP-VEX-006`, `EXPERIENCE-GAP-401-012`	Evidence blocks must cite `code_id`, graph hash, runtime CAS URI, analyzer version.
Operator documentation & samples	No guide shows how to replay `{build_id,start,len}` across CLI/API.	Sprint 401 `QA-DOCS-401-008`, `GAP-DOC-008`	Produce samples under `samples/reachability/**` plus CLI walkthroughs.
Build-id propagation	Build-id not consistently captured or threaded into `SymbolID`/`code_id`; SBOM/runtime joins are brittle.	Sprint 401 `SCANNER-BUILDID-401-035`	Capture `.note.gnu.build-id`, include in code identity, expose in SBOM exports and runtime events.
Load-time constructors as roots	Graph roots omit `.preinit_array`/`.init_array`/`_init`, missing load-time edges.	Sprint 401 `SCANNER-INITROOT-401-036`	Add synthetic roots with `phase=load`; include `DT_NEEDED` deps’ constructors.
PURL-resolved edges	Call edges do not carry `purl` or `symbol_digest`, slowing SBOM joins.	Sprint 401 `GRAPH-PURL-401-034`	Annotate edges per `docs/reachability/purl-resolved-edges.md`; keep deterministic graph hash.
Unknowns handling	Unresolved symbols/edges disappear silently.	Sprint 0400 `SIGNALS-UNKNOWN-201-008`	Emit Unknowns records (see `docs/signals/unknowns-registry.md`) and feed `unknowns_pressure` into scoring.
Patch-oracle QA	No guard-rail tests proving binary analyzers see real patch deltas.	Sprint 401 `QA-PORACLE-401-037`	Add paired vuln/fixed fixtures and expectations; wire to CI using `docs/reachability/patch-oracles.md`.

3. Workstreams & Expectations

3.1 Scanner Symbolization (GAP-SCAN-001 / GAP-SYM-007)

Define SymbolID helpers that glue together {artifact_digest, file, optional section, addr, length, code_block_hash}.
Update analyzer contracts so every analyzer returns both symbol_id and code_id, with demangled names stored under the new symbol block.
Persist the data into richgraph-v1 payloads and attach CAS URIs via StellaOps.Scanner.Reachability.
Deliver fixtures in tests/reachability/StellaOps.ScannerSignals.IntegrationTests that prove determinism (same hash when analyzer flags reorder).

3.2 Runtime + Signals (GAP-ZAS-002 / GAP-SIG-003)

Extend Zastava Observer NDJSON schema to emit: symbol_id, code_id, hit_count, observed_at, loader_base, process.buildId.
Implement /signals/runtime-facts ingestion (gzip + NDJSON) with CAS-backed storage under cas://reachability/runtime/{sha256}.
Update ReachabilityScoringService to lattice states and include runtime evidence references plus CAS URIs in ReachabilityFactDocument.Metadata.

3.3 Replay & Evidence (GAP-REP-004)

Enforce CAS registration + BLAKE3 hashing before manifest writes (graphs and traces).
Teach ReachabilityReplayWriter to require analyzer name/version, graph kind, code_id coverage summary.
Update docs/replay/DETERMINISTIC_REPLAY.md once schema v2 is finalized.

3.4 Policy, VEX, CLI/UI (GAP-POL-005 / GAP-VEX-006)

Policy Engine: ingest new reachability facts, expose reachability.state, max_path_conf, and evidence.graph_hash via SPL + API.
CLI/UI: add stella graph explain and explain drawer showing call path (SymbolID list), code anchors, runtime hits, DSSE references.
Notify templates: include short evidence summary (first hop + truncated code_id).

3.5 Documentation & Samples (GAP-DOC-008)

Publish schema diffs in docs/data/evidence-schema.md (new file) covering SBOM evidence nodes, runtime NDJSON, and API responses.
Write CLI/API walkthroughs in docs/09_API_CLI_REFERENCE.md and docs/api/policy.md showing how to request reachability evidence and verify DSSE chains.
Produce OpenVEX + replay samples under samples/reachability/ showing facts.type = "stella.reachability" with graph_hash and code_id arrays.

3.6 Native lifter & Reachability Store (SCANNER-NATIVE-401-015 / SIG-STORE-401-016)

Stand up Scanner.Symbols.Native + Scanner.CallGraph.Native libraries that:
- parse ELF (DWARF + .symtab/.dynsym), PE/COFF (CodeView/PDB), and stripped binaries via probabilistic carving;
- emit deterministic FuncNode + CallEdge records with demangled names, language hints, and {confidence,evidence} arrays; and
- attach analyzer + toolchain identifiers consumed by richgraph-v1.
Introduce Reachability.Store collections in Mongo:
- func_nodes – keyed by func:<format>:<sha256>:<va> with {binDigest,name,addr,size,lang,confidence,sym}.
- call_edges – {from,to,kind,confidence,evidence[]} linking internal/external nodes.
- cve_func_hits – {cve,purl,func_id,match_kind,confidence,source} for advisory alignment.
Build indexes (binDigest+name, from→to, cve+func_id) and expose repository interfaces so Scanner, Signals, and Policy can reuse the same canonical data without duplicating queries.

4. Schema & API Touchpoints

Authoritative field list lives in docs/reachability/evidence-schema.md; use it for DTOs and CAS writers.

The next implementation pass must cover the following documents/files (create them if missing):

docs/data/evidence-schema.md – authoritative schema for {code_id, symbol, tool} blocks.
docs/runbooks/reachability-runtime.md – operator steps for staging runtime ingestion bundles, retention, and troubleshooting.
docs/runbooks/replay_ops.md – add section detailing replay verification using the new graph/runtime CAS entries.

API contracts to amend:

POST /signals/callgraphs response should include graphHash (BLAKE3) once GRAPH-CAS-401-001 lands.
POST /signals/runtime-facts request body schema (NDJSON) with symbol_id, code_id, hit_count, loader_base.
GET /policy/findings payload must surface reachability.evidence[] objects.

4.1 Signals runtime ingestion snapshot (Nov 2025)

/signals/runtime-facts (JSON) and /signals/runtime-facts/ndjson (streaming, optional gzip) accept the following event fields:
- symbolId (required), codeId, loaderBase, hitCount, processId, processName, socketAddress, containerId, evidenceUri, metadata.
- Subject context (scanId / imageDigest / component / version) plus callgraphId is supplied either in the JSON body or as query params for the NDJSON endpoint.
Signals dedupes events, merges metadata, and persists the aggregated RuntimeFacts onto ReachabilityFactDocument. These facts now feed reachability scoring (SIGNALS-24-004/005) as part of the runtime bonus lattice.
Outstanding work: record CAS URIs for runtime traces, emit provenance events, and expose the enriched context to Policy/Replay consumers.

4.2 Reachability store layout (SIG-STORE-401-016)

All producers must persist native function evidence using the shared collections below (names are advisory; exact names live in Mongo options):

// func_nodes
{
  "_id": "func:ELF:sha256:4012a0",
  "binDigest": "sha256:deadbeef...",
  "name": "ssl3_read_bytes",
  "addr": "0x4012a0",
  "size": 312,
  "lang": "c",
  "confidence": 0.92,
  "symbol": { "mangled": "_Z15ssl3_read_bytes", "demangled": "ssl3_read_bytes", "source": "DWARF" },
  "sym": "present"
}

// call_edges
{
  "from": "func:ELF:sha256:4012a0",
  "to": "func:ELF:sha256:40f0ff",
  "kind": "static",
  "confidence": 0.88,
  "evidence": ["reloc:.plt.got", "bb-target:0x40f0ff"]
}

// cve_func_hits
{
  "cve": "CVE-2023-XXXX",
  "purl": "pkg:generic/openssl@1.1.1u",
  "func_id": "func:ELF:sha256:4012a0",
  "match": "name+version",
  "confidence": 0.77,
  "source": "concelier:openssl-advisory"
}

Writers must:

Upsert func_nodes before emitting edges/hits to ensure _id lookups remain stable.
Serialize evidence arrays in deterministic order (reloc, bb-target, import, …) and normalise hex casing.
Attach analyzer fingerprints (scanner.native@sha256:...) so Replay/Policy can enforce provenance.

5. Test & Fixture Expectations

Reachbench fixtures: update golden cases with code_id + symbol metadata. Ensure both reachable/unreachable variants still pass once graphs contain the richer IDs.
Signals unit tests: add deterministic tests for lattice scoring + runtime evidence linking (tests/reachability/StellaOps.Signals.Reachability.Tests).
Replay tests: extend tests/reachability/StellaOps.Replay.Core.Tests to assert manifest v2 serialization and hash enforcement.

All fixtures must remain deterministic: sort nodes/edges, normalise casing, and freeze timestamps in test data.

6. Handoff Checklist for the Next Agent

Confirm sprint entries (SPRINT_400 and SPRINT_401) remain in sync when moving GAP-* tasks to DOING/DONE.
Start with GAP-SYM-007 (schema/helper implementation) because downstream work depends on the new code_id payload shape.
Once schema PR merges, coordinate with Signals + Policy guilds to align on CAS naming and DSSE predicates before wiring APIs.
Update the docs listed in §4 as each component lands; keep this file current with statuses and links to PRs/ADRs.
Before shipping, run the reachbench fixtures end-to-end and capture hashes for inclusion in replay docs.

Keep this document updated as tasks change state; it is the authoritative hand-off note for the advisory.

12 KiB Raw Blame History Unescape Escape