feat(ruby): Add RubyVendorArtifactCollector to collect vendor artifacts test(deno): Add golden tests for Deno analyzer with various fixtures test(deno): Create Deno module and package files for testing test(deno): Implement Deno lock and import map for dependency management test(deno): Add FFI and worker scripts for Deno testing feat(ruby): Set up Ruby workspace with Gemfile and dependencies feat(ruby): Add expected output for Ruby workspace tests feat(signals): Introduce CallgraphManifest model for signal processing
8.1 KiB
Function-Level Evidence Readiness (Nov 2025 Advisory)
Last updated: 2025-11-09. Owner: Business Analysis Guild.
This memo captures the outstanding work required to make Stella Ops scanners emit stable, function-level evidence that matches the November 2025 advisory. It does not implement any code; instead it enumerates requirements, links them to sprint tasks, and spells out the schema/API updates that the next agent must land.
1. Goal & Scope
Goal. Anchor every vulnerability finding to an immutable {artifact_digest, code_id} tuple plus optional symbol hints so replayers can prove reachability against stripped binaries.
Scope. Scanner analyzers, runtime ingestion, Signals scoring, Replay manifests, Policy/VEX emission, CLI/UI explainers, and documentation/runbooks needed to operationalise the advisory.
Out of scope: implementing disassemblers or symbol servers; those will be handled inside the module-specific backlog tasks referenced below.
2. Advisory Requirements vs. System Gaps
| Requirement | Current gap | Task references | Notes |
|---|---|---|---|
Immutable code identity (code_id = {format, build_id, start, length} + optional code_block_hash) |
Callgraph nodes are opaque strings with no address metadata. | Sprint 401 GRAPH-CAS-401-001, GAP-SCAN-001, GAP-SYM-007 |
code_id should live alongside existing SymbolID helpers so analyzers can emit it without duplicating logic. |
| Symbol hints (demangled name, source, confidence) | No schema fields for symbol metadata; demangling is ad-hoc per analyzer. | GAP-SYM-007 |
Require deterministic casing + symbol.source ∈ {DWARF,PDB,SYM,none}. |
| Runtime facts mapped to code anchors | /signals/runtime-facts now accepts JSON and NDJSON (gzip) streams, stores symbol/code/process/container metadata. |
Sprint 400 ZASTAVA-REACH-201-001, Sprint 401 SIGNALS-RUNTIME-401-002, GAP-ZAS-002, GAP-SIG-003 |
Provenance enrichment (process/socket/container) persisted; next step is exposing CAS URIs + context facts and emitting events for Policy/Replay. |
| Replay/DSSE coverage | Replay manifests don’t enforce hash/CAS registration for graphs/traces. | Sprint 400 REPLAY-REACH-201-005, Sprint 401 REPLAY-401-004, GAP-REP-004 |
Extend manifest v2 with analyzer versions + BLAKE3 digests; add DSSE predicate types. |
| Policy/VEX/UI explainability | Policy uses coarse reachability:* tags; UI/CLI cannot show call paths or evidence hashes. |
Sprint 401 POLICY-VEX-401-006, UI-CLI-401-007, GAP-POL-005, GAP-VEX-006, EXPERIENCE-GAP-401-012 |
Evidence blocks must cite code_id, graph hash, runtime CAS URI, analyzer version. |
| Operator documentation & samples | No guide shows how to replay {build_id,start,len} across CLI/API. |
Sprint 401 QA-DOCS-401-008, GAP-DOC-008 |
Produce samples under samples/reachability/** plus CLI walkthroughs. |
3. Workstreams & Expectations
3.1 Scanner Symbolization (GAP-SCAN-001 / GAP-SYM-007)
- Define
SymbolIDhelpers that glue together{artifact_digest, file, optionalsection,addr,length,code_block_hash}. - Update analyzer contracts so every analyzer returns both
symbol_idandcode_id, with demangled names stored under the newsymbolblock. - Persist the data into
richgraph-v1payloads and attach CAS URIs viaStellaOps.Scanner.Reachability. - Deliver fixtures in
tests/reachability/StellaOps.ScannerSignals.IntegrationTeststhat prove determinism (same hash when analyzer flags reorder).
3.2 Runtime + Signals (GAP-ZAS-002 / GAP-SIG-003)
- Extend Zastava Observer NDJSON schema to emit:
symbol_id,code_id,hit_count,observed_at,loader_base,process.buildId. - Implement
/signals/runtime-factsingestion (gzip + NDJSON) with CAS-backed storage undercas://reachability/runtime/{sha256}. - Update
ReachabilityScoringServiceto lattice states and include runtime evidence references plus CAS URIs inReachabilityFactDocument.Metadata.
3.3 Replay & Evidence (GAP-REP-004)
- Enforce CAS registration + BLAKE3 hashing before manifest writes (graphs and traces).
- Teach
ReachabilityReplayWriterto require analyzer name/version, graph kind,code_idcoverage summary. - Update
docs/replay/DETERMINISTIC_REPLAY.mdonce schema v2 is finalized.
3.4 Policy, VEX, CLI/UI (GAP-POL-005 / GAP-VEX-006)
- Policy Engine: ingest new reachability facts, expose
reachability.state,max_path_conf, andevidence.graph_hashvia SPL + API. - CLI/UI: add
stella graph explainand explain drawer showing call path (SymbolIDlist), code anchors, runtime hits, DSSE references. - Notify templates: include short evidence summary (first hop + truncated
code_id).
3.5 Documentation & Samples (GAP-DOC-008)
- Publish schema diffs in
docs/data/evidence-schema.md(new file) covering SBOM evidence nodes, runtime NDJSON, and API responses. - Write CLI/API walkthroughs in
docs/09_API_CLI_REFERENCE.mdanddocs/api/policy.mdshowing how to request reachability evidence and verify DSSE chains. - Produce OpenVEX + replay samples under
samples/reachability/showingfacts.type = "stella.reachability"withgraph_hashandcode_idarrays.
4. Schema & API Touchpoints
The next implementation pass must cover the following documents/files (create them if missing):
docs/data/evidence-schema.md– authoritative schema for{code_id, symbol, tool}blocks.docs/runbooks/reachability-runtime.md– operator steps for staging runtime ingestion bundles, retention, and troubleshooting.docs/runbooks/replay_ops.md– add section detailing replay verification using the new graph/runtime CAS entries.
API contracts to amend:
POST /signals/callgraphsresponse should includegraphHash(BLAKE3) onceGRAPH-CAS-401-001lands.POST /signals/runtime-factsrequest body schema (NDJSON) withsymbol_id,code_id,hit_count,loader_base.GET /policy/findingspayload must surfacereachability.evidence[]objects.
4.1 Signals runtime ingestion snapshot (Nov 2025)
/signals/runtime-facts(JSON) and/signals/runtime-facts/ndjson(streaming, optional gzip) accept the following event fields:symbolId(required),codeId,loaderBase,hitCount,processId,processName,socketAddress,containerId,evidenceUri,metadata.- Subject context (
scanId/imageDigest/component/version) pluscallgraphIdis supplied either in the JSON body or as query params for the NDJSON endpoint.
- Signals dedupes events, merges metadata, and persists the aggregated
RuntimeFactsontoReachabilityFactDocument. These facts now feed reachability scoring (SIGNALS-24-004/005) as part of the runtime bonus lattice. - Outstanding work: record CAS URIs for runtime traces, emit provenance events, and expose the enriched context to Policy/Replay consumers.
5. Test & Fixture Expectations
- Reachbench fixtures: update golden cases with
code_id+symbolmetadata. Ensure both reachable/unreachable variants still pass once graphs contain the richer IDs. - Signals unit tests: add deterministic tests for lattice scoring + runtime evidence linking (
tests/reachability/StellaOps.Signals.Reachability.Tests). - Replay tests: extend
tests/reachability/StellaOps.Replay.Core.Teststo assert manifest v2 serialization and hash enforcement.
All fixtures must remain deterministic: sort nodes/edges, normalise casing, and freeze timestamps in test data.
6. Handoff Checklist for the Next Agent
- Confirm sprint entries (
SPRINT_400andSPRINT_401) remain in sync when movingGAP-*tasks to DOING/DONE. - Start with
GAP-SYM-007(schema/helper implementation) because downstream work depends on the newcode_idpayload shape. - Once schema PR merges, coordinate with Signals + Policy guilds to align on CAS naming and DSSE predicates before wiring APIs.
- Update the docs listed in §4 as each component lands; keep this file current with statuses and links to PRs/ADRs.
- Before shipping, run the reachbench fixtures end-to-end and capture hashes for inclusion in replay docs.
Keep this document updated as tasks change state; it is the authoritative hand-off note for the advisory.