chore(docs): normalize reachability sprint references

This commit is contained in:
StellaOps Bot
2025-11-22 16:36:27 +00:00
parent dc7c75b496
commit ca09400069
3 changed files with 313 additions and 313 deletions

View File

@@ -1,159 +1,159 @@
# Reachability Evidence Delivery Guide
_Last updated: November 8, 2025. Owner: Reachability Tiger Team (Scanner, Signals, Replay, Policy, Authority, UI)._
This guide translates the deterministic reachability blueprint into concrete work streams that average contributors can pick up without re-reading the entire proposal. Use it as the single navigation point when you land a reachability ticket. For a task-centric view of remaining gaps, see `docs/reachability/REACHABILITY_GAP_TASKS.md`.
---
## 1. Scope & Principles
**Goal**: ship a verifiable reachability signal for every scan by chaining SBOM → graph → runtime facts → VEX into DSSE-attested, replayable evidence.
**Principles**
1. **Deterministic inputs** canonical IDs, sorted payloads, normalized timestamps.
2. **Provable facts** every artifact has a DSSE envelope anchored in Authority + Rekor mirror.
3. **Replay-first** manifests pin feed snapshots, analyzer digests, and policies so auditors can rerun.
4. **Least surprise** same API and file layouts across languages; tests run fixture packs at CI time.
---
## 2. Evidence Chain Overview
| Stage | Producer | Artifact | Requirements |
|-------|----------|----------|--------------|
| SBOM per layer & composed image | Scanner Worker + Sbomer | `sbom.layer.cdx.json`, `sbom.image.cdx.json` | Deterministic CycloneDX 1.6, DSSE envelope, CAS URI |
| Static reachability graph | Scanner Worker lifters (DotNet, Go, Node/Deno, Rust, Swift, JVM, Binary, Shell) | `richgraph-v1.json` + `sha256` | Canonical SymbolIDs, framework entries, predicates, graph hash |
| Runtime facts | Zastava Observer / runtime probes | `runtime-trace.ndjson` (gzip or JSON) | EntryTrace schema, CAS pointer, process/socket/container metadata, optional compression |
| Replay manifest | Scanner Worker + Replay Core | `replay.yaml` | Contains analyzer versions, feed locks, graph hash, runtime trace digests |
| VEX statements | Scanner WebService + Policy Engine | `reachability.json` + OpenVEX doc | Links SBOM attn, graph attn, runtime evidence IDs |
| Signed bundle | Authority + Signer | DSSE envelope referencing above | Support FIPS + PQ variants (Dilithium where required) |
---
## 3. Work Streams (modules + hand-offs)
| Stream | Owner Guild(s) | Key deliverables |
|--------|----------------|------------------|
| **Native symbols & callgraphs** | Scanner Worker · Symbols Guild | Ship `Scanner.Symbols.Native` + `Scanner.CallGraph.Native`, integrate Symbol Manifest v1, demangle Itanium/MSVC names, emit `FuncNode`/`CallEdge` CAS bundles (task `SCANNER-NATIVE-401-015`). |
| **Reachability store** | Signals · BE-Base Platform | Provision shared Mongo collections (`func_nodes`, `call_edges`, `cve_func_hits`), indexes, and repositories plus REST hooks for reuse (task `SIG-STORE-401-016`). |
| **Language lifters** | Scanner Worker | CLI/hosted lifters for DotNet, Go, Node/Deno, JVM, Rust, Swift, Binary, Shell with CAS uploads and richgraph output |
| **Signals ingestion & scoring** | Signals | `/callgraphs`, `/runtime-facts` (JSON + NDJSON/gzip), `/graphs/{id}`, `/reachability/recompute` GA; CAS-backed storage, runtime dedupe, BFS+predicates scoring |
| **Runtime capture** | Zastava + Runtime Guild | EntryTrace/eBPF samplers, NDJSON batches (symbol IDs + timestamps + counts) |
| **Replay evidence** | Replay Core + Scanner Worker | Manifest schema v2, `ReachabilityReplayWriter` integration, hash-lock tests |
| **Authority attestations** | Authority + Signer | DSSE predicates for SBOM, Graph, Replay, VEX; Rekor mirror alignment |
| **Policy & VEX** | Policy Engine + Web + CLI + UI | Accept reachability states, render “Why safe” call paths, CLI/UI explain flows |
| **QA & Docs** | QA + Docs Guilds | `reachbench-2025-expanded` fixtures wired to CI; operator + developer runbooks |
| **Binary quality guardrails (Nov 2026)** | Scanner · Signals · QA | Build-id capture, init-array roots, purl-resolved edges, unknowns emission, and patch-oracle fixtures; see sections 5.75.9 |
---
## 4. Sprint Targets
| Sprint | Nickname | Focus | Exit Criteria |
|--------|----------|-------|---------------|
| **401** | Evidence Pipeline | Finish static lifters + CAS graph storage + runtime ingestion endpoint | Graph CAS layout documented, lifter fixtures passing, `/runtime-facts` receives NDJSON batches |
| **402** | Replay & Attest | Manifest v2, DSSE envelopes, Authority/Rekor publishing | Replay packs include hashes + analyzer fingerprint; DSSE statements passed integration; Rekor mirror updated |
| **403** | Policy & Explain | VEX generation, SPL predicates, UI/CLI explainers | Policy engine uses reachability states, CLI `stella graph explain` returns signed paths, UI shows explain drawer |
Each sprint is two weeks; refer to `docs/implplan/SPRINT_0401_0001_0001_reachability_evidence_chain.md` (new) for per-task tracking.
---
## 5. Task Breakdown Cheat Sheet
### 5.1 Scanner Worker
1. **Lifter SDK** Define `RichGraphWriter`, canonical SymbolID helpers, analyzer interface updates.
2. **Language passes** deliverables per language: discovery, graph build, framework wiring, predicate extraction, runtime overlay.
3. **Replay hooks** plug lifter output + runtime traces into `ReachabilityReplayWriter`; enforce CAS registration before emitting manifest references.
4. **Fixture runs** add tests under `tests/reachability/StellaOps.ScannerSignals.IntegrationTests` to execute lifter outputs against reachbench A/B cases.
### 5.2 Signals Service
1. **Callgraph CAS layout** migrate from filesystem to CAS (`cas://reachability/graphs/{hash}`), include metadata doc.
2. **Runtime facts API** accept NDJSON or gzip, dedupe events, compute hit stats, link to graph nodes.
3. **Scoring engine v2** support multi-state lattice (`Unknown → Observed`), record predicates, blocked edges, runtime evidence CAS URIs.
4. **API responses** `/graphs/{scanId}` returns graph CAS refs + manifest pointers; `/reachability/recompute` accepts replay manifest IDs.
### 5.3 Replay Core & Authority
1. **Manifest schema v2** YAML + JSON versions, includes feeds/analyzers/policies.
2. **CAS naming** standardize `cas://reachability/{kind}/{sha256}`.
3. **DSSE predicate types** `SbomAttestation`, `GraphAttestation`, `VexAttestation`, `ReplayManifest`.
4. **Authority integration** new endpoints for submitting reachability predicates, rotation tests, Rekor mirror update instructions.
### 5.4 Policy / Web / UI / CLI
1. **Policy Engine** ingest reachability fact from Signals, expose via SPL, produce metrics, integrate into explanation tree.
2. **Web API** join reachability fields in vuln responses, add override endpoints, simulate support.
3. **UI/CLI** Visual explain drawer/CLI command showing signed call-path, predicates, runtime hits; counterfactual toggles.
4. **VEX emitter** generate OpenVEX statements with evidence references, DSSE sign via Signer.
### 5.5 Native binaries (build-id + init roots)
- Capture ELF build-id (`.note.gnu.build-id`) alongside soname/path and propagate into `SymbolID`/`code_id` so SBOM/runtime joins stay stable even when paths change.
- Treat `.preinit_array`, `.init_array`, `.ctors`, and `_init` as synthetic graph roots with `phase=load`; include constructors from `DT_NEEDED` deps. Persist the root list in scan evidence.
- Add deterministic tests covering build-id present/absent and init-array edge creation.
### 5.6 PURL-resolved edges
- Annotate every call edge with callee `purl` and `symbol_digest` per `docs/reachability/purl-resolved-edges.md`.
- Update `richgraph-v1` schema, CAS metadata, and CLI/UI explainers to display `purl@version` + demangled name.
- Signals merges graphs by `(purl, symbol_digest)`; Policy uses the same keys when mapping CVE-affected functions.
### 5.7 Unknowns Registry integration
- Emit structured Unknowns when symbol→purl mapping, edge targets, or hashes are ambiguous; write them via Signals API per `docs/signals/unknowns-registry.md`.
- Scoring adds `unknowns_pressure` so `not_affected` claims cannot bypass unresolved evidence.
- UI/CLI should surface unknown chips and triage actions.
### 5.8 Patch-oracle guardrails
- Add `tests/reachability/patch-oracles/**` with paired vuln/fixed binaries and `oracle.yml` expectations (functions/edges added/removed).
- Scanner binary analyzer tests must fail if expected guard functions or edges are missing; CI job ensures determinism.
- See `docs/reachability/patch-oracles.md` for fixture layout and manifest schema.
### 5.9 JS/PHP framework reachability
- Model framework entrypoints explicitly: Express/Fastify/Nest handlers, Laravel/Symfony routes/commands/hooks. Generate graph roots from route/handler catalogs instead of generic `main` only.
- Represent dynamic import/require/include resolution as graph nodes so ambiguity stays visible (`resolution` edges with confidence).
- Keep multi-layer graphs: source-level (TS/JS/PHP) plus bundled output (Webpack/Vite). Merge with runtime hints when available.
- Status model: `always_reachable`, `conditional`, `not_reachable`, `not_analyzed`, `ambiguous`, each with confidence and evidence tags.
- Deliver language-specific profiles + fixture cases to prove coverage; update CLI/UI explainers to show framework route context.
---
## 6. Acceptance Tests
1. **Hash-lock** reorder analyzer flags and confirm graph hash unchanged.
2. **Replay** delete caches, replay manifest, verify DSSE + hash equality.
3. **Tamper** alter single edge and expect VEX verification failure with specific path mismatch.
4. **Golden corpus** run all reachbench cases; ensure NotReachable vs Reachable twins align with expectations JSON.
5. **Runtime sanity** feed staged runtime traces and ensure confidence bump + `observed=true` path chips propagate to UI.
---
## 7. Documentation & Runbooks
- Place developer-facing updates here (`docs/reachability`).
- [Function-level evidence guide](function-level-evidence.md) captures the Nov2025 advisory scope, task references, and schema expectations; keep it in lockstep with sprint status.
- [Reachability runtime runbook](../runbooks/reachability-runtime.md) now documents ingestion, CAS staging, air-gap handling, and troubleshooting—link every runtime feature PR to this guide.
- [VEX Evidence Playbook](../benchmarks/vex-evidence-playbook.md) defines the bench repo layout, artifact shapes, verifier tooling, and metrics; keep it updated when Policy/Signer/CLI features land.
- [Reachability lattice](lattice.md) describes the confidence states, evidence/mitigation kinds, scoring policy, event graph schema, and VEX gates; update it when lattices or probes change.
- [PURL-resolved edges spec](purl-resolved-edges.md) defines the purl + symbol-digest annotation rules for graphs and SBOM joins.
- [Patch-oracles QA pattern](patch-oracles.md) describes the fixture layout and expectations for binary reachability guards.
- [Unknowns registry](../signals/unknowns-registry.md) documents how unresolved symbols/edges are recorded and how scoring uses `unknowns_pressure`.
- [Evidence schema](evidence-schema.md) is the canonical field list for richgraph, runtime facts, and Unknowns CAS objects.
- Update module dossiers (Scanner, Signals, Replay, Authority, Policy, UI) once each guild lands work.
---
## 8. Contact & Rituals
- **Daily reachability stand-up** in `#reachability-build`.
- **Fixture sync** every Friday: QA leads run reachbench matrix, post report to Confluence + link in `docs/reachability/DELIVERY_GUIDE.md`.
- **Decision log** Append ADRs under `docs/adr/reachability-*` for schema changes.
Keep this guide updated whenever scope shifts or a new sprint is added.
# Reachability Evidence Delivery Guide
_Last updated: November 8, 2025. Owner: Reachability Tiger Team (Scanner, Signals, Replay, Policy, Authority, UI)._
This guide translates the deterministic reachability blueprint into concrete work streams that average contributors can pick up without re-reading the entire proposal. Use it as the single navigation point when you land a reachability ticket. For a task-centric view of remaining gaps, see `docs/reachability/REACHABILITY_GAP_TASKS.md`.
---
## 1. Scope & Principles
**Goal**: ship a verifiable reachability signal for every scan by chaining SBOM → graph → runtime facts → VEX into DSSE-attested, replayable evidence.
**Principles**
1. **Deterministic inputs** canonical IDs, sorted payloads, normalized timestamps.
2. **Provable facts** every artifact has a DSSE envelope anchored in Authority + Rekor mirror.
3. **Replay-first** manifests pin feed snapshots, analyzer digests, and policies so auditors can rerun.
4. **Least surprise** same API and file layouts across languages; tests run fixture packs at CI time.
---
## 2. Evidence Chain Overview
| Stage | Producer | Artifact | Requirements |
|-------|----------|----------|--------------|
| SBOM per layer & composed image | Scanner Worker + Sbomer | `sbom.layer.cdx.json`, `sbom.image.cdx.json` | Deterministic CycloneDX 1.6, DSSE envelope, CAS URI |
| Static reachability graph | Scanner Worker lifters (DotNet, Go, Node/Deno, Rust, Swift, JVM, Binary, Shell) | `richgraph-v1.json` + `sha256` | Canonical SymbolIDs, framework entries, predicates, graph hash |
| Runtime facts | Zastava Observer / runtime probes | `runtime-trace.ndjson` (gzip or JSON) | EntryTrace schema, CAS pointer, process/socket/container metadata, optional compression |
| Replay manifest | Scanner Worker + Replay Core | `replay.yaml` | Contains analyzer versions, feed locks, graph hash, runtime trace digests |
| VEX statements | Scanner WebService + Policy Engine | `reachability.json` + OpenVEX doc | Links SBOM attn, graph attn, runtime evidence IDs |
| Signed bundle | Authority + Signer | DSSE envelope referencing above | Support FIPS + PQ variants (Dilithium where required) |
---
## 3. Work Streams (modules + hand-offs)
| Stream | Owner Guild(s) | Key deliverables |
|--------|----------------|------------------|
| **Native symbols & callgraphs** | Scanner Worker · Symbols Guild | Ship `Scanner.Symbols.Native` + `Scanner.CallGraph.Native`, integrate Symbol Manifest v1, demangle Itanium/MSVC names, emit `FuncNode`/`CallEdge` CAS bundles (task `SCANNER-NATIVE-401-015`). |
| **Reachability store** | Signals · BE-Base Platform | Provision shared Mongo collections (`func_nodes`, `call_edges`, `cve_func_hits`), indexes, and repositories plus REST hooks for reuse (task `SIG-STORE-401-016`). |
| **Language lifters** | Scanner Worker | CLI/hosted lifters for DotNet, Go, Node/Deno, JVM, Rust, Swift, Binary, Shell with CAS uploads and richgraph output |
| **Signals ingestion & scoring** | Signals | `/callgraphs`, `/runtime-facts` (JSON + NDJSON/gzip), `/graphs/{id}`, `/reachability/recompute` GA; CAS-backed storage, runtime dedupe, BFS+predicates scoring |
| **Runtime capture** | Zastava + Runtime Guild | EntryTrace/eBPF samplers, NDJSON batches (symbol IDs + timestamps + counts) |
| **Replay evidence** | Replay Core + Scanner Worker | Manifest schema v2, `ReachabilityReplayWriter` integration, hash-lock tests |
| **Authority attestations** | Authority + Signer | DSSE predicates for SBOM, Graph, Replay, VEX; Rekor mirror alignment |
| **Policy & VEX** | Policy Engine + Web + CLI + UI | Accept reachability states, render “Why safe” call paths, CLI/UI explain flows |
| **QA & Docs** | QA + Docs Guilds | `reachbench-2025-expanded` fixtures wired to CI; operator + developer runbooks |
| **Binary quality guardrails (Nov 2026)** | Scanner · Signals · QA | Build-id capture, init-array roots, purl-resolved edges, unknowns emission, and patch-oracle fixtures; see sections 5.75.9 |
---
## 4. Sprint Targets
| Sprint | Nickname | Focus | Exit Criteria |
|--------|----------|-------|---------------|
| **401** | Evidence Pipeline | Finish static lifters + CAS graph storage + runtime ingestion endpoint | Graph CAS layout documented, lifter fixtures passing, `/runtime-facts` receives NDJSON batches |
| **402** | Replay & Attest | Manifest v2, DSSE envelopes, Authority/Rekor publishing | Replay packs include hashes + analyzer fingerprint; DSSE statements passed integration; Rekor mirror updated |
| **403** | Policy & Explain | VEX generation, SPL predicates, UI/CLI explainers | Policy engine uses reachability states, CLI `stella graph explain` returns signed paths, UI shows explain drawer |
Each sprint is two weeks; refer to `docs/implplan/SPRINT_0401_0001_0001_reachability_evidence_chain.md` (new) for per-task tracking.
---
## 5. Task Breakdown Cheat Sheet
### 5.1 Scanner Worker
1. **Lifter SDK** Define `RichGraphWriter`, canonical SymbolID helpers, analyzer interface updates.
2. **Language passes** deliverables per language: discovery, graph build, framework wiring, predicate extraction, runtime overlay.
3. **Replay hooks** plug lifter output + runtime traces into `ReachabilityReplayWriter`; enforce CAS registration before emitting manifest references.
4. **Fixture runs** add tests under `tests/reachability/StellaOps.ScannerSignals.IntegrationTests` to execute lifter outputs against reachbench A/B cases.
### 5.2 Signals Service
1. **Callgraph CAS layout** migrate from filesystem to CAS (`cas://reachability/graphs/{hash}`), include metadata doc.
2. **Runtime facts API** accept NDJSON or gzip, dedupe events, compute hit stats, link to graph nodes.
3. **Scoring engine v2** support multi-state lattice (`Unknown → Observed`), record predicates, blocked edges, runtime evidence CAS URIs.
4. **API responses** `/graphs/{scanId}` returns graph CAS refs + manifest pointers; `/reachability/recompute` accepts replay manifest IDs.
### 5.3 Replay Core & Authority
1. **Manifest schema v2** YAML + JSON versions, includes feeds/analyzers/policies.
2. **CAS naming** standardize `cas://reachability/{kind}/{sha256}`.
3. **DSSE predicate types** `SbomAttestation`, `GraphAttestation`, `VexAttestation`, `ReplayManifest`.
4. **Authority integration** new endpoints for submitting reachability predicates, rotation tests, Rekor mirror update instructions.
### 5.4 Policy / Web / UI / CLI
1. **Policy Engine** ingest reachability fact from Signals, expose via SPL, produce metrics, integrate into explanation tree.
2. **Web API** join reachability fields in vuln responses, add override endpoints, simulate support.
3. **UI/CLI** Visual explain drawer/CLI command showing signed call-path, predicates, runtime hits; counterfactual toggles.
4. **VEX emitter** generate OpenVEX statements with evidence references, DSSE sign via Signer.
### 5.5 Native binaries (build-id + init roots)
- Capture ELF build-id (`.note.gnu.build-id`) alongside soname/path and propagate into `SymbolID`/`code_id` so SBOM/runtime joins stay stable even when paths change.
- Treat `.preinit_array`, `.init_array`, `.ctors`, and `_init` as synthetic graph roots with `phase=load`; include constructors from `DT_NEEDED` deps. Persist the root list in scan evidence.
- Add deterministic tests covering build-id present/absent and init-array edge creation.
### 5.6 PURL-resolved edges
- Annotate every call edge with callee `purl` and `symbol_digest` per `docs/reachability/purl-resolved-edges.md`.
- Update `richgraph-v1` schema, CAS metadata, and CLI/UI explainers to display `purl@version` + demangled name.
- Signals merges graphs by `(purl, symbol_digest)`; Policy uses the same keys when mapping CVE-affected functions.
### 5.7 Unknowns Registry integration
- Emit structured Unknowns when symbol→purl mapping, edge targets, or hashes are ambiguous; write them via Signals API per `docs/signals/unknowns-registry.md`.
- Scoring adds `unknowns_pressure` so `not_affected` claims cannot bypass unresolved evidence.
- UI/CLI should surface unknown chips and triage actions.
### 5.8 Patch-oracle guardrails
- Add `tests/reachability/patch-oracles/**` with paired vuln/fixed binaries and `oracle.yml` expectations (functions/edges added/removed).
- Scanner binary analyzer tests must fail if expected guard functions or edges are missing; CI job ensures determinism.
- See `docs/reachability/patch-oracles.md` for fixture layout and manifest schema.
### 5.9 JS/PHP framework reachability
- Model framework entrypoints explicitly: Express/Fastify/Nest handlers, Laravel/Symfony routes/commands/hooks. Generate graph roots from route/handler catalogs instead of generic `main` only.
- Represent dynamic import/require/include resolution as graph nodes so ambiguity stays visible (`resolution` edges with confidence).
- Keep multi-layer graphs: source-level (TS/JS/PHP) plus bundled output (Webpack/Vite). Merge with runtime hints when available.
- Status model: `always_reachable`, `conditional`, `not_reachable`, `not_analyzed`, `ambiguous`, each with confidence and evidence tags.
- Deliver language-specific profiles + fixture cases to prove coverage; update CLI/UI explainers to show framework route context.
---
## 6. Acceptance Tests
1. **Hash-lock** reorder analyzer flags and confirm graph hash unchanged.
2. **Replay** delete caches, replay manifest, verify DSSE + hash equality.
3. **Tamper** alter single edge and expect VEX verification failure with specific path mismatch.
4. **Golden corpus** run all reachbench cases; ensure NotReachable vs Reachable twins align with expectations JSON.
5. **Runtime sanity** feed staged runtime traces and ensure confidence bump + `observed=true` path chips propagate to UI.
---
## 7. Documentation & Runbooks
- Place developer-facing updates here (`docs/reachability`).
- [Function-level evidence guide](function-level-evidence.md) captures the Nov2025 advisory scope, task references, and schema expectations; keep it in lockstep with sprint status.
- [Reachability runtime runbook](../runbooks/reachability-runtime.md) now documents ingestion, CAS staging, air-gap handling, and troubleshooting—link every runtime feature PR to this guide.
- [VEX Evidence Playbook](../benchmarks/vex-evidence-playbook.md) defines the bench repo layout, artifact shapes, verifier tooling, and metrics; keep it updated when Policy/Signer/CLI features land.
- [Reachability lattice](lattice.md) describes the confidence states, evidence/mitigation kinds, scoring policy, event graph schema, and VEX gates; update it when lattices or probes change.
- [PURL-resolved edges spec](purl-resolved-edges.md) defines the purl + symbol-digest annotation rules for graphs and SBOM joins.
- [Patch-oracles QA pattern](patch-oracles.md) describes the fixture layout and expectations for binary reachability guards.
- [Unknowns registry](../signals/unknowns-registry.md) documents how unresolved symbols/edges are recorded and how scoring uses `unknowns_pressure`.
- [Evidence schema](evidence-schema.md) is the canonical field list for richgraph, runtime facts, and Unknowns CAS objects.
- Update module dossiers (Scanner, Signals, Replay, Authority, Policy, UI) once each guild lands work.
---
## 8. Contact & Rituals
- **Daily reachability stand-up** in `#reachability-build`.
- **Fixture sync** every Friday: QA leads run reachbench matrix, post report to Confluence + link in `docs/reachability/DELIVERY_GUIDE.md`.
- **Decision log** Append ADRs under `docs/adr/reachability-*` for schema changes.
Keep this guide updated whenever scope shifts or a new sprint is added.