docs consolidation and others

2026-01-06 19:02:21 +02:00
parent d7bdca6d97
commit 4789027317
849 changed files with 16551 additions and 66770 deletions
--- a/docs/modules/scanner/architecture.md
+++ b/docs/modules/scanner/architecture.md
@@ -87,11 +87,11 @@ CLI usage: `stella scan --semantic <image>` enables semantic analysis in output.
 - **Stripped-binary pipeline**: native analyzers must recover functions even without symbols (prolog patterns, xrefs, PLT/GOT, vtables). Emit a tool-agnostic neutral JSON (NJIF) with functions, CFG/CG, and evidence tags. Keep heuristics deterministic and record toolchain hashes in the scan manifest.
 - **Synthetic roots**: treat `.preinit_array`, `.init_array`, legacy `.ctors`, and `_init` as graph entrypoints; add roots for constructors in each `DT_NEEDED` dependency. Tag edges from these roots with `phase=load` for explainers.
 - **Build-id capture**: read `.note.gnu.build-id` for every ELF, store hex build-id alongside soname/path, propagate into `SymbolID`/`code_id`, and expose it to SBOM + runtime joiners. If missing, fall back to file hash and mark source accordingly.
- **PURL-resolved edges**: annotate call edges with the callee purl and `symbol_digest` so graphs merge with SBOM components. See `docs/reachability/purl-resolved-edges.md` for schema rules and acceptance tests.
+- **PURL-resolved edges**: annotate call edges with the callee purl and `symbol_digest` so graphs merge with SBOM components. See `docs/modules/reach-graph/guides/purl-resolved-edges.md` for schema rules and acceptance tests.
 - **Symbol hints in evidence**: reachability union and richgraph payloads emit `symbol {mangled,demangled,source,confidence}` plus optional `code_block_hash` for stripped/heuristic functions; serializers clamp confidence to [0,1] and uppercase `source` (`DWARF|PDB|SYM|NONE`) for determinism.
- **Unknowns emission**: when symbol â†’ purl mapping or edge targets remain unresolved, emit structured Unknowns to Signals (see `docs/signals/unknowns-registry.md`) instead of dropping evidence.
+- **Unknowns emission**: when symbol -> purl mapping or edge targets remain unresolved, emit structured Unknowns to Signals (see `docs/modules/signals/guides/unknowns-registry.md`) instead of dropping evidence.
 - **Hybrid attestation**: emit **graph-level DSSE** for every `richgraph-v1` (mandatory) and optional **edge-bundle DSSE** (â‰¤512 edges) for runtime/init-root/contested edges or third-party provenance. Publish graph DSSE digests to Rekor by default; edge-bundle Rekor publish is policy-driven. CAS layout: `cas://reachability/graphs/{blake3}` for graph body, `.../{blake3}.dsse` for envelope, and `cas://reachability/edges/{graph_hash}/{bundle_id}[.dsse]` for bundles. Deterministic ordering before hashing/signing is required.
- **Deterministic call-graph manifest**: capture analyzer versions, feed hashes, toolchain digests, and flags in a manifest stored alongside `richgraph-v1`; replaying with the same manifest MUST yield identical node/edge sets and hashes (see `docs/reachability/lead.md`).
+- **Deterministic call-graph manifest**: capture analyzer versions, feed hashes, toolchain digests, and flags in a manifest stored alongside `richgraph-v1`; replaying with the same manifest MUST yield identical node/edge sets and hashes (see `docs/modules/reach-graph/guides/lead.md`).

 ### 1.1 Queue backbone (Valkey / NATS)

@@ -281,7 +281,7 @@ When `scanner.events.enabled = true`, the WebService serialises the signed repor
 * The exported metadata (`stellaops.os.*` properties, license list, source package) feeds policy scoring and export pipelines
  directly â€“ Policy evaluates quiet rules against package provenance while Exporters forward the enriched fields into
  downstream JSON/Trivy payloads.
-* **Reachability lattice**: analyzers + runtime probes emit `Evidence`/`Mitigation` records (see `docs/reachability/lattice.md`). The lattice engine joins static path evidence, runtime hits (EventPipe/JFR), taint flows, environment gates, and mitigations into `ReachDecision` documents that feed VEX gating and event graph storage.
+* **Reachability lattice**: analyzers + runtime probes emit `Evidence`/`Mitigation` records (see `docs/modules/reach-graph/guides/lattice.md`). The lattice engine joins static path evidence, runtime hits (EventPipe/JFR), taint flows, environment gates, and mitigations into `ReachDecision` documents that feed VEX gating and event graph storage.
 * Sprintâ€¯401 introduces `StellaOps.Scanner.Symbols.Native` (DWARF/PDB reader + demangler) and `StellaOps.Scanner.CallGraph.Native`
  (function boundary detector + call-edge builder). These libraries feed `FuncNode`/`CallEdge` CAS bundles and enrich reachability
  graphs with `{code_id, confidence, evidence}` so Signals/Policy/UI can cite function-level justifications.
@@ -378,7 +378,7 @@ The emitted `buildId` metadata is preserved in component hashes, diff payloads,
 * WebService constructs **predicate** with `image_digest`, `stellaops_version`, `license_id`, `policy_digest?` (when emitting **final reports**), timestamps.
 * Calls **Signer** (requires **OpTok + PoE**); Signer verifies **entitlement + scanner image integrity** and returns **DSSE bundle**.
 * **Attestor** logs to **Rekor v2**; returns `{uuid,index,proof}` â†’ stored in `artifacts.rekor`.
-* **Hybrid reachability attestations**: graph-level DSSE (mandatory) plus optional edge-bundle DSSEs for runtime/init/contested edges. See [`docs/reachability/hybrid-attestation.md`](../../reachability/hybrid-attestation.md) for verification runbooks and Rekor guidance.
+* **Hybrid reachability attestations**: graph-level DSSE (mandatory) plus optional edge-bundle DSSEs for runtime/init/contested edges. See [`docs/modules/reach-graph/guides/hybrid-attestation.md`](../reach-graph/guides/hybrid-attestation.md) for verification runbooks and Rekor guidance.
 * Operator enablement runbooks (toggles, env-var map, rollout guidance) live in [`operations/dsse-rekor-operator-guide.md`](operations/dsse-rekor-operator-guide.md) per SCANNER-ENG-0015.

 ---
--- a/docs/modules/scanner/deterministic-execution.md
+++ b/docs/modules/scanner/deterministic-execution.md
@@ -36,6 +36,6 @@ This note collects the invariants required for reproducible Scanner runs and rep
 - Rekor lookups skipped; rely on bundled proofs.

 ## References
- `docs/replay/DETERMINISTIC_REPLAY.md`
- `docs/replay/TEST_STRATEGY.md`
+- `docs/modules/replay/guides/DETERMINISTIC_REPLAY.md`
+- `docs/modules/replay/guides/TEST_STRATEGY.md`
 - `docs/modules/scanner/determinism-score.md`
--- a/docs/modules/scanner/deterministic-sbom-compose.md
+++ b/docs/modules/scanner/deterministic-sbom-compose.md
@@ -36,7 +36,7 @@ Guarantee that every container scan yields **provably deterministic** SBOM artif

 - **CLI (`stella sbomer ...`)**: adds `layer` and `compose` verbs, deterministic diff reporting, and offline verification per `_composition.json`.
 - **UI/Policy**: determinism badge, drift diffs, and a policy gate that blocks releases when fragment DSSE/verifications fail.
- **Docs**: new guides under `docs/scanner` & `docs/cli` plus policy references detailing how to interpret determinism metadata.
+- **Docs**: new guides under `docs/modules/scanner` & `docs/modules/cli/guides` plus policy references detailing how to interpret determinism metadata.
 - **Crypto**: PQ-friendly DSSE toggle delivered via `SCANNER-CRYPTO-90-002/003` so sovereign bundles can select Dilithium/Falcon.

 ## 3. Verification Flow (offline kit)
@@ -72,7 +72,7 @@ Guarantee that every container scan yields **provably deterministic** SBOM artif

 ## 5. Operational workflow (worker → CLI/UI/Policy)
 - **Worker**: emit fragment DSSE + `_composition.json` into the surface manifest; persist `stellaops:composition.manifest` and `stellaops:merkle.root` properties on composed BOMs so downstream consumers do not recompute merges.
- **CLI**: verify bundles offline with `stella sbomer compose --recipe docs/modules/scanner/fixtures/deterministic-compose/_composition.json --fragments-dir docs/modules/scanner/fixtures/deterministic-compose --verify` (see `docs/cli/sbomer.md`). The command should fail if any DSSE signature, Merkle root, or BOM hash diverges.
+- **CLI**: verify bundles offline with `stella sbomer compose --recipe docs/modules/scanner/fixtures/deterministic-compose/_composition.json --fragments-dir docs/modules/scanner/fixtures/deterministic-compose --verify` (see `docs/modules/cli/guides/commands/sbomer.md`). The command should fail if any DSSE signature, Merkle root, or BOM hash diverges.
 - **UI / Policy**: render determinism badge using `stellaops:merkle.root`; block promotion when `_composition.json` is missing or hashes disagree; expose drift diagnostics by recomputing composition locally and comparing to BOM properties.
 - **Export/Offline**: include `_composition.json`, fragment DSSEs, `bom.cdx.json`, and `hashes.txt` when building Offline Kit bundles so replay jobs can validate without network.

--- a/docs/modules/scanner/guides/SCANNER_RUNTIME_READINESS.md
+++ b/docs/modules/scanner/guides/SCANNER_RUNTIME_READINESS.md
@@ -0,0 +1,81 @@
+# Scanner Runtime Readiness Checklist
+
+Last updated: 2025-10-19
+
+This runbook confirms that Scanner.WebService now surfaces the metadata Runtime Guild consumers requested: quieted finding counts in the signed report events and progress hints on the scan event stream. Follow the checklist before relying on these fields in production automation.
+
+---
+
+## 1. Prerequisites
+
+- Scanner.WebService release includes **SCANNER-POLICY-09-107** (adds quieted provenance and score inputs to `/reports`).  
+- Docs repository at commit containing `docs/events/scanner.report.ready@1.json` with `quietedFindingCount`.  
+- Access to a Scanner environment (staging or sandbox) with an image capable of producing policy verdicts.
+
+---
+
+## 2. Verify quieted finding hints
+
+1. **Trigger a report** – run a scan that produces at least one quieted finding (policy with `quiet: true`). After the scan completes, call:
+   ```http
+   POST /api/v1/reports
+   Authorization: Bearer <token>
+   Content-Type: application/json
+   ```
+   Ensure the JSON response contains `report.summary.quieted` and that the DSSE payload mirrors the same count.
+2. **Check emitted event** – pull the latest `scanner.report.ready` event (from the queue or sample capture). Confirm the payload includes:
+   - `quietedFindingCount` equal to the `summary.quieted` value.
+   - Updated `summary` block with the quieted counter.
+3. **Schema validation** – optionally validate the payload against `docs/events/scanner.report.ready@1.json` to guarantee downstream compatibility:
+   ```bash
+   npx ajv validate -c ajv-formats \
+     -s docs/events/scanner.report.ready@1.json \
+     -d <payload.json>
+   ```
+   (Use `npm install --no-save ajv ajv-cli ajv-formats` once per clone.)
+
+> Snapshot fixtures: see `docs/events/samples/scanner.event.report.ready@1.sample.json` for a canonical orchestrator event that already carries `quietedFindingCount`.
+
+---
+
+## 3. Verify progress hints (SSE / JSONL)
+
+Scanner streams structured progress messages for each scan. The `data` map inside every frame carries the hints Runtime systems consume (force flag, client metadata, additional stage-specific attributes).
+
+1. **Submit a scan** with custom metadata (for example `pipeline=github`, `build=1234`).
+2. **Stream events**:
+   ```http
+   GET /api/v1/scans/{scanId}/events?format=jsonl
+   Authorization: Bearer <token>
+   Accept: application/x-ndjson
+   ```
+3. **Confirm payload** – each frame should resemble:
+   ```json
+   {
+     "scanId": "2f6c17f9b3f548e2a28b9c412f4d63f8",
+     "sequence": 1,
+     "state": "Pending",
+     "message": "queued",
+     "timestamp": "2025-10-19T03:12:45.118Z",
+     "correlationId": "2f6c17f9b3f548e2a28b9c412f4d63f8:0001",
+     "data": {
+       "force": false,
+       "meta.pipeline": "github"
+     }
+   }
+   ```
+   Subsequent frames include additional hints as analyzers progress (e.g., `stage`, `meta.*`, or analyzer-provided keys). Ensure newline-delimited JSON consumers preserve the `data` dictionary when forwarding to runtime dashboards.
+
+> The same frame structure is documented in `docs/API_CLI_REFERENCE.md` §2.6. Copy that snippet into integration tests to keep compatibility.
+
+---
+
+## 4. Sign-off matrix
+
+| Stakeholder | Checklist | Status | Notes |
+|-------------|-----------|--------|-------|
+| Runtime Guild | Sections 2 & 3 completed | ☐ | Capture sample payloads for webhook regression tests. |
+| Notify Guild | `quietedFindingCount` consumed in notifications | ☐ | Update templates after Runtime sign-off. |
+| Docs Guild | Checklist published & linked from updates | ☑ | 2025-10-19 |
+
+Mark the stakeholder boxes as each team completes its validation. Once all checks are green, update `docs/TASKS.md` to reflect task completion.
--- a/docs/modules/scanner/operations/analyzers.md
+++ b/docs/modules/scanner/operations/analyzers.md
@@ -41,8 +41,8 @@ Keep the language analyzer microbench under the < 5 s SBOM pledge. CI emits
 - Pager payload should include `scenario`, `max_ms`, `baseline_max_ms`, and `commit`.
 - Immediate triage steps:
  1. Check `latest.json` artefact for the failing scenario – confirm commit and environment.
-  2. Re-run the harness with `--captured-at` and `--baseline` pointing at the last known good CSV to verify determinism; include `surface/determinism.json` in the release bundle (see `release-determinism.md`).
+  2. Re-run the harness with `--captured-at` and `--baseline` pointing at the last known good CSV to verify determinism; include `surface/determinism.json` in the release bundle (see `release-determinism.md`).
  3. If regression persists, open an incident ticket tagged `scanner-analyzer-perf` and page the owning language guild.
  4. Roll back the offending change or update the baseline after sign-off from the guild lead and Perf captain.

-Document the outcome in `docs/12_PERFORMANCE_WORKBOOK.md` (section 8) so trendlines reflect any accepted regressions.
+Document the outcome in `docs/PERFORMANCE_WORKBOOK.md` (section 8) so trendlines reflect any accepted regressions.
--- a/docs/modules/scanner/samples/node-phase22/node-phase22-sample.ndjson
+++ b/docs/modules/scanner/samples/node-phase22/node-phase22-sample.ndjson
@@ -0,0 +1,7 @@
+{"type":"entrypoint","path":"/app/dist/main.js","format":"esm","reason":"bundle-entrypoint","confidence":0.88,"resolverTrace":["bundle:/app/dist/main.js","map:/app/dist/main.js.map","source:/src/app.js"]}
+{"type":"component","componentType":"pkg","path":"/src/app.js","format":"esm","fromBundle":true,"reason":"source-map","confidence":0.87,"resolverTrace":["bundle:/app/dist/main.js","map:/app/dist/main.js.map","source:/src/app.js"]}
+{"type":"component","componentType":"native","path":"/app/native/addon.node","arch":"x86_64","platform":"linux","reason":"native-addon-file","confidence":0.82,"resolverTrace":["file:/app/native/addon.node","require:/app/dist/native-entry.js"]}
+{"type":"component","componentType":"wasm","path":"/app/pkg/pkg.wasm","exports":["init","run"],"reason":"wasm-file","confidence":0.8,"resolverTrace":["file:/app/pkg/pkg.wasm","import:/app/dist/wasm-entry.js"]}
+{"type":"edge","edgeType":"native-addon","from":"/src/app.js","to":"/app/native/addon.node","reason":"native-dlopen-string","confidence":0.76,"resolverTrace":["source:/src/app.js","call:process.dlopen('./native/addon.node')"]}
+{"type":"edge","edgeType":"wasm","from":"/src/app.js","to":"/app/pkg/pkg.wasm","reason":"wasm-import","confidence":0.74,"resolverTrace":["source:/src/app.js","call:WebAssembly.instantiateStreaming(fetch('./pkg.wasm'))"]}
+{"type":"edge","edgeType":"capability","from":"/src/app.js","to":"child_process.execFile","reason":"capability-child-process","confidence":0.7,"resolverTrace":["source:/src/app.js","call:child_process.execFile"]}