diff --git a/docs/product-advisories/01-Dec-2025 - Benchmarks for a Testable Security Moat.md b/docs/product-advisories/01-Dec-2025 - Benchmarks for a Testable Security Moat.md new file mode 100644 index 000000000..232ecad78 --- /dev/null +++ b/docs/product-advisories/01-Dec-2025 - Benchmarks for a Testable Security Moat.md @@ -0,0 +1,446 @@ +Here’s a crisp, practical way to turn Stella Ops’ “verifiable proof spine” into a moat—and how to measure it. + +# Why this matters (in plain terms) + +Security tools often say “trust me.” You’ll say “prove it”—every finding and every “not‑affected” claim ships with cryptographic receipts anyone can verify. + +--- + +# Differentiators to build in + +**1) Bind every verdict to a graph hash** + +* Compute a stable **Graph Revision ID** (Merkle root) over: SBOM nodes, edges, policies, feeds, scan params, and tool versions. +* Store the ID on each finding/VEX item; show it in the UI and APIs. +* Rule: any data change → new graph hash → new revisioned verdicts. + +**2) Attach machine‑verifiable receipts (in‑toto/DSSE)** + +* For each verdict, emit a **DSSE‑wrapped in‑toto statement**: + + * predicateType: `stellaops.dev/verdict@v1` + * includes: graphRevisionId, artifact digests, rule id/version, inputs (CPE/CVE/CVSS), timestamps. +* Sign with your **Authority** (Sigstore key, offline mode supported). +* Keep receipts queryable and exportable; mirror to Rekor‑compatible ledger when online. + +**3) Add reachability “call‑stack slices” or binary‑symbol proofs** + +* For code‑level reachability, store compact slices: entry → sink, with symbol names + file:line. +* For binary-only targets, include **symbol presence proofs** (e.g., Bloom filters + offsets) with executable digest. +* Compress and embed a hash of the slice/proof inside the DSSE payload. + +**4) Deterministic replay manifests** + +* Alongside receipts, publish a **Replay Manifest** (inputs, feeds, rule versions, container digests) so any auditor can reproduce the same graph hash and verdicts offline. + +--- + +# Benchmarks to publish (make them your headline KPIs) + +**A) False‑positive reduction vs. baseline scanners (%)** + +* Method: run a public corpus (e.g., sample images + app stacks) across 3–4 popular scanners; label ground truth once; compare FP rate. +* Report: mean & p95 FP reduction. + +**B) Proof coverage (% of findings with signed evidence)** + +* Definition: `(# findings or VEX items carrying valid DSSE receipts) / (total surfaced items)`. +* Break out: runtime‑reachable vs. unreachable, and “not‑affected” claims. + +**C) Triage time saved (p50/p95)** + +* Measure analyst minutes from “alert created” → “final disposition.” +* A/B with receipts hidden vs. visible; publish median/p95 deltas. + +**D) Determinism stability** + +* Re-run identical scans N times / across nodes; publish `% identical graph hashes` and drift causes when different. + +--- + +# Minimal implementation plan (week‑by‑week) + +**Week 1: primitives** + +* Add Graph Revision ID generator in `scanner.webservice` (Merkle over normalized JSON of SBOM+edges+policies+toolVersions). +* Define `VerdictReceipt` schema (protobuf/JSON) and DSSE envelope types. + +**Week 2: signing + storage** + +* Wire DSSE signing in **Authority**; offline key support + rotation. +* Persist receipts in `Receipts` table (Postgres) keyed by `(graphRevisionId, verdictId)`; enable export (JSONL) and ledger mirror. + +**Week 3: reachability proofs** + +* Add call‑stack slice capture in reachability engine; serialize compactly; hash + reference from receipts. +* Binary symbol proof module for ELF/PE: symbol bitmap + digest. + +**Week 4: replay + UX** + +* Emit `replay.manifest.json` per scan (inputs, tool digests). +* UI: show **“Verified”** badge, graph hash, signature issuer, and a one‑click “Copy receipt” button. +* API: `GET /verdicts/{id}/receipt`, `GET /graphs/{rev}/replay`. + +**Week 5: benchmarks harness** + +* Create `bench/` with golden fixtures and a runner: + + * Baseline scanner adapters + * Ground‑truth labels + * Metrics export (FP%, proof coverage, triage time capture hooks) + +--- + +# Developer guardrails (make these non‑negotiable) + +* **No receipt, no ship:** any surfaced verdict must carry a DSSE receipt. +* **Schema freeze windows:** changes to rule inputs or policy logic must bump rule version and therefore the graph hash. +* **Replay‑first CI:** PRs touching scanning/rules must pass a replay test that reproduces prior graph hashes on gold fixtures. +* **Clock safety:** use monotonic time inside receipts; add UTC wall‑time separately. + +--- + +# What to show buyers/auditors + +* A short **audit kit**: sample container + your receipts + replay manifest + one command to reproduce the same graph hash. +* A one‑page **benchmark readout**: FP reduction, proof coverage, and triage time saved (p50/p95), with corpus description. + +--- + +If you want, I’ll draft: + +1. the DSSE `predicate` schema, +2. the Postgres DDL for `Receipts` and `Graphs`, and +3. a tiny .NET verification CLI (`stellaops-verify`) that replays a manifest and validates signatures. +Here’s a focused “developer guidelines” doc just for **Benchmarks for a Testable Security Moat** in Stella Ops. + +--- + +# Stella Ops Developer Guidelines + +## Benchmarks for a Testable Security Moat + +> **Goal:** Benchmarks are how we *prove* Stella Ops is better, not just say it is. If a “moat” claim can’t be tied to a benchmark, it doesn’t exist. + +Everything here is about how you, as a developer, design, extend, and run those benchmarks. + +--- + +## 1. What our benchmarks must measure + +Every core product claim needs at least one benchmark: + +1. **Detection quality** + + * Precision / recall vs ground truth. + * False positives vs popular scanners. + * False negatives on known‑bad samples. + +2. **Proof & evidence quality** + + * % of findings with **valid receipts** (DSSE). + * % of VEX “not‑affected” with attached proofs. + * Reachability proof quality: + + * call‑stack slice present? + * symbol proof present for binaries? + +3. **Triage & workflow impact** + + * Time‑to‑decision for analysts (p50/p95). + * Click depth and context switches per decision. + * “Verified” vs “unverified” verdict triage times. + +4. **Determinism & reproducibility** + + * Same inputs → same **Graph Revision ID**. + * Stable verdict sets across runs/nodes. + +> **Rule:** If you add a feature that impacts any of these, you must either hook it into an existing benchmark or add a new one. + +--- + +## 2. Benchmark assets and layout + +**2.1 Repo layout (convention)** + +Under `bench/` we maintain everything benchmark‑related: + +* `bench/corpus/` + + * `images/` – curated container images / tarballs. + * `repos/` – sample codebases (with known vulns). + * `sboms/` – canned SBOMs for edge cases. +* `bench/scenarios/` + + * `*.yaml` – scenario definitions (inputs + expected outputs). +* `bench/golden/` + + * `*.json` – golden results (expected findings, metrics). +* `bench/tools/` + + * adapters for baseline scanners, parsers, helpers. +* `bench/scripts/` + + * `run_benchmarks.[sh/cs]` – single entrypoint. + +**2.2 Scenario definition (high‑level)** + +Each scenario yaml should minimally specify: + +* **Inputs** + + * artifact references (image name / path / repo SHA / SBOM file). + * environment knobs (features enabled/disabled). +* **Ground truth** + + * list of expected vulns (or explicit “none”). + * for some: expected reachability (reachable/unreachable). + * expected VEX entries (affected / not affected). +* **Expectations** + + * required metrics (e.g., “no more than 2 FPs”, “no FNs”). + * required proof coverage (e.g., “100% of surfaced findings have receipts”). + +--- + +## 3. Core benchmark metrics (developer‑facing definitions) + +Use these consistently across code and docs. + +### 3.1 Detection metrics + +* `true_positive_count` (TP) +* `false_positive_count` (FP) +* `false_negative_count` (FN) + +Derived: + +* `precision = TP / (TP + FP)` +* `recall = TP / (TP + FN)` +* For UX: track **FP per asset** and **FP per 100 findings**. + +**Developer guideline:** + +* When you introduce a filter, deduper, or rule tweak, add/modify a scenario where: + + * the change **helps** (reduces FP or FN); and + * a different scenario guards against regressions. + +### 3.2 Moat‑specific metrics + +These are the ones that directly support the “testable moat” story: + +1. **False‑positive reduction vs baseline scanners** + + * Run baseline scanners across our corpus (via adapters in `bench/tools`). + * Compute: + + * `baseline_fp_rate` + * `stella_fp_rate` + * `fp_reduction = (baseline_fp_rate - stella_fp_rate) / baseline_fp_rate`. + +2. **Proof coverage** + + * `proof_coverage_all = findings_with_valid_receipts / total_findings` + * `proof_coverage_vex = vex_items_with_valid_receipts / total_vex_items` + * `proof_coverage_reachable = reachable_findings_with_proofs / total_reachable_findings` + +3. **Triage time improvement** + + * In test harnesses, simulate or record: + + * `time_to_triage_with_receipts` + * `time_to_triage_without_receipts` + * Compute median & p95 deltas. + +4. **Determinism** + + * Re‑run the same scenario `N` times: + + * `% runs with identical Graph Revision ID` + * `% runs with identical verdict sets` + * On mismatch, diff and log cause (e.g., non‑stable sort, non‑pinned feed). + +--- + +## 4. How developers should work with benchmarks + +### 4.1 “No feature without benchmarks” + +If you’re adding or changing: + +* graph structure, +* rule logic, +* scanner integration, +* VEX handling, +* proof / receipt generation, + +you **must** do *at least one* of: + +1. **Extend an existing scenario** + + * Add expectations that cover your change, or + * tighten an existing bound (e.g., lower FP threshold). + +2. **Add a new scenario** + + * For new attack classes / edge cases / ecosystems. + +**Anti‑patterns:** + +* Shipping a new capability with *no* corresponding scenario. +* Updating golden outputs without explaining why metrics changed. + +### 4.2 CI gates + +We treat benchmarks as **blocking**: + +* Add a CI job, e.g.: + + * `make bench:quick` on every PR (small subset). + * `make bench:full` on main / nightly. +* CI fails if: + + * Any scenario marked `strict: true` has: + + * Precision or recall below its threshold. + * Proof coverage below its configured threshold. + * Global regressions above tolerance: + + * e.g. total FP increases > X% without an explicit override. + +**Developer rule:** + +* If you intentionally change behavior: + + * Update the relevant golden files. + * Include a short note in the PR (e.g., `bench-notes.md` snippet) describing: + + * what changed, + * why the new result is better, and + * which moat metric it improves (FP, proof coverage, determinism, etc.). + +--- + +## 5. Benchmark implementation guidelines + +### 5.1 Make benchmarks deterministic + +* **Pin everything**: + + * feed snapshots, + * tool container digests, + * rule versions, + * time windows. +* Use **Replay Manifests** as the source of truth: + + * `replay.manifest.json` should contain: + + * input artifacts, + * tool versions, + * feed versions, + * configuration flags. +* If a benchmark depends on time: + + * Inject a **fake clock** or explicit “as of” timestamp. + +### 5.2 Keep scenarios small but meaningful + +* Prefer many **focused** scenarios over a few huge ones. +* Each scenario should clearly answer: + + * “What property of Stella Ops are we testing?” + * “What moat claim does this support?” + +Examples: + +* `bench/scenarios/false_pos_kubernetes.yaml` + + * Focus: config noise reduction vs baseline scanner. +* `bench/scenarios/reachability_java_webapp.yaml` + + * Focus: reachable vs unreachable vuln proofs. +* `bench/scenarios/vex_not_affected_openssl.yaml` + + * Focus: VEX correctness and proof coverage. + +### 5.3 Use golden outputs, not ad‑hoc assertions + +* Bench harness should: + + * Run Stella Ops on scenario inputs. + * Normalize outputs (sorted lists, stable IDs). + * Compare to `bench/golden/.json`. +* Golden file should include: + + * expected findings (id, severity, reachable?, etc.), + * expected VEX entries, + * expected metrics (precision, recall, coverage). + +--- + +## 6. Moat‑critical benchmark types (we must have all of these) + +When you’re thinking about gaps, check that we have: + +1. **Cross‑tool comparison** + + * Same corpus, multiple scanners. + * Metrics vs baselines for FP/FN. + +2. **Proof density & quality** + + * Corpus where: + + * some vulns are reachable, + * some are not, + * some are not present. + * Ensure: + + * reachable ones have rich proofs (stack slices / symbol proofs). + * non‑reachable or absent ones have: + + * correct disposition, and + * clear receipts explaining why. + +3. **VEX accuracy** + + * Scenarios with known SBOM + known vulnerability impact. + * Check: + + * VEX “affected”/“not‑affected” matches ground truth. + * every VEX entry has a receipt. + +4. **Analyst workflow** + + * Small usability corpus for internal testing: + + * Measure time‑to‑triage with/without receipts. + * Use the same scenarios across releases to track improvement. + +5. **Upgrade / drift resistance** + + * Scenarios that are **expected to remain stable** across: + + * rule changes that *shouldn’t* affect outcomes. + * feed updates (within a given version window). + * These act as canaries for unintended regressions. + +--- + +## 7. Developer checklist (TL;DR) + +Before merging a change that touches security logic, ask yourself: + +1. **Is there at least one benchmark scenario that exercises this change?** +2. **Does the change improve at least one moat metric, or is it neutral?** +3. **Have I run `make bench:quick` locally and checked diffs?** +4. **If goldens changed, did I explain why in the PR?** +5. **Did I keep benchmarks deterministic (pinned versions, fake time, etc.)?** + +If any answer is “no”, fix that before merging. + +--- + +If you’d like, next step I can sketch a concrete `bench/scenarios/*.yaml` and matching `bench/golden/*.json` example that encodes one *specific* moat claim (e.g., “30% fewer FPs than Scanner X on Kubernetes configs”) so your team has a ready‑to-copy pattern. diff --git a/docs/product-advisories/01-Dec-2025 - Common Developers guides.md b/docs/product-advisories/01-Dec-2025 - Common Developers guides.md new file mode 100644 index 000000000..31c1097e5 --- /dev/null +++ b/docs/product-advisories/01-Dec-2025 - Common Developers guides.md @@ -0,0 +1,287 @@ +Here’s a condensed **“Stella Ops Developer Guidelines”** based on the official engineering docs and dev guides. + +--- + +## 0. Where to start + +* **Dev docs index:** The main entrypoint is `Development Guides & Tooling` (docs/technical/development/README.md). It links to coding standards, test strategy, performance workbook, plug‑in SDK, examples, and more. ([Gitea: Git with a cup of tea][1]) +* **If a term is unfamiliar:** Check the one‑page *Glossary of Terms* first. ([Stella Ops][2]) +* **Big picture:** Stella Ops is an SBOM‑first, offline‑ready container security platform; a lot of design decisions (determinism, signatures, policy DSL, SBOM delta scans) flow from that. ([Stella Ops][3]) + +--- + +## 1. Core engineering principles + +From **Coding Standards & Contributor Guide**: ([Gitea: Git with a cup of tea][4]) + +1. **SOLID first** – especially interface & dependency inversion. +2. **100‑line file rule** – if a file grows >100 physical lines, split or refactor. +3. **Contracts vs runtime** – public DTOs and interfaces live in lightweight `*.Contracts` projects; implementations live in sibling runtime projects. +4. **Single composition root** – DI wiring happens in `StellaOps.Web/Program.cs` and each plug‑in’s `IoCConfigurator`. Nothing else creates a service provider. +5. **No service locator** – constructor injection only; no global `ServiceProvider` or static service lookups. +6. **Fail‑fast startup** – validate configuration *before* the web host starts listening. +7. **Hot‑load compatibility** – avoid static singletons that would survive plug‑in unload; don’t manually load assemblies outside the built‑in loader. + +These all serve the product goals of **deterministic, offline, explainable security decisions**. ([Stella Ops][3]) + +--- + +## 2. Repository layout & layering + +From the repo layout section: ([Gitea: Git with a cup of tea][4]) + +* **Top‑level structure (simplified):** + + ```text + src/ + backend/ + StellaOps.Web/ # ASP.NET host + composition root + StellaOps.Common/ # logging, helpers + StellaOps.Contracts/ # DTO + interface contracts + … more runtime projects + plugins-sdk/ # plug‑in templates & abstractions + frontend/ # Angular workspace + tests/ # mirrors src 1‑to‑1 + ``` + +* **Rules:** + + * No “Module” folders or nested solution hierarchies. + * Tests mirror `src/` structure 1:1; **no test code in production projects**. + * New features follow *feature folder* layout (e.g., `Scan/ScanService.cs`, `Scan/ScanController.cs`). + +--- + +## 3. Naming, style & language usage + +Key conventions: ([Gitea: Git with a cup of tea][4]) + +* **Namespaces:** file‑scoped, `StellaOps.*`. +* **Interfaces:** `I` prefix (`IScannerRunner`). +* **Classes/records:** PascalCase (`ScanRequest`, `TrivyRunner`). +* **Private fields:** `camelCase` (no leading `_`). +* **Constants:** `SCREAMING_SNAKE_CASE`. +* **Async methods:** end with `Async`. +* **Usings:** outside namespace, sorted, no wildcard imports. +* **File length:** keep ≤100 lines including `using` and braces (enforced by tooling). + +C# feature usage: ([Gitea: Git with a cup of tea][4]) + +* Nullable reference types **on**. +* Use `record` for immutable DTOs. +* Prefer pattern matching over long `switch` cascades. +* `Span`/`Memory` only when you’ve measured that you need them. +* Use `await foreach` instead of manual iterator loops. + +Formatting & analysis: + +* `dotnet format` must be clean; StyleCop + security analyzers + CodeQL run in CI and are treated as gates. ([Gitea: Git with a cup of tea][4]) + +--- + +## 4. Dependency injection, async & concurrency + +DI policy (core + plug‑ins): ([Gitea: Git with a cup of tea][4]) + +* Exactly **one composition root** per process (`StellaOps.Web/Program.cs`). +* Plug‑ins contribute through: + + * `[ServiceBinding]` attributes for simple bindings, or + * An `IoCConfigurator : IDependencyInjectionRoutine` for advanced setups. +* Default lifetime is **scoped**. Use singletons only for truly stateless, thread‑safe helpers. +* Never use a service locator or manually build nested service providers except in tests. + +Async & threading: ([Gitea: Git with a cup of tea][4]) + +* All I/O is async; avoid `.Result` / `.Wait()`. +* Library code uses `ConfigureAwait(false)`. +* Control concurrency with channels or `Parallel.ForEachAsync`, not ad‑hoc `Task.Run` loops. + +--- + +## 5. Tests, tooling & quality gates + +The **Automated Test‑Suite Overview** spells out all CI layers and budgets. ([Gitea: Git with a cup of tea][5]) + +**Test layers (high‑level):** + +* Unit tests: xUnit. +* Property‑based tests: FsCheck. +* Integration: + + * API integration with Testcontainers. + * DB/merge flows using Mongo + Redis. +* Contracts: gRPC breakage checks with Buf. +* Frontend: + + * Unit tests with Jest. + * E2E tests with Playwright. + * Lighthouse runs for performance & accessibility. +* Non‑functional: + + * Load tests via k6. + * Chaos experiments (CPU/OOM) using Docker tooling. + * Dependency & license scanning. + * SBOM reproducibility/attestation checks. + +**Quality gates (examples):** ([Gitea: Git with a cup of tea][5]) + +* API unit test line coverage ≥ ~85%. +* API P95 latency ≤ ~120 ms in nightly runs. +* Δ‑SBOM warm scan P95 ≤ ~5 s on reference hardware. +* Lighthouse perf score ≥ ~90, a11y ≥ ~95. + +**Local workflows:** + +* Use `./scripts/dev-test.sh` for “fast” local runs and `--full` for the entire stack (API, UI, Playwright, Lighthouse, etc.). Needs Docker and modern Node. ([Gitea: Git with a cup of tea][5]) +* Some suites use Mongo2Go + an OpenSSL 1.1 shim; others use a helper script to spin up a local `mongod` for deeper debugging. ([Gitea: Git with a cup of tea][5]) + +--- + +## 6. Plug‑ins & connectors + +The **Plug‑in SDK Guide** is your bible for schedule jobs, scanner adapters, TLS providers, notification channels, etc. ([Gitea: Git with a cup of tea][6]) + +**Basics:** + +* Use `.NET` templates to scaffold: + + ```bash + dotnet new stellaops-plugin-schedule -n MyPlugin.Schedule --output src + ``` + +* At publish time, copy **signed** artefacts to: + + ```text + src/backend/Stella.Ops.Plugin.Binaries// + MyPlugin.dll + MyPlugin.dll.sig + ``` + +* The backend: + + * Verifies the Cosign signature. + * Enforces `[StellaPluginVersion]` compatibility. + * Loads plug‑ins in isolated `AssemblyLoadContext`s. + +**DI entrypoints:** + +* For simple cases, mark implementations with `[ServiceBinding(typeof(IMyContract), ServiceLifetime.Scoped, …)]`. +* For more control, implement `IoCConfigurator : IDependencyInjectionRoutine` and configure services/options in `Register(...)`. ([Gitea: Git with a cup of tea][6]) + +**Examples:** + +* **Schedule job:** implement `IJob.ExecuteAsync`, add `[StellaPluginVersion("X.Y.Z")]`, register cron with `services.AddCronJob("0 15 * * *")`. +* **Scanner adapter:** implement `IScannerRunner` and register via `services.AddScanner("alt")`; document Docker sidecars if needed. ([Gitea: Git with a cup of tea][6]) + +**Signing & deployment:** + +* Publish, sign with Cosign, optionally zip: + + ```bash + dotnet publish -c Release -p:PublishSingleFile=true -o out + cosign sign --key $COSIGN_KEY out/MyPlugin.Schedule.dll + ``` + +* Copy into the backend container (e.g., `/opt/plugins/`) and restart. + +* Unsigned DLLs are rejected when `StellaOps:Security:DisableUnsigned=false`. ([Gitea: Git with a cup of tea][6]) + +**Marketplace:** + +* Tag releases like `plugin-vX.Y.Z`, attach the signed ZIP, and submit metadata to the community plug‑in index so it shows up in the UI Marketplace. ([Gitea: Git with a cup of tea][6]) + +--- + +## 7. Policy DSL & security decisions + +For policy authors and tooling engineers, the **Stella Policy DSL (stella‑dsl@1)** doc is key. ([Stella Ops][7]) + +**Goals:** + +* Deterministic: same inputs → same findings on every machine. +* Declarative: no arbitrary loops, network calls, or clocks. +* Explainable: each decision carries rule, inputs, rationale. +* Offline‑friendly and reachability‑aware (SBOM + advisories + VEX + reachability). ([Stella Ops][7]) + +**Structure:** + +* One `policy` block per `.stella` file, with: + + * `metadata` (description, tags). + * `profile` blocks (severity, trust, reachability adjustments). + * `rule` blocks (`when` / `then` logic). + * Optional `settings`. ([Stella Ops][7]) + +**Context & built‑ins:** + +* Namespaces like `sbom`, `advisory`, `vex`, `env`, `telemetry`, `secret`, `profile.*`, etc. ([Stella Ops][7]) +* Helpers such as `normalize_cvss`, `risk_score`, `vex.any`, `vex.latest`, `sbom.any_component`, `exists`, `coalesce`, and secrets‑specific helpers. ([Stella Ops][7]) + +**Rules of thumb:** + +* Always include a clear `because` when you change `status` or `severity`. ([Stella Ops][7]) +* Avoid catch‑all suppressions (`when true` + `status := "suppressed"`); the linter will flag them. ([Stella Ops][7]) +* Use `stella policy lint/compile/simulate` in CI and locally; test in sealed (offline) mode to ensure no network dependencies. ([Stella Ops][7]) + +--- + +## 8. Commits, PRs & docs + +From the commit/PR checklist: ([Gitea: Git with a cup of tea][4]) + +Before opening a PR: + +1. Use **Conventional Commit** prefixes (`feat:`, `fix:`, `docs:`, etc.). +2. Run `dotnet format` and `dotnet test`; both must be green. +3. Keep new/changed files within the 100‑line guideline. +4. Update XML‑doc comments for any new public API. +5. If you add/change a public contract: + + * Update the relevant markdown docs. + * Update JSON schema / API descriptions as needed. +6. Ensure static analyzers and CI jobs relevant to your change are passing. + +For new test layers or jobs, also update the test‑suite overview and metrics docs so the CI configuration stays discoverable. ([Gitea: Git with a cup of tea][5]) + +--- + +## 9. Licensing & reciprocity + +Stella Ops ships under **AGPL‑3.0‑or‑later** with a strong reciprocity clause: ([Stella Ops][8]) + +* You may run, study, modify, and redistribute it, including as a hosted service. +* If you run a **modified** version for others over a network, you must make that exact source code available to those users. +* Official containers are signed and include SBOMs and attestations; verify them with Cosign as described on the license/security pages. ([Stella Ops][8]) + +When you build extensions: + +* Keep plug‑ins compatible with AGPL expectations around combined works. +* Don’t embed proprietary logic into the core without checking license implications. + +--- + +## 10. If you just want a “first contribution” recipe + +A practical path that follows the guidelines: + +1. Clone the main repo; skim **Coding Standards**, **Test Suite Overview**, and the **Dev Guides & Tooling** index. ([Gitea: Git with a cup of tea][4]) +2. Get `dotnet`, Docker, Node set up; run `./scripts/dev-test.sh` to make sure your environment is healthy. ([Gitea: Git with a cup of tea][5]) +3. Pick a small issue (docs, small refactor, or new test), make changes respecting: + + * 100‑line files, + * DI patterns, + * naming & style. +4. Add/adjust tests plus any affected docs or JSON schemas. +5. Run tests + formatting locally, push, and open a PR with a conventional title and a short “how I tested this” note. + +If you tell me what you’re planning to work on (plug‑in, policy pack, core feature, or UI), I can turn this into a very concrete checklist tailored to that slice of Stella Ops. + +[1]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/08b27b8a266c82960c7653797460e1e1d17ecd45/docs/technical/development/README.md "git.stella-ops.org/README.md at 08b27b8a266c82960c7653797460e1e1d17ecd45 - git.stella-ops.org - Gitea: Git with a cup of tea" +[2]: https://stella-ops.org/docs/14_glossary_of_terms/?utm_source=chatgpt.com "Open • Sovereign • Modular container security - Stella Ops" +[3]: https://stella-ops.org/docs/05_SYSTEM_REQUIREMENTS_SPEC/?utm_source=chatgpt.com "system requirements specification - Stella Ops – Open • Sovereign ..." +[4]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/08b27b8a266c82960c7653797460e1e1d17ecd45/docs/18_CODING_STANDARDS.md "git.stella-ops.org/18_CODING_STANDARDS.md at 08b27b8a266c82960c7653797460e1e1d17ecd45 - git.stella-ops.org - Gitea: Git with a cup of tea" +[5]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/08b27b8a266c82960c7653797460e1e1d17ecd45/docs/19_TEST_SUITE_OVERVIEW.md "git.stella-ops.org/19_TEST_SUITE_OVERVIEW.md at 08b27b8a266c82960c7653797460e1e1d17ecd45 - git.stella-ops.org - Gitea: Git with a cup of tea" +[6]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/08b27b8a266c82960c7653797460e1e1d17ecd45/docs/10_PLUGIN_SDK_GUIDE.md "git.stella-ops.org/10_PLUGIN_SDK_GUIDE.md at 08b27b8a266c82960c7653797460e1e1d17ecd45 - git.stella-ops.org - Gitea: Git with a cup of tea" +[7]: https://stella-ops.org/docs/policy/dsl/index.html "Stella Ops – Signed Reachability · Deterministic Replay · Sovereign Crypto" +[8]: https://stella-ops.org/license/?utm_source=chatgpt.com "AGPL‑3.0‑or‑later - Stella Ops" diff --git a/docs/product-advisories/01-Dec-2025 - DSSE‑Signed Offline Scanner Updates.md b/docs/product-advisories/01-Dec-2025 - DSSE‑Signed Offline Scanner Updates.md new file mode 100644 index 000000000..d557d311f --- /dev/null +++ b/docs/product-advisories/01-Dec-2025 - DSSE‑Signed Offline Scanner Updates.md @@ -0,0 +1,585 @@ +Here’s a tight, practical pattern to make your scanner’s vuln‑DB updates rock‑solid even when feeds hiccup: + +# Offline, verifiable update bundles (DSSE + Rekor v2) + +**Idea:** distribute DB updates as offline tarballs. Each tarball ships with: + +* a **DSSE‑signed** statement (e.g., in‑toto style) over the bundle hash +* a **Rekor v2 receipt** proving the signature/statement was logged +* a small **manifest.json** (version, created_at, content hashes) + +**Startup flow (happy path):** + +1. Load latest tarball from your local `updates/` cache. +2. Verify DSSE signature against your trusted public keys. +3. Verify Rekor v2 receipt (inclusion proof) matches the DSSE payload hash. +4. If both pass, unpack/activate; record the bundle’s **trust_id** (e.g., statement digest). +5. If anything fails, **keep using the last good bundle**. No service disruption. + +**Why this helps** + +* **Air‑gap friendly:** no live network needed at activation time. +* **Tamper‑evident:** DSSE + Rekor receipt proves provenance and transparency. +* **Operational stability:** feed outages become non‑events—scanner just keeps the last good state. + +--- + +## File layout inside each bundle + +``` +/bundle-2025-11-29/ + manifest.json # { version, created_at, entries[], sha256s } + payload.tar.zst # the actual DB/indices + payload.tar.zst.sha256 + statement.dsse.json # DSSE-wrapped statement over payload hash + rekor-receipt.json # Rekor v2 inclusion/verification material +``` + +--- + +## Acceptance/Activation rules + +* **Trust root:** pin one (or more) publisher public keys; rotate via separate, out‑of‑band process. +* **Monotonicity:** only activate if `manifest.version > current.version` (or if trust policy explicitly allows replay for rollback testing). +* **Atomic switch:** unpack to `db/staging/`, validate, then symlink‑flip to `db/active/`. +* **Quarantine on failure:** move bad bundles to `updates/quarantine/` with a reason code. + +--- + +## Minimal .NET 10 verifier sketch (C#) + +```csharp +public sealed record BundlePaths(string Dir) { + public string Manifest => Path.Combine(Dir, "manifest.json"); + public string Payload => Path.Combine(Dir, "payload.tar.zst"); + public string Dsse => Path.Combine(Dir, "statement.dsse.json"); + public string Receipt => Path.Combine(Dir, "rekor-receipt.json"); +} + +public async Task ActivateBundleAsync(BundlePaths b, TrustConfig trust, string activeDir) { + var manifest = await Manifest.LoadAsync(b.Manifest); + if (!await Hashes.VerifyAsync(b.Payload, manifest.PayloadSha256)) return false; + + // 1) DSSE verify (publisher keys pinned in trust) + var (okSig, dssePayloadDigest) = await Dsse.VerifyAsync(b.Dsse, trust.PublisherKeys); + if (!okSig || dssePayloadDigest != manifest.PayloadSha256) return false; + + // 2) Rekor v2 receipt verify (inclusion + statement digest == dssePayloadDigest) + if (!await RekorV2.VerifyReceiptAsync(b.Receipt, dssePayloadDigest, trust.RekorPub)) return false; + + // 3) Stage, validate, then atomically flip + var staging = Path.Combine(activeDir, "..", "staging"); + DirUtil.Empty(staging); + await TarZstd.ExtractAsync(b.Payload, staging); + if (!await LocalDbSelfCheck.RunAsync(staging)) return false; + + SymlinkUtil.AtomicSwap(source: staging, target: activeDir); + State.WriteLastGood(manifest.Version, dssePayloadDigest); + return true; +} +``` + +--- + +## Operational playbook + +* **On boot & daily at HH:MM:** try `ActivateBundleAsync()` on the newest bundle; on failure, log and continue. +* **Telemetry (no PII):** reason codes (SIG_FAIL, RECEIPT_FAIL, HASH_MISMATCH, SELFTEST_FAIL), versions, last_good. +* **Keys & rotation:** keep `publisher.pub` and `rekor.pub` in a root‑owned, read‑only path; rotate via a separate signed “trust bundle”. +* **Defense‑in‑depth:** verify both the **payload hash** and each file’s hash listed in `manifest.entries[]`. +* **Rollback:** allow `--force-activate ` for emergency testing, but mark as **non‑monotonic** in state. + +--- + +## What to hand your release team + +* A Make/CI target that: + + 1. Builds `payload.tar.zst` and computes hashes + 2. Generates `manifest.json` + 3. Creates and signs the **DSSE statement** + 4. Submits to Rekor (or your mirror) and saves the **v2 receipt** + 5. Packages the bundle folder and publishes to your offline repo +* A checksum file (`*.sha256sum`) for ops to verify out‑of‑band. + +--- + +If you want, I can turn this into a Stella Ops spec page (`docs/modules/scanner/offline-bundles.md`) plus a small reference implementation (C# library + CLI) that drops right into your Scanner service. +Here’s a “drop‑in” Stella Ops dev guide for **DSSE‑signed Offline Scanner Updates** — written in the same spirit as the existing docs and sprint files. + +You can treat this as the seed for `docs/modules/scanner/development/dsse-offline-updates.md` (or similar). + +--- + +# DSSE‑Signed Offline Scanner Updates — Developer Guidelines + +> **Audience** +> Scanner, Export Center, Attestor, CLI, and DevOps engineers implementing DSSE‑signed offline vulnerability updates and integrating them into the Offline Update Kit (OUK). +> +> **Context** +> +> * OUK already ships **signed, atomic offline update bundles** with merged vulnerability feeds, container images, and an attested manifest.([git.stella-ops.org][1]) +> * DSSE + Rekor is already used for **scan evidence** (SBOM attestations, Rekor proofs).([git.stella-ops.org][2]) +> * Sprints 160/162 add **attestation bundles** with manifest, checksums, DSSE signature, and optional transparency log segments, and integrate them into OUK and CLI flows.([git.stella-ops.org][3]) + +These guidelines tell you how to **wire all of that together** for “offline scanner updates” (feeds, rules, packs) in a way that matches Stella Ops’ determinism + sovereignty promises. + +--- + +## 0. Mental model + +At a high level, you’re building this: + +```text + Advisory mirrors / Feeds builders + │ + ▼ + ExportCenter.AttestationBundles + (creates DSSE + Rekor evidence + for each offline update snapshot) + │ + ▼ + Offline Update Kit (OUK) builder + (adds feeds + evidence to kit tarball) + │ + ▼ + stella offline kit import / admin CLI + (verifies Cosign + DSSE + Rekor segments, + then atomically swaps scanner feeds) +``` + +Online, Rekor is live; offline, you rely on **bundled Rekor segments / snapshots** and the existing OUK mechanics (import is atomic, old feeds kept until new bundle is fully verified).([git.stella-ops.org][1]) + +--- + +## 1. Goals & non‑goals + +### Goals + +1. **Authentic offline snapshots** + Every offline scanner update (OUK or delta) must be verifiably tied to: + + * a DSSE envelope, + * a certificate chain rooted in Stella’s Fulcio/KMS profile or BYO KMS/HSM, + * *and* a Rekor v2 inclusion proof or bundled log segment.([Stella Ops][4]) + +2. **Deterministic replay** + Given: + + * a specific offline update kit (`stella-ops-offline-kit-.tgz` + `offline-manifest-.json`)([git.stella-ops.org][1]) + * its DSSE attestation bundle + Rekor segments + every verifier must reach the *same* verdict on integrity and contents — online or fully air‑gapped. + +3. **Separation of concerns** + + * Export Center: build attestation bundles, no business logic about scanning.([git.stella-ops.org][5]) + * Scanner: import & apply feeds; verify but not generate DSSE. + * Signer / Attestor: own DSSE & Rekor integration.([git.stella-ops.org][2]) + +4. **Operational safety** + + * Imports remain **atomic and idempotent**. + * Old feeds stay live until the new update is **fully verified** (Cosign + DSSE + Rekor).([git.stella-ops.org][1]) + +### Non‑goals + +* Designing new crypto or log formats. +* Per‑feed DSSE envelopes (you can have more later, but the minimum contract is **bundle‑level** attestation). + +--- + +## 2. Bundle contract for DSSE‑signed offline updates + +You’re extending the existing OUK contract: + +* OUK already packs: + + * merged vuln feeds (OSV, GHSA, optional NVD 2.0, CNNVD/CNVD, ENISA, JVN, BDU), + * container images (`stella-ops`, Zastava, etc.), + * provenance (Cosign signature, SPDX SBOM, in‑toto SLSA attestation), + * `offline-manifest.json` + detached JWS signed during export.([git.stella-ops.org][1]) + +For **DSSE‑signed offline scanner updates**, add a new logical layer: + +### 2.1. Files to ship + +Inside each offline kit (full or delta) you must produce: + +```text +/attestations/ + offline-update.dsse.json # DSSE envelope + offline-update.rekor.json # Rekor entry + inclusion proof (or segment descriptor) +/manifest/ + offline-manifest.json # existing manifest + offline-manifest.json.jws # existing detached JWS +/feeds/ + ... # existing feed payloads +``` + +The exact paths can be adjusted, but keep: + +* **One DSSE bundle per kit** (min spec). +* **One canonical Rekor proof file** per DSSE envelope. + +### 2.2. DSSE payload contents (minimal) + +Define (or reuse) a predicate type such as: + +```jsonc +{ + "payloadType": "application/vnd.in-toto+json", + "payload": { /* base64 */ } +} +``` + +Decoded payload (in-toto statement) should **at minimum** contain: + +* **Subject** + + * `name`: `stella-ops-offline-kit-.tgz` + * `digest.sha256`: tarball digest + +* **Predicate type** (recommendation) + + * `https://stella-ops.org/attestations/offline-update/1` + +* **Predicate fields** + + * `offline_manifest_sha256` – SHA‑256 of `offline-manifest.json` + * `feeds` – array of feed entries such as `{ name, snapshot_date, archive_digest }` (mirrors `rules_and_feeds` style used in the moat doc).([Stella Ops][6]) + * `builder` – CI workflow id / git commit / Export Center job id + * `created_at` – UTC ISO‑8601 + * `oukit_channel` – e.g., `edge`, `stable`, `fips-profile` + +**Guideline:** this DSSE payload is the **single canonical description** of “what this offline update snapshot is”. + +### 2.3. Rekor material + +Attestor must: + +* Submit `offline-update.dsse.json` to Rekor v2, obtaining: + + * `uuid` + * `logIndex` + * inclusion proof (`rootHash`, `hashes`, `checkpoint`) +* Serialize that to `offline-update.rekor.json` and store it in object storage + OUK staging, so it ships in the kit.([git.stella-ops.org][2]) + +For fully offline operation: + +* Either: + + * embed a **minimal log segment** containing that entry; or + * rely on daily Rekor snapshot exports included elsewhere in the kit.([git.stella-ops.org][2]) + +--- + +## 3. Implementation by module + +### 3.1 Export Center — attestation bundles + +**Working directory:** `src/ExportCenter/StellaOps.ExportCenter.AttestationBundles`([git.stella-ops.org][7]) + +**Responsibilities** + +1. **Compose attestation bundle job** (EXPORT‑ATTEST‑74‑001) + + * Input: a snapshot identifier (e.g., offline kit build id or feed snapshot date). + * Read manifest and feed metadata from the Export Center’s storage.([git.stella-ops.org][5]) + * Generate the DSSE payload structure described above. + * Call `StellaOps.Signer` to wrap it in a DSSE envelope. + * Call `StellaOps.Attestor` to submit DSSE → Rekor and get the inclusion proof.([git.stella-ops.org][2]) + * Persist: + + * `offline-update.dsse.json` + * `offline-update.rekor.json` + * any log segment artifacts. + +2. **Integrate into offline kit packaging** (EXPORT‑ATTEST‑74‑002 / 75‑001) + + * The OUK builder (Python script `ops/offline-kit/build_offline_kit.py`) already assembles artifacts & manifests.([Stella Ops][8]) + * Extend that pipeline (or add an Export Center step) to: + + * fetch the attestation bundle for the snapshot, + * place it under `/attestations/` in the kit staging dir, + * ensure `offline-manifest.json` contains entries for the DSSE and Rekor files (name, sha256, size, capturedAt).([git.stella-ops.org][1]) + +3. **Contracts & schemas** + + * Define a small JSON schema for `offline-update.rekor.json` (UUID, index, proof fields) and check it into `docs/11_DATA_SCHEMAS.md` or module‑local schemas. + * Keep all new payload schemas **versioned**; avoid “shape drift”. + +**Do / Don’t** + +* ✅ **Do** treat attestation bundle job as *pure aggregation* (AOC guardrail: no modification of evidence).([git.stella-ops.org][5]) +* ✅ **Do** rely on Signer + Attestor; don’t hand‑roll DSSE/Rekor logic in Export Center.([git.stella-ops.org][2]) +* ❌ **Don’t** reach out to external networks from this job — it must run with the same offline‑ready posture as the rest of the platform. + +--- + +### 3.2 Offline Update Kit builder + +**Working area:** `ops/offline-kit/*` + `docs/24_OFFLINE_KIT.md`([git.stella-ops.org][1]) + +Guidelines: + +1. **Preserve current guarantees** + + * Imports must remain **idempotent and atomic**, with **old feeds kept until the new bundle is fully verified**. This now includes DSSE/Rekor checks in addition to Cosign + JWS.([git.stella-ops.org][1]) + +2. **Staging layout** + + * When staging a kit, ensure the tree looks like: + + ```text + out/offline-kit/staging/ + feeds/... + images/... + manifest/offline-manifest.json + attestations/offline-update.dsse.json + attestations/offline-update.rekor.json + ``` + + * Update `offline-manifest.json` so each new file appears with: + + * `name`, `sha256`, `size`, `capturedAt`.([git.stella-ops.org][1]) + +3. **Deterministic ordering** + + * File lists in manifests must be in a stable order (e.g., lexical paths). + * Timestamps = UTC ISO‑8601 only; never use local time. (Matches determinism guidance in AGENTS.md + policy/runs docs.)([git.stella-ops.org][9]) + +4. **Delta kits** + + * For deltas (`stella-ouk-YYYY-MM-DD.delta.tgz`), DSSE should still cover: + + * the delta tarball digest, + * the **logical state** (feeds & versions) after applying the delta. + * Don’t shortcut by “attesting only the diff files” — the predicate must describe the resulting snapshot. + +--- + +### 3.3 Scanner — import & activation + +**Working directory:** `src/Scanner/StellaOps.Scanner.WebService`, `StellaOps.Scanner.Worker`([git.stella-ops.org][9]) + +Scanner already exposes admin flows for: + +* **Offline kit import**, which: + + * validates the Cosign signature of the kit, + * uses the attested manifest, + * keeps old feeds until verification is done.([git.stella-ops.org][1]) + +Add DSSE/Rekor awareness as follows: + +1. **Verification sequence (happy path)** + + On `import-offline-usage-kit`: + + 1. Validate **Cosign** signature of the tarball. + 2. Validate `offline-manifest.json` with its JWS signature. + 3. Verify **file digests** for all entries (including `/attestations/*`). + 4. Verify **DSSE**: + + * Call `StellaOps.Attestor.Verify` (or CLI equivalent) with: + + * `offline-update.dsse.json` + * `offline-update.rekor.json` + * local Rekor log snapshot / segment (if configured)([git.stella-ops.org][2]) + * Ensure the payload digest matches the kit tarball + manifest digests. + 5. Only after all checks pass: + + * swap Scanner’s feed pointer to the new snapshot, + * emit an audit event noting: + + * kit filename, tarball digest, + * DSSE statement digest, + * Rekor UUID + log index. + +2. **Config surface** + + Add config keys (names illustrative): + + ```yaml + scanner: + offlineKit: + requireDsse: true # fail import if DSSE/Rekor verification fails + rekorOfflineMode: true # use local snapshots only + attestationVerifier: https://attestor.internal + ``` + + * Mirror them via ASP.NET Core config + env vars (`SCANNER__OFFLINEKIT__REQUIREDSSSE`, etc.), following the same pattern as the DSSE/Rekor operator guide.([git.stella-ops.org][2]) + +3. **Failure behaviour** + + * **DSSE/Rekor fail, Cosign + manifest OK** + + * Keep old feeds active. + * Mark import as failed; surface a `ProblemDetails` error via API/UI. + * Log structured fields: `rekorUuid`, `attestationDigest`, `offlineKitHash`, `failureReason`.([git.stella-ops.org][2]) + + * **Config flag to soften during rollout** + + * When `requireDsse=false`, treat DSSE/Rekor failure as a warning and still allow the import (for initial observation phase), but emit alerts. This mirrors the “observe → enforce” pattern in the DSSE/Rekor operator guide.([git.stella-ops.org][2]) + +--- + +### 3.4 Signer & Attestor + +You mostly **reuse** existing guidance:([git.stella-ops.org][2]) + +* Add a new predicate type & schema for offline updates in Signer. + +* Ensure Attestor: + + * can submit offline‑update DSSE envelopes to Rekor, + * can emit verification routines (used by CLI and Scanner) that: + + * verify the DSSE signature, + * check the certificate chain against the configured root pack (FIPS/eIDAS/GOST/SM, etc.),([Stella Ops][4]) + * verify Rekor inclusion using either live log or local snapshot. + +* For fully air‑gapped installs: + + * rely on Rekor **snapshots mirrored** into Offline Kit (already recommended in the operator guide’s offline section).([git.stella-ops.org][2]) + +--- + +### 3.5 CLI & UI + +Extend CLI with explicit verbs (matching EXPORT‑ATTEST sprints):([git.stella-ops.org][10]) + +* `stella attest bundle verify --bundle path/to/offline-kit.tgz --rekor-key rekor.pub` +* `stella attest bundle import --bundle ...` (for sites that prefer a two‑step “verify then import” flow) +* Wire UI Admin → Offline Kit screen so that: + + * verification status shows both **Cosign/JWS** and **DSSE/Rekor** state, + * policy banners display kit generation time, manifest hash, and DSSE/Rekor freshness.([Stella Ops][11]) + +--- + +## 4. Determinism & offline‑safety rules + +When touching any of this code, keep these rules front‑of‑mind (they align with the policy DSL and architecture docs):([Stella Ops][4]) + +1. **No hidden network dependencies** + + * All verification **must work offline** given the kit + Rekor snapshots. + * Any fallback to live Rekor / Fulcio endpoints must be explicitly toggled and never on by default for “offline mode”. + +2. **Stable serialization** + + * DSSE payload JSON: + + * stable ordering of fields, + * no float weirdness, + * UTC timestamps. + +3. **Replayable imports** + + * Running `import-offline-usage-kit` twice with the same bundle must be a no‑op after the first time. + * The DSSE payload for a given snapshot must not change over time; if it does, bump the predicate or snapshot version. + +4. **Explainability** + + * When verification fails, errors must explain **what** mismatched (kit digest, manifest digest, DSSE envelope hash, Rekor inclusion) so auditors can reason about it. + +--- + +## 5. Testing & CI expectations + +Tie this into the existing CI workflows (`scanner-determinism.yml`, `attestation-bundle.yml`, `offline-kit` pipelines, etc.):([git.stella-ops.org][12]) + +### 5.1 Unit & integration tests + +Write tests that cover: + +1. **Happy paths** + + * Full kit import with valid: + + * Cosign, + * manifest JWS, + * DSSE, + * Rekor proof (online and offline modes). + +2. **Corruption scenarios** + + * Tampered feed file (hash mismatch). + * Tampered `offline-manifest.json`. + * Tampered DSSE payload (signature fails). + * Mismatched Rekor entry (payload digest doesn’t match DSSE). + +3. **Offline scenarios** + + * No network access, only Rekor snapshot: + + * DSSE verification still passes, + * Rekor proof validates against local tree head. + +4. **Roll‑back logic** + + * Import fails at DSSE/Rekor step: + + * scanner DB still points at previous feeds, + * metrics/logs show failure and no partial state. + +### 5.2 SLOs & observability + +Reuse metrics suggested by DSSE/Rekor guide and adapt to OUK imports:([git.stella-ops.org][2]) + +* `offlinekit_import_total{status="success|failed_dsse|failed_rekor|failed_cosign"}` +* `offlinekit_attestation_verify_latency_seconds` (histogram) +* `attestor_rekor_success_total`, `attestor_rekor_retry_total`, `rekor_inclusion_latency` +* Dashboards: kit versions per environment, time since last kit, DSSE/Rekor health. + +--- + +## 6. Developer checklist (TL;DR) + +When you pick up a task touching DSSE‑signed offline updates: + +1. **Read the background** + + * `docs/modules/scanner/operations/dsse-rekor-operator-guide.md`([git.stella-ops.org][2]) + * `docs/24_OFFLINE_KIT.md` (and public offline kit guide).([git.stella-ops.org][1]) + * Relevant sprint file (`SPRINT_160_export_evidence`, `SPRINT_162_exportcenter_i`, etc.).([git.stella-ops.org][10]) + +2. **Implement** + + * Generate DSSE payloads in Export Center only. + * Call Signer & Attestor; persist DSSE + Rekor JSON next to manifests. + * Extend OUK builder to include attestation bundle and list it in `offline-manifest.json`. + * Update Scanner import flow to verify DSSE/Rekor before swapping feeds. + +3. **Test** + + * Unit tests for bundle composition & schema. + * Integration tests for import + rollback. + * Determinism tests (same inputs → same DSSE payload). + +4. **Wire telemetry** + + * Counters + latency histograms. + * Logs with `offlineKitHash`, `attestationDigest`, `rekorUuid`. + +5. **Document** + + * Update `docs/modules/export-center/architecture.md`, `docs/modules/scanner/architecture.md`, and the OUK docs where flows or contracts changed.([git.stella-ops.org][5]) + +--- + +If you tell me which module you’re actually coding in next (Scanner, Export Center, CLI, or Attestor), I can turn this into a very concrete “AGENTS.md‑style” section with exact file paths, class names, and a starter test layout for that module. + +[1]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/7bf40b8589c94078e8eadb240553c02f097a5127/docs/24_OFFLINE_KIT.md "git.stella-ops.org/24_OFFLINE_KIT.md at 7bf40b8589c94078e8eadb240553c02f097a5127 - git.stella-ops.org - Gitea: Git with a cup of tea" +[2]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/13e4b53dda1575ba46c6188c794fd465ec6fdeec/docs/modules/scanner/operations/dsse-rekor-operator-guide.md "git.stella-ops.org/dsse-rekor-operator-guide.md at 13e4b53dda1575ba46c6188c794fd465ec6fdeec - git.stella-ops.org - Gitea: Git with a cup of tea" +[3]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/raw/commit/61f963fd52cd4d6bb2f86afc5a82eac04c04b00e/docs/implplan/SPRINT_162_exportcenter_i.md?utm_source=chatgpt.com "https://git.stella-ops.org/stella-ops.org/git.stel..." +[4]: https://stella-ops.org/docs/07_high_level_architecture/index.html?utm_source=chatgpt.com "Open • Sovereign • Modular container security - Stella Ops" +[5]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/d870da18ce194c6a5f1a6d71abea36205d9fb276/docs/export-center/architecture.md?utm_source=chatgpt.com "Export Center Architecture - Stella Ops" +[6]: https://stella-ops.org/docs/moat/?utm_source=chatgpt.com "Open • Sovereign • Modular container security - Stella Ops" +[7]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/79b8e53441e92dbc63684f42072434d40b80275f/src/ExportCenter?utm_source=chatgpt.com "Code - Stella Ops" +[8]: https://stella-ops.org/docs/24_offline_kit/?utm_source=chatgpt.com "Offline Update Kit (OUK) — Air‑Gap Bundle - Stella Ops – Open ..." +[9]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/7768555f2d107326050cc5ff7f5cb81b82b7ce5f/AGENTS.md "git.stella-ops.org/AGENTS.md at 7768555f2d107326050cc5ff7f5cb81b82b7ce5f - git.stella-ops.org - Gitea: Git with a cup of tea" +[10]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/66cb6c4b8af58a33efa1521b7953dda834431497/docs/implplan/SPRINT_160_export_evidence.md?utm_source=chatgpt.com "git.stella-ops.org/SPRINT_160_export_evidence.md at ..." +[11]: https://stella-ops.org/about/?utm_source=chatgpt.com "Signed Reachability · Deterministic Replay · Sovereign Crypto" +[12]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/actions/?actor=0&status=0&workflow=sdk-publish.yml&utm_source=chatgpt.com "Actions - git.stella-ops.org - Gitea: Git with a cup of tea" diff --git a/docs/product-advisories/01-Dec-2025 - PostgreSQL Patterns for Each StellaOps Module.md b/docs/product-advisories/01-Dec-2025 - PostgreSQL Patterns for Each StellaOps Module.md new file mode 100644 index 000000000..b471dae51 --- /dev/null +++ b/docs/product-advisories/01-Dec-2025 - PostgreSQL Patterns for Each StellaOps Module.md @@ -0,0 +1,819 @@ +Here’s a crisp, opinionated storage blueprint you can hand to your Stella Ops devs right now, plus zero‑downtime conversion tactics so you can keep prototyping fast without painting yourself into a corner. + +# Module → store map (deterministic by default) + +* **Authority / OAuth / Accounts & Audit** + + * **PostgreSQL** as the primary source of truth. + * Tables: `users`, `clients`, `oauth_tokens`, `roles`, `grants`, `audit_log`. + * **Row‑Level Security (RLS)** on `users`, `grants`, `audit_log`; **STRICT FK + CHECK** constraints; **immutable UUID PKs**. + * **Audit**: `audit_log(actor_id, action, entity, entity_id, at timestamptz default now(), diff jsonb)`. + * **Why**: ACID + RLS keeps authz decisions and audit trails deterministic and reviewable. + +* **VEX & Vulnerability Writes** + + * **PostgreSQL** with **JSONB facts + relational decisions**. + * Tables: `vuln_fact(jsonb)`, `vex_decision(package_id, vuln_id, status, rationale, proof_ref, updated_at)`. + * **Materialized views** for triage queues, e.g. `mv_triage_hotset` (refresh on commit or scheduled). + * **Why**: JSONB lets you ingest vendor‑shaped docs; decisions stay relational for joins, integrity, and explainability. + +* **Routing / Feature Flags / Rate‑limits** + + * **PostgreSQL** (truth) + **Redis** (cache). + * Tables: `feature_flag(key, rules jsonb, version)`, `route(domain, service, instance_id, last_heartbeat)`, `rate_limiter(bucket, quota, interval)`. + * Redis keys: `flag:{key}:{version}`, `route:{domain}`, `rl:{bucket}` with short TTLs. + * **Why**: one canonical RDBMS for consistency; Redis for hot path latency. + +* **Unknowns Registry (ambiguity tracker)** + + * **PostgreSQL** with **temporal tables** (bitemporal pattern via `valid_from/valid_to`, `sys_from/sys_to`). + * Table: `unknowns(subject_hash, kind, context jsonb, valid_from, valid_to, sys_from default now(), sys_to)`. + * Views: `unknowns_current` where `valid_to is null`. + * **Why**: preserves how/when uncertainty changed (critical for proofs and audits). + +* **Artifacts / SBOM / VEX files** + + * **OCI‑compatible CAS** (e.g., self‑hosted registry or MinIO bucket as content‑addressable store). + * Keys by **digest** (`sha256:...`), metadata in Postgres `artifact(index)` with `digest`, `media_type`, `size`, `signatures`. + * **Why**: blobs don’t belong in your RDBMS; use CAS for scale + cryptographic addressing. + +--- + +# PostgreSQL implementation essentials (copy/paste starters) + +* **RLS scaffold (Authority)**: + + ```sql + alter table audit_log enable row level security; + create policy p_audit_read_self + on audit_log for select + using (actor_id = current_setting('app.user_id')::uuid or + exists (select 1 from grants g where g.user_id = current_setting('app.user_id')::uuid and g.role = 'auditor')); + ``` + +* **JSONB facts + relational decisions**: + + ```sql + create table vuln_fact ( + id uuid primary key default gen_random_uuid(), + source text not null, + payload jsonb not null, + received_at timestamptz default now() + ); + + create table vex_decision ( + package_id uuid not null, + vuln_id text not null, + status text check (status in ('not_affected','affected','fixed','under_investigation')), + rationale text, + proof_ref text, + decided_at timestamptz default now(), + primary key (package_id, vuln_id) + ); + ``` + +* **Materialized view for triage**: + + ```sql + create materialized view mv_triage_hotset as + select v.id as fact_id, v.payload->>'vuln' as vuln, v.received_at + from vuln_fact v + where (now() - v.received_at) < interval '7 days'; + -- refresh concurrently via job + ``` + +* **Temporal pattern (Unknowns)**: + + ```sql + create table unknowns ( + id uuid primary key default gen_random_uuid(), + subject_hash text not null, + kind text not null, + context jsonb not null, + valid_from timestamptz not null default now(), + valid_to timestamptz, + sys_from timestamptz not null default now(), + sys_to timestamptz + ); + + create view unknowns_current as + select * from unknowns where valid_to is null; + ``` + +--- + +# Conversion (not migration): zero‑downtime, prototype‑friendly + +Even if you’re “not migrating anything yet,” set these rails now so cutting over later is painless. + +1. **Encode Mongo‑shaped docs into JSONB with versioned schemas** + +* Ingest pipeline writes to `*_fact(payload jsonb, schema_version int)`. +* Add a **`validate(schema_version, payload)`** step in your service layer (JSON Schema or SQL checks). +* Keep a **forward‑compatible view** that projects stable columns from JSONB (e.g., `payload->>'id' as vendor_id`) so downstream code doesn’t break when payload evolves. + +2. **Outbox pattern for exactly‑once side‑effects** + +* Add `outbox(id, topic, key, payload jsonb, created_at, dispatched bool default false)`. +* On the same transaction as your write, insert the outbox row. +* A background dispatcher reads `dispatched=false`, publishes to MQ/Webhook, then marks `dispatched=true`. +* Guarantees: no lost events, no duplicates to external systems. + +3. **Parallel read adapters behind feature flags** + +* Keep old readers (e.g., Mongo driver) and new Postgres readers in the same service. +* Gate by `feature_flag('pg_reads')` per tenant or env; flip gradually. +* Add a **read‑diff monitor** that compares results and logs mismatches to `audit_log(diff)`. + +4. **CDC for analytics without coupling** + +* Enable **logical replication** (pgoutput) on your key tables. +* Stream changes into analyzers (reachability, heuristics) without hitting primaries. +* This lets you keep OLTP clean and still power dashboards/tests. + +5. **Materialized views & job cadence** + +* Refresh `mv_*` on a fixed cadence (e.g., every 2–5 minutes) or post‑commit for hot paths. +* Keep **“cold path”** analytics in separate schemas (`analytics.*`) sourced from CDC. + +6. **Cutover playbook (phased)** + +* Phase A (Dark Read): write Postgres, still serve from Mongo; compare results silently. +* Phase B (Shadow Serve): 5–10% traffic from Postgres via flag; auto‑rollback switch. +* Phase C (Authoritative): Postgres becomes source; Mongo path left for emergency read‑only. +* Phase D (Retire): freeze Mongo, back up, remove writes, delete code paths after 2 stable sprints. + +--- + +# Rate‑limits & flags: single truth, fast edges + +* **Truth in Postgres** with versioned flag docs: + + ```sql + create table feature_flag ( + key text primary key, + rules jsonb not null, + version int not null default 1, + updated_at timestamptz default now() + ); + ``` + +* **Edge cache** in Redis: + + * `SETEX flag:{key}:{version} ` + * On update, bump `version`; readers compose cache key with version (cache‑busting without deletes). + +* **Rate limiting**: Persist quotas in Postgres; counters in Redis (`INCR rl:{bucket}:{window}`), with periodic reconciliation jobs writing summaries back to Postgres for audits. + +--- + +# CAS for SBOM/VEX/attestations + +* Push blobs to OCI/MinIO by digest; store only pointers in Postgres: + + ```sql + create table artifact_index ( + digest text primary key, + media_type text not null, + size bigint not null, + created_at timestamptz default now(), + signature_refs jsonb + ); + ``` +* Benefits: immutable, deduped, easy to mirror into offline kits. + +--- + +# Guardrails your team should follow + +* **Always** wrap multi‑table writes (facts + outbox + decisions) in a single transaction. +* **Prefer** `jsonb_path_query` for targeted reads; **avoid** scanning entire payloads. +* **Enforce** RLS + least‑privilege roles; application sets `app.user_id` at session start. +* **Version everything**: schemas, flags, materialized views; never “change in place” without bumping version. +* **Observability**: expose `pg_stat_statements`, refresh latency for `mv_*`, outbox lag, Redis hit ratio, and RLS policy hits. + +--- + +If you want, I can turn this into: + +* ready‑to‑run **EF Core 10** migrations, +* a **/docs/architecture/store-map.md** for your repo, +* and a tiny **dev seed** (Docker + sample data) so the team can poke it immediately. +Here’s a focused “PostgreSQL patterns per module” doc you can hand straight to your StellaOps devs. + +--- + +# StellaOps – PostgreSQL Patterns per Module + +**Scope:** How each StellaOps module should use PostgreSQL: schema patterns, constraints, RLS, indexing, and transaction rules. + +--- + +## 0. Cross‑cutting PostgreSQL Rules + +These apply everywhere unless explicitly overridden. + +### 0.1 Core conventions + +* **Schemas** + + * Use **one logical schema** per module: `authority`, `routing`, `vex`, `unknowns`, `artifact`. + * Shared utilities (e.g., `outbox`) live in a `core` schema. + +* **Naming** + + * Tables: `snake_case`, singular: `user`, `feature_flag`, `vuln_fact`. + * PK: `id uuid primary key`. + * FKs: `_id` (e.g., `user_id`, `tenant_id`). + * Timestamps: + + * `created_at timestamptz not null default now()` + * `updated_at timestamptz not null default now()` + +* **Multi‑tenancy** + + * All tenant‑scoped tables must have `tenant_id uuid not null`. + * Enforce tenant isolation with **RLS** on `tenant_id`. + +* **Time & timezones** + + * Always `timestamptz`, always store **UTC**, let the DB default `now()`. + +### 0.2 RLS & security + +* RLS must be **enabled** on any table reachable from a user‑initiated path. +* Every session must set: + + ```sql + select set_config('app.user_id', '', false); + select set_config('app.tenant_id', '', false); + select set_config('app.roles', 'role1,role2', false); + ``` +* RLS policies: + + * Base policy: `tenant_id = current_setting('app.tenant_id')::uuid`. + * Extra predicates for per‑user privacy (e.g., only see own tokens, only own API clients). +* DB users: + + * Each module’s service has its **own role** with access only to its schema + `core.outbox`. + +### 0.3 JSONB & versioning + +* Any JSONB column must have: + + * `payload jsonb not null`, + * `schema_version int not null`. +* Always index: + + * by source (`source` / `origin`), + * by a small set of projected fields used in WHERE clauses. + +### 0.4 Migrations + +* All schema changes via migrations, forward‑only. +* Backwards‑compat pattern: + + 1. Add new columns / tables. + 2. Backfill. + 3. Flip code to use new structure (behind a feature flag). + 4. After stability, remove old columns/paths. + +--- + +## 1. Authority Module (auth, accounts, audit) + +**Schema:** `authority.*` +**Mission:** identity, OAuth, roles, grants, audit. + +### 1.1 Core tables & patterns + +* `authority.user` + + ```sql + create table authority.user ( + id uuid primary key default gen_random_uuid(), + tenant_id uuid not null, + email text not null, + display_name text not null, + is_disabled boolean not null default false, + created_at timestamptz not null default now(), + updated_at timestamptz not null default now(), + unique (tenant_id, email) + ); + ``` + + * Never hard‑delete users: use `is_disabled` (and optionally `disabled_at`). + +* `authority.role` + + ```sql + create table authority.role ( + id uuid primary key default gen_random_uuid(), + tenant_id uuid not null, + name text not null, + description text, + created_at timestamptz not null default now(), + updated_at timestamptz not null default now(), + unique (tenant_id, name) + ); + ``` + +* `authority.grant` + + ```sql + create table authority.grant ( + id uuid primary key default gen_random_uuid(), + tenant_id uuid not null, + user_id uuid not null references authority.user(id), + role_id uuid not null references authority.role(id), + created_at timestamptz not null default now(), + unique (tenant_id, user_id, role_id) + ); + ``` + +* `authority.oauth_client`, `authority.oauth_token` + + * Enforce token uniqueness: + + ```sql + create table authority.oauth_token ( + id uuid primary key default gen_random_uuid(), + tenant_id uuid not null, + user_id uuid not null references authority.user(id), + client_id uuid not null references authority.oauth_client(id), + token_hash text not null, -- hash, never raw + expires_at timestamptz not null, + created_at timestamptz not null default now(), + revoked_at timestamptz, + unique (token_hash) + ); + ``` + +### 1.2 Audit log pattern + +* `authority.audit_log` + + ```sql + create table authority.audit_log ( + id uuid primary key default gen_random_uuid(), + tenant_id uuid not null, + actor_id uuid, -- null for system + action text not null, + entity_type text not null, + entity_id uuid, + at timestamptz not null default now(), + diff jsonb not null + ); + ``` +* Insert audit rows in the **same transaction** as the change. + +### 1.3 RLS patterns + +* Base RLS: + + ```sql + alter table authority.user enable row level security; + + create policy p_user_tenant on authority.user + for all using (tenant_id = current_setting('app.tenant_id')::uuid); + ``` +* Extra policies: + + * Audit log is visible only to: + + * actor themself, or + * users with an `auditor` or `admin` role. + +--- + +## 2. Routing & Feature Flags Module + +**Schema:** `routing.*` +**Mission:** where instances live, what features are on, rate‑limit configuration. + +### 2.1 Feature flags + +* `routing.feature_flag` + + ```sql + create table routing.feature_flag ( + id uuid primary key default gen_random_uuid(), + tenant_id uuid not null, + key text not null, + rules jsonb not null, + version int not null default 1, + is_enabled boolean not null default true, + created_at timestamptz not null default now(), + updated_at timestamptz not null default now(), + unique (tenant_id, key) + ); + ``` + +* **Immutability by version**: + + * On update, **increment `version`**, don’t overwrite historical data. + * Mirror changes into a history table via trigger: + + ```sql + create table routing.feature_flag_history ( + id uuid primary key default gen_random_uuid(), + feature_flag_id uuid not null references routing.feature_flag(id), + tenant_id uuid not null, + key text not null, + rules jsonb not null, + version int not null, + changed_at timestamptz not null default now(), + changed_by uuid + ); + ``` + +### 2.2 Instance registry + +* `routing.instance` + + ```sql + create table routing.instance ( + id uuid primary key default gen_random_uuid(), + tenant_id uuid not null, + instance_key text not null, + domain text not null, + last_heartbeat timestamptz not null default now(), + status text not null check (status in ('active','draining','offline')), + created_at timestamptz not null default now(), + updated_at timestamptz not null default now(), + unique (tenant_id, instance_key), + unique (tenant_id, domain) + ); + ``` + +* Pattern: + + * Heartbeats use `update ... set last_heartbeat = now()` without touching other fields. + * Routing logic filters by `status='active'` and heartbeat recency. + +### 2.3 Rate‑limit configuration + +* Config in Postgres, counters in Redis: + + ```sql + create table routing.rate_limit_config ( + id uuid primary key default gen_random_uuid(), + tenant_id uuid not null, + key text not null, + limit_per_interval int not null, + interval_seconds int not null, + created_at timestamptz not null default now(), + updated_at timestamptz not null default now(), + unique (tenant_id, key) + ); + ``` + +--- + +## 3. VEX & Vulnerability Module + +**Schema:** `vex.*` +**Mission:** ingest vulnerability facts, keep decisions & triage state. + +### 3.1 Facts as JSONB + +* `vex.vuln_fact` + + ```sql + create table vex.vuln_fact ( + id uuid primary key default gen_random_uuid(), + tenant_id uuid not null, + source text not null, -- e.g. "nvd", "vendor_x_vex" + external_id text, -- e.g. CVE, advisory id + payload jsonb not null, + schema_version int not null, + received_at timestamptz not null default now() + ); + ``` + +* Index patterns: + + ```sql + create index on vex.vuln_fact (tenant_id, source); + create index on vex.vuln_fact (tenant_id, external_id); + create index vuln_fact_payload_gin on vex.vuln_fact using gin (payload); + ``` + +### 3.2 Decisions as relational data + +* `vex.package` + + ```sql + create table vex.package ( + id uuid primary key default gen_random_uuid(), + tenant_id uuid not null, + name text not null, + version text not null, + ecosystem text not null, -- e.g. "pypi", "npm" + created_at timestamptz not null default now(), + unique (tenant_id, name, version, ecosystem) + ); + ``` + +* `vex.vex_decision` + + ```sql + create table vex.vex_decision ( + id uuid primary key default gen_random_uuid(), + tenant_id uuid not null, + package_id uuid not null references vex.package(id), + vuln_id text not null, + status text not null check (status in ( + 'not_affected', 'affected', 'fixed', 'under_investigation' + )), + rationale text, + proof_ref text, -- CAS digest or URL + decided_by uuid, + decided_at timestamptz not null default now(), + created_at timestamptz not null default now(), + updated_at timestamptz not null default now(), + unique (tenant_id, package_id, vuln_id) + ); + ``` + +* For history: + + * Keep current state in `vex_decision`. + * Mirror previous versions into `vex_decision_history` table (similar to feature flags). + +### 3.3 Triage queues with materialized views + +* Example triage view: + + ```sql + create materialized view vex.mv_triage_queue as + select + d.tenant_id, + p.name, + p.version, + d.vuln_id, + d.status, + d.decided_at + from vex.vex_decision d + join vex.package p on p.id = d.package_id + where d.status = 'under_investigation'; + ``` + +* Refresh options: + + * Scheduled refresh (cron/worker). + * Or **incremental** via triggers (more complex; use only when needed). + +### 3.4 RLS for VEX + +* All tables scoped by `tenant_id`. +* Typical policy: + + ```sql + alter table vex.vex_decision enable row level security; + + create policy p_vex_tenant on vex.vex_decision + for all using (tenant_id = current_setting('app.tenant_id')::uuid); + ``` + +--- + +## 4. Unknowns Module + +**Schema:** `unknowns.*` +**Mission:** represent uncertainty and how it changes over time. + +### 4.1 Bitemporal unknowns table + +* `unknowns.unknown` + + ```sql + create table unknowns.unknown ( + id uuid primary key default gen_random_uuid(), + tenant_id uuid not null, + subject_hash text not null, -- stable identifier for "thing" being reasoned about + kind text not null, -- e.g. "reachability", "version_inferred" + context jsonb not null, -- extra info: call graph node, evidence, etc. + valid_from timestamptz not null default now(), + valid_to timestamptz, + sys_from timestamptz not null default now(), + sys_to timestamptz, + created_at timestamptz not null default now() + ); + ``` + +* “Exactly one open unknown per subject/kind” pattern: + + ```sql + create unique index unknown_one_open_per_subject + on unknowns.unknown (tenant_id, subject_hash, kind) + where valid_to is null; + ``` + +### 4.2 Closing an unknown + +* Close by setting `valid_to` and `sys_to`: + + ```sql + update unknowns.unknown + set valid_to = now(), sys_to = now() + where id = :id and valid_to is null; + ``` + +* Never hard-delete; keep all rows for audit/explanation. + +### 4.3 Convenience views + +* Current unknowns: + + ```sql + create view unknowns.current as + select * + from unknowns.unknown + where valid_to is null; + ``` + +### 4.4 RLS + +* Same tenant policy as other modules; unknowns are tenant‑scoped. + +--- + +## 5. Artifact Index / CAS Module + +**Schema:** `artifact.*` +**Mission:** index of immutable blobs stored in OCI / S3 / MinIO etc. + +### 5.1 Artifact index + +* `artifact.artifact` + + ```sql + create table artifact.artifact ( + digest text primary key, -- e.g. "sha256:..." + tenant_id uuid not null, + media_type text not null, + size_bytes bigint not null, + created_at timestamptz not null default now(), + created_by uuid + ); + ``` + +* Validate digest shape with a CHECK: + + ```sql + alter table artifact.artifact + add constraint chk_digest_format + check (digest ~ '^sha[0-9]+:[0-9a-fA-F]{32,}$'); + ``` + +### 5.2 Signatures and tags + +* `artifact.signature` + + ```sql + create table artifact.signature ( + id uuid primary key default gen_random_uuid(), + tenant_id uuid not null, + artifact_digest text not null references artifact.artifact(digest), + signer text not null, + signature_payload jsonb not null, + created_at timestamptz not null default now() + ); + ``` + +* `artifact.tag` + + ```sql + create table artifact.tag ( + id uuid primary key default gen_random_uuid(), + tenant_id uuid not null, + name text not null, + artifact_digest text not null references artifact.artifact(digest), + created_at timestamptz not null default now(), + unique (tenant_id, name) + ); + ``` + +### 5.3 RLS + +* Ensure that tenants cannot see each other’s digests, even if the CAS backing store is shared: + + ```sql + alter table artifact.artifact enable row level security; + + create policy p_artifact_tenant on artifact.artifact + for all using (tenant_id = current_setting('app.tenant_id')::uuid); + ``` + +--- + +## 6. Shared Outbox / Event Pattern + +**Schema:** `core.*` +**Mission:** reliable events for external side‑effects. + +### 6.1 Outbox table + +* `core.outbox` + + ```sql + create table core.outbox ( + id uuid primary key default gen_random_uuid(), + tenant_id uuid, + aggregate_type text not null, -- e.g. "vex_decision", "feature_flag" + aggregate_id uuid, + topic text not null, + payload jsonb not null, + created_at timestamptz not null default now(), + dispatched_at timestamptz, + dispatch_attempts int not null default 0, + error text + ); + ``` + +### 6.2 Usage rule + +* For anything that must emit an event (webhook, Kafka, notifications): + + * In the **same transaction** as the change: + + * write primary data (e.g. `vex.vex_decision`), + * insert an `outbox` row. + * A background worker: + + * pulls undelivered rows, + * sends to external system, + * updates `dispatched_at`/`dispatch_attempts`/`error`. + +--- + +## 7. Indexing & Query Patterns per Module + +### 7.1 Authority + +* Index: + + * `user(tenant_id, email)` + * `grant(tenant_id, user_id)` + * `oauth_token(token_hash)` +* Typical query patterns: + + * Look up user by `tenant_id + email`. + * All roles/grants for a user; design composite indexes accordingly. + +### 7.2 Routing & Flags + +* Index: + + * `feature_flag(tenant_id, key)` + * partial index on enabled flags: + + ```sql + create index on routing.feature_flag (tenant_id, key) + where is_enabled; + ``` + * `instance(tenant_id, status)`, `instance(tenant_id, domain)`. + +### 7.3 VEX + +* Index: + + * `package(tenant_id, name, version, ecosystem)` + * `vex_decision(tenant_id, package_id, vuln_id)` + * GIN on `vuln_fact.payload` for flexible querying. + +### 7.4 Unknowns + +* Index: + + * unique open unknown per subject/kind (shown above). + * `unknown(tenant_id, kind)` for filtering by kind. + +### 7.5 Artifact + +* Index: + + * PK on `digest`. + * `signature(tenant_id, artifact_digest)`. + * `tag(tenant_id, name)`. + +--- + +## 8. Transaction & Isolation Guidelines + +* Default isolation: **READ COMMITTED**. +* For critical sequences (e.g., provisioning a tenant, bulk role updates): + + * consider **REPEATABLE READ** or **SERIALIZABLE** and keep transactions short. +* Pattern: + + * One transaction per logical user action (e.g., “set flag”, “record decision”). + * Never do long‑running external calls inside a database transaction. + +--- + +If you’d like, next step I can turn this into: + +* concrete `CREATE SCHEMA` + `CREATE TABLE` migration files, and +* a short “How to write queries in each module” cheat‑sheet for devs (with example SELECT/INSERT/UPDATE patterns). diff --git a/docs/product-advisories/01-Dec-2025 - Proof-Linked VEX User Interface.md b/docs/product-advisories/01-Dec-2025 - Proof-Linked VEX User Interface.md new file mode 100644 index 000000000..d557d311f --- /dev/null +++ b/docs/product-advisories/01-Dec-2025 - Proof-Linked VEX User Interface.md @@ -0,0 +1,585 @@ +Here’s a tight, practical pattern to make your scanner’s vuln‑DB updates rock‑solid even when feeds hiccup: + +# Offline, verifiable update bundles (DSSE + Rekor v2) + +**Idea:** distribute DB updates as offline tarballs. Each tarball ships with: + +* a **DSSE‑signed** statement (e.g., in‑toto style) over the bundle hash +* a **Rekor v2 receipt** proving the signature/statement was logged +* a small **manifest.json** (version, created_at, content hashes) + +**Startup flow (happy path):** + +1. Load latest tarball from your local `updates/` cache. +2. Verify DSSE signature against your trusted public keys. +3. Verify Rekor v2 receipt (inclusion proof) matches the DSSE payload hash. +4. If both pass, unpack/activate; record the bundle’s **trust_id** (e.g., statement digest). +5. If anything fails, **keep using the last good bundle**. No service disruption. + +**Why this helps** + +* **Air‑gap friendly:** no live network needed at activation time. +* **Tamper‑evident:** DSSE + Rekor receipt proves provenance and transparency. +* **Operational stability:** feed outages become non‑events—scanner just keeps the last good state. + +--- + +## File layout inside each bundle + +``` +/bundle-2025-11-29/ + manifest.json # { version, created_at, entries[], sha256s } + payload.tar.zst # the actual DB/indices + payload.tar.zst.sha256 + statement.dsse.json # DSSE-wrapped statement over payload hash + rekor-receipt.json # Rekor v2 inclusion/verification material +``` + +--- + +## Acceptance/Activation rules + +* **Trust root:** pin one (or more) publisher public keys; rotate via separate, out‑of‑band process. +* **Monotonicity:** only activate if `manifest.version > current.version` (or if trust policy explicitly allows replay for rollback testing). +* **Atomic switch:** unpack to `db/staging/`, validate, then symlink‑flip to `db/active/`. +* **Quarantine on failure:** move bad bundles to `updates/quarantine/` with a reason code. + +--- + +## Minimal .NET 10 verifier sketch (C#) + +```csharp +public sealed record BundlePaths(string Dir) { + public string Manifest => Path.Combine(Dir, "manifest.json"); + public string Payload => Path.Combine(Dir, "payload.tar.zst"); + public string Dsse => Path.Combine(Dir, "statement.dsse.json"); + public string Receipt => Path.Combine(Dir, "rekor-receipt.json"); +} + +public async Task ActivateBundleAsync(BundlePaths b, TrustConfig trust, string activeDir) { + var manifest = await Manifest.LoadAsync(b.Manifest); + if (!await Hashes.VerifyAsync(b.Payload, manifest.PayloadSha256)) return false; + + // 1) DSSE verify (publisher keys pinned in trust) + var (okSig, dssePayloadDigest) = await Dsse.VerifyAsync(b.Dsse, trust.PublisherKeys); + if (!okSig || dssePayloadDigest != manifest.PayloadSha256) return false; + + // 2) Rekor v2 receipt verify (inclusion + statement digest == dssePayloadDigest) + if (!await RekorV2.VerifyReceiptAsync(b.Receipt, dssePayloadDigest, trust.RekorPub)) return false; + + // 3) Stage, validate, then atomically flip + var staging = Path.Combine(activeDir, "..", "staging"); + DirUtil.Empty(staging); + await TarZstd.ExtractAsync(b.Payload, staging); + if (!await LocalDbSelfCheck.RunAsync(staging)) return false; + + SymlinkUtil.AtomicSwap(source: staging, target: activeDir); + State.WriteLastGood(manifest.Version, dssePayloadDigest); + return true; +} +``` + +--- + +## Operational playbook + +* **On boot & daily at HH:MM:** try `ActivateBundleAsync()` on the newest bundle; on failure, log and continue. +* **Telemetry (no PII):** reason codes (SIG_FAIL, RECEIPT_FAIL, HASH_MISMATCH, SELFTEST_FAIL), versions, last_good. +* **Keys & rotation:** keep `publisher.pub` and `rekor.pub` in a root‑owned, read‑only path; rotate via a separate signed “trust bundle”. +* **Defense‑in‑depth:** verify both the **payload hash** and each file’s hash listed in `manifest.entries[]`. +* **Rollback:** allow `--force-activate ` for emergency testing, but mark as **non‑monotonic** in state. + +--- + +## What to hand your release team + +* A Make/CI target that: + + 1. Builds `payload.tar.zst` and computes hashes + 2. Generates `manifest.json` + 3. Creates and signs the **DSSE statement** + 4. Submits to Rekor (or your mirror) and saves the **v2 receipt** + 5. Packages the bundle folder and publishes to your offline repo +* A checksum file (`*.sha256sum`) for ops to verify out‑of‑band. + +--- + +If you want, I can turn this into a Stella Ops spec page (`docs/modules/scanner/offline-bundles.md`) plus a small reference implementation (C# library + CLI) that drops right into your Scanner service. +Here’s a “drop‑in” Stella Ops dev guide for **DSSE‑signed Offline Scanner Updates** — written in the same spirit as the existing docs and sprint files. + +You can treat this as the seed for `docs/modules/scanner/development/dsse-offline-updates.md` (or similar). + +--- + +# DSSE‑Signed Offline Scanner Updates — Developer Guidelines + +> **Audience** +> Scanner, Export Center, Attestor, CLI, and DevOps engineers implementing DSSE‑signed offline vulnerability updates and integrating them into the Offline Update Kit (OUK). +> +> **Context** +> +> * OUK already ships **signed, atomic offline update bundles** with merged vulnerability feeds, container images, and an attested manifest.([git.stella-ops.org][1]) +> * DSSE + Rekor is already used for **scan evidence** (SBOM attestations, Rekor proofs).([git.stella-ops.org][2]) +> * Sprints 160/162 add **attestation bundles** with manifest, checksums, DSSE signature, and optional transparency log segments, and integrate them into OUK and CLI flows.([git.stella-ops.org][3]) + +These guidelines tell you how to **wire all of that together** for “offline scanner updates” (feeds, rules, packs) in a way that matches Stella Ops’ determinism + sovereignty promises. + +--- + +## 0. Mental model + +At a high level, you’re building this: + +```text + Advisory mirrors / Feeds builders + │ + ▼ + ExportCenter.AttestationBundles + (creates DSSE + Rekor evidence + for each offline update snapshot) + │ + ▼ + Offline Update Kit (OUK) builder + (adds feeds + evidence to kit tarball) + │ + ▼ + stella offline kit import / admin CLI + (verifies Cosign + DSSE + Rekor segments, + then atomically swaps scanner feeds) +``` + +Online, Rekor is live; offline, you rely on **bundled Rekor segments / snapshots** and the existing OUK mechanics (import is atomic, old feeds kept until new bundle is fully verified).([git.stella-ops.org][1]) + +--- + +## 1. Goals & non‑goals + +### Goals + +1. **Authentic offline snapshots** + Every offline scanner update (OUK or delta) must be verifiably tied to: + + * a DSSE envelope, + * a certificate chain rooted in Stella’s Fulcio/KMS profile or BYO KMS/HSM, + * *and* a Rekor v2 inclusion proof or bundled log segment.([Stella Ops][4]) + +2. **Deterministic replay** + Given: + + * a specific offline update kit (`stella-ops-offline-kit-.tgz` + `offline-manifest-.json`)([git.stella-ops.org][1]) + * its DSSE attestation bundle + Rekor segments + every verifier must reach the *same* verdict on integrity and contents — online or fully air‑gapped. + +3. **Separation of concerns** + + * Export Center: build attestation bundles, no business logic about scanning.([git.stella-ops.org][5]) + * Scanner: import & apply feeds; verify but not generate DSSE. + * Signer / Attestor: own DSSE & Rekor integration.([git.stella-ops.org][2]) + +4. **Operational safety** + + * Imports remain **atomic and idempotent**. + * Old feeds stay live until the new update is **fully verified** (Cosign + DSSE + Rekor).([git.stella-ops.org][1]) + +### Non‑goals + +* Designing new crypto or log formats. +* Per‑feed DSSE envelopes (you can have more later, but the minimum contract is **bundle‑level** attestation). + +--- + +## 2. Bundle contract for DSSE‑signed offline updates + +You’re extending the existing OUK contract: + +* OUK already packs: + + * merged vuln feeds (OSV, GHSA, optional NVD 2.0, CNNVD/CNVD, ENISA, JVN, BDU), + * container images (`stella-ops`, Zastava, etc.), + * provenance (Cosign signature, SPDX SBOM, in‑toto SLSA attestation), + * `offline-manifest.json` + detached JWS signed during export.([git.stella-ops.org][1]) + +For **DSSE‑signed offline scanner updates**, add a new logical layer: + +### 2.1. Files to ship + +Inside each offline kit (full or delta) you must produce: + +```text +/attestations/ + offline-update.dsse.json # DSSE envelope + offline-update.rekor.json # Rekor entry + inclusion proof (or segment descriptor) +/manifest/ + offline-manifest.json # existing manifest + offline-manifest.json.jws # existing detached JWS +/feeds/ + ... # existing feed payloads +``` + +The exact paths can be adjusted, but keep: + +* **One DSSE bundle per kit** (min spec). +* **One canonical Rekor proof file** per DSSE envelope. + +### 2.2. DSSE payload contents (minimal) + +Define (or reuse) a predicate type such as: + +```jsonc +{ + "payloadType": "application/vnd.in-toto+json", + "payload": { /* base64 */ } +} +``` + +Decoded payload (in-toto statement) should **at minimum** contain: + +* **Subject** + + * `name`: `stella-ops-offline-kit-.tgz` + * `digest.sha256`: tarball digest + +* **Predicate type** (recommendation) + + * `https://stella-ops.org/attestations/offline-update/1` + +* **Predicate fields** + + * `offline_manifest_sha256` – SHA‑256 of `offline-manifest.json` + * `feeds` – array of feed entries such as `{ name, snapshot_date, archive_digest }` (mirrors `rules_and_feeds` style used in the moat doc).([Stella Ops][6]) + * `builder` – CI workflow id / git commit / Export Center job id + * `created_at` – UTC ISO‑8601 + * `oukit_channel` – e.g., `edge`, `stable`, `fips-profile` + +**Guideline:** this DSSE payload is the **single canonical description** of “what this offline update snapshot is”. + +### 2.3. Rekor material + +Attestor must: + +* Submit `offline-update.dsse.json` to Rekor v2, obtaining: + + * `uuid` + * `logIndex` + * inclusion proof (`rootHash`, `hashes`, `checkpoint`) +* Serialize that to `offline-update.rekor.json` and store it in object storage + OUK staging, so it ships in the kit.([git.stella-ops.org][2]) + +For fully offline operation: + +* Either: + + * embed a **minimal log segment** containing that entry; or + * rely on daily Rekor snapshot exports included elsewhere in the kit.([git.stella-ops.org][2]) + +--- + +## 3. Implementation by module + +### 3.1 Export Center — attestation bundles + +**Working directory:** `src/ExportCenter/StellaOps.ExportCenter.AttestationBundles`([git.stella-ops.org][7]) + +**Responsibilities** + +1. **Compose attestation bundle job** (EXPORT‑ATTEST‑74‑001) + + * Input: a snapshot identifier (e.g., offline kit build id or feed snapshot date). + * Read manifest and feed metadata from the Export Center’s storage.([git.stella-ops.org][5]) + * Generate the DSSE payload structure described above. + * Call `StellaOps.Signer` to wrap it in a DSSE envelope. + * Call `StellaOps.Attestor` to submit DSSE → Rekor and get the inclusion proof.([git.stella-ops.org][2]) + * Persist: + + * `offline-update.dsse.json` + * `offline-update.rekor.json` + * any log segment artifacts. + +2. **Integrate into offline kit packaging** (EXPORT‑ATTEST‑74‑002 / 75‑001) + + * The OUK builder (Python script `ops/offline-kit/build_offline_kit.py`) already assembles artifacts & manifests.([Stella Ops][8]) + * Extend that pipeline (or add an Export Center step) to: + + * fetch the attestation bundle for the snapshot, + * place it under `/attestations/` in the kit staging dir, + * ensure `offline-manifest.json` contains entries for the DSSE and Rekor files (name, sha256, size, capturedAt).([git.stella-ops.org][1]) + +3. **Contracts & schemas** + + * Define a small JSON schema for `offline-update.rekor.json` (UUID, index, proof fields) and check it into `docs/11_DATA_SCHEMAS.md` or module‑local schemas. + * Keep all new payload schemas **versioned**; avoid “shape drift”. + +**Do / Don’t** + +* ✅ **Do** treat attestation bundle job as *pure aggregation* (AOC guardrail: no modification of evidence).([git.stella-ops.org][5]) +* ✅ **Do** rely on Signer + Attestor; don’t hand‑roll DSSE/Rekor logic in Export Center.([git.stella-ops.org][2]) +* ❌ **Don’t** reach out to external networks from this job — it must run with the same offline‑ready posture as the rest of the platform. + +--- + +### 3.2 Offline Update Kit builder + +**Working area:** `ops/offline-kit/*` + `docs/24_OFFLINE_KIT.md`([git.stella-ops.org][1]) + +Guidelines: + +1. **Preserve current guarantees** + + * Imports must remain **idempotent and atomic**, with **old feeds kept until the new bundle is fully verified**. This now includes DSSE/Rekor checks in addition to Cosign + JWS.([git.stella-ops.org][1]) + +2. **Staging layout** + + * When staging a kit, ensure the tree looks like: + + ```text + out/offline-kit/staging/ + feeds/... + images/... + manifest/offline-manifest.json + attestations/offline-update.dsse.json + attestations/offline-update.rekor.json + ``` + + * Update `offline-manifest.json` so each new file appears with: + + * `name`, `sha256`, `size`, `capturedAt`.([git.stella-ops.org][1]) + +3. **Deterministic ordering** + + * File lists in manifests must be in a stable order (e.g., lexical paths). + * Timestamps = UTC ISO‑8601 only; never use local time. (Matches determinism guidance in AGENTS.md + policy/runs docs.)([git.stella-ops.org][9]) + +4. **Delta kits** + + * For deltas (`stella-ouk-YYYY-MM-DD.delta.tgz`), DSSE should still cover: + + * the delta tarball digest, + * the **logical state** (feeds & versions) after applying the delta. + * Don’t shortcut by “attesting only the diff files” — the predicate must describe the resulting snapshot. + +--- + +### 3.3 Scanner — import & activation + +**Working directory:** `src/Scanner/StellaOps.Scanner.WebService`, `StellaOps.Scanner.Worker`([git.stella-ops.org][9]) + +Scanner already exposes admin flows for: + +* **Offline kit import**, which: + + * validates the Cosign signature of the kit, + * uses the attested manifest, + * keeps old feeds until verification is done.([git.stella-ops.org][1]) + +Add DSSE/Rekor awareness as follows: + +1. **Verification sequence (happy path)** + + On `import-offline-usage-kit`: + + 1. Validate **Cosign** signature of the tarball. + 2. Validate `offline-manifest.json` with its JWS signature. + 3. Verify **file digests** for all entries (including `/attestations/*`). + 4. Verify **DSSE**: + + * Call `StellaOps.Attestor.Verify` (or CLI equivalent) with: + + * `offline-update.dsse.json` + * `offline-update.rekor.json` + * local Rekor log snapshot / segment (if configured)([git.stella-ops.org][2]) + * Ensure the payload digest matches the kit tarball + manifest digests. + 5. Only after all checks pass: + + * swap Scanner’s feed pointer to the new snapshot, + * emit an audit event noting: + + * kit filename, tarball digest, + * DSSE statement digest, + * Rekor UUID + log index. + +2. **Config surface** + + Add config keys (names illustrative): + + ```yaml + scanner: + offlineKit: + requireDsse: true # fail import if DSSE/Rekor verification fails + rekorOfflineMode: true # use local snapshots only + attestationVerifier: https://attestor.internal + ``` + + * Mirror them via ASP.NET Core config + env vars (`SCANNER__OFFLINEKIT__REQUIREDSSSE`, etc.), following the same pattern as the DSSE/Rekor operator guide.([git.stella-ops.org][2]) + +3. **Failure behaviour** + + * **DSSE/Rekor fail, Cosign + manifest OK** + + * Keep old feeds active. + * Mark import as failed; surface a `ProblemDetails` error via API/UI. + * Log structured fields: `rekorUuid`, `attestationDigest`, `offlineKitHash`, `failureReason`.([git.stella-ops.org][2]) + + * **Config flag to soften during rollout** + + * When `requireDsse=false`, treat DSSE/Rekor failure as a warning and still allow the import (for initial observation phase), but emit alerts. This mirrors the “observe → enforce” pattern in the DSSE/Rekor operator guide.([git.stella-ops.org][2]) + +--- + +### 3.4 Signer & Attestor + +You mostly **reuse** existing guidance:([git.stella-ops.org][2]) + +* Add a new predicate type & schema for offline updates in Signer. + +* Ensure Attestor: + + * can submit offline‑update DSSE envelopes to Rekor, + * can emit verification routines (used by CLI and Scanner) that: + + * verify the DSSE signature, + * check the certificate chain against the configured root pack (FIPS/eIDAS/GOST/SM, etc.),([Stella Ops][4]) + * verify Rekor inclusion using either live log or local snapshot. + +* For fully air‑gapped installs: + + * rely on Rekor **snapshots mirrored** into Offline Kit (already recommended in the operator guide’s offline section).([git.stella-ops.org][2]) + +--- + +### 3.5 CLI & UI + +Extend CLI with explicit verbs (matching EXPORT‑ATTEST sprints):([git.stella-ops.org][10]) + +* `stella attest bundle verify --bundle path/to/offline-kit.tgz --rekor-key rekor.pub` +* `stella attest bundle import --bundle ...` (for sites that prefer a two‑step “verify then import” flow) +* Wire UI Admin → Offline Kit screen so that: + + * verification status shows both **Cosign/JWS** and **DSSE/Rekor** state, + * policy banners display kit generation time, manifest hash, and DSSE/Rekor freshness.([Stella Ops][11]) + +--- + +## 4. Determinism & offline‑safety rules + +When touching any of this code, keep these rules front‑of‑mind (they align with the policy DSL and architecture docs):([Stella Ops][4]) + +1. **No hidden network dependencies** + + * All verification **must work offline** given the kit + Rekor snapshots. + * Any fallback to live Rekor / Fulcio endpoints must be explicitly toggled and never on by default for “offline mode”. + +2. **Stable serialization** + + * DSSE payload JSON: + + * stable ordering of fields, + * no float weirdness, + * UTC timestamps. + +3. **Replayable imports** + + * Running `import-offline-usage-kit` twice with the same bundle must be a no‑op after the first time. + * The DSSE payload for a given snapshot must not change over time; if it does, bump the predicate or snapshot version. + +4. **Explainability** + + * When verification fails, errors must explain **what** mismatched (kit digest, manifest digest, DSSE envelope hash, Rekor inclusion) so auditors can reason about it. + +--- + +## 5. Testing & CI expectations + +Tie this into the existing CI workflows (`scanner-determinism.yml`, `attestation-bundle.yml`, `offline-kit` pipelines, etc.):([git.stella-ops.org][12]) + +### 5.1 Unit & integration tests + +Write tests that cover: + +1. **Happy paths** + + * Full kit import with valid: + + * Cosign, + * manifest JWS, + * DSSE, + * Rekor proof (online and offline modes). + +2. **Corruption scenarios** + + * Tampered feed file (hash mismatch). + * Tampered `offline-manifest.json`. + * Tampered DSSE payload (signature fails). + * Mismatched Rekor entry (payload digest doesn’t match DSSE). + +3. **Offline scenarios** + + * No network access, only Rekor snapshot: + + * DSSE verification still passes, + * Rekor proof validates against local tree head. + +4. **Roll‑back logic** + + * Import fails at DSSE/Rekor step: + + * scanner DB still points at previous feeds, + * metrics/logs show failure and no partial state. + +### 5.2 SLOs & observability + +Reuse metrics suggested by DSSE/Rekor guide and adapt to OUK imports:([git.stella-ops.org][2]) + +* `offlinekit_import_total{status="success|failed_dsse|failed_rekor|failed_cosign"}` +* `offlinekit_attestation_verify_latency_seconds` (histogram) +* `attestor_rekor_success_total`, `attestor_rekor_retry_total`, `rekor_inclusion_latency` +* Dashboards: kit versions per environment, time since last kit, DSSE/Rekor health. + +--- + +## 6. Developer checklist (TL;DR) + +When you pick up a task touching DSSE‑signed offline updates: + +1. **Read the background** + + * `docs/modules/scanner/operations/dsse-rekor-operator-guide.md`([git.stella-ops.org][2]) + * `docs/24_OFFLINE_KIT.md` (and public offline kit guide).([git.stella-ops.org][1]) + * Relevant sprint file (`SPRINT_160_export_evidence`, `SPRINT_162_exportcenter_i`, etc.).([git.stella-ops.org][10]) + +2. **Implement** + + * Generate DSSE payloads in Export Center only. + * Call Signer & Attestor; persist DSSE + Rekor JSON next to manifests. + * Extend OUK builder to include attestation bundle and list it in `offline-manifest.json`. + * Update Scanner import flow to verify DSSE/Rekor before swapping feeds. + +3. **Test** + + * Unit tests for bundle composition & schema. + * Integration tests for import + rollback. + * Determinism tests (same inputs → same DSSE payload). + +4. **Wire telemetry** + + * Counters + latency histograms. + * Logs with `offlineKitHash`, `attestationDigest`, `rekorUuid`. + +5. **Document** + + * Update `docs/modules/export-center/architecture.md`, `docs/modules/scanner/architecture.md`, and the OUK docs where flows or contracts changed.([git.stella-ops.org][5]) + +--- + +If you tell me which module you’re actually coding in next (Scanner, Export Center, CLI, or Attestor), I can turn this into a very concrete “AGENTS.md‑style” section with exact file paths, class names, and a starter test layout for that module. + +[1]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/7bf40b8589c94078e8eadb240553c02f097a5127/docs/24_OFFLINE_KIT.md "git.stella-ops.org/24_OFFLINE_KIT.md at 7bf40b8589c94078e8eadb240553c02f097a5127 - git.stella-ops.org - Gitea: Git with a cup of tea" +[2]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/13e4b53dda1575ba46c6188c794fd465ec6fdeec/docs/modules/scanner/operations/dsse-rekor-operator-guide.md "git.stella-ops.org/dsse-rekor-operator-guide.md at 13e4b53dda1575ba46c6188c794fd465ec6fdeec - git.stella-ops.org - Gitea: Git with a cup of tea" +[3]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/raw/commit/61f963fd52cd4d6bb2f86afc5a82eac04c04b00e/docs/implplan/SPRINT_162_exportcenter_i.md?utm_source=chatgpt.com "https://git.stella-ops.org/stella-ops.org/git.stel..." +[4]: https://stella-ops.org/docs/07_high_level_architecture/index.html?utm_source=chatgpt.com "Open • Sovereign • Modular container security - Stella Ops" +[5]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/d870da18ce194c6a5f1a6d71abea36205d9fb276/docs/export-center/architecture.md?utm_source=chatgpt.com "Export Center Architecture - Stella Ops" +[6]: https://stella-ops.org/docs/moat/?utm_source=chatgpt.com "Open • Sovereign • Modular container security - Stella Ops" +[7]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/79b8e53441e92dbc63684f42072434d40b80275f/src/ExportCenter?utm_source=chatgpt.com "Code - Stella Ops" +[8]: https://stella-ops.org/docs/24_offline_kit/?utm_source=chatgpt.com "Offline Update Kit (OUK) — Air‑Gap Bundle - Stella Ops – Open ..." +[9]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/7768555f2d107326050cc5ff7f5cb81b82b7ce5f/AGENTS.md "git.stella-ops.org/AGENTS.md at 7768555f2d107326050cc5ff7f5cb81b82b7ce5f - git.stella-ops.org - Gitea: Git with a cup of tea" +[10]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/66cb6c4b8af58a33efa1521b7953dda834431497/docs/implplan/SPRINT_160_export_evidence.md?utm_source=chatgpt.com "git.stella-ops.org/SPRINT_160_export_evidence.md at ..." +[11]: https://stella-ops.org/about/?utm_source=chatgpt.com "Signed Reachability · Deterministic Replay · Sovereign Crypto" +[12]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/actions/?actor=0&status=0&workflow=sdk-publish.yml&utm_source=chatgpt.com "Actions - git.stella-ops.org - Gitea: Git with a cup of tea" diff --git a/docs/product-advisories/01-Dec-2025 - Tracking UX Health with Time‑to‑Evidence.md b/docs/product-advisories/01-Dec-2025 - Tracking UX Health with Time‑to‑Evidence.md new file mode 100644 index 000000000..dbe185a4c --- /dev/null +++ b/docs/product-advisories/01-Dec-2025 - Tracking UX Health with Time‑to‑Evidence.md @@ -0,0 +1,425 @@ +Here’s a simple metric that will make your security UI (and teams) radically better: **Time‑to‑Evidence (TTE)** — the time from opening a finding to seeing *raw proof* (a data‑flow edge, an SBOM line, or a VEX note), not a summary. + +--- + +### What it is + +* **Definition:** TTE = `t_first_proof_rendered − t_open_finding`. +* **Proof =** the exact artifact or path that justifies the claim (e.g., `package-lock.json: line 214 → openssl@1.1.1`, `reachability: A → B → C sink`, or `VEX: not_affected due to unreachable code`). +* **Target:** **P95 ≤ 15s** (stretch: P99 ≤ 30s). If 95% of findings show proof within 15 seconds, the UI stays honest: evidence before opinion, low noise, fast explainability. + +--- + +### Why it matters + +* **Trust:** People accept decisions they can *verify* quickly. +* **Triage speed:** Proof-first UIs cut back-and-forth and guesswork. +* **Noise control:** If you can’t surface proof fast, you probably shouldn’t surface the finding yet. + +--- + +### How to measure (engineering‑ready) + +* Emit two stamps per finding view: + + * `t_open_finding` (on route enter or modal open). + * `t_first_proof_rendered` (first DOM paint of SBOM line / path list / VEX clause). +* Store as `tte_ms` in a lightweight events table (Postgres) with tags: `tenant`, `finding_id`, `proof_kind` (`sbom|reachability|vex`), `source` (`local|remote|cache`). +* Nightly rollup: compute P50/P90/P95/P99 by proof_kind and page. +* Alert when **P95 > 15s** for 15 minutes. + +--- + +### UI contract (keeps the UX honest) + +* **Above the fold:** always show a compact **Proof panel** first (not hidden behind tabs). +* **Skeletons over spinners:** reserve space; render partial proof as soon as any piece is ready. +* **Plain text copy affordance:** “Copy SBOM line / path” button right next to the proof. +* **Defer non‑proof widgets:** CVSS badges, remediation prose, and charts load *after* proof. +* **Empty‑state truth:** if no proof exists, say “No proof available yet” and show the loader for *that* proof type only (don’t pretend with summaries). + +--- + +### Backend rules of thumb + +* **Pre‑index for first paint:** cache top N proof items per hot finding (e.g., first SBOM hit + shortest path). +* **Bound queries:** proof queries must be *O(log n)* on indexed columns (pkg name@version, file hash, graph node id). +* **Chunked streaming:** send first proof chunk <200 ms after backend hit; don’t hold for the full set. +* **Timeout budget:** 12s backend budget + 3s UI/render margin = 15s P95. + +--- + +### Minimal contract to add in your code + +```ts +// Frontend: fire on open +metrics.emit('finding_open', { findingId, t: performance.now() }); + +// When the first real proof node/line hits the DOM: +metrics.emit('proof_rendered', { findingId, proofKind, t: performance.now() }); +``` + +```sql +-- Rollup (hourly) +SELECT + proof_kind, + percentile_cont(0.95) WITHIN GROUP (ORDER BY tte_ms) AS p95_ms +FROM tte_events +WHERE ts >= now() - interval '1 hour' +GROUP BY proof_kind; +``` + +--- + +### What to put on the team dashboard + +* **TTE P95 by page** (Findings list, Finding details). +* **TTE P95 by proof_kind** (sbom / reachability / vex). +* **Error budget burn**: minutes over target per day. +* **Top regressions**: last 7 days vs prior 7. + +--- + +### Acceptance checklist for any finding view + +* [ ] First paint shows a real proof snippet (not a summary). +* [ ] “Copy proof” button works within 1 click. +* [ ] TTE P95 in staging ≤ 10s; in prod ≤ 15s. +* [ ] If proof missing, explicit empty‑state + retry path. +* [ ] Telemetry sampled ≥ 50% of sessions (or 100% for internal). + +--- + +If you want, I can turn this into a **`docs/UX/tte.md`** spec plus a **Grafana panel JSON** and a tiny **.NET middleware** to emit the metrics—just say the word and I’ll drop in ready‑to‑use snippets. +Perfect, got it — let’s turn **Time‑to‑Evidence (TTE)** into a proper set of *“how we track UX health”* guidelines. + +I’ll treat this like something you can drop into an internal engineering/UX doc. + +--- + +## 1. What exactly is Time‑to‑Evidence? + +**Definition** + +> **TTE = t_first_proof_rendered − t_open_finding** + +* **t_open_finding** – when the user first opens a “finding” / detail view (e.g., vulnerability, alert, ticket, log event). +* **t_first_proof_rendered** – when the UI first paints **actual evidence** that backs the finding, for example: + + * The SBOM row showing `package@version`. + * The call‑graph/data‑flow path to a sink. + * A VEX note explaining why something is (not) affected. + * A raw log snippet that the alert is based on. + +**Key principle:** +TTE measures **how long users have to trust you blindly** before they can see proof with their own eyes. + +--- + +## 2. UX health goals & targets + +Treat TTE like latency SLOs: + +* **Primary SLO**: + + * **P95 TTE ≤ 15s** for all findings in normal conditions. +* **Stretch SLO**: + + * **P99 TTE ≤ 30s** for heavy cases (big graphs, huge SBOMs, cold caches). +* **Guardrail**: + + * P50 TTE should be **< 3s**. If the median creeps up, you’re in trouble even if P95 looks OK. + +You can refine by feature: + +* “Simple” proof (single SBOM row, small payload): + + * P95 ≤ 5s. +* “Complex” proof (reachability graph, cross‑repo joins): + + * P95 ≤ 15s. + +**UX rule of thumb** + +* < 2s: feels instant. +* 2–10s: acceptable if clearly loading something heavy. +* > 10s: needs **strong** feedback (progress, partial results, explanations). +* > 30s: the system should probably **offer fallback** (e.g., “download raw evidence” or “retry”). + +--- + +## 3. Instrumentation guidelines + +### 3.1 Event model + +Emit two core events per finding view: + +1. **`finding_open`** + + * When user opens the finding details (route enter / modal open). + * Must include: + + * `finding_id` + * `tenant_id` / `org_id` + * `user_role` (admin, dev, triager, etc.) + * `entry_point` (list, search, notification, deep link) + * `ui_version` / `build_sha` + +2. **`proof_rendered`** + + * First time *any* qualifying proof element is painted. + * Must include: + + * `finding_id` + * `proof_kind` (`sbom | reachability | vex | logs | other`) + * `source` (`local_cache | backend_api | 3rd_party`) + * `proof_height` (e.g., pixel offset from top) – to ensure it’s actually above the fold or very close. + +**Derived metric** + +Your telemetry pipeline should compute: + +```text +tte_ms = proof_rendered.timestamp - finding_open.timestamp +``` + +If there are multiple `proof_rendered` events for the same `finding_open`, use: + +* **TTE (first proof)** – minimum timestamp; primary SLO. +* Optionally: **TTE (full evidence)** – last proof in a defined “bundle” (e.g., path + SBOM row). + +### 3.2 Implementation notes + +**Frontend** + +* Emit `finding_open` as soon as: + + * The route is confirmed and + * You know which `finding_id` is being displayed. +* Emit `proof_rendered`: + + * **Not** when you *fetch* data, but when at least one evidence component is **visibly rendered**. + * Easiest approach: hook into component lifecycle / intersection observer on the evidence container. + +Pseudo‑example: + +```ts +// On route/mount: +metrics.emit('finding_open', { + findingId, + entryPoint, + userRole, + uiVersion, + t: performance.now() +}); + +// In EvidencePanel component, after first render with real data: +if (!hasEmittedProof && hasRealEvidence) { + metrics.emit('proof_rendered', { + findingId, + proofKind: 'sbom', + source: 'backend_api', + t: performance.now() + }); + hasEmittedProof = true; +} +``` + +**Backend** + +* No special requirement beyond: + + * Stable IDs (`finding_id`). + * Knowing which API endpoints respond with evidence payloads — you’ll want to correlate backend latency with TTE later. + +--- + +## 4. Data quality & sampling + +If you want TTE to drive decisions, the data must be boringly reliable. + +**Guidelines** + +1. **Sample rate** + + * Start with **100%** in staging. + * In production, aim for **≥ 25% of sessions** for TTE events at minimum; 100% is ideal if volume is reasonable. + +2. **Clock skew** + + * Prefer **frontend timestamps** using `performance.now()` for TTE; they’re monotonic within a tab. + * Don’t mix backend clocks into the TTE calculation. + +3. **Bot / synthetic traffic** + + * Tag synthetic tests (`is_synthetic = true`) and exclude them from UX health dashboards. + +4. **Retry behavior** + + * If the proof fails to load and user hits “retry”: + + * Treat it as a separate measurement (`retry = true`) or + * Log an additional `proof_error` event with error class (timeout, 5xx, network, parse, etc.). + +--- + +## 5. Dashboards: how to watch TTE + +You want a small, opinionated set of views that answer: + +> “Is UX getting better or worse for people trying to understand findings?” + +### 5.1 Core widgets + +1. **TTE distribution** + + * P50 / P90 / P95 / P99 per day (or per release). + * Split by `proof_kind`. + +2. **TTE by page / surface** + + * Finding list → detail. + * Deep links from notifications. + * Direct URLs / bookmarks. + +3. **TTE by user segment** + + * New users vs power users. + * Different roles (security engineer vs application dev). + +4. **Error budget panel** + + * “Minutes over SLO per day” – e.g., sum of all user‑minutes where TTE > 15s. + * Use this to prioritize work. + +5. **Correlation with engagement** + + * Scatter: TTE vs session length, or TTE vs “user clicked ‘ignore’ / ‘snooze’”. + * Aim to confirm the obvious: **long TTE → worse engagement/completion**. + +### 5.2 Operational details + +* Update granularity: **real‑time or ≤15 min** for on‑call/ops panels. +* Retention: at least **90 days** to see trends across big releases. +* Breakdowns: + + * `backend_region` (to catch regional issues). + * `build_version` (to spot regressions quickly). + +--- + +## 6. UX & engineering design rules anchored in TTE + +These are the **behavior rules** for the product that keep TTE healthy. + +### 6.1 “Evidence first” layout rules + +* **Evidence above the fold** + + * At least *one* proof element must be visible **without scrolling** on a typical laptop viewport. +* **Summary second** + + * CVSS scores, severity badges, long descriptions: all secondary. Evidence should come *before* opinion. +* **No fake proof** + + * Don’t use placeholders that *look* like evidence but aren’t (e.g., “example path” or generic text). + * If evidence is still loading, show a clear skeleton/loader with “Loading evidence…”. + +### 6.2 Loading strategy rules + +* Start fetching evidence **as soon as navigation begins**, not after the page is fully mounted. +* Use **lazy loading** for non‑critical widgets until after proof is shown. +* If a call is known to be heavy: + + * Consider **precomputing** and caching the top evidence (shortest path, first SBOM hit). + * Stream results: render first proof item as soon as it arrives; don’t wait for the full list. + +### 6.3 Empty / error state rules + +* If there is genuinely no evidence: + + * Explicitly say **“No supporting evidence available yet”** and treat TTE as: + + * Either “no value” (excluded), or + * A special bucket `proof_kind = "none"`. +* If loading fails: + + * Show a clear error and a **retry** that re‑emits `proof_rendered` when successful. + * Log `proof_error` with reason; track error rate alongside TTE. + +--- + +## 7. How to *use* TTE in practice + +### 7.1 For releases + +For any change that affects findings UI or evidence plumbing: + +* Add a release checklist item: + + * “No regression on TTE P95 for [pages X, Y].” +* During rollout: + + * Compare **pre‑ vs post‑release** TTE P95 by `ui_version`. + * If regression > 20%: + + * Roll back, or + * Add a follow‑up ticket explicitly tagged with the regression. + +### 7.2 For experiments / A/B tests + +When running UI experiments around findings: + +* Always capture TTE per variant. +* Compare: + + * TTE P50/P95. + * Task completion rate (e.g., “user changed status”). + * Subjective UX (CSAT) if you have it. + +You’re looking for patterns like: + +* Variant B: **+5% completion**, **+8% TTE** → maybe OK. +* Variant C: **+2% completion**, **+70% TTE** → probably not acceptable. + +### 7.3 For prioritization + +Use TTE as a lever in planning: + +* If P95 TTE is healthy and stable: + + * More room for new features / experiments. +* If P95 TTE is trending up for 2+ weeks: + + * Time to schedule a “TTE debt” story: caching, query optimization, UI re‑layout, etc. + +--- + +## 8. Quick “TTE‑ready” checklist + +You’re “tracking UX health with TTE” if you can honestly tick these: + +1. **Instrumentation** + + * [ ] `finding_open` + `proof_rendered` events exist and are correlated. + * [ ] TTE computed in a stable pipeline (joins, dedupe, etc.). +2. **Targets** + + * [ ] TTE SLOs defined (P95, P99) and agreed by UX + engineering. +3. **Dashboards** + + * [ ] A dashboard shows TTE by proof kind, page, and release. + * [ ] On‑call / ops can see TTE in near real‑time. +4. **UX rules** + + * [ ] Evidence is visible above the fold for all main finding types. + * [ ] Non‑critical widgets load after evidence. + * [ ] Empty/error states are explicit about evidence availability. +5. **Process** + + * [ ] Major UI changes check TTE pre vs post as part of release acceptance. + * [ ] Regressions in TTE create real tickets, not just “we’ll watch it”. + +--- + +If you tell me what stack you’re on (e.g., React + Next.js + OpenTelemetry + X observability tool), I can turn this into concrete code snippets and an example dashboard spec (fields, queries, charts) tailored exactly to your setup. diff --git a/docs/product-advisories/01-Dec-2025 - Turning SBOM Data Into Verifiable Proofs.md b/docs/product-advisories/01-Dec-2025 - Turning SBOM Data Into Verifiable Proofs.md new file mode 100644 index 000000000..51a54ab79 --- /dev/null +++ b/docs/product-advisories/01-Dec-2025 - Turning SBOM Data Into Verifiable Proofs.md @@ -0,0 +1,576 @@ +Here’s a tight, practical blueprint to turn your SBOM→VEX links into an auditable “proof spine”—using signed DSSE statements and a per‑dependency trust anchor—so every VEX verdict can be traced, verified, and replayed. + +# What this gives you + +* A **chain of evidence** from each SBOM entry → analysis → VEX verdict. +* **Tamper‑evident** DSSE‑signed records (offline‑friendly). +* **Deterministic replay**: same inputs → same verdicts (great for audits/regulators). + +# Core objects (canonical IDs) + +* **ArtifactID**: digest of package/container (e.g., `sha256:…`). +* **SBOMEntryID**: stable ID for a component in an SBOM (`sbomDigest:package@version[:purl]`). +* **EvidenceID**: hash of raw evidence (scanner JSON, reachability, exploit intel). +* **ReasoningID**: hash of normalized reasoning (rules/lattice inputs used). +* **VEXVerdictID**: hash of the final VEX statement body. +* **ProofBundleID**: merkle root of {SBOMEntryID, EvidenceID[], ReasoningID, VEXVerdictID}. +* **TrustAnchorID**: per‑dependency anchor (public key + policy) used to validate the above. + +# Signed DSSE envelopes you’ll produce + +1. **Evidence Statement** (per evidence item) + +* `subject`: SBOMEntryID +* `predicateType`: `evidence.stella/v1` +* `predicate`: source, tool version, timestamps, EvidenceID +* **Signers**: scanner/ingestor key + +2. **Reasoning Statement** + +* `subject`: SBOMEntryID +* `predicateType`: `reasoning.stella/v1` (your lattice/policy inputs + ReasoningID) +* **Signers**: “Policy/Lattice Engine” key (Authority) + +3. **VEX Verdict Statement** + +* `subject`: SBOMEntryID +* `predicateType`: CycloneDX or CSAF VEX body + VEXVerdictID +* **Signers**: VEXer key (or vendor key if you have it) + +4. **Proof Spine Statement** (the spine itself) + +* `subject`: SBOMEntryID +* `predicateType`: `proofspine.stella/v1` +* `predicate`: EvidenceID[], ReasoningID, VEXVerdictID, ProofBundleID +* **Signers**: Authority key + +# Trust model (per‑dependency anchor) + +* **TrustAnchor** (per package/purl): { TrustAnchorID, allowed signers (KMS refs, PKs), accepted predicateTypes, policy version, revocation list }. +* Store anchors in **Authority** and pin them in your graph by SBOMEntryID→TrustAnchorID. +* Optional: PQC mode (Dilithium/Falcon) for long‑term archives. + +# Verification pipeline (deterministic) + +1. Resolve SBOMEntryID → TrustAnchorID. +2. Verify every DSSE envelope’s signature **against the anchor’s allowed keys**. +3. Recompute EvidenceID/ReasoningID/VEXVerdictID from raw content; compare hashes. +4. Recompute ProofBundleID (merkle root) and compare to the spine. +5. Emit a **Receipt**: {ProofBundleID, verification log, tool digests}. Cache it. + +# Storage layout (Postgres + blob store) + +* `sbom_entries(entry_id PK, bom_digest, purl, version, artifact_digest, trust_anchor_id)` +* `dsse_envelopes(env_id PK, entry_id, predicate_type, signer_keyid, body_hash, envelope_blob_ref, signed_at)` +* `spines(entry_id PK, bundle_id, evidence_ids[], reasoning_id, vex_id, anchor_id, created_at)` +* `trust_anchors(anchor_id PK, purl_pattern, allowed_keyids[], policy_ref, revoked_keys[])` +* Blobs (immutable): raw evidence, normalized reasoning JSON, VEX JSON, DSSE bytes. + +# API surface (clean and small) + +* `POST /proofs/:entry/spine` → submit or update spine (idempotent by ProofBundleID) +* `GET /proofs/:entry/receipt` → full verification receipt (JSON) +* `GET /proofs/:entry/vex` → the verified VEX body +* `GET /anchors/:anchor` → fetch trust anchor (for offline kits) + +# Normalization rules (so hashes are stable) + +* Canonical JSON (UTF‑8, sorted keys, no insignificant whitespace). +* Strip volatile fields (timestamps that aren’t part of the semantic claim). +* Version your schemas: `evidence.stella/v1`, `reasoning.stella/v1`, etc. + +# Signing keys & rotation + +* Keep keys in your **Authority** module (KMS/HSM; offline export for air‑gap). +* Publish key material via an **attestation feed** (or Rekor‑mirror) for third‑party audit. +* Rotate by **adding** new allowed_keyids in the TrustAnchor; never mutate old envelopes. + +# CI/CD hooks + +* On SBOM ingest → create/refresh SBOMEntry rows + attach TrustAnchor. +* On scan completion → produce Evidence Statements (DSSE) immediately. +* On policy evaluation → produce Reasoning + VEX, then assemble Spine. +* Gate releases on `GET /proofs/:entry/receipt` == PASS. + +# UX (auditor‑friendly) + +* **Proof timeline** per entry: SBOM → Evidence tiles → Reasoning → VEX → Receipt. +* One‑click “Recompute & Compare” to show deterministic replay passes. +* Red/amber flags when a signature no longer matches a TrustAnchor or a key is revoked. + +# Minimal dev checklist + +* [ ] Implement canonicalizers (Evidence, Reasoning, VEX). +* [ ] Implement DSSE sign/verify (ECDSA + optional PQC). +* [ ] TrustAnchor registry + resolver by purl pattern. +* [ ] Merkle bundling to get ProofBundleID. +* [ ] Receipt generator + verifier. +* [ ] Postgres schema + blob GC (content‑addressed). +* [ ] CI gates + API endpoints above. +* [ ] Auditor UI: timeline + diff + receipts download. + +If you want, I can drop in a ready‑to‑use JSON schema set (`evidence.stella/v1`, `reasoning.stella/v1`, `proofspine.stella/v1`) and sample DSSE envelopes wired to your .NET 10 stack. +Here’s a focused **Stella Ops Developer Guidelines** doc, specifically for the pipeline that turns **SBOM data into verifiable proofs** (your SBOM → Evidence → Reasoning → VEX → Proof Spine). + +Feel free to paste this into your internal handbook and tweak names to match your repos/services. + +--- + +# Stella Ops Developer Guidelines + +## Turning SBOM Data Into Verifiable Proofs + +--- + +## 1. Mental Model: What You’re Actually Building + +For every component in an SBOM, Stella must be able to answer, *“Why should anyone trust our VEX verdict for this dependency, today and ten years from now?”* + +We do that with a pipeline: + +1. **SBOM Ingest** + Raw SBOM → validated → normalized → `SBOMEntryID`. + +2. **Evidence Collection** + Scans, feeds, configs, reachability, etc. → canonical evidence blobs → `EvidenceID` → DSSE-signed. + +3. **Reasoning / Policy** + Policy + evidence → deterministic reasoning → `ReasoningID` → DSSE-signed. + +4. **VEX Verdict** + VEX statement (CycloneDX / CSAF) → canonicalized → `VEXVerdictID` → DSSE-signed. + +5. **Proof Spine** + `{SBOMEntryID, EvidenceIDs[], ReasoningID, VEXVerdictID}` → merkle bundle → `ProofBundleID` → DSSE-signed. + +6. **Verification & Receipts** + Re-run verification → `Receipt` that proves everything above is intact and anchored to trusted keys. + +Everything you do in this area should keep this spine intact and verifiable. + +--- + +## 2. Non‑Negotiable Invariants + +These are the rules you don’t break without an explicit, company-level decision: + +1. **Immutability of Signed Facts** + + * DSSE envelopes (evidence, reasoning, VEX, spines) are append‑only. + * You never edit or delete content inside a previously signed envelope. + * Corrections are made by **superseding** (new statement pointing at the old one). + +2. **Determinism** + + * Same `{SBOMEntryID, Evidence set, policyVersion}` ⇒ same `{ReasoningID, VEXVerdictID, ProofBundleID}`. + * No non-deterministic inputs (e.g., “current time”, random IDs) in anything that affects IDs or verdicts. + +3. **Traceability** + + * Every VEX verdict must be traceable back to: + + * The precise SBOM entry + * Concrete evidence blobs + * A specific policy & reasoning snapshot + * A trust anchor defining allowed signers + +4. **Least Trust / Least Privilege** + + * Services only know the keys and data they need. + * Trust is always explicit: through **TrustAnchors** and signature verification, never “because it’s in our DB”. + +5. **Backwards Compatibility** + + * New code must continue to verify **old proofs**. + * New policies must **not rewrite history**; they produce *new* spines, leaving old ones intact. + +--- + +## 3. SBOM Ingestion Guidelines + +**Goal:** Turn arbitrary SBOMs into stable, addressable `SBOMEntryID`s and safe internal models. + +### 3.1 Inputs & Formats + +* Support at least: + + * CycloneDX (JSON) + * SPDX (JSON / Tag-Value) +* For each ingested SBOM, store: + + * Raw SBOM bytes (immutable, content-addressed) + * A normalized internal representation (your own model) + +### 3.2 IDs + +* Generate: + + * `sbomDigest` = hash(raw SBOM, canonical form) + * `SBOMEntryID` = `sbomDigest + purl + version` (or equivalent stable tuple) +* `SBOMEntryID` must: + + * Not depend on ingestion time or database IDs. + * Be reproducible from SBOM + deterministic normalization. + +### 3.3 Validation & Errors + +* Validate: + + * Syntax (JSON, schema) + * Core semantics (package identifiers, digests, versions) +* If invalid: + + * Reject the SBOM **but** record a small DSSE “failure attestation” explaining: + + * Why it failed + * Which file + * Which system version + * This still gives you a proof trail for “we tried and it failed”. + +--- + +## 4. Evidence Collection Guidelines + +**Goal:** Capture all inputs that influence the verdict in a canonical, signed form. + +Typical evidence types: + +* SCA / vuln scanner results +* CVE feeds & advisory data +* Reachability / call graph analysis +* Runtime context (where this component is used) +* Manual assessments (e.g., security engineer verdicts) + +### 4.1 Evidence Canonicalization + +For every evidence item: + +* Normalize to a schema like `evidence.stella/v1` with fields such as: + + * `source` (scanner name, feed) + * `sourceVersion` (tool version, DB version) + * `collectionTime` + * `sbomEntryId` + * `vulnerabilityId` (if applicable) + * `rawFinding` (or pointer to it) +* Canonical JSON rules: + + * Sorted keys + * UTF‑8, no extraneous whitespace + * No volatile fields beyond what’s semantically needed (e.g., you might include `collectionTime`, but then know it affects the hash and treat that consciously). + +Then: + +* Compute `EvidenceID = hash(canonicalEvidenceJson)`. +* Wrap in DSSE: + + * `subject`: `SBOMEntryID` + * `predicateType`: `evidence.stella/v1` + * `predicate`: canonical evidence + `EvidenceID`. +* Sign with **evidence-ingestor key** (per environment). + +### 4.2 Ops Rules + +* **Idempotency:** + Re-running the same scan with same inputs should produce the same evidence object and `EvidenceID`. +* **Tool changes:** + When tool version or configuration changes, that’s a *new* evidence statement with a new `EvidenceID`. Do not overwrite old evidence. +* **Partial failure:** + If a scan fails, produce a minimal failure evidence record (with error details) instead of “nothing”. + +--- + +## 5. Reasoning & Policy Engine Guidelines + +**Goal:** Turn evidence into a defensible, replayable reasoning step with a clear policy version. + +### 5.1 Reasoning Object + +Define a canonical reasoning schema, e.g. `reasoning.stella/v1`: + +* `sbomEntryId` +* `evidenceIds[]` (sorted) +* `policyVersion` +* `inputs`: normalized form of all policy inputs (severity thresholds, lattice rules, etc.) +* `intermediateFindings`: optional but useful — e.g., “reachable vulns = …” + +Then: + +* Canonicalize JSON and compute `ReasoningID = hash(canonicalReasoning)`. +* Wrap in DSSE: + + * `subject`: `SBOMEntryID` + * `predicateType`: `reasoning.stella/v1` + * `predicate`: canonical reasoning + `ReasoningID`. +* Sign with **Policy/Authority key**. + +### 5.2 Determinism + +* Reasoning functions must be **pure**: + + * Inputs: SBOMEntryID, evidence set, policy version, configuration. + * No hidden calls to external APIs at decision time (fetch feeds earlier and record them as evidence). +* If you need “current time” in policy: + + * Treat it as **explicit input** and record it inside reasoning under `inputs.currentEvaluationTime`. + +### 5.3 Policy Evolution + +* When policy changes: + + * Bump `policyVersion`. + * New evaluations produce new `ReasoningID` and new VEX/spines. + * Don’t retroactively apply new policy to old reasoning objects; generate new ones alongside. + +--- + +## 6. VEX Verdict Guidelines + +**Goal:** Generate VEX statements that are strongly tied to SBOM entries and your reasoning. + +### 6.1 Shape + +* Target standard formats: + + * CycloneDX VEX + * or CSAF +* Required linkages: + + * Component reference = `SBOMEntryID` or a resolvable component identifier from your SBOM normalize layer. + * Vulnerability IDs (CVE, GHSA, internal IDs). + * Status (`not_affected`, `affected`, `fixed`, etc.). + * Justification & impact. + +### 6.2 Canonicalization & Signing + +* Define a canonical VEX body schema (subset of the standard + internal metadata): + + * `sbomEntryId` + * `vulnerabilityId` + * `status` + * `justification` + * `policyVersion` + * `reasoningId` +* Canonicalize JSON → `VEXVerdictID = hash(canonicalVexBody)`. +* DSSE-envelope: + + * `subject`: `SBOMEntryID` + * `predicateType`: e.g. `cdx-vex.stella/v1` + * `predicate`: canonical VEX + `VEXVerdictID`. +* Sign with **VEXer key** or vendor key (depending on trust anchor). + +### 6.3 External VEX + +* When importing vendor VEX: + + * Verify signature against vendor’s TrustAnchor. + * Canonicalize to your internal schema but preserve: + + * Original document + * Original signature material + * Record “source = vendor” vs “source = stella” so auditors see origin. + +--- + +## 7. Proof Spine Guidelines + +**Goal:** Build a compact, tamper-evident “bundle” that ties everything together. + +### 7.1 Structure + +For each `SBOMEntryID`, gather: + +* `EvidenceIDs[]` (sorted lexicographically). +* `ReasoningID`. +* `VEXVerdictID`. + +Compute: + +* Merkle tree root (or deterministic hash) over: + + * `sbomEntryId` + * sorted `EvidenceIDs[]` + * `ReasoningID` + * `VEXVerdictID` +* Result is `ProofBundleID`. + +Create a DSSE “spine”: + +* `subject`: `SBOMEntryID` +* `predicateType`: `proofspine.stella/v1` +* `predicate`: + + * `evidenceIds[]` + * `reasoningId` + * `vexVerdictId` + * `policyVersion` + * `proofBundleId` +* Sign with **Authority key**. + +### 7.2 Ops Rules + +* Spine generation is idempotent: + + * Same inputs → same `ProofBundleID`. +* Never mutate existing spines; new policy or new evidence ⇒ new spine. +* Keep a clear API contract: + + * `GET /proofs/:entry` returns **all** spines, each labeled with `policyVersion` and timestamps. + +--- + +## 8. Storage & Schema Guidelines + +**Goal:** Keep proofs queryable forever without breaking verification. + +### 8.1 Tables (conceptual) + +* `sbom_entries`: `entry_id`, `bom_digest`, `purl`, `version`, `artifact_digest`, `trust_anchor_id`. +* `dsse_envelopes`: `env_id`, `entry_id`, `predicate_type`, `signer_keyid`, `body_hash`, `envelope_blob_ref`, `signed_at`. +* `spines`: `entry_id`, `proof_bundle_id`, `policy_version`, `evidence_ids[]`, `reasoning_id`, `vex_verdict_id`, `anchor_id`, `created_at`. +* `trust_anchors`: `anchor_id`, `purl_pattern`, `allowed_keyids[]`, `policy_ref`, `revoked_keys[]`. + +### 8.2 Schema Changes + +Always follow: + +1. **Expand** + + * Add new columns/tables. + * Make new code tolerant of old data. + +2. **Backfill** + + * Idempotent jobs that fill in new IDs/fields without touching old DSSE payloads. + +3. **Contract** + + * Only after all code uses the new model. + * Never drop the raw DSSE or raw SBOM blobs. + +--- + +## 9. Verification & Receipts + +**Goal:** Make it trivial (for you, customers, and regulators) to recheck everything. + +### 9.1 Verification Flow + +Given `SBOMEntryID` or `ProofBundleID`: + +1. Fetch spine and trust anchor. +2. Verify: + + * Spine DSSE signature against TrustAnchor’s allowed keys. + * VEX, reasoning, and evidence DSSE signatures. +3. Recompute: + + * `EvidenceIDs` from stored canonical evidence. + * `ReasoningID` from reasoning. + * `VEXVerdictID` from VEX body. + * `ProofBundleID` from the above. +4. Compare to stored IDs. + +Emit a **Receipt**: + +* `proofBundleId` +* `verifiedAt` +* `verifierVersion` +* `anchorId` +* `result` (pass/fail, with reasons) + +### 9.2 Offline Kit + +* Provide a minimal CLI (`stella verify`) that: + + * Accepts a bundle export (SBOM + DSSE envelopes + anchors). + * Verifies everything without network access. + +Developers must ensure: + +* Export format is documented and stable. +* All fields required for verification are included. + +--- + +## 10. Security & Key Management (for Devs) + +* Keys live in **KMS/HSM**, not env vars or config files. +* Separate keysets: + + * `dev`, `staging`, `prod` + * Authority vs VEXer vs Evidence Ingestor. +* TrustAnchors: + + * Edit via Authority service only. + * Every change: + + * Requires code-reviewed change. + * Writes an audit log entry. + +Never: + +* Log private keys. +* Log full DSSE envelopes in plaintext logs (log IDs and hashes instead). + +--- + +## 11. Observability & On‑Call Expectations + +### 11.1 Metrics + +For the SBOM→Proof pipeline, expose: + +* `sboms_ingested_total` +* `sbom_ingest_errors_total{reason}` +* `evidence_statements_created_total` +* `reasoning_statements_created_total` +* `vex_statements_created_total` +* `proof_spines_created_total` +* `proof_verifications_total{result}` (pass/fail reason) +* Latency histograms per stage (`_duration_seconds`) + +### 11.2 Logging + +Include in structured logs wherever relevant: + +* `sbomEntryId` +* `proofBundleId` +* `anchorId` +* `policyVersion` +* `requestId` / `traceId` + +### 11.3 Runbooks + +You should maintain runbooks for at least: + +* “Pipeline is stalled” (backlog of SBOMs, evidence, or spines). +* “Verification failures increased”. +* “Trust anchor or key issues” (rotation, revocation, misconfiguration). +* “Backfill gone wrong” (how to safely stop, resume, and audit). + +--- + +## 12. Dev Workflow & PR Checklist (SBOM→Proof Changes Only) + +When your change touches SBOM ingestion, evidence, reasoning, VEX, or proof spines, check: + +* [ ] IDs (`SBOMEntryID`, `EvidenceID`, `ReasoningID`, `VEXVerdictID`, `ProofBundleID`) remain **deterministic** and fully specified. +* [ ] No mutation of existing DSSE envelopes or historical proof data. +* [ ] Schema changes follow **expand → backfill → contract**. +* [ ] New/updated TrustAnchors reviewed by Authority owner. +* [ ] Unit tests cover: + + * Canonicalization for any new/changed predicate. + * ID computation. +* [ ] Integration test covers: + + * SBOM → Evidence → Reasoning → VEX → Spine → Verification → Receipt. +* [ ] Observability updated: + + * New paths emit logs & metrics. +* [ ] Rollback plan documented (especially for migrations & policy changes). + +--- + +If you tell me which microservices/repos map to these stages (e.g. `stella-sbom-ingest`, `stella-proof-authority`, `stella-vexer`), I can turn this into a more concrete, service‑by‑service checklist with example API contracts and class/interface sketches.