add advisories
This commit is contained in:
@@ -0,0 +1,446 @@
|
||||
Here’s a crisp, practical way to turn Stella Ops’ “verifiable proof spine” into a moat—and how to measure it.
|
||||
|
||||
# Why this matters (in plain terms)
|
||||
|
||||
Security tools often say “trust me.” You’ll say “prove it”—every finding and every “not‑affected” claim ships with cryptographic receipts anyone can verify.
|
||||
|
||||
---
|
||||
|
||||
# Differentiators to build in
|
||||
|
||||
**1) Bind every verdict to a graph hash**
|
||||
|
||||
* Compute a stable **Graph Revision ID** (Merkle root) over: SBOM nodes, edges, policies, feeds, scan params, and tool versions.
|
||||
* Store the ID on each finding/VEX item; show it in the UI and APIs.
|
||||
* Rule: any data change → new graph hash → new revisioned verdicts.
|
||||
|
||||
**2) Attach machine‑verifiable receipts (in‑toto/DSSE)**
|
||||
|
||||
* For each verdict, emit a **DSSE‑wrapped in‑toto statement**:
|
||||
|
||||
* predicateType: `stellaops.dev/verdict@v1`
|
||||
* includes: graphRevisionId, artifact digests, rule id/version, inputs (CPE/CVE/CVSS), timestamps.
|
||||
* Sign with your **Authority** (Sigstore key, offline mode supported).
|
||||
* Keep receipts queryable and exportable; mirror to Rekor‑compatible ledger when online.
|
||||
|
||||
**3) Add reachability “call‑stack slices” or binary‑symbol proofs**
|
||||
|
||||
* For code‑level reachability, store compact slices: entry → sink, with symbol names + file:line.
|
||||
* For binary-only targets, include **symbol presence proofs** (e.g., Bloom filters + offsets) with executable digest.
|
||||
* Compress and embed a hash of the slice/proof inside the DSSE payload.
|
||||
|
||||
**4) Deterministic replay manifests**
|
||||
|
||||
* Alongside receipts, publish a **Replay Manifest** (inputs, feeds, rule versions, container digests) so any auditor can reproduce the same graph hash and verdicts offline.
|
||||
|
||||
---
|
||||
|
||||
# Benchmarks to publish (make them your headline KPIs)
|
||||
|
||||
**A) False‑positive reduction vs. baseline scanners (%)**
|
||||
|
||||
* Method: run a public corpus (e.g., sample images + app stacks) across 3–4 popular scanners; label ground truth once; compare FP rate.
|
||||
* Report: mean & p95 FP reduction.
|
||||
|
||||
**B) Proof coverage (% of findings with signed evidence)**
|
||||
|
||||
* Definition: `(# findings or VEX items carrying valid DSSE receipts) / (total surfaced items)`.
|
||||
* Break out: runtime‑reachable vs. unreachable, and “not‑affected” claims.
|
||||
|
||||
**C) Triage time saved (p50/p95)**
|
||||
|
||||
* Measure analyst minutes from “alert created” → “final disposition.”
|
||||
* A/B with receipts hidden vs. visible; publish median/p95 deltas.
|
||||
|
||||
**D) Determinism stability**
|
||||
|
||||
* Re-run identical scans N times / across nodes; publish `% identical graph hashes` and drift causes when different.
|
||||
|
||||
---
|
||||
|
||||
# Minimal implementation plan (week‑by‑week)
|
||||
|
||||
**Week 1: primitives**
|
||||
|
||||
* Add Graph Revision ID generator in `scanner.webservice` (Merkle over normalized JSON of SBOM+edges+policies+toolVersions).
|
||||
* Define `VerdictReceipt` schema (protobuf/JSON) and DSSE envelope types.
|
||||
|
||||
**Week 2: signing + storage**
|
||||
|
||||
* Wire DSSE signing in **Authority**; offline key support + rotation.
|
||||
* Persist receipts in `Receipts` table (Postgres) keyed by `(graphRevisionId, verdictId)`; enable export (JSONL) and ledger mirror.
|
||||
|
||||
**Week 3: reachability proofs**
|
||||
|
||||
* Add call‑stack slice capture in reachability engine; serialize compactly; hash + reference from receipts.
|
||||
* Binary symbol proof module for ELF/PE: symbol bitmap + digest.
|
||||
|
||||
**Week 4: replay + UX**
|
||||
|
||||
* Emit `replay.manifest.json` per scan (inputs, tool digests).
|
||||
* UI: show **“Verified”** badge, graph hash, signature issuer, and a one‑click “Copy receipt” button.
|
||||
* API: `GET /verdicts/{id}/receipt`, `GET /graphs/{rev}/replay`.
|
||||
|
||||
**Week 5: benchmarks harness**
|
||||
|
||||
* Create `bench/` with golden fixtures and a runner:
|
||||
|
||||
* Baseline scanner adapters
|
||||
* Ground‑truth labels
|
||||
* Metrics export (FP%, proof coverage, triage time capture hooks)
|
||||
|
||||
---
|
||||
|
||||
# Developer guardrails (make these non‑negotiable)
|
||||
|
||||
* **No receipt, no ship:** any surfaced verdict must carry a DSSE receipt.
|
||||
* **Schema freeze windows:** changes to rule inputs or policy logic must bump rule version and therefore the graph hash.
|
||||
* **Replay‑first CI:** PRs touching scanning/rules must pass a replay test that reproduces prior graph hashes on gold fixtures.
|
||||
* **Clock safety:** use monotonic time inside receipts; add UTC wall‑time separately.
|
||||
|
||||
---
|
||||
|
||||
# What to show buyers/auditors
|
||||
|
||||
* A short **audit kit**: sample container + your receipts + replay manifest + one command to reproduce the same graph hash.
|
||||
* A one‑page **benchmark readout**: FP reduction, proof coverage, and triage time saved (p50/p95), with corpus description.
|
||||
|
||||
---
|
||||
|
||||
If you want, I’ll draft:
|
||||
|
||||
1. the DSSE `predicate` schema,
|
||||
2. the Postgres DDL for `Receipts` and `Graphs`, and
|
||||
3. a tiny .NET verification CLI (`stellaops-verify`) that replays a manifest and validates signatures.
|
||||
Here’s a focused “developer guidelines” doc just for **Benchmarks for a Testable Security Moat** in Stella Ops.
|
||||
|
||||
---
|
||||
|
||||
# Stella Ops Developer Guidelines
|
||||
|
||||
## Benchmarks for a Testable Security Moat
|
||||
|
||||
> **Goal:** Benchmarks are how we *prove* Stella Ops is better, not just say it is. If a “moat” claim can’t be tied to a benchmark, it doesn’t exist.
|
||||
|
||||
Everything here is about how you, as a developer, design, extend, and run those benchmarks.
|
||||
|
||||
---
|
||||
|
||||
## 1. What our benchmarks must measure
|
||||
|
||||
Every core product claim needs at least one benchmark:
|
||||
|
||||
1. **Detection quality**
|
||||
|
||||
* Precision / recall vs ground truth.
|
||||
* False positives vs popular scanners.
|
||||
* False negatives on known‑bad samples.
|
||||
|
||||
2. **Proof & evidence quality**
|
||||
|
||||
* % of findings with **valid receipts** (DSSE).
|
||||
* % of VEX “not‑affected” with attached proofs.
|
||||
* Reachability proof quality:
|
||||
|
||||
* call‑stack slice present?
|
||||
* symbol proof present for binaries?
|
||||
|
||||
3. **Triage & workflow impact**
|
||||
|
||||
* Time‑to‑decision for analysts (p50/p95).
|
||||
* Click depth and context switches per decision.
|
||||
* “Verified” vs “unverified” verdict triage times.
|
||||
|
||||
4. **Determinism & reproducibility**
|
||||
|
||||
* Same inputs → same **Graph Revision ID**.
|
||||
* Stable verdict sets across runs/nodes.
|
||||
|
||||
> **Rule:** If you add a feature that impacts any of these, you must either hook it into an existing benchmark or add a new one.
|
||||
|
||||
---
|
||||
|
||||
## 2. Benchmark assets and layout
|
||||
|
||||
**2.1 Repo layout (convention)**
|
||||
|
||||
Under `bench/` we maintain everything benchmark‑related:
|
||||
|
||||
* `bench/corpus/`
|
||||
|
||||
* `images/` – curated container images / tarballs.
|
||||
* `repos/` – sample codebases (with known vulns).
|
||||
* `sboms/` – canned SBOMs for edge cases.
|
||||
* `bench/scenarios/`
|
||||
|
||||
* `*.yaml` – scenario definitions (inputs + expected outputs).
|
||||
* `bench/golden/`
|
||||
|
||||
* `*.json` – golden results (expected findings, metrics).
|
||||
* `bench/tools/`
|
||||
|
||||
* adapters for baseline scanners, parsers, helpers.
|
||||
* `bench/scripts/`
|
||||
|
||||
* `run_benchmarks.[sh/cs]` – single entrypoint.
|
||||
|
||||
**2.2 Scenario definition (high‑level)**
|
||||
|
||||
Each scenario yaml should minimally specify:
|
||||
|
||||
* **Inputs**
|
||||
|
||||
* artifact references (image name / path / repo SHA / SBOM file).
|
||||
* environment knobs (features enabled/disabled).
|
||||
* **Ground truth**
|
||||
|
||||
* list of expected vulns (or explicit “none”).
|
||||
* for some: expected reachability (reachable/unreachable).
|
||||
* expected VEX entries (affected / not affected).
|
||||
* **Expectations**
|
||||
|
||||
* required metrics (e.g., “no more than 2 FPs”, “no FNs”).
|
||||
* required proof coverage (e.g., “100% of surfaced findings have receipts”).
|
||||
|
||||
---
|
||||
|
||||
## 3. Core benchmark metrics (developer‑facing definitions)
|
||||
|
||||
Use these consistently across code and docs.
|
||||
|
||||
### 3.1 Detection metrics
|
||||
|
||||
* `true_positive_count` (TP)
|
||||
* `false_positive_count` (FP)
|
||||
* `false_negative_count` (FN)
|
||||
|
||||
Derived:
|
||||
|
||||
* `precision = TP / (TP + FP)`
|
||||
* `recall = TP / (TP + FN)`
|
||||
* For UX: track **FP per asset** and **FP per 100 findings**.
|
||||
|
||||
**Developer guideline:**
|
||||
|
||||
* When you introduce a filter, deduper, or rule tweak, add/modify a scenario where:
|
||||
|
||||
* the change **helps** (reduces FP or FN); and
|
||||
* a different scenario guards against regressions.
|
||||
|
||||
### 3.2 Moat‑specific metrics
|
||||
|
||||
These are the ones that directly support the “testable moat” story:
|
||||
|
||||
1. **False‑positive reduction vs baseline scanners**
|
||||
|
||||
* Run baseline scanners across our corpus (via adapters in `bench/tools`).
|
||||
* Compute:
|
||||
|
||||
* `baseline_fp_rate`
|
||||
* `stella_fp_rate`
|
||||
* `fp_reduction = (baseline_fp_rate - stella_fp_rate) / baseline_fp_rate`.
|
||||
|
||||
2. **Proof coverage**
|
||||
|
||||
* `proof_coverage_all = findings_with_valid_receipts / total_findings`
|
||||
* `proof_coverage_vex = vex_items_with_valid_receipts / total_vex_items`
|
||||
* `proof_coverage_reachable = reachable_findings_with_proofs / total_reachable_findings`
|
||||
|
||||
3. **Triage time improvement**
|
||||
|
||||
* In test harnesses, simulate or record:
|
||||
|
||||
* `time_to_triage_with_receipts`
|
||||
* `time_to_triage_without_receipts`
|
||||
* Compute median & p95 deltas.
|
||||
|
||||
4. **Determinism**
|
||||
|
||||
* Re‑run the same scenario `N` times:
|
||||
|
||||
* `% runs with identical Graph Revision ID`
|
||||
* `% runs with identical verdict sets`
|
||||
* On mismatch, diff and log cause (e.g., non‑stable sort, non‑pinned feed).
|
||||
|
||||
---
|
||||
|
||||
## 4. How developers should work with benchmarks
|
||||
|
||||
### 4.1 “No feature without benchmarks”
|
||||
|
||||
If you’re adding or changing:
|
||||
|
||||
* graph structure,
|
||||
* rule logic,
|
||||
* scanner integration,
|
||||
* VEX handling,
|
||||
* proof / receipt generation,
|
||||
|
||||
you **must** do *at least one* of:
|
||||
|
||||
1. **Extend an existing scenario**
|
||||
|
||||
* Add expectations that cover your change, or
|
||||
* tighten an existing bound (e.g., lower FP threshold).
|
||||
|
||||
2. **Add a new scenario**
|
||||
|
||||
* For new attack classes / edge cases / ecosystems.
|
||||
|
||||
**Anti‑patterns:**
|
||||
|
||||
* Shipping a new capability with *no* corresponding scenario.
|
||||
* Updating golden outputs without explaining why metrics changed.
|
||||
|
||||
### 4.2 CI gates
|
||||
|
||||
We treat benchmarks as **blocking**:
|
||||
|
||||
* Add a CI job, e.g.:
|
||||
|
||||
* `make bench:quick` on every PR (small subset).
|
||||
* `make bench:full` on main / nightly.
|
||||
* CI fails if:
|
||||
|
||||
* Any scenario marked `strict: true` has:
|
||||
|
||||
* Precision or recall below its threshold.
|
||||
* Proof coverage below its configured threshold.
|
||||
* Global regressions above tolerance:
|
||||
|
||||
* e.g. total FP increases > X% without an explicit override.
|
||||
|
||||
**Developer rule:**
|
||||
|
||||
* If you intentionally change behavior:
|
||||
|
||||
* Update the relevant golden files.
|
||||
* Include a short note in the PR (e.g., `bench-notes.md` snippet) describing:
|
||||
|
||||
* what changed,
|
||||
* why the new result is better, and
|
||||
* which moat metric it improves (FP, proof coverage, determinism, etc.).
|
||||
|
||||
---
|
||||
|
||||
## 5. Benchmark implementation guidelines
|
||||
|
||||
### 5.1 Make benchmarks deterministic
|
||||
|
||||
* **Pin everything**:
|
||||
|
||||
* feed snapshots,
|
||||
* tool container digests,
|
||||
* rule versions,
|
||||
* time windows.
|
||||
* Use **Replay Manifests** as the source of truth:
|
||||
|
||||
* `replay.manifest.json` should contain:
|
||||
|
||||
* input artifacts,
|
||||
* tool versions,
|
||||
* feed versions,
|
||||
* configuration flags.
|
||||
* If a benchmark depends on time:
|
||||
|
||||
* Inject a **fake clock** or explicit “as of” timestamp.
|
||||
|
||||
### 5.2 Keep scenarios small but meaningful
|
||||
|
||||
* Prefer many **focused** scenarios over a few huge ones.
|
||||
* Each scenario should clearly answer:
|
||||
|
||||
* “What property of Stella Ops are we testing?”
|
||||
* “What moat claim does this support?”
|
||||
|
||||
Examples:
|
||||
|
||||
* `bench/scenarios/false_pos_kubernetes.yaml`
|
||||
|
||||
* Focus: config noise reduction vs baseline scanner.
|
||||
* `bench/scenarios/reachability_java_webapp.yaml`
|
||||
|
||||
* Focus: reachable vs unreachable vuln proofs.
|
||||
* `bench/scenarios/vex_not_affected_openssl.yaml`
|
||||
|
||||
* Focus: VEX correctness and proof coverage.
|
||||
|
||||
### 5.3 Use golden outputs, not ad‑hoc assertions
|
||||
|
||||
* Bench harness should:
|
||||
|
||||
* Run Stella Ops on scenario inputs.
|
||||
* Normalize outputs (sorted lists, stable IDs).
|
||||
* Compare to `bench/golden/<scenario>.json`.
|
||||
* Golden file should include:
|
||||
|
||||
* expected findings (id, severity, reachable?, etc.),
|
||||
* expected VEX entries,
|
||||
* expected metrics (precision, recall, coverage).
|
||||
|
||||
---
|
||||
|
||||
## 6. Moat‑critical benchmark types (we must have all of these)
|
||||
|
||||
When you’re thinking about gaps, check that we have:
|
||||
|
||||
1. **Cross‑tool comparison**
|
||||
|
||||
* Same corpus, multiple scanners.
|
||||
* Metrics vs baselines for FP/FN.
|
||||
|
||||
2. **Proof density & quality**
|
||||
|
||||
* Corpus where:
|
||||
|
||||
* some vulns are reachable,
|
||||
* some are not,
|
||||
* some are not present.
|
||||
* Ensure:
|
||||
|
||||
* reachable ones have rich proofs (stack slices / symbol proofs).
|
||||
* non‑reachable or absent ones have:
|
||||
|
||||
* correct disposition, and
|
||||
* clear receipts explaining why.
|
||||
|
||||
3. **VEX accuracy**
|
||||
|
||||
* Scenarios with known SBOM + known vulnerability impact.
|
||||
* Check:
|
||||
|
||||
* VEX “affected”/“not‑affected” matches ground truth.
|
||||
* every VEX entry has a receipt.
|
||||
|
||||
4. **Analyst workflow**
|
||||
|
||||
* Small usability corpus for internal testing:
|
||||
|
||||
* Measure time‑to‑triage with/without receipts.
|
||||
* Use the same scenarios across releases to track improvement.
|
||||
|
||||
5. **Upgrade / drift resistance**
|
||||
|
||||
* Scenarios that are **expected to remain stable** across:
|
||||
|
||||
* rule changes that *shouldn’t* affect outcomes.
|
||||
* feed updates (within a given version window).
|
||||
* These act as canaries for unintended regressions.
|
||||
|
||||
---
|
||||
|
||||
## 7. Developer checklist (TL;DR)
|
||||
|
||||
Before merging a change that touches security logic, ask yourself:
|
||||
|
||||
1. **Is there at least one benchmark scenario that exercises this change?**
|
||||
2. **Does the change improve at least one moat metric, or is it neutral?**
|
||||
3. **Have I run `make bench:quick` locally and checked diffs?**
|
||||
4. **If goldens changed, did I explain why in the PR?**
|
||||
5. **Did I keep benchmarks deterministic (pinned versions, fake time, etc.)?**
|
||||
|
||||
If any answer is “no”, fix that before merging.
|
||||
|
||||
---
|
||||
|
||||
If you’d like, next step I can sketch a concrete `bench/scenarios/*.yaml` and matching `bench/golden/*.json` example that encodes one *specific* moat claim (e.g., “30% fewer FPs than Scanner X on Kubernetes configs”) so your team has a ready‑to-copy pattern.
|
||||
@@ -0,0 +1,287 @@
|
||||
Here’s a condensed **“Stella Ops Developer Guidelines”** based on the official engineering docs and dev guides.
|
||||
|
||||
---
|
||||
|
||||
## 0. Where to start
|
||||
|
||||
* **Dev docs index:** The main entrypoint is `Development Guides & Tooling` (docs/technical/development/README.md). It links to coding standards, test strategy, performance workbook, plug‑in SDK, examples, and more. ([Gitea: Git with a cup of tea][1])
|
||||
* **If a term is unfamiliar:** Check the one‑page *Glossary of Terms* first. ([Stella Ops][2])
|
||||
* **Big picture:** Stella Ops is an SBOM‑first, offline‑ready container security platform; a lot of design decisions (determinism, signatures, policy DSL, SBOM delta scans) flow from that. ([Stella Ops][3])
|
||||
|
||||
---
|
||||
|
||||
## 1. Core engineering principles
|
||||
|
||||
From **Coding Standards & Contributor Guide**: ([Gitea: Git with a cup of tea][4])
|
||||
|
||||
1. **SOLID first** – especially interface & dependency inversion.
|
||||
2. **100‑line file rule** – if a file grows >100 physical lines, split or refactor.
|
||||
3. **Contracts vs runtime** – public DTOs and interfaces live in lightweight `*.Contracts` projects; implementations live in sibling runtime projects.
|
||||
4. **Single composition root** – DI wiring happens in `StellaOps.Web/Program.cs` and each plug‑in’s `IoCConfigurator`. Nothing else creates a service provider.
|
||||
5. **No service locator** – constructor injection only; no global `ServiceProvider` or static service lookups.
|
||||
6. **Fail‑fast startup** – validate configuration *before* the web host starts listening.
|
||||
7. **Hot‑load compatibility** – avoid static singletons that would survive plug‑in unload; don’t manually load assemblies outside the built‑in loader.
|
||||
|
||||
These all serve the product goals of **deterministic, offline, explainable security decisions**. ([Stella Ops][3])
|
||||
|
||||
---
|
||||
|
||||
## 2. Repository layout & layering
|
||||
|
||||
From the repo layout section: ([Gitea: Git with a cup of tea][4])
|
||||
|
||||
* **Top‑level structure (simplified):**
|
||||
|
||||
```text
|
||||
src/
|
||||
backend/
|
||||
StellaOps.Web/ # ASP.NET host + composition root
|
||||
StellaOps.Common/ # logging, helpers
|
||||
StellaOps.Contracts/ # DTO + interface contracts
|
||||
… more runtime projects
|
||||
plugins-sdk/ # plug‑in templates & abstractions
|
||||
frontend/ # Angular workspace
|
||||
tests/ # mirrors src 1‑to‑1
|
||||
```
|
||||
|
||||
* **Rules:**
|
||||
|
||||
* No “Module” folders or nested solution hierarchies.
|
||||
* Tests mirror `src/` structure 1:1; **no test code in production projects**.
|
||||
* New features follow *feature folder* layout (e.g., `Scan/ScanService.cs`, `Scan/ScanController.cs`).
|
||||
|
||||
---
|
||||
|
||||
## 3. Naming, style & language usage
|
||||
|
||||
Key conventions: ([Gitea: Git with a cup of tea][4])
|
||||
|
||||
* **Namespaces:** file‑scoped, `StellaOps.*`.
|
||||
* **Interfaces:** `I` prefix (`IScannerRunner`).
|
||||
* **Classes/records:** PascalCase (`ScanRequest`, `TrivyRunner`).
|
||||
* **Private fields:** `camelCase` (no leading `_`).
|
||||
* **Constants:** `SCREAMING_SNAKE_CASE`.
|
||||
* **Async methods:** end with `Async`.
|
||||
* **Usings:** outside namespace, sorted, no wildcard imports.
|
||||
* **File length:** keep ≤100 lines including `using` and braces (enforced by tooling).
|
||||
|
||||
C# feature usage: ([Gitea: Git with a cup of tea][4])
|
||||
|
||||
* Nullable reference types **on**.
|
||||
* Use `record` for immutable DTOs.
|
||||
* Prefer pattern matching over long `switch` cascades.
|
||||
* `Span`/`Memory` only when you’ve measured that you need them.
|
||||
* Use `await foreach` instead of manual iterator loops.
|
||||
|
||||
Formatting & analysis:
|
||||
|
||||
* `dotnet format` must be clean; StyleCop + security analyzers + CodeQL run in CI and are treated as gates. ([Gitea: Git with a cup of tea][4])
|
||||
|
||||
---
|
||||
|
||||
## 4. Dependency injection, async & concurrency
|
||||
|
||||
DI policy (core + plug‑ins): ([Gitea: Git with a cup of tea][4])
|
||||
|
||||
* Exactly **one composition root** per process (`StellaOps.Web/Program.cs`).
|
||||
* Plug‑ins contribute through:
|
||||
|
||||
* `[ServiceBinding]` attributes for simple bindings, or
|
||||
* An `IoCConfigurator : IDependencyInjectionRoutine` for advanced setups.
|
||||
* Default lifetime is **scoped**. Use singletons only for truly stateless, thread‑safe helpers.
|
||||
* Never use a service locator or manually build nested service providers except in tests.
|
||||
|
||||
Async & threading: ([Gitea: Git with a cup of tea][4])
|
||||
|
||||
* All I/O is async; avoid `.Result` / `.Wait()`.
|
||||
* Library code uses `ConfigureAwait(false)`.
|
||||
* Control concurrency with channels or `Parallel.ForEachAsync`, not ad‑hoc `Task.Run` loops.
|
||||
|
||||
---
|
||||
|
||||
## 5. Tests, tooling & quality gates
|
||||
|
||||
The **Automated Test‑Suite Overview** spells out all CI layers and budgets. ([Gitea: Git with a cup of tea][5])
|
||||
|
||||
**Test layers (high‑level):**
|
||||
|
||||
* Unit tests: xUnit.
|
||||
* Property‑based tests: FsCheck.
|
||||
* Integration:
|
||||
|
||||
* API integration with Testcontainers.
|
||||
* DB/merge flows using Mongo + Redis.
|
||||
* Contracts: gRPC breakage checks with Buf.
|
||||
* Frontend:
|
||||
|
||||
* Unit tests with Jest.
|
||||
* E2E tests with Playwright.
|
||||
* Lighthouse runs for performance & accessibility.
|
||||
* Non‑functional:
|
||||
|
||||
* Load tests via k6.
|
||||
* Chaos experiments (CPU/OOM) using Docker tooling.
|
||||
* Dependency & license scanning.
|
||||
* SBOM reproducibility/attestation checks.
|
||||
|
||||
**Quality gates (examples):** ([Gitea: Git with a cup of tea][5])
|
||||
|
||||
* API unit test line coverage ≥ ~85%.
|
||||
* API P95 latency ≤ ~120 ms in nightly runs.
|
||||
* Δ‑SBOM warm scan P95 ≤ ~5 s on reference hardware.
|
||||
* Lighthouse perf score ≥ ~90, a11y ≥ ~95.
|
||||
|
||||
**Local workflows:**
|
||||
|
||||
* Use `./scripts/dev-test.sh` for “fast” local runs and `--full` for the entire stack (API, UI, Playwright, Lighthouse, etc.). Needs Docker and modern Node. ([Gitea: Git with a cup of tea][5])
|
||||
* Some suites use Mongo2Go + an OpenSSL 1.1 shim; others use a helper script to spin up a local `mongod` for deeper debugging. ([Gitea: Git with a cup of tea][5])
|
||||
|
||||
---
|
||||
|
||||
## 6. Plug‑ins & connectors
|
||||
|
||||
The **Plug‑in SDK Guide** is your bible for schedule jobs, scanner adapters, TLS providers, notification channels, etc. ([Gitea: Git with a cup of tea][6])
|
||||
|
||||
**Basics:**
|
||||
|
||||
* Use `.NET` templates to scaffold:
|
||||
|
||||
```bash
|
||||
dotnet new stellaops-plugin-schedule -n MyPlugin.Schedule --output src
|
||||
```
|
||||
|
||||
* At publish time, copy **signed** artefacts to:
|
||||
|
||||
```text
|
||||
src/backend/Stella.Ops.Plugin.Binaries/<MyPlugin>/
|
||||
MyPlugin.dll
|
||||
MyPlugin.dll.sig
|
||||
```
|
||||
|
||||
* The backend:
|
||||
|
||||
* Verifies the Cosign signature.
|
||||
* Enforces `[StellaPluginVersion]` compatibility.
|
||||
* Loads plug‑ins in isolated `AssemblyLoadContext`s.
|
||||
|
||||
**DI entrypoints:**
|
||||
|
||||
* For simple cases, mark implementations with `[ServiceBinding(typeof(IMyContract), ServiceLifetime.Scoped, …)]`.
|
||||
* For more control, implement `IoCConfigurator : IDependencyInjectionRoutine` and configure services/options in `Register(...)`. ([Gitea: Git with a cup of tea][6])
|
||||
|
||||
**Examples:**
|
||||
|
||||
* **Schedule job:** implement `IJob.ExecuteAsync`, add `[StellaPluginVersion("X.Y.Z")]`, register cron with `services.AddCronJob<MyJob>("0 15 * * *")`.
|
||||
* **Scanner adapter:** implement `IScannerRunner` and register via `services.AddScanner<MyAltScanner>("alt")`; document Docker sidecars if needed. ([Gitea: Git with a cup of tea][6])
|
||||
|
||||
**Signing & deployment:**
|
||||
|
||||
* Publish, sign with Cosign, optionally zip:
|
||||
|
||||
```bash
|
||||
dotnet publish -c Release -p:PublishSingleFile=true -o out
|
||||
cosign sign --key $COSIGN_KEY out/MyPlugin.Schedule.dll
|
||||
```
|
||||
|
||||
* Copy into the backend container (e.g., `/opt/plugins/`) and restart.
|
||||
|
||||
* Unsigned DLLs are rejected when `StellaOps:Security:DisableUnsigned=false`. ([Gitea: Git with a cup of tea][6])
|
||||
|
||||
**Marketplace:**
|
||||
|
||||
* Tag releases like `plugin-vX.Y.Z`, attach the signed ZIP, and submit metadata to the community plug‑in index so it shows up in the UI Marketplace. ([Gitea: Git with a cup of tea][6])
|
||||
|
||||
---
|
||||
|
||||
## 7. Policy DSL & security decisions
|
||||
|
||||
For policy authors and tooling engineers, the **Stella Policy DSL (stella‑dsl@1)** doc is key. ([Stella Ops][7])
|
||||
|
||||
**Goals:**
|
||||
|
||||
* Deterministic: same inputs → same findings on every machine.
|
||||
* Declarative: no arbitrary loops, network calls, or clocks.
|
||||
* Explainable: each decision carries rule, inputs, rationale.
|
||||
* Offline‑friendly and reachability‑aware (SBOM + advisories + VEX + reachability). ([Stella Ops][7])
|
||||
|
||||
**Structure:**
|
||||
|
||||
* One `policy` block per `.stella` file, with:
|
||||
|
||||
* `metadata` (description, tags).
|
||||
* `profile` blocks (severity, trust, reachability adjustments).
|
||||
* `rule` blocks (`when` / `then` logic).
|
||||
* Optional `settings`. ([Stella Ops][7])
|
||||
|
||||
**Context & built‑ins:**
|
||||
|
||||
* Namespaces like `sbom`, `advisory`, `vex`, `env`, `telemetry`, `secret`, `profile.*`, etc. ([Stella Ops][7])
|
||||
* Helpers such as `normalize_cvss`, `risk_score`, `vex.any`, `vex.latest`, `sbom.any_component`, `exists`, `coalesce`, and secrets‑specific helpers. ([Stella Ops][7])
|
||||
|
||||
**Rules of thumb:**
|
||||
|
||||
* Always include a clear `because` when you change `status` or `severity`. ([Stella Ops][7])
|
||||
* Avoid catch‑all suppressions (`when true` + `status := "suppressed"`); the linter will flag them. ([Stella Ops][7])
|
||||
* Use `stella policy lint/compile/simulate` in CI and locally; test in sealed (offline) mode to ensure no network dependencies. ([Stella Ops][7])
|
||||
|
||||
---
|
||||
|
||||
## 8. Commits, PRs & docs
|
||||
|
||||
From the commit/PR checklist: ([Gitea: Git with a cup of tea][4])
|
||||
|
||||
Before opening a PR:
|
||||
|
||||
1. Use **Conventional Commit** prefixes (`feat:`, `fix:`, `docs:`, etc.).
|
||||
2. Run `dotnet format` and `dotnet test`; both must be green.
|
||||
3. Keep new/changed files within the 100‑line guideline.
|
||||
4. Update XML‑doc comments for any new public API.
|
||||
5. If you add/change a public contract:
|
||||
|
||||
* Update the relevant markdown docs.
|
||||
* Update JSON schema / API descriptions as needed.
|
||||
6. Ensure static analyzers and CI jobs relevant to your change are passing.
|
||||
|
||||
For new test layers or jobs, also update the test‑suite overview and metrics docs so the CI configuration stays discoverable. ([Gitea: Git with a cup of tea][5])
|
||||
|
||||
---
|
||||
|
||||
## 9. Licensing & reciprocity
|
||||
|
||||
Stella Ops ships under **AGPL‑3.0‑or‑later** with a strong reciprocity clause: ([Stella Ops][8])
|
||||
|
||||
* You may run, study, modify, and redistribute it, including as a hosted service.
|
||||
* If you run a **modified** version for others over a network, you must make that exact source code available to those users.
|
||||
* Official containers are signed and include SBOMs and attestations; verify them with Cosign as described on the license/security pages. ([Stella Ops][8])
|
||||
|
||||
When you build extensions:
|
||||
|
||||
* Keep plug‑ins compatible with AGPL expectations around combined works.
|
||||
* Don’t embed proprietary logic into the core without checking license implications.
|
||||
|
||||
---
|
||||
|
||||
## 10. If you just want a “first contribution” recipe
|
||||
|
||||
A practical path that follows the guidelines:
|
||||
|
||||
1. Clone the main repo; skim **Coding Standards**, **Test Suite Overview**, and the **Dev Guides & Tooling** index. ([Gitea: Git with a cup of tea][4])
|
||||
2. Get `dotnet`, Docker, Node set up; run `./scripts/dev-test.sh` to make sure your environment is healthy. ([Gitea: Git with a cup of tea][5])
|
||||
3. Pick a small issue (docs, small refactor, or new test), make changes respecting:
|
||||
|
||||
* 100‑line files,
|
||||
* DI patterns,
|
||||
* naming & style.
|
||||
4. Add/adjust tests plus any affected docs or JSON schemas.
|
||||
5. Run tests + formatting locally, push, and open a PR with a conventional title and a short “how I tested this” note.
|
||||
|
||||
If you tell me what you’re planning to work on (plug‑in, policy pack, core feature, or UI), I can turn this into a very concrete checklist tailored to that slice of Stella Ops.
|
||||
|
||||
[1]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/08b27b8a266c82960c7653797460e1e1d17ecd45/docs/technical/development/README.md "git.stella-ops.org/README.md at 08b27b8a266c82960c7653797460e1e1d17ecd45 - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[2]: https://stella-ops.org/docs/14_glossary_of_terms/?utm_source=chatgpt.com "Open • Sovereign • Modular container security - Stella Ops"
|
||||
[3]: https://stella-ops.org/docs/05_SYSTEM_REQUIREMENTS_SPEC/?utm_source=chatgpt.com "system requirements specification - Stella Ops – Open • Sovereign ..."
|
||||
[4]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/08b27b8a266c82960c7653797460e1e1d17ecd45/docs/18_CODING_STANDARDS.md "git.stella-ops.org/18_CODING_STANDARDS.md at 08b27b8a266c82960c7653797460e1e1d17ecd45 - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[5]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/08b27b8a266c82960c7653797460e1e1d17ecd45/docs/19_TEST_SUITE_OVERVIEW.md "git.stella-ops.org/19_TEST_SUITE_OVERVIEW.md at 08b27b8a266c82960c7653797460e1e1d17ecd45 - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[6]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/08b27b8a266c82960c7653797460e1e1d17ecd45/docs/10_PLUGIN_SDK_GUIDE.md "git.stella-ops.org/10_PLUGIN_SDK_GUIDE.md at 08b27b8a266c82960c7653797460e1e1d17ecd45 - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[7]: https://stella-ops.org/docs/policy/dsl/index.html "Stella Ops – Signed Reachability · Deterministic Replay · Sovereign Crypto"
|
||||
[8]: https://stella-ops.org/license/?utm_source=chatgpt.com "AGPL‑3.0‑or‑later - Stella Ops"
|
||||
@@ -0,0 +1,585 @@
|
||||
Here’s a tight, practical pattern to make your scanner’s vuln‑DB updates rock‑solid even when feeds hiccup:
|
||||
|
||||
# Offline, verifiable update bundles (DSSE + Rekor v2)
|
||||
|
||||
**Idea:** distribute DB updates as offline tarballs. Each tarball ships with:
|
||||
|
||||
* a **DSSE‑signed** statement (e.g., in‑toto style) over the bundle hash
|
||||
* a **Rekor v2 receipt** proving the signature/statement was logged
|
||||
* a small **manifest.json** (version, created_at, content hashes)
|
||||
|
||||
**Startup flow (happy path):**
|
||||
|
||||
1. Load latest tarball from your local `updates/` cache.
|
||||
2. Verify DSSE signature against your trusted public keys.
|
||||
3. Verify Rekor v2 receipt (inclusion proof) matches the DSSE payload hash.
|
||||
4. If both pass, unpack/activate; record the bundle’s **trust_id** (e.g., statement digest).
|
||||
5. If anything fails, **keep using the last good bundle**. No service disruption.
|
||||
|
||||
**Why this helps**
|
||||
|
||||
* **Air‑gap friendly:** no live network needed at activation time.
|
||||
* **Tamper‑evident:** DSSE + Rekor receipt proves provenance and transparency.
|
||||
* **Operational stability:** feed outages become non‑events—scanner just keeps the last good state.
|
||||
|
||||
---
|
||||
|
||||
## File layout inside each bundle
|
||||
|
||||
```
|
||||
/bundle-2025-11-29/
|
||||
manifest.json # { version, created_at, entries[], sha256s }
|
||||
payload.tar.zst # the actual DB/indices
|
||||
payload.tar.zst.sha256
|
||||
statement.dsse.json # DSSE-wrapped statement over payload hash
|
||||
rekor-receipt.json # Rekor v2 inclusion/verification material
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Acceptance/Activation rules
|
||||
|
||||
* **Trust root:** pin one (or more) publisher public keys; rotate via separate, out‑of‑band process.
|
||||
* **Monotonicity:** only activate if `manifest.version > current.version` (or if trust policy explicitly allows replay for rollback testing).
|
||||
* **Atomic switch:** unpack to `db/staging/`, validate, then symlink‑flip to `db/active/`.
|
||||
* **Quarantine on failure:** move bad bundles to `updates/quarantine/` with a reason code.
|
||||
|
||||
---
|
||||
|
||||
## Minimal .NET 10 verifier sketch (C#)
|
||||
|
||||
```csharp
|
||||
public sealed record BundlePaths(string Dir) {
|
||||
public string Manifest => Path.Combine(Dir, "manifest.json");
|
||||
public string Payload => Path.Combine(Dir, "payload.tar.zst");
|
||||
public string Dsse => Path.Combine(Dir, "statement.dsse.json");
|
||||
public string Receipt => Path.Combine(Dir, "rekor-receipt.json");
|
||||
}
|
||||
|
||||
public async Task<bool> ActivateBundleAsync(BundlePaths b, TrustConfig trust, string activeDir) {
|
||||
var manifest = await Manifest.LoadAsync(b.Manifest);
|
||||
if (!await Hashes.VerifyAsync(b.Payload, manifest.PayloadSha256)) return false;
|
||||
|
||||
// 1) DSSE verify (publisher keys pinned in trust)
|
||||
var (okSig, dssePayloadDigest) = await Dsse.VerifyAsync(b.Dsse, trust.PublisherKeys);
|
||||
if (!okSig || dssePayloadDigest != manifest.PayloadSha256) return false;
|
||||
|
||||
// 2) Rekor v2 receipt verify (inclusion + statement digest == dssePayloadDigest)
|
||||
if (!await RekorV2.VerifyReceiptAsync(b.Receipt, dssePayloadDigest, trust.RekorPub)) return false;
|
||||
|
||||
// 3) Stage, validate, then atomically flip
|
||||
var staging = Path.Combine(activeDir, "..", "staging");
|
||||
DirUtil.Empty(staging);
|
||||
await TarZstd.ExtractAsync(b.Payload, staging);
|
||||
if (!await LocalDbSelfCheck.RunAsync(staging)) return false;
|
||||
|
||||
SymlinkUtil.AtomicSwap(source: staging, target: activeDir);
|
||||
State.WriteLastGood(manifest.Version, dssePayloadDigest);
|
||||
return true;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Operational playbook
|
||||
|
||||
* **On boot & daily at HH:MM:** try `ActivateBundleAsync()` on the newest bundle; on failure, log and continue.
|
||||
* **Telemetry (no PII):** reason codes (SIG_FAIL, RECEIPT_FAIL, HASH_MISMATCH, SELFTEST_FAIL), versions, last_good.
|
||||
* **Keys & rotation:** keep `publisher.pub` and `rekor.pub` in a root‑owned, read‑only path; rotate via a separate signed “trust bundle”.
|
||||
* **Defense‑in‑depth:** verify both the **payload hash** and each file’s hash listed in `manifest.entries[]`.
|
||||
* **Rollback:** allow `--force-activate <bundle>` for emergency testing, but mark as **non‑monotonic** in state.
|
||||
|
||||
---
|
||||
|
||||
## What to hand your release team
|
||||
|
||||
* A Make/CI target that:
|
||||
|
||||
1. Builds `payload.tar.zst` and computes hashes
|
||||
2. Generates `manifest.json`
|
||||
3. Creates and signs the **DSSE statement**
|
||||
4. Submits to Rekor (or your mirror) and saves the **v2 receipt**
|
||||
5. Packages the bundle folder and publishes to your offline repo
|
||||
* A checksum file (`*.sha256sum`) for ops to verify out‑of‑band.
|
||||
|
||||
---
|
||||
|
||||
If you want, I can turn this into a Stella Ops spec page (`docs/modules/scanner/offline-bundles.md`) plus a small reference implementation (C# library + CLI) that drops right into your Scanner service.
|
||||
Here’s a “drop‑in” Stella Ops dev guide for **DSSE‑signed Offline Scanner Updates** — written in the same spirit as the existing docs and sprint files.
|
||||
|
||||
You can treat this as the seed for `docs/modules/scanner/development/dsse-offline-updates.md` (or similar).
|
||||
|
||||
---
|
||||
|
||||
# DSSE‑Signed Offline Scanner Updates — Developer Guidelines
|
||||
|
||||
> **Audience**
|
||||
> Scanner, Export Center, Attestor, CLI, and DevOps engineers implementing DSSE‑signed offline vulnerability updates and integrating them into the Offline Update Kit (OUK).
|
||||
>
|
||||
> **Context**
|
||||
>
|
||||
> * OUK already ships **signed, atomic offline update bundles** with merged vulnerability feeds, container images, and an attested manifest.([git.stella-ops.org][1])
|
||||
> * DSSE + Rekor is already used for **scan evidence** (SBOM attestations, Rekor proofs).([git.stella-ops.org][2])
|
||||
> * Sprints 160/162 add **attestation bundles** with manifest, checksums, DSSE signature, and optional transparency log segments, and integrate them into OUK and CLI flows.([git.stella-ops.org][3])
|
||||
|
||||
These guidelines tell you how to **wire all of that together** for “offline scanner updates” (feeds, rules, packs) in a way that matches Stella Ops’ determinism + sovereignty promises.
|
||||
|
||||
---
|
||||
|
||||
## 0. Mental model
|
||||
|
||||
At a high level, you’re building this:
|
||||
|
||||
```text
|
||||
Advisory mirrors / Feeds builders
|
||||
│
|
||||
▼
|
||||
ExportCenter.AttestationBundles
|
||||
(creates DSSE + Rekor evidence
|
||||
for each offline update snapshot)
|
||||
│
|
||||
▼
|
||||
Offline Update Kit (OUK) builder
|
||||
(adds feeds + evidence to kit tarball)
|
||||
│
|
||||
▼
|
||||
stella offline kit import / admin CLI
|
||||
(verifies Cosign + DSSE + Rekor segments,
|
||||
then atomically swaps scanner feeds)
|
||||
```
|
||||
|
||||
Online, Rekor is live; offline, you rely on **bundled Rekor segments / snapshots** and the existing OUK mechanics (import is atomic, old feeds kept until new bundle is fully verified).([git.stella-ops.org][1])
|
||||
|
||||
---
|
||||
|
||||
## 1. Goals & non‑goals
|
||||
|
||||
### Goals
|
||||
|
||||
1. **Authentic offline snapshots**
|
||||
Every offline scanner update (OUK or delta) must be verifiably tied to:
|
||||
|
||||
* a DSSE envelope,
|
||||
* a certificate chain rooted in Stella’s Fulcio/KMS profile or BYO KMS/HSM,
|
||||
* *and* a Rekor v2 inclusion proof or bundled log segment.([Stella Ops][4])
|
||||
|
||||
2. **Deterministic replay**
|
||||
Given:
|
||||
|
||||
* a specific offline update kit (`stella-ops-offline-kit-<DATE>.tgz` + `offline-manifest-<DATE>.json`)([git.stella-ops.org][1])
|
||||
* its DSSE attestation bundle + Rekor segments
|
||||
every verifier must reach the *same* verdict on integrity and contents — online or fully air‑gapped.
|
||||
|
||||
3. **Separation of concerns**
|
||||
|
||||
* Export Center: build attestation bundles, no business logic about scanning.([git.stella-ops.org][5])
|
||||
* Scanner: import & apply feeds; verify but not generate DSSE.
|
||||
* Signer / Attestor: own DSSE & Rekor integration.([git.stella-ops.org][2])
|
||||
|
||||
4. **Operational safety**
|
||||
|
||||
* Imports remain **atomic and idempotent**.
|
||||
* Old feeds stay live until the new update is **fully verified** (Cosign + DSSE + Rekor).([git.stella-ops.org][1])
|
||||
|
||||
### Non‑goals
|
||||
|
||||
* Designing new crypto or log formats.
|
||||
* Per‑feed DSSE envelopes (you can have more later, but the minimum contract is **bundle‑level** attestation).
|
||||
|
||||
---
|
||||
|
||||
## 2. Bundle contract for DSSE‑signed offline updates
|
||||
|
||||
You’re extending the existing OUK contract:
|
||||
|
||||
* OUK already packs:
|
||||
|
||||
* merged vuln feeds (OSV, GHSA, optional NVD 2.0, CNNVD/CNVD, ENISA, JVN, BDU),
|
||||
* container images (`stella-ops`, Zastava, etc.),
|
||||
* provenance (Cosign signature, SPDX SBOM, in‑toto SLSA attestation),
|
||||
* `offline-manifest.json` + detached JWS signed during export.([git.stella-ops.org][1])
|
||||
|
||||
For **DSSE‑signed offline scanner updates**, add a new logical layer:
|
||||
|
||||
### 2.1. Files to ship
|
||||
|
||||
Inside each offline kit (full or delta) you must produce:
|
||||
|
||||
```text
|
||||
/attestations/
|
||||
offline-update.dsse.json # DSSE envelope
|
||||
offline-update.rekor.json # Rekor entry + inclusion proof (or segment descriptor)
|
||||
/manifest/
|
||||
offline-manifest.json # existing manifest
|
||||
offline-manifest.json.jws # existing detached JWS
|
||||
/feeds/
|
||||
... # existing feed payloads
|
||||
```
|
||||
|
||||
The exact paths can be adjusted, but keep:
|
||||
|
||||
* **One DSSE bundle per kit** (min spec).
|
||||
* **One canonical Rekor proof file** per DSSE envelope.
|
||||
|
||||
### 2.2. DSSE payload contents (minimal)
|
||||
|
||||
Define (or reuse) a predicate type such as:
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"payloadType": "application/vnd.in-toto+json",
|
||||
"payload": { /* base64 */ }
|
||||
}
|
||||
```
|
||||
|
||||
Decoded payload (in-toto statement) should **at minimum** contain:
|
||||
|
||||
* **Subject**
|
||||
|
||||
* `name`: `stella-ops-offline-kit-<DATE>.tgz`
|
||||
* `digest.sha256`: tarball digest
|
||||
|
||||
* **Predicate type** (recommendation)
|
||||
|
||||
* `https://stella-ops.org/attestations/offline-update/1`
|
||||
|
||||
* **Predicate fields**
|
||||
|
||||
* `offline_manifest_sha256` – SHA‑256 of `offline-manifest.json`
|
||||
* `feeds` – array of feed entries such as `{ name, snapshot_date, archive_digest }` (mirrors `rules_and_feeds` style used in the moat doc).([Stella Ops][6])
|
||||
* `builder` – CI workflow id / git commit / Export Center job id
|
||||
* `created_at` – UTC ISO‑8601
|
||||
* `oukit_channel` – e.g., `edge`, `stable`, `fips-profile`
|
||||
|
||||
**Guideline:** this DSSE payload is the **single canonical description** of “what this offline update snapshot is”.
|
||||
|
||||
### 2.3. Rekor material
|
||||
|
||||
Attestor must:
|
||||
|
||||
* Submit `offline-update.dsse.json` to Rekor v2, obtaining:
|
||||
|
||||
* `uuid`
|
||||
* `logIndex`
|
||||
* inclusion proof (`rootHash`, `hashes`, `checkpoint`)
|
||||
* Serialize that to `offline-update.rekor.json` and store it in object storage + OUK staging, so it ships in the kit.([git.stella-ops.org][2])
|
||||
|
||||
For fully offline operation:
|
||||
|
||||
* Either:
|
||||
|
||||
* embed a **minimal log segment** containing that entry; or
|
||||
* rely on daily Rekor snapshot exports included elsewhere in the kit.([git.stella-ops.org][2])
|
||||
|
||||
---
|
||||
|
||||
## 3. Implementation by module
|
||||
|
||||
### 3.1 Export Center — attestation bundles
|
||||
|
||||
**Working directory:** `src/ExportCenter/StellaOps.ExportCenter.AttestationBundles`([git.stella-ops.org][7])
|
||||
|
||||
**Responsibilities**
|
||||
|
||||
1. **Compose attestation bundle job** (EXPORT‑ATTEST‑74‑001)
|
||||
|
||||
* Input: a snapshot identifier (e.g., offline kit build id or feed snapshot date).
|
||||
* Read manifest and feed metadata from the Export Center’s storage.([git.stella-ops.org][5])
|
||||
* Generate the DSSE payload structure described above.
|
||||
* Call `StellaOps.Signer` to wrap it in a DSSE envelope.
|
||||
* Call `StellaOps.Attestor` to submit DSSE → Rekor and get the inclusion proof.([git.stella-ops.org][2])
|
||||
* Persist:
|
||||
|
||||
* `offline-update.dsse.json`
|
||||
* `offline-update.rekor.json`
|
||||
* any log segment artifacts.
|
||||
|
||||
2. **Integrate into offline kit packaging** (EXPORT‑ATTEST‑74‑002 / 75‑001)
|
||||
|
||||
* The OUK builder (Python script `ops/offline-kit/build_offline_kit.py`) already assembles artifacts & manifests.([Stella Ops][8])
|
||||
* Extend that pipeline (or add an Export Center step) to:
|
||||
|
||||
* fetch the attestation bundle for the snapshot,
|
||||
* place it under `/attestations/` in the kit staging dir,
|
||||
* ensure `offline-manifest.json` contains entries for the DSSE and Rekor files (name, sha256, size, capturedAt).([git.stella-ops.org][1])
|
||||
|
||||
3. **Contracts & schemas**
|
||||
|
||||
* Define a small JSON schema for `offline-update.rekor.json` (UUID, index, proof fields) and check it into `docs/11_DATA_SCHEMAS.md` or module‑local schemas.
|
||||
* Keep all new payload schemas **versioned**; avoid “shape drift”.
|
||||
|
||||
**Do / Don’t**
|
||||
|
||||
* ✅ **Do** treat attestation bundle job as *pure aggregation* (AOC guardrail: no modification of evidence).([git.stella-ops.org][5])
|
||||
* ✅ **Do** rely on Signer + Attestor; don’t hand‑roll DSSE/Rekor logic in Export Center.([git.stella-ops.org][2])
|
||||
* ❌ **Don’t** reach out to external networks from this job — it must run with the same offline‑ready posture as the rest of the platform.
|
||||
|
||||
---
|
||||
|
||||
### 3.2 Offline Update Kit builder
|
||||
|
||||
**Working area:** `ops/offline-kit/*` + `docs/24_OFFLINE_KIT.md`([git.stella-ops.org][1])
|
||||
|
||||
Guidelines:
|
||||
|
||||
1. **Preserve current guarantees**
|
||||
|
||||
* Imports must remain **idempotent and atomic**, with **old feeds kept until the new bundle is fully verified**. This now includes DSSE/Rekor checks in addition to Cosign + JWS.([git.stella-ops.org][1])
|
||||
|
||||
2. **Staging layout**
|
||||
|
||||
* When staging a kit, ensure the tree looks like:
|
||||
|
||||
```text
|
||||
out/offline-kit/staging/
|
||||
feeds/...
|
||||
images/...
|
||||
manifest/offline-manifest.json
|
||||
attestations/offline-update.dsse.json
|
||||
attestations/offline-update.rekor.json
|
||||
```
|
||||
|
||||
* Update `offline-manifest.json` so each new file appears with:
|
||||
|
||||
* `name`, `sha256`, `size`, `capturedAt`.([git.stella-ops.org][1])
|
||||
|
||||
3. **Deterministic ordering**
|
||||
|
||||
* File lists in manifests must be in a stable order (e.g., lexical paths).
|
||||
* Timestamps = UTC ISO‑8601 only; never use local time. (Matches determinism guidance in AGENTS.md + policy/runs docs.)([git.stella-ops.org][9])
|
||||
|
||||
4. **Delta kits**
|
||||
|
||||
* For deltas (`stella-ouk-YYYY-MM-DD.delta.tgz`), DSSE should still cover:
|
||||
|
||||
* the delta tarball digest,
|
||||
* the **logical state** (feeds & versions) after applying the delta.
|
||||
* Don’t shortcut by “attesting only the diff files” — the predicate must describe the resulting snapshot.
|
||||
|
||||
---
|
||||
|
||||
### 3.3 Scanner — import & activation
|
||||
|
||||
**Working directory:** `src/Scanner/StellaOps.Scanner.WebService`, `StellaOps.Scanner.Worker`([git.stella-ops.org][9])
|
||||
|
||||
Scanner already exposes admin flows for:
|
||||
|
||||
* **Offline kit import**, which:
|
||||
|
||||
* validates the Cosign signature of the kit,
|
||||
* uses the attested manifest,
|
||||
* keeps old feeds until verification is done.([git.stella-ops.org][1])
|
||||
|
||||
Add DSSE/Rekor awareness as follows:
|
||||
|
||||
1. **Verification sequence (happy path)**
|
||||
|
||||
On `import-offline-usage-kit`:
|
||||
|
||||
1. Validate **Cosign** signature of the tarball.
|
||||
2. Validate `offline-manifest.json` with its JWS signature.
|
||||
3. Verify **file digests** for all entries (including `/attestations/*`).
|
||||
4. Verify **DSSE**:
|
||||
|
||||
* Call `StellaOps.Attestor.Verify` (or CLI equivalent) with:
|
||||
|
||||
* `offline-update.dsse.json`
|
||||
* `offline-update.rekor.json`
|
||||
* local Rekor log snapshot / segment (if configured)([git.stella-ops.org][2])
|
||||
* Ensure the payload digest matches the kit tarball + manifest digests.
|
||||
5. Only after all checks pass:
|
||||
|
||||
* swap Scanner’s feed pointer to the new snapshot,
|
||||
* emit an audit event noting:
|
||||
|
||||
* kit filename, tarball digest,
|
||||
* DSSE statement digest,
|
||||
* Rekor UUID + log index.
|
||||
|
||||
2. **Config surface**
|
||||
|
||||
Add config keys (names illustrative):
|
||||
|
||||
```yaml
|
||||
scanner:
|
||||
offlineKit:
|
||||
requireDsse: true # fail import if DSSE/Rekor verification fails
|
||||
rekorOfflineMode: true # use local snapshots only
|
||||
attestationVerifier: https://attestor.internal
|
||||
```
|
||||
|
||||
* Mirror them via ASP.NET Core config + env vars (`SCANNER__OFFLINEKIT__REQUIREDSSSE`, etc.), following the same pattern as the DSSE/Rekor operator guide.([git.stella-ops.org][2])
|
||||
|
||||
3. **Failure behaviour**
|
||||
|
||||
* **DSSE/Rekor fail, Cosign + manifest OK**
|
||||
|
||||
* Keep old feeds active.
|
||||
* Mark import as failed; surface a `ProblemDetails` error via API/UI.
|
||||
* Log structured fields: `rekorUuid`, `attestationDigest`, `offlineKitHash`, `failureReason`.([git.stella-ops.org][2])
|
||||
|
||||
* **Config flag to soften during rollout**
|
||||
|
||||
* When `requireDsse=false`, treat DSSE/Rekor failure as a warning and still allow the import (for initial observation phase), but emit alerts. This mirrors the “observe → enforce” pattern in the DSSE/Rekor operator guide.([git.stella-ops.org][2])
|
||||
|
||||
---
|
||||
|
||||
### 3.4 Signer & Attestor
|
||||
|
||||
You mostly **reuse** existing guidance:([git.stella-ops.org][2])
|
||||
|
||||
* Add a new predicate type & schema for offline updates in Signer.
|
||||
|
||||
* Ensure Attestor:
|
||||
|
||||
* can submit offline‑update DSSE envelopes to Rekor,
|
||||
* can emit verification routines (used by CLI and Scanner) that:
|
||||
|
||||
* verify the DSSE signature,
|
||||
* check the certificate chain against the configured root pack (FIPS/eIDAS/GOST/SM, etc.),([Stella Ops][4])
|
||||
* verify Rekor inclusion using either live log or local snapshot.
|
||||
|
||||
* For fully air‑gapped installs:
|
||||
|
||||
* rely on Rekor **snapshots mirrored** into Offline Kit (already recommended in the operator guide’s offline section).([git.stella-ops.org][2])
|
||||
|
||||
---
|
||||
|
||||
### 3.5 CLI & UI
|
||||
|
||||
Extend CLI with explicit verbs (matching EXPORT‑ATTEST sprints):([git.stella-ops.org][10])
|
||||
|
||||
* `stella attest bundle verify --bundle path/to/offline-kit.tgz --rekor-key rekor.pub`
|
||||
* `stella attest bundle import --bundle ...` (for sites that prefer a two‑step “verify then import” flow)
|
||||
* Wire UI Admin → Offline Kit screen so that:
|
||||
|
||||
* verification status shows both **Cosign/JWS** and **DSSE/Rekor** state,
|
||||
* policy banners display kit generation time, manifest hash, and DSSE/Rekor freshness.([Stella Ops][11])
|
||||
|
||||
---
|
||||
|
||||
## 4. Determinism & offline‑safety rules
|
||||
|
||||
When touching any of this code, keep these rules front‑of‑mind (they align with the policy DSL and architecture docs):([Stella Ops][4])
|
||||
|
||||
1. **No hidden network dependencies**
|
||||
|
||||
* All verification **must work offline** given the kit + Rekor snapshots.
|
||||
* Any fallback to live Rekor / Fulcio endpoints must be explicitly toggled and never on by default for “offline mode”.
|
||||
|
||||
2. **Stable serialization**
|
||||
|
||||
* DSSE payload JSON:
|
||||
|
||||
* stable ordering of fields,
|
||||
* no float weirdness,
|
||||
* UTC timestamps.
|
||||
|
||||
3. **Replayable imports**
|
||||
|
||||
* Running `import-offline-usage-kit` twice with the same bundle must be a no‑op after the first time.
|
||||
* The DSSE payload for a given snapshot must not change over time; if it does, bump the predicate or snapshot version.
|
||||
|
||||
4. **Explainability**
|
||||
|
||||
* When verification fails, errors must explain **what** mismatched (kit digest, manifest digest, DSSE envelope hash, Rekor inclusion) so auditors can reason about it.
|
||||
|
||||
---
|
||||
|
||||
## 5. Testing & CI expectations
|
||||
|
||||
Tie this into the existing CI workflows (`scanner-determinism.yml`, `attestation-bundle.yml`, `offline-kit` pipelines, etc.):([git.stella-ops.org][12])
|
||||
|
||||
### 5.1 Unit & integration tests
|
||||
|
||||
Write tests that cover:
|
||||
|
||||
1. **Happy paths**
|
||||
|
||||
* Full kit import with valid:
|
||||
|
||||
* Cosign,
|
||||
* manifest JWS,
|
||||
* DSSE,
|
||||
* Rekor proof (online and offline modes).
|
||||
|
||||
2. **Corruption scenarios**
|
||||
|
||||
* Tampered feed file (hash mismatch).
|
||||
* Tampered `offline-manifest.json`.
|
||||
* Tampered DSSE payload (signature fails).
|
||||
* Mismatched Rekor entry (payload digest doesn’t match DSSE).
|
||||
|
||||
3. **Offline scenarios**
|
||||
|
||||
* No network access, only Rekor snapshot:
|
||||
|
||||
* DSSE verification still passes,
|
||||
* Rekor proof validates against local tree head.
|
||||
|
||||
4. **Roll‑back logic**
|
||||
|
||||
* Import fails at DSSE/Rekor step:
|
||||
|
||||
* scanner DB still points at previous feeds,
|
||||
* metrics/logs show failure and no partial state.
|
||||
|
||||
### 5.2 SLOs & observability
|
||||
|
||||
Reuse metrics suggested by DSSE/Rekor guide and adapt to OUK imports:([git.stella-ops.org][2])
|
||||
|
||||
* `offlinekit_import_total{status="success|failed_dsse|failed_rekor|failed_cosign"}`
|
||||
* `offlinekit_attestation_verify_latency_seconds` (histogram)
|
||||
* `attestor_rekor_success_total`, `attestor_rekor_retry_total`, `rekor_inclusion_latency`
|
||||
* Dashboards: kit versions per environment, time since last kit, DSSE/Rekor health.
|
||||
|
||||
---
|
||||
|
||||
## 6. Developer checklist (TL;DR)
|
||||
|
||||
When you pick up a task touching DSSE‑signed offline updates:
|
||||
|
||||
1. **Read the background**
|
||||
|
||||
* `docs/modules/scanner/operations/dsse-rekor-operator-guide.md`([git.stella-ops.org][2])
|
||||
* `docs/24_OFFLINE_KIT.md` (and public offline kit guide).([git.stella-ops.org][1])
|
||||
* Relevant sprint file (`SPRINT_160_export_evidence`, `SPRINT_162_exportcenter_i`, etc.).([git.stella-ops.org][10])
|
||||
|
||||
2. **Implement**
|
||||
|
||||
* Generate DSSE payloads in Export Center only.
|
||||
* Call Signer & Attestor; persist DSSE + Rekor JSON next to manifests.
|
||||
* Extend OUK builder to include attestation bundle and list it in `offline-manifest.json`.
|
||||
* Update Scanner import flow to verify DSSE/Rekor before swapping feeds.
|
||||
|
||||
3. **Test**
|
||||
|
||||
* Unit tests for bundle composition & schema.
|
||||
* Integration tests for import + rollback.
|
||||
* Determinism tests (same inputs → same DSSE payload).
|
||||
|
||||
4. **Wire telemetry**
|
||||
|
||||
* Counters + latency histograms.
|
||||
* Logs with `offlineKitHash`, `attestationDigest`, `rekorUuid`.
|
||||
|
||||
5. **Document**
|
||||
|
||||
* Update `docs/modules/export-center/architecture.md`, `docs/modules/scanner/architecture.md`, and the OUK docs where flows or contracts changed.([git.stella-ops.org][5])
|
||||
|
||||
---
|
||||
|
||||
If you tell me which module you’re actually coding in next (Scanner, Export Center, CLI, or Attestor), I can turn this into a very concrete “AGENTS.md‑style” section with exact file paths, class names, and a starter test layout for that module.
|
||||
|
||||
[1]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/7bf40b8589c94078e8eadb240553c02f097a5127/docs/24_OFFLINE_KIT.md "git.stella-ops.org/24_OFFLINE_KIT.md at 7bf40b8589c94078e8eadb240553c02f097a5127 - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[2]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/13e4b53dda1575ba46c6188c794fd465ec6fdeec/docs/modules/scanner/operations/dsse-rekor-operator-guide.md "git.stella-ops.org/dsse-rekor-operator-guide.md at 13e4b53dda1575ba46c6188c794fd465ec6fdeec - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[3]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/raw/commit/61f963fd52cd4d6bb2f86afc5a82eac04c04b00e/docs/implplan/SPRINT_162_exportcenter_i.md?utm_source=chatgpt.com "https://git.stella-ops.org/stella-ops.org/git.stel..."
|
||||
[4]: https://stella-ops.org/docs/07_high_level_architecture/index.html?utm_source=chatgpt.com "Open • Sovereign • Modular container security - Stella Ops"
|
||||
[5]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/d870da18ce194c6a5f1a6d71abea36205d9fb276/docs/export-center/architecture.md?utm_source=chatgpt.com "Export Center Architecture - Stella Ops"
|
||||
[6]: https://stella-ops.org/docs/moat/?utm_source=chatgpt.com "Open • Sovereign • Modular container security - Stella Ops"
|
||||
[7]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/79b8e53441e92dbc63684f42072434d40b80275f/src/ExportCenter?utm_source=chatgpt.com "Code - Stella Ops"
|
||||
[8]: https://stella-ops.org/docs/24_offline_kit/?utm_source=chatgpt.com "Offline Update Kit (OUK) — Air‑Gap Bundle - Stella Ops – Open ..."
|
||||
[9]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/7768555f2d107326050cc5ff7f5cb81b82b7ce5f/AGENTS.md "git.stella-ops.org/AGENTS.md at 7768555f2d107326050cc5ff7f5cb81b82b7ce5f - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[10]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/66cb6c4b8af58a33efa1521b7953dda834431497/docs/implplan/SPRINT_160_export_evidence.md?utm_source=chatgpt.com "git.stella-ops.org/SPRINT_160_export_evidence.md at ..."
|
||||
[11]: https://stella-ops.org/about/?utm_source=chatgpt.com "Signed Reachability · Deterministic Replay · Sovereign Crypto"
|
||||
[12]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/actions/?actor=0&status=0&workflow=sdk-publish.yml&utm_source=chatgpt.com "Actions - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
@@ -0,0 +1,819 @@
|
||||
Here’s a crisp, opinionated storage blueprint you can hand to your Stella Ops devs right now, plus zero‑downtime conversion tactics so you can keep prototyping fast without painting yourself into a corner.
|
||||
|
||||
# Module → store map (deterministic by default)
|
||||
|
||||
* **Authority / OAuth / Accounts & Audit**
|
||||
|
||||
* **PostgreSQL** as the primary source of truth.
|
||||
* Tables: `users`, `clients`, `oauth_tokens`, `roles`, `grants`, `audit_log`.
|
||||
* **Row‑Level Security (RLS)** on `users`, `grants`, `audit_log`; **STRICT FK + CHECK** constraints; **immutable UUID PKs**.
|
||||
* **Audit**: `audit_log(actor_id, action, entity, entity_id, at timestamptz default now(), diff jsonb)`.
|
||||
* **Why**: ACID + RLS keeps authz decisions and audit trails deterministic and reviewable.
|
||||
|
||||
* **VEX & Vulnerability Writes**
|
||||
|
||||
* **PostgreSQL** with **JSONB facts + relational decisions**.
|
||||
* Tables: `vuln_fact(jsonb)`, `vex_decision(package_id, vuln_id, status, rationale, proof_ref, updated_at)`.
|
||||
* **Materialized views** for triage queues, e.g. `mv_triage_hotset` (refresh on commit or scheduled).
|
||||
* **Why**: JSONB lets you ingest vendor‑shaped docs; decisions stay relational for joins, integrity, and explainability.
|
||||
|
||||
* **Routing / Feature Flags / Rate‑limits**
|
||||
|
||||
* **PostgreSQL** (truth) + **Redis** (cache).
|
||||
* Tables: `feature_flag(key, rules jsonb, version)`, `route(domain, service, instance_id, last_heartbeat)`, `rate_limiter(bucket, quota, interval)`.
|
||||
* Redis keys: `flag:{key}:{version}`, `route:{domain}`, `rl:{bucket}` with short TTLs.
|
||||
* **Why**: one canonical RDBMS for consistency; Redis for hot path latency.
|
||||
|
||||
* **Unknowns Registry (ambiguity tracker)**
|
||||
|
||||
* **PostgreSQL** with **temporal tables** (bitemporal pattern via `valid_from/valid_to`, `sys_from/sys_to`).
|
||||
* Table: `unknowns(subject_hash, kind, context jsonb, valid_from, valid_to, sys_from default now(), sys_to)`.
|
||||
* Views: `unknowns_current` where `valid_to is null`.
|
||||
* **Why**: preserves how/when uncertainty changed (critical for proofs and audits).
|
||||
|
||||
* **Artifacts / SBOM / VEX files**
|
||||
|
||||
* **OCI‑compatible CAS** (e.g., self‑hosted registry or MinIO bucket as content‑addressable store).
|
||||
* Keys by **digest** (`sha256:...`), metadata in Postgres `artifact(index)` with `digest`, `media_type`, `size`, `signatures`.
|
||||
* **Why**: blobs don’t belong in your RDBMS; use CAS for scale + cryptographic addressing.
|
||||
|
||||
---
|
||||
|
||||
# PostgreSQL implementation essentials (copy/paste starters)
|
||||
|
||||
* **RLS scaffold (Authority)**:
|
||||
|
||||
```sql
|
||||
alter table audit_log enable row level security;
|
||||
create policy p_audit_read_self
|
||||
on audit_log for select
|
||||
using (actor_id = current_setting('app.user_id')::uuid or
|
||||
exists (select 1 from grants g where g.user_id = current_setting('app.user_id')::uuid and g.role = 'auditor'));
|
||||
```
|
||||
|
||||
* **JSONB facts + relational decisions**:
|
||||
|
||||
```sql
|
||||
create table vuln_fact (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
source text not null,
|
||||
payload jsonb not null,
|
||||
received_at timestamptz default now()
|
||||
);
|
||||
|
||||
create table vex_decision (
|
||||
package_id uuid not null,
|
||||
vuln_id text not null,
|
||||
status text check (status in ('not_affected','affected','fixed','under_investigation')),
|
||||
rationale text,
|
||||
proof_ref text,
|
||||
decided_at timestamptz default now(),
|
||||
primary key (package_id, vuln_id)
|
||||
);
|
||||
```
|
||||
|
||||
* **Materialized view for triage**:
|
||||
|
||||
```sql
|
||||
create materialized view mv_triage_hotset as
|
||||
select v.id as fact_id, v.payload->>'vuln' as vuln, v.received_at
|
||||
from vuln_fact v
|
||||
where (now() - v.received_at) < interval '7 days';
|
||||
-- refresh concurrently via job
|
||||
```
|
||||
|
||||
* **Temporal pattern (Unknowns)**:
|
||||
|
||||
```sql
|
||||
create table unknowns (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
subject_hash text not null,
|
||||
kind text not null,
|
||||
context jsonb not null,
|
||||
valid_from timestamptz not null default now(),
|
||||
valid_to timestamptz,
|
||||
sys_from timestamptz not null default now(),
|
||||
sys_to timestamptz
|
||||
);
|
||||
|
||||
create view unknowns_current as
|
||||
select * from unknowns where valid_to is null;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Conversion (not migration): zero‑downtime, prototype‑friendly
|
||||
|
||||
Even if you’re “not migrating anything yet,” set these rails now so cutting over later is painless.
|
||||
|
||||
1. **Encode Mongo‑shaped docs into JSONB with versioned schemas**
|
||||
|
||||
* Ingest pipeline writes to `*_fact(payload jsonb, schema_version int)`.
|
||||
* Add a **`validate(schema_version, payload)`** step in your service layer (JSON Schema or SQL checks).
|
||||
* Keep a **forward‑compatible view** that projects stable columns from JSONB (e.g., `payload->>'id' as vendor_id`) so downstream code doesn’t break when payload evolves.
|
||||
|
||||
2. **Outbox pattern for exactly‑once side‑effects**
|
||||
|
||||
* Add `outbox(id, topic, key, payload jsonb, created_at, dispatched bool default false)`.
|
||||
* On the same transaction as your write, insert the outbox row.
|
||||
* A background dispatcher reads `dispatched=false`, publishes to MQ/Webhook, then marks `dispatched=true`.
|
||||
* Guarantees: no lost events, no duplicates to external systems.
|
||||
|
||||
3. **Parallel read adapters behind feature flags**
|
||||
|
||||
* Keep old readers (e.g., Mongo driver) and new Postgres readers in the same service.
|
||||
* Gate by `feature_flag('pg_reads')` per tenant or env; flip gradually.
|
||||
* Add a **read‑diff monitor** that compares results and logs mismatches to `audit_log(diff)`.
|
||||
|
||||
4. **CDC for analytics without coupling**
|
||||
|
||||
* Enable **logical replication** (pgoutput) on your key tables.
|
||||
* Stream changes into analyzers (reachability, heuristics) without hitting primaries.
|
||||
* This lets you keep OLTP clean and still power dashboards/tests.
|
||||
|
||||
5. **Materialized views & job cadence**
|
||||
|
||||
* Refresh `mv_*` on a fixed cadence (e.g., every 2–5 minutes) or post‑commit for hot paths.
|
||||
* Keep **“cold path”** analytics in separate schemas (`analytics.*`) sourced from CDC.
|
||||
|
||||
6. **Cutover playbook (phased)**
|
||||
|
||||
* Phase A (Dark Read): write Postgres, still serve from Mongo; compare results silently.
|
||||
* Phase B (Shadow Serve): 5–10% traffic from Postgres via flag; auto‑rollback switch.
|
||||
* Phase C (Authoritative): Postgres becomes source; Mongo path left for emergency read‑only.
|
||||
* Phase D (Retire): freeze Mongo, back up, remove writes, delete code paths after 2 stable sprints.
|
||||
|
||||
---
|
||||
|
||||
# Rate‑limits & flags: single truth, fast edges
|
||||
|
||||
* **Truth in Postgres** with versioned flag docs:
|
||||
|
||||
```sql
|
||||
create table feature_flag (
|
||||
key text primary key,
|
||||
rules jsonb not null,
|
||||
version int not null default 1,
|
||||
updated_at timestamptz default now()
|
||||
);
|
||||
```
|
||||
|
||||
* **Edge cache** in Redis:
|
||||
|
||||
* `SETEX flag:{key}:{version} <ttl> <json>`
|
||||
* On update, bump `version`; readers compose cache key with version (cache‑busting without deletes).
|
||||
|
||||
* **Rate limiting**: Persist quotas in Postgres; counters in Redis (`INCR rl:{bucket}:{window}`), with periodic reconciliation jobs writing summaries back to Postgres for audits.
|
||||
|
||||
---
|
||||
|
||||
# CAS for SBOM/VEX/attestations
|
||||
|
||||
* Push blobs to OCI/MinIO by digest; store only pointers in Postgres:
|
||||
|
||||
```sql
|
||||
create table artifact_index (
|
||||
digest text primary key,
|
||||
media_type text not null,
|
||||
size bigint not null,
|
||||
created_at timestamptz default now(),
|
||||
signature_refs jsonb
|
||||
);
|
||||
```
|
||||
* Benefits: immutable, deduped, easy to mirror into offline kits.
|
||||
|
||||
---
|
||||
|
||||
# Guardrails your team should follow
|
||||
|
||||
* **Always** wrap multi‑table writes (facts + outbox + decisions) in a single transaction.
|
||||
* **Prefer** `jsonb_path_query` for targeted reads; **avoid** scanning entire payloads.
|
||||
* **Enforce** RLS + least‑privilege roles; application sets `app.user_id` at session start.
|
||||
* **Version everything**: schemas, flags, materialized views; never “change in place” without bumping version.
|
||||
* **Observability**: expose `pg_stat_statements`, refresh latency for `mv_*`, outbox lag, Redis hit ratio, and RLS policy hits.
|
||||
|
||||
---
|
||||
|
||||
If you want, I can turn this into:
|
||||
|
||||
* ready‑to‑run **EF Core 10** migrations,
|
||||
* a **/docs/architecture/store-map.md** for your repo,
|
||||
* and a tiny **dev seed** (Docker + sample data) so the team can poke it immediately.
|
||||
Here’s a focused “PostgreSQL patterns per module” doc you can hand straight to your StellaOps devs.
|
||||
|
||||
---
|
||||
|
||||
# StellaOps – PostgreSQL Patterns per Module
|
||||
|
||||
**Scope:** How each StellaOps module should use PostgreSQL: schema patterns, constraints, RLS, indexing, and transaction rules.
|
||||
|
||||
---
|
||||
|
||||
## 0. Cross‑cutting PostgreSQL Rules
|
||||
|
||||
These apply everywhere unless explicitly overridden.
|
||||
|
||||
### 0.1 Core conventions
|
||||
|
||||
* **Schemas**
|
||||
|
||||
* Use **one logical schema** per module: `authority`, `routing`, `vex`, `unknowns`, `artifact`.
|
||||
* Shared utilities (e.g., `outbox`) live in a `core` schema.
|
||||
|
||||
* **Naming**
|
||||
|
||||
* Tables: `snake_case`, singular: `user`, `feature_flag`, `vuln_fact`.
|
||||
* PK: `id uuid primary key`.
|
||||
* FKs: `<referenced_table>_id` (e.g., `user_id`, `tenant_id`).
|
||||
* Timestamps:
|
||||
|
||||
* `created_at timestamptz not null default now()`
|
||||
* `updated_at timestamptz not null default now()`
|
||||
|
||||
* **Multi‑tenancy**
|
||||
|
||||
* All tenant‑scoped tables must have `tenant_id uuid not null`.
|
||||
* Enforce tenant isolation with **RLS** on `tenant_id`.
|
||||
|
||||
* **Time & timezones**
|
||||
|
||||
* Always `timestamptz`, always store **UTC**, let the DB default `now()`.
|
||||
|
||||
### 0.2 RLS & security
|
||||
|
||||
* RLS must be **enabled** on any table reachable from a user‑initiated path.
|
||||
* Every session must set:
|
||||
|
||||
```sql
|
||||
select set_config('app.user_id', '<uuid>', false);
|
||||
select set_config('app.tenant_id', '<uuid>', false);
|
||||
select set_config('app.roles', 'role1,role2', false);
|
||||
```
|
||||
* RLS policies:
|
||||
|
||||
* Base policy: `tenant_id = current_setting('app.tenant_id')::uuid`.
|
||||
* Extra predicates for per‑user privacy (e.g., only see own tokens, only own API clients).
|
||||
* DB users:
|
||||
|
||||
* Each module’s service has its **own role** with access only to its schema + `core.outbox`.
|
||||
|
||||
### 0.3 JSONB & versioning
|
||||
|
||||
* Any JSONB column must have:
|
||||
|
||||
* `payload jsonb not null`,
|
||||
* `schema_version int not null`.
|
||||
* Always index:
|
||||
|
||||
* by source (`source` / `origin`),
|
||||
* by a small set of projected fields used in WHERE clauses.
|
||||
|
||||
### 0.4 Migrations
|
||||
|
||||
* All schema changes via migrations, forward‑only.
|
||||
* Backwards‑compat pattern:
|
||||
|
||||
1. Add new columns / tables.
|
||||
2. Backfill.
|
||||
3. Flip code to use new structure (behind a feature flag).
|
||||
4. After stability, remove old columns/paths.
|
||||
|
||||
---
|
||||
|
||||
## 1. Authority Module (auth, accounts, audit)
|
||||
|
||||
**Schema:** `authority.*`
|
||||
**Mission:** identity, OAuth, roles, grants, audit.
|
||||
|
||||
### 1.1 Core tables & patterns
|
||||
|
||||
* `authority.user`
|
||||
|
||||
```sql
|
||||
create table authority.user (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
email text not null,
|
||||
display_name text not null,
|
||||
is_disabled boolean not null default false,
|
||||
created_at timestamptz not null default now(),
|
||||
updated_at timestamptz not null default now(),
|
||||
unique (tenant_id, email)
|
||||
);
|
||||
```
|
||||
|
||||
* Never hard‑delete users: use `is_disabled` (and optionally `disabled_at`).
|
||||
|
||||
* `authority.role`
|
||||
|
||||
```sql
|
||||
create table authority.role (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
name text not null,
|
||||
description text,
|
||||
created_at timestamptz not null default now(),
|
||||
updated_at timestamptz not null default now(),
|
||||
unique (tenant_id, name)
|
||||
);
|
||||
```
|
||||
|
||||
* `authority.grant`
|
||||
|
||||
```sql
|
||||
create table authority.grant (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
user_id uuid not null references authority.user(id),
|
||||
role_id uuid not null references authority.role(id),
|
||||
created_at timestamptz not null default now(),
|
||||
unique (tenant_id, user_id, role_id)
|
||||
);
|
||||
```
|
||||
|
||||
* `authority.oauth_client`, `authority.oauth_token`
|
||||
|
||||
* Enforce token uniqueness:
|
||||
|
||||
```sql
|
||||
create table authority.oauth_token (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
user_id uuid not null references authority.user(id),
|
||||
client_id uuid not null references authority.oauth_client(id),
|
||||
token_hash text not null, -- hash, never raw
|
||||
expires_at timestamptz not null,
|
||||
created_at timestamptz not null default now(),
|
||||
revoked_at timestamptz,
|
||||
unique (token_hash)
|
||||
);
|
||||
```
|
||||
|
||||
### 1.2 Audit log pattern
|
||||
|
||||
* `authority.audit_log`
|
||||
|
||||
```sql
|
||||
create table authority.audit_log (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
actor_id uuid, -- null for system
|
||||
action text not null,
|
||||
entity_type text not null,
|
||||
entity_id uuid,
|
||||
at timestamptz not null default now(),
|
||||
diff jsonb not null
|
||||
);
|
||||
```
|
||||
* Insert audit rows in the **same transaction** as the change.
|
||||
|
||||
### 1.3 RLS patterns
|
||||
|
||||
* Base RLS:
|
||||
|
||||
```sql
|
||||
alter table authority.user enable row level security;
|
||||
|
||||
create policy p_user_tenant on authority.user
|
||||
for all using (tenant_id = current_setting('app.tenant_id')::uuid);
|
||||
```
|
||||
* Extra policies:
|
||||
|
||||
* Audit log is visible only to:
|
||||
|
||||
* actor themself, or
|
||||
* users with an `auditor` or `admin` role.
|
||||
|
||||
---
|
||||
|
||||
## 2. Routing & Feature Flags Module
|
||||
|
||||
**Schema:** `routing.*`
|
||||
**Mission:** where instances live, what features are on, rate‑limit configuration.
|
||||
|
||||
### 2.1 Feature flags
|
||||
|
||||
* `routing.feature_flag`
|
||||
|
||||
```sql
|
||||
create table routing.feature_flag (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
key text not null,
|
||||
rules jsonb not null,
|
||||
version int not null default 1,
|
||||
is_enabled boolean not null default true,
|
||||
created_at timestamptz not null default now(),
|
||||
updated_at timestamptz not null default now(),
|
||||
unique (tenant_id, key)
|
||||
);
|
||||
```
|
||||
|
||||
* **Immutability by version**:
|
||||
|
||||
* On update, **increment `version`**, don’t overwrite historical data.
|
||||
* Mirror changes into a history table via trigger:
|
||||
|
||||
```sql
|
||||
create table routing.feature_flag_history (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
feature_flag_id uuid not null references routing.feature_flag(id),
|
||||
tenant_id uuid not null,
|
||||
key text not null,
|
||||
rules jsonb not null,
|
||||
version int not null,
|
||||
changed_at timestamptz not null default now(),
|
||||
changed_by uuid
|
||||
);
|
||||
```
|
||||
|
||||
### 2.2 Instance registry
|
||||
|
||||
* `routing.instance`
|
||||
|
||||
```sql
|
||||
create table routing.instance (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
instance_key text not null,
|
||||
domain text not null,
|
||||
last_heartbeat timestamptz not null default now(),
|
||||
status text not null check (status in ('active','draining','offline')),
|
||||
created_at timestamptz not null default now(),
|
||||
updated_at timestamptz not null default now(),
|
||||
unique (tenant_id, instance_key),
|
||||
unique (tenant_id, domain)
|
||||
);
|
||||
```
|
||||
|
||||
* Pattern:
|
||||
|
||||
* Heartbeats use `update ... set last_heartbeat = now()` without touching other fields.
|
||||
* Routing logic filters by `status='active'` and heartbeat recency.
|
||||
|
||||
### 2.3 Rate‑limit configuration
|
||||
|
||||
* Config in Postgres, counters in Redis:
|
||||
|
||||
```sql
|
||||
create table routing.rate_limit_config (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
key text not null,
|
||||
limit_per_interval int not null,
|
||||
interval_seconds int not null,
|
||||
created_at timestamptz not null default now(),
|
||||
updated_at timestamptz not null default now(),
|
||||
unique (tenant_id, key)
|
||||
);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. VEX & Vulnerability Module
|
||||
|
||||
**Schema:** `vex.*`
|
||||
**Mission:** ingest vulnerability facts, keep decisions & triage state.
|
||||
|
||||
### 3.1 Facts as JSONB
|
||||
|
||||
* `vex.vuln_fact`
|
||||
|
||||
```sql
|
||||
create table vex.vuln_fact (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
source text not null, -- e.g. "nvd", "vendor_x_vex"
|
||||
external_id text, -- e.g. CVE, advisory id
|
||||
payload jsonb not null,
|
||||
schema_version int not null,
|
||||
received_at timestamptz not null default now()
|
||||
);
|
||||
```
|
||||
|
||||
* Index patterns:
|
||||
|
||||
```sql
|
||||
create index on vex.vuln_fact (tenant_id, source);
|
||||
create index on vex.vuln_fact (tenant_id, external_id);
|
||||
create index vuln_fact_payload_gin on vex.vuln_fact using gin (payload);
|
||||
```
|
||||
|
||||
### 3.2 Decisions as relational data
|
||||
|
||||
* `vex.package`
|
||||
|
||||
```sql
|
||||
create table vex.package (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
name text not null,
|
||||
version text not null,
|
||||
ecosystem text not null, -- e.g. "pypi", "npm"
|
||||
created_at timestamptz not null default now(),
|
||||
unique (tenant_id, name, version, ecosystem)
|
||||
);
|
||||
```
|
||||
|
||||
* `vex.vex_decision`
|
||||
|
||||
```sql
|
||||
create table vex.vex_decision (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
package_id uuid not null references vex.package(id),
|
||||
vuln_id text not null,
|
||||
status text not null check (status in (
|
||||
'not_affected', 'affected', 'fixed', 'under_investigation'
|
||||
)),
|
||||
rationale text,
|
||||
proof_ref text, -- CAS digest or URL
|
||||
decided_by uuid,
|
||||
decided_at timestamptz not null default now(),
|
||||
created_at timestamptz not null default now(),
|
||||
updated_at timestamptz not null default now(),
|
||||
unique (tenant_id, package_id, vuln_id)
|
||||
);
|
||||
```
|
||||
|
||||
* For history:
|
||||
|
||||
* Keep current state in `vex_decision`.
|
||||
* Mirror previous versions into `vex_decision_history` table (similar to feature flags).
|
||||
|
||||
### 3.3 Triage queues with materialized views
|
||||
|
||||
* Example triage view:
|
||||
|
||||
```sql
|
||||
create materialized view vex.mv_triage_queue as
|
||||
select
|
||||
d.tenant_id,
|
||||
p.name,
|
||||
p.version,
|
||||
d.vuln_id,
|
||||
d.status,
|
||||
d.decided_at
|
||||
from vex.vex_decision d
|
||||
join vex.package p on p.id = d.package_id
|
||||
where d.status = 'under_investigation';
|
||||
```
|
||||
|
||||
* Refresh options:
|
||||
|
||||
* Scheduled refresh (cron/worker).
|
||||
* Or **incremental** via triggers (more complex; use only when needed).
|
||||
|
||||
### 3.4 RLS for VEX
|
||||
|
||||
* All tables scoped by `tenant_id`.
|
||||
* Typical policy:
|
||||
|
||||
```sql
|
||||
alter table vex.vex_decision enable row level security;
|
||||
|
||||
create policy p_vex_tenant on vex.vex_decision
|
||||
for all using (tenant_id = current_setting('app.tenant_id')::uuid);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Unknowns Module
|
||||
|
||||
**Schema:** `unknowns.*`
|
||||
**Mission:** represent uncertainty and how it changes over time.
|
||||
|
||||
### 4.1 Bitemporal unknowns table
|
||||
|
||||
* `unknowns.unknown`
|
||||
|
||||
```sql
|
||||
create table unknowns.unknown (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
subject_hash text not null, -- stable identifier for "thing" being reasoned about
|
||||
kind text not null, -- e.g. "reachability", "version_inferred"
|
||||
context jsonb not null, -- extra info: call graph node, evidence, etc.
|
||||
valid_from timestamptz not null default now(),
|
||||
valid_to timestamptz,
|
||||
sys_from timestamptz not null default now(),
|
||||
sys_to timestamptz,
|
||||
created_at timestamptz not null default now()
|
||||
);
|
||||
```
|
||||
|
||||
* “Exactly one open unknown per subject/kind” pattern:
|
||||
|
||||
```sql
|
||||
create unique index unknown_one_open_per_subject
|
||||
on unknowns.unknown (tenant_id, subject_hash, kind)
|
||||
where valid_to is null;
|
||||
```
|
||||
|
||||
### 4.2 Closing an unknown
|
||||
|
||||
* Close by setting `valid_to` and `sys_to`:
|
||||
|
||||
```sql
|
||||
update unknowns.unknown
|
||||
set valid_to = now(), sys_to = now()
|
||||
where id = :id and valid_to is null;
|
||||
```
|
||||
|
||||
* Never hard-delete; keep all rows for audit/explanation.
|
||||
|
||||
### 4.3 Convenience views
|
||||
|
||||
* Current unknowns:
|
||||
|
||||
```sql
|
||||
create view unknowns.current as
|
||||
select *
|
||||
from unknowns.unknown
|
||||
where valid_to is null;
|
||||
```
|
||||
|
||||
### 4.4 RLS
|
||||
|
||||
* Same tenant policy as other modules; unknowns are tenant‑scoped.
|
||||
|
||||
---
|
||||
|
||||
## 5. Artifact Index / CAS Module
|
||||
|
||||
**Schema:** `artifact.*`
|
||||
**Mission:** index of immutable blobs stored in OCI / S3 / MinIO etc.
|
||||
|
||||
### 5.1 Artifact index
|
||||
|
||||
* `artifact.artifact`
|
||||
|
||||
```sql
|
||||
create table artifact.artifact (
|
||||
digest text primary key, -- e.g. "sha256:..."
|
||||
tenant_id uuid not null,
|
||||
media_type text not null,
|
||||
size_bytes bigint not null,
|
||||
created_at timestamptz not null default now(),
|
||||
created_by uuid
|
||||
);
|
||||
```
|
||||
|
||||
* Validate digest shape with a CHECK:
|
||||
|
||||
```sql
|
||||
alter table artifact.artifact
|
||||
add constraint chk_digest_format
|
||||
check (digest ~ '^sha[0-9]+:[0-9a-fA-F]{32,}$');
|
||||
```
|
||||
|
||||
### 5.2 Signatures and tags
|
||||
|
||||
* `artifact.signature`
|
||||
|
||||
```sql
|
||||
create table artifact.signature (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
artifact_digest text not null references artifact.artifact(digest),
|
||||
signer text not null,
|
||||
signature_payload jsonb not null,
|
||||
created_at timestamptz not null default now()
|
||||
);
|
||||
```
|
||||
|
||||
* `artifact.tag`
|
||||
|
||||
```sql
|
||||
create table artifact.tag (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
name text not null,
|
||||
artifact_digest text not null references artifact.artifact(digest),
|
||||
created_at timestamptz not null default now(),
|
||||
unique (tenant_id, name)
|
||||
);
|
||||
```
|
||||
|
||||
### 5.3 RLS
|
||||
|
||||
* Ensure that tenants cannot see each other’s digests, even if the CAS backing store is shared:
|
||||
|
||||
```sql
|
||||
alter table artifact.artifact enable row level security;
|
||||
|
||||
create policy p_artifact_tenant on artifact.artifact
|
||||
for all using (tenant_id = current_setting('app.tenant_id')::uuid);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Shared Outbox / Event Pattern
|
||||
|
||||
**Schema:** `core.*`
|
||||
**Mission:** reliable events for external side‑effects.
|
||||
|
||||
### 6.1 Outbox table
|
||||
|
||||
* `core.outbox`
|
||||
|
||||
```sql
|
||||
create table core.outbox (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid,
|
||||
aggregate_type text not null, -- e.g. "vex_decision", "feature_flag"
|
||||
aggregate_id uuid,
|
||||
topic text not null,
|
||||
payload jsonb not null,
|
||||
created_at timestamptz not null default now(),
|
||||
dispatched_at timestamptz,
|
||||
dispatch_attempts int not null default 0,
|
||||
error text
|
||||
);
|
||||
```
|
||||
|
||||
### 6.2 Usage rule
|
||||
|
||||
* For anything that must emit an event (webhook, Kafka, notifications):
|
||||
|
||||
* In the **same transaction** as the change:
|
||||
|
||||
* write primary data (e.g. `vex.vex_decision`),
|
||||
* insert an `outbox` row.
|
||||
* A background worker:
|
||||
|
||||
* pulls undelivered rows,
|
||||
* sends to external system,
|
||||
* updates `dispatched_at`/`dispatch_attempts`/`error`.
|
||||
|
||||
---
|
||||
|
||||
## 7. Indexing & Query Patterns per Module
|
||||
|
||||
### 7.1 Authority
|
||||
|
||||
* Index:
|
||||
|
||||
* `user(tenant_id, email)`
|
||||
* `grant(tenant_id, user_id)`
|
||||
* `oauth_token(token_hash)`
|
||||
* Typical query patterns:
|
||||
|
||||
* Look up user by `tenant_id + email`.
|
||||
* All roles/grants for a user; design composite indexes accordingly.
|
||||
|
||||
### 7.2 Routing & Flags
|
||||
|
||||
* Index:
|
||||
|
||||
* `feature_flag(tenant_id, key)`
|
||||
* partial index on enabled flags:
|
||||
|
||||
```sql
|
||||
create index on routing.feature_flag (tenant_id, key)
|
||||
where is_enabled;
|
||||
```
|
||||
* `instance(tenant_id, status)`, `instance(tenant_id, domain)`.
|
||||
|
||||
### 7.3 VEX
|
||||
|
||||
* Index:
|
||||
|
||||
* `package(tenant_id, name, version, ecosystem)`
|
||||
* `vex_decision(tenant_id, package_id, vuln_id)`
|
||||
* GIN on `vuln_fact.payload` for flexible querying.
|
||||
|
||||
### 7.4 Unknowns
|
||||
|
||||
* Index:
|
||||
|
||||
* unique open unknown per subject/kind (shown above).
|
||||
* `unknown(tenant_id, kind)` for filtering by kind.
|
||||
|
||||
### 7.5 Artifact
|
||||
|
||||
* Index:
|
||||
|
||||
* PK on `digest`.
|
||||
* `signature(tenant_id, artifact_digest)`.
|
||||
* `tag(tenant_id, name)`.
|
||||
|
||||
---
|
||||
|
||||
## 8. Transaction & Isolation Guidelines
|
||||
|
||||
* Default isolation: **READ COMMITTED**.
|
||||
* For critical sequences (e.g., provisioning a tenant, bulk role updates):
|
||||
|
||||
* consider **REPEATABLE READ** or **SERIALIZABLE** and keep transactions short.
|
||||
* Pattern:
|
||||
|
||||
* One transaction per logical user action (e.g., “set flag”, “record decision”).
|
||||
* Never do long‑running external calls inside a database transaction.
|
||||
|
||||
---
|
||||
|
||||
If you’d like, next step I can turn this into:
|
||||
|
||||
* concrete `CREATE SCHEMA` + `CREATE TABLE` migration files, and
|
||||
* a short “How to write queries in each module” cheat‑sheet for devs (with example SELECT/INSERT/UPDATE patterns).
|
||||
@@ -0,0 +1,585 @@
|
||||
Here’s a tight, practical pattern to make your scanner’s vuln‑DB updates rock‑solid even when feeds hiccup:
|
||||
|
||||
# Offline, verifiable update bundles (DSSE + Rekor v2)
|
||||
|
||||
**Idea:** distribute DB updates as offline tarballs. Each tarball ships with:
|
||||
|
||||
* a **DSSE‑signed** statement (e.g., in‑toto style) over the bundle hash
|
||||
* a **Rekor v2 receipt** proving the signature/statement was logged
|
||||
* a small **manifest.json** (version, created_at, content hashes)
|
||||
|
||||
**Startup flow (happy path):**
|
||||
|
||||
1. Load latest tarball from your local `updates/` cache.
|
||||
2. Verify DSSE signature against your trusted public keys.
|
||||
3. Verify Rekor v2 receipt (inclusion proof) matches the DSSE payload hash.
|
||||
4. If both pass, unpack/activate; record the bundle’s **trust_id** (e.g., statement digest).
|
||||
5. If anything fails, **keep using the last good bundle**. No service disruption.
|
||||
|
||||
**Why this helps**
|
||||
|
||||
* **Air‑gap friendly:** no live network needed at activation time.
|
||||
* **Tamper‑evident:** DSSE + Rekor receipt proves provenance and transparency.
|
||||
* **Operational stability:** feed outages become non‑events—scanner just keeps the last good state.
|
||||
|
||||
---
|
||||
|
||||
## File layout inside each bundle
|
||||
|
||||
```
|
||||
/bundle-2025-11-29/
|
||||
manifest.json # { version, created_at, entries[], sha256s }
|
||||
payload.tar.zst # the actual DB/indices
|
||||
payload.tar.zst.sha256
|
||||
statement.dsse.json # DSSE-wrapped statement over payload hash
|
||||
rekor-receipt.json # Rekor v2 inclusion/verification material
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Acceptance/Activation rules
|
||||
|
||||
* **Trust root:** pin one (or more) publisher public keys; rotate via separate, out‑of‑band process.
|
||||
* **Monotonicity:** only activate if `manifest.version > current.version` (or if trust policy explicitly allows replay for rollback testing).
|
||||
* **Atomic switch:** unpack to `db/staging/`, validate, then symlink‑flip to `db/active/`.
|
||||
* **Quarantine on failure:** move bad bundles to `updates/quarantine/` with a reason code.
|
||||
|
||||
---
|
||||
|
||||
## Minimal .NET 10 verifier sketch (C#)
|
||||
|
||||
```csharp
|
||||
public sealed record BundlePaths(string Dir) {
|
||||
public string Manifest => Path.Combine(Dir, "manifest.json");
|
||||
public string Payload => Path.Combine(Dir, "payload.tar.zst");
|
||||
public string Dsse => Path.Combine(Dir, "statement.dsse.json");
|
||||
public string Receipt => Path.Combine(Dir, "rekor-receipt.json");
|
||||
}
|
||||
|
||||
public async Task<bool> ActivateBundleAsync(BundlePaths b, TrustConfig trust, string activeDir) {
|
||||
var manifest = await Manifest.LoadAsync(b.Manifest);
|
||||
if (!await Hashes.VerifyAsync(b.Payload, manifest.PayloadSha256)) return false;
|
||||
|
||||
// 1) DSSE verify (publisher keys pinned in trust)
|
||||
var (okSig, dssePayloadDigest) = await Dsse.VerifyAsync(b.Dsse, trust.PublisherKeys);
|
||||
if (!okSig || dssePayloadDigest != manifest.PayloadSha256) return false;
|
||||
|
||||
// 2) Rekor v2 receipt verify (inclusion + statement digest == dssePayloadDigest)
|
||||
if (!await RekorV2.VerifyReceiptAsync(b.Receipt, dssePayloadDigest, trust.RekorPub)) return false;
|
||||
|
||||
// 3) Stage, validate, then atomically flip
|
||||
var staging = Path.Combine(activeDir, "..", "staging");
|
||||
DirUtil.Empty(staging);
|
||||
await TarZstd.ExtractAsync(b.Payload, staging);
|
||||
if (!await LocalDbSelfCheck.RunAsync(staging)) return false;
|
||||
|
||||
SymlinkUtil.AtomicSwap(source: staging, target: activeDir);
|
||||
State.WriteLastGood(manifest.Version, dssePayloadDigest);
|
||||
return true;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Operational playbook
|
||||
|
||||
* **On boot & daily at HH:MM:** try `ActivateBundleAsync()` on the newest bundle; on failure, log and continue.
|
||||
* **Telemetry (no PII):** reason codes (SIG_FAIL, RECEIPT_FAIL, HASH_MISMATCH, SELFTEST_FAIL), versions, last_good.
|
||||
* **Keys & rotation:** keep `publisher.pub` and `rekor.pub` in a root‑owned, read‑only path; rotate via a separate signed “trust bundle”.
|
||||
* **Defense‑in‑depth:** verify both the **payload hash** and each file’s hash listed in `manifest.entries[]`.
|
||||
* **Rollback:** allow `--force-activate <bundle>` for emergency testing, but mark as **non‑monotonic** in state.
|
||||
|
||||
---
|
||||
|
||||
## What to hand your release team
|
||||
|
||||
* A Make/CI target that:
|
||||
|
||||
1. Builds `payload.tar.zst` and computes hashes
|
||||
2. Generates `manifest.json`
|
||||
3. Creates and signs the **DSSE statement**
|
||||
4. Submits to Rekor (or your mirror) and saves the **v2 receipt**
|
||||
5. Packages the bundle folder and publishes to your offline repo
|
||||
* A checksum file (`*.sha256sum`) for ops to verify out‑of‑band.
|
||||
|
||||
---
|
||||
|
||||
If you want, I can turn this into a Stella Ops spec page (`docs/modules/scanner/offline-bundles.md`) plus a small reference implementation (C# library + CLI) that drops right into your Scanner service.
|
||||
Here’s a “drop‑in” Stella Ops dev guide for **DSSE‑signed Offline Scanner Updates** — written in the same spirit as the existing docs and sprint files.
|
||||
|
||||
You can treat this as the seed for `docs/modules/scanner/development/dsse-offline-updates.md` (or similar).
|
||||
|
||||
---
|
||||
|
||||
# DSSE‑Signed Offline Scanner Updates — Developer Guidelines
|
||||
|
||||
> **Audience**
|
||||
> Scanner, Export Center, Attestor, CLI, and DevOps engineers implementing DSSE‑signed offline vulnerability updates and integrating them into the Offline Update Kit (OUK).
|
||||
>
|
||||
> **Context**
|
||||
>
|
||||
> * OUK already ships **signed, atomic offline update bundles** with merged vulnerability feeds, container images, and an attested manifest.([git.stella-ops.org][1])
|
||||
> * DSSE + Rekor is already used for **scan evidence** (SBOM attestations, Rekor proofs).([git.stella-ops.org][2])
|
||||
> * Sprints 160/162 add **attestation bundles** with manifest, checksums, DSSE signature, and optional transparency log segments, and integrate them into OUK and CLI flows.([git.stella-ops.org][3])
|
||||
|
||||
These guidelines tell you how to **wire all of that together** for “offline scanner updates” (feeds, rules, packs) in a way that matches Stella Ops’ determinism + sovereignty promises.
|
||||
|
||||
---
|
||||
|
||||
## 0. Mental model
|
||||
|
||||
At a high level, you’re building this:
|
||||
|
||||
```text
|
||||
Advisory mirrors / Feeds builders
|
||||
│
|
||||
▼
|
||||
ExportCenter.AttestationBundles
|
||||
(creates DSSE + Rekor evidence
|
||||
for each offline update snapshot)
|
||||
│
|
||||
▼
|
||||
Offline Update Kit (OUK) builder
|
||||
(adds feeds + evidence to kit tarball)
|
||||
│
|
||||
▼
|
||||
stella offline kit import / admin CLI
|
||||
(verifies Cosign + DSSE + Rekor segments,
|
||||
then atomically swaps scanner feeds)
|
||||
```
|
||||
|
||||
Online, Rekor is live; offline, you rely on **bundled Rekor segments / snapshots** and the existing OUK mechanics (import is atomic, old feeds kept until new bundle is fully verified).([git.stella-ops.org][1])
|
||||
|
||||
---
|
||||
|
||||
## 1. Goals & non‑goals
|
||||
|
||||
### Goals
|
||||
|
||||
1. **Authentic offline snapshots**
|
||||
Every offline scanner update (OUK or delta) must be verifiably tied to:
|
||||
|
||||
* a DSSE envelope,
|
||||
* a certificate chain rooted in Stella’s Fulcio/KMS profile or BYO KMS/HSM,
|
||||
* *and* a Rekor v2 inclusion proof or bundled log segment.([Stella Ops][4])
|
||||
|
||||
2. **Deterministic replay**
|
||||
Given:
|
||||
|
||||
* a specific offline update kit (`stella-ops-offline-kit-<DATE>.tgz` + `offline-manifest-<DATE>.json`)([git.stella-ops.org][1])
|
||||
* its DSSE attestation bundle + Rekor segments
|
||||
every verifier must reach the *same* verdict on integrity and contents — online or fully air‑gapped.
|
||||
|
||||
3. **Separation of concerns**
|
||||
|
||||
* Export Center: build attestation bundles, no business logic about scanning.([git.stella-ops.org][5])
|
||||
* Scanner: import & apply feeds; verify but not generate DSSE.
|
||||
* Signer / Attestor: own DSSE & Rekor integration.([git.stella-ops.org][2])
|
||||
|
||||
4. **Operational safety**
|
||||
|
||||
* Imports remain **atomic and idempotent**.
|
||||
* Old feeds stay live until the new update is **fully verified** (Cosign + DSSE + Rekor).([git.stella-ops.org][1])
|
||||
|
||||
### Non‑goals
|
||||
|
||||
* Designing new crypto or log formats.
|
||||
* Per‑feed DSSE envelopes (you can have more later, but the minimum contract is **bundle‑level** attestation).
|
||||
|
||||
---
|
||||
|
||||
## 2. Bundle contract for DSSE‑signed offline updates
|
||||
|
||||
You’re extending the existing OUK contract:
|
||||
|
||||
* OUK already packs:
|
||||
|
||||
* merged vuln feeds (OSV, GHSA, optional NVD 2.0, CNNVD/CNVD, ENISA, JVN, BDU),
|
||||
* container images (`stella-ops`, Zastava, etc.),
|
||||
* provenance (Cosign signature, SPDX SBOM, in‑toto SLSA attestation),
|
||||
* `offline-manifest.json` + detached JWS signed during export.([git.stella-ops.org][1])
|
||||
|
||||
For **DSSE‑signed offline scanner updates**, add a new logical layer:
|
||||
|
||||
### 2.1. Files to ship
|
||||
|
||||
Inside each offline kit (full or delta) you must produce:
|
||||
|
||||
```text
|
||||
/attestations/
|
||||
offline-update.dsse.json # DSSE envelope
|
||||
offline-update.rekor.json # Rekor entry + inclusion proof (or segment descriptor)
|
||||
/manifest/
|
||||
offline-manifest.json # existing manifest
|
||||
offline-manifest.json.jws # existing detached JWS
|
||||
/feeds/
|
||||
... # existing feed payloads
|
||||
```
|
||||
|
||||
The exact paths can be adjusted, but keep:
|
||||
|
||||
* **One DSSE bundle per kit** (min spec).
|
||||
* **One canonical Rekor proof file** per DSSE envelope.
|
||||
|
||||
### 2.2. DSSE payload contents (minimal)
|
||||
|
||||
Define (or reuse) a predicate type such as:
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"payloadType": "application/vnd.in-toto+json",
|
||||
"payload": { /* base64 */ }
|
||||
}
|
||||
```
|
||||
|
||||
Decoded payload (in-toto statement) should **at minimum** contain:
|
||||
|
||||
* **Subject**
|
||||
|
||||
* `name`: `stella-ops-offline-kit-<DATE>.tgz`
|
||||
* `digest.sha256`: tarball digest
|
||||
|
||||
* **Predicate type** (recommendation)
|
||||
|
||||
* `https://stella-ops.org/attestations/offline-update/1`
|
||||
|
||||
* **Predicate fields**
|
||||
|
||||
* `offline_manifest_sha256` – SHA‑256 of `offline-manifest.json`
|
||||
* `feeds` – array of feed entries such as `{ name, snapshot_date, archive_digest }` (mirrors `rules_and_feeds` style used in the moat doc).([Stella Ops][6])
|
||||
* `builder` – CI workflow id / git commit / Export Center job id
|
||||
* `created_at` – UTC ISO‑8601
|
||||
* `oukit_channel` – e.g., `edge`, `stable`, `fips-profile`
|
||||
|
||||
**Guideline:** this DSSE payload is the **single canonical description** of “what this offline update snapshot is”.
|
||||
|
||||
### 2.3. Rekor material
|
||||
|
||||
Attestor must:
|
||||
|
||||
* Submit `offline-update.dsse.json` to Rekor v2, obtaining:
|
||||
|
||||
* `uuid`
|
||||
* `logIndex`
|
||||
* inclusion proof (`rootHash`, `hashes`, `checkpoint`)
|
||||
* Serialize that to `offline-update.rekor.json` and store it in object storage + OUK staging, so it ships in the kit.([git.stella-ops.org][2])
|
||||
|
||||
For fully offline operation:
|
||||
|
||||
* Either:
|
||||
|
||||
* embed a **minimal log segment** containing that entry; or
|
||||
* rely on daily Rekor snapshot exports included elsewhere in the kit.([git.stella-ops.org][2])
|
||||
|
||||
---
|
||||
|
||||
## 3. Implementation by module
|
||||
|
||||
### 3.1 Export Center — attestation bundles
|
||||
|
||||
**Working directory:** `src/ExportCenter/StellaOps.ExportCenter.AttestationBundles`([git.stella-ops.org][7])
|
||||
|
||||
**Responsibilities**
|
||||
|
||||
1. **Compose attestation bundle job** (EXPORT‑ATTEST‑74‑001)
|
||||
|
||||
* Input: a snapshot identifier (e.g., offline kit build id or feed snapshot date).
|
||||
* Read manifest and feed metadata from the Export Center’s storage.([git.stella-ops.org][5])
|
||||
* Generate the DSSE payload structure described above.
|
||||
* Call `StellaOps.Signer` to wrap it in a DSSE envelope.
|
||||
* Call `StellaOps.Attestor` to submit DSSE → Rekor and get the inclusion proof.([git.stella-ops.org][2])
|
||||
* Persist:
|
||||
|
||||
* `offline-update.dsse.json`
|
||||
* `offline-update.rekor.json`
|
||||
* any log segment artifacts.
|
||||
|
||||
2. **Integrate into offline kit packaging** (EXPORT‑ATTEST‑74‑002 / 75‑001)
|
||||
|
||||
* The OUK builder (Python script `ops/offline-kit/build_offline_kit.py`) already assembles artifacts & manifests.([Stella Ops][8])
|
||||
* Extend that pipeline (or add an Export Center step) to:
|
||||
|
||||
* fetch the attestation bundle for the snapshot,
|
||||
* place it under `/attestations/` in the kit staging dir,
|
||||
* ensure `offline-manifest.json` contains entries for the DSSE and Rekor files (name, sha256, size, capturedAt).([git.stella-ops.org][1])
|
||||
|
||||
3. **Contracts & schemas**
|
||||
|
||||
* Define a small JSON schema for `offline-update.rekor.json` (UUID, index, proof fields) and check it into `docs/11_DATA_SCHEMAS.md` or module‑local schemas.
|
||||
* Keep all new payload schemas **versioned**; avoid “shape drift”.
|
||||
|
||||
**Do / Don’t**
|
||||
|
||||
* ✅ **Do** treat attestation bundle job as *pure aggregation* (AOC guardrail: no modification of evidence).([git.stella-ops.org][5])
|
||||
* ✅ **Do** rely on Signer + Attestor; don’t hand‑roll DSSE/Rekor logic in Export Center.([git.stella-ops.org][2])
|
||||
* ❌ **Don’t** reach out to external networks from this job — it must run with the same offline‑ready posture as the rest of the platform.
|
||||
|
||||
---
|
||||
|
||||
### 3.2 Offline Update Kit builder
|
||||
|
||||
**Working area:** `ops/offline-kit/*` + `docs/24_OFFLINE_KIT.md`([git.stella-ops.org][1])
|
||||
|
||||
Guidelines:
|
||||
|
||||
1. **Preserve current guarantees**
|
||||
|
||||
* Imports must remain **idempotent and atomic**, with **old feeds kept until the new bundle is fully verified**. This now includes DSSE/Rekor checks in addition to Cosign + JWS.([git.stella-ops.org][1])
|
||||
|
||||
2. **Staging layout**
|
||||
|
||||
* When staging a kit, ensure the tree looks like:
|
||||
|
||||
```text
|
||||
out/offline-kit/staging/
|
||||
feeds/...
|
||||
images/...
|
||||
manifest/offline-manifest.json
|
||||
attestations/offline-update.dsse.json
|
||||
attestations/offline-update.rekor.json
|
||||
```
|
||||
|
||||
* Update `offline-manifest.json` so each new file appears with:
|
||||
|
||||
* `name`, `sha256`, `size`, `capturedAt`.([git.stella-ops.org][1])
|
||||
|
||||
3. **Deterministic ordering**
|
||||
|
||||
* File lists in manifests must be in a stable order (e.g., lexical paths).
|
||||
* Timestamps = UTC ISO‑8601 only; never use local time. (Matches determinism guidance in AGENTS.md + policy/runs docs.)([git.stella-ops.org][9])
|
||||
|
||||
4. **Delta kits**
|
||||
|
||||
* For deltas (`stella-ouk-YYYY-MM-DD.delta.tgz`), DSSE should still cover:
|
||||
|
||||
* the delta tarball digest,
|
||||
* the **logical state** (feeds & versions) after applying the delta.
|
||||
* Don’t shortcut by “attesting only the diff files” — the predicate must describe the resulting snapshot.
|
||||
|
||||
---
|
||||
|
||||
### 3.3 Scanner — import & activation
|
||||
|
||||
**Working directory:** `src/Scanner/StellaOps.Scanner.WebService`, `StellaOps.Scanner.Worker`([git.stella-ops.org][9])
|
||||
|
||||
Scanner already exposes admin flows for:
|
||||
|
||||
* **Offline kit import**, which:
|
||||
|
||||
* validates the Cosign signature of the kit,
|
||||
* uses the attested manifest,
|
||||
* keeps old feeds until verification is done.([git.stella-ops.org][1])
|
||||
|
||||
Add DSSE/Rekor awareness as follows:
|
||||
|
||||
1. **Verification sequence (happy path)**
|
||||
|
||||
On `import-offline-usage-kit`:
|
||||
|
||||
1. Validate **Cosign** signature of the tarball.
|
||||
2. Validate `offline-manifest.json` with its JWS signature.
|
||||
3. Verify **file digests** for all entries (including `/attestations/*`).
|
||||
4. Verify **DSSE**:
|
||||
|
||||
* Call `StellaOps.Attestor.Verify` (or CLI equivalent) with:
|
||||
|
||||
* `offline-update.dsse.json`
|
||||
* `offline-update.rekor.json`
|
||||
* local Rekor log snapshot / segment (if configured)([git.stella-ops.org][2])
|
||||
* Ensure the payload digest matches the kit tarball + manifest digests.
|
||||
5. Only after all checks pass:
|
||||
|
||||
* swap Scanner’s feed pointer to the new snapshot,
|
||||
* emit an audit event noting:
|
||||
|
||||
* kit filename, tarball digest,
|
||||
* DSSE statement digest,
|
||||
* Rekor UUID + log index.
|
||||
|
||||
2. **Config surface**
|
||||
|
||||
Add config keys (names illustrative):
|
||||
|
||||
```yaml
|
||||
scanner:
|
||||
offlineKit:
|
||||
requireDsse: true # fail import if DSSE/Rekor verification fails
|
||||
rekorOfflineMode: true # use local snapshots only
|
||||
attestationVerifier: https://attestor.internal
|
||||
```
|
||||
|
||||
* Mirror them via ASP.NET Core config + env vars (`SCANNER__OFFLINEKIT__REQUIREDSSSE`, etc.), following the same pattern as the DSSE/Rekor operator guide.([git.stella-ops.org][2])
|
||||
|
||||
3. **Failure behaviour**
|
||||
|
||||
* **DSSE/Rekor fail, Cosign + manifest OK**
|
||||
|
||||
* Keep old feeds active.
|
||||
* Mark import as failed; surface a `ProblemDetails` error via API/UI.
|
||||
* Log structured fields: `rekorUuid`, `attestationDigest`, `offlineKitHash`, `failureReason`.([git.stella-ops.org][2])
|
||||
|
||||
* **Config flag to soften during rollout**
|
||||
|
||||
* When `requireDsse=false`, treat DSSE/Rekor failure as a warning and still allow the import (for initial observation phase), but emit alerts. This mirrors the “observe → enforce” pattern in the DSSE/Rekor operator guide.([git.stella-ops.org][2])
|
||||
|
||||
---
|
||||
|
||||
### 3.4 Signer & Attestor
|
||||
|
||||
You mostly **reuse** existing guidance:([git.stella-ops.org][2])
|
||||
|
||||
* Add a new predicate type & schema for offline updates in Signer.
|
||||
|
||||
* Ensure Attestor:
|
||||
|
||||
* can submit offline‑update DSSE envelopes to Rekor,
|
||||
* can emit verification routines (used by CLI and Scanner) that:
|
||||
|
||||
* verify the DSSE signature,
|
||||
* check the certificate chain against the configured root pack (FIPS/eIDAS/GOST/SM, etc.),([Stella Ops][4])
|
||||
* verify Rekor inclusion using either live log or local snapshot.
|
||||
|
||||
* For fully air‑gapped installs:
|
||||
|
||||
* rely on Rekor **snapshots mirrored** into Offline Kit (already recommended in the operator guide’s offline section).([git.stella-ops.org][2])
|
||||
|
||||
---
|
||||
|
||||
### 3.5 CLI & UI
|
||||
|
||||
Extend CLI with explicit verbs (matching EXPORT‑ATTEST sprints):([git.stella-ops.org][10])
|
||||
|
||||
* `stella attest bundle verify --bundle path/to/offline-kit.tgz --rekor-key rekor.pub`
|
||||
* `stella attest bundle import --bundle ...` (for sites that prefer a two‑step “verify then import” flow)
|
||||
* Wire UI Admin → Offline Kit screen so that:
|
||||
|
||||
* verification status shows both **Cosign/JWS** and **DSSE/Rekor** state,
|
||||
* policy banners display kit generation time, manifest hash, and DSSE/Rekor freshness.([Stella Ops][11])
|
||||
|
||||
---
|
||||
|
||||
## 4. Determinism & offline‑safety rules
|
||||
|
||||
When touching any of this code, keep these rules front‑of‑mind (they align with the policy DSL and architecture docs):([Stella Ops][4])
|
||||
|
||||
1. **No hidden network dependencies**
|
||||
|
||||
* All verification **must work offline** given the kit + Rekor snapshots.
|
||||
* Any fallback to live Rekor / Fulcio endpoints must be explicitly toggled and never on by default for “offline mode”.
|
||||
|
||||
2. **Stable serialization**
|
||||
|
||||
* DSSE payload JSON:
|
||||
|
||||
* stable ordering of fields,
|
||||
* no float weirdness,
|
||||
* UTC timestamps.
|
||||
|
||||
3. **Replayable imports**
|
||||
|
||||
* Running `import-offline-usage-kit` twice with the same bundle must be a no‑op after the first time.
|
||||
* The DSSE payload for a given snapshot must not change over time; if it does, bump the predicate or snapshot version.
|
||||
|
||||
4. **Explainability**
|
||||
|
||||
* When verification fails, errors must explain **what** mismatched (kit digest, manifest digest, DSSE envelope hash, Rekor inclusion) so auditors can reason about it.
|
||||
|
||||
---
|
||||
|
||||
## 5. Testing & CI expectations
|
||||
|
||||
Tie this into the existing CI workflows (`scanner-determinism.yml`, `attestation-bundle.yml`, `offline-kit` pipelines, etc.):([git.stella-ops.org][12])
|
||||
|
||||
### 5.1 Unit & integration tests
|
||||
|
||||
Write tests that cover:
|
||||
|
||||
1. **Happy paths**
|
||||
|
||||
* Full kit import with valid:
|
||||
|
||||
* Cosign,
|
||||
* manifest JWS,
|
||||
* DSSE,
|
||||
* Rekor proof (online and offline modes).
|
||||
|
||||
2. **Corruption scenarios**
|
||||
|
||||
* Tampered feed file (hash mismatch).
|
||||
* Tampered `offline-manifest.json`.
|
||||
* Tampered DSSE payload (signature fails).
|
||||
* Mismatched Rekor entry (payload digest doesn’t match DSSE).
|
||||
|
||||
3. **Offline scenarios**
|
||||
|
||||
* No network access, only Rekor snapshot:
|
||||
|
||||
* DSSE verification still passes,
|
||||
* Rekor proof validates against local tree head.
|
||||
|
||||
4. **Roll‑back logic**
|
||||
|
||||
* Import fails at DSSE/Rekor step:
|
||||
|
||||
* scanner DB still points at previous feeds,
|
||||
* metrics/logs show failure and no partial state.
|
||||
|
||||
### 5.2 SLOs & observability
|
||||
|
||||
Reuse metrics suggested by DSSE/Rekor guide and adapt to OUK imports:([git.stella-ops.org][2])
|
||||
|
||||
* `offlinekit_import_total{status="success|failed_dsse|failed_rekor|failed_cosign"}`
|
||||
* `offlinekit_attestation_verify_latency_seconds` (histogram)
|
||||
* `attestor_rekor_success_total`, `attestor_rekor_retry_total`, `rekor_inclusion_latency`
|
||||
* Dashboards: kit versions per environment, time since last kit, DSSE/Rekor health.
|
||||
|
||||
---
|
||||
|
||||
## 6. Developer checklist (TL;DR)
|
||||
|
||||
When you pick up a task touching DSSE‑signed offline updates:
|
||||
|
||||
1. **Read the background**
|
||||
|
||||
* `docs/modules/scanner/operations/dsse-rekor-operator-guide.md`([git.stella-ops.org][2])
|
||||
* `docs/24_OFFLINE_KIT.md` (and public offline kit guide).([git.stella-ops.org][1])
|
||||
* Relevant sprint file (`SPRINT_160_export_evidence`, `SPRINT_162_exportcenter_i`, etc.).([git.stella-ops.org][10])
|
||||
|
||||
2. **Implement**
|
||||
|
||||
* Generate DSSE payloads in Export Center only.
|
||||
* Call Signer & Attestor; persist DSSE + Rekor JSON next to manifests.
|
||||
* Extend OUK builder to include attestation bundle and list it in `offline-manifest.json`.
|
||||
* Update Scanner import flow to verify DSSE/Rekor before swapping feeds.
|
||||
|
||||
3. **Test**
|
||||
|
||||
* Unit tests for bundle composition & schema.
|
||||
* Integration tests for import + rollback.
|
||||
* Determinism tests (same inputs → same DSSE payload).
|
||||
|
||||
4. **Wire telemetry**
|
||||
|
||||
* Counters + latency histograms.
|
||||
* Logs with `offlineKitHash`, `attestationDigest`, `rekorUuid`.
|
||||
|
||||
5. **Document**
|
||||
|
||||
* Update `docs/modules/export-center/architecture.md`, `docs/modules/scanner/architecture.md`, and the OUK docs where flows or contracts changed.([git.stella-ops.org][5])
|
||||
|
||||
---
|
||||
|
||||
If you tell me which module you’re actually coding in next (Scanner, Export Center, CLI, or Attestor), I can turn this into a very concrete “AGENTS.md‑style” section with exact file paths, class names, and a starter test layout for that module.
|
||||
|
||||
[1]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/7bf40b8589c94078e8eadb240553c02f097a5127/docs/24_OFFLINE_KIT.md "git.stella-ops.org/24_OFFLINE_KIT.md at 7bf40b8589c94078e8eadb240553c02f097a5127 - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[2]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/13e4b53dda1575ba46c6188c794fd465ec6fdeec/docs/modules/scanner/operations/dsse-rekor-operator-guide.md "git.stella-ops.org/dsse-rekor-operator-guide.md at 13e4b53dda1575ba46c6188c794fd465ec6fdeec - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[3]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/raw/commit/61f963fd52cd4d6bb2f86afc5a82eac04c04b00e/docs/implplan/SPRINT_162_exportcenter_i.md?utm_source=chatgpt.com "https://git.stella-ops.org/stella-ops.org/git.stel..."
|
||||
[4]: https://stella-ops.org/docs/07_high_level_architecture/index.html?utm_source=chatgpt.com "Open • Sovereign • Modular container security - Stella Ops"
|
||||
[5]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/d870da18ce194c6a5f1a6d71abea36205d9fb276/docs/export-center/architecture.md?utm_source=chatgpt.com "Export Center Architecture - Stella Ops"
|
||||
[6]: https://stella-ops.org/docs/moat/?utm_source=chatgpt.com "Open • Sovereign • Modular container security - Stella Ops"
|
||||
[7]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/79b8e53441e92dbc63684f42072434d40b80275f/src/ExportCenter?utm_source=chatgpt.com "Code - Stella Ops"
|
||||
[8]: https://stella-ops.org/docs/24_offline_kit/?utm_source=chatgpt.com "Offline Update Kit (OUK) — Air‑Gap Bundle - Stella Ops – Open ..."
|
||||
[9]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/7768555f2d107326050cc5ff7f5cb81b82b7ce5f/AGENTS.md "git.stella-ops.org/AGENTS.md at 7768555f2d107326050cc5ff7f5cb81b82b7ce5f - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[10]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/66cb6c4b8af58a33efa1521b7953dda834431497/docs/implplan/SPRINT_160_export_evidence.md?utm_source=chatgpt.com "git.stella-ops.org/SPRINT_160_export_evidence.md at ..."
|
||||
[11]: https://stella-ops.org/about/?utm_source=chatgpt.com "Signed Reachability · Deterministic Replay · Sovereign Crypto"
|
||||
[12]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/actions/?actor=0&status=0&workflow=sdk-publish.yml&utm_source=chatgpt.com "Actions - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
@@ -0,0 +1,425 @@
|
||||
Here’s a simple metric that will make your security UI (and teams) radically better: **Time‑to‑Evidence (TTE)** — the time from opening a finding to seeing *raw proof* (a data‑flow edge, an SBOM line, or a VEX note), not a summary.
|
||||
|
||||
---
|
||||
|
||||
### What it is
|
||||
|
||||
* **Definition:** TTE = `t_first_proof_rendered − t_open_finding`.
|
||||
* **Proof =** the exact artifact or path that justifies the claim (e.g., `package-lock.json: line 214 → openssl@1.1.1`, `reachability: A → B → C sink`, or `VEX: not_affected due to unreachable code`).
|
||||
* **Target:** **P95 ≤ 15s** (stretch: P99 ≤ 30s). If 95% of findings show proof within 15 seconds, the UI stays honest: evidence before opinion, low noise, fast explainability.
|
||||
|
||||
---
|
||||
|
||||
### Why it matters
|
||||
|
||||
* **Trust:** People accept decisions they can *verify* quickly.
|
||||
* **Triage speed:** Proof-first UIs cut back-and-forth and guesswork.
|
||||
* **Noise control:** If you can’t surface proof fast, you probably shouldn’t surface the finding yet.
|
||||
|
||||
---
|
||||
|
||||
### How to measure (engineering‑ready)
|
||||
|
||||
* Emit two stamps per finding view:
|
||||
|
||||
* `t_open_finding` (on route enter or modal open).
|
||||
* `t_first_proof_rendered` (first DOM paint of SBOM line / path list / VEX clause).
|
||||
* Store as `tte_ms` in a lightweight events table (Postgres) with tags: `tenant`, `finding_id`, `proof_kind` (`sbom|reachability|vex`), `source` (`local|remote|cache`).
|
||||
* Nightly rollup: compute P50/P90/P95/P99 by proof_kind and page.
|
||||
* Alert when **P95 > 15s** for 15 minutes.
|
||||
|
||||
---
|
||||
|
||||
### UI contract (keeps the UX honest)
|
||||
|
||||
* **Above the fold:** always show a compact **Proof panel** first (not hidden behind tabs).
|
||||
* **Skeletons over spinners:** reserve space; render partial proof as soon as any piece is ready.
|
||||
* **Plain text copy affordance:** “Copy SBOM line / path” button right next to the proof.
|
||||
* **Defer non‑proof widgets:** CVSS badges, remediation prose, and charts load *after* proof.
|
||||
* **Empty‑state truth:** if no proof exists, say “No proof available yet” and show the loader for *that* proof type only (don’t pretend with summaries).
|
||||
|
||||
---
|
||||
|
||||
### Backend rules of thumb
|
||||
|
||||
* **Pre‑index for first paint:** cache top N proof items per hot finding (e.g., first SBOM hit + shortest path).
|
||||
* **Bound queries:** proof queries must be *O(log n)* on indexed columns (pkg name@version, file hash, graph node id).
|
||||
* **Chunked streaming:** send first proof chunk <200 ms after backend hit; don’t hold for the full set.
|
||||
* **Timeout budget:** 12s backend budget + 3s UI/render margin = 15s P95.
|
||||
|
||||
---
|
||||
|
||||
### Minimal contract to add in your code
|
||||
|
||||
```ts
|
||||
// Frontend: fire on open
|
||||
metrics.emit('finding_open', { findingId, t: performance.now() });
|
||||
|
||||
// When the first real proof node/line hits the DOM:
|
||||
metrics.emit('proof_rendered', { findingId, proofKind, t: performance.now() });
|
||||
```
|
||||
|
||||
```sql
|
||||
-- Rollup (hourly)
|
||||
SELECT
|
||||
proof_kind,
|
||||
percentile_cont(0.95) WITHIN GROUP (ORDER BY tte_ms) AS p95_ms
|
||||
FROM tte_events
|
||||
WHERE ts >= now() - interval '1 hour'
|
||||
GROUP BY proof_kind;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### What to put on the team dashboard
|
||||
|
||||
* **TTE P95 by page** (Findings list, Finding details).
|
||||
* **TTE P95 by proof_kind** (sbom / reachability / vex).
|
||||
* **Error budget burn**: minutes over target per day.
|
||||
* **Top regressions**: last 7 days vs prior 7.
|
||||
|
||||
---
|
||||
|
||||
### Acceptance checklist for any finding view
|
||||
|
||||
* [ ] First paint shows a real proof snippet (not a summary).
|
||||
* [ ] “Copy proof” button works within 1 click.
|
||||
* [ ] TTE P95 in staging ≤ 10s; in prod ≤ 15s.
|
||||
* [ ] If proof missing, explicit empty‑state + retry path.
|
||||
* [ ] Telemetry sampled ≥ 50% of sessions (or 100% for internal).
|
||||
|
||||
---
|
||||
|
||||
If you want, I can turn this into a **`docs/UX/tte.md`** spec plus a **Grafana panel JSON** and a tiny **.NET middleware** to emit the metrics—just say the word and I’ll drop in ready‑to‑use snippets.
|
||||
Perfect, got it — let’s turn **Time‑to‑Evidence (TTE)** into a proper set of *“how we track UX health”* guidelines.
|
||||
|
||||
I’ll treat this like something you can drop into an internal engineering/UX doc.
|
||||
|
||||
---
|
||||
|
||||
## 1. What exactly is Time‑to‑Evidence?
|
||||
|
||||
**Definition**
|
||||
|
||||
> **TTE = t_first_proof_rendered − t_open_finding**
|
||||
|
||||
* **t_open_finding** – when the user first opens a “finding” / detail view (e.g., vulnerability, alert, ticket, log event).
|
||||
* **t_first_proof_rendered** – when the UI first paints **actual evidence** that backs the finding, for example:
|
||||
|
||||
* The SBOM row showing `package@version`.
|
||||
* The call‑graph/data‑flow path to a sink.
|
||||
* A VEX note explaining why something is (not) affected.
|
||||
* A raw log snippet that the alert is based on.
|
||||
|
||||
**Key principle:**
|
||||
TTE measures **how long users have to trust you blindly** before they can see proof with their own eyes.
|
||||
|
||||
---
|
||||
|
||||
## 2. UX health goals & targets
|
||||
|
||||
Treat TTE like latency SLOs:
|
||||
|
||||
* **Primary SLO**:
|
||||
|
||||
* **P95 TTE ≤ 15s** for all findings in normal conditions.
|
||||
* **Stretch SLO**:
|
||||
|
||||
* **P99 TTE ≤ 30s** for heavy cases (big graphs, huge SBOMs, cold caches).
|
||||
* **Guardrail**:
|
||||
|
||||
* P50 TTE should be **< 3s**. If the median creeps up, you’re in trouble even if P95 looks OK.
|
||||
|
||||
You can refine by feature:
|
||||
|
||||
* “Simple” proof (single SBOM row, small payload):
|
||||
|
||||
* P95 ≤ 5s.
|
||||
* “Complex” proof (reachability graph, cross‑repo joins):
|
||||
|
||||
* P95 ≤ 15s.
|
||||
|
||||
**UX rule of thumb**
|
||||
|
||||
* < 2s: feels instant.
|
||||
* 2–10s: acceptable if clearly loading something heavy.
|
||||
* > 10s: needs **strong** feedback (progress, partial results, explanations).
|
||||
* > 30s: the system should probably **offer fallback** (e.g., “download raw evidence” or “retry”).
|
||||
|
||||
---
|
||||
|
||||
## 3. Instrumentation guidelines
|
||||
|
||||
### 3.1 Event model
|
||||
|
||||
Emit two core events per finding view:
|
||||
|
||||
1. **`finding_open`**
|
||||
|
||||
* When user opens the finding details (route enter / modal open).
|
||||
* Must include:
|
||||
|
||||
* `finding_id`
|
||||
* `tenant_id` / `org_id`
|
||||
* `user_role` (admin, dev, triager, etc.)
|
||||
* `entry_point` (list, search, notification, deep link)
|
||||
* `ui_version` / `build_sha`
|
||||
|
||||
2. **`proof_rendered`**
|
||||
|
||||
* First time *any* qualifying proof element is painted.
|
||||
* Must include:
|
||||
|
||||
* `finding_id`
|
||||
* `proof_kind` (`sbom | reachability | vex | logs | other`)
|
||||
* `source` (`local_cache | backend_api | 3rd_party`)
|
||||
* `proof_height` (e.g., pixel offset from top) – to ensure it’s actually above the fold or very close.
|
||||
|
||||
**Derived metric**
|
||||
|
||||
Your telemetry pipeline should compute:
|
||||
|
||||
```text
|
||||
tte_ms = proof_rendered.timestamp - finding_open.timestamp
|
||||
```
|
||||
|
||||
If there are multiple `proof_rendered` events for the same `finding_open`, use:
|
||||
|
||||
* **TTE (first proof)** – minimum timestamp; primary SLO.
|
||||
* Optionally: **TTE (full evidence)** – last proof in a defined “bundle” (e.g., path + SBOM row).
|
||||
|
||||
### 3.2 Implementation notes
|
||||
|
||||
**Frontend**
|
||||
|
||||
* Emit `finding_open` as soon as:
|
||||
|
||||
* The route is confirmed and
|
||||
* You know which `finding_id` is being displayed.
|
||||
* Emit `proof_rendered`:
|
||||
|
||||
* **Not** when you *fetch* data, but when at least one evidence component is **visibly rendered**.
|
||||
* Easiest approach: hook into component lifecycle / intersection observer on the evidence container.
|
||||
|
||||
Pseudo‑example:
|
||||
|
||||
```ts
|
||||
// On route/mount:
|
||||
metrics.emit('finding_open', {
|
||||
findingId,
|
||||
entryPoint,
|
||||
userRole,
|
||||
uiVersion,
|
||||
t: performance.now()
|
||||
});
|
||||
|
||||
// In EvidencePanel component, after first render with real data:
|
||||
if (!hasEmittedProof && hasRealEvidence) {
|
||||
metrics.emit('proof_rendered', {
|
||||
findingId,
|
||||
proofKind: 'sbom',
|
||||
source: 'backend_api',
|
||||
t: performance.now()
|
||||
});
|
||||
hasEmittedProof = true;
|
||||
}
|
||||
```
|
||||
|
||||
**Backend**
|
||||
|
||||
* No special requirement beyond:
|
||||
|
||||
* Stable IDs (`finding_id`).
|
||||
* Knowing which API endpoints respond with evidence payloads — you’ll want to correlate backend latency with TTE later.
|
||||
|
||||
---
|
||||
|
||||
## 4. Data quality & sampling
|
||||
|
||||
If you want TTE to drive decisions, the data must be boringly reliable.
|
||||
|
||||
**Guidelines**
|
||||
|
||||
1. **Sample rate**
|
||||
|
||||
* Start with **100%** in staging.
|
||||
* In production, aim for **≥ 25% of sessions** for TTE events at minimum; 100% is ideal if volume is reasonable.
|
||||
|
||||
2. **Clock skew**
|
||||
|
||||
* Prefer **frontend timestamps** using `performance.now()` for TTE; they’re monotonic within a tab.
|
||||
* Don’t mix backend clocks into the TTE calculation.
|
||||
|
||||
3. **Bot / synthetic traffic**
|
||||
|
||||
* Tag synthetic tests (`is_synthetic = true`) and exclude them from UX health dashboards.
|
||||
|
||||
4. **Retry behavior**
|
||||
|
||||
* If the proof fails to load and user hits “retry”:
|
||||
|
||||
* Treat it as a separate measurement (`retry = true`) or
|
||||
* Log an additional `proof_error` event with error class (timeout, 5xx, network, parse, etc.).
|
||||
|
||||
---
|
||||
|
||||
## 5. Dashboards: how to watch TTE
|
||||
|
||||
You want a small, opinionated set of views that answer:
|
||||
|
||||
> “Is UX getting better or worse for people trying to understand findings?”
|
||||
|
||||
### 5.1 Core widgets
|
||||
|
||||
1. **TTE distribution**
|
||||
|
||||
* P50 / P90 / P95 / P99 per day (or per release).
|
||||
* Split by `proof_kind`.
|
||||
|
||||
2. **TTE by page / surface**
|
||||
|
||||
* Finding list → detail.
|
||||
* Deep links from notifications.
|
||||
* Direct URLs / bookmarks.
|
||||
|
||||
3. **TTE by user segment**
|
||||
|
||||
* New users vs power users.
|
||||
* Different roles (security engineer vs application dev).
|
||||
|
||||
4. **Error budget panel**
|
||||
|
||||
* “Minutes over SLO per day” – e.g., sum of all user‑minutes where TTE > 15s.
|
||||
* Use this to prioritize work.
|
||||
|
||||
5. **Correlation with engagement**
|
||||
|
||||
* Scatter: TTE vs session length, or TTE vs “user clicked ‘ignore’ / ‘snooze’”.
|
||||
* Aim to confirm the obvious: **long TTE → worse engagement/completion**.
|
||||
|
||||
### 5.2 Operational details
|
||||
|
||||
* Update granularity: **real‑time or ≤15 min** for on‑call/ops panels.
|
||||
* Retention: at least **90 days** to see trends across big releases.
|
||||
* Breakdowns:
|
||||
|
||||
* `backend_region` (to catch regional issues).
|
||||
* `build_version` (to spot regressions quickly).
|
||||
|
||||
---
|
||||
|
||||
## 6. UX & engineering design rules anchored in TTE
|
||||
|
||||
These are the **behavior rules** for the product that keep TTE healthy.
|
||||
|
||||
### 6.1 “Evidence first” layout rules
|
||||
|
||||
* **Evidence above the fold**
|
||||
|
||||
* At least *one* proof element must be visible **without scrolling** on a typical laptop viewport.
|
||||
* **Summary second**
|
||||
|
||||
* CVSS scores, severity badges, long descriptions: all secondary. Evidence should come *before* opinion.
|
||||
* **No fake proof**
|
||||
|
||||
* Don’t use placeholders that *look* like evidence but aren’t (e.g., “example path” or generic text).
|
||||
* If evidence is still loading, show a clear skeleton/loader with “Loading evidence…”.
|
||||
|
||||
### 6.2 Loading strategy rules
|
||||
|
||||
* Start fetching evidence **as soon as navigation begins**, not after the page is fully mounted.
|
||||
* Use **lazy loading** for non‑critical widgets until after proof is shown.
|
||||
* If a call is known to be heavy:
|
||||
|
||||
* Consider **precomputing** and caching the top evidence (shortest path, first SBOM hit).
|
||||
* Stream results: render first proof item as soon as it arrives; don’t wait for the full list.
|
||||
|
||||
### 6.3 Empty / error state rules
|
||||
|
||||
* If there is genuinely no evidence:
|
||||
|
||||
* Explicitly say **“No supporting evidence available yet”** and treat TTE as:
|
||||
|
||||
* Either “no value” (excluded), or
|
||||
* A special bucket `proof_kind = "none"`.
|
||||
* If loading fails:
|
||||
|
||||
* Show a clear error and a **retry** that re‑emits `proof_rendered` when successful.
|
||||
* Log `proof_error` with reason; track error rate alongside TTE.
|
||||
|
||||
---
|
||||
|
||||
## 7. How to *use* TTE in practice
|
||||
|
||||
### 7.1 For releases
|
||||
|
||||
For any change that affects findings UI or evidence plumbing:
|
||||
|
||||
* Add a release checklist item:
|
||||
|
||||
* “No regression on TTE P95 for [pages X, Y].”
|
||||
* During rollout:
|
||||
|
||||
* Compare **pre‑ vs post‑release** TTE P95 by `ui_version`.
|
||||
* If regression > 20%:
|
||||
|
||||
* Roll back, or
|
||||
* Add a follow‑up ticket explicitly tagged with the regression.
|
||||
|
||||
### 7.2 For experiments / A/B tests
|
||||
|
||||
When running UI experiments around findings:
|
||||
|
||||
* Always capture TTE per variant.
|
||||
* Compare:
|
||||
|
||||
* TTE P50/P95.
|
||||
* Task completion rate (e.g., “user changed status”).
|
||||
* Subjective UX (CSAT) if you have it.
|
||||
|
||||
You’re looking for patterns like:
|
||||
|
||||
* Variant B: **+5% completion**, **+8% TTE** → maybe OK.
|
||||
* Variant C: **+2% completion**, **+70% TTE** → probably not acceptable.
|
||||
|
||||
### 7.3 For prioritization
|
||||
|
||||
Use TTE as a lever in planning:
|
||||
|
||||
* If P95 TTE is healthy and stable:
|
||||
|
||||
* More room for new features / experiments.
|
||||
* If P95 TTE is trending up for 2+ weeks:
|
||||
|
||||
* Time to schedule a “TTE debt” story: caching, query optimization, UI re‑layout, etc.
|
||||
|
||||
---
|
||||
|
||||
## 8. Quick “TTE‑ready” checklist
|
||||
|
||||
You’re “tracking UX health with TTE” if you can honestly tick these:
|
||||
|
||||
1. **Instrumentation**
|
||||
|
||||
* [ ] `finding_open` + `proof_rendered` events exist and are correlated.
|
||||
* [ ] TTE computed in a stable pipeline (joins, dedupe, etc.).
|
||||
2. **Targets**
|
||||
|
||||
* [ ] TTE SLOs defined (P95, P99) and agreed by UX + engineering.
|
||||
3. **Dashboards**
|
||||
|
||||
* [ ] A dashboard shows TTE by proof kind, page, and release.
|
||||
* [ ] On‑call / ops can see TTE in near real‑time.
|
||||
4. **UX rules**
|
||||
|
||||
* [ ] Evidence is visible above the fold for all main finding types.
|
||||
* [ ] Non‑critical widgets load after evidence.
|
||||
* [ ] Empty/error states are explicit about evidence availability.
|
||||
5. **Process**
|
||||
|
||||
* [ ] Major UI changes check TTE pre vs post as part of release acceptance.
|
||||
* [ ] Regressions in TTE create real tickets, not just “we’ll watch it”.
|
||||
|
||||
---
|
||||
|
||||
If you tell me what stack you’re on (e.g., React + Next.js + OpenTelemetry + X observability tool), I can turn this into concrete code snippets and an example dashboard spec (fields, queries, charts) tailored exactly to your setup.
|
||||
@@ -0,0 +1,576 @@
|
||||
Here’s a tight, practical blueprint to turn your SBOM→VEX links into an auditable “proof spine”—using signed DSSE statements and a per‑dependency trust anchor—so every VEX verdict can be traced, verified, and replayed.
|
||||
|
||||
# What this gives you
|
||||
|
||||
* A **chain of evidence** from each SBOM entry → analysis → VEX verdict.
|
||||
* **Tamper‑evident** DSSE‑signed records (offline‑friendly).
|
||||
* **Deterministic replay**: same inputs → same verdicts (great for audits/regulators).
|
||||
|
||||
# Core objects (canonical IDs)
|
||||
|
||||
* **ArtifactID**: digest of package/container (e.g., `sha256:…`).
|
||||
* **SBOMEntryID**: stable ID for a component in an SBOM (`sbomDigest:package@version[:purl]`).
|
||||
* **EvidenceID**: hash of raw evidence (scanner JSON, reachability, exploit intel).
|
||||
* **ReasoningID**: hash of normalized reasoning (rules/lattice inputs used).
|
||||
* **VEXVerdictID**: hash of the final VEX statement body.
|
||||
* **ProofBundleID**: merkle root of {SBOMEntryID, EvidenceID[], ReasoningID, VEXVerdictID}.
|
||||
* **TrustAnchorID**: per‑dependency anchor (public key + policy) used to validate the above.
|
||||
|
||||
# Signed DSSE envelopes you’ll produce
|
||||
|
||||
1. **Evidence Statement** (per evidence item)
|
||||
|
||||
* `subject`: SBOMEntryID
|
||||
* `predicateType`: `evidence.stella/v1`
|
||||
* `predicate`: source, tool version, timestamps, EvidenceID
|
||||
* **Signers**: scanner/ingestor key
|
||||
|
||||
2. **Reasoning Statement**
|
||||
|
||||
* `subject`: SBOMEntryID
|
||||
* `predicateType`: `reasoning.stella/v1` (your lattice/policy inputs + ReasoningID)
|
||||
* **Signers**: “Policy/Lattice Engine” key (Authority)
|
||||
|
||||
3. **VEX Verdict Statement**
|
||||
|
||||
* `subject`: SBOMEntryID
|
||||
* `predicateType`: CycloneDX or CSAF VEX body + VEXVerdictID
|
||||
* **Signers**: VEXer key (or vendor key if you have it)
|
||||
|
||||
4. **Proof Spine Statement** (the spine itself)
|
||||
|
||||
* `subject`: SBOMEntryID
|
||||
* `predicateType`: `proofspine.stella/v1`
|
||||
* `predicate`: EvidenceID[], ReasoningID, VEXVerdictID, ProofBundleID
|
||||
* **Signers**: Authority key
|
||||
|
||||
# Trust model (per‑dependency anchor)
|
||||
|
||||
* **TrustAnchor** (per package/purl): { TrustAnchorID, allowed signers (KMS refs, PKs), accepted predicateTypes, policy version, revocation list }.
|
||||
* Store anchors in **Authority** and pin them in your graph by SBOMEntryID→TrustAnchorID.
|
||||
* Optional: PQC mode (Dilithium/Falcon) for long‑term archives.
|
||||
|
||||
# Verification pipeline (deterministic)
|
||||
|
||||
1. Resolve SBOMEntryID → TrustAnchorID.
|
||||
2. Verify every DSSE envelope’s signature **against the anchor’s allowed keys**.
|
||||
3. Recompute EvidenceID/ReasoningID/VEXVerdictID from raw content; compare hashes.
|
||||
4. Recompute ProofBundleID (merkle root) and compare to the spine.
|
||||
5. Emit a **Receipt**: {ProofBundleID, verification log, tool digests}. Cache it.
|
||||
|
||||
# Storage layout (Postgres + blob store)
|
||||
|
||||
* `sbom_entries(entry_id PK, bom_digest, purl, version, artifact_digest, trust_anchor_id)`
|
||||
* `dsse_envelopes(env_id PK, entry_id, predicate_type, signer_keyid, body_hash, envelope_blob_ref, signed_at)`
|
||||
* `spines(entry_id PK, bundle_id, evidence_ids[], reasoning_id, vex_id, anchor_id, created_at)`
|
||||
* `trust_anchors(anchor_id PK, purl_pattern, allowed_keyids[], policy_ref, revoked_keys[])`
|
||||
* Blobs (immutable): raw evidence, normalized reasoning JSON, VEX JSON, DSSE bytes.
|
||||
|
||||
# API surface (clean and small)
|
||||
|
||||
* `POST /proofs/:entry/spine` → submit or update spine (idempotent by ProofBundleID)
|
||||
* `GET /proofs/:entry/receipt` → full verification receipt (JSON)
|
||||
* `GET /proofs/:entry/vex` → the verified VEX body
|
||||
* `GET /anchors/:anchor` → fetch trust anchor (for offline kits)
|
||||
|
||||
# Normalization rules (so hashes are stable)
|
||||
|
||||
* Canonical JSON (UTF‑8, sorted keys, no insignificant whitespace).
|
||||
* Strip volatile fields (timestamps that aren’t part of the semantic claim).
|
||||
* Version your schemas: `evidence.stella/v1`, `reasoning.stella/v1`, etc.
|
||||
|
||||
# Signing keys & rotation
|
||||
|
||||
* Keep keys in your **Authority** module (KMS/HSM; offline export for air‑gap).
|
||||
* Publish key material via an **attestation feed** (or Rekor‑mirror) for third‑party audit.
|
||||
* Rotate by **adding** new allowed_keyids in the TrustAnchor; never mutate old envelopes.
|
||||
|
||||
# CI/CD hooks
|
||||
|
||||
* On SBOM ingest → create/refresh SBOMEntry rows + attach TrustAnchor.
|
||||
* On scan completion → produce Evidence Statements (DSSE) immediately.
|
||||
* On policy evaluation → produce Reasoning + VEX, then assemble Spine.
|
||||
* Gate releases on `GET /proofs/:entry/receipt` == PASS.
|
||||
|
||||
# UX (auditor‑friendly)
|
||||
|
||||
* **Proof timeline** per entry: SBOM → Evidence tiles → Reasoning → VEX → Receipt.
|
||||
* One‑click “Recompute & Compare” to show deterministic replay passes.
|
||||
* Red/amber flags when a signature no longer matches a TrustAnchor or a key is revoked.
|
||||
|
||||
# Minimal dev checklist
|
||||
|
||||
* [ ] Implement canonicalizers (Evidence, Reasoning, VEX).
|
||||
* [ ] Implement DSSE sign/verify (ECDSA + optional PQC).
|
||||
* [ ] TrustAnchor registry + resolver by purl pattern.
|
||||
* [ ] Merkle bundling to get ProofBundleID.
|
||||
* [ ] Receipt generator + verifier.
|
||||
* [ ] Postgres schema + blob GC (content‑addressed).
|
||||
* [ ] CI gates + API endpoints above.
|
||||
* [ ] Auditor UI: timeline + diff + receipts download.
|
||||
|
||||
If you want, I can drop in a ready‑to‑use JSON schema set (`evidence.stella/v1`, `reasoning.stella/v1`, `proofspine.stella/v1`) and sample DSSE envelopes wired to your .NET 10 stack.
|
||||
Here’s a focused **Stella Ops Developer Guidelines** doc, specifically for the pipeline that turns **SBOM data into verifiable proofs** (your SBOM → Evidence → Reasoning → VEX → Proof Spine).
|
||||
|
||||
Feel free to paste this into your internal handbook and tweak names to match your repos/services.
|
||||
|
||||
---
|
||||
|
||||
# Stella Ops Developer Guidelines
|
||||
|
||||
## Turning SBOM Data Into Verifiable Proofs
|
||||
|
||||
---
|
||||
|
||||
## 1. Mental Model: What You’re Actually Building
|
||||
|
||||
For every component in an SBOM, Stella must be able to answer, *“Why should anyone trust our VEX verdict for this dependency, today and ten years from now?”*
|
||||
|
||||
We do that with a pipeline:
|
||||
|
||||
1. **SBOM Ingest**
|
||||
Raw SBOM → validated → normalized → `SBOMEntryID`.
|
||||
|
||||
2. **Evidence Collection**
|
||||
Scans, feeds, configs, reachability, etc. → canonical evidence blobs → `EvidenceID` → DSSE-signed.
|
||||
|
||||
3. **Reasoning / Policy**
|
||||
Policy + evidence → deterministic reasoning → `ReasoningID` → DSSE-signed.
|
||||
|
||||
4. **VEX Verdict**
|
||||
VEX statement (CycloneDX / CSAF) → canonicalized → `VEXVerdictID` → DSSE-signed.
|
||||
|
||||
5. **Proof Spine**
|
||||
`{SBOMEntryID, EvidenceIDs[], ReasoningID, VEXVerdictID}` → merkle bundle → `ProofBundleID` → DSSE-signed.
|
||||
|
||||
6. **Verification & Receipts**
|
||||
Re-run verification → `Receipt` that proves everything above is intact and anchored to trusted keys.
|
||||
|
||||
Everything you do in this area should keep this spine intact and verifiable.
|
||||
|
||||
---
|
||||
|
||||
## 2. Non‑Negotiable Invariants
|
||||
|
||||
These are the rules you don’t break without an explicit, company-level decision:
|
||||
|
||||
1. **Immutability of Signed Facts**
|
||||
|
||||
* DSSE envelopes (evidence, reasoning, VEX, spines) are append‑only.
|
||||
* You never edit or delete content inside a previously signed envelope.
|
||||
* Corrections are made by **superseding** (new statement pointing at the old one).
|
||||
|
||||
2. **Determinism**
|
||||
|
||||
* Same `{SBOMEntryID, Evidence set, policyVersion}` ⇒ same `{ReasoningID, VEXVerdictID, ProofBundleID}`.
|
||||
* No non-deterministic inputs (e.g., “current time”, random IDs) in anything that affects IDs or verdicts.
|
||||
|
||||
3. **Traceability**
|
||||
|
||||
* Every VEX verdict must be traceable back to:
|
||||
|
||||
* The precise SBOM entry
|
||||
* Concrete evidence blobs
|
||||
* A specific policy & reasoning snapshot
|
||||
* A trust anchor defining allowed signers
|
||||
|
||||
4. **Least Trust / Least Privilege**
|
||||
|
||||
* Services only know the keys and data they need.
|
||||
* Trust is always explicit: through **TrustAnchors** and signature verification, never “because it’s in our DB”.
|
||||
|
||||
5. **Backwards Compatibility**
|
||||
|
||||
* New code must continue to verify **old proofs**.
|
||||
* New policies must **not rewrite history**; they produce *new* spines, leaving old ones intact.
|
||||
|
||||
---
|
||||
|
||||
## 3. SBOM Ingestion Guidelines
|
||||
|
||||
**Goal:** Turn arbitrary SBOMs into stable, addressable `SBOMEntryID`s and safe internal models.
|
||||
|
||||
### 3.1 Inputs & Formats
|
||||
|
||||
* Support at least:
|
||||
|
||||
* CycloneDX (JSON)
|
||||
* SPDX (JSON / Tag-Value)
|
||||
* For each ingested SBOM, store:
|
||||
|
||||
* Raw SBOM bytes (immutable, content-addressed)
|
||||
* A normalized internal representation (your own model)
|
||||
|
||||
### 3.2 IDs
|
||||
|
||||
* Generate:
|
||||
|
||||
* `sbomDigest` = hash(raw SBOM, canonical form)
|
||||
* `SBOMEntryID` = `sbomDigest + purl + version` (or equivalent stable tuple)
|
||||
* `SBOMEntryID` must:
|
||||
|
||||
* Not depend on ingestion time or database IDs.
|
||||
* Be reproducible from SBOM + deterministic normalization.
|
||||
|
||||
### 3.3 Validation & Errors
|
||||
|
||||
* Validate:
|
||||
|
||||
* Syntax (JSON, schema)
|
||||
* Core semantics (package identifiers, digests, versions)
|
||||
* If invalid:
|
||||
|
||||
* Reject the SBOM **but** record a small DSSE “failure attestation” explaining:
|
||||
|
||||
* Why it failed
|
||||
* Which file
|
||||
* Which system version
|
||||
* This still gives you a proof trail for “we tried and it failed”.
|
||||
|
||||
---
|
||||
|
||||
## 4. Evidence Collection Guidelines
|
||||
|
||||
**Goal:** Capture all inputs that influence the verdict in a canonical, signed form.
|
||||
|
||||
Typical evidence types:
|
||||
|
||||
* SCA / vuln scanner results
|
||||
* CVE feeds & advisory data
|
||||
* Reachability / call graph analysis
|
||||
* Runtime context (where this component is used)
|
||||
* Manual assessments (e.g., security engineer verdicts)
|
||||
|
||||
### 4.1 Evidence Canonicalization
|
||||
|
||||
For every evidence item:
|
||||
|
||||
* Normalize to a schema like `evidence.stella/v1` with fields such as:
|
||||
|
||||
* `source` (scanner name, feed)
|
||||
* `sourceVersion` (tool version, DB version)
|
||||
* `collectionTime`
|
||||
* `sbomEntryId`
|
||||
* `vulnerabilityId` (if applicable)
|
||||
* `rawFinding` (or pointer to it)
|
||||
* Canonical JSON rules:
|
||||
|
||||
* Sorted keys
|
||||
* UTF‑8, no extraneous whitespace
|
||||
* No volatile fields beyond what’s semantically needed (e.g., you might include `collectionTime`, but then know it affects the hash and treat that consciously).
|
||||
|
||||
Then:
|
||||
|
||||
* Compute `EvidenceID = hash(canonicalEvidenceJson)`.
|
||||
* Wrap in DSSE:
|
||||
|
||||
* `subject`: `SBOMEntryID`
|
||||
* `predicateType`: `evidence.stella/v1`
|
||||
* `predicate`: canonical evidence + `EvidenceID`.
|
||||
* Sign with **evidence-ingestor key** (per environment).
|
||||
|
||||
### 4.2 Ops Rules
|
||||
|
||||
* **Idempotency:**
|
||||
Re-running the same scan with same inputs should produce the same evidence object and `EvidenceID`.
|
||||
* **Tool changes:**
|
||||
When tool version or configuration changes, that’s a *new* evidence statement with a new `EvidenceID`. Do not overwrite old evidence.
|
||||
* **Partial failure:**
|
||||
If a scan fails, produce a minimal failure evidence record (with error details) instead of “nothing”.
|
||||
|
||||
---
|
||||
|
||||
## 5. Reasoning & Policy Engine Guidelines
|
||||
|
||||
**Goal:** Turn evidence into a defensible, replayable reasoning step with a clear policy version.
|
||||
|
||||
### 5.1 Reasoning Object
|
||||
|
||||
Define a canonical reasoning schema, e.g. `reasoning.stella/v1`:
|
||||
|
||||
* `sbomEntryId`
|
||||
* `evidenceIds[]` (sorted)
|
||||
* `policyVersion`
|
||||
* `inputs`: normalized form of all policy inputs (severity thresholds, lattice rules, etc.)
|
||||
* `intermediateFindings`: optional but useful — e.g., “reachable vulns = …”
|
||||
|
||||
Then:
|
||||
|
||||
* Canonicalize JSON and compute `ReasoningID = hash(canonicalReasoning)`.
|
||||
* Wrap in DSSE:
|
||||
|
||||
* `subject`: `SBOMEntryID`
|
||||
* `predicateType`: `reasoning.stella/v1`
|
||||
* `predicate`: canonical reasoning + `ReasoningID`.
|
||||
* Sign with **Policy/Authority key**.
|
||||
|
||||
### 5.2 Determinism
|
||||
|
||||
* Reasoning functions must be **pure**:
|
||||
|
||||
* Inputs: SBOMEntryID, evidence set, policy version, configuration.
|
||||
* No hidden calls to external APIs at decision time (fetch feeds earlier and record them as evidence).
|
||||
* If you need “current time” in policy:
|
||||
|
||||
* Treat it as **explicit input** and record it inside reasoning under `inputs.currentEvaluationTime`.
|
||||
|
||||
### 5.3 Policy Evolution
|
||||
|
||||
* When policy changes:
|
||||
|
||||
* Bump `policyVersion`.
|
||||
* New evaluations produce new `ReasoningID` and new VEX/spines.
|
||||
* Don’t retroactively apply new policy to old reasoning objects; generate new ones alongside.
|
||||
|
||||
---
|
||||
|
||||
## 6. VEX Verdict Guidelines
|
||||
|
||||
**Goal:** Generate VEX statements that are strongly tied to SBOM entries and your reasoning.
|
||||
|
||||
### 6.1 Shape
|
||||
|
||||
* Target standard formats:
|
||||
|
||||
* CycloneDX VEX
|
||||
* or CSAF
|
||||
* Required linkages:
|
||||
|
||||
* Component reference = `SBOMEntryID` or a resolvable component identifier from your SBOM normalize layer.
|
||||
* Vulnerability IDs (CVE, GHSA, internal IDs).
|
||||
* Status (`not_affected`, `affected`, `fixed`, etc.).
|
||||
* Justification & impact.
|
||||
|
||||
### 6.2 Canonicalization & Signing
|
||||
|
||||
* Define a canonical VEX body schema (subset of the standard + internal metadata):
|
||||
|
||||
* `sbomEntryId`
|
||||
* `vulnerabilityId`
|
||||
* `status`
|
||||
* `justification`
|
||||
* `policyVersion`
|
||||
* `reasoningId`
|
||||
* Canonicalize JSON → `VEXVerdictID = hash(canonicalVexBody)`.
|
||||
* DSSE-envelope:
|
||||
|
||||
* `subject`: `SBOMEntryID`
|
||||
* `predicateType`: e.g. `cdx-vex.stella/v1`
|
||||
* `predicate`: canonical VEX + `VEXVerdictID`.
|
||||
* Sign with **VEXer key** or vendor key (depending on trust anchor).
|
||||
|
||||
### 6.3 External VEX
|
||||
|
||||
* When importing vendor VEX:
|
||||
|
||||
* Verify signature against vendor’s TrustAnchor.
|
||||
* Canonicalize to your internal schema but preserve:
|
||||
|
||||
* Original document
|
||||
* Original signature material
|
||||
* Record “source = vendor” vs “source = stella” so auditors see origin.
|
||||
|
||||
---
|
||||
|
||||
## 7. Proof Spine Guidelines
|
||||
|
||||
**Goal:** Build a compact, tamper-evident “bundle” that ties everything together.
|
||||
|
||||
### 7.1 Structure
|
||||
|
||||
For each `SBOMEntryID`, gather:
|
||||
|
||||
* `EvidenceIDs[]` (sorted lexicographically).
|
||||
* `ReasoningID`.
|
||||
* `VEXVerdictID`.
|
||||
|
||||
Compute:
|
||||
|
||||
* Merkle tree root (or deterministic hash) over:
|
||||
|
||||
* `sbomEntryId`
|
||||
* sorted `EvidenceIDs[]`
|
||||
* `ReasoningID`
|
||||
* `VEXVerdictID`
|
||||
* Result is `ProofBundleID`.
|
||||
|
||||
Create a DSSE “spine”:
|
||||
|
||||
* `subject`: `SBOMEntryID`
|
||||
* `predicateType`: `proofspine.stella/v1`
|
||||
* `predicate`:
|
||||
|
||||
* `evidenceIds[]`
|
||||
* `reasoningId`
|
||||
* `vexVerdictId`
|
||||
* `policyVersion`
|
||||
* `proofBundleId`
|
||||
* Sign with **Authority key**.
|
||||
|
||||
### 7.2 Ops Rules
|
||||
|
||||
* Spine generation is idempotent:
|
||||
|
||||
* Same inputs → same `ProofBundleID`.
|
||||
* Never mutate existing spines; new policy or new evidence ⇒ new spine.
|
||||
* Keep a clear API contract:
|
||||
|
||||
* `GET /proofs/:entry` returns **all** spines, each labeled with `policyVersion` and timestamps.
|
||||
|
||||
---
|
||||
|
||||
## 8. Storage & Schema Guidelines
|
||||
|
||||
**Goal:** Keep proofs queryable forever without breaking verification.
|
||||
|
||||
### 8.1 Tables (conceptual)
|
||||
|
||||
* `sbom_entries`: `entry_id`, `bom_digest`, `purl`, `version`, `artifact_digest`, `trust_anchor_id`.
|
||||
* `dsse_envelopes`: `env_id`, `entry_id`, `predicate_type`, `signer_keyid`, `body_hash`, `envelope_blob_ref`, `signed_at`.
|
||||
* `spines`: `entry_id`, `proof_bundle_id`, `policy_version`, `evidence_ids[]`, `reasoning_id`, `vex_verdict_id`, `anchor_id`, `created_at`.
|
||||
* `trust_anchors`: `anchor_id`, `purl_pattern`, `allowed_keyids[]`, `policy_ref`, `revoked_keys[]`.
|
||||
|
||||
### 8.2 Schema Changes
|
||||
|
||||
Always follow:
|
||||
|
||||
1. **Expand**
|
||||
|
||||
* Add new columns/tables.
|
||||
* Make new code tolerant of old data.
|
||||
|
||||
2. **Backfill**
|
||||
|
||||
* Idempotent jobs that fill in new IDs/fields without touching old DSSE payloads.
|
||||
|
||||
3. **Contract**
|
||||
|
||||
* Only after all code uses the new model.
|
||||
* Never drop the raw DSSE or raw SBOM blobs.
|
||||
|
||||
---
|
||||
|
||||
## 9. Verification & Receipts
|
||||
|
||||
**Goal:** Make it trivial (for you, customers, and regulators) to recheck everything.
|
||||
|
||||
### 9.1 Verification Flow
|
||||
|
||||
Given `SBOMEntryID` or `ProofBundleID`:
|
||||
|
||||
1. Fetch spine and trust anchor.
|
||||
2. Verify:
|
||||
|
||||
* Spine DSSE signature against TrustAnchor’s allowed keys.
|
||||
* VEX, reasoning, and evidence DSSE signatures.
|
||||
3. Recompute:
|
||||
|
||||
* `EvidenceIDs` from stored canonical evidence.
|
||||
* `ReasoningID` from reasoning.
|
||||
* `VEXVerdictID` from VEX body.
|
||||
* `ProofBundleID` from the above.
|
||||
4. Compare to stored IDs.
|
||||
|
||||
Emit a **Receipt**:
|
||||
|
||||
* `proofBundleId`
|
||||
* `verifiedAt`
|
||||
* `verifierVersion`
|
||||
* `anchorId`
|
||||
* `result` (pass/fail, with reasons)
|
||||
|
||||
### 9.2 Offline Kit
|
||||
|
||||
* Provide a minimal CLI (`stella verify`) that:
|
||||
|
||||
* Accepts a bundle export (SBOM + DSSE envelopes + anchors).
|
||||
* Verifies everything without network access.
|
||||
|
||||
Developers must ensure:
|
||||
|
||||
* Export format is documented and stable.
|
||||
* All fields required for verification are included.
|
||||
|
||||
---
|
||||
|
||||
## 10. Security & Key Management (for Devs)
|
||||
|
||||
* Keys live in **KMS/HSM**, not env vars or config files.
|
||||
* Separate keysets:
|
||||
|
||||
* `dev`, `staging`, `prod`
|
||||
* Authority vs VEXer vs Evidence Ingestor.
|
||||
* TrustAnchors:
|
||||
|
||||
* Edit via Authority service only.
|
||||
* Every change:
|
||||
|
||||
* Requires code-reviewed change.
|
||||
* Writes an audit log entry.
|
||||
|
||||
Never:
|
||||
|
||||
* Log private keys.
|
||||
* Log full DSSE envelopes in plaintext logs (log IDs and hashes instead).
|
||||
|
||||
---
|
||||
|
||||
## 11. Observability & On‑Call Expectations
|
||||
|
||||
### 11.1 Metrics
|
||||
|
||||
For the SBOM→Proof pipeline, expose:
|
||||
|
||||
* `sboms_ingested_total`
|
||||
* `sbom_ingest_errors_total{reason}`
|
||||
* `evidence_statements_created_total`
|
||||
* `reasoning_statements_created_total`
|
||||
* `vex_statements_created_total`
|
||||
* `proof_spines_created_total`
|
||||
* `proof_verifications_total{result}` (pass/fail reason)
|
||||
* Latency histograms per stage (`_duration_seconds`)
|
||||
|
||||
### 11.2 Logging
|
||||
|
||||
Include in structured logs wherever relevant:
|
||||
|
||||
* `sbomEntryId`
|
||||
* `proofBundleId`
|
||||
* `anchorId`
|
||||
* `policyVersion`
|
||||
* `requestId` / `traceId`
|
||||
|
||||
### 11.3 Runbooks
|
||||
|
||||
You should maintain runbooks for at least:
|
||||
|
||||
* “Pipeline is stalled” (backlog of SBOMs, evidence, or spines).
|
||||
* “Verification failures increased”.
|
||||
* “Trust anchor or key issues” (rotation, revocation, misconfiguration).
|
||||
* “Backfill gone wrong” (how to safely stop, resume, and audit).
|
||||
|
||||
---
|
||||
|
||||
## 12. Dev Workflow & PR Checklist (SBOM→Proof Changes Only)
|
||||
|
||||
When your change touches SBOM ingestion, evidence, reasoning, VEX, or proof spines, check:
|
||||
|
||||
* [ ] IDs (`SBOMEntryID`, `EvidenceID`, `ReasoningID`, `VEXVerdictID`, `ProofBundleID`) remain **deterministic** and fully specified.
|
||||
* [ ] No mutation of existing DSSE envelopes or historical proof data.
|
||||
* [ ] Schema changes follow **expand → backfill → contract**.
|
||||
* [ ] New/updated TrustAnchors reviewed by Authority owner.
|
||||
* [ ] Unit tests cover:
|
||||
|
||||
* Canonicalization for any new/changed predicate.
|
||||
* ID computation.
|
||||
* [ ] Integration test covers:
|
||||
|
||||
* SBOM → Evidence → Reasoning → VEX → Spine → Verification → Receipt.
|
||||
* [ ] Observability updated:
|
||||
|
||||
* New paths emit logs & metrics.
|
||||
* [ ] Rollback plan documented (especially for migrations & policy changes).
|
||||
|
||||
---
|
||||
|
||||
If you tell me which microservices/repos map to these stages (e.g. `stella-sbom-ingest`, `stella-proof-authority`, `stella-vexer`), I can turn this into a more concrete, service‑by‑service checklist with example API contracts and class/interface sketches.
|
||||
Reference in New Issue
Block a user