themed the bulk of advisories
This commit is contained in:
@@ -0,0 +1,446 @@
|
||||
Here’s a crisp, practical way to turn Stella Ops’ “verifiable proof spine” into a moat—and how to measure it.
|
||||
|
||||
# Why this matters (in plain terms)
|
||||
|
||||
Security tools often say “trust me.” You’ll say “prove it”—every finding and every “not‑affected” claim ships with cryptographic receipts anyone can verify.
|
||||
|
||||
---
|
||||
|
||||
# Differentiators to build in
|
||||
|
||||
**1) Bind every verdict to a graph hash**
|
||||
|
||||
* Compute a stable **Graph Revision ID** (Merkle root) over: SBOM nodes, edges, policies, feeds, scan params, and tool versions.
|
||||
* Store the ID on each finding/VEX item; show it in the UI and APIs.
|
||||
* Rule: any data change → new graph hash → new revisioned verdicts.
|
||||
|
||||
**2) Attach machine‑verifiable receipts (in‑toto/DSSE)**
|
||||
|
||||
* For each verdict, emit a **DSSE‑wrapped in‑toto statement**:
|
||||
|
||||
* predicateType: `stellaops.dev/verdict@v1`
|
||||
* includes: graphRevisionId, artifact digests, rule id/version, inputs (CPE/CVE/CVSS), timestamps.
|
||||
* Sign with your **Authority** (Sigstore key, offline mode supported).
|
||||
* Keep receipts queryable and exportable; mirror to Rekor‑compatible ledger when online.
|
||||
|
||||
**3) Add reachability “call‑stack slices” or binary‑symbol proofs**
|
||||
|
||||
* For code‑level reachability, store compact slices: entry → sink, with symbol names + file:line.
|
||||
* For binary-only targets, include **symbol presence proofs** (e.g., Bloom filters + offsets) with executable digest.
|
||||
* Compress and embed a hash of the slice/proof inside the DSSE payload.
|
||||
|
||||
**4) Deterministic replay manifests**
|
||||
|
||||
* Alongside receipts, publish a **Replay Manifest** (inputs, feeds, rule versions, container digests) so any auditor can reproduce the same graph hash and verdicts offline.
|
||||
|
||||
---
|
||||
|
||||
# Benchmarks to publish (make them your headline KPIs)
|
||||
|
||||
**A) False‑positive reduction vs. baseline scanners (%)**
|
||||
|
||||
* Method: run a public corpus (e.g., sample images + app stacks) across 3–4 popular scanners; label ground truth once; compare FP rate.
|
||||
* Report: mean & p95 FP reduction.
|
||||
|
||||
**B) Proof coverage (% of findings with signed evidence)**
|
||||
|
||||
* Definition: `(# findings or VEX items carrying valid DSSE receipts) / (total surfaced items)`.
|
||||
* Break out: runtime‑reachable vs. unreachable, and “not‑affected” claims.
|
||||
|
||||
**C) Triage time saved (p50/p95)**
|
||||
|
||||
* Measure analyst minutes from “alert created” → “final disposition.”
|
||||
* A/B with receipts hidden vs. visible; publish median/p95 deltas.
|
||||
|
||||
**D) Determinism stability**
|
||||
|
||||
* Re-run identical scans N times / across nodes; publish `% identical graph hashes` and drift causes when different.
|
||||
|
||||
---
|
||||
|
||||
# Minimal implementation plan (week‑by‑week)
|
||||
|
||||
**Week 1: primitives**
|
||||
|
||||
* Add Graph Revision ID generator in `scanner.webservice` (Merkle over normalized JSON of SBOM+edges+policies+toolVersions).
|
||||
* Define `VerdictReceipt` schema (protobuf/JSON) and DSSE envelope types.
|
||||
|
||||
**Week 2: signing + storage**
|
||||
|
||||
* Wire DSSE signing in **Authority**; offline key support + rotation.
|
||||
* Persist receipts in `Receipts` table (Postgres) keyed by `(graphRevisionId, verdictId)`; enable export (JSONL) and ledger mirror.
|
||||
|
||||
**Week 3: reachability proofs**
|
||||
|
||||
* Add call‑stack slice capture in reachability engine; serialize compactly; hash + reference from receipts.
|
||||
* Binary symbol proof module for ELF/PE: symbol bitmap + digest.
|
||||
|
||||
**Week 4: replay + UX**
|
||||
|
||||
* Emit `replay.manifest.json` per scan (inputs, tool digests).
|
||||
* UI: show **“Verified”** badge, graph hash, signature issuer, and a one‑click “Copy receipt” button.
|
||||
* API: `GET /verdicts/{id}/receipt`, `GET /graphs/{rev}/replay`.
|
||||
|
||||
**Week 5: benchmarks harness**
|
||||
|
||||
* Create `bench/` with golden fixtures and a runner:
|
||||
|
||||
* Baseline scanner adapters
|
||||
* Ground‑truth labels
|
||||
* Metrics export (FP%, proof coverage, triage time capture hooks)
|
||||
|
||||
---
|
||||
|
||||
# Developer guardrails (make these non‑negotiable)
|
||||
|
||||
* **No receipt, no ship:** any surfaced verdict must carry a DSSE receipt.
|
||||
* **Schema freeze windows:** changes to rule inputs or policy logic must bump rule version and therefore the graph hash.
|
||||
* **Replay‑first CI:** PRs touching scanning/rules must pass a replay test that reproduces prior graph hashes on gold fixtures.
|
||||
* **Clock safety:** use monotonic time inside receipts; add UTC wall‑time separately.
|
||||
|
||||
---
|
||||
|
||||
# What to show buyers/auditors
|
||||
|
||||
* A short **audit kit**: sample container + your receipts + replay manifest + one command to reproduce the same graph hash.
|
||||
* A one‑page **benchmark readout**: FP reduction, proof coverage, and triage time saved (p50/p95), with corpus description.
|
||||
|
||||
---
|
||||
|
||||
If you want, I’ll draft:
|
||||
|
||||
1. the DSSE `predicate` schema,
|
||||
2. the Postgres DDL for `Receipts` and `Graphs`, and
|
||||
3. a tiny .NET verification CLI (`stellaops-verify`) that replays a manifest and validates signatures.
|
||||
Here’s a focused “developer guidelines” doc just for **Benchmarks for a Testable Security Moat** in Stella Ops.
|
||||
|
||||
---
|
||||
|
||||
# Stella Ops Developer Guidelines
|
||||
|
||||
## Benchmarks for a Testable Security Moat
|
||||
|
||||
> **Goal:** Benchmarks are how we *prove* Stella Ops is better, not just say it is. If a “moat” claim can’t be tied to a benchmark, it doesn’t exist.
|
||||
|
||||
Everything here is about how you, as a developer, design, extend, and run those benchmarks.
|
||||
|
||||
---
|
||||
|
||||
## 1. What our benchmarks must measure
|
||||
|
||||
Every core product claim needs at least one benchmark:
|
||||
|
||||
1. **Detection quality**
|
||||
|
||||
* Precision / recall vs ground truth.
|
||||
* False positives vs popular scanners.
|
||||
* False negatives on known‑bad samples.
|
||||
|
||||
2. **Proof & evidence quality**
|
||||
|
||||
* % of findings with **valid receipts** (DSSE).
|
||||
* % of VEX “not‑affected” with attached proofs.
|
||||
* Reachability proof quality:
|
||||
|
||||
* call‑stack slice present?
|
||||
* symbol proof present for binaries?
|
||||
|
||||
3. **Triage & workflow impact**
|
||||
|
||||
* Time‑to‑decision for analysts (p50/p95).
|
||||
* Click depth and context switches per decision.
|
||||
* “Verified” vs “unverified” verdict triage times.
|
||||
|
||||
4. **Determinism & reproducibility**
|
||||
|
||||
* Same inputs → same **Graph Revision ID**.
|
||||
* Stable verdict sets across runs/nodes.
|
||||
|
||||
> **Rule:** If you add a feature that impacts any of these, you must either hook it into an existing benchmark or add a new one.
|
||||
|
||||
---
|
||||
|
||||
## 2. Benchmark assets and layout
|
||||
|
||||
**2.1 Repo layout (convention)**
|
||||
|
||||
Under `bench/` we maintain everything benchmark‑related:
|
||||
|
||||
* `bench/corpus/`
|
||||
|
||||
* `images/` – curated container images / tarballs.
|
||||
* `repos/` – sample codebases (with known vulns).
|
||||
* `sboms/` – canned SBOMs for edge cases.
|
||||
* `bench/scenarios/`
|
||||
|
||||
* `*.yaml` – scenario definitions (inputs + expected outputs).
|
||||
* `bench/golden/`
|
||||
|
||||
* `*.json` – golden results (expected findings, metrics).
|
||||
* `bench/tools/`
|
||||
|
||||
* adapters for baseline scanners, parsers, helpers.
|
||||
* `bench/scripts/`
|
||||
|
||||
* `run_benchmarks.[sh/cs]` – single entrypoint.
|
||||
|
||||
**2.2 Scenario definition (high‑level)**
|
||||
|
||||
Each scenario yaml should minimally specify:
|
||||
|
||||
* **Inputs**
|
||||
|
||||
* artifact references (image name / path / repo SHA / SBOM file).
|
||||
* environment knobs (features enabled/disabled).
|
||||
* **Ground truth**
|
||||
|
||||
* list of expected vulns (or explicit “none”).
|
||||
* for some: expected reachability (reachable/unreachable).
|
||||
* expected VEX entries (affected / not affected).
|
||||
* **Expectations**
|
||||
|
||||
* required metrics (e.g., “no more than 2 FPs”, “no FNs”).
|
||||
* required proof coverage (e.g., “100% of surfaced findings have receipts”).
|
||||
|
||||
---
|
||||
|
||||
## 3. Core benchmark metrics (developer‑facing definitions)
|
||||
|
||||
Use these consistently across code and docs.
|
||||
|
||||
### 3.1 Detection metrics
|
||||
|
||||
* `true_positive_count` (TP)
|
||||
* `false_positive_count` (FP)
|
||||
* `false_negative_count` (FN)
|
||||
|
||||
Derived:
|
||||
|
||||
* `precision = TP / (TP + FP)`
|
||||
* `recall = TP / (TP + FN)`
|
||||
* For UX: track **FP per asset** and **FP per 100 findings**.
|
||||
|
||||
**Developer guideline:**
|
||||
|
||||
* When you introduce a filter, deduper, or rule tweak, add/modify a scenario where:
|
||||
|
||||
* the change **helps** (reduces FP or FN); and
|
||||
* a different scenario guards against regressions.
|
||||
|
||||
### 3.2 Moat‑specific metrics
|
||||
|
||||
These are the ones that directly support the “testable moat” story:
|
||||
|
||||
1. **False‑positive reduction vs baseline scanners**
|
||||
|
||||
* Run baseline scanners across our corpus (via adapters in `bench/tools`).
|
||||
* Compute:
|
||||
|
||||
* `baseline_fp_rate`
|
||||
* `stella_fp_rate`
|
||||
* `fp_reduction = (baseline_fp_rate - stella_fp_rate) / baseline_fp_rate`.
|
||||
|
||||
2. **Proof coverage**
|
||||
|
||||
* `proof_coverage_all = findings_with_valid_receipts / total_findings`
|
||||
* `proof_coverage_vex = vex_items_with_valid_receipts / total_vex_items`
|
||||
* `proof_coverage_reachable = reachable_findings_with_proofs / total_reachable_findings`
|
||||
|
||||
3. **Triage time improvement**
|
||||
|
||||
* In test harnesses, simulate or record:
|
||||
|
||||
* `time_to_triage_with_receipts`
|
||||
* `time_to_triage_without_receipts`
|
||||
* Compute median & p95 deltas.
|
||||
|
||||
4. **Determinism**
|
||||
|
||||
* Re‑run the same scenario `N` times:
|
||||
|
||||
* `% runs with identical Graph Revision ID`
|
||||
* `% runs with identical verdict sets`
|
||||
* On mismatch, diff and log cause (e.g., non‑stable sort, non‑pinned feed).
|
||||
|
||||
---
|
||||
|
||||
## 4. How developers should work with benchmarks
|
||||
|
||||
### 4.1 “No feature without benchmarks”
|
||||
|
||||
If you’re adding or changing:
|
||||
|
||||
* graph structure,
|
||||
* rule logic,
|
||||
* scanner integration,
|
||||
* VEX handling,
|
||||
* proof / receipt generation,
|
||||
|
||||
you **must** do *at least one* of:
|
||||
|
||||
1. **Extend an existing scenario**
|
||||
|
||||
* Add expectations that cover your change, or
|
||||
* tighten an existing bound (e.g., lower FP threshold).
|
||||
|
||||
2. **Add a new scenario**
|
||||
|
||||
* For new attack classes / edge cases / ecosystems.
|
||||
|
||||
**Anti‑patterns:**
|
||||
|
||||
* Shipping a new capability with *no* corresponding scenario.
|
||||
* Updating golden outputs without explaining why metrics changed.
|
||||
|
||||
### 4.2 CI gates
|
||||
|
||||
We treat benchmarks as **blocking**:
|
||||
|
||||
* Add a CI job, e.g.:
|
||||
|
||||
* `make bench:quick` on every PR (small subset).
|
||||
* `make bench:full` on main / nightly.
|
||||
* CI fails if:
|
||||
|
||||
* Any scenario marked `strict: true` has:
|
||||
|
||||
* Precision or recall below its threshold.
|
||||
* Proof coverage below its configured threshold.
|
||||
* Global regressions above tolerance:
|
||||
|
||||
* e.g. total FP increases > X% without an explicit override.
|
||||
|
||||
**Developer rule:**
|
||||
|
||||
* If you intentionally change behavior:
|
||||
|
||||
* Update the relevant golden files.
|
||||
* Include a short note in the PR (e.g., `bench-notes.md` snippet) describing:
|
||||
|
||||
* what changed,
|
||||
* why the new result is better, and
|
||||
* which moat metric it improves (FP, proof coverage, determinism, etc.).
|
||||
|
||||
---
|
||||
|
||||
## 5. Benchmark implementation guidelines
|
||||
|
||||
### 5.1 Make benchmarks deterministic
|
||||
|
||||
* **Pin everything**:
|
||||
|
||||
* feed snapshots,
|
||||
* tool container digests,
|
||||
* rule versions,
|
||||
* time windows.
|
||||
* Use **Replay Manifests** as the source of truth:
|
||||
|
||||
* `replay.manifest.json` should contain:
|
||||
|
||||
* input artifacts,
|
||||
* tool versions,
|
||||
* feed versions,
|
||||
* configuration flags.
|
||||
* If a benchmark depends on time:
|
||||
|
||||
* Inject a **fake clock** or explicit “as of” timestamp.
|
||||
|
||||
### 5.2 Keep scenarios small but meaningful
|
||||
|
||||
* Prefer many **focused** scenarios over a few huge ones.
|
||||
* Each scenario should clearly answer:
|
||||
|
||||
* “What property of Stella Ops are we testing?”
|
||||
* “What moat claim does this support?”
|
||||
|
||||
Examples:
|
||||
|
||||
* `bench/scenarios/false_pos_kubernetes.yaml`
|
||||
|
||||
* Focus: config noise reduction vs baseline scanner.
|
||||
* `bench/scenarios/reachability_java_webapp.yaml`
|
||||
|
||||
* Focus: reachable vs unreachable vuln proofs.
|
||||
* `bench/scenarios/vex_not_affected_openssl.yaml`
|
||||
|
||||
* Focus: VEX correctness and proof coverage.
|
||||
|
||||
### 5.3 Use golden outputs, not ad‑hoc assertions
|
||||
|
||||
* Bench harness should:
|
||||
|
||||
* Run Stella Ops on scenario inputs.
|
||||
* Normalize outputs (sorted lists, stable IDs).
|
||||
* Compare to `bench/golden/<scenario>.json`.
|
||||
* Golden file should include:
|
||||
|
||||
* expected findings (id, severity, reachable?, etc.),
|
||||
* expected VEX entries,
|
||||
* expected metrics (precision, recall, coverage).
|
||||
|
||||
---
|
||||
|
||||
## 6. Moat‑critical benchmark types (we must have all of these)
|
||||
|
||||
When you’re thinking about gaps, check that we have:
|
||||
|
||||
1. **Cross‑tool comparison**
|
||||
|
||||
* Same corpus, multiple scanners.
|
||||
* Metrics vs baselines for FP/FN.
|
||||
|
||||
2. **Proof density & quality**
|
||||
|
||||
* Corpus where:
|
||||
|
||||
* some vulns are reachable,
|
||||
* some are not,
|
||||
* some are not present.
|
||||
* Ensure:
|
||||
|
||||
* reachable ones have rich proofs (stack slices / symbol proofs).
|
||||
* non‑reachable or absent ones have:
|
||||
|
||||
* correct disposition, and
|
||||
* clear receipts explaining why.
|
||||
|
||||
3. **VEX accuracy**
|
||||
|
||||
* Scenarios with known SBOM + known vulnerability impact.
|
||||
* Check:
|
||||
|
||||
* VEX “affected”/“not‑affected” matches ground truth.
|
||||
* every VEX entry has a receipt.
|
||||
|
||||
4. **Analyst workflow**
|
||||
|
||||
* Small usability corpus for internal testing:
|
||||
|
||||
* Measure time‑to‑triage with/without receipts.
|
||||
* Use the same scenarios across releases to track improvement.
|
||||
|
||||
5. **Upgrade / drift resistance**
|
||||
|
||||
* Scenarios that are **expected to remain stable** across:
|
||||
|
||||
* rule changes that *shouldn’t* affect outcomes.
|
||||
* feed updates (within a given version window).
|
||||
* These act as canaries for unintended regressions.
|
||||
|
||||
---
|
||||
|
||||
## 7. Developer checklist (TL;DR)
|
||||
|
||||
Before merging a change that touches security logic, ask yourself:
|
||||
|
||||
1. **Is there at least one benchmark scenario that exercises this change?**
|
||||
2. **Does the change improve at least one moat metric, or is it neutral?**
|
||||
3. **Have I run `make bench:quick` locally and checked diffs?**
|
||||
4. **If goldens changed, did I explain why in the PR?**
|
||||
5. **Did I keep benchmarks deterministic (pinned versions, fake time, etc.)?**
|
||||
|
||||
If any answer is “no”, fix that before merging.
|
||||
|
||||
---
|
||||
|
||||
If you’d like, next step I can sketch a concrete `bench/scenarios/*.yaml` and matching `bench/golden/*.json` example that encodes one *specific* moat claim (e.g., “30% fewer FPs than Scanner X on Kubernetes configs”) so your team has a ready‑to-copy pattern.
|
||||
@@ -0,0 +1,287 @@
|
||||
Here’s a condensed **“Stella Ops Developer Guidelines”** based on the official engineering docs and dev guides.
|
||||
|
||||
---
|
||||
|
||||
## 0. Where to start
|
||||
|
||||
* **Dev docs index:** The main entrypoint is `Development Guides & Tooling` (docs/technical/development/README.md). It links to coding standards, test strategy, performance workbook, plug‑in SDK, examples, and more. ([Gitea: Git with a cup of tea][1])
|
||||
* **If a term is unfamiliar:** Check the one‑page *Glossary of Terms* first. ([Stella Ops][2])
|
||||
* **Big picture:** Stella Ops is an SBOM‑first, offline‑ready container security platform; a lot of design decisions (determinism, signatures, policy DSL, SBOM delta scans) flow from that. ([Stella Ops][3])
|
||||
|
||||
---
|
||||
|
||||
## 1. Core engineering principles
|
||||
|
||||
From **Coding Standards & Contributor Guide**: ([Gitea: Git with a cup of tea][4])
|
||||
|
||||
1. **SOLID first** – especially interface & dependency inversion.
|
||||
2. **100‑line file rule** – if a file grows >100 physical lines, split or refactor.
|
||||
3. **Contracts vs runtime** – public DTOs and interfaces live in lightweight `*.Contracts` projects; implementations live in sibling runtime projects.
|
||||
4. **Single composition root** – DI wiring happens in `StellaOps.Web/Program.cs` and each plug‑in’s `IoCConfigurator`. Nothing else creates a service provider.
|
||||
5. **No service locator** – constructor injection only; no global `ServiceProvider` or static service lookups.
|
||||
6. **Fail‑fast startup** – validate configuration *before* the web host starts listening.
|
||||
7. **Hot‑load compatibility** – avoid static singletons that would survive plug‑in unload; don’t manually load assemblies outside the built‑in loader.
|
||||
|
||||
These all serve the product goals of **deterministic, offline, explainable security decisions**. ([Stella Ops][3])
|
||||
|
||||
---
|
||||
|
||||
## 2. Repository layout & layering
|
||||
|
||||
From the repo layout section: ([Gitea: Git with a cup of tea][4])
|
||||
|
||||
* **Top‑level structure (simplified):**
|
||||
|
||||
```text
|
||||
src/
|
||||
backend/
|
||||
StellaOps.Web/ # ASP.NET host + composition root
|
||||
StellaOps.Common/ # logging, helpers
|
||||
StellaOps.Contracts/ # DTO + interface contracts
|
||||
… more runtime projects
|
||||
plugins-sdk/ # plug‑in templates & abstractions
|
||||
frontend/ # Angular workspace
|
||||
tests/ # mirrors src 1‑to‑1
|
||||
```
|
||||
|
||||
* **Rules:**
|
||||
|
||||
* No “Module” folders or nested solution hierarchies.
|
||||
* Tests mirror `src/` structure 1:1; **no test code in production projects**.
|
||||
* New features follow *feature folder* layout (e.g., `Scan/ScanService.cs`, `Scan/ScanController.cs`).
|
||||
|
||||
---
|
||||
|
||||
## 3. Naming, style & language usage
|
||||
|
||||
Key conventions: ([Gitea: Git with a cup of tea][4])
|
||||
|
||||
* **Namespaces:** file‑scoped, `StellaOps.*`.
|
||||
* **Interfaces:** `I` prefix (`IScannerRunner`).
|
||||
* **Classes/records:** PascalCase (`ScanRequest`, `TrivyRunner`).
|
||||
* **Private fields:** `camelCase` (no leading `_`).
|
||||
* **Constants:** `SCREAMING_SNAKE_CASE`.
|
||||
* **Async methods:** end with `Async`.
|
||||
* **Usings:** outside namespace, sorted, no wildcard imports.
|
||||
* **File length:** keep ≤100 lines including `using` and braces (enforced by tooling).
|
||||
|
||||
C# feature usage: ([Gitea: Git with a cup of tea][4])
|
||||
|
||||
* Nullable reference types **on**.
|
||||
* Use `record` for immutable DTOs.
|
||||
* Prefer pattern matching over long `switch` cascades.
|
||||
* `Span`/`Memory` only when you’ve measured that you need them.
|
||||
* Use `await foreach` instead of manual iterator loops.
|
||||
|
||||
Formatting & analysis:
|
||||
|
||||
* `dotnet format` must be clean; StyleCop + security analyzers + CodeQL run in CI and are treated as gates. ([Gitea: Git with a cup of tea][4])
|
||||
|
||||
---
|
||||
|
||||
## 4. Dependency injection, async & concurrency
|
||||
|
||||
DI policy (core + plug‑ins): ([Gitea: Git with a cup of tea][4])
|
||||
|
||||
* Exactly **one composition root** per process (`StellaOps.Web/Program.cs`).
|
||||
* Plug‑ins contribute through:
|
||||
|
||||
* `[ServiceBinding]` attributes for simple bindings, or
|
||||
* An `IoCConfigurator : IDependencyInjectionRoutine` for advanced setups.
|
||||
* Default lifetime is **scoped**. Use singletons only for truly stateless, thread‑safe helpers.
|
||||
* Never use a service locator or manually build nested service providers except in tests.
|
||||
|
||||
Async & threading: ([Gitea: Git with a cup of tea][4])
|
||||
|
||||
* All I/O is async; avoid `.Result` / `.Wait()`.
|
||||
* Library code uses `ConfigureAwait(false)`.
|
||||
* Control concurrency with channels or `Parallel.ForEachAsync`, not ad‑hoc `Task.Run` loops.
|
||||
|
||||
---
|
||||
|
||||
## 5. Tests, tooling & quality gates
|
||||
|
||||
The **Automated Test‑Suite Overview** spells out all CI layers and budgets. ([Gitea: Git with a cup of tea][5])
|
||||
|
||||
**Test layers (high‑level):**
|
||||
|
||||
* Unit tests: xUnit.
|
||||
* Property‑based tests: FsCheck.
|
||||
* Integration:
|
||||
|
||||
* API integration with Testcontainers.
|
||||
* DB/merge flows using Mongo + Redis.
|
||||
* Contracts: gRPC breakage checks with Buf.
|
||||
* Frontend:
|
||||
|
||||
* Unit tests with Jest.
|
||||
* E2E tests with Playwright.
|
||||
* Lighthouse runs for performance & accessibility.
|
||||
* Non‑functional:
|
||||
|
||||
* Load tests via k6.
|
||||
* Chaos experiments (CPU/OOM) using Docker tooling.
|
||||
* Dependency & license scanning.
|
||||
* SBOM reproducibility/attestation checks.
|
||||
|
||||
**Quality gates (examples):** ([Gitea: Git with a cup of tea][5])
|
||||
|
||||
* API unit test line coverage ≥ ~85%.
|
||||
* API P95 latency ≤ ~120 ms in nightly runs.
|
||||
* Δ‑SBOM warm scan P95 ≤ ~5 s on reference hardware.
|
||||
* Lighthouse perf score ≥ ~90, a11y ≥ ~95.
|
||||
|
||||
**Local workflows:**
|
||||
|
||||
* Use `./scripts/dev-test.sh` for “fast” local runs and `--full` for the entire stack (API, UI, Playwright, Lighthouse, etc.). Needs Docker and modern Node. ([Gitea: Git with a cup of tea][5])
|
||||
* Some suites use Mongo2Go + an OpenSSL 1.1 shim; others use a helper script to spin up a local `mongod` for deeper debugging. ([Gitea: Git with a cup of tea][5])
|
||||
|
||||
---
|
||||
|
||||
## 6. Plug‑ins & connectors
|
||||
|
||||
The **Plug‑in SDK Guide** is your bible for schedule jobs, scanner adapters, TLS providers, notification channels, etc. ([Gitea: Git with a cup of tea][6])
|
||||
|
||||
**Basics:**
|
||||
|
||||
* Use `.NET` templates to scaffold:
|
||||
|
||||
```bash
|
||||
dotnet new stellaops-plugin-schedule -n MyPlugin.Schedule --output src
|
||||
```
|
||||
|
||||
* At publish time, copy **signed** artefacts to:
|
||||
|
||||
```text
|
||||
src/backend/Stella.Ops.Plugin.Binaries/<MyPlugin>/
|
||||
MyPlugin.dll
|
||||
MyPlugin.dll.sig
|
||||
```
|
||||
|
||||
* The backend:
|
||||
|
||||
* Verifies the Cosign signature.
|
||||
* Enforces `[StellaPluginVersion]` compatibility.
|
||||
* Loads plug‑ins in isolated `AssemblyLoadContext`s.
|
||||
|
||||
**DI entrypoints:**
|
||||
|
||||
* For simple cases, mark implementations with `[ServiceBinding(typeof(IMyContract), ServiceLifetime.Scoped, …)]`.
|
||||
* For more control, implement `IoCConfigurator : IDependencyInjectionRoutine` and configure services/options in `Register(...)`. ([Gitea: Git with a cup of tea][6])
|
||||
|
||||
**Examples:**
|
||||
|
||||
* **Schedule job:** implement `IJob.ExecuteAsync`, add `[StellaPluginVersion("X.Y.Z")]`, register cron with `services.AddCronJob<MyJob>("0 15 * * *")`.
|
||||
* **Scanner adapter:** implement `IScannerRunner` and register via `services.AddScanner<MyAltScanner>("alt")`; document Docker sidecars if needed. ([Gitea: Git with a cup of tea][6])
|
||||
|
||||
**Signing & deployment:**
|
||||
|
||||
* Publish, sign with Cosign, optionally zip:
|
||||
|
||||
```bash
|
||||
dotnet publish -c Release -p:PublishSingleFile=true -o out
|
||||
cosign sign --key $COSIGN_KEY out/MyPlugin.Schedule.dll
|
||||
```
|
||||
|
||||
* Copy into the backend container (e.g., `/opt/plugins/`) and restart.
|
||||
|
||||
* Unsigned DLLs are rejected when `StellaOps:Security:DisableUnsigned=false`. ([Gitea: Git with a cup of tea][6])
|
||||
|
||||
**Marketplace:**
|
||||
|
||||
* Tag releases like `plugin-vX.Y.Z`, attach the signed ZIP, and submit metadata to the community plug‑in index so it shows up in the UI Marketplace. ([Gitea: Git with a cup of tea][6])
|
||||
|
||||
---
|
||||
|
||||
## 7. Policy DSL & security decisions
|
||||
|
||||
For policy authors and tooling engineers, the **Stella Policy DSL (stella‑dsl@1)** doc is key. ([Stella Ops][7])
|
||||
|
||||
**Goals:**
|
||||
|
||||
* Deterministic: same inputs → same findings on every machine.
|
||||
* Declarative: no arbitrary loops, network calls, or clocks.
|
||||
* Explainable: each decision carries rule, inputs, rationale.
|
||||
* Offline‑friendly and reachability‑aware (SBOM + advisories + VEX + reachability). ([Stella Ops][7])
|
||||
|
||||
**Structure:**
|
||||
|
||||
* One `policy` block per `.stella` file, with:
|
||||
|
||||
* `metadata` (description, tags).
|
||||
* `profile` blocks (severity, trust, reachability adjustments).
|
||||
* `rule` blocks (`when` / `then` logic).
|
||||
* Optional `settings`. ([Stella Ops][7])
|
||||
|
||||
**Context & built‑ins:**
|
||||
|
||||
* Namespaces like `sbom`, `advisory`, `vex`, `env`, `telemetry`, `secret`, `profile.*`, etc. ([Stella Ops][7])
|
||||
* Helpers such as `normalize_cvss`, `risk_score`, `vex.any`, `vex.latest`, `sbom.any_component`, `exists`, `coalesce`, and secrets‑specific helpers. ([Stella Ops][7])
|
||||
|
||||
**Rules of thumb:**
|
||||
|
||||
* Always include a clear `because` when you change `status` or `severity`. ([Stella Ops][7])
|
||||
* Avoid catch‑all suppressions (`when true` + `status := "suppressed"`); the linter will flag them. ([Stella Ops][7])
|
||||
* Use `stella policy lint/compile/simulate` in CI and locally; test in sealed (offline) mode to ensure no network dependencies. ([Stella Ops][7])
|
||||
|
||||
---
|
||||
|
||||
## 8. Commits, PRs & docs
|
||||
|
||||
From the commit/PR checklist: ([Gitea: Git with a cup of tea][4])
|
||||
|
||||
Before opening a PR:
|
||||
|
||||
1. Use **Conventional Commit** prefixes (`feat:`, `fix:`, `docs:`, etc.).
|
||||
2. Run `dotnet format` and `dotnet test`; both must be green.
|
||||
3. Keep new/changed files within the 100‑line guideline.
|
||||
4. Update XML‑doc comments for any new public API.
|
||||
5. If you add/change a public contract:
|
||||
|
||||
* Update the relevant markdown docs.
|
||||
* Update JSON schema / API descriptions as needed.
|
||||
6. Ensure static analyzers and CI jobs relevant to your change are passing.
|
||||
|
||||
For new test layers or jobs, also update the test‑suite overview and metrics docs so the CI configuration stays discoverable. ([Gitea: Git with a cup of tea][5])
|
||||
|
||||
---
|
||||
|
||||
## 9. Licensing & reciprocity
|
||||
|
||||
Stella Ops ships under **AGPL‑3.0‑or‑later** with a strong reciprocity clause: ([Stella Ops][8])
|
||||
|
||||
* You may run, study, modify, and redistribute it, including as a hosted service.
|
||||
* If you run a **modified** version for others over a network, you must make that exact source code available to those users.
|
||||
* Official containers are signed and include SBOMs and attestations; verify them with Cosign as described on the license/security pages. ([Stella Ops][8])
|
||||
|
||||
When you build extensions:
|
||||
|
||||
* Keep plug‑ins compatible with AGPL expectations around combined works.
|
||||
* Don’t embed proprietary logic into the core without checking license implications.
|
||||
|
||||
---
|
||||
|
||||
## 10. If you just want a “first contribution” recipe
|
||||
|
||||
A practical path that follows the guidelines:
|
||||
|
||||
1. Clone the main repo; skim **Coding Standards**, **Test Suite Overview**, and the **Dev Guides & Tooling** index. ([Gitea: Git with a cup of tea][4])
|
||||
2. Get `dotnet`, Docker, Node set up; run `./scripts/dev-test.sh` to make sure your environment is healthy. ([Gitea: Git with a cup of tea][5])
|
||||
3. Pick a small issue (docs, small refactor, or new test), make changes respecting:
|
||||
|
||||
* 100‑line files,
|
||||
* DI patterns,
|
||||
* naming & style.
|
||||
4. Add/adjust tests plus any affected docs or JSON schemas.
|
||||
5. Run tests + formatting locally, push, and open a PR with a conventional title and a short “how I tested this” note.
|
||||
|
||||
If you tell me what you’re planning to work on (plug‑in, policy pack, core feature, or UI), I can turn this into a very concrete checklist tailored to that slice of Stella Ops.
|
||||
|
||||
[1]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/08b27b8a266c82960c7653797460e1e1d17ecd45/docs/technical/development/README.md "git.stella-ops.org/README.md at 08b27b8a266c82960c7653797460e1e1d17ecd45 - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[2]: https://stella-ops.org/docs/14_glossary_of_terms/?utm_source=chatgpt.com "Open • Sovereign • Modular container security - Stella Ops"
|
||||
[3]: https://stella-ops.org/docs/05_SYSTEM_REQUIREMENTS_SPEC/?utm_source=chatgpt.com "system requirements specification - Stella Ops – Open • Sovereign ..."
|
||||
[4]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/08b27b8a266c82960c7653797460e1e1d17ecd45/docs/18_CODING_STANDARDS.md "git.stella-ops.org/18_CODING_STANDARDS.md at 08b27b8a266c82960c7653797460e1e1d17ecd45 - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[5]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/08b27b8a266c82960c7653797460e1e1d17ecd45/docs/19_TEST_SUITE_OVERVIEW.md "git.stella-ops.org/19_TEST_SUITE_OVERVIEW.md at 08b27b8a266c82960c7653797460e1e1d17ecd45 - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[6]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/08b27b8a266c82960c7653797460e1e1d17ecd45/docs/10_PLUGIN_SDK_GUIDE.md "git.stella-ops.org/10_PLUGIN_SDK_GUIDE.md at 08b27b8a266c82960c7653797460e1e1d17ecd45 - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[7]: https://stella-ops.org/docs/policy/dsl/index.html "Stella Ops – Signed Reachability · Deterministic Replay · Sovereign Crypto"
|
||||
[8]: https://stella-ops.org/license/?utm_source=chatgpt.com "AGPL‑3.0‑or‑later - Stella Ops"
|
||||
@@ -0,0 +1,585 @@
|
||||
Here’s a tight, practical pattern to make your scanner’s vuln‑DB updates rock‑solid even when feeds hiccup:
|
||||
|
||||
# Offline, verifiable update bundles (DSSE + Rekor v2)
|
||||
|
||||
**Idea:** distribute DB updates as offline tarballs. Each tarball ships with:
|
||||
|
||||
* a **DSSE‑signed** statement (e.g., in‑toto style) over the bundle hash
|
||||
* a **Rekor v2 receipt** proving the signature/statement was logged
|
||||
* a small **manifest.json** (version, created_at, content hashes)
|
||||
|
||||
**Startup flow (happy path):**
|
||||
|
||||
1. Load latest tarball from your local `updates/` cache.
|
||||
2. Verify DSSE signature against your trusted public keys.
|
||||
3. Verify Rekor v2 receipt (inclusion proof) matches the DSSE payload hash.
|
||||
4. If both pass, unpack/activate; record the bundle’s **trust_id** (e.g., statement digest).
|
||||
5. If anything fails, **keep using the last good bundle**. No service disruption.
|
||||
|
||||
**Why this helps**
|
||||
|
||||
* **Air‑gap friendly:** no live network needed at activation time.
|
||||
* **Tamper‑evident:** DSSE + Rekor receipt proves provenance and transparency.
|
||||
* **Operational stability:** feed outages become non‑events—scanner just keeps the last good state.
|
||||
|
||||
---
|
||||
|
||||
## File layout inside each bundle
|
||||
|
||||
```
|
||||
/bundle-2025-11-29/
|
||||
manifest.json # { version, created_at, entries[], sha256s }
|
||||
payload.tar.zst # the actual DB/indices
|
||||
payload.tar.zst.sha256
|
||||
statement.dsse.json # DSSE-wrapped statement over payload hash
|
||||
rekor-receipt.json # Rekor v2 inclusion/verification material
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Acceptance/Activation rules
|
||||
|
||||
* **Trust root:** pin one (or more) publisher public keys; rotate via separate, out‑of‑band process.
|
||||
* **Monotonicity:** only activate if `manifest.version > current.version` (or if trust policy explicitly allows replay for rollback testing).
|
||||
* **Atomic switch:** unpack to `db/staging/`, validate, then symlink‑flip to `db/active/`.
|
||||
* **Quarantine on failure:** move bad bundles to `updates/quarantine/` with a reason code.
|
||||
|
||||
---
|
||||
|
||||
## Minimal .NET 10 verifier sketch (C#)
|
||||
|
||||
```csharp
|
||||
public sealed record BundlePaths(string Dir) {
|
||||
public string Manifest => Path.Combine(Dir, "manifest.json");
|
||||
public string Payload => Path.Combine(Dir, "payload.tar.zst");
|
||||
public string Dsse => Path.Combine(Dir, "statement.dsse.json");
|
||||
public string Receipt => Path.Combine(Dir, "rekor-receipt.json");
|
||||
}
|
||||
|
||||
public async Task<bool> ActivateBundleAsync(BundlePaths b, TrustConfig trust, string activeDir) {
|
||||
var manifest = await Manifest.LoadAsync(b.Manifest);
|
||||
if (!await Hashes.VerifyAsync(b.Payload, manifest.PayloadSha256)) return false;
|
||||
|
||||
// 1) DSSE verify (publisher keys pinned in trust)
|
||||
var (okSig, dssePayloadDigest) = await Dsse.VerifyAsync(b.Dsse, trust.PublisherKeys);
|
||||
if (!okSig || dssePayloadDigest != manifest.PayloadSha256) return false;
|
||||
|
||||
// 2) Rekor v2 receipt verify (inclusion + statement digest == dssePayloadDigest)
|
||||
if (!await RekorV2.VerifyReceiptAsync(b.Receipt, dssePayloadDigest, trust.RekorPub)) return false;
|
||||
|
||||
// 3) Stage, validate, then atomically flip
|
||||
var staging = Path.Combine(activeDir, "..", "staging");
|
||||
DirUtil.Empty(staging);
|
||||
await TarZstd.ExtractAsync(b.Payload, staging);
|
||||
if (!await LocalDbSelfCheck.RunAsync(staging)) return false;
|
||||
|
||||
SymlinkUtil.AtomicSwap(source: staging, target: activeDir);
|
||||
State.WriteLastGood(manifest.Version, dssePayloadDigest);
|
||||
return true;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Operational playbook
|
||||
|
||||
* **On boot & daily at HH:MM:** try `ActivateBundleAsync()` on the newest bundle; on failure, log and continue.
|
||||
* **Telemetry (no PII):** reason codes (SIG_FAIL, RECEIPT_FAIL, HASH_MISMATCH, SELFTEST_FAIL), versions, last_good.
|
||||
* **Keys & rotation:** keep `publisher.pub` and `rekor.pub` in a root‑owned, read‑only path; rotate via a separate signed “trust bundle”.
|
||||
* **Defense‑in‑depth:** verify both the **payload hash** and each file’s hash listed in `manifest.entries[]`.
|
||||
* **Rollback:** allow `--force-activate <bundle>` for emergency testing, but mark as **non‑monotonic** in state.
|
||||
|
||||
---
|
||||
|
||||
## What to hand your release team
|
||||
|
||||
* A Make/CI target that:
|
||||
|
||||
1. Builds `payload.tar.zst` and computes hashes
|
||||
2. Generates `manifest.json`
|
||||
3. Creates and signs the **DSSE statement**
|
||||
4. Submits to Rekor (or your mirror) and saves the **v2 receipt**
|
||||
5. Packages the bundle folder and publishes to your offline repo
|
||||
* A checksum file (`*.sha256sum`) for ops to verify out‑of‑band.
|
||||
|
||||
---
|
||||
|
||||
If you want, I can turn this into a Stella Ops spec page (`docs/modules/scanner/offline-bundles.md`) plus a small reference implementation (C# library + CLI) that drops right into your Scanner service.
|
||||
Here’s a “drop‑in” Stella Ops dev guide for **DSSE‑signed Offline Scanner Updates** — written in the same spirit as the existing docs and sprint files.
|
||||
|
||||
You can treat this as the seed for `docs/modules/scanner/development/dsse-offline-updates.md` (or similar).
|
||||
|
||||
---
|
||||
|
||||
# DSSE‑Signed Offline Scanner Updates — Developer Guidelines
|
||||
|
||||
> **Audience**
|
||||
> Scanner, Export Center, Attestor, CLI, and DevOps engineers implementing DSSE‑signed offline vulnerability updates and integrating them into the Offline Update Kit (OUK).
|
||||
>
|
||||
> **Context**
|
||||
>
|
||||
> * OUK already ships **signed, atomic offline update bundles** with merged vulnerability feeds, container images, and an attested manifest.([git.stella-ops.org][1])
|
||||
> * DSSE + Rekor is already used for **scan evidence** (SBOM attestations, Rekor proofs).([git.stella-ops.org][2])
|
||||
> * Sprints 160/162 add **attestation bundles** with manifest, checksums, DSSE signature, and optional transparency log segments, and integrate them into OUK and CLI flows.([git.stella-ops.org][3])
|
||||
|
||||
These guidelines tell you how to **wire all of that together** for “offline scanner updates” (feeds, rules, packs) in a way that matches Stella Ops’ determinism + sovereignty promises.
|
||||
|
||||
---
|
||||
|
||||
## 0. Mental model
|
||||
|
||||
At a high level, you’re building this:
|
||||
|
||||
```text
|
||||
Advisory mirrors / Feeds builders
|
||||
│
|
||||
▼
|
||||
ExportCenter.AttestationBundles
|
||||
(creates DSSE + Rekor evidence
|
||||
for each offline update snapshot)
|
||||
│
|
||||
▼
|
||||
Offline Update Kit (OUK) builder
|
||||
(adds feeds + evidence to kit tarball)
|
||||
│
|
||||
▼
|
||||
stella offline kit import / admin CLI
|
||||
(verifies Cosign + DSSE + Rekor segments,
|
||||
then atomically swaps scanner feeds)
|
||||
```
|
||||
|
||||
Online, Rekor is live; offline, you rely on **bundled Rekor segments / snapshots** and the existing OUK mechanics (import is atomic, old feeds kept until new bundle is fully verified).([git.stella-ops.org][1])
|
||||
|
||||
---
|
||||
|
||||
## 1. Goals & non‑goals
|
||||
|
||||
### Goals
|
||||
|
||||
1. **Authentic offline snapshots**
|
||||
Every offline scanner update (OUK or delta) must be verifiably tied to:
|
||||
|
||||
* a DSSE envelope,
|
||||
* a certificate chain rooted in Stella’s Fulcio/KMS profile or BYO KMS/HSM,
|
||||
* *and* a Rekor v2 inclusion proof or bundled log segment.([Stella Ops][4])
|
||||
|
||||
2. **Deterministic replay**
|
||||
Given:
|
||||
|
||||
* a specific offline update kit (`stella-ops-offline-kit-<DATE>.tgz` + `offline-manifest-<DATE>.json`)([git.stella-ops.org][1])
|
||||
* its DSSE attestation bundle + Rekor segments
|
||||
every verifier must reach the *same* verdict on integrity and contents — online or fully air‑gapped.
|
||||
|
||||
3. **Separation of concerns**
|
||||
|
||||
* Export Center: build attestation bundles, no business logic about scanning.([git.stella-ops.org][5])
|
||||
* Scanner: import & apply feeds; verify but not generate DSSE.
|
||||
* Signer / Attestor: own DSSE & Rekor integration.([git.stella-ops.org][2])
|
||||
|
||||
4. **Operational safety**
|
||||
|
||||
* Imports remain **atomic and idempotent**.
|
||||
* Old feeds stay live until the new update is **fully verified** (Cosign + DSSE + Rekor).([git.stella-ops.org][1])
|
||||
|
||||
### Non‑goals
|
||||
|
||||
* Designing new crypto or log formats.
|
||||
* Per‑feed DSSE envelopes (you can have more later, but the minimum contract is **bundle‑level** attestation).
|
||||
|
||||
---
|
||||
|
||||
## 2. Bundle contract for DSSE‑signed offline updates
|
||||
|
||||
You’re extending the existing OUK contract:
|
||||
|
||||
* OUK already packs:
|
||||
|
||||
* merged vuln feeds (OSV, GHSA, optional NVD 2.0, CNNVD/CNVD, ENISA, JVN, BDU),
|
||||
* container images (`stella-ops`, Zastava, etc.),
|
||||
* provenance (Cosign signature, SPDX SBOM, in‑toto SLSA attestation),
|
||||
* `offline-manifest.json` + detached JWS signed during export.([git.stella-ops.org][1])
|
||||
|
||||
For **DSSE‑signed offline scanner updates**, add a new logical layer:
|
||||
|
||||
### 2.1. Files to ship
|
||||
|
||||
Inside each offline kit (full or delta) you must produce:
|
||||
|
||||
```text
|
||||
/attestations/
|
||||
offline-update.dsse.json # DSSE envelope
|
||||
offline-update.rekor.json # Rekor entry + inclusion proof (or segment descriptor)
|
||||
/manifest/
|
||||
offline-manifest.json # existing manifest
|
||||
offline-manifest.json.jws # existing detached JWS
|
||||
/feeds/
|
||||
... # existing feed payloads
|
||||
```
|
||||
|
||||
The exact paths can be adjusted, but keep:
|
||||
|
||||
* **One DSSE bundle per kit** (min spec).
|
||||
* **One canonical Rekor proof file** per DSSE envelope.
|
||||
|
||||
### 2.2. DSSE payload contents (minimal)
|
||||
|
||||
Define (or reuse) a predicate type such as:
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"payloadType": "application/vnd.in-toto+json",
|
||||
"payload": { /* base64 */ }
|
||||
}
|
||||
```
|
||||
|
||||
Decoded payload (in-toto statement) should **at minimum** contain:
|
||||
|
||||
* **Subject**
|
||||
|
||||
* `name`: `stella-ops-offline-kit-<DATE>.tgz`
|
||||
* `digest.sha256`: tarball digest
|
||||
|
||||
* **Predicate type** (recommendation)
|
||||
|
||||
* `https://stella-ops.org/attestations/offline-update/1`
|
||||
|
||||
* **Predicate fields**
|
||||
|
||||
* `offline_manifest_sha256` – SHA‑256 of `offline-manifest.json`
|
||||
* `feeds` – array of feed entries such as `{ name, snapshot_date, archive_digest }` (mirrors `rules_and_feeds` style used in the moat doc).([Stella Ops][6])
|
||||
* `builder` – CI workflow id / git commit / Export Center job id
|
||||
* `created_at` – UTC ISO‑8601
|
||||
* `oukit_channel` – e.g., `edge`, `stable`, `fips-profile`
|
||||
|
||||
**Guideline:** this DSSE payload is the **single canonical description** of “what this offline update snapshot is”.
|
||||
|
||||
### 2.3. Rekor material
|
||||
|
||||
Attestor must:
|
||||
|
||||
* Submit `offline-update.dsse.json` to Rekor v2, obtaining:
|
||||
|
||||
* `uuid`
|
||||
* `logIndex`
|
||||
* inclusion proof (`rootHash`, `hashes`, `checkpoint`)
|
||||
* Serialize that to `offline-update.rekor.json` and store it in object storage + OUK staging, so it ships in the kit.([git.stella-ops.org][2])
|
||||
|
||||
For fully offline operation:
|
||||
|
||||
* Either:
|
||||
|
||||
* embed a **minimal log segment** containing that entry; or
|
||||
* rely on daily Rekor snapshot exports included elsewhere in the kit.([git.stella-ops.org][2])
|
||||
|
||||
---
|
||||
|
||||
## 3. Implementation by module
|
||||
|
||||
### 3.1 Export Center — attestation bundles
|
||||
|
||||
**Working directory:** `src/ExportCenter/StellaOps.ExportCenter.AttestationBundles`([git.stella-ops.org][7])
|
||||
|
||||
**Responsibilities**
|
||||
|
||||
1. **Compose attestation bundle job** (EXPORT‑ATTEST‑74‑001)
|
||||
|
||||
* Input: a snapshot identifier (e.g., offline kit build id or feed snapshot date).
|
||||
* Read manifest and feed metadata from the Export Center’s storage.([git.stella-ops.org][5])
|
||||
* Generate the DSSE payload structure described above.
|
||||
* Call `StellaOps.Signer` to wrap it in a DSSE envelope.
|
||||
* Call `StellaOps.Attestor` to submit DSSE → Rekor and get the inclusion proof.([git.stella-ops.org][2])
|
||||
* Persist:
|
||||
|
||||
* `offline-update.dsse.json`
|
||||
* `offline-update.rekor.json`
|
||||
* any log segment artifacts.
|
||||
|
||||
2. **Integrate into offline kit packaging** (EXPORT‑ATTEST‑74‑002 / 75‑001)
|
||||
|
||||
* The OUK builder (Python script `ops/offline-kit/build_offline_kit.py`) already assembles artifacts & manifests.([Stella Ops][8])
|
||||
* Extend that pipeline (or add an Export Center step) to:
|
||||
|
||||
* fetch the attestation bundle for the snapshot,
|
||||
* place it under `/attestations/` in the kit staging dir,
|
||||
* ensure `offline-manifest.json` contains entries for the DSSE and Rekor files (name, sha256, size, capturedAt).([git.stella-ops.org][1])
|
||||
|
||||
3. **Contracts & schemas**
|
||||
|
||||
* Define a small JSON schema for `offline-update.rekor.json` (UUID, index, proof fields) and check it into `docs/11_DATA_SCHEMAS.md` or module‑local schemas.
|
||||
* Keep all new payload schemas **versioned**; avoid “shape drift”.
|
||||
|
||||
**Do / Don’t**
|
||||
|
||||
* ✅ **Do** treat attestation bundle job as *pure aggregation* (AOC guardrail: no modification of evidence).([git.stella-ops.org][5])
|
||||
* ✅ **Do** rely on Signer + Attestor; don’t hand‑roll DSSE/Rekor logic in Export Center.([git.stella-ops.org][2])
|
||||
* ❌ **Don’t** reach out to external networks from this job — it must run with the same offline‑ready posture as the rest of the platform.
|
||||
|
||||
---
|
||||
|
||||
### 3.2 Offline Update Kit builder
|
||||
|
||||
**Working area:** `ops/offline-kit/*` + `docs/24_OFFLINE_KIT.md`([git.stella-ops.org][1])
|
||||
|
||||
Guidelines:
|
||||
|
||||
1. **Preserve current guarantees**
|
||||
|
||||
* Imports must remain **idempotent and atomic**, with **old feeds kept until the new bundle is fully verified**. This now includes DSSE/Rekor checks in addition to Cosign + JWS.([git.stella-ops.org][1])
|
||||
|
||||
2. **Staging layout**
|
||||
|
||||
* When staging a kit, ensure the tree looks like:
|
||||
|
||||
```text
|
||||
out/offline-kit/staging/
|
||||
feeds/...
|
||||
images/...
|
||||
manifest/offline-manifest.json
|
||||
attestations/offline-update.dsse.json
|
||||
attestations/offline-update.rekor.json
|
||||
```
|
||||
|
||||
* Update `offline-manifest.json` so each new file appears with:
|
||||
|
||||
* `name`, `sha256`, `size`, `capturedAt`.([git.stella-ops.org][1])
|
||||
|
||||
3. **Deterministic ordering**
|
||||
|
||||
* File lists in manifests must be in a stable order (e.g., lexical paths).
|
||||
* Timestamps = UTC ISO‑8601 only; never use local time. (Matches determinism guidance in AGENTS.md + policy/runs docs.)([git.stella-ops.org][9])
|
||||
|
||||
4. **Delta kits**
|
||||
|
||||
* For deltas (`stella-ouk-YYYY-MM-DD.delta.tgz`), DSSE should still cover:
|
||||
|
||||
* the delta tarball digest,
|
||||
* the **logical state** (feeds & versions) after applying the delta.
|
||||
* Don’t shortcut by “attesting only the diff files” — the predicate must describe the resulting snapshot.
|
||||
|
||||
---
|
||||
|
||||
### 3.3 Scanner — import & activation
|
||||
|
||||
**Working directory:** `src/Scanner/StellaOps.Scanner.WebService`, `StellaOps.Scanner.Worker`([git.stella-ops.org][9])
|
||||
|
||||
Scanner already exposes admin flows for:
|
||||
|
||||
* **Offline kit import**, which:
|
||||
|
||||
* validates the Cosign signature of the kit,
|
||||
* uses the attested manifest,
|
||||
* keeps old feeds until verification is done.([git.stella-ops.org][1])
|
||||
|
||||
Add DSSE/Rekor awareness as follows:
|
||||
|
||||
1. **Verification sequence (happy path)**
|
||||
|
||||
On `import-offline-usage-kit`:
|
||||
|
||||
1. Validate **Cosign** signature of the tarball.
|
||||
2. Validate `offline-manifest.json` with its JWS signature.
|
||||
3. Verify **file digests** for all entries (including `/attestations/*`).
|
||||
4. Verify **DSSE**:
|
||||
|
||||
* Call `StellaOps.Attestor.Verify` (or CLI equivalent) with:
|
||||
|
||||
* `offline-update.dsse.json`
|
||||
* `offline-update.rekor.json`
|
||||
* local Rekor log snapshot / segment (if configured)([git.stella-ops.org][2])
|
||||
* Ensure the payload digest matches the kit tarball + manifest digests.
|
||||
5. Only after all checks pass:
|
||||
|
||||
* swap Scanner’s feed pointer to the new snapshot,
|
||||
* emit an audit event noting:
|
||||
|
||||
* kit filename, tarball digest,
|
||||
* DSSE statement digest,
|
||||
* Rekor UUID + log index.
|
||||
|
||||
2. **Config surface**
|
||||
|
||||
Add config keys (names illustrative):
|
||||
|
||||
```yaml
|
||||
scanner:
|
||||
offlineKit:
|
||||
requireDsse: true # fail import if DSSE/Rekor verification fails
|
||||
rekorOfflineMode: true # use local snapshots only
|
||||
attestationVerifier: https://attestor.internal
|
||||
```
|
||||
|
||||
* Mirror them via ASP.NET Core config + env vars (`SCANNER__OFFLINEKIT__REQUIREDSSSE`, etc.), following the same pattern as the DSSE/Rekor operator guide.([git.stella-ops.org][2])
|
||||
|
||||
3. **Failure behaviour**
|
||||
|
||||
* **DSSE/Rekor fail, Cosign + manifest OK**
|
||||
|
||||
* Keep old feeds active.
|
||||
* Mark import as failed; surface a `ProblemDetails` error via API/UI.
|
||||
* Log structured fields: `rekorUuid`, `attestationDigest`, `offlineKitHash`, `failureReason`.([git.stella-ops.org][2])
|
||||
|
||||
* **Config flag to soften during rollout**
|
||||
|
||||
* When `requireDsse=false`, treat DSSE/Rekor failure as a warning and still allow the import (for initial observation phase), but emit alerts. This mirrors the “observe → enforce” pattern in the DSSE/Rekor operator guide.([git.stella-ops.org][2])
|
||||
|
||||
---
|
||||
|
||||
### 3.4 Signer & Attestor
|
||||
|
||||
You mostly **reuse** existing guidance:([git.stella-ops.org][2])
|
||||
|
||||
* Add a new predicate type & schema for offline updates in Signer.
|
||||
|
||||
* Ensure Attestor:
|
||||
|
||||
* can submit offline‑update DSSE envelopes to Rekor,
|
||||
* can emit verification routines (used by CLI and Scanner) that:
|
||||
|
||||
* verify the DSSE signature,
|
||||
* check the certificate chain against the configured root pack (FIPS/eIDAS/GOST/SM, etc.),([Stella Ops][4])
|
||||
* verify Rekor inclusion using either live log or local snapshot.
|
||||
|
||||
* For fully air‑gapped installs:
|
||||
|
||||
* rely on Rekor **snapshots mirrored** into Offline Kit (already recommended in the operator guide’s offline section).([git.stella-ops.org][2])
|
||||
|
||||
---
|
||||
|
||||
### 3.5 CLI & UI
|
||||
|
||||
Extend CLI with explicit verbs (matching EXPORT‑ATTEST sprints):([git.stella-ops.org][10])
|
||||
|
||||
* `stella attest bundle verify --bundle path/to/offline-kit.tgz --rekor-key rekor.pub`
|
||||
* `stella attest bundle import --bundle ...` (for sites that prefer a two‑step “verify then import” flow)
|
||||
* Wire UI Admin → Offline Kit screen so that:
|
||||
|
||||
* verification status shows both **Cosign/JWS** and **DSSE/Rekor** state,
|
||||
* policy banners display kit generation time, manifest hash, and DSSE/Rekor freshness.([Stella Ops][11])
|
||||
|
||||
---
|
||||
|
||||
## 4. Determinism & offline‑safety rules
|
||||
|
||||
When touching any of this code, keep these rules front‑of‑mind (they align with the policy DSL and architecture docs):([Stella Ops][4])
|
||||
|
||||
1. **No hidden network dependencies**
|
||||
|
||||
* All verification **must work offline** given the kit + Rekor snapshots.
|
||||
* Any fallback to live Rekor / Fulcio endpoints must be explicitly toggled and never on by default for “offline mode”.
|
||||
|
||||
2. **Stable serialization**
|
||||
|
||||
* DSSE payload JSON:
|
||||
|
||||
* stable ordering of fields,
|
||||
* no float weirdness,
|
||||
* UTC timestamps.
|
||||
|
||||
3. **Replayable imports**
|
||||
|
||||
* Running `import-offline-usage-kit` twice with the same bundle must be a no‑op after the first time.
|
||||
* The DSSE payload for a given snapshot must not change over time; if it does, bump the predicate or snapshot version.
|
||||
|
||||
4. **Explainability**
|
||||
|
||||
* When verification fails, errors must explain **what** mismatched (kit digest, manifest digest, DSSE envelope hash, Rekor inclusion) so auditors can reason about it.
|
||||
|
||||
---
|
||||
|
||||
## 5. Testing & CI expectations
|
||||
|
||||
Tie this into the existing CI workflows (`scanner-determinism.yml`, `attestation-bundle.yml`, `offline-kit` pipelines, etc.):([git.stella-ops.org][12])
|
||||
|
||||
### 5.1 Unit & integration tests
|
||||
|
||||
Write tests that cover:
|
||||
|
||||
1. **Happy paths**
|
||||
|
||||
* Full kit import with valid:
|
||||
|
||||
* Cosign,
|
||||
* manifest JWS,
|
||||
* DSSE,
|
||||
* Rekor proof (online and offline modes).
|
||||
|
||||
2. **Corruption scenarios**
|
||||
|
||||
* Tampered feed file (hash mismatch).
|
||||
* Tampered `offline-manifest.json`.
|
||||
* Tampered DSSE payload (signature fails).
|
||||
* Mismatched Rekor entry (payload digest doesn’t match DSSE).
|
||||
|
||||
3. **Offline scenarios**
|
||||
|
||||
* No network access, only Rekor snapshot:
|
||||
|
||||
* DSSE verification still passes,
|
||||
* Rekor proof validates against local tree head.
|
||||
|
||||
4. **Roll‑back logic**
|
||||
|
||||
* Import fails at DSSE/Rekor step:
|
||||
|
||||
* scanner DB still points at previous feeds,
|
||||
* metrics/logs show failure and no partial state.
|
||||
|
||||
### 5.2 SLOs & observability
|
||||
|
||||
Reuse metrics suggested by DSSE/Rekor guide and adapt to OUK imports:([git.stella-ops.org][2])
|
||||
|
||||
* `offlinekit_import_total{status="success|failed_dsse|failed_rekor|failed_cosign"}`
|
||||
* `offlinekit_attestation_verify_latency_seconds` (histogram)
|
||||
* `attestor_rekor_success_total`, `attestor_rekor_retry_total`, `rekor_inclusion_latency`
|
||||
* Dashboards: kit versions per environment, time since last kit, DSSE/Rekor health.
|
||||
|
||||
---
|
||||
|
||||
## 6. Developer checklist (TL;DR)
|
||||
|
||||
When you pick up a task touching DSSE‑signed offline updates:
|
||||
|
||||
1. **Read the background**
|
||||
|
||||
* `docs/modules/scanner/operations/dsse-rekor-operator-guide.md`([git.stella-ops.org][2])
|
||||
* `docs/24_OFFLINE_KIT.md` (and public offline kit guide).([git.stella-ops.org][1])
|
||||
* Relevant sprint file (`SPRINT_160_export_evidence`, `SPRINT_162_exportcenter_i`, etc.).([git.stella-ops.org][10])
|
||||
|
||||
2. **Implement**
|
||||
|
||||
* Generate DSSE payloads in Export Center only.
|
||||
* Call Signer & Attestor; persist DSSE + Rekor JSON next to manifests.
|
||||
* Extend OUK builder to include attestation bundle and list it in `offline-manifest.json`.
|
||||
* Update Scanner import flow to verify DSSE/Rekor before swapping feeds.
|
||||
|
||||
3. **Test**
|
||||
|
||||
* Unit tests for bundle composition & schema.
|
||||
* Integration tests for import + rollback.
|
||||
* Determinism tests (same inputs → same DSSE payload).
|
||||
|
||||
4. **Wire telemetry**
|
||||
|
||||
* Counters + latency histograms.
|
||||
* Logs with `offlineKitHash`, `attestationDigest`, `rekorUuid`.
|
||||
|
||||
5. **Document**
|
||||
|
||||
* Update `docs/modules/export-center/architecture.md`, `docs/modules/scanner/architecture.md`, and the OUK docs where flows or contracts changed.([git.stella-ops.org][5])
|
||||
|
||||
---
|
||||
|
||||
If you tell me which module you’re actually coding in next (Scanner, Export Center, CLI, or Attestor), I can turn this into a very concrete “AGENTS.md‑style” section with exact file paths, class names, and a starter test layout for that module.
|
||||
|
||||
[1]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/7bf40b8589c94078e8eadb240553c02f097a5127/docs/24_OFFLINE_KIT.md "git.stella-ops.org/24_OFFLINE_KIT.md at 7bf40b8589c94078e8eadb240553c02f097a5127 - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[2]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/13e4b53dda1575ba46c6188c794fd465ec6fdeec/docs/modules/scanner/operations/dsse-rekor-operator-guide.md "git.stella-ops.org/dsse-rekor-operator-guide.md at 13e4b53dda1575ba46c6188c794fd465ec6fdeec - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[3]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/raw/commit/61f963fd52cd4d6bb2f86afc5a82eac04c04b00e/docs/implplan/SPRINT_162_exportcenter_i.md?utm_source=chatgpt.com "https://git.stella-ops.org/stella-ops.org/git.stel..."
|
||||
[4]: https://stella-ops.org/docs/07_high_level_architecture/index.html?utm_source=chatgpt.com "Open • Sovereign • Modular container security - Stella Ops"
|
||||
[5]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/d870da18ce194c6a5f1a6d71abea36205d9fb276/docs/export-center/architecture.md?utm_source=chatgpt.com "Export Center Architecture - Stella Ops"
|
||||
[6]: https://stella-ops.org/docs/moat/?utm_source=chatgpt.com "Open • Sovereign • Modular container security - Stella Ops"
|
||||
[7]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/79b8e53441e92dbc63684f42072434d40b80275f/src/ExportCenter?utm_source=chatgpt.com "Code - Stella Ops"
|
||||
[8]: https://stella-ops.org/docs/24_offline_kit/?utm_source=chatgpt.com "Offline Update Kit (OUK) — Air‑Gap Bundle - Stella Ops – Open ..."
|
||||
[9]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/7768555f2d107326050cc5ff7f5cb81b82b7ce5f/AGENTS.md "git.stella-ops.org/AGENTS.md at 7768555f2d107326050cc5ff7f5cb81b82b7ce5f - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[10]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/66cb6c4b8af58a33efa1521b7953dda834431497/docs/implplan/SPRINT_160_export_evidence.md?utm_source=chatgpt.com "git.stella-ops.org/SPRINT_160_export_evidence.md at ..."
|
||||
[11]: https://stella-ops.org/about/?utm_source=chatgpt.com "Signed Reachability · Deterministic Replay · Sovereign Crypto"
|
||||
[12]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/actions/?actor=0&status=0&workflow=sdk-publish.yml&utm_source=chatgpt.com "Actions - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
@@ -0,0 +1,819 @@
|
||||
Here’s a crisp, opinionated storage blueprint you can hand to your Stella Ops devs right now, plus zero‑downtime conversion tactics so you can keep prototyping fast without painting yourself into a corner.
|
||||
|
||||
# Module → store map (deterministic by default)
|
||||
|
||||
* **Authority / OAuth / Accounts & Audit**
|
||||
|
||||
* **PostgreSQL** as the primary source of truth.
|
||||
* Tables: `users`, `clients`, `oauth_tokens`, `roles`, `grants`, `audit_log`.
|
||||
* **Row‑Level Security (RLS)** on `users`, `grants`, `audit_log`; **STRICT FK + CHECK** constraints; **immutable UUID PKs**.
|
||||
* **Audit**: `audit_log(actor_id, action, entity, entity_id, at timestamptz default now(), diff jsonb)`.
|
||||
* **Why**: ACID + RLS keeps authz decisions and audit trails deterministic and reviewable.
|
||||
|
||||
* **VEX & Vulnerability Writes**
|
||||
|
||||
* **PostgreSQL** with **JSONB facts + relational decisions**.
|
||||
* Tables: `vuln_fact(jsonb)`, `vex_decision(package_id, vuln_id, status, rationale, proof_ref, updated_at)`.
|
||||
* **Materialized views** for triage queues, e.g. `mv_triage_hotset` (refresh on commit or scheduled).
|
||||
* **Why**: JSONB lets you ingest vendor‑shaped docs; decisions stay relational for joins, integrity, and explainability.
|
||||
|
||||
* **Routing / Feature Flags / Rate‑limits**
|
||||
|
||||
* **PostgreSQL** (truth) + **Redis** (cache).
|
||||
* Tables: `feature_flag(key, rules jsonb, version)`, `route(domain, service, instance_id, last_heartbeat)`, `rate_limiter(bucket, quota, interval)`.
|
||||
* Redis keys: `flag:{key}:{version}`, `route:{domain}`, `rl:{bucket}` with short TTLs.
|
||||
* **Why**: one canonical RDBMS for consistency; Redis for hot path latency.
|
||||
|
||||
* **Unknowns Registry (ambiguity tracker)**
|
||||
|
||||
* **PostgreSQL** with **temporal tables** (bitemporal pattern via `valid_from/valid_to`, `sys_from/sys_to`).
|
||||
* Table: `unknowns(subject_hash, kind, context jsonb, valid_from, valid_to, sys_from default now(), sys_to)`.
|
||||
* Views: `unknowns_current` where `valid_to is null`.
|
||||
* **Why**: preserves how/when uncertainty changed (critical for proofs and audits).
|
||||
|
||||
* **Artifacts / SBOM / VEX files**
|
||||
|
||||
* **OCI‑compatible CAS** (e.g., self‑hosted registry or MinIO bucket as content‑addressable store).
|
||||
* Keys by **digest** (`sha256:...`), metadata in Postgres `artifact(index)` with `digest`, `media_type`, `size`, `signatures`.
|
||||
* **Why**: blobs don’t belong in your RDBMS; use CAS for scale + cryptographic addressing.
|
||||
|
||||
---
|
||||
|
||||
# PostgreSQL implementation essentials (copy/paste starters)
|
||||
|
||||
* **RLS scaffold (Authority)**:
|
||||
|
||||
```sql
|
||||
alter table audit_log enable row level security;
|
||||
create policy p_audit_read_self
|
||||
on audit_log for select
|
||||
using (actor_id = current_setting('app.user_id')::uuid or
|
||||
exists (select 1 from grants g where g.user_id = current_setting('app.user_id')::uuid and g.role = 'auditor'));
|
||||
```
|
||||
|
||||
* **JSONB facts + relational decisions**:
|
||||
|
||||
```sql
|
||||
create table vuln_fact (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
source text not null,
|
||||
payload jsonb not null,
|
||||
received_at timestamptz default now()
|
||||
);
|
||||
|
||||
create table vex_decision (
|
||||
package_id uuid not null,
|
||||
vuln_id text not null,
|
||||
status text check (status in ('not_affected','affected','fixed','under_investigation')),
|
||||
rationale text,
|
||||
proof_ref text,
|
||||
decided_at timestamptz default now(),
|
||||
primary key (package_id, vuln_id)
|
||||
);
|
||||
```
|
||||
|
||||
* **Materialized view for triage**:
|
||||
|
||||
```sql
|
||||
create materialized view mv_triage_hotset as
|
||||
select v.id as fact_id, v.payload->>'vuln' as vuln, v.received_at
|
||||
from vuln_fact v
|
||||
where (now() - v.received_at) < interval '7 days';
|
||||
-- refresh concurrently via job
|
||||
```
|
||||
|
||||
* **Temporal pattern (Unknowns)**:
|
||||
|
||||
```sql
|
||||
create table unknowns (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
subject_hash text not null,
|
||||
kind text not null,
|
||||
context jsonb not null,
|
||||
valid_from timestamptz not null default now(),
|
||||
valid_to timestamptz,
|
||||
sys_from timestamptz not null default now(),
|
||||
sys_to timestamptz
|
||||
);
|
||||
|
||||
create view unknowns_current as
|
||||
select * from unknowns where valid_to is null;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Conversion (not migration): zero‑downtime, prototype‑friendly
|
||||
|
||||
Even if you’re “not migrating anything yet,” set these rails now so cutting over later is painless.
|
||||
|
||||
1. **Encode Mongo‑shaped docs into JSONB with versioned schemas**
|
||||
|
||||
* Ingest pipeline writes to `*_fact(payload jsonb, schema_version int)`.
|
||||
* Add a **`validate(schema_version, payload)`** step in your service layer (JSON Schema or SQL checks).
|
||||
* Keep a **forward‑compatible view** that projects stable columns from JSONB (e.g., `payload->>'id' as vendor_id`) so downstream code doesn’t break when payload evolves.
|
||||
|
||||
2. **Outbox pattern for exactly‑once side‑effects**
|
||||
|
||||
* Add `outbox(id, topic, key, payload jsonb, created_at, dispatched bool default false)`.
|
||||
* On the same transaction as your write, insert the outbox row.
|
||||
* A background dispatcher reads `dispatched=false`, publishes to MQ/Webhook, then marks `dispatched=true`.
|
||||
* Guarantees: no lost events, no duplicates to external systems.
|
||||
|
||||
3. **Parallel read adapters behind feature flags**
|
||||
|
||||
* Keep old readers (e.g., Mongo driver) and new Postgres readers in the same service.
|
||||
* Gate by `feature_flag('pg_reads')` per tenant or env; flip gradually.
|
||||
* Add a **read‑diff monitor** that compares results and logs mismatches to `audit_log(diff)`.
|
||||
|
||||
4. **CDC for analytics without coupling**
|
||||
|
||||
* Enable **logical replication** (pgoutput) on your key tables.
|
||||
* Stream changes into analyzers (reachability, heuristics) without hitting primaries.
|
||||
* This lets you keep OLTP clean and still power dashboards/tests.
|
||||
|
||||
5. **Materialized views & job cadence**
|
||||
|
||||
* Refresh `mv_*` on a fixed cadence (e.g., every 2–5 minutes) or post‑commit for hot paths.
|
||||
* Keep **“cold path”** analytics in separate schemas (`analytics.*`) sourced from CDC.
|
||||
|
||||
6. **Cutover playbook (phased)**
|
||||
|
||||
* Phase A (Dark Read): write Postgres, still serve from Mongo; compare results silently.
|
||||
* Phase B (Shadow Serve): 5–10% traffic from Postgres via flag; auto‑rollback switch.
|
||||
* Phase C (Authoritative): Postgres becomes source; Mongo path left for emergency read‑only.
|
||||
* Phase D (Retire): freeze Mongo, back up, remove writes, delete code paths after 2 stable sprints.
|
||||
|
||||
---
|
||||
|
||||
# Rate‑limits & flags: single truth, fast edges
|
||||
|
||||
* **Truth in Postgres** with versioned flag docs:
|
||||
|
||||
```sql
|
||||
create table feature_flag (
|
||||
key text primary key,
|
||||
rules jsonb not null,
|
||||
version int not null default 1,
|
||||
updated_at timestamptz default now()
|
||||
);
|
||||
```
|
||||
|
||||
* **Edge cache** in Redis:
|
||||
|
||||
* `SETEX flag:{key}:{version} <ttl> <json>`
|
||||
* On update, bump `version`; readers compose cache key with version (cache‑busting without deletes).
|
||||
|
||||
* **Rate limiting**: Persist quotas in Postgres; counters in Redis (`INCR rl:{bucket}:{window}`), with periodic reconciliation jobs writing summaries back to Postgres for audits.
|
||||
|
||||
---
|
||||
|
||||
# CAS for SBOM/VEX/attestations
|
||||
|
||||
* Push blobs to OCI/MinIO by digest; store only pointers in Postgres:
|
||||
|
||||
```sql
|
||||
create table artifact_index (
|
||||
digest text primary key,
|
||||
media_type text not null,
|
||||
size bigint not null,
|
||||
created_at timestamptz default now(),
|
||||
signature_refs jsonb
|
||||
);
|
||||
```
|
||||
* Benefits: immutable, deduped, easy to mirror into offline kits.
|
||||
|
||||
---
|
||||
|
||||
# Guardrails your team should follow
|
||||
|
||||
* **Always** wrap multi‑table writes (facts + outbox + decisions) in a single transaction.
|
||||
* **Prefer** `jsonb_path_query` for targeted reads; **avoid** scanning entire payloads.
|
||||
* **Enforce** RLS + least‑privilege roles; application sets `app.user_id` at session start.
|
||||
* **Version everything**: schemas, flags, materialized views; never “change in place” without bumping version.
|
||||
* **Observability**: expose `pg_stat_statements`, refresh latency for `mv_*`, outbox lag, Redis hit ratio, and RLS policy hits.
|
||||
|
||||
---
|
||||
|
||||
If you want, I can turn this into:
|
||||
|
||||
* ready‑to‑run **EF Core 10** migrations,
|
||||
* a **/docs/architecture/store-map.md** for your repo,
|
||||
* and a tiny **dev seed** (Docker + sample data) so the team can poke it immediately.
|
||||
Here’s a focused “PostgreSQL patterns per module” doc you can hand straight to your StellaOps devs.
|
||||
|
||||
---
|
||||
|
||||
# StellaOps – PostgreSQL Patterns per Module
|
||||
|
||||
**Scope:** How each StellaOps module should use PostgreSQL: schema patterns, constraints, RLS, indexing, and transaction rules.
|
||||
|
||||
---
|
||||
|
||||
## 0. Cross‑cutting PostgreSQL Rules
|
||||
|
||||
These apply everywhere unless explicitly overridden.
|
||||
|
||||
### 0.1 Core conventions
|
||||
|
||||
* **Schemas**
|
||||
|
||||
* Use **one logical schema** per module: `authority`, `routing`, `vex`, `unknowns`, `artifact`.
|
||||
* Shared utilities (e.g., `outbox`) live in a `core` schema.
|
||||
|
||||
* **Naming**
|
||||
|
||||
* Tables: `snake_case`, singular: `user`, `feature_flag`, `vuln_fact`.
|
||||
* PK: `id uuid primary key`.
|
||||
* FKs: `<referenced_table>_id` (e.g., `user_id`, `tenant_id`).
|
||||
* Timestamps:
|
||||
|
||||
* `created_at timestamptz not null default now()`
|
||||
* `updated_at timestamptz not null default now()`
|
||||
|
||||
* **Multi‑tenancy**
|
||||
|
||||
* All tenant‑scoped tables must have `tenant_id uuid not null`.
|
||||
* Enforce tenant isolation with **RLS** on `tenant_id`.
|
||||
|
||||
* **Time & timezones**
|
||||
|
||||
* Always `timestamptz`, always store **UTC**, let the DB default `now()`.
|
||||
|
||||
### 0.2 RLS & security
|
||||
|
||||
* RLS must be **enabled** on any table reachable from a user‑initiated path.
|
||||
* Every session must set:
|
||||
|
||||
```sql
|
||||
select set_config('app.user_id', '<uuid>', false);
|
||||
select set_config('app.tenant_id', '<uuid>', false);
|
||||
select set_config('app.roles', 'role1,role2', false);
|
||||
```
|
||||
* RLS policies:
|
||||
|
||||
* Base policy: `tenant_id = current_setting('app.tenant_id')::uuid`.
|
||||
* Extra predicates for per‑user privacy (e.g., only see own tokens, only own API clients).
|
||||
* DB users:
|
||||
|
||||
* Each module’s service has its **own role** with access only to its schema + `core.outbox`.
|
||||
|
||||
### 0.3 JSONB & versioning
|
||||
|
||||
* Any JSONB column must have:
|
||||
|
||||
* `payload jsonb not null`,
|
||||
* `schema_version int not null`.
|
||||
* Always index:
|
||||
|
||||
* by source (`source` / `origin`),
|
||||
* by a small set of projected fields used in WHERE clauses.
|
||||
|
||||
### 0.4 Migrations
|
||||
|
||||
* All schema changes via migrations, forward‑only.
|
||||
* Backwards‑compat pattern:
|
||||
|
||||
1. Add new columns / tables.
|
||||
2. Backfill.
|
||||
3. Flip code to use new structure (behind a feature flag).
|
||||
4. After stability, remove old columns/paths.
|
||||
|
||||
---
|
||||
|
||||
## 1. Authority Module (auth, accounts, audit)
|
||||
|
||||
**Schema:** `authority.*`
|
||||
**Mission:** identity, OAuth, roles, grants, audit.
|
||||
|
||||
### 1.1 Core tables & patterns
|
||||
|
||||
* `authority.user`
|
||||
|
||||
```sql
|
||||
create table authority.user (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
email text not null,
|
||||
display_name text not null,
|
||||
is_disabled boolean not null default false,
|
||||
created_at timestamptz not null default now(),
|
||||
updated_at timestamptz not null default now(),
|
||||
unique (tenant_id, email)
|
||||
);
|
||||
```
|
||||
|
||||
* Never hard‑delete users: use `is_disabled` (and optionally `disabled_at`).
|
||||
|
||||
* `authority.role`
|
||||
|
||||
```sql
|
||||
create table authority.role (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
name text not null,
|
||||
description text,
|
||||
created_at timestamptz not null default now(),
|
||||
updated_at timestamptz not null default now(),
|
||||
unique (tenant_id, name)
|
||||
);
|
||||
```
|
||||
|
||||
* `authority.grant`
|
||||
|
||||
```sql
|
||||
create table authority.grant (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
user_id uuid not null references authority.user(id),
|
||||
role_id uuid not null references authority.role(id),
|
||||
created_at timestamptz not null default now(),
|
||||
unique (tenant_id, user_id, role_id)
|
||||
);
|
||||
```
|
||||
|
||||
* `authority.oauth_client`, `authority.oauth_token`
|
||||
|
||||
* Enforce token uniqueness:
|
||||
|
||||
```sql
|
||||
create table authority.oauth_token (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
user_id uuid not null references authority.user(id),
|
||||
client_id uuid not null references authority.oauth_client(id),
|
||||
token_hash text not null, -- hash, never raw
|
||||
expires_at timestamptz not null,
|
||||
created_at timestamptz not null default now(),
|
||||
revoked_at timestamptz,
|
||||
unique (token_hash)
|
||||
);
|
||||
```
|
||||
|
||||
### 1.2 Audit log pattern
|
||||
|
||||
* `authority.audit_log`
|
||||
|
||||
```sql
|
||||
create table authority.audit_log (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
actor_id uuid, -- null for system
|
||||
action text not null,
|
||||
entity_type text not null,
|
||||
entity_id uuid,
|
||||
at timestamptz not null default now(),
|
||||
diff jsonb not null
|
||||
);
|
||||
```
|
||||
* Insert audit rows in the **same transaction** as the change.
|
||||
|
||||
### 1.3 RLS patterns
|
||||
|
||||
* Base RLS:
|
||||
|
||||
```sql
|
||||
alter table authority.user enable row level security;
|
||||
|
||||
create policy p_user_tenant on authority.user
|
||||
for all using (tenant_id = current_setting('app.tenant_id')::uuid);
|
||||
```
|
||||
* Extra policies:
|
||||
|
||||
* Audit log is visible only to:
|
||||
|
||||
* actor themself, or
|
||||
* users with an `auditor` or `admin` role.
|
||||
|
||||
---
|
||||
|
||||
## 2. Routing & Feature Flags Module
|
||||
|
||||
**Schema:** `routing.*`
|
||||
**Mission:** where instances live, what features are on, rate‑limit configuration.
|
||||
|
||||
### 2.1 Feature flags
|
||||
|
||||
* `routing.feature_flag`
|
||||
|
||||
```sql
|
||||
create table routing.feature_flag (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
key text not null,
|
||||
rules jsonb not null,
|
||||
version int not null default 1,
|
||||
is_enabled boolean not null default true,
|
||||
created_at timestamptz not null default now(),
|
||||
updated_at timestamptz not null default now(),
|
||||
unique (tenant_id, key)
|
||||
);
|
||||
```
|
||||
|
||||
* **Immutability by version**:
|
||||
|
||||
* On update, **increment `version`**, don’t overwrite historical data.
|
||||
* Mirror changes into a history table via trigger:
|
||||
|
||||
```sql
|
||||
create table routing.feature_flag_history (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
feature_flag_id uuid not null references routing.feature_flag(id),
|
||||
tenant_id uuid not null,
|
||||
key text not null,
|
||||
rules jsonb not null,
|
||||
version int not null,
|
||||
changed_at timestamptz not null default now(),
|
||||
changed_by uuid
|
||||
);
|
||||
```
|
||||
|
||||
### 2.2 Instance registry
|
||||
|
||||
* `routing.instance`
|
||||
|
||||
```sql
|
||||
create table routing.instance (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
instance_key text not null,
|
||||
domain text not null,
|
||||
last_heartbeat timestamptz not null default now(),
|
||||
status text not null check (status in ('active','draining','offline')),
|
||||
created_at timestamptz not null default now(),
|
||||
updated_at timestamptz not null default now(),
|
||||
unique (tenant_id, instance_key),
|
||||
unique (tenant_id, domain)
|
||||
);
|
||||
```
|
||||
|
||||
* Pattern:
|
||||
|
||||
* Heartbeats use `update ... set last_heartbeat = now()` without touching other fields.
|
||||
* Routing logic filters by `status='active'` and heartbeat recency.
|
||||
|
||||
### 2.3 Rate‑limit configuration
|
||||
|
||||
* Config in Postgres, counters in Redis:
|
||||
|
||||
```sql
|
||||
create table routing.rate_limit_config (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
key text not null,
|
||||
limit_per_interval int not null,
|
||||
interval_seconds int not null,
|
||||
created_at timestamptz not null default now(),
|
||||
updated_at timestamptz not null default now(),
|
||||
unique (tenant_id, key)
|
||||
);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. VEX & Vulnerability Module
|
||||
|
||||
**Schema:** `vex.*`
|
||||
**Mission:** ingest vulnerability facts, keep decisions & triage state.
|
||||
|
||||
### 3.1 Facts as JSONB
|
||||
|
||||
* `vex.vuln_fact`
|
||||
|
||||
```sql
|
||||
create table vex.vuln_fact (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
source text not null, -- e.g. "nvd", "vendor_x_vex"
|
||||
external_id text, -- e.g. CVE, advisory id
|
||||
payload jsonb not null,
|
||||
schema_version int not null,
|
||||
received_at timestamptz not null default now()
|
||||
);
|
||||
```
|
||||
|
||||
* Index patterns:
|
||||
|
||||
```sql
|
||||
create index on vex.vuln_fact (tenant_id, source);
|
||||
create index on vex.vuln_fact (tenant_id, external_id);
|
||||
create index vuln_fact_payload_gin on vex.vuln_fact using gin (payload);
|
||||
```
|
||||
|
||||
### 3.2 Decisions as relational data
|
||||
|
||||
* `vex.package`
|
||||
|
||||
```sql
|
||||
create table vex.package (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
name text not null,
|
||||
version text not null,
|
||||
ecosystem text not null, -- e.g. "pypi", "npm"
|
||||
created_at timestamptz not null default now(),
|
||||
unique (tenant_id, name, version, ecosystem)
|
||||
);
|
||||
```
|
||||
|
||||
* `vex.vex_decision`
|
||||
|
||||
```sql
|
||||
create table vex.vex_decision (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
package_id uuid not null references vex.package(id),
|
||||
vuln_id text not null,
|
||||
status text not null check (status in (
|
||||
'not_affected', 'affected', 'fixed', 'under_investigation'
|
||||
)),
|
||||
rationale text,
|
||||
proof_ref text, -- CAS digest or URL
|
||||
decided_by uuid,
|
||||
decided_at timestamptz not null default now(),
|
||||
created_at timestamptz not null default now(),
|
||||
updated_at timestamptz not null default now(),
|
||||
unique (tenant_id, package_id, vuln_id)
|
||||
);
|
||||
```
|
||||
|
||||
* For history:
|
||||
|
||||
* Keep current state in `vex_decision`.
|
||||
* Mirror previous versions into `vex_decision_history` table (similar to feature flags).
|
||||
|
||||
### 3.3 Triage queues with materialized views
|
||||
|
||||
* Example triage view:
|
||||
|
||||
```sql
|
||||
create materialized view vex.mv_triage_queue as
|
||||
select
|
||||
d.tenant_id,
|
||||
p.name,
|
||||
p.version,
|
||||
d.vuln_id,
|
||||
d.status,
|
||||
d.decided_at
|
||||
from vex.vex_decision d
|
||||
join vex.package p on p.id = d.package_id
|
||||
where d.status = 'under_investigation';
|
||||
```
|
||||
|
||||
* Refresh options:
|
||||
|
||||
* Scheduled refresh (cron/worker).
|
||||
* Or **incremental** via triggers (more complex; use only when needed).
|
||||
|
||||
### 3.4 RLS for VEX
|
||||
|
||||
* All tables scoped by `tenant_id`.
|
||||
* Typical policy:
|
||||
|
||||
```sql
|
||||
alter table vex.vex_decision enable row level security;
|
||||
|
||||
create policy p_vex_tenant on vex.vex_decision
|
||||
for all using (tenant_id = current_setting('app.tenant_id')::uuid);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Unknowns Module
|
||||
|
||||
**Schema:** `unknowns.*`
|
||||
**Mission:** represent uncertainty and how it changes over time.
|
||||
|
||||
### 4.1 Bitemporal unknowns table
|
||||
|
||||
* `unknowns.unknown`
|
||||
|
||||
```sql
|
||||
create table unknowns.unknown (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
subject_hash text not null, -- stable identifier for "thing" being reasoned about
|
||||
kind text not null, -- e.g. "reachability", "version_inferred"
|
||||
context jsonb not null, -- extra info: call graph node, evidence, etc.
|
||||
valid_from timestamptz not null default now(),
|
||||
valid_to timestamptz,
|
||||
sys_from timestamptz not null default now(),
|
||||
sys_to timestamptz,
|
||||
created_at timestamptz not null default now()
|
||||
);
|
||||
```
|
||||
|
||||
* “Exactly one open unknown per subject/kind” pattern:
|
||||
|
||||
```sql
|
||||
create unique index unknown_one_open_per_subject
|
||||
on unknowns.unknown (tenant_id, subject_hash, kind)
|
||||
where valid_to is null;
|
||||
```
|
||||
|
||||
### 4.2 Closing an unknown
|
||||
|
||||
* Close by setting `valid_to` and `sys_to`:
|
||||
|
||||
```sql
|
||||
update unknowns.unknown
|
||||
set valid_to = now(), sys_to = now()
|
||||
where id = :id and valid_to is null;
|
||||
```
|
||||
|
||||
* Never hard-delete; keep all rows for audit/explanation.
|
||||
|
||||
### 4.3 Convenience views
|
||||
|
||||
* Current unknowns:
|
||||
|
||||
```sql
|
||||
create view unknowns.current as
|
||||
select *
|
||||
from unknowns.unknown
|
||||
where valid_to is null;
|
||||
```
|
||||
|
||||
### 4.4 RLS
|
||||
|
||||
* Same tenant policy as other modules; unknowns are tenant‑scoped.
|
||||
|
||||
---
|
||||
|
||||
## 5. Artifact Index / CAS Module
|
||||
|
||||
**Schema:** `artifact.*`
|
||||
**Mission:** index of immutable blobs stored in OCI / S3 / MinIO etc.
|
||||
|
||||
### 5.1 Artifact index
|
||||
|
||||
* `artifact.artifact`
|
||||
|
||||
```sql
|
||||
create table artifact.artifact (
|
||||
digest text primary key, -- e.g. "sha256:..."
|
||||
tenant_id uuid not null,
|
||||
media_type text not null,
|
||||
size_bytes bigint not null,
|
||||
created_at timestamptz not null default now(),
|
||||
created_by uuid
|
||||
);
|
||||
```
|
||||
|
||||
* Validate digest shape with a CHECK:
|
||||
|
||||
```sql
|
||||
alter table artifact.artifact
|
||||
add constraint chk_digest_format
|
||||
check (digest ~ '^sha[0-9]+:[0-9a-fA-F]{32,}$');
|
||||
```
|
||||
|
||||
### 5.2 Signatures and tags
|
||||
|
||||
* `artifact.signature`
|
||||
|
||||
```sql
|
||||
create table artifact.signature (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
artifact_digest text not null references artifact.artifact(digest),
|
||||
signer text not null,
|
||||
signature_payload jsonb not null,
|
||||
created_at timestamptz not null default now()
|
||||
);
|
||||
```
|
||||
|
||||
* `artifact.tag`
|
||||
|
||||
```sql
|
||||
create table artifact.tag (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid not null,
|
||||
name text not null,
|
||||
artifact_digest text not null references artifact.artifact(digest),
|
||||
created_at timestamptz not null default now(),
|
||||
unique (tenant_id, name)
|
||||
);
|
||||
```
|
||||
|
||||
### 5.3 RLS
|
||||
|
||||
* Ensure that tenants cannot see each other’s digests, even if the CAS backing store is shared:
|
||||
|
||||
```sql
|
||||
alter table artifact.artifact enable row level security;
|
||||
|
||||
create policy p_artifact_tenant on artifact.artifact
|
||||
for all using (tenant_id = current_setting('app.tenant_id')::uuid);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Shared Outbox / Event Pattern
|
||||
|
||||
**Schema:** `core.*`
|
||||
**Mission:** reliable events for external side‑effects.
|
||||
|
||||
### 6.1 Outbox table
|
||||
|
||||
* `core.outbox`
|
||||
|
||||
```sql
|
||||
create table core.outbox (
|
||||
id uuid primary key default gen_random_uuid(),
|
||||
tenant_id uuid,
|
||||
aggregate_type text not null, -- e.g. "vex_decision", "feature_flag"
|
||||
aggregate_id uuid,
|
||||
topic text not null,
|
||||
payload jsonb not null,
|
||||
created_at timestamptz not null default now(),
|
||||
dispatched_at timestamptz,
|
||||
dispatch_attempts int not null default 0,
|
||||
error text
|
||||
);
|
||||
```
|
||||
|
||||
### 6.2 Usage rule
|
||||
|
||||
* For anything that must emit an event (webhook, Kafka, notifications):
|
||||
|
||||
* In the **same transaction** as the change:
|
||||
|
||||
* write primary data (e.g. `vex.vex_decision`),
|
||||
* insert an `outbox` row.
|
||||
* A background worker:
|
||||
|
||||
* pulls undelivered rows,
|
||||
* sends to external system,
|
||||
* updates `dispatched_at`/`dispatch_attempts`/`error`.
|
||||
|
||||
---
|
||||
|
||||
## 7. Indexing & Query Patterns per Module
|
||||
|
||||
### 7.1 Authority
|
||||
|
||||
* Index:
|
||||
|
||||
* `user(tenant_id, email)`
|
||||
* `grant(tenant_id, user_id)`
|
||||
* `oauth_token(token_hash)`
|
||||
* Typical query patterns:
|
||||
|
||||
* Look up user by `tenant_id + email`.
|
||||
* All roles/grants for a user; design composite indexes accordingly.
|
||||
|
||||
### 7.2 Routing & Flags
|
||||
|
||||
* Index:
|
||||
|
||||
* `feature_flag(tenant_id, key)`
|
||||
* partial index on enabled flags:
|
||||
|
||||
```sql
|
||||
create index on routing.feature_flag (tenant_id, key)
|
||||
where is_enabled;
|
||||
```
|
||||
* `instance(tenant_id, status)`, `instance(tenant_id, domain)`.
|
||||
|
||||
### 7.3 VEX
|
||||
|
||||
* Index:
|
||||
|
||||
* `package(tenant_id, name, version, ecosystem)`
|
||||
* `vex_decision(tenant_id, package_id, vuln_id)`
|
||||
* GIN on `vuln_fact.payload` for flexible querying.
|
||||
|
||||
### 7.4 Unknowns
|
||||
|
||||
* Index:
|
||||
|
||||
* unique open unknown per subject/kind (shown above).
|
||||
* `unknown(tenant_id, kind)` for filtering by kind.
|
||||
|
||||
### 7.5 Artifact
|
||||
|
||||
* Index:
|
||||
|
||||
* PK on `digest`.
|
||||
* `signature(tenant_id, artifact_digest)`.
|
||||
* `tag(tenant_id, name)`.
|
||||
|
||||
---
|
||||
|
||||
## 8. Transaction & Isolation Guidelines
|
||||
|
||||
* Default isolation: **READ COMMITTED**.
|
||||
* For critical sequences (e.g., provisioning a tenant, bulk role updates):
|
||||
|
||||
* consider **REPEATABLE READ** or **SERIALIZABLE** and keep transactions short.
|
||||
* Pattern:
|
||||
|
||||
* One transaction per logical user action (e.g., “set flag”, “record decision”).
|
||||
* Never do long‑running external calls inside a database transaction.
|
||||
|
||||
---
|
||||
|
||||
If you’d like, next step I can turn this into:
|
||||
|
||||
* concrete `CREATE SCHEMA` + `CREATE TABLE` migration files, and
|
||||
* a short “How to write queries in each module” cheat‑sheet for devs (with example SELECT/INSERT/UPDATE patterns).
|
||||
@@ -0,0 +1,585 @@
|
||||
Here’s a tight, practical pattern to make your scanner’s vuln‑DB updates rock‑solid even when feeds hiccup:
|
||||
|
||||
# Offline, verifiable update bundles (DSSE + Rekor v2)
|
||||
|
||||
**Idea:** distribute DB updates as offline tarballs. Each tarball ships with:
|
||||
|
||||
* a **DSSE‑signed** statement (e.g., in‑toto style) over the bundle hash
|
||||
* a **Rekor v2 receipt** proving the signature/statement was logged
|
||||
* a small **manifest.json** (version, created_at, content hashes)
|
||||
|
||||
**Startup flow (happy path):**
|
||||
|
||||
1. Load latest tarball from your local `updates/` cache.
|
||||
2. Verify DSSE signature against your trusted public keys.
|
||||
3. Verify Rekor v2 receipt (inclusion proof) matches the DSSE payload hash.
|
||||
4. If both pass, unpack/activate; record the bundle’s **trust_id** (e.g., statement digest).
|
||||
5. If anything fails, **keep using the last good bundle**. No service disruption.
|
||||
|
||||
**Why this helps**
|
||||
|
||||
* **Air‑gap friendly:** no live network needed at activation time.
|
||||
* **Tamper‑evident:** DSSE + Rekor receipt proves provenance and transparency.
|
||||
* **Operational stability:** feed outages become non‑events—scanner just keeps the last good state.
|
||||
|
||||
---
|
||||
|
||||
## File layout inside each bundle
|
||||
|
||||
```
|
||||
/bundle-2025-11-29/
|
||||
manifest.json # { version, created_at, entries[], sha256s }
|
||||
payload.tar.zst # the actual DB/indices
|
||||
payload.tar.zst.sha256
|
||||
statement.dsse.json # DSSE-wrapped statement over payload hash
|
||||
rekor-receipt.json # Rekor v2 inclusion/verification material
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Acceptance/Activation rules
|
||||
|
||||
* **Trust root:** pin one (or more) publisher public keys; rotate via separate, out‑of‑band process.
|
||||
* **Monotonicity:** only activate if `manifest.version > current.version` (or if trust policy explicitly allows replay for rollback testing).
|
||||
* **Atomic switch:** unpack to `db/staging/`, validate, then symlink‑flip to `db/active/`.
|
||||
* **Quarantine on failure:** move bad bundles to `updates/quarantine/` with a reason code.
|
||||
|
||||
---
|
||||
|
||||
## Minimal .NET 10 verifier sketch (C#)
|
||||
|
||||
```csharp
|
||||
public sealed record BundlePaths(string Dir) {
|
||||
public string Manifest => Path.Combine(Dir, "manifest.json");
|
||||
public string Payload => Path.Combine(Dir, "payload.tar.zst");
|
||||
public string Dsse => Path.Combine(Dir, "statement.dsse.json");
|
||||
public string Receipt => Path.Combine(Dir, "rekor-receipt.json");
|
||||
}
|
||||
|
||||
public async Task<bool> ActivateBundleAsync(BundlePaths b, TrustConfig trust, string activeDir) {
|
||||
var manifest = await Manifest.LoadAsync(b.Manifest);
|
||||
if (!await Hashes.VerifyAsync(b.Payload, manifest.PayloadSha256)) return false;
|
||||
|
||||
// 1) DSSE verify (publisher keys pinned in trust)
|
||||
var (okSig, dssePayloadDigest) = await Dsse.VerifyAsync(b.Dsse, trust.PublisherKeys);
|
||||
if (!okSig || dssePayloadDigest != manifest.PayloadSha256) return false;
|
||||
|
||||
// 2) Rekor v2 receipt verify (inclusion + statement digest == dssePayloadDigest)
|
||||
if (!await RekorV2.VerifyReceiptAsync(b.Receipt, dssePayloadDigest, trust.RekorPub)) return false;
|
||||
|
||||
// 3) Stage, validate, then atomically flip
|
||||
var staging = Path.Combine(activeDir, "..", "staging");
|
||||
DirUtil.Empty(staging);
|
||||
await TarZstd.ExtractAsync(b.Payload, staging);
|
||||
if (!await LocalDbSelfCheck.RunAsync(staging)) return false;
|
||||
|
||||
SymlinkUtil.AtomicSwap(source: staging, target: activeDir);
|
||||
State.WriteLastGood(manifest.Version, dssePayloadDigest);
|
||||
return true;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Operational playbook
|
||||
|
||||
* **On boot & daily at HH:MM:** try `ActivateBundleAsync()` on the newest bundle; on failure, log and continue.
|
||||
* **Telemetry (no PII):** reason codes (SIG_FAIL, RECEIPT_FAIL, HASH_MISMATCH, SELFTEST_FAIL), versions, last_good.
|
||||
* **Keys & rotation:** keep `publisher.pub` and `rekor.pub` in a root‑owned, read‑only path; rotate via a separate signed “trust bundle”.
|
||||
* **Defense‑in‑depth:** verify both the **payload hash** and each file’s hash listed in `manifest.entries[]`.
|
||||
* **Rollback:** allow `--force-activate <bundle>` for emergency testing, but mark as **non‑monotonic** in state.
|
||||
|
||||
---
|
||||
|
||||
## What to hand your release team
|
||||
|
||||
* A Make/CI target that:
|
||||
|
||||
1. Builds `payload.tar.zst` and computes hashes
|
||||
2. Generates `manifest.json`
|
||||
3. Creates and signs the **DSSE statement**
|
||||
4. Submits to Rekor (or your mirror) and saves the **v2 receipt**
|
||||
5. Packages the bundle folder and publishes to your offline repo
|
||||
* A checksum file (`*.sha256sum`) for ops to verify out‑of‑band.
|
||||
|
||||
---
|
||||
|
||||
If you want, I can turn this into a Stella Ops spec page (`docs/modules/scanner/offline-bundles.md`) plus a small reference implementation (C# library + CLI) that drops right into your Scanner service.
|
||||
Here’s a “drop‑in” Stella Ops dev guide for **DSSE‑signed Offline Scanner Updates** — written in the same spirit as the existing docs and sprint files.
|
||||
|
||||
You can treat this as the seed for `docs/modules/scanner/development/dsse-offline-updates.md` (or similar).
|
||||
|
||||
---
|
||||
|
||||
# DSSE‑Signed Offline Scanner Updates — Developer Guidelines
|
||||
|
||||
> **Audience**
|
||||
> Scanner, Export Center, Attestor, CLI, and DevOps engineers implementing DSSE‑signed offline vulnerability updates and integrating them into the Offline Update Kit (OUK).
|
||||
>
|
||||
> **Context**
|
||||
>
|
||||
> * OUK already ships **signed, atomic offline update bundles** with merged vulnerability feeds, container images, and an attested manifest.([git.stella-ops.org][1])
|
||||
> * DSSE + Rekor is already used for **scan evidence** (SBOM attestations, Rekor proofs).([git.stella-ops.org][2])
|
||||
> * Sprints 160/162 add **attestation bundles** with manifest, checksums, DSSE signature, and optional transparency log segments, and integrate them into OUK and CLI flows.([git.stella-ops.org][3])
|
||||
|
||||
These guidelines tell you how to **wire all of that together** for “offline scanner updates” (feeds, rules, packs) in a way that matches Stella Ops’ determinism + sovereignty promises.
|
||||
|
||||
---
|
||||
|
||||
## 0. Mental model
|
||||
|
||||
At a high level, you’re building this:
|
||||
|
||||
```text
|
||||
Advisory mirrors / Feeds builders
|
||||
│
|
||||
▼
|
||||
ExportCenter.AttestationBundles
|
||||
(creates DSSE + Rekor evidence
|
||||
for each offline update snapshot)
|
||||
│
|
||||
▼
|
||||
Offline Update Kit (OUK) builder
|
||||
(adds feeds + evidence to kit tarball)
|
||||
│
|
||||
▼
|
||||
stella offline kit import / admin CLI
|
||||
(verifies Cosign + DSSE + Rekor segments,
|
||||
then atomically swaps scanner feeds)
|
||||
```
|
||||
|
||||
Online, Rekor is live; offline, you rely on **bundled Rekor segments / snapshots** and the existing OUK mechanics (import is atomic, old feeds kept until new bundle is fully verified).([git.stella-ops.org][1])
|
||||
|
||||
---
|
||||
|
||||
## 1. Goals & non‑goals
|
||||
|
||||
### Goals
|
||||
|
||||
1. **Authentic offline snapshots**
|
||||
Every offline scanner update (OUK or delta) must be verifiably tied to:
|
||||
|
||||
* a DSSE envelope,
|
||||
* a certificate chain rooted in Stella’s Fulcio/KMS profile or BYO KMS/HSM,
|
||||
* *and* a Rekor v2 inclusion proof or bundled log segment.([Stella Ops][4])
|
||||
|
||||
2. **Deterministic replay**
|
||||
Given:
|
||||
|
||||
* a specific offline update kit (`stella-ops-offline-kit-<DATE>.tgz` + `offline-manifest-<DATE>.json`)([git.stella-ops.org][1])
|
||||
* its DSSE attestation bundle + Rekor segments
|
||||
every verifier must reach the *same* verdict on integrity and contents — online or fully air‑gapped.
|
||||
|
||||
3. **Separation of concerns**
|
||||
|
||||
* Export Center: build attestation bundles, no business logic about scanning.([git.stella-ops.org][5])
|
||||
* Scanner: import & apply feeds; verify but not generate DSSE.
|
||||
* Signer / Attestor: own DSSE & Rekor integration.([git.stella-ops.org][2])
|
||||
|
||||
4. **Operational safety**
|
||||
|
||||
* Imports remain **atomic and idempotent**.
|
||||
* Old feeds stay live until the new update is **fully verified** (Cosign + DSSE + Rekor).([git.stella-ops.org][1])
|
||||
|
||||
### Non‑goals
|
||||
|
||||
* Designing new crypto or log formats.
|
||||
* Per‑feed DSSE envelopes (you can have more later, but the minimum contract is **bundle‑level** attestation).
|
||||
|
||||
---
|
||||
|
||||
## 2. Bundle contract for DSSE‑signed offline updates
|
||||
|
||||
You’re extending the existing OUK contract:
|
||||
|
||||
* OUK already packs:
|
||||
|
||||
* merged vuln feeds (OSV, GHSA, optional NVD 2.0, CNNVD/CNVD, ENISA, JVN, BDU),
|
||||
* container images (`stella-ops`, Zastava, etc.),
|
||||
* provenance (Cosign signature, SPDX SBOM, in‑toto SLSA attestation),
|
||||
* `offline-manifest.json` + detached JWS signed during export.([git.stella-ops.org][1])
|
||||
|
||||
For **DSSE‑signed offline scanner updates**, add a new logical layer:
|
||||
|
||||
### 2.1. Files to ship
|
||||
|
||||
Inside each offline kit (full or delta) you must produce:
|
||||
|
||||
```text
|
||||
/attestations/
|
||||
offline-update.dsse.json # DSSE envelope
|
||||
offline-update.rekor.json # Rekor entry + inclusion proof (or segment descriptor)
|
||||
/manifest/
|
||||
offline-manifest.json # existing manifest
|
||||
offline-manifest.json.jws # existing detached JWS
|
||||
/feeds/
|
||||
... # existing feed payloads
|
||||
```
|
||||
|
||||
The exact paths can be adjusted, but keep:
|
||||
|
||||
* **One DSSE bundle per kit** (min spec).
|
||||
* **One canonical Rekor proof file** per DSSE envelope.
|
||||
|
||||
### 2.2. DSSE payload contents (minimal)
|
||||
|
||||
Define (or reuse) a predicate type such as:
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"payloadType": "application/vnd.in-toto+json",
|
||||
"payload": { /* base64 */ }
|
||||
}
|
||||
```
|
||||
|
||||
Decoded payload (in-toto statement) should **at minimum** contain:
|
||||
|
||||
* **Subject**
|
||||
|
||||
* `name`: `stella-ops-offline-kit-<DATE>.tgz`
|
||||
* `digest.sha256`: tarball digest
|
||||
|
||||
* **Predicate type** (recommendation)
|
||||
|
||||
* `https://stella-ops.org/attestations/offline-update/1`
|
||||
|
||||
* **Predicate fields**
|
||||
|
||||
* `offline_manifest_sha256` – SHA‑256 of `offline-manifest.json`
|
||||
* `feeds` – array of feed entries such as `{ name, snapshot_date, archive_digest }` (mirrors `rules_and_feeds` style used in the moat doc).([Stella Ops][6])
|
||||
* `builder` – CI workflow id / git commit / Export Center job id
|
||||
* `created_at` – UTC ISO‑8601
|
||||
* `oukit_channel` – e.g., `edge`, `stable`, `fips-profile`
|
||||
|
||||
**Guideline:** this DSSE payload is the **single canonical description** of “what this offline update snapshot is”.
|
||||
|
||||
### 2.3. Rekor material
|
||||
|
||||
Attestor must:
|
||||
|
||||
* Submit `offline-update.dsse.json` to Rekor v2, obtaining:
|
||||
|
||||
* `uuid`
|
||||
* `logIndex`
|
||||
* inclusion proof (`rootHash`, `hashes`, `checkpoint`)
|
||||
* Serialize that to `offline-update.rekor.json` and store it in object storage + OUK staging, so it ships in the kit.([git.stella-ops.org][2])
|
||||
|
||||
For fully offline operation:
|
||||
|
||||
* Either:
|
||||
|
||||
* embed a **minimal log segment** containing that entry; or
|
||||
* rely on daily Rekor snapshot exports included elsewhere in the kit.([git.stella-ops.org][2])
|
||||
|
||||
---
|
||||
|
||||
## 3. Implementation by module
|
||||
|
||||
### 3.1 Export Center — attestation bundles
|
||||
|
||||
**Working directory:** `src/ExportCenter/StellaOps.ExportCenter.AttestationBundles`([git.stella-ops.org][7])
|
||||
|
||||
**Responsibilities**
|
||||
|
||||
1. **Compose attestation bundle job** (EXPORT‑ATTEST‑74‑001)
|
||||
|
||||
* Input: a snapshot identifier (e.g., offline kit build id or feed snapshot date).
|
||||
* Read manifest and feed metadata from the Export Center’s storage.([git.stella-ops.org][5])
|
||||
* Generate the DSSE payload structure described above.
|
||||
* Call `StellaOps.Signer` to wrap it in a DSSE envelope.
|
||||
* Call `StellaOps.Attestor` to submit DSSE → Rekor and get the inclusion proof.([git.stella-ops.org][2])
|
||||
* Persist:
|
||||
|
||||
* `offline-update.dsse.json`
|
||||
* `offline-update.rekor.json`
|
||||
* any log segment artifacts.
|
||||
|
||||
2. **Integrate into offline kit packaging** (EXPORT‑ATTEST‑74‑002 / 75‑001)
|
||||
|
||||
* The OUK builder (Python script `ops/offline-kit/build_offline_kit.py`) already assembles artifacts & manifests.([Stella Ops][8])
|
||||
* Extend that pipeline (or add an Export Center step) to:
|
||||
|
||||
* fetch the attestation bundle for the snapshot,
|
||||
* place it under `/attestations/` in the kit staging dir,
|
||||
* ensure `offline-manifest.json` contains entries for the DSSE and Rekor files (name, sha256, size, capturedAt).([git.stella-ops.org][1])
|
||||
|
||||
3. **Contracts & schemas**
|
||||
|
||||
* Define a small JSON schema for `offline-update.rekor.json` (UUID, index, proof fields) and check it into `docs/11_DATA_SCHEMAS.md` or module‑local schemas.
|
||||
* Keep all new payload schemas **versioned**; avoid “shape drift”.
|
||||
|
||||
**Do / Don’t**
|
||||
|
||||
* ✅ **Do** treat attestation bundle job as *pure aggregation* (AOC guardrail: no modification of evidence).([git.stella-ops.org][5])
|
||||
* ✅ **Do** rely on Signer + Attestor; don’t hand‑roll DSSE/Rekor logic in Export Center.([git.stella-ops.org][2])
|
||||
* ❌ **Don’t** reach out to external networks from this job — it must run with the same offline‑ready posture as the rest of the platform.
|
||||
|
||||
---
|
||||
|
||||
### 3.2 Offline Update Kit builder
|
||||
|
||||
**Working area:** `ops/offline-kit/*` + `docs/24_OFFLINE_KIT.md`([git.stella-ops.org][1])
|
||||
|
||||
Guidelines:
|
||||
|
||||
1. **Preserve current guarantees**
|
||||
|
||||
* Imports must remain **idempotent and atomic**, with **old feeds kept until the new bundle is fully verified**. This now includes DSSE/Rekor checks in addition to Cosign + JWS.([git.stella-ops.org][1])
|
||||
|
||||
2. **Staging layout**
|
||||
|
||||
* When staging a kit, ensure the tree looks like:
|
||||
|
||||
```text
|
||||
out/offline-kit/staging/
|
||||
feeds/...
|
||||
images/...
|
||||
manifest/offline-manifest.json
|
||||
attestations/offline-update.dsse.json
|
||||
attestations/offline-update.rekor.json
|
||||
```
|
||||
|
||||
* Update `offline-manifest.json` so each new file appears with:
|
||||
|
||||
* `name`, `sha256`, `size`, `capturedAt`.([git.stella-ops.org][1])
|
||||
|
||||
3. **Deterministic ordering**
|
||||
|
||||
* File lists in manifests must be in a stable order (e.g., lexical paths).
|
||||
* Timestamps = UTC ISO‑8601 only; never use local time. (Matches determinism guidance in AGENTS.md + policy/runs docs.)([git.stella-ops.org][9])
|
||||
|
||||
4. **Delta kits**
|
||||
|
||||
* For deltas (`stella-ouk-YYYY-MM-DD.delta.tgz`), DSSE should still cover:
|
||||
|
||||
* the delta tarball digest,
|
||||
* the **logical state** (feeds & versions) after applying the delta.
|
||||
* Don’t shortcut by “attesting only the diff files” — the predicate must describe the resulting snapshot.
|
||||
|
||||
---
|
||||
|
||||
### 3.3 Scanner — import & activation
|
||||
|
||||
**Working directory:** `src/Scanner/StellaOps.Scanner.WebService`, `StellaOps.Scanner.Worker`([git.stella-ops.org][9])
|
||||
|
||||
Scanner already exposes admin flows for:
|
||||
|
||||
* **Offline kit import**, which:
|
||||
|
||||
* validates the Cosign signature of the kit,
|
||||
* uses the attested manifest,
|
||||
* keeps old feeds until verification is done.([git.stella-ops.org][1])
|
||||
|
||||
Add DSSE/Rekor awareness as follows:
|
||||
|
||||
1. **Verification sequence (happy path)**
|
||||
|
||||
On `import-offline-usage-kit`:
|
||||
|
||||
1. Validate **Cosign** signature of the tarball.
|
||||
2. Validate `offline-manifest.json` with its JWS signature.
|
||||
3. Verify **file digests** for all entries (including `/attestations/*`).
|
||||
4. Verify **DSSE**:
|
||||
|
||||
* Call `StellaOps.Attestor.Verify` (or CLI equivalent) with:
|
||||
|
||||
* `offline-update.dsse.json`
|
||||
* `offline-update.rekor.json`
|
||||
* local Rekor log snapshot / segment (if configured)([git.stella-ops.org][2])
|
||||
* Ensure the payload digest matches the kit tarball + manifest digests.
|
||||
5. Only after all checks pass:
|
||||
|
||||
* swap Scanner’s feed pointer to the new snapshot,
|
||||
* emit an audit event noting:
|
||||
|
||||
* kit filename, tarball digest,
|
||||
* DSSE statement digest,
|
||||
* Rekor UUID + log index.
|
||||
|
||||
2. **Config surface**
|
||||
|
||||
Add config keys (names illustrative):
|
||||
|
||||
```yaml
|
||||
scanner:
|
||||
offlineKit:
|
||||
requireDsse: true # fail import if DSSE/Rekor verification fails
|
||||
rekorOfflineMode: true # use local snapshots only
|
||||
attestationVerifier: https://attestor.internal
|
||||
```
|
||||
|
||||
* Mirror them via ASP.NET Core config + env vars (`SCANNER__OFFLINEKIT__REQUIREDSSSE`, etc.), following the same pattern as the DSSE/Rekor operator guide.([git.stella-ops.org][2])
|
||||
|
||||
3. **Failure behaviour**
|
||||
|
||||
* **DSSE/Rekor fail, Cosign + manifest OK**
|
||||
|
||||
* Keep old feeds active.
|
||||
* Mark import as failed; surface a `ProblemDetails` error via API/UI.
|
||||
* Log structured fields: `rekorUuid`, `attestationDigest`, `offlineKitHash`, `failureReason`.([git.stella-ops.org][2])
|
||||
|
||||
* **Config flag to soften during rollout**
|
||||
|
||||
* When `requireDsse=false`, treat DSSE/Rekor failure as a warning and still allow the import (for initial observation phase), but emit alerts. This mirrors the “observe → enforce” pattern in the DSSE/Rekor operator guide.([git.stella-ops.org][2])
|
||||
|
||||
---
|
||||
|
||||
### 3.4 Signer & Attestor
|
||||
|
||||
You mostly **reuse** existing guidance:([git.stella-ops.org][2])
|
||||
|
||||
* Add a new predicate type & schema for offline updates in Signer.
|
||||
|
||||
* Ensure Attestor:
|
||||
|
||||
* can submit offline‑update DSSE envelopes to Rekor,
|
||||
* can emit verification routines (used by CLI and Scanner) that:
|
||||
|
||||
* verify the DSSE signature,
|
||||
* check the certificate chain against the configured root pack (FIPS/eIDAS/GOST/SM, etc.),([Stella Ops][4])
|
||||
* verify Rekor inclusion using either live log or local snapshot.
|
||||
|
||||
* For fully air‑gapped installs:
|
||||
|
||||
* rely on Rekor **snapshots mirrored** into Offline Kit (already recommended in the operator guide’s offline section).([git.stella-ops.org][2])
|
||||
|
||||
---
|
||||
|
||||
### 3.5 CLI & UI
|
||||
|
||||
Extend CLI with explicit verbs (matching EXPORT‑ATTEST sprints):([git.stella-ops.org][10])
|
||||
|
||||
* `stella attest bundle verify --bundle path/to/offline-kit.tgz --rekor-key rekor.pub`
|
||||
* `stella attest bundle import --bundle ...` (for sites that prefer a two‑step “verify then import” flow)
|
||||
* Wire UI Admin → Offline Kit screen so that:
|
||||
|
||||
* verification status shows both **Cosign/JWS** and **DSSE/Rekor** state,
|
||||
* policy banners display kit generation time, manifest hash, and DSSE/Rekor freshness.([Stella Ops][11])
|
||||
|
||||
---
|
||||
|
||||
## 4. Determinism & offline‑safety rules
|
||||
|
||||
When touching any of this code, keep these rules front‑of‑mind (they align with the policy DSL and architecture docs):([Stella Ops][4])
|
||||
|
||||
1. **No hidden network dependencies**
|
||||
|
||||
* All verification **must work offline** given the kit + Rekor snapshots.
|
||||
* Any fallback to live Rekor / Fulcio endpoints must be explicitly toggled and never on by default for “offline mode”.
|
||||
|
||||
2. **Stable serialization**
|
||||
|
||||
* DSSE payload JSON:
|
||||
|
||||
* stable ordering of fields,
|
||||
* no float weirdness,
|
||||
* UTC timestamps.
|
||||
|
||||
3. **Replayable imports**
|
||||
|
||||
* Running `import-offline-usage-kit` twice with the same bundle must be a no‑op after the first time.
|
||||
* The DSSE payload for a given snapshot must not change over time; if it does, bump the predicate or snapshot version.
|
||||
|
||||
4. **Explainability**
|
||||
|
||||
* When verification fails, errors must explain **what** mismatched (kit digest, manifest digest, DSSE envelope hash, Rekor inclusion) so auditors can reason about it.
|
||||
|
||||
---
|
||||
|
||||
## 5. Testing & CI expectations
|
||||
|
||||
Tie this into the existing CI workflows (`scanner-determinism.yml`, `attestation-bundle.yml`, `offline-kit` pipelines, etc.):([git.stella-ops.org][12])
|
||||
|
||||
### 5.1 Unit & integration tests
|
||||
|
||||
Write tests that cover:
|
||||
|
||||
1. **Happy paths**
|
||||
|
||||
* Full kit import with valid:
|
||||
|
||||
* Cosign,
|
||||
* manifest JWS,
|
||||
* DSSE,
|
||||
* Rekor proof (online and offline modes).
|
||||
|
||||
2. **Corruption scenarios**
|
||||
|
||||
* Tampered feed file (hash mismatch).
|
||||
* Tampered `offline-manifest.json`.
|
||||
* Tampered DSSE payload (signature fails).
|
||||
* Mismatched Rekor entry (payload digest doesn’t match DSSE).
|
||||
|
||||
3. **Offline scenarios**
|
||||
|
||||
* No network access, only Rekor snapshot:
|
||||
|
||||
* DSSE verification still passes,
|
||||
* Rekor proof validates against local tree head.
|
||||
|
||||
4. **Roll‑back logic**
|
||||
|
||||
* Import fails at DSSE/Rekor step:
|
||||
|
||||
* scanner DB still points at previous feeds,
|
||||
* metrics/logs show failure and no partial state.
|
||||
|
||||
### 5.2 SLOs & observability
|
||||
|
||||
Reuse metrics suggested by DSSE/Rekor guide and adapt to OUK imports:([git.stella-ops.org][2])
|
||||
|
||||
* `offlinekit_import_total{status="success|failed_dsse|failed_rekor|failed_cosign"}`
|
||||
* `offlinekit_attestation_verify_latency_seconds` (histogram)
|
||||
* `attestor_rekor_success_total`, `attestor_rekor_retry_total`, `rekor_inclusion_latency`
|
||||
* Dashboards: kit versions per environment, time since last kit, DSSE/Rekor health.
|
||||
|
||||
---
|
||||
|
||||
## 6. Developer checklist (TL;DR)
|
||||
|
||||
When you pick up a task touching DSSE‑signed offline updates:
|
||||
|
||||
1. **Read the background**
|
||||
|
||||
* `docs/modules/scanner/operations/dsse-rekor-operator-guide.md`([git.stella-ops.org][2])
|
||||
* `docs/24_OFFLINE_KIT.md` (and public offline kit guide).([git.stella-ops.org][1])
|
||||
* Relevant sprint file (`SPRINT_160_export_evidence`, `SPRINT_162_exportcenter_i`, etc.).([git.stella-ops.org][10])
|
||||
|
||||
2. **Implement**
|
||||
|
||||
* Generate DSSE payloads in Export Center only.
|
||||
* Call Signer & Attestor; persist DSSE + Rekor JSON next to manifests.
|
||||
* Extend OUK builder to include attestation bundle and list it in `offline-manifest.json`.
|
||||
* Update Scanner import flow to verify DSSE/Rekor before swapping feeds.
|
||||
|
||||
3. **Test**
|
||||
|
||||
* Unit tests for bundle composition & schema.
|
||||
* Integration tests for import + rollback.
|
||||
* Determinism tests (same inputs → same DSSE payload).
|
||||
|
||||
4. **Wire telemetry**
|
||||
|
||||
* Counters + latency histograms.
|
||||
* Logs with `offlineKitHash`, `attestationDigest`, `rekorUuid`.
|
||||
|
||||
5. **Document**
|
||||
|
||||
* Update `docs/modules/export-center/architecture.md`, `docs/modules/scanner/architecture.md`, and the OUK docs where flows or contracts changed.([git.stella-ops.org][5])
|
||||
|
||||
---
|
||||
|
||||
If you tell me which module you’re actually coding in next (Scanner, Export Center, CLI, or Attestor), I can turn this into a very concrete “AGENTS.md‑style” section with exact file paths, class names, and a starter test layout for that module.
|
||||
|
||||
[1]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/7bf40b8589c94078e8eadb240553c02f097a5127/docs/24_OFFLINE_KIT.md "git.stella-ops.org/24_OFFLINE_KIT.md at 7bf40b8589c94078e8eadb240553c02f097a5127 - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[2]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/13e4b53dda1575ba46c6188c794fd465ec6fdeec/docs/modules/scanner/operations/dsse-rekor-operator-guide.md "git.stella-ops.org/dsse-rekor-operator-guide.md at 13e4b53dda1575ba46c6188c794fd465ec6fdeec - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[3]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/raw/commit/61f963fd52cd4d6bb2f86afc5a82eac04c04b00e/docs/implplan/SPRINT_162_exportcenter_i.md?utm_source=chatgpt.com "https://git.stella-ops.org/stella-ops.org/git.stel..."
|
||||
[4]: https://stella-ops.org/docs/07_high_level_architecture/index.html?utm_source=chatgpt.com "Open • Sovereign • Modular container security - Stella Ops"
|
||||
[5]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/d870da18ce194c6a5f1a6d71abea36205d9fb276/docs/export-center/architecture.md?utm_source=chatgpt.com "Export Center Architecture - Stella Ops"
|
||||
[6]: https://stella-ops.org/docs/moat/?utm_source=chatgpt.com "Open • Sovereign • Modular container security - Stella Ops"
|
||||
[7]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/79b8e53441e92dbc63684f42072434d40b80275f/src/ExportCenter?utm_source=chatgpt.com "Code - Stella Ops"
|
||||
[8]: https://stella-ops.org/docs/24_offline_kit/?utm_source=chatgpt.com "Offline Update Kit (OUK) — Air‑Gap Bundle - Stella Ops – Open ..."
|
||||
[9]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/7768555f2d107326050cc5ff7f5cb81b82b7ce5f/AGENTS.md "git.stella-ops.org/AGENTS.md at 7768555f2d107326050cc5ff7f5cb81b82b7ce5f - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
[10]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/66cb6c4b8af58a33efa1521b7953dda834431497/docs/implplan/SPRINT_160_export_evidence.md?utm_source=chatgpt.com "git.stella-ops.org/SPRINT_160_export_evidence.md at ..."
|
||||
[11]: https://stella-ops.org/about/?utm_source=chatgpt.com "Signed Reachability · Deterministic Replay · Sovereign Crypto"
|
||||
[12]: https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/actions/?actor=0&status=0&workflow=sdk-publish.yml&utm_source=chatgpt.com "Actions - git.stella-ops.org - Gitea: Git with a cup of tea"
|
||||
@@ -0,0 +1,425 @@
|
||||
Here’s a simple metric that will make your security UI (and teams) radically better: **Time‑to‑Evidence (TTE)** — the time from opening a finding to seeing *raw proof* (a data‑flow edge, an SBOM line, or a VEX note), not a summary.
|
||||
|
||||
---
|
||||
|
||||
### What it is
|
||||
|
||||
* **Definition:** TTE = `t_first_proof_rendered − t_open_finding`.
|
||||
* **Proof =** the exact artifact or path that justifies the claim (e.g., `package-lock.json: line 214 → openssl@1.1.1`, `reachability: A → B → C sink`, or `VEX: not_affected due to unreachable code`).
|
||||
* **Target:** **P95 ≤ 15s** (stretch: P99 ≤ 30s). If 95% of findings show proof within 15 seconds, the UI stays honest: evidence before opinion, low noise, fast explainability.
|
||||
|
||||
---
|
||||
|
||||
### Why it matters
|
||||
|
||||
* **Trust:** People accept decisions they can *verify* quickly.
|
||||
* **Triage speed:** Proof-first UIs cut back-and-forth and guesswork.
|
||||
* **Noise control:** If you can’t surface proof fast, you probably shouldn’t surface the finding yet.
|
||||
|
||||
---
|
||||
|
||||
### How to measure (engineering‑ready)
|
||||
|
||||
* Emit two stamps per finding view:
|
||||
|
||||
* `t_open_finding` (on route enter or modal open).
|
||||
* `t_first_proof_rendered` (first DOM paint of SBOM line / path list / VEX clause).
|
||||
* Store as `tte_ms` in a lightweight events table (Postgres) with tags: `tenant`, `finding_id`, `proof_kind` (`sbom|reachability|vex`), `source` (`local|remote|cache`).
|
||||
* Nightly rollup: compute P50/P90/P95/P99 by proof_kind and page.
|
||||
* Alert when **P95 > 15s** for 15 minutes.
|
||||
|
||||
---
|
||||
|
||||
### UI contract (keeps the UX honest)
|
||||
|
||||
* **Above the fold:** always show a compact **Proof panel** first (not hidden behind tabs).
|
||||
* **Skeletons over spinners:** reserve space; render partial proof as soon as any piece is ready.
|
||||
* **Plain text copy affordance:** “Copy SBOM line / path” button right next to the proof.
|
||||
* **Defer non‑proof widgets:** CVSS badges, remediation prose, and charts load *after* proof.
|
||||
* **Empty‑state truth:** if no proof exists, say “No proof available yet” and show the loader for *that* proof type only (don’t pretend with summaries).
|
||||
|
||||
---
|
||||
|
||||
### Backend rules of thumb
|
||||
|
||||
* **Pre‑index for first paint:** cache top N proof items per hot finding (e.g., first SBOM hit + shortest path).
|
||||
* **Bound queries:** proof queries must be *O(log n)* on indexed columns (pkg name@version, file hash, graph node id).
|
||||
* **Chunked streaming:** send first proof chunk <200 ms after backend hit; don’t hold for the full set.
|
||||
* **Timeout budget:** 12s backend budget + 3s UI/render margin = 15s P95.
|
||||
|
||||
---
|
||||
|
||||
### Minimal contract to add in your code
|
||||
|
||||
```ts
|
||||
// Frontend: fire on open
|
||||
metrics.emit('finding_open', { findingId, t: performance.now() });
|
||||
|
||||
// When the first real proof node/line hits the DOM:
|
||||
metrics.emit('proof_rendered', { findingId, proofKind, t: performance.now() });
|
||||
```
|
||||
|
||||
```sql
|
||||
-- Rollup (hourly)
|
||||
SELECT
|
||||
proof_kind,
|
||||
percentile_cont(0.95) WITHIN GROUP (ORDER BY tte_ms) AS p95_ms
|
||||
FROM tte_events
|
||||
WHERE ts >= now() - interval '1 hour'
|
||||
GROUP BY proof_kind;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### What to put on the team dashboard
|
||||
|
||||
* **TTE P95 by page** (Findings list, Finding details).
|
||||
* **TTE P95 by proof_kind** (sbom / reachability / vex).
|
||||
* **Error budget burn**: minutes over target per day.
|
||||
* **Top regressions**: last 7 days vs prior 7.
|
||||
|
||||
---
|
||||
|
||||
### Acceptance checklist for any finding view
|
||||
|
||||
* [ ] First paint shows a real proof snippet (not a summary).
|
||||
* [ ] “Copy proof” button works within 1 click.
|
||||
* [ ] TTE P95 in staging ≤ 10s; in prod ≤ 15s.
|
||||
* [ ] If proof missing, explicit empty‑state + retry path.
|
||||
* [ ] Telemetry sampled ≥ 50% of sessions (or 100% for internal).
|
||||
|
||||
---
|
||||
|
||||
If you want, I can turn this into a **`docs/UX/tte.md`** spec plus a **Grafana panel JSON** and a tiny **.NET middleware** to emit the metrics—just say the word and I’ll drop in ready‑to‑use snippets.
|
||||
Perfect, got it — let’s turn **Time‑to‑Evidence (TTE)** into a proper set of *“how we track UX health”* guidelines.
|
||||
|
||||
I’ll treat this like something you can drop into an internal engineering/UX doc.
|
||||
|
||||
---
|
||||
|
||||
## 1. What exactly is Time‑to‑Evidence?
|
||||
|
||||
**Definition**
|
||||
|
||||
> **TTE = t_first_proof_rendered − t_open_finding**
|
||||
|
||||
* **t_open_finding** – when the user first opens a “finding” / detail view (e.g., vulnerability, alert, ticket, log event).
|
||||
* **t_first_proof_rendered** – when the UI first paints **actual evidence** that backs the finding, for example:
|
||||
|
||||
* The SBOM row showing `package@version`.
|
||||
* The call‑graph/data‑flow path to a sink.
|
||||
* A VEX note explaining why something is (not) affected.
|
||||
* A raw log snippet that the alert is based on.
|
||||
|
||||
**Key principle:**
|
||||
TTE measures **how long users have to trust you blindly** before they can see proof with their own eyes.
|
||||
|
||||
---
|
||||
|
||||
## 2. UX health goals & targets
|
||||
|
||||
Treat TTE like latency SLOs:
|
||||
|
||||
* **Primary SLO**:
|
||||
|
||||
* **P95 TTE ≤ 15s** for all findings in normal conditions.
|
||||
* **Stretch SLO**:
|
||||
|
||||
* **P99 TTE ≤ 30s** for heavy cases (big graphs, huge SBOMs, cold caches).
|
||||
* **Guardrail**:
|
||||
|
||||
* P50 TTE should be **< 3s**. If the median creeps up, you’re in trouble even if P95 looks OK.
|
||||
|
||||
You can refine by feature:
|
||||
|
||||
* “Simple” proof (single SBOM row, small payload):
|
||||
|
||||
* P95 ≤ 5s.
|
||||
* “Complex” proof (reachability graph, cross‑repo joins):
|
||||
|
||||
* P95 ≤ 15s.
|
||||
|
||||
**UX rule of thumb**
|
||||
|
||||
* < 2s: feels instant.
|
||||
* 2–10s: acceptable if clearly loading something heavy.
|
||||
* > 10s: needs **strong** feedback (progress, partial results, explanations).
|
||||
* > 30s: the system should probably **offer fallback** (e.g., “download raw evidence” or “retry”).
|
||||
|
||||
---
|
||||
|
||||
## 3. Instrumentation guidelines
|
||||
|
||||
### 3.1 Event model
|
||||
|
||||
Emit two core events per finding view:
|
||||
|
||||
1. **`finding_open`**
|
||||
|
||||
* When user opens the finding details (route enter / modal open).
|
||||
* Must include:
|
||||
|
||||
* `finding_id`
|
||||
* `tenant_id` / `org_id`
|
||||
* `user_role` (admin, dev, triager, etc.)
|
||||
* `entry_point` (list, search, notification, deep link)
|
||||
* `ui_version` / `build_sha`
|
||||
|
||||
2. **`proof_rendered`**
|
||||
|
||||
* First time *any* qualifying proof element is painted.
|
||||
* Must include:
|
||||
|
||||
* `finding_id`
|
||||
* `proof_kind` (`sbom | reachability | vex | logs | other`)
|
||||
* `source` (`local_cache | backend_api | 3rd_party`)
|
||||
* `proof_height` (e.g., pixel offset from top) – to ensure it’s actually above the fold or very close.
|
||||
|
||||
**Derived metric**
|
||||
|
||||
Your telemetry pipeline should compute:
|
||||
|
||||
```text
|
||||
tte_ms = proof_rendered.timestamp - finding_open.timestamp
|
||||
```
|
||||
|
||||
If there are multiple `proof_rendered` events for the same `finding_open`, use:
|
||||
|
||||
* **TTE (first proof)** – minimum timestamp; primary SLO.
|
||||
* Optionally: **TTE (full evidence)** – last proof in a defined “bundle” (e.g., path + SBOM row).
|
||||
|
||||
### 3.2 Implementation notes
|
||||
|
||||
**Frontend**
|
||||
|
||||
* Emit `finding_open` as soon as:
|
||||
|
||||
* The route is confirmed and
|
||||
* You know which `finding_id` is being displayed.
|
||||
* Emit `proof_rendered`:
|
||||
|
||||
* **Not** when you *fetch* data, but when at least one evidence component is **visibly rendered**.
|
||||
* Easiest approach: hook into component lifecycle / intersection observer on the evidence container.
|
||||
|
||||
Pseudo‑example:
|
||||
|
||||
```ts
|
||||
// On route/mount:
|
||||
metrics.emit('finding_open', {
|
||||
findingId,
|
||||
entryPoint,
|
||||
userRole,
|
||||
uiVersion,
|
||||
t: performance.now()
|
||||
});
|
||||
|
||||
// In EvidencePanel component, after first render with real data:
|
||||
if (!hasEmittedProof && hasRealEvidence) {
|
||||
metrics.emit('proof_rendered', {
|
||||
findingId,
|
||||
proofKind: 'sbom',
|
||||
source: 'backend_api',
|
||||
t: performance.now()
|
||||
});
|
||||
hasEmittedProof = true;
|
||||
}
|
||||
```
|
||||
|
||||
**Backend**
|
||||
|
||||
* No special requirement beyond:
|
||||
|
||||
* Stable IDs (`finding_id`).
|
||||
* Knowing which API endpoints respond with evidence payloads — you’ll want to correlate backend latency with TTE later.
|
||||
|
||||
---
|
||||
|
||||
## 4. Data quality & sampling
|
||||
|
||||
If you want TTE to drive decisions, the data must be boringly reliable.
|
||||
|
||||
**Guidelines**
|
||||
|
||||
1. **Sample rate**
|
||||
|
||||
* Start with **100%** in staging.
|
||||
* In production, aim for **≥ 25% of sessions** for TTE events at minimum; 100% is ideal if volume is reasonable.
|
||||
|
||||
2. **Clock skew**
|
||||
|
||||
* Prefer **frontend timestamps** using `performance.now()` for TTE; they’re monotonic within a tab.
|
||||
* Don’t mix backend clocks into the TTE calculation.
|
||||
|
||||
3. **Bot / synthetic traffic**
|
||||
|
||||
* Tag synthetic tests (`is_synthetic = true`) and exclude them from UX health dashboards.
|
||||
|
||||
4. **Retry behavior**
|
||||
|
||||
* If the proof fails to load and user hits “retry”:
|
||||
|
||||
* Treat it as a separate measurement (`retry = true`) or
|
||||
* Log an additional `proof_error` event with error class (timeout, 5xx, network, parse, etc.).
|
||||
|
||||
---
|
||||
|
||||
## 5. Dashboards: how to watch TTE
|
||||
|
||||
You want a small, opinionated set of views that answer:
|
||||
|
||||
> “Is UX getting better or worse for people trying to understand findings?”
|
||||
|
||||
### 5.1 Core widgets
|
||||
|
||||
1. **TTE distribution**
|
||||
|
||||
* P50 / P90 / P95 / P99 per day (or per release).
|
||||
* Split by `proof_kind`.
|
||||
|
||||
2. **TTE by page / surface**
|
||||
|
||||
* Finding list → detail.
|
||||
* Deep links from notifications.
|
||||
* Direct URLs / bookmarks.
|
||||
|
||||
3. **TTE by user segment**
|
||||
|
||||
* New users vs power users.
|
||||
* Different roles (security engineer vs application dev).
|
||||
|
||||
4. **Error budget panel**
|
||||
|
||||
* “Minutes over SLO per day” – e.g., sum of all user‑minutes where TTE > 15s.
|
||||
* Use this to prioritize work.
|
||||
|
||||
5. **Correlation with engagement**
|
||||
|
||||
* Scatter: TTE vs session length, or TTE vs “user clicked ‘ignore’ / ‘snooze’”.
|
||||
* Aim to confirm the obvious: **long TTE → worse engagement/completion**.
|
||||
|
||||
### 5.2 Operational details
|
||||
|
||||
* Update granularity: **real‑time or ≤15 min** for on‑call/ops panels.
|
||||
* Retention: at least **90 days** to see trends across big releases.
|
||||
* Breakdowns:
|
||||
|
||||
* `backend_region` (to catch regional issues).
|
||||
* `build_version` (to spot regressions quickly).
|
||||
|
||||
---
|
||||
|
||||
## 6. UX & engineering design rules anchored in TTE
|
||||
|
||||
These are the **behavior rules** for the product that keep TTE healthy.
|
||||
|
||||
### 6.1 “Evidence first” layout rules
|
||||
|
||||
* **Evidence above the fold**
|
||||
|
||||
* At least *one* proof element must be visible **without scrolling** on a typical laptop viewport.
|
||||
* **Summary second**
|
||||
|
||||
* CVSS scores, severity badges, long descriptions: all secondary. Evidence should come *before* opinion.
|
||||
* **No fake proof**
|
||||
|
||||
* Don’t use placeholders that *look* like evidence but aren’t (e.g., “example path” or generic text).
|
||||
* If evidence is still loading, show a clear skeleton/loader with “Loading evidence…”.
|
||||
|
||||
### 6.2 Loading strategy rules
|
||||
|
||||
* Start fetching evidence **as soon as navigation begins**, not after the page is fully mounted.
|
||||
* Use **lazy loading** for non‑critical widgets until after proof is shown.
|
||||
* If a call is known to be heavy:
|
||||
|
||||
* Consider **precomputing** and caching the top evidence (shortest path, first SBOM hit).
|
||||
* Stream results: render first proof item as soon as it arrives; don’t wait for the full list.
|
||||
|
||||
### 6.3 Empty / error state rules
|
||||
|
||||
* If there is genuinely no evidence:
|
||||
|
||||
* Explicitly say **“No supporting evidence available yet”** and treat TTE as:
|
||||
|
||||
* Either “no value” (excluded), or
|
||||
* A special bucket `proof_kind = "none"`.
|
||||
* If loading fails:
|
||||
|
||||
* Show a clear error and a **retry** that re‑emits `proof_rendered` when successful.
|
||||
* Log `proof_error` with reason; track error rate alongside TTE.
|
||||
|
||||
---
|
||||
|
||||
## 7. How to *use* TTE in practice
|
||||
|
||||
### 7.1 For releases
|
||||
|
||||
For any change that affects findings UI or evidence plumbing:
|
||||
|
||||
* Add a release checklist item:
|
||||
|
||||
* “No regression on TTE P95 for [pages X, Y].”
|
||||
* During rollout:
|
||||
|
||||
* Compare **pre‑ vs post‑release** TTE P95 by `ui_version`.
|
||||
* If regression > 20%:
|
||||
|
||||
* Roll back, or
|
||||
* Add a follow‑up ticket explicitly tagged with the regression.
|
||||
|
||||
### 7.2 For experiments / A/B tests
|
||||
|
||||
When running UI experiments around findings:
|
||||
|
||||
* Always capture TTE per variant.
|
||||
* Compare:
|
||||
|
||||
* TTE P50/P95.
|
||||
* Task completion rate (e.g., “user changed status”).
|
||||
* Subjective UX (CSAT) if you have it.
|
||||
|
||||
You’re looking for patterns like:
|
||||
|
||||
* Variant B: **+5% completion**, **+8% TTE** → maybe OK.
|
||||
* Variant C: **+2% completion**, **+70% TTE** → probably not acceptable.
|
||||
|
||||
### 7.3 For prioritization
|
||||
|
||||
Use TTE as a lever in planning:
|
||||
|
||||
* If P95 TTE is healthy and stable:
|
||||
|
||||
* More room for new features / experiments.
|
||||
* If P95 TTE is trending up for 2+ weeks:
|
||||
|
||||
* Time to schedule a “TTE debt” story: caching, query optimization, UI re‑layout, etc.
|
||||
|
||||
---
|
||||
|
||||
## 8. Quick “TTE‑ready” checklist
|
||||
|
||||
You’re “tracking UX health with TTE” if you can honestly tick these:
|
||||
|
||||
1. **Instrumentation**
|
||||
|
||||
* [ ] `finding_open` + `proof_rendered` events exist and are correlated.
|
||||
* [ ] TTE computed in a stable pipeline (joins, dedupe, etc.).
|
||||
2. **Targets**
|
||||
|
||||
* [ ] TTE SLOs defined (P95, P99) and agreed by UX + engineering.
|
||||
3. **Dashboards**
|
||||
|
||||
* [ ] A dashboard shows TTE by proof kind, page, and release.
|
||||
* [ ] On‑call / ops can see TTE in near real‑time.
|
||||
4. **UX rules**
|
||||
|
||||
* [ ] Evidence is visible above the fold for all main finding types.
|
||||
* [ ] Non‑critical widgets load after evidence.
|
||||
* [ ] Empty/error states are explicit about evidence availability.
|
||||
5. **Process**
|
||||
|
||||
* [ ] Major UI changes check TTE pre vs post as part of release acceptance.
|
||||
* [ ] Regressions in TTE create real tickets, not just “we’ll watch it”.
|
||||
|
||||
---
|
||||
|
||||
If you tell me what stack you’re on (e.g., React + Next.js + OpenTelemetry + X observability tool), I can turn this into concrete code snippets and an example dashboard spec (fields, queries, charts) tailored exactly to your setup.
|
||||
@@ -0,0 +1,576 @@
|
||||
Here’s a tight, practical blueprint to turn your SBOM→VEX links into an auditable “proof spine”—using signed DSSE statements and a per‑dependency trust anchor—so every VEX verdict can be traced, verified, and replayed.
|
||||
|
||||
# What this gives you
|
||||
|
||||
* A **chain of evidence** from each SBOM entry → analysis → VEX verdict.
|
||||
* **Tamper‑evident** DSSE‑signed records (offline‑friendly).
|
||||
* **Deterministic replay**: same inputs → same verdicts (great for audits/regulators).
|
||||
|
||||
# Core objects (canonical IDs)
|
||||
|
||||
* **ArtifactID**: digest of package/container (e.g., `sha256:…`).
|
||||
* **SBOMEntryID**: stable ID for a component in an SBOM (`sbomDigest:package@version[:purl]`).
|
||||
* **EvidenceID**: hash of raw evidence (scanner JSON, reachability, exploit intel).
|
||||
* **ReasoningID**: hash of normalized reasoning (rules/lattice inputs used).
|
||||
* **VEXVerdictID**: hash of the final VEX statement body.
|
||||
* **ProofBundleID**: merkle root of {SBOMEntryID, EvidenceID[], ReasoningID, VEXVerdictID}.
|
||||
* **TrustAnchorID**: per‑dependency anchor (public key + policy) used to validate the above.
|
||||
|
||||
# Signed DSSE envelopes you’ll produce
|
||||
|
||||
1. **Evidence Statement** (per evidence item)
|
||||
|
||||
* `subject`: SBOMEntryID
|
||||
* `predicateType`: `evidence.stella/v1`
|
||||
* `predicate`: source, tool version, timestamps, EvidenceID
|
||||
* **Signers**: scanner/ingestor key
|
||||
|
||||
2. **Reasoning Statement**
|
||||
|
||||
* `subject`: SBOMEntryID
|
||||
* `predicateType`: `reasoning.stella/v1` (your lattice/policy inputs + ReasoningID)
|
||||
* **Signers**: “Policy/Lattice Engine” key (Authority)
|
||||
|
||||
3. **VEX Verdict Statement**
|
||||
|
||||
* `subject`: SBOMEntryID
|
||||
* `predicateType`: CycloneDX or CSAF VEX body + VEXVerdictID
|
||||
* **Signers**: VEXer key (or vendor key if you have it)
|
||||
|
||||
4. **Proof Spine Statement** (the spine itself)
|
||||
|
||||
* `subject`: SBOMEntryID
|
||||
* `predicateType`: `proofspine.stella/v1`
|
||||
* `predicate`: EvidenceID[], ReasoningID, VEXVerdictID, ProofBundleID
|
||||
* **Signers**: Authority key
|
||||
|
||||
# Trust model (per‑dependency anchor)
|
||||
|
||||
* **TrustAnchor** (per package/purl): { TrustAnchorID, allowed signers (KMS refs, PKs), accepted predicateTypes, policy version, revocation list }.
|
||||
* Store anchors in **Authority** and pin them in your graph by SBOMEntryID→TrustAnchorID.
|
||||
* Optional: PQC mode (Dilithium/Falcon) for long‑term archives.
|
||||
|
||||
# Verification pipeline (deterministic)
|
||||
|
||||
1. Resolve SBOMEntryID → TrustAnchorID.
|
||||
2. Verify every DSSE envelope’s signature **against the anchor’s allowed keys**.
|
||||
3. Recompute EvidenceID/ReasoningID/VEXVerdictID from raw content; compare hashes.
|
||||
4. Recompute ProofBundleID (merkle root) and compare to the spine.
|
||||
5. Emit a **Receipt**: {ProofBundleID, verification log, tool digests}. Cache it.
|
||||
|
||||
# Storage layout (Postgres + blob store)
|
||||
|
||||
* `sbom_entries(entry_id PK, bom_digest, purl, version, artifact_digest, trust_anchor_id)`
|
||||
* `dsse_envelopes(env_id PK, entry_id, predicate_type, signer_keyid, body_hash, envelope_blob_ref, signed_at)`
|
||||
* `spines(entry_id PK, bundle_id, evidence_ids[], reasoning_id, vex_id, anchor_id, created_at)`
|
||||
* `trust_anchors(anchor_id PK, purl_pattern, allowed_keyids[], policy_ref, revoked_keys[])`
|
||||
* Blobs (immutable): raw evidence, normalized reasoning JSON, VEX JSON, DSSE bytes.
|
||||
|
||||
# API surface (clean and small)
|
||||
|
||||
* `POST /proofs/:entry/spine` → submit or update spine (idempotent by ProofBundleID)
|
||||
* `GET /proofs/:entry/receipt` → full verification receipt (JSON)
|
||||
* `GET /proofs/:entry/vex` → the verified VEX body
|
||||
* `GET /anchors/:anchor` → fetch trust anchor (for offline kits)
|
||||
|
||||
# Normalization rules (so hashes are stable)
|
||||
|
||||
* Canonical JSON (UTF‑8, sorted keys, no insignificant whitespace).
|
||||
* Strip volatile fields (timestamps that aren’t part of the semantic claim).
|
||||
* Version your schemas: `evidence.stella/v1`, `reasoning.stella/v1`, etc.
|
||||
|
||||
# Signing keys & rotation
|
||||
|
||||
* Keep keys in your **Authority** module (KMS/HSM; offline export for air‑gap).
|
||||
* Publish key material via an **attestation feed** (or Rekor‑mirror) for third‑party audit.
|
||||
* Rotate by **adding** new allowed_keyids in the TrustAnchor; never mutate old envelopes.
|
||||
|
||||
# CI/CD hooks
|
||||
|
||||
* On SBOM ingest → create/refresh SBOMEntry rows + attach TrustAnchor.
|
||||
* On scan completion → produce Evidence Statements (DSSE) immediately.
|
||||
* On policy evaluation → produce Reasoning + VEX, then assemble Spine.
|
||||
* Gate releases on `GET /proofs/:entry/receipt` == PASS.
|
||||
|
||||
# UX (auditor‑friendly)
|
||||
|
||||
* **Proof timeline** per entry: SBOM → Evidence tiles → Reasoning → VEX → Receipt.
|
||||
* One‑click “Recompute & Compare” to show deterministic replay passes.
|
||||
* Red/amber flags when a signature no longer matches a TrustAnchor or a key is revoked.
|
||||
|
||||
# Minimal dev checklist
|
||||
|
||||
* [ ] Implement canonicalizers (Evidence, Reasoning, VEX).
|
||||
* [ ] Implement DSSE sign/verify (ECDSA + optional PQC).
|
||||
* [ ] TrustAnchor registry + resolver by purl pattern.
|
||||
* [ ] Merkle bundling to get ProofBundleID.
|
||||
* [ ] Receipt generator + verifier.
|
||||
* [ ] Postgres schema + blob GC (content‑addressed).
|
||||
* [ ] CI gates + API endpoints above.
|
||||
* [ ] Auditor UI: timeline + diff + receipts download.
|
||||
|
||||
If you want, I can drop in a ready‑to‑use JSON schema set (`evidence.stella/v1`, `reasoning.stella/v1`, `proofspine.stella/v1`) and sample DSSE envelopes wired to your .NET 10 stack.
|
||||
Here’s a focused **Stella Ops Developer Guidelines** doc, specifically for the pipeline that turns **SBOM data into verifiable proofs** (your SBOM → Evidence → Reasoning → VEX → Proof Spine).
|
||||
|
||||
Feel free to paste this into your internal handbook and tweak names to match your repos/services.
|
||||
|
||||
---
|
||||
|
||||
# Stella Ops Developer Guidelines
|
||||
|
||||
## Turning SBOM Data Into Verifiable Proofs
|
||||
|
||||
---
|
||||
|
||||
## 1. Mental Model: What You’re Actually Building
|
||||
|
||||
For every component in an SBOM, Stella must be able to answer, *“Why should anyone trust our VEX verdict for this dependency, today and ten years from now?”*
|
||||
|
||||
We do that with a pipeline:
|
||||
|
||||
1. **SBOM Ingest**
|
||||
Raw SBOM → validated → normalized → `SBOMEntryID`.
|
||||
|
||||
2. **Evidence Collection**
|
||||
Scans, feeds, configs, reachability, etc. → canonical evidence blobs → `EvidenceID` → DSSE-signed.
|
||||
|
||||
3. **Reasoning / Policy**
|
||||
Policy + evidence → deterministic reasoning → `ReasoningID` → DSSE-signed.
|
||||
|
||||
4. **VEX Verdict**
|
||||
VEX statement (CycloneDX / CSAF) → canonicalized → `VEXVerdictID` → DSSE-signed.
|
||||
|
||||
5. **Proof Spine**
|
||||
`{SBOMEntryID, EvidenceIDs[], ReasoningID, VEXVerdictID}` → merkle bundle → `ProofBundleID` → DSSE-signed.
|
||||
|
||||
6. **Verification & Receipts**
|
||||
Re-run verification → `Receipt` that proves everything above is intact and anchored to trusted keys.
|
||||
|
||||
Everything you do in this area should keep this spine intact and verifiable.
|
||||
|
||||
---
|
||||
|
||||
## 2. Non‑Negotiable Invariants
|
||||
|
||||
These are the rules you don’t break without an explicit, company-level decision:
|
||||
|
||||
1. **Immutability of Signed Facts**
|
||||
|
||||
* DSSE envelopes (evidence, reasoning, VEX, spines) are append‑only.
|
||||
* You never edit or delete content inside a previously signed envelope.
|
||||
* Corrections are made by **superseding** (new statement pointing at the old one).
|
||||
|
||||
2. **Determinism**
|
||||
|
||||
* Same `{SBOMEntryID, Evidence set, policyVersion}` ⇒ same `{ReasoningID, VEXVerdictID, ProofBundleID}`.
|
||||
* No non-deterministic inputs (e.g., “current time”, random IDs) in anything that affects IDs or verdicts.
|
||||
|
||||
3. **Traceability**
|
||||
|
||||
* Every VEX verdict must be traceable back to:
|
||||
|
||||
* The precise SBOM entry
|
||||
* Concrete evidence blobs
|
||||
* A specific policy & reasoning snapshot
|
||||
* A trust anchor defining allowed signers
|
||||
|
||||
4. **Least Trust / Least Privilege**
|
||||
|
||||
* Services only know the keys and data they need.
|
||||
* Trust is always explicit: through **TrustAnchors** and signature verification, never “because it’s in our DB”.
|
||||
|
||||
5. **Backwards Compatibility**
|
||||
|
||||
* New code must continue to verify **old proofs**.
|
||||
* New policies must **not rewrite history**; they produce *new* spines, leaving old ones intact.
|
||||
|
||||
---
|
||||
|
||||
## 3. SBOM Ingestion Guidelines
|
||||
|
||||
**Goal:** Turn arbitrary SBOMs into stable, addressable `SBOMEntryID`s and safe internal models.
|
||||
|
||||
### 3.1 Inputs & Formats
|
||||
|
||||
* Support at least:
|
||||
|
||||
* CycloneDX (JSON)
|
||||
* SPDX (JSON / Tag-Value)
|
||||
* For each ingested SBOM, store:
|
||||
|
||||
* Raw SBOM bytes (immutable, content-addressed)
|
||||
* A normalized internal representation (your own model)
|
||||
|
||||
### 3.2 IDs
|
||||
|
||||
* Generate:
|
||||
|
||||
* `sbomDigest` = hash(raw SBOM, canonical form)
|
||||
* `SBOMEntryID` = `sbomDigest + purl + version` (or equivalent stable tuple)
|
||||
* `SBOMEntryID` must:
|
||||
|
||||
* Not depend on ingestion time or database IDs.
|
||||
* Be reproducible from SBOM + deterministic normalization.
|
||||
|
||||
### 3.3 Validation & Errors
|
||||
|
||||
* Validate:
|
||||
|
||||
* Syntax (JSON, schema)
|
||||
* Core semantics (package identifiers, digests, versions)
|
||||
* If invalid:
|
||||
|
||||
* Reject the SBOM **but** record a small DSSE “failure attestation” explaining:
|
||||
|
||||
* Why it failed
|
||||
* Which file
|
||||
* Which system version
|
||||
* This still gives you a proof trail for “we tried and it failed”.
|
||||
|
||||
---
|
||||
|
||||
## 4. Evidence Collection Guidelines
|
||||
|
||||
**Goal:** Capture all inputs that influence the verdict in a canonical, signed form.
|
||||
|
||||
Typical evidence types:
|
||||
|
||||
* SCA / vuln scanner results
|
||||
* CVE feeds & advisory data
|
||||
* Reachability / call graph analysis
|
||||
* Runtime context (where this component is used)
|
||||
* Manual assessments (e.g., security engineer verdicts)
|
||||
|
||||
### 4.1 Evidence Canonicalization
|
||||
|
||||
For every evidence item:
|
||||
|
||||
* Normalize to a schema like `evidence.stella/v1` with fields such as:
|
||||
|
||||
* `source` (scanner name, feed)
|
||||
* `sourceVersion` (tool version, DB version)
|
||||
* `collectionTime`
|
||||
* `sbomEntryId`
|
||||
* `vulnerabilityId` (if applicable)
|
||||
* `rawFinding` (or pointer to it)
|
||||
* Canonical JSON rules:
|
||||
|
||||
* Sorted keys
|
||||
* UTF‑8, no extraneous whitespace
|
||||
* No volatile fields beyond what’s semantically needed (e.g., you might include `collectionTime`, but then know it affects the hash and treat that consciously).
|
||||
|
||||
Then:
|
||||
|
||||
* Compute `EvidenceID = hash(canonicalEvidenceJson)`.
|
||||
* Wrap in DSSE:
|
||||
|
||||
* `subject`: `SBOMEntryID`
|
||||
* `predicateType`: `evidence.stella/v1`
|
||||
* `predicate`: canonical evidence + `EvidenceID`.
|
||||
* Sign with **evidence-ingestor key** (per environment).
|
||||
|
||||
### 4.2 Ops Rules
|
||||
|
||||
* **Idempotency:**
|
||||
Re-running the same scan with same inputs should produce the same evidence object and `EvidenceID`.
|
||||
* **Tool changes:**
|
||||
When tool version or configuration changes, that’s a *new* evidence statement with a new `EvidenceID`. Do not overwrite old evidence.
|
||||
* **Partial failure:**
|
||||
If a scan fails, produce a minimal failure evidence record (with error details) instead of “nothing”.
|
||||
|
||||
---
|
||||
|
||||
## 5. Reasoning & Policy Engine Guidelines
|
||||
|
||||
**Goal:** Turn evidence into a defensible, replayable reasoning step with a clear policy version.
|
||||
|
||||
### 5.1 Reasoning Object
|
||||
|
||||
Define a canonical reasoning schema, e.g. `reasoning.stella/v1`:
|
||||
|
||||
* `sbomEntryId`
|
||||
* `evidenceIds[]` (sorted)
|
||||
* `policyVersion`
|
||||
* `inputs`: normalized form of all policy inputs (severity thresholds, lattice rules, etc.)
|
||||
* `intermediateFindings`: optional but useful — e.g., “reachable vulns = …”
|
||||
|
||||
Then:
|
||||
|
||||
* Canonicalize JSON and compute `ReasoningID = hash(canonicalReasoning)`.
|
||||
* Wrap in DSSE:
|
||||
|
||||
* `subject`: `SBOMEntryID`
|
||||
* `predicateType`: `reasoning.stella/v1`
|
||||
* `predicate`: canonical reasoning + `ReasoningID`.
|
||||
* Sign with **Policy/Authority key**.
|
||||
|
||||
### 5.2 Determinism
|
||||
|
||||
* Reasoning functions must be **pure**:
|
||||
|
||||
* Inputs: SBOMEntryID, evidence set, policy version, configuration.
|
||||
* No hidden calls to external APIs at decision time (fetch feeds earlier and record them as evidence).
|
||||
* If you need “current time” in policy:
|
||||
|
||||
* Treat it as **explicit input** and record it inside reasoning under `inputs.currentEvaluationTime`.
|
||||
|
||||
### 5.3 Policy Evolution
|
||||
|
||||
* When policy changes:
|
||||
|
||||
* Bump `policyVersion`.
|
||||
* New evaluations produce new `ReasoningID` and new VEX/spines.
|
||||
* Don’t retroactively apply new policy to old reasoning objects; generate new ones alongside.
|
||||
|
||||
---
|
||||
|
||||
## 6. VEX Verdict Guidelines
|
||||
|
||||
**Goal:** Generate VEX statements that are strongly tied to SBOM entries and your reasoning.
|
||||
|
||||
### 6.1 Shape
|
||||
|
||||
* Target standard formats:
|
||||
|
||||
* CycloneDX VEX
|
||||
* or CSAF
|
||||
* Required linkages:
|
||||
|
||||
* Component reference = `SBOMEntryID` or a resolvable component identifier from your SBOM normalize layer.
|
||||
* Vulnerability IDs (CVE, GHSA, internal IDs).
|
||||
* Status (`not_affected`, `affected`, `fixed`, etc.).
|
||||
* Justification & impact.
|
||||
|
||||
### 6.2 Canonicalization & Signing
|
||||
|
||||
* Define a canonical VEX body schema (subset of the standard + internal metadata):
|
||||
|
||||
* `sbomEntryId`
|
||||
* `vulnerabilityId`
|
||||
* `status`
|
||||
* `justification`
|
||||
* `policyVersion`
|
||||
* `reasoningId`
|
||||
* Canonicalize JSON → `VEXVerdictID = hash(canonicalVexBody)`.
|
||||
* DSSE-envelope:
|
||||
|
||||
* `subject`: `SBOMEntryID`
|
||||
* `predicateType`: e.g. `cdx-vex.stella/v1`
|
||||
* `predicate`: canonical VEX + `VEXVerdictID`.
|
||||
* Sign with **VEXer key** or vendor key (depending on trust anchor).
|
||||
|
||||
### 6.3 External VEX
|
||||
|
||||
* When importing vendor VEX:
|
||||
|
||||
* Verify signature against vendor’s TrustAnchor.
|
||||
* Canonicalize to your internal schema but preserve:
|
||||
|
||||
* Original document
|
||||
* Original signature material
|
||||
* Record “source = vendor” vs “source = stella” so auditors see origin.
|
||||
|
||||
---
|
||||
|
||||
## 7. Proof Spine Guidelines
|
||||
|
||||
**Goal:** Build a compact, tamper-evident “bundle” that ties everything together.
|
||||
|
||||
### 7.1 Structure
|
||||
|
||||
For each `SBOMEntryID`, gather:
|
||||
|
||||
* `EvidenceIDs[]` (sorted lexicographically).
|
||||
* `ReasoningID`.
|
||||
* `VEXVerdictID`.
|
||||
|
||||
Compute:
|
||||
|
||||
* Merkle tree root (or deterministic hash) over:
|
||||
|
||||
* `sbomEntryId`
|
||||
* sorted `EvidenceIDs[]`
|
||||
* `ReasoningID`
|
||||
* `VEXVerdictID`
|
||||
* Result is `ProofBundleID`.
|
||||
|
||||
Create a DSSE “spine”:
|
||||
|
||||
* `subject`: `SBOMEntryID`
|
||||
* `predicateType`: `proofspine.stella/v1`
|
||||
* `predicate`:
|
||||
|
||||
* `evidenceIds[]`
|
||||
* `reasoningId`
|
||||
* `vexVerdictId`
|
||||
* `policyVersion`
|
||||
* `proofBundleId`
|
||||
* Sign with **Authority key**.
|
||||
|
||||
### 7.2 Ops Rules
|
||||
|
||||
* Spine generation is idempotent:
|
||||
|
||||
* Same inputs → same `ProofBundleID`.
|
||||
* Never mutate existing spines; new policy or new evidence ⇒ new spine.
|
||||
* Keep a clear API contract:
|
||||
|
||||
* `GET /proofs/:entry` returns **all** spines, each labeled with `policyVersion` and timestamps.
|
||||
|
||||
---
|
||||
|
||||
## 8. Storage & Schema Guidelines
|
||||
|
||||
**Goal:** Keep proofs queryable forever without breaking verification.
|
||||
|
||||
### 8.1 Tables (conceptual)
|
||||
|
||||
* `sbom_entries`: `entry_id`, `bom_digest`, `purl`, `version`, `artifact_digest`, `trust_anchor_id`.
|
||||
* `dsse_envelopes`: `env_id`, `entry_id`, `predicate_type`, `signer_keyid`, `body_hash`, `envelope_blob_ref`, `signed_at`.
|
||||
* `spines`: `entry_id`, `proof_bundle_id`, `policy_version`, `evidence_ids[]`, `reasoning_id`, `vex_verdict_id`, `anchor_id`, `created_at`.
|
||||
* `trust_anchors`: `anchor_id`, `purl_pattern`, `allowed_keyids[]`, `policy_ref`, `revoked_keys[]`.
|
||||
|
||||
### 8.2 Schema Changes
|
||||
|
||||
Always follow:
|
||||
|
||||
1. **Expand**
|
||||
|
||||
* Add new columns/tables.
|
||||
* Make new code tolerant of old data.
|
||||
|
||||
2. **Backfill**
|
||||
|
||||
* Idempotent jobs that fill in new IDs/fields without touching old DSSE payloads.
|
||||
|
||||
3. **Contract**
|
||||
|
||||
* Only after all code uses the new model.
|
||||
* Never drop the raw DSSE or raw SBOM blobs.
|
||||
|
||||
---
|
||||
|
||||
## 9. Verification & Receipts
|
||||
|
||||
**Goal:** Make it trivial (for you, customers, and regulators) to recheck everything.
|
||||
|
||||
### 9.1 Verification Flow
|
||||
|
||||
Given `SBOMEntryID` or `ProofBundleID`:
|
||||
|
||||
1. Fetch spine and trust anchor.
|
||||
2. Verify:
|
||||
|
||||
* Spine DSSE signature against TrustAnchor’s allowed keys.
|
||||
* VEX, reasoning, and evidence DSSE signatures.
|
||||
3. Recompute:
|
||||
|
||||
* `EvidenceIDs` from stored canonical evidence.
|
||||
* `ReasoningID` from reasoning.
|
||||
* `VEXVerdictID` from VEX body.
|
||||
* `ProofBundleID` from the above.
|
||||
4. Compare to stored IDs.
|
||||
|
||||
Emit a **Receipt**:
|
||||
|
||||
* `proofBundleId`
|
||||
* `verifiedAt`
|
||||
* `verifierVersion`
|
||||
* `anchorId`
|
||||
* `result` (pass/fail, with reasons)
|
||||
|
||||
### 9.2 Offline Kit
|
||||
|
||||
* Provide a minimal CLI (`stella verify`) that:
|
||||
|
||||
* Accepts a bundle export (SBOM + DSSE envelopes + anchors).
|
||||
* Verifies everything without network access.
|
||||
|
||||
Developers must ensure:
|
||||
|
||||
* Export format is documented and stable.
|
||||
* All fields required for verification are included.
|
||||
|
||||
---
|
||||
|
||||
## 10. Security & Key Management (for Devs)
|
||||
|
||||
* Keys live in **KMS/HSM**, not env vars or config files.
|
||||
* Separate keysets:
|
||||
|
||||
* `dev`, `staging`, `prod`
|
||||
* Authority vs VEXer vs Evidence Ingestor.
|
||||
* TrustAnchors:
|
||||
|
||||
* Edit via Authority service only.
|
||||
* Every change:
|
||||
|
||||
* Requires code-reviewed change.
|
||||
* Writes an audit log entry.
|
||||
|
||||
Never:
|
||||
|
||||
* Log private keys.
|
||||
* Log full DSSE envelopes in plaintext logs (log IDs and hashes instead).
|
||||
|
||||
---
|
||||
|
||||
## 11. Observability & On‑Call Expectations
|
||||
|
||||
### 11.1 Metrics
|
||||
|
||||
For the SBOM→Proof pipeline, expose:
|
||||
|
||||
* `sboms_ingested_total`
|
||||
* `sbom_ingest_errors_total{reason}`
|
||||
* `evidence_statements_created_total`
|
||||
* `reasoning_statements_created_total`
|
||||
* `vex_statements_created_total`
|
||||
* `proof_spines_created_total`
|
||||
* `proof_verifications_total{result}` (pass/fail reason)
|
||||
* Latency histograms per stage (`_duration_seconds`)
|
||||
|
||||
### 11.2 Logging
|
||||
|
||||
Include in structured logs wherever relevant:
|
||||
|
||||
* `sbomEntryId`
|
||||
* `proofBundleId`
|
||||
* `anchorId`
|
||||
* `policyVersion`
|
||||
* `requestId` / `traceId`
|
||||
|
||||
### 11.3 Runbooks
|
||||
|
||||
You should maintain runbooks for at least:
|
||||
|
||||
* “Pipeline is stalled” (backlog of SBOMs, evidence, or spines).
|
||||
* “Verification failures increased”.
|
||||
* “Trust anchor or key issues” (rotation, revocation, misconfiguration).
|
||||
* “Backfill gone wrong” (how to safely stop, resume, and audit).
|
||||
|
||||
---
|
||||
|
||||
## 12. Dev Workflow & PR Checklist (SBOM→Proof Changes Only)
|
||||
|
||||
When your change touches SBOM ingestion, evidence, reasoning, VEX, or proof spines, check:
|
||||
|
||||
* [ ] IDs (`SBOMEntryID`, `EvidenceID`, `ReasoningID`, `VEXVerdictID`, `ProofBundleID`) remain **deterministic** and fully specified.
|
||||
* [ ] No mutation of existing DSSE envelopes or historical proof data.
|
||||
* [ ] Schema changes follow **expand → backfill → contract**.
|
||||
* [ ] New/updated TrustAnchors reviewed by Authority owner.
|
||||
* [ ] Unit tests cover:
|
||||
|
||||
* Canonicalization for any new/changed predicate.
|
||||
* ID computation.
|
||||
* [ ] Integration test covers:
|
||||
|
||||
* SBOM → Evidence → Reasoning → VEX → Spine → Verification → Receipt.
|
||||
* [ ] Observability updated:
|
||||
|
||||
* New paths emit logs & metrics.
|
||||
* [ ] Rollback plan documented (especially for migrations & policy changes).
|
||||
|
||||
---
|
||||
|
||||
If you tell me which microservices/repos map to these stages (e.g. `stella-sbom-ingest`, `stella-proof-authority`, `stella-vexer`), I can turn this into a more concrete, service‑by‑service checklist with example API contracts and class/interface sketches.
|
||||
@@ -0,0 +1,381 @@
|
||||
Here’s a crisp, plug‑in set of **reproducible benchmarks** you can bake into Stella Ops so buyers, auditors, and your own team can see measurable wins—without hand‑wavy heuristics.
|
||||
|
||||
# Benchmarks Stella Ops should standardize
|
||||
|
||||
**1) Time‑to‑Evidence (TTE)**
|
||||
How fast Stella Ops turns a “suspicion” into a signed, auditor‑usable proof (e.g., VEX+attestations).
|
||||
|
||||
* **Definition:** `TTE = t(proof_ready) – t(artifact_ingested)`
|
||||
* **Scope:** scanning, reachability, policy evaluation, proof generation, notarization, and publication to your proof ledger.
|
||||
* **Targets:**
|
||||
|
||||
* *P50* < 2m for typical container images (≤ 500 MB, known ecosystems).
|
||||
* *P95* < 5m including cold‑start/offline‑bundle mode.
|
||||
* **Report:** Median/P95 by artifact size bucket; break down stages (fetch → analyze → reachability → VEX → sign → publish).
|
||||
* **Auditable logs:** DSSE/DSD signatures, policy hash, feed set IDs, scanner build hash.
|
||||
|
||||
**2) False‑Negative Drift Rate (FN‑Drift)**
|
||||
Catches when a previously “clean” artifact later becomes “affected” because the world changed (new CVE, rule, or feed).
|
||||
|
||||
* **Definition (rolling window 30d):**
|
||||
`FN‑Drift = (# artifacts re‑classified from {unaffected/unknown} → affected) / (total artifacts re‑evaluated)`
|
||||
* **Stratify by cause:** feed delta, rule delta, lattice/policy delta, reachability delta.
|
||||
* **Goal:** keep *feed‑caused* FN‑Drift low by faster deltas (good) while keeping *engine‑caused* FN‑Drift near zero (stability).
|
||||
* **Guardrails:** require **explanations** on re‑classification: include diff of feeds, rule versions, and lattice policy commit.
|
||||
* **Badge:** “No engine‑caused FN drift in 90d” (hash‑linked evidence bundle).
|
||||
|
||||
**3) Deterministic Re‑scan Reproducibility (Hash‑Stable Proofs)**
|
||||
Same inputs → same outputs, byte‑for‑byte, including proofs. Crucial for audits and regulated buys.
|
||||
|
||||
* **Definition:**
|
||||
Given a **scan manifest** (artifact digest, feed snapshots, engine build hash, lattice/policy hash), re‑scan must produce **identical**: findings set, VEX decisions, proofs, and top‑level bundle hash.
|
||||
* **Metric:** `Repro rate = identical_outputs / total_replays` (target 100%).
|
||||
* **Proof object:**
|
||||
|
||||
```
|
||||
{
|
||||
artifact_digest,
|
||||
scan_manifest_hash,
|
||||
feeds_merkle_root,
|
||||
engine_build_hash,
|
||||
policy_lattice_hash,
|
||||
findings_sha256,
|
||||
vex_bundle_sha256,
|
||||
proof_bundle_sha256
|
||||
}
|
||||
```
|
||||
* **CI check:** nightly replay of a fixed corpus; fail pipeline on any non‑determinism (with diff).
|
||||
|
||||
# Minimal implementation plan (developer‑ready)
|
||||
|
||||
* **Canonical Scan Manifest (CSM):** immutable JSON (canonicalized), covering: artifact digests; feed URIs + content hashes; engine build + ruleset hashes; lattice/policy hash; config flags; environment fingerprint (CPU features, locale). Store CSM + DSSE envelope.
|
||||
* **Stage timers:** emit monotonic timestamps for each stage; roll up to TTE. Persist per‑artifact in Postgres (time‑series table by artifact_digest).
|
||||
* **Delta re‑eval daemon:** on any feed/rule/policy change, re‑score the corpus referenced by that feed snapshot; log re‑classifications with cause; compute FN‑Drift daily.
|
||||
* **Replay harness:** given a CSM, re‑run pipeline in sealed mode (no network, feeds from snapshot); recompute bundle hashes; assert equality.
|
||||
* **Proof bundle:** tar/zip with canonical ordering; include SBOM slice, reachability graph, VEX, signatures, and an index.json (canonical). The bundle’s SHA256 is your public “proof hash.”
|
||||
|
||||
# What to put on dashboards & in SLAs
|
||||
|
||||
* **TTE panel:** P50/P95 by image size; stacked bars by stage; alerts when P95 breaches SLO.
|
||||
* **FN‑Drift panel:** overall and by cause; red flag if engine‑caused drift > 0.1% in 30d.
|
||||
* **Repro panel:** last 24h/7d replay pass rate (goal 100%); list any non‑deterministic modules.
|
||||
|
||||
# Why this wins sales & audits
|
||||
|
||||
* **Auditors:** can pick any proof hash → replay from CSM → get the exact same signed outcome.
|
||||
* **Buyers:** TTE proves speed; FN‑Drift proves stability and feed hygiene; Repro proves you’re not heuristic‑wobbly.
|
||||
* **Competitors:** many can’t show deterministic replay or attribute drift causes—your “hash‑stable proofs” make that gap obvious.
|
||||
|
||||
If you want, I can generate the exact **PostgreSQL schema**, **.NET 10 structs**, and a **nightly replay GitLab job** that enforces these three metrics out‑of‑the‑box.
|
||||
Below is the complete, implementation-ready package you asked for: PostgreSQL schema, .NET 10 types, and a CI replay job for the three Stella Ops benchmarks: Time-to-Evidence (TTE), False-Negative Drift (FN-Drift), and Deterministic Replayability.
|
||||
|
||||
This is written so your mid-level developers can drop it directly into Stella Ops without re-architecting anything.
|
||||
|
||||
---
|
||||
|
||||
# 1. PostgreSQL Schema (Canonical, Deterministic, Normalized)
|
||||
|
||||
## 1.1 Table: scan_manifest
|
||||
|
||||
Immutable record describing exactly what was used for a scan.
|
||||
|
||||
```sql
|
||||
CREATE TABLE scan_manifest (
|
||||
manifest_id UUID PRIMARY KEY,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
|
||||
artifact_digest TEXT NOT NULL,
|
||||
feeds_merkle_root TEXT NOT NULL,
|
||||
engine_build_hash TEXT NOT NULL,
|
||||
policy_lattice_hash TEXT NOT NULL,
|
||||
|
||||
ruleset_hash TEXT NOT NULL,
|
||||
config_flags JSONB NOT NULL,
|
||||
|
||||
environment_fingerprint JSONB NOT NULL,
|
||||
|
||||
raw_manifest JSONB NOT NULL,
|
||||
raw_manifest_sha256 TEXT NOT NULL
|
||||
);
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
* `raw_manifest` is the canonical JSON used for deterministic replay.
|
||||
* `raw_manifest_sha256` is the canonicalized-JSON hash, not a hash of the unformatted body.
|
||||
|
||||
---
|
||||
|
||||
## 1.2 Table: scan_execution
|
||||
|
||||
One execution corresponds to one run of the scanner with one manifest.
|
||||
|
||||
```sql
|
||||
CREATE TABLE scan_execution (
|
||||
execution_id UUID PRIMARY KEY,
|
||||
manifest_id UUID NOT NULL REFERENCES scan_manifest(manifest_id) ON DELETE CASCADE,
|
||||
|
||||
started_at TIMESTAMPTZ NOT NULL,
|
||||
finished_at TIMESTAMPTZ NOT NULL,
|
||||
|
||||
t_ingest_ms INT NOT NULL,
|
||||
t_analyze_ms INT NOT NULL,
|
||||
t_reachability_ms INT NOT NULL,
|
||||
t_vex_ms INT NOT NULL,
|
||||
t_sign_ms INT NOT NULL,
|
||||
t_publish_ms INT NOT NULL,
|
||||
|
||||
proof_bundle_sha256 TEXT NOT NULL,
|
||||
findings_sha256 TEXT NOT NULL,
|
||||
vex_bundle_sha256 TEXT NOT NULL,
|
||||
|
||||
replay_mode BOOLEAN NOT NULL DEFAULT FALSE
|
||||
);
|
||||
```
|
||||
|
||||
Derived view for Time-to-Evidence:
|
||||
|
||||
```sql
|
||||
CREATE VIEW scan_tte AS
|
||||
SELECT
|
||||
execution_id,
|
||||
manifest_id,
|
||||
(finished_at - started_at) AS tte_interval
|
||||
FROM scan_execution;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 1.3 Table: classification_history
|
||||
|
||||
Used for FN-Drift tracking.
|
||||
|
||||
```sql
|
||||
CREATE TABLE classification_history (
|
||||
id BIGSERIAL PRIMARY KEY,
|
||||
artifact_digest TEXT NOT NULL,
|
||||
manifest_id UUID NOT NULL REFERENCES scan_manifest(manifest_id) ON DELETE CASCADE,
|
||||
execution_id UUID NOT NULL REFERENCES scan_execution(execution_id) ON DELETE CASCADE,
|
||||
|
||||
previous_status TEXT NOT NULL, -- unaffected | unknown | affected
|
||||
new_status TEXT NOT NULL,
|
||||
cause TEXT NOT NULL, -- engine_delta | feed_delta | ruleset_delta | policy_delta
|
||||
|
||||
changed_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||
);
|
||||
```
|
||||
|
||||
Materialized view for drift statistics:
|
||||
|
||||
```sql
|
||||
CREATE MATERIALIZED VIEW fn_drift_stats AS
|
||||
SELECT
|
||||
date_trunc('day', changed_at) AS day_bucket,
|
||||
COUNT(*) FILTER (WHERE new_status = 'affected') AS affected_count,
|
||||
COUNT(*) AS total_reclassified,
|
||||
ROUND(
|
||||
(COUNT(*) FILTER (WHERE new_status = 'affected')::numeric /
|
||||
NULLIF(COUNT(*), 0)) * 100, 4
|
||||
) AS drift_percent
|
||||
FROM classification_history
|
||||
GROUP BY 1;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# 2. .NET 10 / C# Types (Deterministic, Hash-Stable)
|
||||
|
||||
The following structs map 1:1 to the DB entities and enforce canonicalization rules.
|
||||
|
||||
## 2.1 CSM Structure
|
||||
|
||||
```csharp
|
||||
public sealed record CanonicalScanManifest
|
||||
{
|
||||
public required string ArtifactDigest { get; init; }
|
||||
public required string FeedsMerkleRoot { get; init; }
|
||||
public required string EngineBuildHash { get; init; }
|
||||
public required string PolicyLatticeHash { get; init; }
|
||||
public required string RulesetHash { get; init; }
|
||||
|
||||
public required IReadOnlyDictionary<string, string> ConfigFlags { get; init; }
|
||||
public required EnvironmentFingerprint Environment { get; init; }
|
||||
}
|
||||
|
||||
public sealed record EnvironmentFingerprint
|
||||
{
|
||||
public required string CpuModel { get; init; }
|
||||
public required string RuntimeVersion { get; init; }
|
||||
public required string Os { get; init; }
|
||||
public required IReadOnlyDictionary<string, string> Extra { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### Deterministic canonical-JSON serializer
|
||||
|
||||
Your developers must generate a stable JSON:
|
||||
|
||||
```csharp
|
||||
internal static class CanonicalJson
|
||||
{
|
||||
private static readonly JsonSerializerOptions Options = new()
|
||||
{
|
||||
WriteIndented = false,
|
||||
PropertyNamingPolicy = JsonNamingPolicy.CamelCase
|
||||
};
|
||||
|
||||
public static string Serialize(object obj)
|
||||
{
|
||||
using var stream = new MemoryStream();
|
||||
using (var writer = new Utf8JsonWriter(stream, new JsonWriterOptions
|
||||
{
|
||||
Indented = false,
|
||||
SkipValidation = false
|
||||
}))
|
||||
{
|
||||
JsonSerializer.Serialize(writer, obj, obj.GetType(), Options);
|
||||
}
|
||||
|
||||
var bytes = stream.ToArray();
|
||||
// Sort object keys alphabetically and array items in stable order.
|
||||
// This step is mandatory to guarantee canonical form:
|
||||
var canonical = JsonCanonicalizer.Canonicalize(bytes);
|
||||
|
||||
return canonical;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`JsonCanonicalizer` is your deterministic canonicalization engine (already referenced in other Stella Ops modules).
|
||||
|
||||
---
|
||||
|
||||
## 2.2 Execution record
|
||||
|
||||
```csharp
|
||||
public sealed record ScanExecutionMetrics
|
||||
{
|
||||
public required int IngestMs { get; init; }
|
||||
public required int AnalyzeMs { get; init; }
|
||||
public required int ReachabilityMs { get; init; }
|
||||
public required int VexMs { get; init; }
|
||||
public required int SignMs { get; init; }
|
||||
public required int PublishMs { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2.3 Replay harness entrypoint
|
||||
|
||||
```csharp
|
||||
public static class ReplayRunner
|
||||
{
|
||||
public static ReplayResult Replay(Guid manifestId, IScannerEngine engine)
|
||||
{
|
||||
var manifest = ManifestRepository.Load(manifestId);
|
||||
var canonical = CanonicalJson.Serialize(manifest.RawObject);
|
||||
var canonicalHash = Sha256(canonical);
|
||||
|
||||
if (canonicalHash != manifest.RawManifestSHA256)
|
||||
throw new InvalidOperationException("Manifest integrity violation.");
|
||||
|
||||
using var feeds = FeedSnapshotResolver.Open(manifest.FeedsMerkleRoot);
|
||||
|
||||
var exec = engine.Scan(new ScanRequest
|
||||
{
|
||||
ArtifactDigest = manifest.ArtifactDigest,
|
||||
Feeds = feeds,
|
||||
LatticeHash = manifest.PolicyLatticeHash,
|
||||
EngineBuildHash = manifest.EngineBuildHash,
|
||||
CanonicalManifest = canonical
|
||||
});
|
||||
|
||||
return new ReplayResult(
|
||||
exec.FindingsHash == manifest.FindingsSHA256,
|
||||
exec.VexBundleHash == manifest.VexBundleSHA256,
|
||||
exec.ProofBundleHash == manifest.ProofBundleSHA256,
|
||||
exec
|
||||
);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Replay must run with:
|
||||
|
||||
* no network
|
||||
* feeds resolved strictly from snapshots
|
||||
* deterministic clock (monotonic timers only)
|
||||
|
||||
---
|
||||
|
||||
# 3. GitLab CI Job for Nightly Deterministic Replay
|
||||
|
||||
```yaml
|
||||
replay-test:
|
||||
stage: test
|
||||
image: mcr.microsoft.com/dotnet/sdk:10.0
|
||||
script:
|
||||
- echo "Starting nightly deterministic replay"
|
||||
|
||||
# 1. Export 200 random manifests from Postgres
|
||||
- >
|
||||
psql "$PG_CONN" -Atc "
|
||||
SELECT manifest_id
|
||||
FROM scan_manifest
|
||||
ORDER BY random()
|
||||
LIMIT 200
|
||||
" > manifests.txt
|
||||
|
||||
# 2. Replay each manifest
|
||||
- >
|
||||
while read mid; do
|
||||
echo "Replaying $mid"
|
||||
dotnet run --project src/StellaOps.Scanner.Replay \
|
||||
--manifest $mid || exit 1
|
||||
done < manifests.txt
|
||||
|
||||
# 3. Aggregate results
|
||||
- >
|
||||
if grep -R "NON-DETERMINISTIC" replay-logs; then
|
||||
echo "Replay failures detected"
|
||||
exit 1
|
||||
else
|
||||
echo "All replays deterministic"
|
||||
fi
|
||||
artifacts:
|
||||
paths:
|
||||
- replay-logs/
|
||||
expire_in: 7 days
|
||||
only:
|
||||
- schedules
|
||||
```
|
||||
|
||||
Replay job failure criteria:
|
||||
|
||||
* Any mismatch in findings/VEX/proof bundle hash
|
||||
* Any non-canonical input or manifest discrepancy
|
||||
* Any accidental feed/network access
|
||||
|
||||
---
|
||||
|
||||
# 4. Developer Rules (Should be added to docs/stellaops-developer-rules.md)
|
||||
|
||||
1. A scan is not valid unless the Canonical Scan Manifest (CSM) hash is stored.
|
||||
2. Every stage must emit monotonic timestamps for TTE. Do not mix monotonic and wall clock.
|
||||
3. Classification changes must always include a cause: no silent reclassification.
|
||||
4. Replay mode must never reach network, dynamic rules, cloud feeds, or external clocks.
|
||||
5. Proof bundles must be TAR with deterministic ordering: alphabetical filenames, fixed uid/gid=0, fixed mtime=0.
|
||||
|
||||
---
|
||||
|
||||
# 5. Ready for integration
|
||||
|
||||
If you want, I can produce:
|
||||
|
||||
* the full EF Core 9 mapping classes
|
||||
* a migration file consistent with your existing Stella Ops module naming
|
||||
* the Angular UI block that displays TTE, FN-Drift, and Replay statistics
|
||||
* a deterministic TAR writer (C#) for proof bundles
|
||||
|
||||
Tell me which part you want next.
|
||||
@@ -0,0 +1,634 @@
|
||||
Here’s a clean way to turn an SBOM into a verifiable supply‑chain proof without extra fluff: use CycloneDX’s `metadata.component.hashes` as the cryptographic anchors, map each component@version to an **in‑toto subject**, wrap the result in a **DSSE** envelope, record it in **Rekor**, and (optionally) attach or reference your **VEX** claims. This gives you a deterministic, end‑to‑end “SBOM → DSSE → Rekor → VEX” spine you can replay and audit anytime.
|
||||
|
||||
---
|
||||
|
||||
# Why this works (quick background)
|
||||
|
||||
* **CycloneDX SBOM**: lists components; each can carry hashes (SHA‑256/512) under `metadata.component.hashes`.
|
||||
* **in‑toto**: describes supply‑chain steps; a “subject” is just a file/artifact + its digest(s).
|
||||
* **DSSE**: standard envelope to sign statements (like in‑toto) without touching payload bytes.
|
||||
* **Rekor** (Sigstore): transparency log—append‑only proofs (inclusion/consistency).
|
||||
* **VEX**: vulnerability status for components (affected/not affected, under investigation, fixed).
|
||||
|
||||
---
|
||||
|
||||
# Minimal mapping
|
||||
|
||||
1. **From CycloneDX → subjects**
|
||||
|
||||
* For each component with a hash:
|
||||
|
||||
* Subject name: `pkg:<type>/<name>@<version>` (or your canonical URI)
|
||||
* Subject digest(s): copy from `metadata.component.hashes`
|
||||
|
||||
2. **in‑toto statement**
|
||||
|
||||
```json
|
||||
{
|
||||
"_type": "https://in-toto.io/Statement/v1",
|
||||
"predicateType": "https://stellaops.dev/predicate/sbom-linkage/v1",
|
||||
"subject": [
|
||||
{ "name": "pkg:npm/lodash@4.17.21",
|
||||
"digest": { "sha256": "…", "sha512": "…" } }
|
||||
],
|
||||
"predicate": {
|
||||
"sbom": {
|
||||
"format": "CycloneDX",
|
||||
"version": "1.6",
|
||||
"sha256": "…sbom file hash…"
|
||||
},
|
||||
"generatedAt": "2025-12-01T00:00:00Z",
|
||||
"generator": "StellaOps.Sbomer/1.0"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
3. **Wrap in DSSE**
|
||||
|
||||
* Create DSSE envelope with the statement as payload.
|
||||
* Sign with your org key (or keyless Sigstore if online; for air‑gap, use your offline CA/PKCS#11).
|
||||
|
||||
4. **Log to Rekor**
|
||||
|
||||
* Submit DSSE to Rekor; store back the **logIndex**, **UUID**, and **inclusion proof**.
|
||||
* In offline/air‑gap kits, mirror to your own Rekor instance and sync later.
|
||||
|
||||
5. **Link VEX**
|
||||
|
||||
* For each component subject, attach a VEX item (same subject name + digest) or store a pointer:
|
||||
|
||||
```json
|
||||
"predicate": {
|
||||
"vex": [
|
||||
{ "subject": "pkg:npm/lodash@4.17.21",
|
||||
"digest": { "sha256": "…" },
|
||||
"vulnerability": "CVE-XXXX-YYYY",
|
||||
"status": "not_affected",
|
||||
"justification": "component_not_present",
|
||||
"timestamp": "2025-12-01T00:00:00Z" }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
* You can keep VEX in a separate DSSE/in‑toto document; cross‑reference by subject digest.
|
||||
|
||||
---
|
||||
|
||||
# Deterministic replay recipe (Stella Ops‑style)
|
||||
|
||||
* **Input**: CycloneDX file + deterministic hashing rules.
|
||||
* **Process**:
|
||||
|
||||
1. Normalize SBOM (stable sort keys, strip volatile fields).
|
||||
2. Extract `metadata.component.hashes`; fail build if missing.
|
||||
3. Emit in‑toto statement with sorted subjects.
|
||||
4. DSSE‑sign with fixed algorithm (e.g., SHA‑256 + Ed25519) and pinned key id.
|
||||
5. Rekor log; record `logIndex` in your store.
|
||||
6. Emit VEX statements keyed by the *same* subject digests.
|
||||
* **Output**: `(SBOM hash, DSSE envelope, Rekor proofs, VEX docs)` — all content‑addressed.
|
||||
|
||||
---
|
||||
|
||||
# Quick C# sketch (DOTNET 10) to build subjects
|
||||
|
||||
```csharp
|
||||
public record Subject(string Name, Dictionary<string,string> Digest);
|
||||
|
||||
IEnumerable<Subject> ToSubjects(CycloneDxSbom sbom)
|
||||
{
|
||||
foreach (var c in sbom.Metadata.Components)
|
||||
{
|
||||
if (c.Hashes == null || c.Hashes.Count == 0) continue;
|
||||
var name = $"pkg:{c.Type}/{c.Name}@{c.Version}";
|
||||
var dig = c.Hashes
|
||||
.OrderBy(h => h.Algorithm) // deterministic
|
||||
.ToDictionary(h => h.Algorithm.ToLowerInvariant(), h => h.Value.ToLowerInvariant());
|
||||
yield return new Subject(name, dig);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Validation gates you’ll want
|
||||
|
||||
* **No‑hash = no‑ship**: reject SBOM components without strong digests.
|
||||
* **Stable ordering**: sort subjects and digests before signing.
|
||||
* **Key policy**: pin algorithm + key id; rotate on a schedule; record KMS path.
|
||||
* **Proof check**: verify Rekor inclusion on CI and during runtime attestation.
|
||||
* **VEX parity**: every shipped subject must have a VEX stance (even “unknown/under‑investigation”).
|
||||
|
||||
---
|
||||
|
||||
# Where this helps you
|
||||
|
||||
* **Audits**: one click from running container → component digest → Rekor proof → VEX decision.
|
||||
* **Air‑gap**: DSSE + local Rekor mirror keeps everything verifiable offline, syncs later.
|
||||
* **Determinism**: same inputs always produce byte‑identical envelopes and proofs.
|
||||
|
||||
If you want, I can turn this into a drop‑in **StellaOps.Sbomer → Vexer** guideline (with schema files, DSSE signing helper, and a Rekor client wrapper) tailored to your .NET 10 repos.
|
||||
Below is a compact but complete guideline you can hand directly to Stella Ops devs.
|
||||
|
||||
---
|
||||
|
||||
# Stella Ops Developer Guidelines
|
||||
|
||||
## Converting SBOM Data into Proof Chains
|
||||
|
||||
### 1. Objective
|
||||
|
||||
Define how Stella Ops components (Sbomer, Authority, Vexer, Proof Graph, Rekor bridge) convert raw SBOM data (CycloneDX / SPDX) into **cryptographically verifiable proof chains**:
|
||||
|
||||
`Artifact/Image → SBOM → in-toto Statement → DSSE Envelope → Rekor Entry → VEX Attestations → Proof-of-Integrity Graph`.
|
||||
|
||||
This must be:
|
||||
|
||||
* Deterministic (replayable).
|
||||
* Content-addressed (hashes everywhere).
|
||||
* Offline-capable (air-gapped), with later synchronization.
|
||||
* Crypto-sovereign (pluggable crypto backends, including PQC later).
|
||||
|
||||
---
|
||||
|
||||
## 2. Responsibilities by Service
|
||||
|
||||
**StellaOps.Sbomer**
|
||||
|
||||
* Ingest SBOMs (CycloneDX 1.6, SPDX 3.x).
|
||||
* Canonicalize and hash SBOM.
|
||||
* Extract component subjects from SBOM.
|
||||
* Build in-toto Statement for “sbom-linkage”.
|
||||
* Call Authority to DSSE-sign Statement.
|
||||
* Hand signed envelopes to Rekor bridge + Proof Graph.
|
||||
|
||||
**StellaOps.Authority**
|
||||
|
||||
* Abstract cryptography (sign/verify, hash, key resolution).
|
||||
* Support multiple profiles (default: FIPS-style SHA-256 + Ed25519/ECDSA; future: GOST/SM/eIDAS/PQC).
|
||||
* Enforce key policies (which key for which tenant/realm).
|
||||
|
||||
**StellaOps.RekorBridge** (could be sub-package of Authority or separate microservice)
|
||||
|
||||
* Log DSSE envelopes to Rekor (or local Rekor-compatible ledger).
|
||||
* Handle offline queuing and later sync.
|
||||
* Return stable Rekor metadata: `logIndex`, `logId`, `inclusionProof`.
|
||||
|
||||
**StellaOps.Vexer (Excitors)**
|
||||
|
||||
* Produce VEX statements that reference **the same subjects** as the SBOM proof chain.
|
||||
* DSSE-sign VEX statements via Authority.
|
||||
* Optionally log VEX DSSE envelopes to Rekor using the same bridge.
|
||||
* Never run lattice logic here (per your rule); only attach VEX and preserve provenance.
|
||||
|
||||
**StellaOps.ProofGraph**
|
||||
|
||||
* Persist the full chain:
|
||||
|
||||
* Artifacts, SBOM docs, in-toto Statements, DSSE envelopes, Rekor entries, VEX docs.
|
||||
* Expose graph APIs for Scanner / runtime agents:
|
||||
|
||||
* “Show me proof for this container/image/binary.”
|
||||
|
||||
---
|
||||
|
||||
## 3. High-Level Flow
|
||||
|
||||
For each scanned artifact (e.g., container image):
|
||||
|
||||
1. **SBOM ingestion** (Sbomer)
|
||||
|
||||
* Accept SBOM file/stream (CycloneDX/SPDX).
|
||||
* Normalize & hash the SBOM document.
|
||||
2. **Subject extraction** (Sbomer)
|
||||
|
||||
* Derive a stable list of `subjects[]` from SBOM components (name + digests).
|
||||
3. **Statement construction** (Sbomer)
|
||||
|
||||
* Build in-toto Statement with `predicateType = "https://stella-ops.org/predicates/sbom-linkage/v1"`.
|
||||
4. **DSSE signing** (Authority)
|
||||
|
||||
* Wrap Statement as DSSE envelope.
|
||||
* Sign with the appropriate org/tenant key.
|
||||
5. **Rekor logging** (RekorBridge)
|
||||
|
||||
* Submit DSSE envelope to Rekor.
|
||||
* Store log metadata & proofs.
|
||||
6. **VEX linkage** (Vexer)
|
||||
|
||||
* For each subject, optionally emit VEX statements (status: affected/not_affected/etc.).
|
||||
* DSSE-sign and log VEX to Rekor (same pattern).
|
||||
7. **Proof-of-Integrity Graph** (ProofGraph)
|
||||
|
||||
* Insert nodes & edges to represent the whole chain, content-addressed by hash.
|
||||
|
||||
---
|
||||
|
||||
## 4. Canonicalizing and Hashing SBOMs (Sbomer)
|
||||
|
||||
### 4.1 Supported formats
|
||||
|
||||
* MUST support:
|
||||
|
||||
* CycloneDX JSON 1.4+ (target 1.6).
|
||||
* SPDX 3.x JSON.
|
||||
* MUST map both formats into a common internal `SbomDocument` model.
|
||||
|
||||
### 4.2 Canonicalization rules
|
||||
|
||||
All **hashes used as identifiers** MUST be computed over **canonical form**:
|
||||
|
||||
* For JSON SBOMs:
|
||||
|
||||
* Remove insignificant whitespace.
|
||||
* Sort object keys lexicographically.
|
||||
* For arrays where order is not semantically meaningful (e.g., `components`), sort deterministically (e.g., by `bom-ref` or `purl`).
|
||||
* Strip volatile fields if present:
|
||||
|
||||
* Timestamps (generation time).
|
||||
* Tool build IDs.
|
||||
* Non-deterministic UUIDs.
|
||||
|
||||
* For other formats (if ever accepted):
|
||||
|
||||
* Convert to internal JSON representation first, then canonicalize JSON.
|
||||
|
||||
Example C# signature:
|
||||
|
||||
```csharp
|
||||
public interface ISbomCanonicalizer
|
||||
{
|
||||
byte[] Canonicalize(ReadOnlySpan<byte> rawSbom, string mediaType);
|
||||
}
|
||||
|
||||
public interface IBlobHasher
|
||||
{
|
||||
string ComputeSha256Hex(ReadOnlySpan<byte> data);
|
||||
}
|
||||
```
|
||||
|
||||
**Contract:** same input bytes → same canonical bytes → same `sha256` → replayable.
|
||||
|
||||
### 4.3 SBOM identity
|
||||
|
||||
Define SBOM identity as:
|
||||
|
||||
```text
|
||||
sbomId = sha256(canonicalSbomBytes)
|
||||
```
|
||||
|
||||
Store:
|
||||
|
||||
* `SbomId` (hex string).
|
||||
* `MediaType` (e.g., `application/vnd.cyclonedx+json`).
|
||||
* `SpecVersion`.
|
||||
* Optional `Source` (file path, OCI label, etc.).
|
||||
|
||||
---
|
||||
|
||||
## 5. Extracting Subjects from SBOM Components
|
||||
|
||||
### 5.1 Subject schema
|
||||
|
||||
Internal model:
|
||||
|
||||
```csharp
|
||||
public sealed record ProofSubject(
|
||||
string Name, // e.g. "pkg:npm/lodash@4.17.21"
|
||||
IReadOnlyDictionary<string,string> Digest // e.g. { ["sha256"] = "..." }
|
||||
);
|
||||
```
|
||||
|
||||
### 5.2 Name rules
|
||||
|
||||
* Prefer **PURL** when present.
|
||||
|
||||
* `Name = purl` exactly as in SBOM.
|
||||
* Fallback per eco-system:
|
||||
|
||||
* npm: `pkg:npm/{name}@{version}`
|
||||
* NuGet/.NET: `pkg:nuget/{name}@{version}`
|
||||
* Maven: `pkg:maven/{groupId}/{artifactId}@{version}`
|
||||
* OS packages (rpm/deb/apk): appropriate purl.
|
||||
* If nothing else is available:
|
||||
|
||||
* `Name = "component:" + UrlEncode(componentName + "@" + version)`.
|
||||
|
||||
### 5.3 Digest rules
|
||||
|
||||
* Consume all strong digests provided (CycloneDX `hashes[]`, SPDX checksums).
|
||||
* Normalize algorithm keys:
|
||||
|
||||
* Lowercase (e.g., `sha256`, `sha512`).
|
||||
* For SHA-1, still capture it but mark as weak in predicate metadata.
|
||||
* MUST have at least one of:
|
||||
|
||||
* `sha256`
|
||||
* `sha512`
|
||||
* If no strong digest exists, the component:
|
||||
|
||||
* MUST NOT be used as a primary subject in the proof chain.
|
||||
* MAY be logged in an “incomplete_subjects” block inside the predicate for diagnostics.
|
||||
|
||||
### 5.4 Deterministic ordering
|
||||
|
||||
* Sort subjects by:
|
||||
|
||||
1. `Name` ascending.
|
||||
2. Then by lexicographic concat of `algorithm:value` pairs.
|
||||
|
||||
This ordering must be applied before building the in-toto Statement.
|
||||
|
||||
---
|
||||
|
||||
## 6. Building the in-toto Statement (Sbomer)
|
||||
|
||||
### 6.1 Statement shape
|
||||
|
||||
Use the generic in-toto v1 Statement:
|
||||
|
||||
```json
|
||||
{
|
||||
"_type": "https://in-toto.io/Statement/v1",
|
||||
"subject": [ /* from SBOM subjects */ ],
|
||||
"predicateType": "https://stella-ops.org/predicates/sbom-linkage/v1",
|
||||
"predicate": {
|
||||
"sbom": {
|
||||
"id": "<sbomId hex>",
|
||||
"format": "CycloneDX",
|
||||
"specVersion": "1.6",
|
||||
"mediaType": "application/vnd.cyclonedx+json",
|
||||
"sha256": "<sha256 of canonicalSbomBytes>",
|
||||
"location": "oci://… or file://…"
|
||||
},
|
||||
"generator": {
|
||||
"name": "StellaOps.Sbomer",
|
||||
"version": "x.y.z"
|
||||
},
|
||||
"generatedAt": "2025-12-09T10:37:42Z",
|
||||
"incompleteSubjects": [ /* optional, see 5.3 */ ],
|
||||
"tags": {
|
||||
"tenantId": "…",
|
||||
"projectId": "…",
|
||||
"pipelineRunId": "…"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 6.2 Implementation rules
|
||||
|
||||
* All dictionary keys in the final JSON MUST be sorted.
|
||||
* Use UTC ISO-8601 for timestamps.
|
||||
* `tags` is an **extensible** string map; do not put secrets here.
|
||||
* The Statement payload given to DSSE MUST be the canonical JSON (same key order each time).
|
||||
|
||||
C# sketch:
|
||||
|
||||
```csharp
|
||||
public record SbomLinkagePredicate(
|
||||
SbomDescriptor Sbom,
|
||||
GeneratorDescriptor Generator,
|
||||
DateTimeOffset GeneratedAt,
|
||||
IReadOnlyList<IncompleteSubject>? IncompleteSubjects,
|
||||
IReadOnlyDictionary<string,string>? Tags
|
||||
);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. DSSE Signing (Authority)
|
||||
|
||||
### 7.1 Abstraction
|
||||
|
||||
All signing MUST run through Authority; no direct crypto calls from Sbomer/Vexer.
|
||||
|
||||
```csharp
|
||||
public interface IDsseSigner
|
||||
{
|
||||
Task<DsseEnvelope> SignAsync(
|
||||
ReadOnlyMemory<byte> payload,
|
||||
string payloadType, // always "application/vnd.in-toto+json"
|
||||
string keyProfile, // e.g. "default", "gov-bg", "pqc-lab"
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
```
|
||||
|
||||
### 7.2 DSSE rules
|
||||
|
||||
* `payloadType` fixed: `"application/vnd.in-toto+json"`.
|
||||
|
||||
* `signatures[]`:
|
||||
|
||||
* At least one signature.
|
||||
* Each signature MUST carry:
|
||||
|
||||
* `keyid` (stable identifier within Authority).
|
||||
* `sig` (base64).
|
||||
* Optional `cert` if X.509 is used (but not required to be in the hashed payload).
|
||||
|
||||
* Crypto profile:
|
||||
|
||||
* Default: SHA-256 + Ed25519/ECDSA (configurable).
|
||||
* Key resolution must be **config-driven per tenant/realm**.
|
||||
|
||||
### 7.3 Determinism
|
||||
|
||||
* DSSE envelope JSON MUST also be canonical when hashed or sent to Rekor.
|
||||
* Signature bytes will differ across runs (due to non-deterministic ECDSA), but **payload hash** and Statement hash MUST remain stable.
|
||||
|
||||
---
|
||||
|
||||
## 8. Rekor Logging (RekorBridge)
|
||||
|
||||
### 8.1 When to log
|
||||
|
||||
* Every SBOM linkage DSSE envelope SHOULD be logged to a Rekor-compatible transparency log.
|
||||
* In air-gapped mode:
|
||||
|
||||
* Enqueue entries in a local store.
|
||||
* Tag them with a “pending” status and sync log later.
|
||||
|
||||
### 8.2 Entry type
|
||||
|
||||
Use Rekor’s DSSE/intoto entry kind (exact spec is implementation detail, but guidelines:
|
||||
|
||||
* Entry contains:
|
||||
|
||||
* DSSE envelope.
|
||||
* `apiVersion` / `kind` fields required by Rekor.
|
||||
* On success, Rekor returns:
|
||||
|
||||
* `logIndex`
|
||||
* `logId`
|
||||
* `integratedTime`
|
||||
* `inclusionProof` (Merkle proof).
|
||||
|
||||
### 8.3 Data persisted back into ProofGraph
|
||||
|
||||
For each DSSE envelope:
|
||||
|
||||
* Store:
|
||||
|
||||
```json
|
||||
{
|
||||
"dsseSha256": "<sha256 of canonical dsse envelope>",
|
||||
"rekor": {
|
||||
"logIndex": 12345,
|
||||
"logId": "…",
|
||||
"integratedTime": 1733736000,
|
||||
"inclusionProof": { /* Merkle path */ }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
* Link this Rekor entry node to the DSSE envelope node with `LOGGED_IN` edge.
|
||||
|
||||
---
|
||||
|
||||
## 9. VEX Linkage (Vexer)
|
||||
|
||||
### 9.1 Core rule
|
||||
|
||||
VEX subjects MUST align with SBOM proof subjects:
|
||||
|
||||
* Same `name` value.
|
||||
* Same digest set (`sha256` at minimum).
|
||||
* If VEX is created later (e.g., days after SBOM), they still link through the subject digests.
|
||||
|
||||
### 9.2 VEX statement
|
||||
|
||||
StellaOps VEX may be its own predicateType, e.g.:
|
||||
|
||||
```json
|
||||
{
|
||||
"_type": "https://in-toto.io/Statement/v1",
|
||||
"subject": [
|
||||
{ "name": "pkg:npm/lodash@4.17.21",
|
||||
"digest": { "sha256": "…" } }
|
||||
],
|
||||
"predicateType": "https://stella-ops.org/predicates/vex/v1",
|
||||
"predicate": {
|
||||
"vulnerabilities": [
|
||||
{
|
||||
"id": "CVE-2024-XXXX",
|
||||
"status": "not_affected",
|
||||
"justification": "component_not_present",
|
||||
"timestamp": "2025-12-09T10:40:00Z",
|
||||
"details": "…"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Then:
|
||||
|
||||
1. Canonicalize JSON.
|
||||
2. DSSE-sign via Authority.
|
||||
3. Optionally log DSSE envelope to Rekor.
|
||||
4. Insert into ProofGraph with `HAS_VEX` relationships from subject → VEX node.
|
||||
|
||||
### 9.3 Non-functional
|
||||
|
||||
* Vexer must **not** run lattice algorithms; Scanner’s policy engine consumes these VEX proofs.
|
||||
* Vexer MUST be idempotent when re-emitting VEX for the same (subject, CVE, status) tuple.
|
||||
|
||||
---
|
||||
|
||||
## 10. Proof-of-Integrity Graph (ProofGraph)
|
||||
|
||||
### 10.1 Node types (suggested)
|
||||
|
||||
* `Artifact` (container image, binary, Helm chart, etc.).
|
||||
* `SbomDocument` (by `sbomId`).
|
||||
* `InTotoStatement` (by statement hash).
|
||||
* `DsseEnvelope`.
|
||||
* `RekorEntry`.
|
||||
* `VexStatement`.
|
||||
|
||||
### 10.2 Edge types
|
||||
|
||||
* `DESCRIBED_BY`: `Artifact` → `SbomDocument`.
|
||||
* `ATTESTED_BY`: `SbomDocument` → `InTotoStatement`.
|
||||
* `WRAPPED_BY`: `InTotoStatement` → `DsseEnvelope`.
|
||||
* `LOGGED_IN`: `DsseEnvelope` → `RekorEntry`.
|
||||
* `HAS_VEX`: `Artifact`/`Subject` → `VexStatement`.
|
||||
* Optionally `CONTAINS_SUBJECT`: `InTotoStatement` → `Subject` nodes if you materialise them.
|
||||
|
||||
### 10.3 Identifiers
|
||||
|
||||
* All nodes MUST be addressable by a content hash:
|
||||
|
||||
* `ArtifactId` = hash of image manifest or binary.
|
||||
* `SbomId` = hash of canonical SBOM.
|
||||
* `StatementId` = hash of canonical in-toto JSON.
|
||||
* `DsseId` = hash of canonical DSSE JSON.
|
||||
* `VexId` = hash of canonical VEX Statement JSON.
|
||||
|
||||
Idempotence rule: inserting the same chain twice must result in the same nodes, not duplicates.
|
||||
|
||||
---
|
||||
|
||||
## 11. Error Handling & Policy Gates
|
||||
|
||||
### 11.1 Ingestion failures
|
||||
|
||||
* If SBOM is missing or invalid:
|
||||
|
||||
* Mark the artifact as “unproven” in the graph.
|
||||
* Raise a policy event so Scanner/CI can enforce “no SBOM, no ship” if configured.
|
||||
|
||||
### 11.2 Missing digests
|
||||
|
||||
* If a component lacks `sha256`/`sha512`:
|
||||
|
||||
* Log as incomplete subject.
|
||||
* Expose in predicate and UI as “unverifiable component – not anchored to proof chain”.
|
||||
|
||||
### 11.3 Rekor failures
|
||||
|
||||
* If Rekor is unavailable:
|
||||
|
||||
* Still store DSSE envelope locally.
|
||||
* Queue for retry.
|
||||
* Proof chain is internal-only until Rekor sync succeeds; flag accordingly (`rekorStatus: "pending"`).
|
||||
|
||||
---
|
||||
|
||||
## 12. Definition of Done for Dev Work
|
||||
|
||||
Any feature that “converts SBOMs into proof chains” is only done when:
|
||||
|
||||
1. **Canonicalization**
|
||||
|
||||
* Given the same SBOM file, multiple runs produce identical:
|
||||
|
||||
* `sbomId`
|
||||
* Statement JSON bytes
|
||||
* DSSE payload bytes (before signing)
|
||||
|
||||
2. **Subject extraction**
|
||||
|
||||
* All strong-digest components appear as subjects.
|
||||
* Deterministic ordering is tested with golden fixtures.
|
||||
|
||||
3. **DSSE + Rekor**
|
||||
|
||||
* DSSE envelopes verifiable with Authority key material.
|
||||
* Rekor entry present (or in offline queue) for each envelope.
|
||||
* Rekor metadata linked in ProofGraph.
|
||||
|
||||
4. **VEX integration**
|
||||
|
||||
* VEX for a subject is discoverable via the same subject in graph queries.
|
||||
* Scanner can prove: “this vulnerability is (not_)affected because of VEX X”.
|
||||
|
||||
5. **Graph query**
|
||||
|
||||
* From a running container/image, you can traverse:
|
||||
|
||||
* `Artifact → SBOM → Statement → DSSE → Rekor → VEX` in a single query.
|
||||
|
||||
---
|
||||
|
||||
If you want, next step I can do a concrete `.cs` layout (interfaces + record types + one golden test fixture) specifically for `StellaOps.Sbomer` and `StellaOps.ProofGraph`, so you can drop it straight into your .NET 10 solution.
|
||||
@@ -0,0 +1,614 @@
|
||||
Here’s a crisp product idea you can drop straight into Stella Ops: a **VEX “proof spine”**—an interactive, signed chain that shows exactly *why* a vuln is **not exploitable**, end‑to‑end.
|
||||
|
||||
---
|
||||
|
||||
# What it is (plain speak)
|
||||
|
||||
* A **proof spine** is a linear (but zoomable) chain of facts: *vuln → package → reachable symbol → guarded path → runtime context → policy verdict*.
|
||||
* Each segment is **cryptographically signed** (DSSE, in‑toto style) so users can audit who/what asserted it, with hashes for inputs/outputs.
|
||||
* In the UI, the chain appears as **locked graph segments**. Users can expand a segment to see the evidence, but they can’t alter it without breaking the signature.
|
||||
|
||||
---
|
||||
|
||||
# Why it’s different
|
||||
|
||||
* **From “scanner says so” to “here’s the evidence.”** This leap is what Trivy/Snyk static readouts don’t fully deliver: deterministic reachability + proof‑linked UX.
|
||||
* **Time‑to‑Evidence (TtE)** drops: the path from alert → proof is one click, reducing back‑and‑forth with security and auditors.
|
||||
* **Replayable & sovereign:** works offline, and every step is reproducible in air‑gapped audits.
|
||||
|
||||
---
|
||||
|
||||
# Minimal UX spec (fast to ship)
|
||||
|
||||
1. **Evidence Rail (left side)**
|
||||
|
||||
* Badges per segment: *SBOM*, *Match*, *Reachability*, *Guards*, *Runtime*, *Policy*.
|
||||
* Each badge shows status: ✅ verified, ⚠️ partial, ❌ missing, ⏳ pending.
|
||||
2. **Chain Canvas (center)**
|
||||
|
||||
* Segments render as locked pills connected by a line.
|
||||
* Clicking a pill opens an **Evidence Drawer** with:
|
||||
|
||||
* Inputs (hashes, versions), Tool ID, Who signed (key ID), Signature, Timestamp.
|
||||
* “Reproduce” button → prefilled `stellaops scan --replay <manifest hash>`.
|
||||
3. **Verdict Capsule (top‑right)**
|
||||
|
||||
* Final VEX statement (e.g., `not_affected: guarded-by-feature-flag`) with signer, expiry, and policy that produced it.
|
||||
4. **Audit Mode toggle**
|
||||
|
||||
* Freezes the view, shows raw DSSE envelopes and canonical JSON of each step.
|
||||
|
||||
---
|
||||
|
||||
# Data model (lean)
|
||||
|
||||
* `ProofSegment`
|
||||
|
||||
* `type`: `SBOM|Match|Reachability|Guard|Runtime|Policy`
|
||||
* `inputs`: array of `{name, hash, mediaType}`
|
||||
* `result`: JSON blob (canonicalized)
|
||||
* `attestation`: DSSE envelope
|
||||
* `tool_id`, `version`, `started_at`, `finished_at`
|
||||
* `ProofSpine`
|
||||
|
||||
* `vuln_id`, `artifact_id`, `segments[]`, `verdict`, `spine_hash`
|
||||
|
||||
---
|
||||
|
||||
# Deterministic pipeline (dev notes)
|
||||
|
||||
1. **SBOM lock** → hash the SBOM slice relevant to the package.
|
||||
2. **Vuln match** → store matcher inputs (CPE/PURL rules) and result.
|
||||
3. **Reachability pass** → static callgraph diff with symbol list; record *exact* rule set and graph hash.
|
||||
4. **Guard analysis** → record predicates (feature flags, config gates) and satisfiability result.
|
||||
5. **Runtime sampling (optional)** → link eBPF trace or app telemetry digest.
|
||||
6. **Policy evaluation** → lattice rule IDs + decision; emit final VEX statement.
|
||||
7. DSSE‑sign each step; **link by previous segment hash** (spine = mini‑Merkle chain).
|
||||
|
||||
---
|
||||
|
||||
# Quick .NET 10 implementation hints
|
||||
|
||||
* **Canonical JSON:** `System.Text.Json` with deterministic ordering; pre‑normalize floats/timestamps.
|
||||
* **DSSE:** wrap payloads, sign with your Authority service; store `key_id`, `sig`, `alg`.
|
||||
* **Hashing:** SHA‑256 of canonical result; spine hash = hash(concat of segment hashes).
|
||||
* **Replay manifests:** emit a single `scan.replay.json` containing feed versions, ruleset IDs, and all input hashes.
|
||||
|
||||
---
|
||||
|
||||
# Tiny UI contract for Angular
|
||||
|
||||
* Component: `ProofSpineComponent`
|
||||
|
||||
* `@Input() spine: ProofSpine`
|
||||
* Emits: `replayRequested(spine_hash)`, `segmentOpened(segment_id)`
|
||||
* Drawer shows: `inputs`, `result`, `attestation`, `reproduce` CTA.
|
||||
* Badge colors map to verification state from backend (`verified/partial/missing/pending`).
|
||||
|
||||
---
|
||||
|
||||
# How it lands value fast
|
||||
|
||||
* Gives customers a **credible “not exploitable”** stance with audit‑ready proofs.
|
||||
* Shortens investigations (SecOps, Dev, Compliance speak the same artifact).
|
||||
* Creates a **moat**: deterministic, signed evidence chains—hard to copy with pure static lists.
|
||||
|
||||
If you want, I’ll draft the C# models, the DSSE signer interface, and the Angular component skeleton next.
|
||||
Good, let’s turn the “proof spine” into something you can actually brief to devs, UX, and auditors as a concrete capability.
|
||||
|
||||
I’ll structure it around: domain model, lifecycle, storage, signing & trust, UX, and dev/testing guidelines.
|
||||
|
||||
---
|
||||
|
||||
## 1. Scope the Proof Spine precisely
|
||||
|
||||
### Core intent
|
||||
|
||||
A **Proof Spine** is the *minimal signed chain of reasoning* that justifies a VEX verdict for a given `(artifact, vulnerability)` pair. It must be:
|
||||
|
||||
* Deterministic: same inputs → bit-identical spine.
|
||||
* Replayable: every step has enough context to re-run it.
|
||||
* Verifiable: each step is DSSE-signed, chained by hashes.
|
||||
* Decoupled: you can verify a spine even if Scanner/Vexer code changes later.
|
||||
|
||||
### Non-goals (so devs don’t overextend)
|
||||
|
||||
* Not a general logging system.
|
||||
* Not a full provenance graph (that’s for your Proof-of-Integrity Graph).
|
||||
* Not a full data warehouse of all intermediate findings. It’s a curated, compressed reasoning chain.
|
||||
|
||||
---
|
||||
|
||||
## 2. Domain model: from “nice idea” to strict contract
|
||||
|
||||
Think in terms of three primitives:
|
||||
|
||||
1. `ProofSpine`
|
||||
2. `ProofSegment`
|
||||
3. `ReplayManifest`
|
||||
|
||||
### 2.1 `ProofSpine` (aggregate root)
|
||||
|
||||
Per `(ArtifactId, VulnerabilityId, PolicyProfileId)` you have at most one **latest** active spine.
|
||||
|
||||
Key fields:
|
||||
|
||||
* `SpineId` (ULID / GUID): stable ID for references and URLs.
|
||||
* `ArtifactId` (image digest, repo+tag, etc.).
|
||||
* `VulnerabilityId` (CVE, GHSA, etc.).
|
||||
* `PolicyProfileId` (which lattice/policy produced the verdict).
|
||||
* `Segments[]` (ordered; see below).
|
||||
* `Verdict` (`affected`, `not_affected`, `fixed`, `under_investigation`, etc.).
|
||||
* `VerdictReason` (short machine code, e.g. `unreachable-code`, `guarded-runtime-config`).
|
||||
* `RootHash` (hash of concatenated segment hashes).
|
||||
* `ScanRunId` (link back to scan execution).
|
||||
* `CreatedAt`, `SupersededBySpineId?`.
|
||||
|
||||
C# sketch:
|
||||
|
||||
```csharp
|
||||
public sealed record ProofSpine(
|
||||
string SpineId,
|
||||
string ArtifactId,
|
||||
string VulnerabilityId,
|
||||
string PolicyProfileId,
|
||||
IReadOnlyList<ProofSegment> Segments,
|
||||
string Verdict,
|
||||
string VerdictReason,
|
||||
string RootHash,
|
||||
string ScanRunId,
|
||||
DateTimeOffset CreatedAt,
|
||||
string? SupersededBySpineId
|
||||
);
|
||||
```
|
||||
|
||||
### 2.2 `ProofSegment` (atomic evidence step)
|
||||
|
||||
Each segment represents **one logical transformation**:
|
||||
|
||||
Examples of `SegmentType`:
|
||||
|
||||
* `SBOM_SLICE` – “Which components are relevant?”
|
||||
* `MATCH` – “Which SBOM component matches this vuln feed record?”
|
||||
* `REACHABILITY` – “Is the vulnerable symbol reachable in this build?”
|
||||
* `GUARD_ANALYSIS` – “Is this path gated by config/feature flag?”
|
||||
* `RUNTIME_OBSERVATION` – “Was this code observed at runtime?”
|
||||
* `POLICY_EVAL` – “How did the lattice/policy combine evidence?”
|
||||
|
||||
Fields:
|
||||
|
||||
* `SegmentId`
|
||||
* `SegmentType`
|
||||
* `Index` (0-based position in spine)
|
||||
* `Inputs` (canonical JSON)
|
||||
* `Result` (canonical JSON)
|
||||
* `InputHash` (`SHA256(canonical(Inputs))`)
|
||||
* `ResultHash`
|
||||
* `PrevSegmentHash` (optional for first segment)
|
||||
* `Envelope` (DSSE payload + signature)
|
||||
* `ToolId`, `ToolVersion`
|
||||
* `Status` (`verified`, `partial`, `invalid`, `unknown`)
|
||||
|
||||
C# sketch:
|
||||
|
||||
```csharp
|
||||
public sealed record ProofSegment(
|
||||
string SegmentId,
|
||||
string SegmentType,
|
||||
int Index,
|
||||
string InputHash,
|
||||
string ResultHash,
|
||||
string? PrevSegmentHash,
|
||||
DsseEnvelope Envelope,
|
||||
string ToolId,
|
||||
string ToolVersion,
|
||||
string Status
|
||||
);
|
||||
|
||||
public sealed record DsseEnvelope(
|
||||
string PayloadType,
|
||||
string PayloadBase64,
|
||||
IReadOnlyList<DsseSignature> Signatures
|
||||
);
|
||||
|
||||
public sealed record DsseSignature(
|
||||
string KeyId,
|
||||
string SigBase64
|
||||
);
|
||||
```
|
||||
|
||||
### 2.3 `ReplayManifest` (reproducibility anchor)
|
||||
|
||||
A `ReplayManifest` is emitted per scan run and referenced by multiple spines:
|
||||
|
||||
* `ReplayManifestId`
|
||||
* `Feeds` (names + versions + digests)
|
||||
* `Rulesets` (reachability rules version, lattice policy version)
|
||||
* `Tools` (scanner, sbomer, vexer versions)
|
||||
* `Environment` (OS, arch, container image digest where the scan ran)
|
||||
|
||||
This is what your CLI will take:
|
||||
|
||||
```bash
|
||||
stellaops scan --replay <ReplayManifestId> --artifact <ArtifactId> --vuln <VulnerabilityId>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Lifecycle: where the spine is built in Stella Ops
|
||||
|
||||
### 3.1 Producer components
|
||||
|
||||
The following services contribute segments:
|
||||
|
||||
* `Sbomer` → `SBOM_SLICE`
|
||||
* `Scanner` → `MATCH`, maybe `RUNTIME_OBSERVATION` if it integrates runtime traces
|
||||
* `Reachability Engine` inside `Scanner` / dedicated module → `REACHABILITY`
|
||||
* `Guard Analyzer` (config/feature flag evaluator) → `GUARD_ANALYSIS`
|
||||
* `Vexer/Excititor` → `POLICY_EVAL`, final verdict
|
||||
* `Authority` → optional cross-signing / endorsement segment (`TRUST_ASSERTION`)
|
||||
|
||||
Important: each microservice **emits its own segments**, not a full spine. A small orchestrator (inside Vexer or a dedicated `ProofSpineBuilder`) collects, orders, and chains them.
|
||||
|
||||
### 3.2 Build sequence
|
||||
|
||||
Example for a “not affected due to guard” verdict:
|
||||
|
||||
1. `Sbomer` produces `SBOM_SLICE` segment for `(Artifact, Vuln)` and DSSE-signs it.
|
||||
2. `Scanner` takes slice, produces `MATCH` segment (component X -> vuln Y).
|
||||
3. `Reachability` produces `REACHABILITY` segment (symbol reachable or not).
|
||||
4. `Guard Analyzer` produces `GUARD_ANALYSIS` segment (path is gated by `feature_x_enabled=false` under current policy context).
|
||||
5. `Vexer` evaluates lattice, produces `POLICY_EVAL` segment with final VEX statement `not_affected`.
|
||||
6. `ProofSpineBuilder`:
|
||||
|
||||
* Sorts segments by predetermined order.
|
||||
* Chains `PrevSegmentHash`.
|
||||
* Computes `RootHash`.
|
||||
* Stores `ProofSpine` in canonical store and exposes it via API/GraphQL.
|
||||
|
||||
---
|
||||
|
||||
## 4. Storage & PostgreSQL patterns
|
||||
|
||||
You are moving more to Postgres for canonical data, so think:
|
||||
|
||||
### 4.1 Tables (conceptual)
|
||||
|
||||
`proof_spines`:
|
||||
|
||||
* `spine_id` (PK)
|
||||
* `artifact_id`
|
||||
* `vuln_id`
|
||||
* `policy_profile_id`
|
||||
* `verdict`
|
||||
* `verdict_reason`
|
||||
* `root_hash`
|
||||
* `scan_run_id`
|
||||
* `created_at`
|
||||
* `superseded_by_spine_id` (nullable)
|
||||
* `segment_count`
|
||||
|
||||
Indexes:
|
||||
|
||||
* `(artifact_id, vuln_id, policy_profile_id)`
|
||||
* `(scan_run_id)`
|
||||
* `(root_hash)`
|
||||
|
||||
`proof_segments`:
|
||||
|
||||
* `segment_id` (PK)
|
||||
* `spine_id` (FK)
|
||||
* `idx`
|
||||
* `segment_type`
|
||||
* `input_hash`
|
||||
* `result_hash`
|
||||
* `prev_segment_hash`
|
||||
* `envelope` (bytea or text)
|
||||
* `tool_id`
|
||||
* `tool_version`
|
||||
* `status`
|
||||
* `created_at`
|
||||
|
||||
Optional `proof_segment_payloads` if you want fast JSONB search on `inputs` / `result`:
|
||||
|
||||
* `segment_id` (PK) FK
|
||||
* `inputs_jsonb`
|
||||
* `result_jsonb`
|
||||
|
||||
Guidelines:
|
||||
|
||||
* Use **append-only** semantics: never mutate segments; supersede by new spine.
|
||||
* Partition `proof_spines` and `proof_segments` by time or `scan_run_id` if volume is large.
|
||||
* Keep envelopes as raw bytes; only parse/validate on demand or asynchronously for indexing.
|
||||
|
||||
---
|
||||
|
||||
## 5. Signing, keys, and trust model
|
||||
|
||||
### 5.1 Signers
|
||||
|
||||
At minimum:
|
||||
|
||||
* One keypair per *service* (Sbomer, Scanner, Reachability, Vexer).
|
||||
* Optional: vendor keys for imported spines/segments.
|
||||
|
||||
Key management:
|
||||
|
||||
* Keys and key IDs are owned by `Authority` service.
|
||||
* Services obtain signing keys via short-lived tokens or integrate with HSM/Key vault under Authority control.
|
||||
* Key rotation:
|
||||
|
||||
* Keys have validity intervals.
|
||||
* Spines keep `KeyId` in each DSSE signature.
|
||||
* Authority maintains a trust table: which keys are trusted for which `SegmentType` and time window.
|
||||
|
||||
### 5.2 Verification flow
|
||||
|
||||
When UI loads a spine:
|
||||
|
||||
1. Fetch `ProofSpine` + `ProofSegments`.
|
||||
2. For each segment:
|
||||
|
||||
* Verify DSSE signature via Authority API.
|
||||
* Validate `PrevSegmentHash` integrity.
|
||||
3. Compute `RootHash` and check against stored `RootHash`.
|
||||
4. Expose per-segment `status` to UI: `verified`, `untrusted-key`, `signature-failed`, `hash-mismatch`.
|
||||
|
||||
This drives the badge colors in the UX.
|
||||
|
||||
---
|
||||
|
||||
## 6. UX: from “rail + pills” to full flows
|
||||
|
||||
Think of three primary UX contexts:
|
||||
|
||||
1. **Vulnerability detail → “Explain why not affected”**
|
||||
2. **Audit view → “Show me all evidence behind this VEX statement”**
|
||||
3. **Developer triage → “Where exactly did the reasoning go conservative?”**
|
||||
|
||||
### 6.1 Spine view patterns
|
||||
|
||||
For each `(artifact, vuln)`:
|
||||
|
||||
* **Top summary bar**
|
||||
|
||||
* Verdict pill: `Not affected (guarded by runtime config)`
|
||||
* Confidence / verification status: e.g. `Proof verified`, `Partial proof`.
|
||||
* Links:
|
||||
|
||||
* “Download Proof Spine” (JSON/DSSE bundle).
|
||||
* “Replay this analysis” (CLI snippet).
|
||||
|
||||
* **Spine stepper**
|
||||
|
||||
* Horizontal list of segments (SBOM → Match → Reachability → Guard → Policy).
|
||||
* Each segment displays:
|
||||
|
||||
* Type
|
||||
* Service name
|
||||
* Status (icon + color)
|
||||
* On click: open side drawer.
|
||||
|
||||
* **Side drawer (segment detail)**
|
||||
|
||||
* `Who`: `ToolId`, `ToolVersion`, `KeyId`.
|
||||
* `When`: timestamps.
|
||||
* `Inputs`:
|
||||
|
||||
* Pretty-printed subset with “Show canonical JSON” toggle.
|
||||
* `Result`:
|
||||
|
||||
* Human-oriented short explanation + raw JSON view.
|
||||
* `Attestation`:
|
||||
|
||||
* Signature summary: `Signature verified / Key untrusted / Invalid`.
|
||||
* `PrevSegmentHash` & `ResultHash` (shortened with copy icons).
|
||||
* “Run this step in isolation” button if you support it (nice-to-have).
|
||||
|
||||
### 6.2 Time-to-Evidence (TtE) integration
|
||||
|
||||
You already asked for guidelines on “Tracking UX Health with Time-to-Evidence”.
|
||||
|
||||
Use the spine as the data source:
|
||||
|
||||
* Measure `TtE` as:
|
||||
|
||||
* `time_from_alert_opened_to_first_spine_view` OR
|
||||
* `time_from_alert_opened_to_verdict_understood`.
|
||||
* Instrument events:
|
||||
|
||||
* `spine_opened`, `segment_opened`, `segment_scrolled_to_end`, `replay_clicked`.
|
||||
* Use this to spot UX bottlenecks:
|
||||
|
||||
* Too many irrelevant segments.
|
||||
* Missing human explanations.
|
||||
* Overly verbose JSON.
|
||||
|
||||
### 6.3 Multiple paths and partial evidence
|
||||
|
||||
You might have:
|
||||
|
||||
* Static reachability: says “unreachable”.
|
||||
* Runtime traces: not collected.
|
||||
* Policy: chooses conservative path.
|
||||
|
||||
UI guidelines:
|
||||
|
||||
* Allow small branching visualization if you ever model alternative reasoning paths, but for v1:
|
||||
|
||||
* Treat missing segments as explicit `pending` / `unknown`.
|
||||
* Show them as grey pills: “Runtime observation: not available”.
|
||||
|
||||
---
|
||||
|
||||
## 7. Replay & offline/air-gap story
|
||||
|
||||
For air-gapped Stella Ops this is one of your moats.
|
||||
|
||||
### 7.1 Manifest shape
|
||||
|
||||
`ReplayManifest` (JSON, canonicalized):
|
||||
|
||||
* `manifest_id`
|
||||
* `generated_at`
|
||||
* `tools`:
|
||||
|
||||
* `{ "id": "Scanner", "version": "10.1.3", "image_digest": "..." }`
|
||||
* etc.
|
||||
* `feeds`:
|
||||
|
||||
* `{ "name": "nvd", "version": "2025-11-30T00:00:00Z", "hash": "..." }`
|
||||
* `policies`:
|
||||
|
||||
* `{ "policy_profile_id": "default-eu", "version": "3.4.0", "hash": "..." }`
|
||||
|
||||
CLI contract:
|
||||
|
||||
```bash
|
||||
stellaops scan \
|
||||
--replay-manifest <id-or-file> \
|
||||
--artifact <image-digest> \
|
||||
--vuln <cve> \
|
||||
--explain
|
||||
```
|
||||
|
||||
Replay guarantees:
|
||||
|
||||
* If the artifact and feeds are still available, replay reproduces:
|
||||
|
||||
* identical segments,
|
||||
* identical `RootHash`,
|
||||
* identical verdict.
|
||||
* If anything changed:
|
||||
|
||||
* CLI clearly marks divergence: “Recomputed proof differs from stored spine (hash mismatch).”
|
||||
|
||||
### 7.2 Offline bundle integration
|
||||
|
||||
Your offline update kit should:
|
||||
|
||||
* Ship manifests alongside feed bundles.
|
||||
* Keep a small index “manifest_id → bundle file”.
|
||||
* Allow customers to verify that a spine produced 6 months ago used feed version X that they still have in archive.
|
||||
|
||||
---
|
||||
|
||||
## 8. Performance, dedup, and scaling
|
||||
|
||||
### 8.1 Dedup segments
|
||||
|
||||
Many artifacts share partial reasoning, e.g.:
|
||||
|
||||
* Same base image SBOM slice.
|
||||
* Same reachability result for a shared library.
|
||||
|
||||
You have options:
|
||||
|
||||
1. **Simple v1:** keep segments embedded in spines. Optimize later.
|
||||
2. **Advanced:** deduplicate by `ResultHash` + `SegmentType` + `ToolId`:
|
||||
|
||||
* Store unique “segment payloads” in a table keyed by that combination.
|
||||
* `ProofSegment` then references payload via foreign key.
|
||||
|
||||
Guideline for now: instruct devs to design with **possible dedup** in mind (segment payloads should be referable).
|
||||
|
||||
### 8.2 Retention strategy
|
||||
|
||||
* Keep full spines for:
|
||||
|
||||
* Recent scans (e.g., last 90 days) for triage.
|
||||
* Any spines that were exported to auditors or regulators.
|
||||
* For older scans:
|
||||
|
||||
* Option A: keep only `POLICY_EVAL` + `RootHash` + short summary.
|
||||
* Option B: archive full spines to object storage (S3/minio) keyed by `RootHash`.
|
||||
|
||||
---
|
||||
|
||||
## 9. Security & multi-tenant boundaries
|
||||
|
||||
Stella Ops will likely serve many customers / environments.
|
||||
|
||||
Guidelines:
|
||||
|
||||
* `SpineId` is globally unique, but all queries must be scope-checked by:
|
||||
|
||||
* `TenantId`
|
||||
* `EnvironmentId`
|
||||
* Authority verifies not only signatures, but also **key scopes**:
|
||||
|
||||
* Key X is only allowed to sign for Tenant T / Environment E, or for system-wide tools.
|
||||
* Never leak:
|
||||
|
||||
* File paths,
|
||||
* Internal IPs,
|
||||
* Customer-specific configs,
|
||||
in the human-friendly explanation. Those can stay in the canonical JSON, which is exposed only in advanced / audit mode.
|
||||
|
||||
---
|
||||
|
||||
## 10. Developer & tester guidelines
|
||||
|
||||
### 10.1 For implementors (C# / .NET 10)
|
||||
|
||||
* Use a **single deterministic JSON serializer** (e.g. wrapper around `System.Text.Json`) with:
|
||||
|
||||
* Stable property order.
|
||||
* Standardized timestamp format (UTC ISO 8601).
|
||||
* Explicit numeric formats (no locale-dependent decimals).
|
||||
* Before signing:
|
||||
|
||||
* Canonicalize JSON.
|
||||
* Hash bytes directly.
|
||||
* Never change canonicalization semantics in a minor version. If you must, bump a major version and record it in `ReplayManifest`.
|
||||
|
||||
### 10.2 For test engineers
|
||||
|
||||
Build a curated suite of fixture scenarios:
|
||||
|
||||
1. “Straightforward not affected”:
|
||||
|
||||
* Unreachable symbol, no runtime data, conservative policy: still `not_affected` due to unreachable.
|
||||
2. “Guarded at runtime”:
|
||||
|
||||
* Reachable symbol, but guard based on config → `not_affected`.
|
||||
3. “Missing segment”:
|
||||
|
||||
* Remove `REACHABILITY` segment → policy should downgrade to `affected` or `under_investigation`.
|
||||
4. “Signature tampering”:
|
||||
|
||||
* Flip a byte in one DSSE payload → UI must show `invalid` and mark entire spine as compromised.
|
||||
5. “Key revocation”:
|
||||
|
||||
* Mark a key untrusted → segments signed with it become `untrusted-key` and spine is partially verified.
|
||||
|
||||
Provide golden JSON for:
|
||||
|
||||
* `ProofSpine` object.
|
||||
* Each `ProofSegment` envelope.
|
||||
* Expected `RootHash`.
|
||||
* Expected UI status per segment.
|
||||
|
||||
---
|
||||
|
||||
## 11. How this ties into your moats
|
||||
|
||||
This Proof Spine is not just “nice UX”:
|
||||
|
||||
* It is the **concrete substrate** for:
|
||||
|
||||
* Trust Algebra Studio (the lattice engine acts on segments and outputs `POLICY_EVAL` segments).
|
||||
* Proof-Market Ledger (publish `RootHash` + minimal metadata).
|
||||
* Deterministic, replayable scans (spine + manifest).
|
||||
* Competitors can show “reasons”, but you are explicitly providing:
|
||||
|
||||
* Signed, chain-of-evidence reasoning,
|
||||
* With deterministic replay,
|
||||
* Packaged for regulators and procurement.
|
||||
|
||||
---
|
||||
|
||||
If you want, next step I can draft:
|
||||
|
||||
* A proto/JSON schema for `ProofSpine` bundles for export/import.
|
||||
* A minimal set of REST/GraphQL endpoints for querying spines from UI and external auditors.
|
||||
@@ -0,0 +1,470 @@
|
||||
I thought you’d be interested in this — there’s real momentum toward exactly what you want for Stella Ops’s vision: tools now offer VEX‑based attestations and more advanced UX around vulnerability context and suppression.
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
## ✅ What others are doing now that matches Stella’s roadmap
|
||||
|
||||
* **Docker Scout** — already supports creating exceptions using VEX documents, both via CLI and GUI. That means you can attach a VEX (OpenVEX) statement to a container image marking certain CVEs non‑applicable, fixed, or mitigated. Scout then automatically suppresses those CVEs from scan results. ([Docker Documentation][1])
|
||||
* The CLI now includes a command to fetch a merged VEX document (`docker scout vex get`), which allows retrieving the effective vulnerability‑status attestations for a given image. That gives a machine‑readable manifest of “what is safe/justified.” ([Docker Documentation][2])
|
||||
* Exception management via GUI: you can use the dashboard or Docker Desktop to create “Accepted risk” or “False positive” exceptions, with justifications and scopes (single image, repo, org-wide, etc.). That ensures flexibility when a vulnerability exists but is considered safe given context. ([Docker Documentation][3])
|
||||
* **Anchore Enterprise** — with release 5.23 (Nov 10, 2025), it added support for exporting vulnerability annotations in the format of CycloneDX VEX, plus support for vulnerability disclosure reports (VDR). That means teams can annotate which CVEs are effectively mitigated, non‑applicable, or fixed, and generate standardized VEX/VDR outputs. ([Anchore][4])
|
||||
* Anchore’s UI now gives improved UX: rather than just a severity pie chart, there are linear metrics — severity distribution, EPSS score ranges, KEV status, fix availability — and filtering tools to quickly assess risk posture. Annotations are accessible via UI or API, making vulnerability justification part of the workflow. ([Anchore Documentation][5])
|
||||
|
||||
Because of these developments, a product like Stella can realistically embed inline “Show Proof” / “Why safe?” panels that link directly to VEX documents or attestation digests — much like what Docker Scout and Anchore now support.
|
||||
|
||||
## 🔍 What this suggests for Stella’s UX & Feature Moats
|
||||
|
||||
* **Inline attestation linkage is viable now.** Since Docker Scout allows exporting/ fetching VEX JSON attestation per image, Stella could similarly pull up a VEX file and link users to it (or embed it) in a “Why safe?” panel.
|
||||
* **Vendor-agnostic VEX support makes dual-format export (OpenVEX + CycloneDX) a realistic baseline.** Anchore’s support for both formats shows that supply-chain tools are converging; Stella can adopt the same approach to increase interoperability.
|
||||
* **Exception annotation + context-aware suppression is feasible.** The “Accepted risk / False positive” model from Docker Scout — including scope, justification, and organizational visibility — gives a blueprint for how Stella might let users record contextual judgments (e.g. “component unused”, “mitigated by runtime configuration”) and persist them in a standardized VEX message.
|
||||
* **Better UX for risk prioritization and filtering.** Anchore’s shift from pie-chart severity to multi-dimensional risk summaries (severity, EPSS, fix status) gives a better mental model for users than raw CVE counts. Stella’s prioritization UI could adopt a similar holistic scoring approach — perhaps further enriched by runtime context, as you envision.
|
||||
|
||||
## ⚠️ What to watch out for
|
||||
|
||||
* The field of VEX‑based scanning tools is still maturing. A recent academic paper found that different VEX‑aware scanners often produce inconsistent vulnerability‑status results on the same container images — meaning that automated tools still differ substantially in interpretation. ([arXiv][6])
|
||||
* As reported by some users of Docker Scout, there are occasional issues when attaching VEX attestations to images in practice — e.g. attestations aren’t always honored in the web dashboard or CLI unless additional steps are taken. ([Docker Community Forums][7])
|
||||
|
||||
---
|
||||
|
||||
Given all this — your Stella Ops moats around deterministic, audit‑ready SBOM/VEX bundles and inline proof panels are *absolutely* aligned with the current trajectory of industry tooling.
|
||||
|
||||
If you like, I can collect **5–10 recent open‑source implementations** (with links) that already use VEX or CycloneDX extents of exactly this kind — could be useful reference code or inspiration for Stella.
|
||||
|
||||
[1]: https://docs.docker.com/scout/how-tos/create-exceptions-vex/?utm_source=chatgpt.com "Create an exception using the VEX"
|
||||
[2]: https://docs.docker.com/scout/release-notes/cli/?utm_source=chatgpt.com "Docker Scout CLI release notes"
|
||||
[3]: https://docs.docker.com/scout/how-tos/create-exceptions-gui/?utm_source=chatgpt.com "Create an exception using the GUI"
|
||||
[4]: https://anchore.com/blog/anchore-enterprise-5-23-cyclonedx-vex-and-vdr-support/?utm_source=chatgpt.com "Anchore Enterprise 5.23: CycloneDX VEX and VDR Support"
|
||||
[5]: https://docs.anchore.com/current/docs/release_notes/enterprise/5230/?utm_source=chatgpt.com "Anchore Enterprise Release Notes - Version 5.23.0"
|
||||
[6]: https://arxiv.org/abs/2503.14388?utm_source=chatgpt.com "Vexed by VEX tools: Consistency evaluation of container vulnerability scanners"
|
||||
[7]: https://forums.docker.com/t/struggling-with-adding-vex-attestations-with-docker-scout/143422?utm_source=chatgpt.com "Struggling with adding vex attestations with docker scout"
|
||||
Good, let’s pivot explicitly to DevOps as the primary persona and strip the UX down to what helps them ship and sleep.
|
||||
|
||||
I’ll frame this as a reusable guideline document you can drop into `docs/ux/ux-devops-guidelines.md`.
|
||||
|
||||
---
|
||||
|
||||
## 1. DevOps mental model
|
||||
|
||||
Design every surface assuming:
|
||||
|
||||
* They are under time pressure, context-switching, and on-call.
|
||||
* They already live in:
|
||||
|
||||
* CI logs and pipeline dashboards
|
||||
* Chat (alerts, incident rooms)
|
||||
* Kubernetes / Docker / Terraform / Ansible
|
||||
* They tolerate *some* complexity, but hate ceremony and “wizards”.
|
||||
|
||||
**Rule:** Stella UX for DevOps must always answer one of three questions clearly:
|
||||
|
||||
1. Can I ship this?
|
||||
2. If not, what exactly blocks me?
|
||||
3. What’s the minimum safe change to unblock?
|
||||
|
||||
Everything else is secondary.
|
||||
|
||||
---
|
||||
|
||||
## 2. Global UX principles for DevOps
|
||||
|
||||
1. **Pipeline-first, UI-second**
|
||||
|
||||
* Every decision surfaced in the UI must be reproducible via:
|
||||
|
||||
* CLI
|
||||
* API
|
||||
* Pipeline config (YAML)
|
||||
* UI is the “explainer & debugger”, not the only interface.
|
||||
|
||||
2. **Time-to-evidence ≤ 30 seconds**
|
||||
|
||||
* From a red flag in the pipeline to concrete, human-readable evidence:
|
||||
|
||||
* Max 3 clicks / interactions.
|
||||
* No abstract “risk scores” without a path to:
|
||||
|
||||
* SBOM line
|
||||
* VEX statement
|
||||
* Feed / CVE record
|
||||
* Artifact / image name + digest
|
||||
|
||||
3. **Three-step resolution path**
|
||||
For any finding in the UI:
|
||||
|
||||
1. See impact: “What is affected, where, and how bad?”
|
||||
2. See options: “Fix now / Waive with proof / Defer with conditions”
|
||||
3. Generate action: patch snippet, ticket, MR template, or policy change.
|
||||
|
||||
4. **No dead ends**
|
||||
|
||||
* Every screen must offer at least one next action:
|
||||
|
||||
* “Open in pipeline run”
|
||||
* “Open in cluster view”
|
||||
* “Create exception”
|
||||
* “Open proof bundle”
|
||||
* “Export as JSON”
|
||||
|
||||
5. **Deterministic, not magical**
|
||||
|
||||
* Always show *why* a decision was made:
|
||||
|
||||
* Why did the lattice say “not affected”?
|
||||
* Why is this vulnerability prioritized over others?
|
||||
* DevOps must be able to say in an incident review:
|
||||
“Stella said this is safe because X, Y, Z.”
|
||||
|
||||
---
|
||||
|
||||
## 3. Core views DevOps actually need
|
||||
|
||||
### 3.1. Pipeline / run-centric view
|
||||
|
||||
**Use:** during CI/CD failures and investigations.
|
||||
|
||||
Key elements:
|
||||
|
||||
* List of recent runs with status:
|
||||
|
||||
* ✅ Passed with notes
|
||||
* 🟡 Passed with waivers
|
||||
* 🔴 Failed by policy
|
||||
* Columns:
|
||||
|
||||
* Commit / branch
|
||||
* Image(s) or artifacts involved
|
||||
* Policy summary (“Blocked: critical vuln with no VEX coverage”)
|
||||
* Time-to-evidence: clickable “Details” link
|
||||
|
||||
On clicking a failed run:
|
||||
|
||||
* Top section:
|
||||
|
||||
* “Why this run failed” in one sentence.
|
||||
* Example:
|
||||
`Blocked: CVE-2025-12345 (Critical, reachable, no fix, no VEX proof).`
|
||||
* Immediately below:
|
||||
|
||||
* Button: **“Show evidence”** → opens vulnerability detail with:
|
||||
|
||||
* SBOM component
|
||||
* Path in image (e.g. `/usr/lib/libfoo.so`)
|
||||
* Feed record used
|
||||
* VEX status (if any)
|
||||
* Lattice verdict (“reachable because …”)
|
||||
* Side rail:
|
||||
|
||||
* “Possible actions”:
|
||||
|
||||
* Propose upgrade (version suggestions)
|
||||
* Draft exception (with required justification template)
|
||||
* Open in cluster view (if deployed)
|
||||
* Export proof bundle (for auditor / security team)
|
||||
|
||||
### 3.2. Artifact-centric view (image / component)
|
||||
|
||||
**Use:** when DevOps wants a clean risk story per image.
|
||||
|
||||
Key elements:
|
||||
|
||||
* Title: `<registry>/<repo>:<tag> @ sha256:…`
|
||||
* Score block:
|
||||
|
||||
* Number of vulnerabilities by status:
|
||||
|
||||
* Affected
|
||||
* Not affected (with VEX proof)
|
||||
* Fixed in newer tag
|
||||
* Policy verdict: “Allowed / Allowed with waivers / Blocked”
|
||||
* “Proof Spine” panel:
|
||||
|
||||
* SBOM hash
|
||||
* VEX attestation hashes
|
||||
* Scan manifest hash
|
||||
* Link to Rekor / internal ledger entry (if present)
|
||||
* Table:
|
||||
|
||||
* Column set:
|
||||
|
||||
* CVE / ID
|
||||
* Effective status (after VEX & lattice)
|
||||
* Reachability (reachable / not reachable / unknown)
|
||||
* Fix available?
|
||||
* Exceptions applied?
|
||||
* Filters:
|
||||
|
||||
* “Show only blockers”
|
||||
* “Show only items with VEX”
|
||||
* “Show only unknown reachability”
|
||||
|
||||
From here, DevOps should be able to:
|
||||
|
||||
* Promote / block this artifact in specific environments.
|
||||
* Generate a short “risk summary” text to paste into change records.
|
||||
|
||||
### 3.3. Environment / cluster-centric view
|
||||
|
||||
**Use:** operational posture and compliance.
|
||||
|
||||
Key elements:
|
||||
|
||||
* Node: `environment → service → artifact`.
|
||||
* Color-coded status:
|
||||
|
||||
* Green: no blockers / only accepted risk with proof
|
||||
* Yellow: waivers that are close to expiry or weakly justified
|
||||
* Red: policy-violating deployments
|
||||
* For each service:
|
||||
|
||||
* Running image(s)
|
||||
* Last scan age
|
||||
* VEX coverage ratio:
|
||||
|
||||
* “80% of critical vulns have VEX or explicit policy decision”
|
||||
|
||||
Critical UX rule:
|
||||
From a red environment tile, DevOps can drill down in 2 steps to:
|
||||
|
||||
1. The exact conflicting artifact.
|
||||
2. The exact vulnerability + policy rule causing the violation.
|
||||
|
||||
---
|
||||
|
||||
## 4. Evidence & proof presentation
|
||||
|
||||
For DevOps, the key is: **“Can I trust this automated decision during an incident?”**
|
||||
|
||||
UX pattern for a single vulnerability:
|
||||
|
||||
1. **Summary strip**
|
||||
|
||||
* `CVE-2025-12345 · Critical · Reachable · No fix`
|
||||
* Small chip: `Policy: BLOCK`
|
||||
|
||||
2. **Evidence tabs**
|
||||
|
||||
* `SBOM`
|
||||
Exact component, version, and path.
|
||||
* `Feeds`
|
||||
Which feed(s) and timestamps were used.
|
||||
* `VEX`
|
||||
All VEX statements (source, status, time).
|
||||
* `Lattice decision`
|
||||
Human-readable explanation of why the final verdict is what it is.
|
||||
* `History`
|
||||
Changes over time: “Previously not affected via vendor VEX; changed to affected on <date>.”
|
||||
|
||||
3. **Action panel**
|
||||
|
||||
* For DevOps:
|
||||
|
||||
* “Suggest upgrade to safe version”
|
||||
* “Propose temporary exception”
|
||||
* “Re-run scan with latest feeds” (if allowed)
|
||||
* Guardrail: exceptions require:
|
||||
|
||||
* Scope (image / service / environment / org)
|
||||
* Duration / expiry
|
||||
* Justification text
|
||||
* Optional attachment (ticket link, vendor email)
|
||||
|
||||
---
|
||||
|
||||
## 5. Exception & waiver UX specifically for DevOps
|
||||
|
||||
DevOps needs fast but controlled handling of “we must ship with this risk.”
|
||||
|
||||
Guidelines:
|
||||
|
||||
1. **Default scope presets**
|
||||
|
||||
* “This run only”
|
||||
* “This branch / service”
|
||||
* “This environment (e.g. staging only)”
|
||||
* “Global (requires higher role / Authority approval)”
|
||||
|
||||
2. **Strong, structured justification UI**
|
||||
|
||||
* Dropdown reason categories:
|
||||
|
||||
* “Not reachable in this deployment”
|
||||
* “Mitigated by config / WAF”
|
||||
* “Vendor VEX says not affected”
|
||||
* “Business override / emergency”
|
||||
* Required free-text field:
|
||||
|
||||
* 2–3 suggested sentence starters to prevent “OK” as justification.
|
||||
|
||||
3. **Expiry as first-class attribute**
|
||||
|
||||
* Every exception must show:
|
||||
|
||||
* End date
|
||||
* “Time left” indicator
|
||||
* UI warning when exceptions are about to expire in critical environments.
|
||||
|
||||
4. **Audit-friendly timeline**
|
||||
|
||||
* For each exception:
|
||||
|
||||
* Who created it
|
||||
* Which run / artifact triggered it
|
||||
* Policy evaluation before/after
|
||||
|
||||
DevOps UX goal:
|
||||
Create waiver in < 60 seconds, but with enough structure that auditors and security are not furious later.
|
||||
|
||||
---
|
||||
|
||||
## 6. CLI and automation UX
|
||||
|
||||
DevOps often never open the web UI during normal work; they see:
|
||||
|
||||
* CLI output
|
||||
* Pipeline logs
|
||||
* Alerts in chat
|
||||
|
||||
Guidelines:
|
||||
|
||||
1. **Stable, simple exit codes**
|
||||
|
||||
* `0` = no policy violation
|
||||
* `1` = policy violation
|
||||
* `2` = scanner/system error (distinguish clearly from “found vulns”)
|
||||
|
||||
2. **Dual output**
|
||||
|
||||
* Human-readable summary:
|
||||
|
||||
* Short, 3–5 lines by default
|
||||
* Machine-readable JSON:
|
||||
|
||||
* `--output json` or auto-detected in CI
|
||||
* Includes links to:
|
||||
|
||||
* Web UI run page
|
||||
* Proof bundle ID
|
||||
* Rekor / ledger reference
|
||||
|
||||
3. **Minimal default noise**
|
||||
|
||||
* Default CLI mode is concise; verbose details via `-v`/`-vv`.
|
||||
* One-line per blocking issue, with an ID you can copy into the web UI.
|
||||
|
||||
4. **Copy/paste-friendly**
|
||||
|
||||
* IDs, hashes, URLs must be selectable and minimally noisy.
|
||||
* Don’t wrap hashes in decorations that make copy hard.
|
||||
|
||||
---
|
||||
|
||||
## 7. Alerting & incident integration
|
||||
|
||||
When DevOps is on-call, Stella UX should behave like a good colleague, not a chatty auditor.
|
||||
|
||||
1. **Alert text pattern**
|
||||
|
||||
* Subject / title:
|
||||
|
||||
* `[Stella] Production blocked: Image X (CVE-YYYY-NNNN)`
|
||||
* First line:
|
||||
|
||||
* “Policy blocked deployment of `<service>@<env>` due to: `<summary>`”
|
||||
* Then a single deep link:
|
||||
|
||||
* “Open in Stella (Run #12345)”
|
||||
|
||||
2. **Degraded-mode cues**
|
||||
|
||||
* If feeds are stale or air-gapped kit is outdated:
|
||||
|
||||
* Clear banner in UI and in CLI output:
|
||||
|
||||
* “Scanner currently operating with feeds from <date>. Confidence reduced.”
|
||||
* This is vital for trustworthy DevOps decisions.
|
||||
|
||||
---
|
||||
|
||||
## 8. Metrics that matter for DevOps UX
|
||||
|
||||
To keep DevOps-focused UX honest, track:
|
||||
|
||||
* Median **time from pipeline failure to first evidence view**.
|
||||
* Median **time from evidence view to decision** (fix / exception / revert).
|
||||
* % of exceptions with:
|
||||
|
||||
* Valid justification
|
||||
* Non-expired status
|
||||
* % of blocked deployments that were later overruled by humans, by reason.
|
||||
|
||||
Use these to iteratively simplify screens and flows that DevOps clearly struggle with.
|
||||
|
||||
---
|
||||
|
||||
If you want, the next step can be:
|
||||
|
||||
* A concrete wireframe spec for **one key flow**, for example:
|
||||
“Pipeline fails → DevOps opens Stella → applies time-bounded exception for staging, but not production.”
|
||||
That can be expressed as step-by-step UI states that you hand to your frontend devs.
|
||||
Stella DevOps UX Implementation Guide
|
||||
1. DevOps Mental Model
|
||||
Development Direction: Align the platform’s design with a DevOps engineer’s mental model of the software delivery flow. All key entities (pipelines, builds, artifacts, environments, deployments) should be first-class concepts in both UI and API. The system must allow tracing the path from code commit through CI/CD pipeline to the artifact and finally to the running environment, reflecting how DevOps think about changes moving through stages. This means using consistent identifiers (e.g. commit SHA, artifact version, build number) across views so everything is linked in a coherent flow[1]. For example, an engineer should easily follow a chain from a security control or test result, to the artifact produced, to where that artifact is deployed.
|
||||
Implementation Plan: Model the domain objects (pipeline runs, artifacts, environments) in the backend with clear relationships. For instance, store each pipeline run with metadata: commit ID, associated artifact IDs, and target environment. Implement linking in the UI: pipeline run pages link to the artifacts they produced; artifact pages link to the deployments or environments where they’re running. Use tags or labels (in a database or artifact repository metadata) to tie artifacts back to source commits or tickets. This could leverage existing CI systems (Jenkins, GitLab CI, etc.) by pulling their data via APIs, or be built on a custom pipeline engine (e.g. Tekton on Kubernetes for native pipeline CRDs). Ensure any integration (with Git or ticketing) populates these references automatically. By tagging and correlating objects, we enable deep linking: e.g. clicking an artifact’s version shows which pipeline produced it and which environment it's in[1].
|
||||
DevOps-facing Outcome: DevOps users will experience a platform that “thinks” the way they do. In practice, they can trace a story of a change across the system: for a given commit, see the CI/CD run that built it, view the artifact (container image, package, etc.) with its SBOM and test results attached, and see exactly which environment or cluster is running that version[1]. This traceability instills confidence – it’s obvious where any given change is and what happened to it. New team members find the UI intuitive because it mirrors real deployment workflows rather than abstract concepts.
|
||||
2. Global UX Principles for DevOps
|
||||
Development Direction: Build the user experience with an emphasis on clarity, consistency, and minimal friction for DevOps tasks. The platform should be intuitive enough that common actions require few clicks and little to no documentation. Use familiar conventions from other DevOps tools (icons, terminology, keyboard shortcuts) to leverage existing mental models[2]. Prioritize core functionality over feature bloat to keep the interface straightforward – focus on the top tasks DevOps engineers perform daily. Every part of the tool (UI, CLI, API) should follow the same design principles so that switching contexts doesn’t confuse the user[3].
|
||||
Implementation Plan: Adopt a consistent design system and navigation structure across all modules. For example, use standard color coding (green for success, red for failure) and layout similar to popular CI/CD tools for pipeline status to meet user expectations[2]. Implement safe defaults and templates: e.g. provide pipeline configuration templates and environment defaults so users aren’t overwhelmed with setup (following “convention over configuration” for common scenarios[4]). Ensure immediate, contextual feedback in the UI – if a pipeline fails, highlight the failed step with error details right there (no hunting through logs unnecessarily). Incorporate guidance into the product: for instance, tooltips or inline hints for first-time setup, but design the flow so that the “right way” is also the easiest way (leveraging constraints to guide best practices[5]). Integrate authentication and SSO with existing systems (LDAP/OIDC) to avoid extra logins, and integrate with familiar interfaces (ChatOps, Slack, IDE plugins) to reduce context-switching. Maintain parity between the web UI and CLI by making both use the same underlying APIs – this ensures consistency and that improvements apply to all interfaces. In development, use UX best practices such as usability testing with actual DevOps users to refine workflows (e.g. ensure creating a new environment or pipeline is a short, logical sequence). Keep pages responsive and lightweight for quick load times, as speed is part of good UX.
|
||||
DevOps-facing Outcome: DevOps practitioners will find the tool intuitive and efficient. They can accomplish routine tasks (triggering a deployment, approving a change, checking logs) without referring to documentation, because the UI naturally leads them through workflows. The system provides feedback that is specific and actionable – for example, error messages clearly state what failed (e.g. “Deployment to QA failed policy check X”) and suggest next steps (with a link to the policy or waiver option), rather than generic errors[6]. Users notice that everything feels familiar: the terminology matches their conventions, and even the CLI commands and outputs align with tools they know. Friction is minimized: they aren’t wasting time on redundant confirmations or searching for information across different screens. Overall, this leads to improved flow state and productivity – the tool “gets out of the way” and lets DevOps focus on delivering software[3].
|
||||
3. Core Views DevOps Actually Need
|
||||
Pipeline/Run-Centric View
|
||||
Development Direction: Provide a pipeline-run dashboard that gives a real-time and historical view of CI/CD pipeline executions. DevOps users need to see each pipeline run’s status, stages, and logs at a glance, with the ability to drill down into any step. Key requirements include visual indicators of progress (running, passed, failed), links to related entities (commit, artifacts produced, deployment targets), and controls to re-run or rollback if needed. Essentially, we need to build what is often seen in tools like Jenkins Blue Ocean or GitLab Pipelines: a clear timeline or graph of pipeline stages with results. The view should support filtering (by branch, status, timeframe) and show recent pipeline outcomes to quickly spot failures[7].
|
||||
Implementation Plan: Leverage the CI system’s data to populate this view. If using an existing CI (Jenkins/GitLab/GitHub Actions), integrate through their APIs to fetch pipeline run details (jobs, status, logs). Alternatively, if building a custom pipeline service (e.g. Tekton on Kubernetes), use its pipeline CRDs and results to construct the UI. Implement a real-time update mechanism (WebSocket or long-poll) so users can watch a running pipeline’s progress live (e.g. seeing stages turn green or red as they complete). The UI could be a linear timeline of stages or a node graph for parallel stages. Each stage node should be clickable to view logs and any artifacts from that stage. Include a sidebar or modal for logs with search and highlight (so DevOps can quickly diagnose failures). Provide controls to download logs or artifacts right from the UI. Integrate links: e.g. the commit hash in the pipeline header links to the SCM, the artifact name links to the artifact repository or artifact-centric view. If a pipeline fails a quality gate or test, highlight it and possibly prompt next actions (create a ticket or issue, or jump to evidence). Use CI webhooks or event listeners to update pipeline status in the platform database, and maintain a history of past runs. This can be backed by a database table (storing run id, pipeline id, status, duration, initiator, etc.) for querying and metrics.
|
||||
DevOps-facing Outcome: The pipeline-centric view becomes the mission control for builds and releases. A DevOps engineer looking at this dashboard can immediately answer: “What’s the state of our pipelines right now?” They’ll see perhaps a list or grid of recent runs, with status color-codes (e.g. green check for success, red X for failure, yellow for running). They can click a failed pipeline and instantly see which stage failed and the error message, without wading through raw logs. For a running deployment, they might see a live streaming log of tests and a progress bar of stages. This greatly speeds up troubleshooting and situational awareness[7]. Moreover, from this view they can trigger actions – e.g. re-run a failed job or approve a manual gate – making it a one-stop interface for pipeline operations. Overall, this view ensures that pipeline status and history are highly visible (no more digging through Jenkins job lists or disparate tools), which supports faster feedback and collaboration (e.g. a team board showing these pipeline dashboards to all team members[7]).
|
||||
Artifact-Centric View
|
||||
Development Direction: Create an artifact-centric view that tracks the build outputs (artifacts) through their lifecycle. DevOps teams often manage artifacts like container images, binaries, or packages that are built once and then promoted across environments. This view should list artifact versions along with metadata: what build produced it, which tests it passed, security scan results, and where it’s currently deployed. The guiding principle is “promote artifacts, not code” – once an artifact is proven in one environment, it should be the same artifact moving forward[8]. Therefore, the system must support viewing an artifact (say version 1.2.3 of a service) and seeing its chain of custody: built by Pipeline #123 from Commit ABC, signed and stored in registry, deployed to Staging, awaiting promotion to Prod. It should also highlight if an artifact is approved (all checks passed) or if it carries any waivers/exceptions.
|
||||
Implementation Plan: Integrate with artifact repositories and registries. For example, if using Docker images, connect to a container registry (AWS ECR, Docker Hub, etc.) via API or CLI to list image tags and digests. For JARs or packages, integrate with a binary repository (Artifactory, Nexus, etc.). Store metadata in a database linking artifact IDs (e.g. digest or version) to pipeline run and test results. The implementation could include a dedicated microservice to handle artifact metadata: when a pipeline produces a new artifact, record its details (checksum, storage URL, SBOM, test summary, security scan outcome). Implement the artifact view UI to display a table or list of artifact versions, each expandable to show details like: build timestamp, commit ID, link to pipeline run, list of environments where it’s deployed, and compliance status (e.g. “Signed ✅, Security scan ✅, Tests ✅”). Provide actions like promoting an artifact to an environment (which could trigger a deployment pipeline or Argo CD sync behind the scenes). Include promotion workflows with approvals – e.g. a button to “Promote to Production” that will enforce an approval if required by policy[8]. Ensure the artifact view can filter or search by component/service name and version. Behind the scenes, implement retention policies for artifacts (possibly configurable) and mark artifacts that are no longer deployed so they can be archived or cleaned up[8]. Use signing tools (like Cosign for container images) and display signature verification status in the UI to ensure integrity[8]. This likely means storing signature info and verification results in our metadata DB and updating on artifact fetch.
|
||||
DevOps-facing Outcome: Users gain a single source of truth for artifacts. Instead of manually cross-referencing CI runs and Docker registries, they can go to “Artifact X version Y” page and get a comprehensive picture: “Built 2 days ago from commit abc123 by pipeline #56[8]. Passed all tests and security checks. Currently in UAT and Prod.” They will see if the artifact was signed and by whom, and they can trust that what went through QA is exactly what’s in production (no surprise re-builds). If an artifact has a known vulnerability, they can quickly find everywhere it’s running. Conversely, if a deployment is failing, they can confirm the artifact’s provenance (maybe the issue is that it wasn’t the artifact they expected). This view also streamlines promotions: a DevOps engineer can promote a vetted artifact to the next environment with one click, knowing the platform will handle the deployment and update the status. Overall, the artifact-centric view reduces release errors by emphasizing immutability and traceability of builds, and it gives teams confidence that only approved artifacts progress through environments[8].
|
||||
Environment/Cluster-Centric View
|
||||
Development Direction: Provide an environment or cluster-centric dashboard focusing on the state of each deployment environment (Dev, QA, Prod, or specific Kubernetes clusters). DevOps need to see what is running where and the health/status of those environments. This view should show each environment’s active versions of services, configuration, last deployment time, and any pending changes or issues. Essentially, when selecting an environment (or a cluster), the user should see all relevant information: which artifacts/versions are deployed, whether there are any out-of-policy conditions, recent deployment history for that environment, and live metrics or alerts for it. It’s about answering “Is everything OK in environment X right now? What’s deployed there?” at a glance. The environment view should also integrate any Infrastructure-as-Code context – e.g. show if the environment’s infrastructure (Terraform, Kubernetes resources) is in sync or drifted from the desired state.
|
||||
Implementation Plan: Represent environments as entities in the system with attributes and links to resources. For a Kubernetes cluster environment, integrate with the K8s API or Argo CD to fetch the list of deployed applications and their versions. For VM or cloud environments, integrate with deployment scripts or Terraform state: e.g. tag deployments with an environment ID so the system knows what’s deployed. Implement an environment overview page showing a grid or list of services in that environment and their current version (pull this from a deployment registry or continuous delivery tool). Include environment-specific status checks: e.g. call Kubernetes for pod statuses or use health check endpoints of services. If using Terraform or another IaC, query its state or run a drift detection (using Terraform plan or Terraform Cloud APIs) to identify differences between desired and actual infrastructure; highlight those if any. Additionally, integrate recent deployment logs: e.g. “Deployed version 1.2.3 of ServiceA 2 hours ago by pipeline #45 (passed ✅)” so that context is visible[7]. Enable quick access to logs or monitoring: e.g. links to Kibana for logs or Prometheus/Grafana for metrics specific to that environment. For environment config, provide a way to manage environment-specific variables or secrets (possibly by integrating with a vault or config management). This view might also expose controls like pausing deployments (maintenance mode) or manually triggering a rollback in that environment. If the organization uses approval gates on environments, show whether the environment is open for deployment or awaiting approvals. Use role-based access control to ensure users only see and act on environments they’re allowed to. In terms of tech, you might integrate with Kubernetes via the Kubernetes API (client libraries) for cluster state, and with cloud providers (AWS, etc.) for resource statuses. If multiple clusters, aggregate them or allow selecting each.
|
||||
DevOps-facing Outcome: When a DevOps engineer opens the environment view (say for “Production”), they get a comprehensive snapshot of Prod. For example, they see that Service A version 2.3 is running (with a green check indicating all health checks pass), Service B version 1.8 is running but has a warning (perhaps a policy violation or a pod restarting). They can see that the last deployment was yesterday, and maybe an approval is pending for a new version (clearly indicated). They also notice any environment-level alerts (e.g. “Disk space low” or “Compliance drift detected: one config changed outside of pipeline”). This reduces the need to jump between different monitoring and deployment tools – key information is aggregated. They can directly access logs or metrics if something looks off. For example, if an incident occurs in production, the on-call can open this view to quickly find what changed recently and on which nodes. The environment-centric view thus bridges operations and release info: it’s not just what versions are deployed, but also their run-state and any issues. As a result, DevOps teams can more effectively manage environments, verify deployments, and ensure consistency. This high-level visibility aligns with best practices where environments are monitored and audited continuously[9] – the UI will show deployment history and status in one place, simplifying compliance and troubleshooting.
|
||||
4. Evidence & Proof Presentation
|
||||
Development Direction: The platform must automatically collect and present evidence of compliance and quality for each release, making audits and reviews straightforward. This means every pipeline and deployment should leave an “evidence trail” – test results, security scan reports, configuration snapshots, audit logs – that is organized and accessible. DevOps users (and auditors or security teams) need a dedicated view or report that proves all required checks were done (for example, that an artifact has an SBOM, passed vulnerability scanning, was signed, and met policy criteria). Essentially, treat evidence as a first-class artifact of the process, not an afterthought[1]. The UX should include dashboards or evidence pages where one can inspect and download these proofs, whether for an individual release or an environment’s compliance status.
|
||||
Implementation Plan: Automate evidence generation and storage in the CI/CD pipeline. Incorporate steps in pipelines to generate artifacts like test reports (e.g. JUnit XML, coverage reports), security scan outputs (SAST/DAST results, SBOMs), and policy compliance logs. Use a secure storage (artifact repository or object storage bucket) for these evidence artifacts. For example, after a pipeline run, store the JUnit report and link it to that run record. Implement an “Evidence” section in the UI for each pipeline run or release: this could list the artifacts with download links or visual summaries (like a list of passed tests vs failed tests, vulnerability counts, etc.). Leverage “audit as code” practices – encode compliance checks as code so their output can be captured as evidence[10]. For instance, if using Policy as Code (OPA, HashiCorp Sentinel, etc.), have the pipeline produce a policy evaluation report and save it. Use version-controlled snapshots: for a deployment, take a snapshot of environment configuration (container image digests, config values) and store that as a JSON/YAML file as evidence of “what was deployed”. Utilize tagging and retention: mark these evidence files with the build or release ID and keep them immutably (perhaps using an object store with write-once settings[1]). Integrate a compliance dashboard that aggregates evidence status – e.g. “100% of builds have test reports, 95% have no critical vulns” etc., for a quick view of compliance posture[10]. We may implement a database of compliance statuses (each control check per pipeline run) to quickly query and display summaries. Also, provide an export or report generation feature: allow users to download an “attestation bundle” (ZIP of SBOMs, test results, etc.) for a release to provide to auditors[1]. Security-wise, ensure this evidence store is append-only to prevent tampering (using object locks or checksums). In terms of tech, tools like SLSA attestations can be integrated to sign and verify evidence (for supply chain security). The UI can show verification status of attestation signatures to prove integrity.
|
||||
DevOps-facing Outcome: DevOps teams and compliance officers will see a clear, accessible trail of proof for each deployment. For example, when viewing a particular release, they might see: Tests: 120/120 passed (link to detailed results), Security: 0 critical vulns (link to scanner report), Config Audit: 1 minor deviation (waiver granted, link to waiver details). They can click any of those to dive deeper – e.g. open the actual security scan report artifact or view the SBOM file. Instead of scrambling to gather evidence from multiple tools at audit time, the platform surfaces it continuously[10][1]. An auditor or DevOps lead could open a compliance dashboard and see in real-time that all production releases have the required documentation and checks attached, and even download a bundle for an audit. This builds trust with stakeholders: when someone asks “How do we know this release is secure and compliant?”, the answer is a few clicks away in the evidence tab, not a week-long hunt. It also helps engineers themselves – if a question arises about “Did we run performance tests before this release?”, the evidence view will show if that artifact is present. By making evidence visible and automatic, it encourages teams to incorporate compliance into daily work (no more hidden spreadsheets or missing screenshots), ultimately making audits “boringly” smooth[1].
|
||||
5. Exception & Waiver UX
|
||||
Example of an exemption request form (Harness.io) where a user selects scope (pipeline, target, project), duration, and reason for a waiver. Our implementation will provide a similar interface to manage policy exceptions.
|
||||
Development Direction: Implement a controlled workflow for exceptions/waivers that allows DevOps to override certain failures (policy violations, test failures) only with proper approval and tracking. In real-world pipelines, there are cases where a security vulnerability or policy may be temporarily excepted (waived) to unblock a deployment – but this must be done transparently and with accountability. The UX should make it easy to request an exception when needed (with justification) and to see the status of that request, but also make the presence of any waivers very visible to everyone (so they’re not forgotten). Key requirements: ability to request a waiver with specific scope (e.g. just for this pipeline run or environment, vs broader), mandatory reason and expiration for each waiver, an approval step by authorized roles, and an “exception register” in the UI that lists all active waivers and their expiry[11]. Essentially, treat waivers as temporary, auditable objects in the system.
|
||||
Implementation Plan: Build a feature where pipeline policy checks or scan results that would fail the pipeline can be turned into an exception request. For example, if a pipeline finds a critical vulnerability, provide a “Request Waiver” button next to the failure message in the UI. This triggers a form (like the image example) to capture details: scope of waiver (this specific deployment, this application, or whole project)[12], duration (e.g. 14 days or until a certain date), and a required reason category and description (like “Acceptable risk – low impact, fix in next release” or “False positive”[13]). Once submitted, store the request in a database with status “Pending” and notify the appropriate approvers (could integrate with email/Slack or just within the app). Implement an approval interface where a security lead or product owner can review the request and either approve (possibly adjusting scope or duration)[14] or reject it. Use role-based permissions to ensure only certain roles (e.g. Security Officer) can approve. If approved, the pipeline or policy engine should automatically apply that exception: e.g. mark that particular check as waived for the specified scope. This could be implemented by updating a policy store (for instance, adding an entry that “vuln XYZ is waived for app A in staging until date D”). The pipeline then reads these waivers on the next run so it doesn’t fail for a known, waived issue. Ensure the waiver is time-bound: perhaps schedule a job to auto-expire it (or the pipeline will treat it as fail after expiration). In the UI, implement an “Active Waivers” dashboard[11] listing all current exceptions, with details: what was waived, why, who approved, and countdown to expiration. Possibly show this on the environment and artifact views too (e.g. a banner “Running with 1 waiver: CVE-1234 in ServiceA (expires in 5 days)”). Also log all waiver actions in the audit trail. Technically, this could integrate with a policy engine like OPA – e.g. OPA could have a data map of exceptions which the policies check. Or simpler, our app’s database serves as the source of truth and our pipeline code consults it. Finally, enforce in code that any exception must have an owner and expiry set (no indefinite waivers) – e.g. do not allow submission without an expiry date, and prevent using expired waivers (pipeline should fail if an expired waiver is encountered). This follows the best practice of “time-boxed exceptions with owners”[11].
|
||||
DevOps-facing Outcome: Instead of ad-hoc Slack approvals or lingering risk acceptances, DevOps users get a transparent, self-service mechanism to handle necessary exceptions. For example, if a pipeline is blocking a deployment due to a vulnerability that is a false positive, the engineer can click “Request Waiver”, fill in the justification (selecting “False positive” and adding notes) and submit. They will see the request in a pending state and, if authorized, an approver will get notified. Once approved, the pipeline might automatically continue or allow a rerun to succeed. In the UI, a clear label might mark that deployment as “Waiver applied” so it’s never hidden[15]. The team and auditors can always consult the Waivers dashboard to see, for instance, that “CVE-1234 in ServiceA was waived for 7 days by Jane Doe on Oct 10, reason: Acceptable risk[15].” As waivers near expiration, perhaps the system alerts the team to fix the underlying issue. This prevents “forever exceptions” – it’s obvious if something is continuously waived. By integrating this UX, we maintain velocity without sacrificing governance: teams aren’t stuck when a known low-risk issue pops up, but any deviation from standards is documented and tracked. Over time, the exception log can even drive improvement (e.g. seeing which policies frequently get waived might indicate they need adjustment). In summary, DevOps engineers experience a workflow where getting an exception is streamlined yet responsible, and they always know which releases are carrying exceptions (no surprises to be caught in audits or incidents)[11].
|
||||
6. CLI and Automation UX
|
||||
Development Direction: Offer a powerful CLI tool that mirrors the capabilities of the UI, enabling automation and scripting of all DevOps workflows. DevOps engineers often prefer or need command-line access for integration into CI scripts, Infrastructure as Code pipelines, or simply for speed. The CLI experience should be considered part of the product’s UX – it must be intuitive, consistent with the UI concepts, and provide useful output (including machine-readable formats). Essentially, anything you can do in the web console (view pipeline status, approve a waiver, deploy an artifact, fetch evidence) should be doable via the CLI or API. This empowers advanced users and facilitates integration with other automation (shell scripts, CI jobs, Git hooks, etc.). A good CLI follows standard conventions and provides help, clear errors, and supports environment configuration for non-interactive use.
|
||||
Implementation Plan: Develop the CLI as a first-class client to the platform’s REST/GraphQL API. Likely implement it in a language suited for cross-platform command-line tools (Go is a common choice for CLIs due to easy binary distribution, or Python for rapid development with an installer). Use an existing CLI framework (for Go, something like Cobra or Click for Python) to structure commands and flags. Ensure the commands map closely to the domain: e.g. stella pipeline list, stella pipeline logs <id>, stella artifact promote <artifact> --env prod, stella evidence download --release <id>, stella waiver request ... etc. Follow common UNIX CLI design principles: support --help for every command, use short (-f) and long (--force) flags appropriately, and return proper exit codes (so scripts can detect success/failure). Include output format switches, e.g. --output json for commands to get machine-parseable output (allowing integration with other tools). Integrate authentication in a user-friendly way: perhaps stella auth login to do an OAuth device code flow or accept a token, and store it (maybe in ~/.stella/config). The CLI should respect environment variables for non-interactive use (e.g. STELLA_API_TOKEN, STELLA_TENANT) for easy CI integration[16]. Provide auto-completion scripts for common shells to improve usability. Tie the CLI version to the server API version, and provide a clear upgrade path (maybe stella upgrade to get the latest version). As part of development, create comprehensive docs and examples for the CLI, and possibly a testing harness to ensure it works on all platforms. Consider also that the CLI might be used in pipelines: ensure it’s efficient (no unnecessary output when not needed, perhaps a quiet mode). For implementing heavy tasks (like streaming logs), use web socket or long polling under the hood to show live logs in the terminal, similar to how kubectl logs -f works. If the CLI will handle potentially sensitive operations (like approvals or secret management), ensure it can prompt for confirmation or use flags to force through in scripts. Also, align CLI error messages and terminology with the UI for consistency.
|
||||
DevOps-facing Outcome: For DevOps engineers, the CLI becomes a productivity booster and a Swiss army knife in automation. They can script repetitive tasks: for instance, a release engineer might run a script that uses stella artifact list --env staging to verify what's in staging, then stella artifact promote to push to production followed by stella pipeline monitor --wait to watch the rollout complete. All of this can be done without leaving their terminal or clicking in a browser. The CLI output is designed to be readable but also parseable: e.g. stella pipeline status 123 might output a concise summary in human-readable form, or with --json give a JSON that a script can parse to decide next steps. In on-call situations, an engineer could quickly fetch evidence or status: e.g. stella evidence summary --release 2025.10.05 to see if all checks passed for a particular release, right from the terminal. This complements the UI by enabling automation integration – the CLI can be used in CI pipelines (maybe even in other systems, e.g. a Jenkins job could call stella ... to trigger something in Stella). Because the CLI uses the same language as the UI, users don’t have to learn a completely different syntax or mental model. And by providing robust help and logical command names, even newcomers find it accessible (for example, typing stella --help lists subcommands in a clear way, similar to kubectl or git CLIs they know). Overall, the DevOps-facing outcome is that the tool meets engineers where they are – whether they love GUIs or CLIs – and supports automation at scale, which is a core DevOps principle.
|
||||
7. Alerting & Incident Integration
|
||||
Development Direction: The platform should seamlessly integrate with alerting and incident management workflows so that issues in pipelines or environments automatically notify the right people, and ongoing incidents are visible in the deployment context. DevOps teams rely on fast feedback for failures or abnormal conditions – whether a pipeline fails, a deployment causes a service outage, or a security scan finds a critical issue, the system needs to push alerts to the channels where engineers are already looking (chat, email, incident tools). Additionally, when viewing the DevOps dashboards, users should see indicators of active incidents or alerts related to recent changes. This tight integration helps bridge the gap between CI/CD and operations: deployments and incidents should not be separate silos. The UX should support configuring alert rules and connecting to tools like PagerDuty, Opsgenie, Slack/MS Teams, or even Jira for incident tickets, with minimal setup.
|
||||
Implementation Plan: Introduce an alerting configuration module where certain events trigger notifications. Key events to consider: pipeline failures, pipeline successes (optional), deployment to production, policy violations, security vulnerabilities found, and performance regressions in metrics. Allow users to configure where these go – e.g. a Slack webhook, an email list, or an incident management system’s API. For pipeline failures or critical security findings, integration with PagerDuty/On-call rotation can create an incident automatically. Use webhooks and APIs: for Slack or Teams, send a formatted message (e.g. “:red_circle: Deployment Failed – Pipeline #123 failed at step 'Integration Tests'. Click here to view details.” with a link to the UI). For PagerDuty, use their Events API to trigger an incident with details including the pipeline or service impacted. On the incoming side, integrate with monitoring tools to reflect incidents: e.g. use status from an incident management system or monitoring alerts to display in the platform. If the organization uses something like ServiceNow or Jira for incidents, consider a plugin or link: for instance, tag deployments with change IDs and then auto-update those tickets if a deployment triggers an alert. In the environment view, include a widget that shows current alerts for that environment (by pulling from Prometheus Alertmanager or cloud monitoring alerts relevant to that cluster). Implement ChatOps commands as well: possibly allow acknowledging or redeploying via Slack bot commands. This can be achieved by having a small service listening to chat commands (Slack slash commands or similar) that call the same internal APIs (for example, a “/deploy rollback serviceA” command in Slack triggers the rollback pipeline). For UI implementation, ensure that when an alert is active, it’s clearly indicated: e.g. a red badge on the environment or pipeline view, and maybe a top-level “Incidents” section that lists all unresolved incidents (with links to their external system if applicable). Use the information radiators approach – maybe a large screen mode or summary panel showing system health and any ongoing incidents[7]. Technically, setting up these integrations means building outbound webhook capabilities and possibly small integration plugins for each target (Slack, PagerDuty, etc.). Also include the ability to throttle or filter alerts (to avoid spamming on every minor issue). Logging and auditing: record what alerts were sent and when (so one can later review incident timelines).
|
||||
DevOps-facing Outcome: DevOps engineers will be immediately aware of problems without having to constantly watch the dashboards. For example, if a nightly build fails or a critical vulnerability is found in a new build, the on-call engineer might get a PagerDuty alert or a Slack message in the team channel within seconds. The message will contain enough context (pipeline name, failure reason snippet, a link to view details) so they can quickly respond. During a live incident, when they open the Stella environment view, they might see an incident banner or an “Active Alerts” list indicating which services are affected, aligning with what their monitoring is showing. This context speeds up remediation: if a production incident is ongoing, the team can see which recent deployment might have caused it (since the platform correlates deployment events with incident alerts). Conversely, when doing a deployment, if an alert fires (e.g. error rate spiked), the system could even pause further stages and notify the team. By integrating ChatOps, some users might even resolve things without leaving their chat: e.g. the Slack bot reports “Deployment failed” and the engineer types a command to rollback right in Slack, which the platform executes[17]. Overall, the outcome is a highly responsive DevOps process: issues are caught and communicated in real-time, and the platform becomes part of the incident handling loop, not isolated. Management can also see in retrospective reports that alerts were linked to changes (useful for blameless postmortems, since you can trace alert -> deployment). The tight coupling of alerting with the DevOps UX ensures nothing falls through the cracks, and teams can react swiftly, embodying the DevOps ideal of continuous feedback[7].
|
||||
8. Metrics That Matter
|
||||
Development Direction: Define and display the key metrics that truly measure DevOps success and software delivery performance, rather than vanity metrics. This likely includes industry-standard DORA metrics (Deployment Frequency, Lead Time for Changes, Change Failure Rate, Time to Restore) to gauge velocity and stability[18], as well as any domain-specific metrics (like compliance metrics or efficiency metrics relevant to the team). The UX should provide a metrics dashboard that is easy to interpret – with trends over time, targets or benchmarks, and the ability to drill down into what’s influencing those metrics. By focusing on “metrics that matter,” the platform steers teams toward continuous improvement on important outcomes (like faster deployments with high reliability) and avoids information overload. Each metric should be backed by data collected from the pipelines, incidents, and other parts of the system.
|
||||
Implementation Plan: Instrument the CI/CD pipeline and operations data to collect these metrics automatically. For example, every successful deployment should log an event with a timestamp and environment, which can feed Deployment Frequency calculations (e.g. how many deploys to prod per day/week)[19]. Track lead time by measuring time from code commit (or merge) to deployment completion – this might involve integrating with the version control system to get commit timestamps and comparing to deployment events[20]. Change Failure Rate can be inferred by flagging deployments that resulted in a failure or rollback – integrate with incident tracking or post-deployment health checks to mark a deployment as “failed” if it had to be reverted or caused an alert. Time to Restore is measured from incident start to resolution – integrate with incident management timestamps or pipeline rollback completion times. Additionally, incorporate compliance/quality metrics highlighted earlier: e.g. “% of builds with all tests passing”, “average time to remediate critical vulnerabilities” – many of these can be derived from the evidence and waiver data we track[21]. Use a time-series database (Prometheus, InfluxDB) or even just a relational DB with time-series tables to store metric data points. Implement a Metrics Dashboard UI with charts for each key metric, ideally with the ability to view by different scopes (maybe per service or team or environment). For instance, a line chart for Deployment Frequency (deploys per week) with annotations when big changes happened, or a bar chart for Change Failure Rate per month. Provide comparison to industry benchmarks if available (e.g. highlighting if the team is elite per DORA benchmarks). Also, crucially, implement drill-down links: if a metric spike or drop is observed, the user should be able to click it and see underlying data – e.g. clicking a high Change Failure Rate in April shows which deployments failed in April and links to those pipeline runs[22]. Use color-coding to flag concerning trends (like increasing failure rate). Allow export of metrics for reporting purposes. Possibly integrate with existing analytics (if using Datadog or other BI, allow data export or API access to metrics). Ensure that metrics are updated in near real-time (maybe after each pipeline run or incident closure, recalc relevant metrics) so the dashboard is always current. We should also secure the metrics view (maybe management only for some, but ideally DevOps leads have it openly to promote transparency). In development, validate that these metrics indeed correlate with what teams care about (work with users to refine).
|
||||
DevOps-facing Outcome: The team gets a focused insight into how they are performing and where to improve. On the metrics dashboard, they might see for example: Deployment Frequency – 20 deploys/week (trending upward), Lead Time – 1 day median, Change Failure Rate – 5%, Time to Restore – 1 hour median. These will be shown perhaps as simple cards or charts. They can quickly glean, say, “We’re deploying more often, but our change failure rate spiked last month,” prompting investigation. By clicking that spike, they see a list of incidents or failed deployments that contributed, allowing them to identify common causes and address them[22]. The dashboard might also show compliance metrics if relevant: e.g. “100% of builds had SBOMs attached this quarter” (the team could celebrate this boring but important win)[23], or “Median time to patch critical vulns: 2 days” – these could be in a separate section for security/compliance. Importantly, all metrics shown are ones that drive behavior the organization cares about – no pointless graphs that don’t lead to action. This ensures that when leadership asks “How are we doing in DevOps?”, the answer is readily available with evidence[18]. It also gamifies improvement: teams can see the needle move when they streamline a pipeline or improve testing. For example, after investing in parallel tests, Lead Time drops – the dashboard confirms such improvements. Furthermore, the presence of drill-down and context means metrics are trusted by engineers: if someone questions a number, they can click in and see the raw data behind it (making it hard to ignore or dispute the findings)[22]. Overall, this focus on meaningful metrics helps align everyone (Dev, Ops, and management) on common goals and provides continuous feedback at a high level on the effectiveness of DevOps practices. It’s not just data for managers – it’s a working tool for teams to guide decisions (like where to invest automation efforts next). By keeping the metrics visible and up-to-date, we encourage a culture of data-driven improvement in the DevOps process, as opposed to anecdotal or vanity measures[21].
|
||||
________________________________________
|
||||
[1] [11] [21] [22] [23] Bake Ruthless Compliance Into CI/CD Without Slowing Releases - DevOps Oasis
|
||||
https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/
|
||||
[2] [3] [4] [5] [6] 7 UX principles everyone needs to understand to adopt better tools that improve developer experience (DevEx)
|
||||
https://www.opslevel.com/resources/devex-series-part-2-how-tooling-affects-developer-experience-devex
|
||||
[7] [8] [10] [17] DevOps for Classified Environments
|
||||
https://www.getambush.com/article/devops-for-classified-environments/
|
||||
[9] Understanding Azure DevOps Pipelines: Environment and variables | BrowserStack
|
||||
https://www.browserstack.com/guide/azure-devops-environment
|
||||
[12] [13] [14] [15] Request Issue Exemption | Harness Developer Hub
|
||||
https://developer.harness.io/docs/security-testing-orchestration/exemptions/exemption-workflows/
|
||||
[16] git.stella-ops.org/11_AUTHORITY.md at 48702191bed7d66b8e29929a8fad4ecdb40b9490 - git.stella-ops.org - Gitea: Git with a cup of tea
|
||||
https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/48702191bed7d66b8e29929a8fad4ecdb40b9490/docs/11_AUTHORITY.md
|
||||
[18] [19] [20] DevOps Research and Assessment (DORA) metrics | GitLab Docs
|
||||
https://docs.gitlab.com/user/analytics/dora_metrics/
|
||||
@@ -0,0 +1,109 @@
|
||||
You might find this relevant — recent developments and research strengthen the case for **context‑aware, evidence‑backed vulnerability triage and evaluation metrics** when running container security at scale.
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
## 🔎 Why reachability‑with‑evidence matters now
|
||||
|
||||
* The latest update from Snyk Container (Nov 4, 2025) signals a shift: the tool will begin integrating **runtime insights as a “signal”** in their Container Registry Sync service, making it possible to link vulnerabilities to images actually deployed in production — not just theoretical ones. ([Snyk][1])
|
||||
* The plan is to evolve from static‑scan noise (a long list of CVEs) to a **prioritized, actionable workflow** where developers and security teams see which issues truly matter based on real deployment context: what’s running, what’s reachable, and thus what’s realistically exploitable. ([Snyk][1])
|
||||
* This aligns with the broader shift toward container runtime security: static scanning alone misses a lot — configuration drift, privilege escalation, unexpected container behavior and misconfigurations only visible at runtime. ([Snyk][2])
|
||||
|
||||
**Implication:** The future of container‑security triage will rely heavily on runtime/context signals — increasing confidence that flagged issues are genuinely relevant and deserve remediation urgency.
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ Why heuristics & scanner evaluation matter — and how unreliable “gold standards” can be
|
||||
|
||||
* A recent study A Comparative Analysis of Docker Image Security analyzed **927 Docker images** with two popular scanners (Trivy and Grype). Among 865 images flagged as vulnerable, the two tools **disagreed both on the total number of vulnerabilities and on the specific CVE IDs** found per image. ([montana.edu][3])
|
||||
* A more recent study Consistency evaluation of container vulnerability scanners (2025) observed **low consistency and similarity** between tools’ results when applied to the same container datasets — even under the VEX/SBOM‑based paradigm. ([arXiv][4])
|
||||
* The root causes include: divergent vulnerability databases, differing matching logic, and incomplete visibility (some scanners miss obscure containers or layers). ([montana.edu][3])
|
||||
|
||||
**Implication:** There is **no reliable “single source of truth”** today. Without golden fixtures and metrics like *proof coverage %*, *diff‑closure rate*, or *mean time to evidence* (i.e. how long until a vulnerability is confirmed exploitable in runtime), relying solely on scans is risky.
|
||||
|
||||
---
|
||||
|
||||
## 🧩 What this means for your security strategy (and for Stella Ops)
|
||||
|
||||
Given what you’re building around Stella Ops — with its emphasis on **deterministic, replayable scans, cryptographic integrity, and VEX/SBOM proofs** — this context reinforces why your “moats” are necessary and well‑timed:
|
||||
|
||||
* Prioritizing runtime evidence (reachability, runtime context) — like what Snyk is now moving toward — can help reduce noise and focus your remediation on what truly matters.
|
||||
* Golden‑fixture benchmarks — container images with known, audited vulnerabilities and expected outcomes — are critical to evaluate scanner performance over time.
|
||||
* Metrics such as closure rate (how fast an issue goes from flagged to confirmed exploitable), proof coverage (percentage of dependencies with valid SBOM/VEX proofs), and differential‑closure (how new database updates or policy changes affect prior scan results) should be part of any mature container‑security program.
|
||||
|
||||
---
|
||||
|
||||
If you like — I can dig up **3–5 recent academic or industry studies (2024–2025)** that benchmark scanners *with* runtime detection, to show where the frontier is heading.
|
||||
|
||||
[1]: https://snyk.io/blog/future-snyk-container/?utm_source=chatgpt.com "Beyond the Scan: The Future of Snyk Container"
|
||||
[2]: https://snyk.io/articles/container-runtime-security/?utm_source=chatgpt.com "What is Container Runtime Security?"
|
||||
[3]: https://www.montana.edu/cyber/products/Grype_Vs_Trivy_Boles_et_al.pdf?utm_source=chatgpt.com "A Comparative Analysis of Docker Image Security"
|
||||
[4]: https://arxiv.org/html/2503.14388v1?utm_source=chatgpt.com "Consistency evaluation of container vulnerability scanners"
|
||||
Comparative Analysis of Container Vulnerability Scanning and Prioritization Studies (2024–2025)
|
||||
1. Consistency Evaluation of Container Vulnerability Scanners (2025)
|
||||
Methodology: This study evaluates VEX-enabled container scanners by measuring their consistency across a common dataset[1]. The authors assembled 48 Docker images (with fixed hashes for reproducibility[2]) divided into subsets: 8 images with no known vulns, 8 with a high vuln count (as per Docker Hub data), and 32 random images[3][4]. Seven scanning tools supporting the Vulnerability Exploitability eXchange (VEX) format were tested: Trivy, Grype, OWASP DepScan, Docker Scout, Snyk CLI, OSV-Scanner, and “Vexy”[5]. For fairness, each tool was run in its default optimal mode – e.g. directly scanning the image when possible, or scanning a uniform SBOM (CycloneDX 1.4/SPDX 2.3) generated by Docker Scout for tools that cannot scan images directly[6]. The output of each tool is a VEX report listing vulnerabilities and their exploitability status. The study then compared tools’ outputs in terms of vulnerabilities found and their statuses. Crucially, instead of attempting to know the absolute ground truth, they assessed pairwise and multi-tool agreement. They computed the Jaccard similarity between each pair of tools’ vulnerability sets[7] and a generalized Tversky index for overlap among groups of tools[8]. Key metrics included the total number of vulns each tool reported per image subset and the overlap fraction of specific CVEs identified.
|
||||
Findings and Algorithms: The results revealed large inconsistencies among scanners. For the full image set, one tool (DepScan) reported 18,680 vulnerabilities while another (Vexy) reported only 191 – a two orders of magnitude difference[9]. Even tools with similar totals did not necessarily find the same CVEs[10]. For example, Trivy vs Grype had relatively close counts (~12.3k vs ~12.8k on complete set) yet still differed in specific vulns found. No two tools produced identical vulnerability lists or statuses for an image[11]. Pairwise Jaccard indices were very low (often near 0), indicating minimal overlap in the sets of CVEs found by different scanners[11]. Even the four “most consistent” tools combined (Grype, Trivy, Docker Scout, Snyk) shared only ~18% of their vulnerabilities in common[12]. This suggests that each scanner misses or filters out many issues that others catch, reflecting differences in vulnerability databases and detection logic. The study did not introduce a new scanning algorithm but leveraged consistency as a proxy for scanner quality. By using Jaccard/Tversky similarity[1][7], the authors quantify how “mature” the VEX tool ecosystem is – low consistency implies that at least some tools are producing false positives or false negatives relative to others. They also examined the “status” field in VEX outputs (which marks if a vulnerability is affected/exploitable or not). The number of vulns marked “affected” varied widely between tools (e.g. on one subset, Trivy marked 7,767 as affected vs Docker Scout 1,266, etc.), and some tools (OSV-Scanner, Vexy) don’t provide an exploitability status at all[13]. This further complicates direct comparisons. These discrepancies arise from differences in detection heuristics: e.g. whether a scanner pulls in upstream vendor advisories, how it matches package versions, and whether it suppresses vulnerabilities deemed not reachable. The authors performed additional experiments (such as normalizing on common vulnerability IDs and re-running comparisons) to find explanations, but results remained largely inconclusive – hinting that systematic causes (like inconsistent SBOM generation, alias resolution, or runtime context assumptions) underlie the variance, requiring further research.
|
||||
Unique Features: This work is the first to quantitatively assess consistency among container vulnerability scanners in the context of VEX. By focusing on VEX (which augments SBOMs with exploitability info), the study touches on reachability indirectly – a vuln marked “not affected” in VEX implies it’s present but not actually reachable in that product. The comparison highlights that different tools assign exploitability differently (some default everything to “affected” if found, while others omit the field)[13]. The study’s experimental design is itself a contribution: a reusable suite of tests with a fixed set of container images (they published the image hashes and SBOM details so others can reproduce the analysis easily[2][14]). This serves as a potential “golden dataset” for future scanner evaluations[15]. The authors suggest that as VEX tooling matures, consistency should improve – and propose tracking these experiments over time as a benchmark. Another notable aspect is the discussion on using multiple scanners: if one assumes that overlapping findings are more likely true positives, security teams could choose to focus on vulnerabilities found by several tools in common (to reduce false alarms), or conversely aggregate across tools to minimize false negatives[16]. In short, this study reveals an immature ecosystem – low overlap implies that container image risk can vary dramatically depending on which scanner is used, underscoring the need for better standards (in SBOM content, vulnerability databases, and exploitability criteria).
|
||||
Reproducibility: All tools used are publicly available, and specific versions were used (though not explicitly listed in the snippet, presumably latest as of early 2024). The container selection (with specific digests) and consistent SBOM formats ensure others can replicate the tests[2][14]. The similarity metrics (Jaccard, Tversky) are well-defined and can be re-calculated by others on the shared data. This work thus provides a baseline for future studies to measure if newer scanners or versions converge on results or not. The authors openly admit that they could not define absolute ground truth, but by focusing on consistency, they provide a practical way to benchmark scanners without needing perfect knowledge of each vulnerability – a useful approach for the community to adopt moving forward.
|
||||
2. A Comparative Analysis of Docker Image Security (Montana State University, 2024)
|
||||
Methodology: This study (titled “Deciphering Discrepancies”) systematically compares two popular static container scanners, Trivy and Grype, to understand why their results differ[17]. The researchers built a large corpus of 927 Docker images, drawn from the top 97 most-pulled “Official” images on Docker Hub (as of Feb 2024) with up to 10 evenly-spaced version tags each[18]. Both tools were run on each image version under controlled conditions: the team froze the vulnerability database feeds on a specific date for each tool to ensure they were working with the same knowledge base throughout the experiment[19]. (They downloaded Grype’s and Trivy’s advisory databases on Nov 11, 2023 and used those snapshots for all scans, preventing daily updates from skewing results[19].) They also used the latest releases of the tools at the time (Trivy v0.49.0 and Grype v0.73.0) and standardized scan settings (e.g. extended timeouts for large images to avoid timeouts)[20]. If a tool failed on an image or produced an empty result due to format issues, that image was excluded to keep comparisons apples-to-apples[21]. After scanning, the team aggregated the results to compare: (1) total vulnerability counts per image (and differences between the two tools), (2) the identity of vulnerabilities reported (CVE or other IDs), and (3) metadata like severity ratings. They visualized the distribution of count differences with a density plot (difference = Grype findings minus Trivy findings)[22] and computed statistics such as mean and standard deviation of the count gap[23]. They also tabulated the breakdown of vulnerability ID types each tool produced (CVE vs GHSA vs distro-specific IDs)[24], and manually examined cases of severity rating mismatches.
|
||||
Findings: The analysis uncovered striking discrepancies in scan outputs, even though both Trivy and Grype are reputable scanners. Grype reported significantly more vulnerabilities than Trivy in the majority of cases[25]. Summed over the entire corpus, Grype found ~603,259 vulnerabilities while Trivy found ~473,661[25] – a difference of ~130k. On a per-image basis, Grype’s count was higher on ~84.6% of images[25]. The average image saw Trivy report ~140 fewer vulns than Grype (with a large std deviation ~357)[26]. In some images the gap was extreme – e.g. for the image python:3.7.6-stretch, Trivy found 3,208 vulns vs Grype’s 5,724, a difference of 2,516[27][28]. Crucially, the tools almost never fully agreed. They reported the exact same number of vulnerabilities in only 9.2% of non-empty cases (80 out of 865 vulnerable images)[29], and even in those 80 cases, the specific vulnerability IDs did not match[30]. In fact, the only scenario where Trivy and Grype produced identical outputs was when an image had no vulnerabilities at all (they both output nothing)[31]. This means every time they found issues, the list of CVEs differed – highlighting how scanner databases and matching logic diverge. The study’s deeper dive provides an explanation: Trivy and Grype pull from different sets of vulnerability databases and handle the data differently[32][33].
|
||||
Both tools use the major feeds (e.g. NVD and GitHub Advisory Database), but Trivy integrates many additional vendor feeds (Debian, Ubuntu, Alpine, Red Hat, Amazon Linux, etc.), nine more sources than Grype[34]. Intuitively one might expect Trivy (with more sources) to find more issues, but the opposite occurred – Trivy found fewer. This is attributed to how each tool aggregates and filters vulnerabilities. Trivy’s design is to merge vulnerabilities that are considered the same across databases: it treats different IDs referring to the same flaw as one entry (for example, if a CVE from NVD and a GHSA from GitHub refer to the same underlying vuln, Trivy’s database ties them together under a single record, usually the CVE)[35][36]. Grype, on the other hand, tends to keep entries separate by source; it reported thousands of GitHub-origin IDs (26k+ GHSA IDs) and even Amazon and Oracle advisory IDs (ALAS, ELSA) that Trivy never reported[37][38]. In the corpus, Trivy marked 98.5% of its findings with CVE IDs, whereas Grype’s findings were only 95.1% CVEs, with the rest being GHSA/ALAS/ELSA, etc.[39][33]. This indicates Grype is surfacing a lot of distro-specific advisories as separate issues. However, the study noted that duplicate counting (the same vulnerability counted twice by Grype) was relatively rare – only 675 instances of obvious double counts in Grype’s 600k findings[40]. So the difference isn’t simply Grype counting the same vuln twice; rather, it’s that Grype finds additional unique issues linked to those non-CVE advisories. Some of these could be genuine (e.g. Grype might include vulnerabilities specific to certain Linux distros that Trivy’s feeds missed), while others might be aliases that Trivy merged under a CVE.
|
||||
The researchers also observed severity rating inconsistencies: in 60,799 cases, Trivy and Grype gave different severity levels to the same CVE[41]. For instance, CVE-2019-17594 was “Medium” according to Grype but “Low” in Trivy, and even more dramatically, CVE-2019-8457 was tagged Critical by Trivy but only Negligible by Grype[42]. These conflicts arise because the tools pull severity info from different sources (NVD vs vendor scoring) or update at different times. Such disparities can lead to confusion in prioritization – an issue one scanner urges you to treat as critical, another almost ignores. The authors then discussed root causes. They found that simply using different external databases was not the primary cause of count differences – indeed Trivy uses more databases yet found fewer vulns[43]. Instead, they point to internal processing and filtering heuristics. For example, each tool has its own logic to match installed packages to known vulnerabilities: Grype historically relied on broad CPE matching which could flag many false positives, but recent versions (like the one used) introduced stricter matching to reduce noise[44]. Trivy might be dropping vulnerabilities that it deems “fixed” or not actually present due to how it matches package versions or combines records. The paper hypothesizes that Trivy’s alias consolidation (merging GHSA entries into CVEs) causes it to report fewer total IDs[32]. Supporting this, Trivy showed virtually zero ALAS/ELSA, etc., because it likely converted those to CVEs or ignored them if a CVE existed; Grype, lacking some of Trivy’s extra feeds, surprisingly had more findings – suggesting Trivy may be deliberately excluding some things (perhaps to cut false positives from vendor feeds or to avoid duplication). In summary, the study revealed that scanner results differ wildly due to a complex interplay of data sources and design choices.
|
||||
Unique Contributions: This work is notable for its scale (scanning ~900 real-world images) and its focus on the causes of scanner discrepancies. It provides one of the first extensive empirical validations that “which scanner you use” can significantly alter your security conclusions for container images. Unlike prior works that might compare tools on a handful of images, this study’s breadth lends statistical weight to the differences. The authors also contributed a Zenodo archive of their pipeline and dataset, enabling others to reproduce or extend the research[18]. This includes the list of image names/versions, the exact scanner database snapshots, and scripts used – effectively a benchmark suite for scanner comparison. By dissecting results into ID categories and severity mismatches, the paper highlights specific pain points: e.g. the handling of alias vulnerabilities (CVE vs GHSA, etc.) and inconsistent scoring. These insights can guide tool developers to improve consistency (perhaps by adopting a common data taxonomy or making alias resolution more transparent). From a practitioner standpoint, the findings reinforce that static image scanning is far from deterministic – security teams should be aware that using multiple scanners might be necessary to get a complete picture, albeit at the cost of more false positives. In fact, the disagreement suggests an opportunity for a combined approach: one could take the union of Trivy and Grype results to minimize missed issues, or the intersection to focus on consensus high-likelihood issues. The paper doesn’t prescribe one, but it raises awareness that trust in scanners should be tempered. It also gently suggests that simply counting vulnerabilities (as many compliance checks do) is misleading – different tools count differently – so organizations should instead focus on specific high-risk vulns and how they impact their environment.
|
||||
Reproducibility: The study stands out for its strong reproducibility measures. By freezing tool databases at a point in time, it eliminated the usual hurdle that vulnerability scanners constantly update (making results from yesterday vs today incomparable). They documented and shared these snapshots, meaning anyone can rerun Trivy and Grype with those database versions to get identical results[19]. They also handled corner cases (images causing errors) by removing them, which is documented, so others know the exact set of images used[21]. The analysis code for computing differences and plotting distributions is provided via DOI[18]. This openness is exemplary in academic tool evaluations. It means the community can verify the claims or even plug in new scanners (e.g., compare Anchor’s Syft/Grype vs Aqua’s Trivy vs VMware’s Clair, etc.) on the same corpus. Over time, it would be interesting to see if these tools converge (e.g., if Grype incorporates more feeds or Trivy changes its aggregation). In short, the study offers both a data point in 2024 and a framework for ongoing assessment, contributing to better understanding and hopefully improvement of container scanning tools.
|
||||
3. Runtime-Aware Vulnerability Prioritization for Containerized Workloads (IEEE TDSC, 2024)
|
||||
Methodology: This study addresses the problem of vulnerability overload in containers by incorporating runtime context to prioritize risks. Traditional image scanning yields a long list of CVEs, many of which may not actually be exploitable in a given container’s normal operation. The authors propose a system that monitors container workloads at runtime to determine which vulnerable components are actually used (loaded or executed) and uses that information to prioritize remediation. In terms of methodology, they likely set up containerized applications and introduced known vulnerabilities, then observed the application’s execution to see which vulnerabilities were reachable in practice. For example, they might use a web application in a container with some vulnerable libraries, deploy it and generate traffic, and then log which library functions or binaries get invoked. The core evaluation would compare a baseline static vulnerability list (all issues found in the container image) versus a filtered list based on runtime reachability. Key data collection involved instrumenting the container runtime or the OS to capture events like process launches, library loads, or function calls. This could be done with tools such as eBPF-based monitors, dynamic tracers, or built-in profiling in the container. The study likely constructed a runtime call graph or dependency graph for each container, wherein nodes represent code modules (or even functions) and edges represent call relationships observed at runtime. Each known vulnerability (e.g. a CVE in a library) was mapped to its code entity (function or module). If the execution trace/graph covered that entity, the vulnerability is deemed “reachable” (and thus higher priority); if not, it’s “unreached” and could be deprioritized. The authors tested this approach on various workloads – possibly benchmarks or real-world container apps – and measured how much the vulnerability list can be reduced without sacrificing security. They may have measured metrics like reduction in alert volume (e.g. “X% of vulnerabilities were never invoked at runtime”) and conversely coverage of actual exploits (ensuring vulnerabilities that can be exploited in the workload were correctly flagged as reachable). Empirical results likely showed a substantial drop in the number of critical/high findings when focusing only on those actually used by the application (which aligns with industry reports, e.g. Sysdig found ~85% of critical vulns in containers were in inactive code[45]).
|
||||
Techniques and Algorithms: The solution presented in this work can be thought of as a hybrid of static and dynamic analysis tailored to container environments. On the static side, the system needs to know what vulnerabilities could exist in the image (using an SBOM or scanner output), and ideally, the specific functions or binaries those vulnerabilities reside in. On the dynamic side, it gathers runtime telemetry to see if those functions/binaries are touched. The paper likely describes an architecture where each container is paired with a monitoring agent. One common approach is system call interception or library hooking: e.g. using an LD_PRELOAD library or ptrace to log whenever a shared object is loaded or a process executes a certain library call. Another efficient approach is using eBPF programs attached to kernel events (like file open or exec) to catch when vulnerable libraries are loaded into memory[46][47]. The authors may have implemented a lightweight eBPF sensor (similar to what some security tools do) that records the presence of known vulnerable packages in memory at runtime. The collected data is then analyzed by an algorithm that matches it against the known vulnerability list. For example, if CVE-XXXX is in package foo v1.2 and at runtime libfoo.so was never loaded, then CVE-XXXX is marked “inactive”. Conversely, if libfoo.so loaded and the vulnerable function was called, mark it “active”. Some solutions also incorporate call stack analysis to ensure that merely loading a library doesn’t count as exploitable unless the vulnerable function is actually reached; however, determining function-level reachability might require instrumentation of the application (which could be language-specific). It’s possible the study narrowed scope to package or module-level usage as a proxy for reachability. They might also utilize container orchestrator knowledge: for example, if a container image contains multiple services but only one is ever started (via an entrypoint), code from the others might never run. The prioritization algorithm then uses this info to adjust vulnerability scores or order. A likely outcome is a heuristic like “if a vulnerability is not loaded/executed in any container instance over period X, downgrade its priority”. Conversely, if it is seen in execution, perhaps upgrade priority.
|
||||
Unique Features: This is one of the earlier academic works to formalize “runtime reachability” in container security. It brings concepts from application security (like runtime instrumentation and exploitability analysis) into the container context. Unique aspects include constructing a runtime model for an entire container (which may include not just one process but potentially multiple processes or microservices in the container). The paper likely introduces a framework that automatically builds a Runtime Vulnerability Graph – a graph linking running processes and loaded libraries to the vulnerabilities affecting them. This could be visualized as nodes for each CVE with edges to a “running” label if active. By doing an empirical evaluation, the authors demonstrate the practical impact: e.g., they might show a table where for each container image, the raw scanner found N vulnerabilities, but only a fraction f(N) were actually observed in use. For instance, they might report something like “across our experiments, only 10–20% of known vulnerabilities were ever invoked, drastically reducing the immediate patching workload” (this hypothetical number aligns with industry claims that ~15% of vulnerabilities are in runtime paths[45]). They likely also examine any false negatives: scenarios where a vulnerability didn’t execute during observation but could execute under different conditions. The paper might discuss coverage – ensuring the runtime monitoring covers enough behavior (they may run test traffic or use benchmarks to simulate typical usage). Another feature is potentially tying into the VEX (Vulnerability Exploitability eXchange) format – the system could automatically produce VEX statements marking vulns as not impacted if not reached, or affected if reached. This would be a direct way to feed the info back into existing workflows, and it would mirror the intent of VEX (to communicate exploitability) with actual runtime evidence.
|
||||
Contrasting with static-only approaches: The authors probably compare their prioritized lists to CVSS-based prioritization or other heuristics. A static scanner might flag dozens of criticals, but the runtime-aware system can show which of those are “cold” code and thus de-prioritize them despite high CVSS. This aligns with a broader push in industry from volume-based management to risk-based vulnerability management, where context (like reachability, exposure, asset importance) is used. The algorithms here provide that context automatically for containers.
|
||||
Reproducibility: As an academic work, the authors may have provided a prototype implementation. Possibly they built their monitoring tool on open-source components (maybe extending tools like Sysdig, Falco, or writing custom eBPF in C). If the paper is early-access, code might be shared via a repository or available upon request. They would have evaluated on certain open-source applications (for example, NodeGoat or Juice Shop for web apps, or some microservice demo) – if so, they would list those apps and how they generated traffic to exercise them. The results could be reproduced by others by running the same containers and using the provided monitoring agent. They may also have created synthetic scenarios: e.g., a container with a deliberately vulnerable component that is never invoked, to ensure the system correctly flags it as not urgent. The combination of those scenarios would form a benchmark for runtime exploitability. By releasing such scenarios (or at least describing them well), they enable future researchers to test other runtime-aware tools. We expect the paper to note that while runtime data is invaluable, it’s not a silver bullet: it depends on the workload exercised. Thus, reproducibility also depends on simulating realistic container usage; the authors likely detail their workload generation process (such as using test suites or stress testing tools to drive container behavior). Overall, this study provides a blueprint for integrating runtime insights into container vulnerability management, demonstrating empirically that it cuts through noise and focusing engineers on the truly critical vulnerabilities that actively threaten their running services[46][48].
|
||||
4. Empirical Evaluation of Reachability-Based Vulnerability Analysis for Containers (USENIX Security 2024 companion)
|
||||
Methodology: This work takes a closer look at “reachability-based” vulnerability analysis – i.e. determining whether vulnerabilities in a container are actually reachable by any execution path – and evaluates its effectiveness. As a companion piece (likely a short paper or poster at USENIX Security 2024), it focuses on measuring how well reachability analysis improves prioritization in practice. The authors set up experiments to answer questions like: Does knowing a vulnerability is unreachable help developers ignore it safely? How accurate are reachability determinations? What is the overhead of computing reachability? The evaluation probably involved using or developing a reachability analysis tool and testing it on real containerized applications. They may have leveraged existing static analysis (for example, Snyk’s reachability for Java or GitHub’s CodeQL for call graph analysis) to statically compute if vulnerable code is ever called[49][50]. Additionally, they might compare static reachability with dynamic (runtime) reachability. For instance, they could take an application with known vulnerable dependencies and create different usage scenarios: one that calls the vulnerable code and one that doesn’t. Then they would apply reachability analysis to see if it correctly identifies the scenario where the vuln is truly exploitable. The “empirical evaluation” suggests they measured outcomes like number of vulnerabilities downgraded or dropped due to reachability analysis, and any missed vulnerabilities (false negatives) that reachability analysis might incorrectly ignore. They likely used a mix of container images – perhaps some deliberately insecure demo apps (with known CVEs in unused code paths) and some real-world open source projects. The analysis likely produces a before/after comparison: a table or graph showing how many critical/high vulns a pure scanner finds vs how many remain when filtering by reachability. They might also evaluate multiple tools or algorithms if available (e.g., compare a simple static call-graph reachability tool vs a more advanced one, or compare static vs dynamic results). Performance metrics like analysis time or required computational resources could be reported, since reachability analysis (especially static code analysis across a container’s codebase) can be heavy. If the evaluation is a companion to a larger tool paper, it might also validate that tool’s claims on independent benchmarks.
|
||||
Techniques and Scope: Reachability analysis in containers can be challenging because container images often include both system-level and application-level components. The evaluation likely distinguishes between language-specific reachability (e.g., is a vulnerable Java method ever invoked by the application’s call graph?) and component-level reachability (e.g., is a vulnerable package ever loaded or used by any process?). The authors might have implemented a static analysis pipeline that takes an image’s contents (binaries, libraries, application code) and, for a given vulnerability, tries to find a path from some entry point (like the container’s CMD or web request handlers) to the vulnerable code. One could imagine them using call graph construction for JARs or binary analysis for native code. They might also incorporate dynamic analysis by running containers with instrumented code to see if vulnerabilities trigger (similar to the runtime approach in study #3). Given it’s an “empirical evaluation,” the focus is on outcomes (how many vulns are judged reachable/unreachable and whether those judgments hold true), rather than proposing new algorithms. For example, they may report that reachability-based analysis was able to categorize perhaps 50% of vulnerabilities as unreachable, which if correct, could eliminate many false positives. But they would also check if any vulnerability deemed “unreachable” was in fact exploitable (which would be dangerous). They might introduce a concept of golden benchmarks: containers with known ground truth about vulnerability exploitability. One way to get ground truth is to use CVE proof-of-concept exploits or test cases – if an exploit exists and the service is accessible, the vulnerability is clearly reachable. If reachability analysis says “not reachable” for a known exploitable scenario, that’s a false negative. Conversely, if it says “reachable” for a vuln that in reality cannot be exploited in that setup, that’s a false positive (though in reachability terms, false positive means it claims a path exists when none truly does). The paper likely shares a few case studies illustrating these points. For instance, they might discuss an OpenSSL CVE present in an image – if the container never calls the part of OpenSSL that’s vulnerable (maybe it doesn’t use that feature), reachability analysis would drop it. They would confirm by attempting the known exploit and seeing it fails (because the code isn’t invoked), thereby validating the analysis. Another scenario might be a vulnerable library in a container that could be used if the user flips some configuration, even if it wasn’t used in default runs. Reachability might mark it unreachable (based on default call graph), but one could argue it’s a latent risk. The study likely acknowledges such edge cases, emphasizing that reachability is context-dependent – it answers “given the observed or expected usage”. They might therefore recommend pairing reachability analysis with threat modeling of usage patterns.
|
||||
Unique Observations: One important aspect the evaluation might highlight is the granularity of analysis. For example, function-level reachability (like Snyk’s approach for code[50]) can be very precise but is currently available for a limited set of languages (Java, .NET, etc.), whereas module-level or package-level reachability (like checking if a package is imported at all) is broader but might miss nuanced cases (e.g., package imported but specific vulnerable function not used). The paper could compare these: perhaps they show that coarse package-level reachability already cuts out a lot of vulns (since many packages aren’t loaded), but finer function-level reachability can go further, though at the cost of more complex analysis. They also likely discuss dynamic vs static reachability: static analysis finds potential paths even if they aren’t taken at runtime, whereas dynamic (observing a running system) finds actually taken paths[51][52]. The ideal is to combine them (static to anticipate all possible paths; dynamic to confirm those taken in real runs). The evaluation might reveal that static reachability sometimes over-approximates (flagging something reachable that never happens in production), whereas dynamic under-approximates (only sees what was exercised in tests). A balanced approach could be to use static analysis with some constraints derived from runtime profiling – perhaps something the authors mention for future work. Another unique feature could be integration with container build pipelines: they might note that reachability analysis could be integrated into CI (for example, analyzing code after a build to label vulnerabilities as reachable or not before deployment).
|
||||
Reproducibility: The authors likely make their evaluation setup available or at least well-documented. This might include a repository of container images and corresponding application source code used in the tests, plus scripts to run static analysis tools (like CodeQL or Snyk CLI in reachability mode) against them. If they developed their own reachability analyzer, they might share that as well. They might also provide test harnesses that simulate realistic usage of the containers (since reachability results can hinge on how the app is driven). By providing these, others can reproduce the analysis and verify the effectiveness of reachability-based prioritization. The notion of “golden benchmarks” in this context could refer to a set of container scenarios with known outcomes – for example, a container where we know vulnerability X is unreachable. Those benchmarks can be used to evaluate any reachability tool. If the paper indeed created such scenarios (possibly by tweaking sample apps to include a dormant vulnerable code path), that’s a valuable contribution for future research.
|
||||
In summary, this study empirically demonstrates that reachability analysis is a promising strategy to reduce vulnerability noise in containers, but it also clarifies its limitations. Likely results show a significant drop in the number of urgent vulnerabilities when using reachability filtering, confirming the value of the approach. At the same time, the authors probably caution that reachability is not absolute – environment changes or atypical use could activate some of those “unreachable” vulns, so organizations should use it to prioritize, not to completely ignore certain findings unless confident in the usage constraints. Their evaluation provides concrete data to back the intuition that focusing on reachable vulnerabilities can improve remediation focus without markedly increasing risk.
|
||||
5. Beyond the Scan: The Future of Snyk Container (Snyk industry report, Nov 2025)
|
||||
Context and Methodology: This industry report (a blog post by Snyk’s product team) outlines the next-generation features Snyk is introducing for container security, shifting from pure scanning to a more holistic, continuous approach. While not a traditional study with experiments, it provides insight into the practical implementation of runtime-based prioritization and supply chain security in a commercial tool. Snyk observes that just scanning container images at build time isn’t enough: new vulnerabilities emerge after deployment, and many “theoretical” vulns never pose a real risk, causing alert fatigue[53][54]. To address this, Snyk Container’s roadmap includes: (a) Continuous registry monitoring, (b) Runtime insights for prioritization, and (c) a revamped UI/UX to combine these contexts. In effect, Snyk is connecting the dots across the container lifecycle – from development to production – and feeding production security intelligence back to developers.
|
||||
Key Features and Techniques: First, Continuous Registry Sync is described as continuously watching container images in registries for new vulnerabilities[55]. Instead of a one-time scan during CI, Snyk’s service will integrate with container registries (Docker Hub, ECR, etc.) to maintain an up-to-date inventory of images and automatically flag them when a new CVE affects them[56]. This is a shift to a proactive monitoring model: teams get alerted immediately if yesterday’s “clean” image becomes vulnerable due to a newly disclosed CVE, without manually rescanning. They mention using rich rules to filter which images to monitor (e.g. focus on latest tags, or prod images)[57], and support for multiple registries per organization for complete coverage[58]. The value is eliminating “ticking time bombs” sitting in registries unnoticed[55][59], thus tightening the feedback loop so devs know if a deployed image suddenly has a critical issue.
|
||||
Secondly, and most relevant to runtime prioritization, Snyk is adding ingestion of runtime signals[60]. Specifically, Snyk will gather data on which packages in the container are actually loaded and in use at runtime[46]. This implies deploying some sensor in the running environment (likely via partners or an agent) to detect loaded modules – for example, detecting loaded classes in a JVM or loaded shared libraries in a Linux container. Unlike other tools that might just show runtime issues (like an observed exploit attempt), Snyk plans to use runtime usage data to enhance the scan results for developers[61]. Essentially, vulnerabilities in packages that are never loaded would be de-prioritized, whereas those in actively-used code would be highlighted. Snyk calls this “true risk-based prioritization” achieved by understanding actual usage in memory[46]. The runtime context will initially integrate with the registry monitoring – e.g., within the registry view, you can prioritize images that are known to be running in production and filter their issues by whether they’re in-use or not[62][63]. Later, it will be surfaced directly in the developer’s issue list as a “runtime reachability” signal on each vulnerability[64]. For example, a vulnerability might get a tag if its package was seen running in prod vs. a tag if it was not observed, influencing its risk score. This closes the loop: developers working in Snyk can see which findings really matter (because those packages are part of the live application), cutting through the noise of hypothetical issues. Snyk explicitly contrasts this with tools that only show “what’s on fire in production” – they want to not only detect issues in prod, but funnel that info back to earlier stages to prevent fires proactively[61].
|
||||
To support these changes, Snyk is also redesigning its Container security UI. They mention a new inventory view where each container image has a consolidated overview including its vulnerabilities, whether it’s running (and where), and the new runtime exploitability context[65][66]. In a mock-up, clicking an image shows all its issues but with clear indication of which ones are “truly exploitable” in your environment[66]. This likely involves highlighting the subset of vulnerabilities for which runtime signals were detected (e.g., “this library is loaded by process X in your Kubernetes cluster”) – effectively integrating a VEX-like judgement (“exploitable” or “not exploited”) into the UI. They emphasize this will help cut noise and guide developers to focus on fixes that matter[66].
|
||||
Beyond runtime aspects, the report also touches on container provenance and supply chain: Snyk is partnering with providers of hardened minimalist base images (Chainguard, Docker Official, Canonical, etc.) to ensure they can scan those properly and help devs stay on a secure base[67]. They advocate using distroless/hardened images to reduce the initial vuln count, and then using Snyk to continuously verify that base image stays secure (monitoring for new vulns in it)[68][69] and to scan any additional layers the dev adds on top[70]. This two-pronged approach (secure base + continuous monitoring + scanning custom code) aligns with modern supply chain security practices. They also mention upcoming policy features to enforce best practices (like blocking deployments of images with certain vulns or requiring certain base images)[71], which ties into governance.
|
||||
Relation to Prioritization Approaches: Snyk’s planned features strongly echo the findings of the academic studies: They specifically tackle the problem identified in studies #1 and #2 (overwhelming vulnerability lists and inconsistency over time) by doing continuous updates and focusing on relevant issues. And they implement what studies #3 and #4 explore, by using runtime reachability to inform prioritization. The difference is in implementation at scale: Snyk’s approach needs to work across many languages and environments, so they likely leverage integrations (possibly using data from orchestration platforms or APM tools rather than heavy custom agents). The blog hints that the beta of runtime insights will start early 2026[72], implying they are actively building these capabilities (possibly in collaboration with firms like Dynatrace or Sysdig who already collect such data). Notably, Snyk’s messaging is that this is not just about responding to runtime attacks, but about preventing them by informing developers – a “shift left” philosophy augmented by runtime data.
|
||||
Unique Perspective: This industry report gives a forward-looking view that complements the academic work by describing how these ideas are productized. Unique elements include the notion of continuous scanning (most academic works assume scans happen periodically or at points in time, while here it’s event-driven by new CVE disclosures) and the integration of multiple contexts (dev, registry, runtime) into one platform. Snyk is effectively combining SBOM-based scanning, CVE feeds, runtime telemetry, and even AI-powered remediation suggestions (they mention AI for fixes and predicting breaking changes in upgrades[73]). The result is a more dev-friendly prioritization – instead of a raw CVSS sorting, issues will be ranked by factors like reachable at runtime, present in many running containers, has a fix available, etc. For instance, if only 5 of 50 vulns in an image are in loaded code, those 5 will bubble to the top of the fix list. The report underscores solving alert fatigue[74], which is a practical concern echoed in academic literature as well.
|
||||
Reproducibility/Deployment: While not a study to reproduce, it indicates that these features will be rolled out to users (closed beta for some in late 2025, broader in 2026)[72]. Snyk’s approach will effectively test in the real world what the studies hypothesized: e.g., will developers indeed fix issues faster when told “this is actually running in prod memory” vs. ignoring long scanner reports? Snyk is likely to measure success by reductions in mean-time-to-fix for reachable vulns and possibly a reduction in noise (perhaps they will later publish metrics on how many vulnerabilities get filtered out as not loaded, etc.). It shows the industry validation of the runtime prioritization concept – by 2025, leading vendors are investing in it.
|
||||
In summary, “Beyond the Scan” highlights the evolving best practices for container security: don’t just scan and forget; continuously monitor for new threats, and contextualize vulnerabilities with runtime data to focus on what truly matters[46]. This matches the guidance that engineers building a platform like Stella Ops could take: incorporate continuous update feeds, integrate with runtime instrumentation to gather exploitability signals, and present all this in a unified, developer-centric dashboard to drive remediation where it counts.
|
||||
6. Container Provenance and Supply Chain Integrity under In-Toto/DSSE (NDSS 2024)
|
||||
Objective and Context: This NDSS 2024 work addresses container provenance and supply chain security, focusing on using the in-toto framework and DSSE (Dead Simple Signing Envelope) for integrity. In-toto is a framework for tracking the chain of custody in software builds – it records who did what in the build/test/release process and produces signed metadata (attestations) for each step. DSSE is a signing specification (used by in-toto and Sigstore) that provides a standardized way to sign and verify these attestations. The study likely investigates how to enforce and verify container image integrity using in-toto attestations and what the performance or deployment implications are. For example, it might ask: Can we ensure that a container image running in production was built from audited sources and wasn’t tampered with? What overhead does that add? The paper appears to introduce “Scudo”, a system or approach that combines in-toto with Uptane (an update security framework widely used in automotive)[75]. The connection to Uptane suggests they might have looked at delivering secure updates of container images in potentially distributed or resource-constrained environments (like IoT or vehicles), but the principles apply generally to supply chain integrity.
|
||||
Methodology: The researchers likely designed a supply chain pipeline instrumented with in-toto. This involves defining a layout (the expected steps, e.g., code build, test, image build, scan, sign) and having each step produce a signed attestation of what it did (using DSSE to encapsulate the attestation and sign it). They then enforce verification either on the client that pulls the container or on a registry. The study probably included a practical deployment or prototype of this pipeline – for instance, building a containerized app with in-toto and then deploying it to an environment that checks the attestations before running the image. They mention a “secure instantiation of Scudo” that they deployed, which provided “robust supply chain protections”[75]. Empirical evaluation could involve simulating supply chain attacks to see if the system stops them. For example, they might try to insert a malicious build script or use an unauthorized compiler and show that the in-toto verification detects the deviation (since the signature or expected materials won’t match). They also looked at the cost of these verifications. One highlight from the text is that verifying the entire supply chain on the client (e.g., on an embedded device or at deployment time) is inefficient and largely unnecessary if multiple verifications are done on the server side[76]. This implies they measured something like the time it takes or the bandwidth needed for a client (like a car’s head unit or a Kubernetes node) to verify all attestations versus a scenario where a central service (like a secure registry) already vetted most of them. Possibly, they found that pushing full in-toto verification to the edge could be slow or memory-intensive, so they propose verifying heavy steps upstream and having the client trust a summary. This is akin to how Uptane works (the repository signs metadata indicating images are valid, and the client just checks that metadata).
|
||||
Algorithms and DSSE Usage: The use of DSSE signatures is central. DSSE provides a secure envelope where the content (e.g., an in-toto statement about a build step) is digested and signed, ensuring authenticity and integrity[77]. In-toto typically generates a link file for each step with fields like materials (inputs), products (outputs), command executed, and the signing key of the functionary. The system likely set up a chain of trust: e.g., developer’s key signs the code commit, CI’s key signs the build attestation, scanner’s key signs a “Vulnerability-free” attestation (or a VEX saying no exploitable vulns), and finally a release key signs the container image. They might have used delegation or threshold signatures (in-toto allows requiring, say, two out of three code reviewers to sign off). The algorithms include verifying that each step’s attestation is present and signed by an authorized key, and that the contents (hashes of artifacts) match between steps (supply chain link completeness). Scudo appears to integrate Uptane – Uptane is a framework for secure over-the-air updates, which itself uses metadata signed by different roles (director, image repository) to ensure vehicles only install authentic updates. Combining Uptane with in-toto means not only is the final image signed (as Uptane would ensure) but also the build process of that image is verified. This addresses attacks where an attacker compromises the build pipeline (something Uptane alone wouldn’t catch, since Uptane assumes the final binary is legitimate and just secures distribution). Scudo’s design likely ensures that by the time an image or update is signed for release (per Uptane), it comes with in-toto attestations proving it was built securely. They likely had to optimize this for lightweight verification. The note that full verification on vehicle was unnecessary implies their algorithm divides trust: the repository or cloud service verifies the in-toto attestations (which can be heavy, involving possibly heavy crypto and checking many signatures), and if all is good, it issues a final statement (or uses Uptane’s top-level metadata) that the vehicle/consumer verifies. This way, the client does a single signature check (plus maybe a hash check of image) rather than dozens of them.
|
||||
Unique Features and Findings: One key result from the snippet is that Scudo is easy to deploy and can efficiently catch supply chain attacks[78]. The ease of deployment likely refers to using existing standards (in-toto is CNCF incubating, DSSE is standardized, Uptane is an existing standard in automotive) – so they built on these rather than inventing new crypto. The robust protection claim suggests that in a trial, Scudo was able to prevent successful software supply chain tampering. For instance, if an attacker inserted malicious code in a dependency without updating the in-toto signature, Scudo’s verification would fail and the update would be rejected. Or if an attacker compromised a builder and tried to produce an image outside the defined process, the lack of correct attestation would be detected. They might have demonstrated scenarios like “provenance attack” (e.g., someone tries to swap out the base image for one with malware): in-toto would catch that because the base image hash wouldn’t match the expected material in the attestation. DSSE ensures that all these records are tamper-evident; an attacker can’t alter the attestation logs without invalidating signatures. The study likely emphasizes that cryptographic provenance can be integrated into container delivery with acceptable overhead. Any performance numbers could include: size of metadata per image (maybe a few kilobytes of JSON and signatures), verification time on a client (maybe a few milliseconds if only final metadata is checked, or a second or two if doing full in-toto chain verify). They might also discuss scalability – e.g., how to manage keys and signatures in large organizations (which keys sign what, rotation, etc.). DSSE plays a role in simplifying verification, as it provides a unifying envelope format for different signature types, making automation easier.
|
||||
Another unique aspect is bridging supply chain levels: Many supply chain protections stop at verifying a container image’s signature (ensuring it came from a trusted source). This work ensures the content of the container is also trustworthy by verifying the steps that built it. Essentially, it extends trust “all the way to source”. This is aligned with frameworks like Google’s SLSA (Supply-chain Levels for Software Artifacts), which define levels of build integrity – in-toto/DSSE are key to achieving SLSA Level 3/4 (provenance attested and verified). The paper likely references such frameworks and perhaps demonstrates achieving a high-assurance build of a container that meets those requirements.
|
||||
Reproducibility and Applicability: Being an academic paper, they may have built an open-source prototype of Scudo or at least used open tooling (in-toto has a reference implementation in Python/Go). The usage of Uptane suggests they might have targeted a specific domain (vehicles or IoT) for deployment, which might not be directly reproducible by everyone. However, they likely provide enough detail that one could apply the approach to a standard CI/CD pipeline for containers. For instance, they might outline how to instrument a Jenkins or Tekton pipeline with in-toto and how to use Cosign (a DSSE-based signer) to sign the final image. If any proprietary components were used (maybe a custom verifier on an embedded device), they would describe its logic for verification. Given NDSS’s focus, security properties are formally stated – they might present a threat model and argue how their approach thwarts each threat (malicious insider trying to bypass build steps, compromised repo, etc.). They possibly also discuss what it doesn’t protect (e.g., if the compiler itself is malicious but considered trusted, that’s outside scope – though in-toto could even track compilers if desired).
|
||||
A notable subtlety is that multiple points of verification means the supply chain security doesn’t rely on just one gate. In Scudo, there might be a verification at the registry (ensuring all in-toto attestations are present) and another at deployment. The finding that verifying everything on the client is “largely unnecessary”[76] suggests trust is placed in the repository to do thorough checks. That is a pragmatic trade-off: it’s like saying “our secure container registry verifies the provenance of images before signing them as approved; the Kubernetes cluster only checks that final signature.” This two-level scheme still protects against tampered images (since the cluster won’t run anything not blessed by the registry), and the registry in turn won’t bless an image unless its provenance chain is intact. This offloads heavy lifting from runtime environments (which might be constrained, or in vehicles, bandwidth-limited). The paper likely validates that this approach doesn’t weaken security significantly, as long as the repository system is trusted and secured.
|
||||
Implications: For engineers, this study demonstrates how to implement end-to-end supply chain verification for containers. Using in-toto attestations signed with DSSE means one can trace an image back to source code and ensure each step (build, test, scan) was performed by approved tools and people. The DSSE logic is crucial – it ensures that when you verify an attestation, you’re verifying exactly what was signed (DSSE’s design prevents certain vulnerabilities in naive signing like canonicalization issues). The combination with Uptane hints at real-world readiness: Uptane is known for updating fleets reliably. So Scudo could be used to securely push container updates to thousands of nodes or devices, confident that no one has inserted backdoors in the pipeline. This approach mitigates a range of supply chain attacks (like the SolarWinds-type attack or malicious base images) by requiring cryptographic evidence of integrity all along.
|
||||
In conclusion, this NDSS paper highlights that container security isn’t just about vulnerabilities at runtime, but also about ensuring the container’s content is built and delivered as intended. By using in-toto and DSSE, it provides a framework for provenance attestation in container supply chains, and empirically shows it can be done with reasonable efficiency[79][75]. This means organizations can adopt similar strategies (there are even cloud services now adopting in-toto attestations as part of artifacts – e.g., Sigstore’s cosign can store provenance). For a platform like Stella Ops, integrating such provenance checks could be a recommendation: not only prioritize vulnerabilities by reachability, but also verify that the container wasn’t tampered with and was built in a secure manner. The end result is a more trustworthy container deployment pipeline: you know what you’re running (thanks to provenance) and you know which vulns matter (thanks to runtime context). Together, the six studies and industry insights map out a comprehensive approach to container security, from the integrity of the build process to the realities of runtime risk.
|
||||
Sources: The analysis above draws on information from each referenced study or report, including direct data and statements: the VEX tools consistency study[9][13], the Trivy vs Grype comparative analysis[29][32], the concept of runtime reachability[51][48], Snyk’s product vision[46][56], and the NDSS supply chain security findings[79][75].
|
||||
________________________________________
|
||||
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [2503.14388] Vexed by VEX tools: Consistency evaluation of container vulnerability scanners
|
||||
https://ar5iv.org/html/2503.14388v1
|
||||
[17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] cs.montana.edu
|
||||
https://www.cs.montana.edu/izurieta/pubs/SCAM2024.pdf
|
||||
[45] Vulnerability Prioritization – Combating Developer Fatigue - Sysdig
|
||||
https://www.sysdig.com/blog/vulnerability-prioritization-fatigue-developers
|
||||
[46] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] Beyond the Scan: The Future of Snyk Container | Snyk
|
||||
https://snyk.io/blog/future-snyk-container/
|
||||
[47] [48] [51] [52] Dynamic Reachability Analysis for Real-Time Vulnerability Management
|
||||
https://orca.security/resources/blog/dynamic-reachability-analysis/
|
||||
[49] [50] Reachability analysis | Snyk User Docs
|
||||
https://docs.snyk.io/manage-risk/prioritize-issues-for-fixing/reachability-analysis
|
||||
[75] [76] [78] [79] Symposium on Vehicle Security and Privacy (VehicleSec) 2024 Program - NDSS Symposium
|
||||
https://www.ndss-symposium.org/ndss-program/vehiclesec-2024/
|
||||
[77] in-toto and SLSA
|
||||
https://slsa.dev/blog/2023/05/in-toto-and-slsa
|
||||
@@ -0,0 +1,247 @@
|
||||
I’m sharing this because — given your interest in building a “deterministic, high‑integrity scanner” (as in your Stella Ops vision) — these recent vendor claims and real‑world tradeoffs illustrate why reachability, traceability and reproducibility are emerging as strategic differentiators.
|
||||
|
||||
---
|
||||
|
||||
## 🔎 What major vendors claim now (as of early Dec 2025)
|
||||
|
||||
* **Snyk** says its *reachability analysis* is now in General Availability (GA) for specific languages/integrations. It analyzes source code + dependencies to see whether vulnerable parts (functions, classes, modules, even deep in dependencies) are ever “called” (directly or transitively) by your app — flagging only “reachable” vulnerabilities as higher priority. ([Snyk User Docs][1])
|
||||
* **Wiz** — via its “Security Graph” — promotes an “agentless” reachability-based approach that spans network, identity, data and resource configuration layers. Their framing: instead of a laundry‑list of findings, you get a unified “can an attacker reach X vulnerable component (CVE, misconfiguration, overprivileged identity, exposed storage)?” assessment. ([wiz.io][2])
|
||||
* **Prisma Cloud** (from Palo Alto Networks) claims “Code‑to‑Cloud tracing”: their Vulnerability Explorer enables tracing vulnerabilities from runtime (cloud workload, container, instance) back to source — bridging build-time, dependency-time, and runtime contexts. ([VendorTruth][3])
|
||||
* **Orca Security** emphasizes “Dynamic Reachability Analysis”: agentless static‑and‑runtime analysis to show which vulnerable packages are actually executed in your cloud workloads, not just present in the dependency tree. Their approach aims to reduce “dead‑code noise” and highlight exploitable risks in real‑time. ([Orca Security][4])
|
||||
* Even cloud‑infra ecosystems such as Amazon Web Services (AWS) recommend using reachability analysis to reduce alert fatigue: by distinguishing packages/libraries merely present from those actually used at runtime, you avoid spending resources on low-risk findings. ([Amazon Web Services, Inc.][5])
|
||||
|
||||
Bottom line: leading vendors are converging on *reachability + context + traceability* as the new baseline — shifting from “what is in my dependencies” to “what is actually used, reachable, exploitable”.
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ What these claims don’t solve — and why you still have room to build a moat
|
||||
|
||||
* **Static reachability ≠ guarantee of exploitability**. As some docs admit, static reachability “shows there *is* a path” — but “no path found” doesn’t prove absence of risk (false negatives remain possible) because static analysis can't guarantee runtime behavior. ([Snyk User Docs][1])
|
||||
* **Dynamic reachability helps — but has environment/cost trade‑offs**. Runtime‑based detection (like Orca’s) gives stronger confidence but depends on actually executing the vulnerable code paths — which might not happen in tests or staging, and may require overhead. ([Orca Security][4])
|
||||
* **Cloud systems are especially complex**: environments constantly change (new services, network paths, IAM roles, data flows), so reachability today doesn’t guarantee reachability tomorrow — requiring re‑analysis, continuous monitoring, and integration across code, infra, identity, data and runtime.
|
||||
|
||||
Therefore, what these vendors offer is much improved over naive SCA, but none claim full *deterministic, replayable, build‑to‑runtime‑to‑audit* traceability under air‑gap or high‑compliance conditions.
|
||||
|
||||
That is exactly where your conceptual benchmarks (time‑to‑evidence from SBOM → signed call‑graph; false‑positive control under dependency churn; deterministic priority replays under air‑gap) have strategic value.
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Why your “moat benchmarks” are still compelling — and what they map to in real‑world gaps
|
||||
|
||||
| Your Benchmark Concept | What Vendors Do — Where They Fall Short | Why It Matters (and Where You Could Lead) |
|
||||
| ------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| **(a) Time‑to‑evidence: SBOM → signed call‑graph** | Tools like Snyk, Wiz, Orca produce reachability info — but rely on proprietary engines, often need source access or runtime telemetry, not tied to SBOM provenance or build‑time manifests. | You could offer a deterministic approach: from SBOM + build manifest generate a signed, auditable call‑graph — ideal for compliance, supply‑chain attestation, and reproducible audits. |
|
||||
| **(b) SBOM‑diff false positive rate under dependency churn** | Vendors update engines and vulnerability databases frequently; reachability results change accordingly (e.g. Snyk’s recent JS/TS improvements), implying non‑deterministic drift under innocuous dependency updates. ([updates.snyk.io][6]) | You could aim for stability: using signed call‑graphs, track which vulnerabilities remain reachable across dependency churn — minimizing churn‑induced noise and building trust over time. |
|
||||
| **(c) Deterministic priority scoring under air‑gap replay** | Risk or priority scores (e.g. Snyk Risk Score) include dynamic factors (time since disclosure, EPSS, exploit data) — so score changes with external context, not purely code/graph based. ([Snyk User Docs][7]) | Your project could provide deterministic, reproducible risk ratings — independent of external feeds — ideal for regulated environments or locked-down deployments. |
|
||||
|
||||
---
|
||||
|
||||
## 🧭 How this shapes your Stella‑Ops architecture vision
|
||||
|
||||
Given the limitations above, what top‑tier vendors deliver today is strong but still “heuristic + context‑aware” rather than “provable + reproducible”.
|
||||
|
||||
That strengthens the rationale for building your **crypto‑sovereign, deterministic, SBOM‑to‑artifact‑to‑runtime‑to‑audit** pipeline (with lattice/trust‑graph, reproducible call‑graphs, signed manifests, replayable scans, proof ledger).
|
||||
|
||||
If you succeed, you would not just match current vendors — you’d exceed them in **auditability, compliance‑readiness, post‑quantum future‑proofing, and supply‑chain integrity**.
|
||||
|
||||
---
|
||||
|
||||
If you like — I can draft a **matrix** comparing 5‑10 leading vendors (Snyk, Wiz, Orca, Prisma Cloud, etc.) *vs* your target moat metrics — that could help you benchmark clearly (or show to investors).
|
||||
Let me know if you want that matrix now.
|
||||
|
||||
[1]: https://docs.snyk.io/manage-risk/prioritize-issues-for-fixing/reachability-analysis?utm_source=chatgpt.com "Reachability analysis"
|
||||
[2]: https://www.wiz.io/academy/reachability-analysis-in-cloud-security?utm_source=chatgpt.com "What is reachability analysis in cloud security?"
|
||||
[3]: https://www.vendortruth.org/article/report-comparison-of-top-cspm-vendors-wiz-prisma-cloud-orca-security-lacework?utm_source=chatgpt.com "Comparison of Top CSPM Vendors (Wiz, Prisma Cloud, Orca ..."
|
||||
[4]: https://orca.security/resources/blog/agentless-dynamic-reachability-reduce-cloud-risks/?utm_source=chatgpt.com "Unveiling Agentless and Dynamic Reachability Analysis ..."
|
||||
[5]: https://aws.amazon.com/blogs/apn/reduce-vulnerabilities-on-aws-with-orca-securitys-reachability-analysis/?utm_source=chatgpt.com "Reduce Vulnerabilities on AWS with Orca Security's ..."
|
||||
[6]: https://updates.snyk.io/improvements-to-reachability-for-snyk-open-source-october/?utm_source=chatgpt.com "Improvements to Reachability for Snyk Open Source 🎉"
|
||||
[7]: https://docs.snyk.io/manage-risk/prioritize-issues-for-fixing/risk-score?utm_source=chatgpt.com "Risk Score | Snyk User Docs"
|
||||
Stella Ops’ big advantage isn’t “better findings.” It’s **better *truth***: security results you can **reproduce, verify, and audit** like a build artifact—rather than “a SaaS said so today.”
|
||||
|
||||
Here’s how to develop that into a crisp, defensible set of advantages (and a product shape that makes them real).
|
||||
|
||||
---
|
||||
|
||||
## 1) Deterministic security = trust you can ship
|
||||
|
||||
**Claim:** Same inputs → same outputs, always.
|
||||
|
||||
**Why that matters:** Most scanners are partly nondeterministic (changing vuln feeds, changing heuristics, changing graph rules). That creates “security drift,” which kills trust and slows remediation because teams can’t tell whether risk changed or tooling changed.
|
||||
|
||||
**Stella Ops advantage:**
|
||||
|
||||
* Pin everything that affects results: vuln DB snapshot, rule versions, analyzer versions, build toolchain metadata.
|
||||
* Outputs include a **replay recipe** (“if you re-run with these exact inputs, you’ll get the same answer”).
|
||||
* This makes security posture a **versioned artifact**, not a vibe.
|
||||
|
||||
**Moat hook:** “Reproducible security builds” becomes as normal as reproducible software builds.
|
||||
|
||||
---
|
||||
|
||||
## 2) Evidence-first findings (not alerts-first)
|
||||
|
||||
**Claim:** Every finding comes with a *proof bundle*.
|
||||
|
||||
Most tools do: `CVE exists in dependency tree → alert`.
|
||||
Reachability tools do: `CVE reachable? → alert`.
|
||||
Stella Ops can do: `CVE reachable + here’s the exact path + here’s why the analysis is sound + here’s the provenance of inputs → evidence`.
|
||||
|
||||
**What “proof” looks like:**
|
||||
|
||||
* Exact dependency coordinates + SBOM excerpt (what is present)
|
||||
* Call chain / data-flow chain / entrypoint mapping (what is used)
|
||||
* Build context: lockfile hashes, compiler flags, platform targets (why this binary includes it)
|
||||
* Constraints: “reachable only if feature flag X is on” (conditional reachability)
|
||||
* Optional runtime corroboration (telemetry or test execution), but not required
|
||||
|
||||
**Practical benefit:** You eliminate “AppSec debates.” Dev teams stop arguing and start fixing because the reasoning is legible and portable.
|
||||
|
||||
---
|
||||
|
||||
## 3) Signed call-graphs and signed SBOMs = tamper-evident integrity
|
||||
|
||||
**Claim:** You can cryptographically attest to *what was analyzed* and *what was concluded*.
|
||||
|
||||
This is the step vendors usually skip because it’s hard and unglamorous—but it’s where regulated orgs and serious supply-chain buyers pay.
|
||||
|
||||
**Stella Ops advantage:**
|
||||
|
||||
* Produce **signed SBOMs**, **signed call-graphs**, and **signed scan attestations**.
|
||||
* Store them in a tamper-evident log (doesn’t need to be blockchain hype—just append-only + verifiable).
|
||||
* When something goes wrong, you can answer: *“Was this artifact scanned? Under what rules? Before the deploy? By whom?”*
|
||||
|
||||
**Moat hook:** You become the “security notary” for builds and deployments.
|
||||
|
||||
---
|
||||
|
||||
## 4) Diff-native security: less noise, faster action
|
||||
|
||||
**Claim:** Stella Ops speaks “diff” as a first-class concept.
|
||||
|
||||
A lot of security pain comes from not knowing what changed.
|
||||
|
||||
**Stella Ops advantage:**
|
||||
|
||||
* Treat every scan as a **delta** from the last known-good state.
|
||||
* Findings are grouped into:
|
||||
|
||||
* **New risk introduced** (code or dependency change)
|
||||
* **Risk removed**
|
||||
* **Same risk, new intel** (CVE severity changed, exploit published)
|
||||
* **Tooling change** (rule update caused reclassification) — explicitly labeled
|
||||
|
||||
**Result:** Teams stop chasing churn. You reduce alert fatigue without hiding risk.
|
||||
|
||||
---
|
||||
|
||||
## 5) Air-gap and sovereign-mode as a *design center*, not an afterthought
|
||||
|
||||
**Claim:** “Offline replay” is a feature, not a limitation.
|
||||
|
||||
Most cloud security tooling assumes internet connectivity, cloud control-plane access, and continuous updates. Some customers can’t do that.
|
||||
|
||||
**Stella Ops advantage:**
|
||||
|
||||
* Run fully offline: pinned feeds, mirrored registries, packaged analyzers.
|
||||
* Export/import “scan capsules” that include all artifacts needed for verification.
|
||||
* Deterministic scoring works even without live exploit intel.
|
||||
|
||||
**Moat hook:** This unlocks defense, healthcare, critical infrastructure, and M&A diligence use cases that SaaS-first vendors struggle with.
|
||||
|
||||
---
|
||||
|
||||
## 6) Priority scoring that is stable *and* configurable
|
||||
|
||||
**Claim:** You can separate “risk facts” from “risk policy.”
|
||||
|
||||
Most tools blend:
|
||||
|
||||
* facts (is it reachable? what’s the CVSS? is there an exploit?)
|
||||
* policy (what your org considers urgent)
|
||||
* and sometimes vendor-secret sauce
|
||||
|
||||
**Stella Ops advantage:**
|
||||
|
||||
* Output **two layers**:
|
||||
|
||||
1. **Deterministic fact layer** (reachable path, attack surface, blast radius)
|
||||
2. **Policy layer** (your org’s thresholds, compensating controls, deadlines)
|
||||
* Scoring becomes replayable and explainable.
|
||||
|
||||
**Result:** You can say “this is why we deferred this CVE” with credible, auditable logic.
|
||||
|
||||
---
|
||||
|
||||
## 7) “Code-to-cloud” without hand-waving (but with boundaries)
|
||||
|
||||
**Claim:** Stella Ops can unify code reachability with *deployment reachability*.
|
||||
|
||||
Here’s where Wiz/Orca/Prisma play, but often with opaque graph logic. Stella Ops can be the version that’s provable.
|
||||
|
||||
**Stella Ops advantage:**
|
||||
|
||||
* Join three graphs:
|
||||
|
||||
* **Call graph** (code execution)
|
||||
* **Artifact graph** (what shipped where; image → workload → service)
|
||||
* **Exposure graph** (network paths, identity permissions, data access)
|
||||
* The key is not claiming omniscience—**it’s declaring assumptions**:
|
||||
|
||||
* “Reachable from the internet” vs “reachable from VPC” vs “reachable only with role X”
|
||||
|
||||
**Moat hook:** The ability to *prove* your assumptions beats a “security graph” that’s impossible to audit.
|
||||
|
||||
---
|
||||
|
||||
## 8) Extreme developer ergonomics: fix speed as the KPI
|
||||
|
||||
If you want adoption, don’t compete on “most findings.” Compete on **time-to-fix**.
|
||||
|
||||
**Stella Ops advantage:**
|
||||
|
||||
* Every finding includes:
|
||||
|
||||
* exact dependency edge causing inclusion
|
||||
* minimal remediation set (upgrade, replace, feature flag off)
|
||||
* impact analysis (“what breaks if you upgrade?” via API surface diff where possible)
|
||||
* Output is CI-friendly: PR comments, merge gates, and a local CLI that matches CI outputs 1:1.
|
||||
|
||||
**Result:** Devs trust it because it’s consistent between laptop and pipeline.
|
||||
|
||||
---
|
||||
|
||||
## 9) A wedge that vendors can’t easily copy: “Security proofs” as a portable artifact
|
||||
|
||||
This is the strategic framing:
|
||||
|
||||
* Snyk/Wiz/Orca can add more reachability heuristics.
|
||||
* It’s much harder for them to retrofit **reproducible, cryptographically verifiable, offline-replayable** evidence chains across their whole stack without breaking their SaaS economics.
|
||||
|
||||
**Stella Ops advantage:** You’re not just a scanner. You’re a **verification layer** that:
|
||||
|
||||
* can sit above existing tools
|
||||
* can validate their outputs
|
||||
* can produce the audit-grade record customers actually need
|
||||
|
||||
That’s a platform position.
|
||||
|
||||
---
|
||||
|
||||
## What to build first (to make the advantages real fast)
|
||||
|
||||
If you want a sharp first release that screams “Stella Ops”:
|
||||
|
||||
1. **Signed SBOM + signed call-graph** for 1–2 languages you can do extremely well
|
||||
2. **Reachability certificate** per vulnerability (path + provenance + replay recipe)
|
||||
3. **Diff-native scan output** (new/removed/tooling-changed buckets)
|
||||
4. **Offline replay bundle** (“scan capsule”) export/import
|
||||
5. **Policy layer**: deterministic facts + org-defined thresholds
|
||||
|
||||
That combination gives you a story competitors can’t easily match: **auditable security with low noise**.
|
||||
|
||||
---
|
||||
|
||||
If you want, I can also turn this into:
|
||||
|
||||
* a one-page positioning doc (CISO + AppSec versions), or
|
||||
* a “moat metrics” scorecard you can use to benchmark vendors and your MVP.
|
||||
@@ -0,0 +1,303 @@
|
||||
Here’s a compact, practical design you can drop into Stella Ops to make findings *provable* and gating *trustable*—no mystery meat.
|
||||
|
||||
---
|
||||
|
||||
# Proof‑linked findings (reachability + receipts)
|
||||
|
||||
**Why:** “Reachable” ≠ just a label; it should ship with cryptographic receipts. Snyk popularized reachability (call‑graph evidence to show a vuln is actually invoked), so let’s mirror the affordance—but back it with proofs. ([docs.snyk.io][1])
|
||||
|
||||
**UI:** for every finding, show a right‑rail “Evidence” drawer with four artifacts:
|
||||
|
||||
1. **SBOM snippet (signed)**
|
||||
|
||||
* Minimal CycloneDX/ SPDX slice (component + version + file refs) wrapped as an in‑toto **DSSE** attestation (`application/vnd.in‑toto+json`). Verify with cosign. ([in-toto][2])
|
||||
|
||||
2. **Call‑stack slice (reachability)**
|
||||
|
||||
* Small, human‑readable excerpt: entrypoint → vulnerable symbol, with file:line and hash of the static call graph node set. Status pill: `Reachable`, `Potentially reachable`, `Unreachable (suppressed)`. (Snyk’s “reachability” term and behavior reference.) ([docs.snyk.io][1])
|
||||
|
||||
3. **Attestation chain**
|
||||
|
||||
* Show the DSSE envelope summary (subject digest, predicate type) and verification status. Link: “Verify locally” -> `cosign verify-attestation …`. ([Sigstore][3])
|
||||
|
||||
4. **Transparency receipt**
|
||||
|
||||
* Rekor inclusion proof (log index, UUID, checkpoint). Button: “Verify inclusion” -> `rekor-cli verify …`. ([Sigstore][4])
|
||||
|
||||
**One‑click export:**
|
||||
|
||||
* “Export Evidence (.tar.gz)” bundling: SBOM slice, call‑stack JSON, DSSE attestation, Rekor proof JSON. (Helps audits and vendor hand‑offs.)
|
||||
|
||||
**Dev notes:**
|
||||
|
||||
* Attestation predicates: start with SLSA provenance + custom `stellaops.reachability/v1` (symbol list + call‑edges + source hashes). Use DSSE envelopes and publish to Rekor (or your mirror). ([in-toto][2])
|
||||
|
||||
---
|
||||
|
||||
# VEX‑gated policy UX (clear decisions, quick drill‑downs)
|
||||
|
||||
**Why:** VEX exists to state *why* a product *is or isn’t affected*—use it to drive gates, not just annotate. Support CSAF/OpenVEX now. ([OASIS Open Documentation][5])
|
||||
|
||||
**Gate banner (top of finding list / CI run):**
|
||||
|
||||
* Status chip: **Block** | **Allow** | **Needs‑VEX**
|
||||
* **Decision hash**: SHA‑256 over (policy version + inputs’ digests) → deterministic, auditable runs.
|
||||
* Links to inputs: **Scans**, **SBOM**, **Attestations**, **VEX**.
|
||||
* “**Why blocked?**” expands to the exact lattice rule hit + referenced VEX statement (`status: not_affected/affected` with justification). ([first.org][6])
|
||||
|
||||
**Diff‑aware override (with justification):**
|
||||
|
||||
* “Request override” opens a panel pre‑filled with the delta (changed components/paths). Require a **signed justification** (DSSE‑wrapped note + optional time‑boxed TTL). Record to the transparency log (org‑local Rekor mirror is fine). ([Sigstore][4])
|
||||
|
||||
**VEX ingestion:**
|
||||
|
||||
* Accept CSAF VEX and OpenVEX; normalize into a single internal model (product tree ↔ component purls, status + rationale). Red Hat’s guidance is a good structural map. ([redhatproductsecurity.github.io][7])
|
||||
|
||||
---
|
||||
|
||||
# Bare‑minimum schema & API (so your team can build now)
|
||||
|
||||
**Evidence object (per finding)**
|
||||
|
||||
* `sbom_snippet_attestation` (DSSE)
|
||||
* `reachability_proof` { entrypoint, frames[], file_hashes[], graph_digest }
|
||||
* `attestation_chain[]` (DSSE summaries)
|
||||
* `transparency_receipt` { logIndex, uuid, inclusionProof, checkpoint }
|
||||
|
||||
**Gate decision**
|
||||
|
||||
* `decision` enum, `decision_hash`, `policy_version`, `inputs[]` (digests), `rule_id`, `explanation`, `vex_refs[]`
|
||||
|
||||
**CLI hooks**
|
||||
|
||||
* `stella verify-evidence <findingId>` → runs `cosign verify-attestation` + `rekor-cli verify` under the hood. ([Sigstore][3])
|
||||
|
||||
---
|
||||
|
||||
# Implementation tips (quick wins)
|
||||
|
||||
* **Start with read‑only proofs:** generate DSSE attestations for today’s SBOM slices and publish to Rekor; wire the Evidence drawer before enforcing gates. ([Sigstore][4])
|
||||
* **Reachability MVP:** static call‑graph for .NET 10 (Roslyn analyzers) capturing symbol‑to‑sink edges; label *Potentially reachable* when edges cross unknown reflection/dynamic boundaries; store the call‑stack slice in the predicate. (UX mirrors Snyk’s concept so devs “get it”.) ([docs.snyk.io][1])
|
||||
* **VEX first class:** parse CSAF/OpenVEX, show the raw “justification” inline on hover, and let gates consume it. ([OASIS Open Documentation][5])
|
||||
* **Make it verifiable offline:** keep a Rekor *mirror* or signed append‑only log bundle for air‑gapped clients; surface inclusion proofs the same way. (Sigstore now even ships public datasets for analysis/mirroring patterns.) ([openssf.org][8])
|
||||
|
||||
If you want, I can turn this into: (1) a .NET 10 DTO/record set, (2) Angular component stubs for the drawer and banner, and (3) a tiny cosign/rekor verification wrapper for your CI.
|
||||
|
||||
[1]: https://docs.snyk.io/manage-risk/prioritize-issues-for-fixing/reachability-analysis?utm_source=chatgpt.com "Reachability analysis | Snyk User Docs"
|
||||
[2]: https://in-toto.io/docs/specs/?utm_source=chatgpt.com "Specifications"
|
||||
[3]: https://docs.sigstore.dev/cosign/verifying/attestation/?utm_source=chatgpt.com "In-Toto Attestations"
|
||||
[4]: https://docs.sigstore.dev/logging/overview/?utm_source=chatgpt.com "Rekor"
|
||||
[5]: https://docs.oasis-open.org/csaf/csaf/v2.0/os/csaf-v2.0-os.html?utm_source=chatgpt.com "Common Security Advisory Framework Version 2.0 - Index of /"
|
||||
[6]: https://www.first.org/standards/frameworks/psirts/Consolidated-SBOM-VEX-Operational-Framework.pdf?utm_source=chatgpt.com "Consolidated SBOM and CSAF/VEX Operational Framework"
|
||||
[7]: https://redhatproductsecurity.github.io/security-data-guidelines/csaf-vex/?utm_source=chatgpt.com "CSAF/VEX - Red Hat Security Data Guidelines"
|
||||
[8]: https://openssf.org/blog/2025/10/15/announcing-the-sigstore-transparency-log-research-dataset/?utm_source=chatgpt.com "Announcing the Sigstore Transparency Log Research ..."
|
||||
Below are developer-facing guidelines for designing **traceable evidence** in security UX—so every “this is vulnerable / reachable / blocked” claim can be **verified, reproduced, and audited**.
|
||||
|
||||
---
|
||||
|
||||
## 1) Start from a hard rule: *every UI assertion must map to evidence*
|
||||
|
||||
Define a small set of “claim types” your UI will ever display (examples):
|
||||
|
||||
* “Component X@version Y is present”
|
||||
* “CVE-… matches component X”
|
||||
* “CVE-… is reachable from entrypoint E”
|
||||
* “This build was produced by pipeline P from commit C”
|
||||
* “Gate blocked because policy R + VEX says affected”
|
||||
|
||||
For each claim type, require a **minimum evidence bundle** (see section 6). Don’t ship a UI label that can’t be backed by artifacts + a verifier.
|
||||
|
||||
---
|
||||
|
||||
## 2) Bind evidence to immutable subjects (digests first, names second)
|
||||
|
||||
Traceability collapses if identifiers drift.
|
||||
|
||||
**Do:**
|
||||
|
||||
* Identify the “subject” using content digests (e.g., `sha256`) and stable package identifiers (purl, coordinates).
|
||||
* Keep **component names/versions as metadata**, not the primary key.
|
||||
* Track “subject sets” (multi-arch images, multi-file builds) explicitly.
|
||||
|
||||
This matches the supply-chain attestation model where a statement binds evidence to a particular subject. ([GitHub][1])
|
||||
|
||||
---
|
||||
|
||||
## 3) Use a standard evidence envelope (in‑toto Statement + DSSE)
|
||||
|
||||
Don’t invent your own signing format. Use:
|
||||
|
||||
* **in‑toto Statement v1** as the inner statement (subject + predicateType + predicate). ([GitHub][1])
|
||||
* **DSSE** as the signing envelope. ([GitHub][2])
|
||||
|
||||
Minimal shape:
|
||||
|
||||
```json
|
||||
// DSSE envelope (outer)
|
||||
{
|
||||
"payloadType": "application/vnd.in-toto+json",
|
||||
"payload": "<base64(in-toto Statement)>",
|
||||
"signatures": [{ "sig": "<base64(signature)>" }]
|
||||
}
|
||||
```
|
||||
|
||||
Sigstore bundles expect this payload type and an in‑toto statement in the payload. ([Sigstore][3])
|
||||
|
||||
For build provenance, prefer the **SLSA provenance predicate** (`predicateType: https://slsa.dev/provenance/v1`). ([SLSA][4])
|
||||
|
||||
---
|
||||
|
||||
## 4) Make verification a first-class UX action (not a hidden “trust me”)
|
||||
|
||||
In the UI, every claim should have:
|
||||
|
||||
* **Verification status**: `Verified`, `Unverified`, `Failed verification`, `Expired/Outdated`
|
||||
* **Verifier details**: who signed, what policy verified, what log entry proves transparency
|
||||
* A **“Verify locally”** copy button with exact commands (developers love this)
|
||||
|
||||
Example direction (image attestations):
|
||||
|
||||
* `cosign verify-attestation …` is explicitly designed to verify attestations and check transparency log claims. ([GitHub][5])
|
||||
* Use transparency inclusion verification as part of the “Verified” state. ([Sigstore][6])
|
||||
|
||||
**UX tip:** Default to a friendly summary (“Signed by CI key from Org X; logged to transparency; subject sha256:…”) and progressively disclose raw JSON.
|
||||
|
||||
---
|
||||
|
||||
## 5) Prefer “receipts” over screenshots: transparency logs + bundles
|
||||
|
||||
Traceable evidence is strongest when it’s:
|
||||
|
||||
* **Signed** (authenticity + integrity)
|
||||
* **Publicly/append-only recorded** (non-repudiation)
|
||||
* **Exportable** (audits, incident response, vendor escalation)
|
||||
|
||||
If you use Sigstore:
|
||||
|
||||
* Publish to **Rekor** and store/ship a **Sigstore bundle** that includes the DSSE envelope and log proofs. ([Sigstore][7])
|
||||
* In UX, show: log index/UUID + “inclusion proof present”.
|
||||
|
||||
---
|
||||
|
||||
## 6) Define minimum evidence bundles per feature (practical templates)
|
||||
|
||||
### A) “Component is present”
|
||||
|
||||
Minimum evidence:
|
||||
|
||||
* SBOM fragment (SPDX/CycloneDX) that includes the component identity and where it came from.
|
||||
|
||||
* SPDX 3.x explicitly models SBOM as a collection describing a package/system. ([SPDX][8])
|
||||
* Signed attestation for the SBOM artifact.
|
||||
|
||||
### B) “Vulnerability match”
|
||||
|
||||
Minimum evidence:
|
||||
|
||||
* The matching rule details (CPE/purl/range) + scanner identity/version
|
||||
* Signed vulnerability report attestation (or signed scan output)
|
||||
|
||||
### C) “Reachable vulnerability”
|
||||
|
||||
Minimum evidence:
|
||||
|
||||
* A **call path**: entrypoint → frames → vulnerable symbol
|
||||
* A hash/digest of the call graph slice (so the path is tamper-evident)
|
||||
* Tool info + limitations (reflection/dynamic dispatch uncertainty)
|
||||
|
||||
This mirrors how reachability is typically explained: determine whether vulnerable functions are used by building a call graph and reasoning about reachability. ([Snyk User Docs][9])
|
||||
|
||||
### D) “Not affected” via VEX
|
||||
|
||||
Minimum evidence:
|
||||
|
||||
* The VEX statement (OpenVEX/CSAF) + signer
|
||||
* **Justification** for `not_affected` (OpenVEX requires justification or an impact statement for not_affected). ([GitHub][10])
|
||||
* If using CSAF VEX, include `product_status` and related required fields. ([docs.oasis-open.org][11])
|
||||
* Align to minimum requirements guidance (CISA). ([CISA][12])
|
||||
|
||||
### E) “Gate decision: blocked/allowed”
|
||||
|
||||
Minimum evidence:
|
||||
|
||||
* Inputs digests (SBOM digest, scan attestation digests, VEX doc digests)
|
||||
* Policy version + rule id
|
||||
* A deterministic **decision hash** over (policy + input digests)
|
||||
|
||||
**UX:** Let users open a “Decision details” panel that shows exactly which VEX statement and which rule caused the block.
|
||||
|
||||
---
|
||||
|
||||
## 7) Build evidence UX around progressive disclosure + copyability
|
||||
|
||||
Recommended layout patterns:
|
||||
|
||||
* **Finding header:** severity + status + “Verified” badge
|
||||
* **Evidence drawer (right panel):**
|
||||
|
||||
1. Human summary (“why you should care”)
|
||||
2. Evidence list (SBOM snippet, reachability path, VEX statement)
|
||||
3. Verification section (who signed, transparency receipt)
|
||||
4. Raw artifacts (download / copy JSON)
|
||||
|
||||
**Avoid:** forcing users to leave the app to “trust” you. Provide the artifacts and verification steps inline.
|
||||
|
||||
---
|
||||
|
||||
## 8) Handle uncertainty explicitly (don’t overclaim)
|
||||
|
||||
Reachability and exploitability often have gray areas.
|
||||
|
||||
* Use a three-state model: `Reachable`, `Potentially reachable`, `Not reachable (with reason)`.
|
||||
* Make the reason machine-readable (so policies can use it) and human-readable (so devs accept it).
|
||||
* If the analysis is approximate (reflection, native calls), show “Why uncertain” and what would tighten it (e.g., runtime trace, config constraints).
|
||||
|
||||
---
|
||||
|
||||
## 9) Security & privacy: evidence is sensitive
|
||||
|
||||
Evidence can leak:
|
||||
|
||||
* internal source paths
|
||||
* dependency structure
|
||||
* environment details
|
||||
* user data
|
||||
|
||||
Guidelines:
|
||||
|
||||
* **Minimize**: store only the smallest slice needed (e.g., call-stack slice, not whole graph).
|
||||
* **Redact**: secrets, usernames, absolute paths; replace with stable file hashes.
|
||||
* **Access-control**: evidence visibility should follow least privilege; treat it like production logs.
|
||||
* **Retention**: use TTL for volatile evidence; keep signed receipts longer.
|
||||
|
||||
---
|
||||
|
||||
## 10) Developer checklist (ship-ready)
|
||||
|
||||
Before you ship a “traceable evidence” feature, verify you have:
|
||||
|
||||
* [ ] Stable subject identifiers (digests + purls)
|
||||
* [ ] Standard envelope (in‑toto Statement + DSSE) ([GitHub][2])
|
||||
* [ ] Provenance attestation (SLSA provenance where applicable) ([SLSA][4])
|
||||
* [ ] Transparency receipt (Rekor/bundle) and UX that surfaces it ([Sigstore][7])
|
||||
* [ ] “Verify locally” commands in UI (cosign verify-attestation, etc.) ([GitHub][5])
|
||||
* [ ] VEX ingestion + justification handling for `not_affected` ([GitHub][10])
|
||||
* [ ] Clear uncertainty states (esp. reachability) ([Snyk User Docs][9])
|
||||
* [ ] Exportable evidence bundle (for audits/incidents)
|
||||
|
||||
---
|
||||
|
||||
If you tell me your stack (e.g., .NET 10 + Angular, or Go + React) and where you store artifacts (OCI registry, S3, etc.), I can propose a concrete evidence object schema + UI component contract that fits your architecture.
|
||||
|
||||
[1]: https://github.com/in-toto/attestation/blob/main/spec/v1/statement.md?utm_source=chatgpt.com "attestation/spec/v1/statement.md at main · in-toto/attestation"
|
||||
[2]: https://github.com/secure-systems-lab/dsse?utm_source=chatgpt.com "DSSE: Dead Simple Signing Envelope"
|
||||
[3]: https://docs.sigstore.dev/about/bundle/?utm_source=chatgpt.com "Sigstore Bundle Format"
|
||||
[4]: https://slsa.dev/spec/v1.0/provenance?utm_source=chatgpt.com "SLSA • Provenance"
|
||||
[5]: https://github.com/sigstore/cosign/blob/main/doc/cosign_verify-attestation.md?utm_source=chatgpt.com "cosign/doc/cosign_verify-attestation.md at main"
|
||||
[6]: https://docs.sigstore.dev/quickstart/quickstart-cosign/?utm_source=chatgpt.com "Sigstore Quickstart with Cosign"
|
||||
[7]: https://docs.sigstore.dev/logging/overview/?utm_source=chatgpt.com "Rekor"
|
||||
[8]: https://spdx.dev/wp-content/uploads/sites/31/2024/12/SPDX-3.0.1-1.pdf?utm_source=chatgpt.com "SPDX© Specification v3.0.1"
|
||||
[9]: https://docs.snyk.io/manage-risk/prioritize-issues-for-fixing/reachability-analysis?utm_source=chatgpt.com "Reachability analysis | Snyk User Docs"
|
||||
[10]: https://github.com/openvex/spec/blob/main/OPENVEX-SPEC.md?utm_source=chatgpt.com "spec/OPENVEX-SPEC.md at main"
|
||||
[11]: https://docs.oasis-open.org/csaf/csaf/v2.0/os/csaf-v2.0-os.html?utm_source=chatgpt.com "Common Security Advisory Framework Version 2.0 - Index of /"
|
||||
[12]: https://www.cisa.gov/sites/default/files/2023-04/minimum-requirements-for-vex-508c.pdf?utm_source=chatgpt.com "minimum-requirements-for-vex-508c.pdf"
|
||||
@@ -0,0 +1,458 @@
|
||||
Here’s a simple, *actionable* way to keep “unknowns” from piling up in Stella Ops: rank them by **how risky they might be** and **how widely they could spread**—then let Scheduler auto‑recheck or escalate based on that score.
|
||||
|
||||
---
|
||||
|
||||
# Unknowns Triage: a lightweight, high‑leverage scheme
|
||||
|
||||
**Goal:** decide which “Unknown” findings (no proof yet; inconclusive reachability; unparsed advisory; mismatched version; missing evidence) to re‑scan first or route into VEX escalation—without waiting for perfect certainty.
|
||||
|
||||
## 1) Define the score
|
||||
|
||||
Score each Unknown `U` with a weighted sum (normalize each input to 0–1):
|
||||
|
||||
* **Component popularity (P):** how many distinct workloads/images depend on this package (direct + transitive).
|
||||
*Proxy:* in‑degree or deployment count across environments.
|
||||
* **CVSS uncertainty (C):** how fuzzy the risk is (e.g., missing vector, version ranges like `<=`, vendor ambiguity).
|
||||
*Proxy:* 1 − certainty; higher = less certain, more dangerous to ignore.
|
||||
* **Graph centrality (G):** how “hub‑like” the component is in your dependency graph.
|
||||
*Proxy:* normalized betweenness/degree centrality in your SBOM DAG.
|
||||
|
||||
**TriageScore(U) = wP·P + wC·C + wG·G**, with default weights: `wP=0.4, wC=0.35, wG=0.25`.
|
||||
|
||||
**Thresholds (tuneable):**
|
||||
|
||||
* `≥ 0.70` → **Hot**: immediate rescan + VEX escalation job
|
||||
* `0.40–0.69` → **Warm**: schedule rescan within 24–48h
|
||||
* `< 0.40` → **Cold**: batch into weekly sweep
|
||||
|
||||
## 2) Minimal schema (Postgres or Mongo) to support it
|
||||
|
||||
* `unknowns(id, pkg_id, version, source, first_seen, last_seen, certainty, evidence_hash, status)`
|
||||
* `deploy_refs(pkg_id, image_id, env, first_seen, last_seen)` → compute **popularity P**
|
||||
* `graph_metrics(pkg_id, degree_c, betweenness_c, last_calc_at)` → compute **centrality G**
|
||||
* `advisory_gaps(pkg_id, missing_fields[], has_range_version, vendor_mismatch)` → compute **uncertainty C**
|
||||
|
||||
> Store `triage_score`, `triage_band` on write so Scheduler can act without recomputing everything.
|
||||
|
||||
## 3) Fast heuristics to fill inputs
|
||||
|
||||
* **P (popularity):** `P = min(1, log10(1 + deployments)/log10(1 + 100))`
|
||||
* **C (uncertainty):** start at 0; +0.3 if version range, +0.2 if vendor mismatch, +0.2 if missing CVSS vector, +0.2 if evidence stale (>7d), cap at 1.0
|
||||
* **G (centrality):** precompute on SBOM DAG nightly; normalize to [0,1]
|
||||
|
||||
## 4) Scheduler rules (UnknownsRegistry → jobs)
|
||||
|
||||
* On `unknowns.upsert`:
|
||||
|
||||
* compute (P,C,G) → `triage_score`
|
||||
* if **Hot** → enqueue:
|
||||
|
||||
* **Deterministic rescan** (fresh feeds + strict lattice)
|
||||
* **VEX escalation** (Excititor) with context pack (SBOM slice, provenance, last evidence)
|
||||
* if **Warm** → enqueue rescan with jitter (spread load)
|
||||
* if **Cold** → tag for weekly batch
|
||||
* Backoff: if the same Unknown stays **Hot** after N attempts, widen evidence (alternate feeds, secondary matcher, vendor OVAL, NVD mirror) and alert.
|
||||
|
||||
## 5) Operator‑visible UX (DevOps‑friendly)
|
||||
|
||||
* Unknowns list: columns = pkg@ver, deployments, centrality, uncertainty flags, last evidence age, **score badge** (Hot/Warm/Cold), **Next action** chip.
|
||||
* Side panel: show *why* the score is high (P/C/G sub‑scores) + scheduled jobs and last outcomes.
|
||||
* Bulk actions: “Recompute scores”, “Force VEX escalation”, “De‑dupe aliases”.
|
||||
|
||||
## 6) Guardrails to keep it deterministic
|
||||
|
||||
* Record the **inputs + weights + feed hashes** in the scan manifest (your “replay” object).
|
||||
* Any change to weights or heuristics → new policy version in the manifest; old runs remain replayable.
|
||||
|
||||
## 7) Reference snippets
|
||||
|
||||
**SQL (Postgres) — compute and persist score:**
|
||||
|
||||
```sql
|
||||
update unknowns u
|
||||
set triage_score = least(1, 0.4*u.popularity_p + 0.35*u.cvss_uncertainty_c + 0.25*u.graph_centrality_g),
|
||||
triage_band = case
|
||||
when (0.4*u.popularity_p + 0.35*u.cvss_uncertainty_c + 0.25*u.graph_centrality_g) >= 0.70 then 'HOT'
|
||||
when (0.4*u.popularity_p + 0.35*u.cvss_uncertainty_c + 0.25*u.graph_centrality_g) >= 0.40 then 'WARM'
|
||||
else 'COLD'
|
||||
end,
|
||||
last_scored_at = now()
|
||||
where u.status = 'OPEN';
|
||||
```
|
||||
|
||||
**C# (Common) — score helper:**
|
||||
|
||||
```csharp
|
||||
public static (double score, string band) Score(double p, double c, double g,
|
||||
double wP=0.4, double wC=0.35, double wG=0.25)
|
||||
{
|
||||
var s = Math.Min(1.0, wP*p + wC*c + wG*g);
|
||||
var band = s >= 0.70 ? "HOT" : s >= 0.40 ? "WARM" : "COLD";
|
||||
return (s, band);
|
||||
}
|
||||
```
|
||||
|
||||
## 8) Where this plugs into Stella Ops
|
||||
|
||||
* **Scanner.WebService**: writes Unknowns with raw flags (range‑version, vector missing, vendor mismatch).
|
||||
* **UnknownsRegistry**: computes P/C/G, persists triage fields, emits `Unknown.Triaged`.
|
||||
* **Scheduler**: listens → enqueues **Rescan** / **VEX Escalation** with jitter/backoff.
|
||||
* **Excititor (VEX)**: builds vendor‑merge proof or raises “Unresolvable” with rationale.
|
||||
* **Authority**: records policy version + weights in replay manifest.
|
||||
|
||||
---
|
||||
|
||||
If you want, I can drop in a ready‑to‑use `UnknownsRegistry` table DDL + EF Core 9 model and a tiny Scheduler job that implements these thresholds.
|
||||
Below is a complete, production-grade **developer guideline for Ranking Unknowns in Reachability Graphs** inside **Stella Ops**.
|
||||
It fits the existing architectural rules (scanner = origin of truth, Concelier/Vexer = prune-preservers, Authority = replay manifest owner, Scheduler = executor).
|
||||
|
||||
These guidelines give:
|
||||
|
||||
1. Definitions
|
||||
2. Ranking dimensions
|
||||
3. Deterministic scoring formula
|
||||
4. Evidence capture
|
||||
5. Scheduler policies
|
||||
6. UX and API rules
|
||||
7. Testing rules and golden fixtures
|
||||
|
||||
---
|
||||
|
||||
# Stella Ops Developer Guidelines
|
||||
|
||||
# Ranking Unknowns in Reachability Graphs
|
||||
|
||||
## 0. Purpose
|
||||
|
||||
An **Unknown** is any vulnerability-like record where **reachability**, **affectability**, or **evidence linkage** cannot yet be proved true or false.
|
||||
We rank Unknowns to:
|
||||
|
||||
1. Prioritize rescans
|
||||
2. Trigger VEX escalation
|
||||
3. Guide operators in constrained time windows
|
||||
4. Maintain deterministic behaviour under replay manifests
|
||||
5. Avoid non-deterministic or “probabilistic” security decisions
|
||||
|
||||
Unknown ranking **never declares security state**.
|
||||
It determines **the order of proof acquisition**.
|
||||
|
||||
---
|
||||
|
||||
# 1. Formal Definition of “Unknown”
|
||||
|
||||
A record is classified as **Unknown** if one or more of the following is true:
|
||||
|
||||
1. **Dependency Reachability Unproven**
|
||||
|
||||
* Graph traversal exists but is not validated by call-graph/rule-graph evidence.
|
||||
* Downstream node is reachable but no execution path has sufficient evidence.
|
||||
|
||||
2. **Version Semantics Uncertain**
|
||||
|
||||
* Advisory reports `<=`, `<`, `>=`, version ranges, or ambiguous pseudo-versions.
|
||||
* Normalized version mapping disagrees between data sources.
|
||||
|
||||
3. **Component Provenance Uncertain**
|
||||
|
||||
* Package cannot be deterministically linked to its SBOM node (name-alias confusion, epoch mismatch, distro backport case).
|
||||
|
||||
4. **Missing/Contradictory Evidence**
|
||||
|
||||
* Feeds disagree; Vendor VEX differs from NVD; OSS index has missing CVSS vector; environment evidence incomplete.
|
||||
|
||||
5. **Weak Graph Anchoring**
|
||||
|
||||
* Node exists but cannot be anchored to a layer digest or artifact hash (common in scratch/base images and badly packaged libs).
|
||||
|
||||
Unknowns **must be stored with explicit flags**—not as a collapsed bucket.
|
||||
|
||||
---
|
||||
|
||||
# 2. Dimensions for Ranking Unknowns
|
||||
|
||||
Each Unknown is ranked along **five deterministic axes**:
|
||||
|
||||
### 2.1 Popularity Impact (P)
|
||||
|
||||
How broadly the component is used across workloads.
|
||||
|
||||
Evidence sources:
|
||||
|
||||
* SBOM deployment graph
|
||||
* Workload registry
|
||||
* Layer-to-package index
|
||||
|
||||
Compute:
|
||||
`P = normalized log(deployment_count)`.
|
||||
|
||||
### 2.2 Exploit Consequence Potential (E)
|
||||
|
||||
Not risk. Consequence if the Unknown turns out to be an actual vulnerability.
|
||||
|
||||
Compute from:
|
||||
|
||||
* Maximum CVSS across feeds
|
||||
* CWE category weight
|
||||
* Vendor “criticality marker” if present
|
||||
* If CVSS missing → use CWE fallback → mark uncertainty penalty.
|
||||
|
||||
### 2.3 Uncertainty Density (U)
|
||||
|
||||
How much is missing or contradictory.
|
||||
|
||||
Flags (examples):
|
||||
|
||||
* version_range → +0.25
|
||||
* missing_vector → +0.15
|
||||
* conflicting_feeds → +0.20
|
||||
* no provenance anchor → +0.30
|
||||
* unreachable source advisory → +0.10
|
||||
|
||||
U ∈ [0, 1].
|
||||
|
||||
### 2.4 Graph Centrality (C)
|
||||
|
||||
Is this component a structural hub?
|
||||
|
||||
Use:
|
||||
|
||||
* In-degree
|
||||
* Out-degree
|
||||
* Betweenness centrality
|
||||
|
||||
Normalize per artifact type.
|
||||
|
||||
### 2.5 Evidence Staleness (S)
|
||||
|
||||
Age of last successful evidence pull.
|
||||
|
||||
Decay function:
|
||||
`S = min(1, age_days / 14)`.
|
||||
|
||||
---
|
||||
|
||||
# 3. Deterministic Ranking Score
|
||||
|
||||
All Unknowns get a reproducible score under replay manifest:
|
||||
|
||||
```
|
||||
Score = clamp01(
|
||||
wP·P +
|
||||
wE·E +
|
||||
wU·U +
|
||||
wC·C +
|
||||
wS·S
|
||||
)
|
||||
```
|
||||
|
||||
Default recommended weights:
|
||||
|
||||
```
|
||||
wP = 0.25 (deployment impact)
|
||||
wE = 0.25 (potential consequence)
|
||||
wU = 0.25 (uncertainty density)
|
||||
wC = 0.15 (graph centrality)
|
||||
wS = 0.10 (evidence staleness)
|
||||
```
|
||||
|
||||
The manifest must record:
|
||||
|
||||
* weights
|
||||
* transform functions
|
||||
* normalization rules
|
||||
* feed hashes
|
||||
* evidence hashes
|
||||
|
||||
Thus the ranking is replayable bit-for-bit.
|
||||
|
||||
---
|
||||
|
||||
# 4. Ranking Bands
|
||||
|
||||
After computing Score:
|
||||
|
||||
* **Hot (Score ≥ 0.70)**
|
||||
Immediate rescan, VEX escalation, widen evidence sources.
|
||||
|
||||
* **Warm (0.40 ≤ Score < 0.70)**
|
||||
Scheduled rescan, no escalation yet.
|
||||
|
||||
* **Cold (Score < 0.40)**
|
||||
Batch weekly; suppressed from UI noise except on request.
|
||||
|
||||
Band assignment must be stored explicitly.
|
||||
|
||||
---
|
||||
|
||||
# 5. Evidence Capture Requirements
|
||||
|
||||
Every Unknown must persist:
|
||||
|
||||
1. **UnknownFlags[]** – all uncertainty flags
|
||||
2. **GraphSliceHash** – deterministic hash of dependents/ancestors
|
||||
3. **EvidenceSetHash** – hashes of advisories, vendor VEXes, feed extracts
|
||||
4. **NormalizationTrace** – version normalization decision path
|
||||
5. **CallGraphAttemptHash** – even if incomplete
|
||||
6. **PackageMatchTrace** – exact match reasoning (name, epoch, distro backport heuristics)
|
||||
|
||||
This allows Inspector/Authority to replay everything and prevents “ghost Unknowns” caused by environment drift.
|
||||
|
||||
---
|
||||
|
||||
# 6. Scheduler Policies
|
||||
|
||||
### 6.1 On Unknown Created
|
||||
|
||||
Scheduler receives event: `Unknown.Created`.
|
||||
|
||||
Decision matrix:
|
||||
|
||||
| Condition | Action |
|
||||
| --------------- | ------------------------------------- |
|
||||
| Score ≥ 0.70 | Immediate Rescan + VEX Escalation job |
|
||||
| Score 0.40–0.69 | Queue rescan within 12–72h (jitter) |
|
||||
| Score < 0.40 | Add to weekly batch |
|
||||
|
||||
### 6.2 On Unknown Unchanged after N rescans
|
||||
|
||||
If N = 3 consecutive runs with same UnknownFlags:
|
||||
|
||||
* Force alternate feeds (mirror, vendor direct)
|
||||
* Run VEX excitor with full provenance pack
|
||||
* If still unresolved → emit `Unknown.Unresolvable` event (not an error; a state)
|
||||
|
||||
### 6.3 Failure Recovery
|
||||
|
||||
If fetch/feed errors → Unknown transitions to `Unknown.EvidenceFailed`.
|
||||
This must raise S (staleness) on next compute.
|
||||
|
||||
---
|
||||
|
||||
# 7. Scanner Implementation Guidelines (.NET 10)
|
||||
|
||||
### 7.1 Ranking Computation Location
|
||||
|
||||
Ranking is computed inside **scanner.webservice** immediately after Unknown classification.
|
||||
Concelier/Vexer must **not** touch ranking logic.
|
||||
|
||||
### 7.2 Graph Metrics Service
|
||||
|
||||
Maintain a cached daily calculation of centrality metrics to prevent per-scan recomputation cost explosion.
|
||||
|
||||
### 7.3 Compute Path
|
||||
|
||||
```
|
||||
1. Build evidence set
|
||||
2. Classify UnknownFlags
|
||||
3. Compute P, E, U, C, S
|
||||
4. Compute Score
|
||||
5. Assign Band
|
||||
6. Persist UnknownRecord
|
||||
7. Emit Unknown.Triaged event
|
||||
```
|
||||
|
||||
### 7.4 Storage Schema (Postgres)
|
||||
|
||||
Fields required:
|
||||
|
||||
```
|
||||
unknown_id PK
|
||||
pkg_id
|
||||
pkg_version
|
||||
digest_anchor
|
||||
unknown_flags jsonb
|
||||
popularity_p float
|
||||
potential_e float
|
||||
uncertainty_u float
|
||||
centrality_c float
|
||||
staleness_s float
|
||||
score float
|
||||
band enum
|
||||
graph_slice_hash bytea
|
||||
evidence_set_hash bytea
|
||||
normalization_trace jsonb
|
||||
callgraph_attempt_hash bytea
|
||||
created_at, updated_at
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# 8. API and UX Guidelines
|
||||
|
||||
### 8.1 Operator UI
|
||||
|
||||
For every Unknown:
|
||||
|
||||
* Score badge (Hot/Warm/Cold)
|
||||
* Sub-component contributions (P/E/U/C/S)
|
||||
* Flags list
|
||||
* Evidence age
|
||||
* Scheduled next action
|
||||
* History graph of score evolution
|
||||
|
||||
### 8.2 Filters
|
||||
|
||||
Operators may filter by:
|
||||
|
||||
* High P (impactful components)
|
||||
* High U (ambiguous advisories)
|
||||
* High S (stale data)
|
||||
* High C (graph hubs)
|
||||
|
||||
### 8.3 Reasoning Transparency
|
||||
|
||||
UI must show *exactly why* the ranking is high. No hidden heuristics.
|
||||
|
||||
---
|
||||
|
||||
# 9. Unit Testing & Golden Fixtures
|
||||
|
||||
### 9.1 Golden Unknown Cases
|
||||
|
||||
Provide frozen fixtures for:
|
||||
|
||||
* Version range ambiguity
|
||||
* Mismatched epoch/backport
|
||||
* Missing vector
|
||||
* Conflicting severity between vendor/NVD
|
||||
* Unanchored filesystem library
|
||||
|
||||
Each fixture stores expected:
|
||||
|
||||
* Flags
|
||||
* P/E/U/C/S
|
||||
* Score
|
||||
* Band
|
||||
|
||||
### 9.2 Replay Manifest Tests
|
||||
|
||||
Given a manifest containing:
|
||||
|
||||
* feed hashes
|
||||
* rules version
|
||||
* normalization logic
|
||||
* lattice rules (for overall system)
|
||||
|
||||
Ensure ranking recomputes identically.
|
||||
|
||||
---
|
||||
|
||||
# 10. Developer Checklist (must be followed)
|
||||
|
||||
1. Did I persist all traces needed for deterministic replay?
|
||||
2. Does ranking depend only on manifest-declared parameters (not environment)?
|
||||
3. Are all uncertainty factors explicit flags, never inferred fuzzily?
|
||||
4. Is the scoring reproducible under identical inputs?
|
||||
5. Is Scheduler decision table deterministic and exhaustively tested?
|
||||
6. Does API expose full reasoning without hiding rules?
|
||||
|
||||
---
|
||||
|
||||
If you want, I can now produce:
|
||||
|
||||
1. **A full Postgres DDL** for Unknowns.
|
||||
2. **A .NET 10 service class** for ranking calculation.
|
||||
3. **A golden test suite** with 20 fixtures.
|
||||
4. **UI wireframe** for Unknown triage screen.
|
||||
|
||||
Which one should I generate?
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,329 @@
|
||||
Here’s a crisp, practical game plan to take your SBOM/VEX pipeline from “SBOM‑only” → “VEX‑ready” → “signed, provable evidence graph” with Rekor inclusion‑proof checks—plus an explainability track you can ship alongside it.
|
||||
|
||||
---
|
||||
|
||||
# 1) Freeze on SBOM specs (CycloneDX + SPDX)
|
||||
|
||||
* **Adopt two inputs only:** CycloneDX v1.6 (Ecma ECMA‑424) and SPDX 3.0.1. Lock parsers and schemas; reject anything else at ingest. ([Ecma International][1])
|
||||
* **Scope notes:** CycloneDX covers software/hardware/ML/config & provenance; SPDX 3.x brings a richer, granular data model. ([CycloneDX][2])
|
||||
|
||||
**Action:**
|
||||
|
||||
* `Sbomer.Ingest` accepts `*.cdx.json` and `*.spdx.json` only.
|
||||
* Validate against ECMA‑424 (CycloneDX 1.6) and SPDX 3.0.1 canonical docs before storage. ([Ecma International][1])
|
||||
|
||||
---
|
||||
|
||||
# 2) Wire VEX predicates (VEXer)
|
||||
|
||||
* **Model:** in‑toto Attestation layered as DSSE Envelope → in‑toto Statement → VEX predicate payload. ([Legit Security][3])
|
||||
* **Why DSSE:** avoids fragile canonicalization; standard across in‑toto/Sigstore. ([Medium][4])
|
||||
|
||||
**Action:**
|
||||
|
||||
* Accept VEX as an **attestation** (JSON) with `statementType: in-toto`, `predicateType: VEX`. Wrap/verify via DSSE at the edge. ([Legit Security][3])
|
||||
|
||||
---
|
||||
|
||||
# 3) Sign every artifact & edge (Authority)
|
||||
|
||||
* **Artifacts to sign:** SBOM files, VEX attestations, and each **evidence edge** you materialize in the proof graph (e.g., “image X derives from build Y,” “package Z fixed in version…”)—all as DSSE envelopes. ([in-toto][5])
|
||||
* **Sigstore/Cosign path:** Sign + optionally keyless; publish signatures/attestations; send to Rekor. ([Sigstore][6])
|
||||
|
||||
**Action:**
|
||||
|
||||
* Output: `{ artifact, DSSE-envelope, signature, Rekor UUID }` per node/edge.
|
||||
* Keep offline mode by queueing envelopes; mirror later to Rekor.
|
||||
|
||||
---
|
||||
|
||||
# 4) Rekor inclusion‑proofs (Proof Service)
|
||||
|
||||
* **Goal:** For every submitted signature/attestation, store the Rekor *UUID*, *log index*, and verify **inclusion proofs** regularly. ([Sigstore][7])
|
||||
|
||||
**CLI contract (reference):**
|
||||
|
||||
* `rekor-cli verify --rekor_server <url> --signature <sig> --public-key <pub> --artifact <path>` (yields inclusion proof). ([Sigstore][8])
|
||||
|
||||
**Action:**
|
||||
|
||||
* Background metrics: “% entries with valid inclusion proof,” “median verify latency,” “last inclusion‑proof age.” ([Sigstore][7])
|
||||
|
||||
---
|
||||
|
||||
# 5) Deterministic evidence graph (Graph & Ledger)
|
||||
|
||||
* Store **hash‑addressed** nodes and signed edges; persist the DSSE for each.
|
||||
* Export a **deterministic ledger** dump (stable sort, normalized JSON) to guarantee byte‑for‑byte reproducible rebuilds.
|
||||
* Track **provenance chain** from container → build → source → SBOM → VEX.
|
||||
|
||||
---
|
||||
|
||||
# 6) Explainability: Smart‑Diff + Reachability + Scores
|
||||
|
||||
* **Human‑readable proof trails:** For every verdict, render the chain: finding → SBOM component → VEX predicate → reachability basis → runtime/CFG evidence → signature + Rekor proof.
|
||||
* **Smart‑Diff:** Image‑to‑image diff includes env/config deltas; highlight changes that flip reachability (e.g., library upgrade, flag on/off).
|
||||
* **Call‑stack reachability:** Add compositional call‑graph checks per language (Java/JS/Python/Go/C/C++/.NET) and label evidence origins.
|
||||
* **Deterministic scoring:** Pin a formula (e.g., `Score = weight(VEX status) + weight(Reachability) + weight(Exploit/EPSS) + weight(Runtime hit)`), emit the formula + inputs in the UI.
|
||||
* **Explicit UNKNOWNs:** When data is missing, mark `UNKNOWN` and run sandboxed probes to shrink unknowns over time; surface these as tasks.
|
||||
* **Track Rekor verification latency** as a UX health metric (evidence “time‑to‑trust”). ([Sigstore][7])
|
||||
|
||||
---
|
||||
|
||||
# 7) Minimal .NET 10 module checklist (Stella Ops)
|
||||
|
||||
* **Sbomer**: strict CycloneDX/SPDX validation → normalize → hash. ([Ecma International][1])
|
||||
* **Vexer**: ingest DSSE/in‑toto VEX; verify signature; map to components. ([Legit Security][3])
|
||||
* **Authority**: DSSE signers (keyed + keyless) + Cosign integration. ([Sigstore][6])
|
||||
* **Proof**: Rekor submit/verify; store UUID/index/inclusion‑proof. ([Sigstore][7])
|
||||
* **Scanner**: reachability plugins per language; emit call‑chain evidence.
|
||||
* **UI**: proof‑trail pages; Smart‑Diff; deterministic score panel; UNKNOWN badge.
|
||||
|
||||
---
|
||||
|
||||
# 8) Guardrails & defaults
|
||||
|
||||
* **Only** CycloneDX 1.6 / SPDX 3.0.1 at ingest. Hard fail others. ([Ecma International][1])
|
||||
* DSSE everywhere (even edges). ([in-toto][5])
|
||||
* For online mode, default to public Rekor; for air‑gap, queue and verify later against your mirror. ([Sigstore][7])
|
||||
* Persist inclusion‑proof artifacts so audits don’t require re-fetching. ([Sigstore][7])
|
||||
|
||||
---
|
||||
|
||||
# 9) Tiny starter backlog (ready to copy into SPRINT)
|
||||
|
||||
1. **Ingest Freeze:** Add format gate (CDX1.6/SPDX3.0.1 validators). ([Ecma International][1])
|
||||
2. **Attest API:** DSSE verify endpoint for VEX statements. ([Legit Security][3])
|
||||
3. **Signer:** Cosign wrapper for DSSE + push to Rekor; store UUID. ([Sigstore][6])
|
||||
4. **Proof‑Verifier:** `rekor-cli verify` integration + metrics. ([Sigstore][8])
|
||||
5. **Graph Store:** hash‑addressed nodes/edges; deterministic export.
|
||||
6. **Explain UI:** proof trail, Smart‑Diff, reachability call‑chains, UNKNOWNs.
|
||||
|
||||
If you want, I can turn this into concrete `.csproj` skeletons, validator stubs, DSSE signing/verify helpers, and a Rekor client wrapper next.
|
||||
|
||||
[1]: https://ecma-international.org/publications-and-standards/standards/ecma-424/?utm_source=chatgpt.com "ECMA-424"
|
||||
[2]: https://cyclonedx.org/specification/overview/?utm_source=chatgpt.com "Specification Overview"
|
||||
[3]: https://www.legitsecurity.com/blog/slsa-provenance-blog-series-part-1-what-is-software-attestation?utm_source=chatgpt.com "SLSA Provenance Blog Series, Part 1: What Is Software ..."
|
||||
[4]: https://dlorenc.medium.com/signature-formats-9b7b2a127473?utm_source=chatgpt.com "Signature Formats. Envelopes and Wrappers and Formats, Oh…"
|
||||
[5]: https://in-toto.readthedocs.io/en/latest/model.html?utm_source=chatgpt.com "Metadata Model — in-toto 3.0.0 documentation"
|
||||
[6]: https://docs.sigstore.dev/cosign/verifying/attestation/?utm_source=chatgpt.com "In-Toto Attestations"
|
||||
[7]: https://docs.sigstore.dev/logging/overview/?utm_source=chatgpt.com "Rekor"
|
||||
[8]: https://docs.sigstore.dev/logging/cli/?utm_source=chatgpt.com "CLI"
|
||||
## Stella Ops — what you get that “SBOM-only” tools don’t
|
||||
|
||||
### 1) **Proof-carrying security decisions**
|
||||
|
||||
Stella Ops doesn’t just *compute* a verdict (“CVE present / fixed / not affected”). It produces a **verifiable story**:
|
||||
|
||||
**SBOM → VEX → Reachability/runtime evidence → policy decision → signature(s) → transparency-log proof**
|
||||
|
||||
* Every artifact (SBOM, VEX, scan results, “edge” in the evidence graph) is wrapped as an **in-toto attestation** and signed (DSSE) (Cosign uses DSSE for payload signing). ([Sigstore][1])
|
||||
* Signatures/attestations are anchored in **Rekor**, and you can verify “proof of entry” with `rekor-cli verify`. ([Sigstore][2])
|
||||
|
||||
**Advantage:** audits, incident reviews, and partner trust become *mechanical verification* instead of “trust us”.
|
||||
|
||||
---
|
||||
|
||||
### 2) **Noise reduction that’s accountable**
|
||||
|
||||
VEX is explicitly about exploitability in context (not just “a scanner saw it”). CycloneDX frames VEX as a way to prioritize by real-world exploitability. ([CycloneDX][3])
|
||||
OpenVEX is designed to be SBOM-agnostic and minimal, though it’s still marked as a draft spec. ([GitHub][4])
|
||||
|
||||
**Advantage:** you can suppress false positives *with receipts* (justifications + signed statements), not tribal knowledge.
|
||||
|
||||
---
|
||||
|
||||
### 3) **Version-aware interoperability (without chaos)**
|
||||
|
||||
* CycloneDX’s **current** release is **1.7** (2025‑10‑21). ([CycloneDX][5])
|
||||
* The ECMA standard **ECMA‑424** corresponds to **CycloneDX v1.6**. ([Ecma International][6])
|
||||
* SPDX has an official **3.0.1** spec. ([SPDX][7])
|
||||
|
||||
**Advantage:** Stella Ops can accept real-world supplier outputs, while still keeping your internal model stable and upgradeable.
|
||||
|
||||
---
|
||||
|
||||
### 4) **Deterministic evidence graph = fast “blast radius” answers**
|
||||
|
||||
Because evidence is stored as a graph of content-addressed nodes/edges (hash IDs), you can answer:
|
||||
|
||||
* “Which deployed images include package X@version Y?”
|
||||
* “Which builds were declared *not affected* by vendor VEX, and why?”
|
||||
* “What changed between build A and build B that made CVE reachable?”
|
||||
|
||||
**Advantage:** incident response becomes query + verify, not archaeology.
|
||||
|
||||
---
|
||||
|
||||
### 5) **Security improvements beyond vulnerabilities**
|
||||
|
||||
CycloneDX 1.6 added stronger cryptographic asset discovery/reporting to help manage crypto posture (including agility and policy compliance). ([CycloneDX][8])
|
||||
**Advantage:** Stella Ops can expand beyond “CVEs” into crypto hygiene, provenance, and operational config integrity.
|
||||
|
||||
---
|
||||
|
||||
# Developer guidelines (two audiences)
|
||||
|
||||
## A) Guidelines for *app teams* producing Stella-ready evidence
|
||||
|
||||
### 1) Pick formats + pin versions (don’t wing it)
|
||||
|
||||
**SBOMs**
|
||||
|
||||
* Prefer **CycloneDX 1.7** going forward; allow **1.6** when you need strict ECMA‑424 alignment. ([CycloneDX][5])
|
||||
* Accept **SPDX 3.0.1** as the SPDX target. ([SPDX][7])
|
||||
|
||||
**VEX**
|
||||
|
||||
* Prefer **OpenVEX** for a minimal, SBOM-agnostic VEX doc (but treat it as a draft spec and lock to a versioned context like `…/v0.2.0`). ([GitHub][4])
|
||||
|
||||
**Rule of thumb:** “Versioned in, versioned out.” Keep the original document bytes, plus a normalized internal form.
|
||||
|
||||
---
|
||||
|
||||
### 2) Use stable identities everywhere
|
||||
|
||||
* **Subjects:** reference immutable artifacts (e.g., container image digest), not tags (`:latest`).
|
||||
* **Components:** use PURLs when possible, and include hashes when available.
|
||||
* **VEX “products”:** use the same identifiers your SBOM uses (PURLs are ideal).
|
||||
|
||||
---
|
||||
|
||||
### 3) Sign and attach evidence as attestations
|
||||
|
||||
Cosign supports SBOM attestations and in-toto predicates; it supports SBOM formats including SPDX and CycloneDX. ([Sigstore][9])
|
||||
Example: attach an SPDX SBOM as an attestation (Sigstore sample policy shows the exact pattern): ([Sigstore][10])
|
||||
|
||||
```bash
|
||||
cosign attest --yes --type https://spdx.dev/Document \
|
||||
--predicate sbom.spdx.json \
|
||||
--key cosign.key \
|
||||
"${IMAGE_DIGEST}"
|
||||
```
|
||||
|
||||
OpenVEX examples in the ecosystem use a versioned predicate type like `https://openvex.dev/ns/v0.2.0`. ([Docker Documentation][11])
|
||||
(Your Stella Ops policy can accept either `--type openvex` or the explicit URI; the explicit URI is easiest to reason about.)
|
||||
|
||||
---
|
||||
|
||||
### 4) Always log + verify transparency proofs
|
||||
|
||||
Rekor’s CLI supports verifying inclusion proofs (proof-of-entry). ([Sigstore][2])
|
||||
|
||||
```bash
|
||||
rekor-cli verify --rekor_server https://rekor.sigstore.dev \
|
||||
--signature artifact.sig \
|
||||
--public-key cosign.pub \
|
||||
--artifact artifact.bin
|
||||
```
|
||||
|
||||
**Team rule:** releases aren’t “trusted” until signatures + inclusion proofs verify.
|
||||
|
||||
---
|
||||
|
||||
### 5) Write VEX like it will be cross-examined
|
||||
|
||||
A good VEX statement includes:
|
||||
|
||||
* **status** (e.g., not_affected / affected / fixed)
|
||||
* **justification** (why)
|
||||
* **timestamp** and author
|
||||
* link to supporting evidence (ticket, code change, runtime data)
|
||||
|
||||
If you can’t justify a “not_affected”, use “under investigation” and make it expire.
|
||||
|
||||
---
|
||||
|
||||
## B) Guidelines for *Stella Ops contributors* (platform developers)
|
||||
|
||||
### 1) Core principle: “Everything is evidence, evidence is immutable”
|
||||
|
||||
* Treat every ingest as **untrusted input**: strict schema validation, size limits, decompression limits, deny SSRF in “external references”, etc.
|
||||
* Store artifacts as **content-addressed blobs**: `sha256(bytes)` is the primary ID.
|
||||
* Never mutate evidence; publish a *new* node/edge with its own signature.
|
||||
|
||||
---
|
||||
|
||||
### 2) Canonical internal model + lossless preservation
|
||||
|
||||
**Store three things per document:**
|
||||
|
||||
1. raw bytes (for audits)
|
||||
2. parsed form (for queries)
|
||||
3. normalized canonical form (for deterministic hashing & diffs)
|
||||
|
||||
**Why:** it lets you evolve internal representation without losing provenance.
|
||||
|
||||
---
|
||||
|
||||
### 3) Evidence graph rules (keep it explainable)
|
||||
|
||||
* Nodes: `Artifact`, `Component`, `Vulnerability`, `Attestation`, `Build`, `Deployment`, `RuntimeSignal`
|
||||
* Edges: `DESCRIBES`, `AFFECTS`, `NOT_AFFECTED_BY`, `FIXED_IN`, `DERIVED_FROM`, `DEPLOYS`, `OBSERVED_AT_RUNTIME`
|
||||
* **Sign edges**, not just nodes (edge = claim).
|
||||
|
||||
**UI rule:** every “status” shown to users must have a clickable proof trail ending in signatures + inclusion proof.
|
||||
|
||||
---
|
||||
|
||||
### 4) Attestation envelope standardization
|
||||
|
||||
Use **in-toto statements in DSSE envelopes** for everything you sign (SBOM, VEX, scan predicates). Cosign’s attestation flow is built on DSSE. ([Sigstore][1])
|
||||
|
||||
**Fail closed:** if a policy requires an attestation and it’s missing/unverifiable, the decision should not silently pass.
|
||||
|
||||
---
|
||||
|
||||
### 5) Rekor proof lifecycle
|
||||
|
||||
Persist per entry:
|
||||
|
||||
* Rekor UUID
|
||||
* log index
|
||||
* integrated time
|
||||
* inclusion proof data (when available)
|
||||
|
||||
And track health metrics:
|
||||
|
||||
* % verified proofs
|
||||
* proof verification failures
|
||||
* “proof age” since last verification
|
||||
|
||||
---
|
||||
|
||||
### 6) Versioning strategy (avoid future pain)
|
||||
|
||||
* Maintain an explicit compatibility matrix:
|
||||
|
||||
* CycloneDX 1.6 (ECMA‑424) and 1.7 (current). ([Ecma International][6])
|
||||
* SPDX 3.0.1. ([SPDX][7])
|
||||
* Add new versions behind feature flags and “upgrade playbooks”.
|
||||
* Include the **media type + version** in stored metadata (CycloneDX supports a `version=` media type parameter). ([CycloneDX][5])
|
||||
|
||||
---
|
||||
|
||||
### 7) “Definition of done” for any new scanner / evidence source
|
||||
|
||||
A new plugin isn’t complete unless it:
|
||||
|
||||
* outputs a **stable, hashable predicate**
|
||||
* links evidence to **subjects/components** using stable identifiers
|
||||
* emits **reason codes** + minimal repro info
|
||||
* produces a signed attestation and (optionally) Rekor entry
|
||||
* supports deterministic re-run (same inputs ⇒ same normalized output)
|
||||
|
||||
---
|
||||
|
||||
If you want a one-page “Stella Ops Developer Contract” template (API shapes + required fields for SBOM/VEX/attestations + policy expectations), I can draft it in a copy/paste form suitable for a repo `docs/` folder.
|
||||
|
||||
[1]: https://docs.sigstore.dev/cosign/verifying/attestation/ "In-Toto Attestations - Sigstore"
|
||||
[2]: https://docs.sigstore.dev/logging/cli/ "CLI - Sigstore"
|
||||
[3]: https://cyclonedx.org/capabilities/vex/?utm_source=chatgpt.com "Vulnerability Exploitability eXchange (VEX)"
|
||||
[4]: https://github.com/openvex/spec?utm_source=chatgpt.com "OpenVEX Specification"
|
||||
[5]: https://cyclonedx.org/specification/overview/ "Specification Overview | CycloneDX"
|
||||
[6]: https://ecma-international.org/publications-and-standards/standards/ecma-424/?utm_source=chatgpt.com "ECMA-424"
|
||||
[7]: https://spdx.github.io/spdx-spec/v3.0.1/?utm_source=chatgpt.com "SPDX Specification 3.0.1"
|
||||
[8]: https://cyclonedx.org/news/cyclonedx-v1.6-released/?utm_source=chatgpt.com "CycloneDX v1.6 Released, Advances Software Supply ..."
|
||||
[9]: https://docs.sigstore.dev/cosign/system_config/specifications/ "Specifications - Sigstore"
|
||||
[10]: https://docs.sigstore.dev/policy-controller/sample-policies/ "Sample Policies - Sigstore"
|
||||
[11]: https://docs.docker.com/scout/how-tos/create-exceptions-vex/?utm_source=chatgpt.com "Create an exception using the VEX"
|
||||
@@ -0,0 +1,253 @@
|
||||
Here’s a compact blueprint for two high‑impact Stella Ops features that cut noise and speed triage: a **smart‑diff scanner** and a **call‑stack analyzer**.
|
||||
|
||||
# Smart‑diff scanner (rescore only what changed)
|
||||
|
||||
**Goal:** When an image/app updates, recompute risk only for deltas—packages, SBOM layers, and changed functions—then attach machine‑verifiable evidence.
|
||||
|
||||
**Why it helps (plain English):**
|
||||
|
||||
* Most “new” alerts are repeats. Diffing old vs new narrows work to just what changed.
|
||||
* If a vulnerable API disappears, auto‑draft a VEX “not affected” (NA) with proof.
|
||||
* Evidence (DSSE attestations + links) makes audits fast and deterministic.
|
||||
|
||||
**Inputs to diff:**
|
||||
|
||||
* Package lock/manifest (e.g., `package-lock.json`, `Pipfile.lock`, `go.sum`, `pom.xml`, `packages.lock.json`).
|
||||
* Image layer SBOMs (CycloneDX/SPDX per layer).
|
||||
* Function‑level CFG summaries (per language; see below).
|
||||
|
||||
**Core flow (pseudocode):**
|
||||
|
||||
```pseudo
|
||||
prev = load_snapshot(t-1) // lockfiles + layer SBOM + CFG index + reachability cache
|
||||
curr = load_snapshot(t)
|
||||
|
||||
Δ.pkg = diff_packages(prev.lock, curr.lock) // added/removed/changed packages
|
||||
Δ.layers= diff_layers(prev.sbom, curr.sbom) // image files, licenses, hashes
|
||||
Δ.funcs = diff_cfg(prev.cfgIndex, curr.cfgIndex) // added/removed/changed functions
|
||||
|
||||
scope = union(
|
||||
impact_of(Δ.pkg.changed),
|
||||
impact_of_files(Δ.layers.changed),
|
||||
reachability_of(Δ.funcs.changed)
|
||||
)
|
||||
|
||||
for f in scope.functions:
|
||||
rescore(f) // recompute reachability, version bounds, EPSS, KEV, exploit hints
|
||||
|
||||
for v in impacted_vulns(scope):
|
||||
annotate(v, patch_delta(Δ)) // symbols added/removed/changed
|
||||
link_evidence(v, dsse_attestation(), proof_links())
|
||||
|
||||
for v in previously_flagged where vulnerable_apis_now_absent(v, curr):
|
||||
emit_vex_candidate(v, status="not_affected", rationale="API not present", evidence=proof_links())
|
||||
```
|
||||
|
||||
**Evidence & provenance:**
|
||||
|
||||
* Emit **DSSE** envelopes for: (a) diff result, (b) rescoring inputs, (c) VEX candidates.
|
||||
* Attach **proof links**: Rekor entry, content digests, source commit, layer digest, and normalized lockfile hash.
|
||||
* Deterministic IDs: `sha256(canonical-json(record))`.
|
||||
|
||||
**Data model (minimal):**
|
||||
|
||||
* `Delta.Packages { added[], removed[], changed[{name, fromVer, toVer}] }`
|
||||
* `Delta.Layers { changed[{path, fromHash, toHash, licenseDelta}] }`
|
||||
* `Delta.Functions { added[], removed[], changed[{symbol, file, signatureHashFrom, signatureHashTo}] }`
|
||||
* `PatchDelta { addedSymbols[], removedSymbols[], changedSignatures[] }`
|
||||
|
||||
**.NET 10 implementation hints:**
|
||||
|
||||
* Projects: `StellaOps.Scanner.Diff`, `StellaOps.Scanner.Rescore`, `StellaOps.Evidence`.
|
||||
* Use `System.Formats.Asn1`/`System.Security.Cryptography` for digests & signing adapters.
|
||||
* Keep a **content‑addressed cache** by `(artifactDigest, toolVersion)` to make rescoring O(Δ).
|
||||
|
||||
**Language normalizers (lockfiles → canonical):**
|
||||
|
||||
* Node: parse `package-lock.json` v2/v3 → `{name, version, resolved, integrity}`.
|
||||
* Python: consolidate `pip freeze` + `pipdeptree` or `poetry.lock` into name/version/source.
|
||||
* Java: `mvn -DskipTests -q help:effective-pom` + `dependency:tree -DoutputType=json`.
|
||||
* Go: parse `go.sum` + `go list -m -json all`.
|
||||
* .NET: `dotnet list package --format json` + `packages.lock.json`.
|
||||
|
||||
---
|
||||
|
||||
# Call‑stack analyzer (fast reachability + readable explainers)
|
||||
|
||||
**Goal:** Rank vulns by whether your code can realistically hit the vulnerable sink, and show a **minimal, human‑readable path** (“why here?”).
|
||||
|
||||
**Strategy: hybrid analysis**
|
||||
|
||||
* **Static pre‑compute:** Build language‑specific call graphs (normalize package symbols, collapse known framework boilerplate). Techniques: CHA (Class Hierarchy Analysis), RTA (Rapid Type Analysis), and Spark‑style dataflow over edges.
|
||||
* **JIT refinement:** On demand, prune with types/points‑to from build artifacts (PDBs, `dotnet build` metadata, `javac -h`, `tsc --declaration`), eliminate dead generics, inline trivial wrappers.
|
||||
* **Path collapse:** Merge equivalent prefixes/suffixes; cap frames to the **smallest user‑code slice** plus critical sink frames.
|
||||
|
||||
**Scoring & ranking:**
|
||||
|
||||
* `score = user_code_distance^-1 * sink_criticality * evidence_weight`
|
||||
* `user_code_distance`: hops from repo code to sink (shorter = riskier).
|
||||
* `sink_criticality`: CWE/AV:N + KEV/EPSS boost.
|
||||
* `evidence_weight`: runtime hints (observed stack traces, symbols present).
|
||||
|
||||
**Explainer format (what triage sees):**
|
||||
|
||||
```
|
||||
[Reachable: HIGH] CVE-2024-XXXX in log4j-core@2.14.0
|
||||
why here? MyService.Process() → LoggingUtil.Wrap() → org.apache...JndiLookup.lookup()
|
||||
minimal path (3/17 frames), pruned 14 library frames
|
||||
proof: layer sha256:…, PDB match, symbol hash match, DSSE att#… (click to expand)
|
||||
```
|
||||
|
||||
**.NET 10 building blocks:**
|
||||
|
||||
* Build symbol index from PDBs (`Microsoft.DiaSymReader`), Roslyn analyzers for method refs.
|
||||
* Generate a compact call graph (`StellaOps.Reach.Graph`) with node IDs = `sha256(normalized-signature)`.
|
||||
* JIT refinement: read IL (`System.Reflection.Metadata`) to resolve virtual dispatch edges when type sets are small (from compile artifacts).
|
||||
* Renderer: keep to ≤5 frames by default; toggle “show hidden frames”.
|
||||
|
||||
**CFG + function diff for rescoring (bridge to smart‑diff):**
|
||||
|
||||
* Store per‑function signature hash and basic‑block count.
|
||||
* On change, register function for rescoring reachability + sinks affecting that symbol.
|
||||
|
||||
---
|
||||
|
||||
# Minimal deliverables to get moving (1 sprint)
|
||||
|
||||
1. **Delta core**: canonical lockfile/Layer/Symbol diff + patch‑delta annotator.
|
||||
2. **Rescore loop**: take `Delta` → select functions → recompute reachability & risk.
|
||||
3. **Explainer renderer**: minimal‑frames call path with “why here?” badges.
|
||||
4. **Evidence emitter**: DSSE envelopes + proof links; VEX NA when vulnerable APIs vanish.
|
||||
5. **Cache & determinism**: content‑addressed store; stable JSON; golden tests.
|
||||
|
||||
If you want, I can generate the .NET 10 project skeletons (`StellaOps.Scanner.Diff`, `StellaOps.Reach.Graph`, `StellaOps.Evidence`) and stub methods next.
|
||||
Stella Ops’ big advantage is that it treats security findings as **versioned, provable changes in your system** (not a perpetual firehose of “still vulnerable” alerts). That unlocks a bunch of practical wins:
|
||||
|
||||
## 1) Massive noise reduction via “delta-first” security
|
||||
|
||||
Most scanners re-report the whole universe on every build. Stella Ops flips it: **only rescore what changed** (packages, image layers, symbols/functions), and inherit prior conclusions for everything else.
|
||||
|
||||
What you get:
|
||||
|
||||
* Fewer duplicate tickets (“same CVE, same component, nothing changed”)
|
||||
* Less rescanning cost and faster CI feedback
|
||||
* A clear answer to “what’s new and why?”
|
||||
|
||||
Why this is a real moat: making incremental results *reliable* requires stable canonicalization, caching, and evidence that the diff is correct—most tools stop at “diff packages,” not “diff exploitability.”
|
||||
|
||||
## 2) Reachability-driven prioritization (the call-stack explainer)
|
||||
|
||||
Instead of ranking by CVSS alone, Stella Ops asks: **can our code actually hit the vulnerable sink?** Then it shows the *minimal* path that makes it believable.
|
||||
|
||||
What you get:
|
||||
|
||||
* Engineers fix what’s *actually* dangerous first
|
||||
* Security can justify prioritization with a “why here?” trace
|
||||
* “Unreachable” findings become low-touch (auto-suppress with expiry, or mark as NA with evidence)
|
||||
|
||||
This is the difference between “we have log4j somewhere” and “this service calls JndiLookup from a request path.”
|
||||
|
||||
## 3) Evidence-first security: every decision is auditable
|
||||
|
||||
Stella Ops can attach cryptographic, machine-verifiable evidence to each conclusion:
|
||||
|
||||
* **Diff attestations**: what changed between artifact A and B
|
||||
* **Rescore attestations**: inputs used to decide “reachable/not reachable”
|
||||
* **VEX candidates**: “not affected” or “affected” claims with rationale
|
||||
|
||||
A clean way to package this is **DSSE envelopes** (a standard signing wrapper used by supply-chain tooling). DSSE is widely used in attestations and supported in supply chain ecosystems like in-toto and sigstore/cosign. ([GitHub][1])
|
||||
|
||||
What you get:
|
||||
|
||||
* Audit-ready trails (“show me why you marked this NA”)
|
||||
* Tamper-evident compliance artifacts
|
||||
* Less “trust me” and more “verify me”
|
||||
|
||||
## 4) Auto-VEX that’s grounded in reality (and standards)
|
||||
|
||||
When a vulnerability is present in a dependency but **not exploitable in your context**, you want a VEX “not affected” statement *with a justification*—not an ad-hoc spreadsheet.
|
||||
|
||||
CISA has documented minimum elements for VEX documents, and points out multiple formats (including CycloneDX/OpenVEX/CSAF) that can carry VEX data. ([CISA][2])
|
||||
CycloneDX specifically positions VEX as context-focused exploitability information (“can it actually be exploited here?”). ([cyclonedx.org][3])
|
||||
|
||||
What you get:
|
||||
|
||||
* Fast, standardized “NA” responses with a paper trail
|
||||
* Cleaner vendor/customer conversations (“here’s our VEX, here’s why”)
|
||||
* Less time arguing about theoretical vs practical exposure
|
||||
|
||||
## 5) Faster blast-radius answers when a 0‑day drops
|
||||
|
||||
The “smart diff + symbol index + call paths” combo turns incident questions from days to minutes:
|
||||
|
||||
* “Which services contain the vulnerable function/symbol?”
|
||||
* “Which ones have a reachable path from exposed entrypoints?”
|
||||
* “Which builds/images introduced it, and when?”
|
||||
|
||||
That’s an *Ops* superpower: you can scope impact precisely, patch the right places, and avoid mass-panic upgrades that break production for no gain.
|
||||
|
||||
## 6) Lower total cost: fewer cycles, less compute, fewer human interrupts
|
||||
|
||||
Even without quoting numbers, the direction is obvious:
|
||||
|
||||
* Delta rescoring reduces CPU/time and storage churn
|
||||
* Reachability reduces triage load (fewer high-severity false alarms)
|
||||
* Evidence reduces audit and exception-management overhead
|
||||
|
||||
Net effect: security becomes a **steady pipeline** instead of a periodic “CVE storm.”
|
||||
|
||||
## 7) Better developer UX: findings that are actionable, not accusatory
|
||||
|
||||
Stella Ops can present findings like engineering wants to see them:
|
||||
|
||||
* “This new dependency bump added X, removed Y”
|
||||
* “Here’s the minimal path from your code to the vulnerable call”
|
||||
* “Here’s the exact commit / layer / symbol change that made risk go up”
|
||||
|
||||
That framing turns security into debugging, which engineers are already good at.
|
||||
|
||||
## 8) Standards alignment without being “standards only”
|
||||
|
||||
Stella Ops can speak the language auditors and customers care about:
|
||||
|
||||
* SBOM-friendly (CycloneDX is a BOM standard; it’s also published as ECMA-424). ([GitHub][4])
|
||||
* Supply chain framework alignment (SLSA describes controls/guidelines to prevent tampering and improve integrity). ([SLSA][5])
|
||||
* Attestations that fit modern ecosystems (DSSE, in-toto style envelopes, sigstore verification).
|
||||
|
||||
The advantage is you’re not just “producing an SBOM”—you’re producing **decisions + proofs** that are portable.
|
||||
|
||||
## 9) Defensibility: a compounding knowledge graph
|
||||
|
||||
Every scan produces structured facts:
|
||||
|
||||
* What changed
|
||||
* What functions exist
|
||||
* What call paths exist
|
||||
* What was concluded, when, and based on what evidence
|
||||
|
||||
Over time that becomes a proprietary, high-signal dataset:
|
||||
|
||||
* Faster future triage (because prior context is reused)
|
||||
* Better suppression correctness (because it’s anchored to symbols/paths, not text matching)
|
||||
* Better cross-repo correlation (“this vulnerable sink shows up in 12 services, but only 2 are reachable”)
|
||||
|
||||
## 10) “Ops” is the product: governance, exceptions, expiry, and drift control
|
||||
|
||||
The last advantage is cultural: Stella Ops isn’t just a scanner, it’s a **risk operations system**:
|
||||
|
||||
* time-bound suppressions that auto-expire
|
||||
* policy-as-code gates that understand reachability and diffs
|
||||
* evidence-backed exceptions (so you don’t re-litigate every quarter)
|
||||
|
||||
---
|
||||
|
||||
### A crisp way to pitch it internally
|
||||
|
||||
**Stella Ops turns vulnerability management from a static list of CVEs into a living, evidence-backed change log of what actually matters—and why.**
|
||||
Delta scanning cuts noise, call-stack analysis makes prioritization real, and DSSE/VEX-style artifacts make every decision auditable. ([CISA][2])
|
||||
|
||||
[1]: https://github.com/secure-systems-lab/dsse?utm_source=chatgpt.com "DSSE: Dead Simple Signing Envelope"
|
||||
[2]: https://www.cisa.gov/resources-tools/resources/minimum-requirements-vulnerability-exploitability-exchange-vex?utm_source=chatgpt.com "Minimum Requirements for Vulnerability Exploitability ..."
|
||||
[3]: https://cyclonedx.org/capabilities/vex/?utm_source=chatgpt.com "Vulnerability Exploitability eXchange (VEX)"
|
||||
[4]: https://github.com/CycloneDX/specification?utm_source=chatgpt.com "CycloneDX/specification"
|
||||
[5]: https://slsa.dev/?utm_source=chatgpt.com "SLSA • Supply-chain Levels for Software Artifacts"
|
||||
@@ -0,0 +1,240 @@
|
||||
I thought you might find these recent developments useful — they directly shape the competitive landscape and highlight where a tool like “Stella Ops” could stand out.
|
||||
|
||||
Here’s a quick run‑through of what’s happening — and where you could try to create advantage.
|
||||
|
||||
---
|
||||
|
||||
## 🔎 What competitors have recently shipped (competitive cues)
|
||||
|
||||
* Snyk Open Source recently rolled out a new **“dependency‑grouped” default view**, shifting from listing individual vulnerabilities to grouping them by library + version, so that you see the full impact of an upgrade (i.e. how many vulnerabilities a single library bump would remediate). ([updates.snyk.io][1])
|
||||
* Prisma Cloud (via its Vulnerability Explorer) now supports **Code‑to‑Cloud tracing**, meaning runtime vulnerabilities in container images or deployed assets can be traced back to the originating code/package in source repositories. ([docs.prismacloud.io][2])
|
||||
* Prisma Cloud also emphasizes **contextual risk scoring** that factors in risk elements beyond raw CVE severity — e.g. exposure, deployment context, asset type — to prioritize what truly matters. ([Palo Alto Networks][3])
|
||||
|
||||
These moves reflect a clear shift from “just list vulnerabilities” to “give actionable context and remediation clarity.”
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Where to build stronger differentiation (your conceptual moats)
|
||||
|
||||
Given what others have done, there’s now a window to own features that go deeper than “scan + score.” I think the following conceptual differentiators could give a tool like yours a strong, defensible edge:
|
||||
|
||||
* **“Stack‑Trace Lens”** — produce a first‑repro (or first‑hit) path from root cause to sink: show exactly how a vulnerability flows from a vulnerable library/line of code into a vulnerable runtime or container. That gives clarity developers rarely get from typical SCA/CSPM dashboards.
|
||||
* **“VEX Receipt” sidebar** — for issues flagged but deemed non‑exploitable (e.g. mitigated by runtime guards, configuration, or because the code path isn’t reachable), show a structured explanation for *why* it’s safe. That helps reduce noise, foster trust, and defensibly suppress “false positives” while retaining an audit trail.
|
||||
* **“Noise Ledger”** — an audit log of all suppressions, silences, or de‑prioritisations. If later the environment changes (e.g. a library bump, configuration change, or new code), you can re‑evaluate suppressed risks — or easily re‑enable previously suppressed issues.
|
||||
|
||||
---
|
||||
|
||||
## 💡 Why this matters — and where “Stella Ops” can shine
|
||||
|
||||
Because leading tools are increasingly offering dependency‑group grouping and risk‑scored vulnerability ranking + code‑to‑cloud tracing, the baseline expectation from users is rising: they don’t just want scans — they want *actionable clarity*.
|
||||
|
||||
By building lenses (traceability), receipts (rationalized suppressions), and auditability (reversible noise control), you move from “noise‑heavy scanning” to **“security as insight & governance”** — which aligns cleanly with your ambitions around deterministic scanning, compliance‑ready SBOM/VEX, and long‑term traceability.
|
||||
|
||||
You could position “Stella Ops” not as “another scanner,” but as a **governance‑grade, trace‑first, compliance‑centric security toolkit** — something that outpaces both SCA‑focused and cloud‑context tools by unifying them under auditability, trust, and clarity.
|
||||
|
||||
---
|
||||
|
||||
If you like, I can sketch a **draft competitive matrix** (Snyk vs Prisma Cloud vs Stella Ops) showing exactly which features you beat them on — that might help when you write your positioning.
|
||||
|
||||
[1]: https://updates.snyk.io/group-by-dependency-a-new-view-for-snyk-open-source-319578/?utm_source=chatgpt.com "Group by Dependency: A New View for Snyk Open Source"
|
||||
[2]: https://docs.prismacloud.io/en/enterprise-edition/content-collections/search-and-investigate/c2c-tracing-vulnerabilities/c2c-tracing-vulnerabilities?utm_source=chatgpt.com "Code to Cloud Tracing for Vulnerabilities"
|
||||
[3]: https://www.paloaltonetworks.com/prisma/cloud/vulnerability-management?utm_source=chatgpt.com "Vulnerability Management"
|
||||
To make Stella Ops feel *meaningfully* better than “scan + score” tools, lean into three advantages that compound over time: **traceability**, **explainability**, and **auditability**. Here’s a deeper, more buildable version of the ideas (and a few adjacent moats that reinforce them).
|
||||
|
||||
---
|
||||
|
||||
## 1) Stack‑Trace Lens → “Show me the exploit path, not the CVE”
|
||||
|
||||
**Promise:** “This vuln matters because *this* request route can reach *that* vulnerable function under *these* runtime conditions.”
|
||||
|
||||
### What it looks like in product
|
||||
|
||||
* **Exploit Path View** (per finding)
|
||||
|
||||
* Entry point: API route / job / message topic / cron
|
||||
* Call chain: `handler → service → lib.fn() → vulnerable sink`
|
||||
* **Reachability verdict:** reachable / likely reachable / not reachable (with rationale)
|
||||
* **Runtime gates:** feature flag off, auth wall, input constraints, WAF, env var, etc.
|
||||
* **“Why this is risky” panel**
|
||||
|
||||
* Severity + exploit maturity + exposure (internet-facing?) + privilege required
|
||||
* But crucially: **show the factors**, don’t hide behind a single score.
|
||||
|
||||
### How this becomes a moat (harder to copy)
|
||||
|
||||
* You’re building a **code + dependency + runtime graph** that improves with every build/deploy.
|
||||
* Competitors can map “package ↔ image ↔ workload”; fewer can answer “*can user input reach the vulnerable code path?*”
|
||||
|
||||
### Killer demo
|
||||
|
||||
Pick a noisy transitive dependency CVE.
|
||||
|
||||
* Stella shows: “Not reachable: the vulnerable function isn’t invoked in your codebase. Here’s the nearest call site; it dead-ends.”
|
||||
* Then show a second CVE where it *is* reachable, with a path that ends at a public endpoint. The contrast sells.
|
||||
|
||||
---
|
||||
|
||||
## 2) VEX Receipt → “Suppressions you can defend”
|
||||
|
||||
**Promise:** When you say “won’t fix” or “not affected,” Stella produces a **structured, portable explanation** that stands up in audits and survives team churn.
|
||||
|
||||
### What a “receipt” contains
|
||||
|
||||
* Vulnerability ID(s), component + version, where detected (SBOM node)
|
||||
* **Status:** affected / not affected / under investigation
|
||||
* **Justification template** (pick one, pre-filled where possible):
|
||||
|
||||
* Not in execution path (reachability)
|
||||
* Mitigated by configuration (e.g., feature disabled, safe defaults)
|
||||
* Environment not vulnerable (e.g., OS/arch mismatch)
|
||||
* Only dev/test dependency
|
||||
* Patched downstream / backported fix
|
||||
* **Evidence attachments** (hashable)
|
||||
|
||||
* Call graph snippet, config snapshot, runtime trace, build attestation reference
|
||||
* **Owner + approver + expiry**
|
||||
|
||||
* “This expires in 90 days unless re-approved”
|
||||
* **Reopen triggers**
|
||||
|
||||
* “If this package version changes” / “if this endpoint becomes public” / “if config flag flips”
|
||||
|
||||
### Why it’s a competitive advantage
|
||||
|
||||
* Most tools offer “ignore” or “risk accept.” Few make it **portable governance**.
|
||||
* The receipt becomes a **memory system** for security decisions, not a pile of tribal knowledge.
|
||||
|
||||
### Killer demo
|
||||
|
||||
Open a SOC2/ISO audit scenario:
|
||||
|
||||
* “Why is this critical CVE not fixed?”
|
||||
Stella: click → receipt → evidence → approver → expiry → automatically scheduled revalidation.
|
||||
|
||||
---
|
||||
|
||||
## 3) Noise Ledger → “Safe noise reduction without blind spots”
|
||||
|
||||
**Promise:** You can reduce noise aggressively *without* creating a security black hole.
|
||||
|
||||
### What to build
|
||||
|
||||
* A first-class **Suppression Object**
|
||||
|
||||
* Scope (repo/service/env), matching logic, owner, reason, risk rating, expiry
|
||||
* Links to receipts (VEX) when applicable
|
||||
* **Suppression Drift Detection**
|
||||
|
||||
* If conditions change (new code path, new exposure, new dependency graph), Stella flags:
|
||||
|
||||
* “This suppression is now invalid”
|
||||
* **Suppression Debt dashboard**
|
||||
|
||||
* How many suppressions exist
|
||||
* How many expired
|
||||
* How many are blocking remediation
|
||||
* “Top 10 suppressions by residual risk”
|
||||
|
||||
### Why it wins
|
||||
|
||||
* Teams want fewer alerts. Auditors want rigor. The ledger gives both.
|
||||
* It also creates a **governance flywheel**: each suppression forces a structured rationale, which improves the product’s prioritization later.
|
||||
|
||||
---
|
||||
|
||||
## 4) Deterministic Scanning → “Same inputs, same outputs (and provable)”
|
||||
|
||||
This is subtle but huge for trust.
|
||||
|
||||
### Buildable elements
|
||||
|
||||
* **Pinned scanner/toolchain versions** per org, per policy pack
|
||||
* **Reproducible scan artifacts**
|
||||
|
||||
* Results are content-addressed (hash), signed, and versioned
|
||||
* **Diff-first UX**
|
||||
|
||||
* “What changed since last build?” is the default view:
|
||||
|
||||
* new findings / resolved / severity changes / reachability changes
|
||||
* **Stable finding IDs**
|
||||
|
||||
* The same issue stays the same issue across refactors, so workflows don’t rot.
|
||||
|
||||
### Why it’s hard to copy
|
||||
|
||||
* Determinism is a *systems* choice (pipelines + data model + UI). It’s not a feature toggle.
|
||||
|
||||
---
|
||||
|
||||
## 5) Remediation Planner → “Best fix set, minimal breakage”
|
||||
|
||||
Competitors often say “upgrade X.” Stella can say “Here’s the *smallest set of changes* that removes the most risk.”
|
||||
|
||||
### What it does
|
||||
|
||||
* **Upgrade simulation**
|
||||
|
||||
* “If you bump `libA` to 2.3, you eliminate 14 vulns but introduce 1 breaking change risk”
|
||||
* **Patch plan**
|
||||
|
||||
* Ordered steps, test guidance, rollout suggestions
|
||||
* **Campaign mode**
|
||||
|
||||
* One CVE → many repos/services → coordinated PRs + tracking
|
||||
|
||||
### Why it wins
|
||||
|
||||
* Reduces time-to-fix by turning vulnerability work into an **optimization problem**, not a scavenger hunt.
|
||||
|
||||
---
|
||||
|
||||
## 6) “Audit Pack” Mode → instant compliance evidence
|
||||
|
||||
**Promise:** “Give me evidence for this control set for the last 90 days.”
|
||||
|
||||
### Contents
|
||||
|
||||
* SBOM + VEX exports (per release)
|
||||
* Exception receipts + approvals + expiries
|
||||
* Policy results + change history
|
||||
* Attestation references tying code → artifact → deploy
|
||||
|
||||
This is how you position Stella Ops as **governance-grade**, not just developer-grade.
|
||||
|
||||
---
|
||||
|
||||
## 7) Open standards + portability as a wedge (without being “open-source-y”)
|
||||
|
||||
Make it easy to *leave*—ironically, that increases trust and adoption.
|
||||
|
||||
* SBOM: SPDX/CycloneDX exports
|
||||
* VEX: OpenVEX/CycloneDX VEX outputs
|
||||
* Attestations: in-toto/SLSA-style provenance references (even if you don’t implement every spec day one)
|
||||
|
||||
The advantage: “Your security posture is not trapped in our UI.”
|
||||
|
||||
---
|
||||
|
||||
## 8) The positioning that ties it together
|
||||
|
||||
A crisp way to frame Stella Ops:
|
||||
|
||||
* **Snyk-like:** finds issues fast.
|
||||
* **Prisma-like:** adds runtime/cloud context.
|
||||
* **Stella Ops:** turns findings into **defensible decisions** with **traceable evidence**, and keeps those decisions correct as the system changes.
|
||||
|
||||
If you want a north-star tagline that matches the above:
|
||||
|
||||
* **“Security you can prove.”**
|
||||
* **“From CVEs to verifiable decisions.”**
|
||||
|
||||
---
|
||||
|
||||
### Three “hero workflows” that sell all of this in one demo
|
||||
|
||||
1. **New CVE drops** → impact across deployments → exploit path → fix set → PRs → rollout tracking
|
||||
2. **Developer sees a finding** → Stack-Trace Lens explains why it matters → one-click remediation plan
|
||||
3. **Auditor asks** → Audit Pack + VEX receipts + ledger shows governance end-to-end
|
||||
|
||||
If you want, I can turn this into a one-page competitive matrix (Snyk / Prisma / Stella Ops) plus a recommended MVP cut that still preserves the moats (the parts that are hardest to copy).
|
||||
@@ -0,0 +1,628 @@
|
||||
Here’s a tight, step‑through recipe for making every VEX statement **verifiably** tied to build evidence—using CycloneDX (SBOM), deterministic identifiers, and attestations (in‑toto/DSSE).
|
||||
|
||||
---
|
||||
|
||||
# 1) Build time: mint stable, content‑addressed IDs
|
||||
|
||||
* For every artifact (source, module, package, container layer), compute:
|
||||
|
||||
* `sha256` of canonical bytes
|
||||
* a **deterministic component ID**: `pkg:<ecosystem>/<name>@<version>?sha256=<digest>` (CycloneDX supports `bom-ref`; use this value as the `bom-ref`).
|
||||
* Emit SBOM (CycloneDX 1.6) with:
|
||||
|
||||
* `metadata.component` = the top artifact
|
||||
* each `components[].bom-ref` = the deterministic ID
|
||||
* `properties[]` for extras: build system run ID, git commit, tool versions.
|
||||
|
||||
**Example (SBOM fragment):**
|
||||
|
||||
```json
|
||||
{
|
||||
"bomFormat": "CycloneDX",
|
||||
"specVersion": "1.6",
|
||||
"serialNumber": "urn:uuid:7b4f3f64-8f0b-4a7d-9b3f-7a0a2b6cf6a9",
|
||||
"version": 1,
|
||||
"metadata": {
|
||||
"component": {
|
||||
"type": "container",
|
||||
"name": "stellaops/scanner",
|
||||
"version": "1.2.3",
|
||||
"bom-ref": "pkg:docker/stellaops/scanner@1.2.3?sha256=7e1a...b9"
|
||||
}
|
||||
},
|
||||
"components": [
|
||||
{
|
||||
"type": "library",
|
||||
"name": "openssl",
|
||||
"version": "3.2.1",
|
||||
"purl": "pkg:apk/alpine/openssl@3.2.1-r0",
|
||||
"bom-ref": "pkg:apk/alpine/openssl@3.2.1-r0?sha256=2c0f...54e",
|
||||
"properties": [
|
||||
{"name": "build.git", "value": "ef3d9b4"},
|
||||
{"name": "build.run", "value": "gha-61241"}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# 2) Sign the SBOM as evidence
|
||||
|
||||
* Wrap the SBOM in **DSSE** and sign it (cosign or in‑toto).
|
||||
* Record to Rekor (or your offline mirror). Store the **log index**/UUID.
|
||||
|
||||
**Provenance note:** keep `{ sbomDigest, dsseSignature, rekorLogID }`.
|
||||
|
||||
---
|
||||
|
||||
# 3) Normalize vulnerability findings to the same IDs
|
||||
|
||||
* Your scanner should output findings where `affected.bom-ref` equals the component’s deterministic ID.
|
||||
* If using CVE/OSV, keep both the upstream ID and your local `bom-ref`.
|
||||
|
||||
**Finding (internal record):**
|
||||
|
||||
```json
|
||||
{
|
||||
"vulnId": "CVE-2024-12345",
|
||||
"affected": "pkg:apk/alpine/openssl@3.2.1-r0?sha256=2c0f...54e",
|
||||
"source": "grype@0.79.0",
|
||||
"introducedBy": "stellaops/scanner@1.2.3",
|
||||
"evidence": {"scanDigest": "sha256:aa1b..."}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# 4) Issue VEX with deterministic targets
|
||||
|
||||
* Create a CycloneDX **VEX** doc where each `vulnerabilities[].affects[].ref` equals the SBOM `bom-ref`.
|
||||
* Use `analysis.justification` and `analysis.state` (`not_affected`, `affected`, `fixed`, `under_investigation`).
|
||||
* Add **tight reasons** (reachability, config, platform) and a **link back to evidence** via properties.
|
||||
|
||||
**VEX (CycloneDX) minimal:**
|
||||
|
||||
```json
|
||||
{
|
||||
"bomFormat": "CycloneDX",
|
||||
"specVersion": "1.6",
|
||||
"version": 1,
|
||||
"vulnerabilities": [
|
||||
{
|
||||
"id": "CVE-2024-12345",
|
||||
"source": {"name": "NVD"},
|
||||
"analysis": {
|
||||
"state": "not_affected",
|
||||
"justification": "vulnerable_code_not_present",
|
||||
"response": ["will_not_fix"],
|
||||
"detail": "Linked OpenSSL feature set excludes the vulnerable cipher."
|
||||
},
|
||||
"affects": [
|
||||
{"ref": "pkg:apk/alpine/openssl@3.2.1-r0?sha256=2c0f...54e"}
|
||||
],
|
||||
"properties": [
|
||||
{"name": "evidence.sbomDigest", "value": "sha256:91f2...9a"},
|
||||
{"name": "evidence.rekorLogID", "value": "425c1d1e..."},
|
||||
{"name": "reachability.report", "value": "sha256:reacha..."},
|
||||
{"name": "policy.decision", "value": "TrustGate#R-17.2"}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# 5) Sign the VEX and anchor it
|
||||
|
||||
* Wrap the VEX in DSSE, sign, and (optionally) publish to Rekor (or your Proof‑Market mirror).
|
||||
* Now you can verify: **component digest ↔ SBOM bom‑ref ↔ VEX affects.ref ↔ signatures/log**.
|
||||
|
||||
---
|
||||
|
||||
# 6) Verifier flow (what your UI/CLI should do)
|
||||
|
||||
1. Load VEX → verify DSSE signature → (optional) Rekor inclusion.
|
||||
2. For each `affects.ref`, check there exists an SBOM component with the **exact same value**.
|
||||
3. Verify the SBOM signature and Rekor entry (hash of SBOM equals what VEX references in `properties.evidence.sbomDigest`).
|
||||
4. Cross‑check the running artifact/container digest matches the SBOM `metadata.component.bom-ref` (or OCI manifest digest).
|
||||
5. Render the decision with **explainable evidence** (links to proofs, reachability report hash, policy rule ID).
|
||||
|
||||
---
|
||||
|
||||
# 7) Attestation shapes (quick starters)
|
||||
|
||||
**DSSE envelope (JSON) around SBOM or VEX payload:**
|
||||
|
||||
```json
|
||||
{
|
||||
"payloadType": "application/vnd.cyclonedx+json;version=1.6",
|
||||
"payload": "BASE64(SBOM_OR_VEX_JSON)",
|
||||
"signatures": [
|
||||
{"keyid": "SHA256-PUBKEY", "sig": "BASE64(SIG)"}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**in‑toto Statement for provenance → attach SBOM hash:**
|
||||
|
||||
```json
|
||||
{
|
||||
"_type": "https://in-toto.io/Statement/v1",
|
||||
"predicateType": "https://slsa.dev/provenance/v1",
|
||||
"subject": [{"name": "stellaops/scanner", "digest": {"sha256": "7e1a...b9"}}],
|
||||
"predicate": {
|
||||
"buildType": "stellaops/ci",
|
||||
"materials": [{"uri": "git+https://...#ef3d9b4"}],
|
||||
"metadata": {"buildInvocationID": "gha-61241"},
|
||||
"externalParameters": {"sbomDigest": "sha256:91f2...9a"}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# 8) Practical guardrails (so it stays deterministic)
|
||||
|
||||
* **Never** generate `bom-ref` from mutable fields (like file paths). Use content digests + stable PURL.
|
||||
* Pin toolchains and normalize JSON (UTF‑8, sorted keys if you post‑hash).
|
||||
* Store `{ toolVersions, feed snapshots, policy set hash }` to replay decisions.
|
||||
* For containers, prefer `bom-ref = pkg:oci/<repo>@<digest>` PLUS layer evidence in `components[]`.
|
||||
|
||||
---
|
||||
|
||||
# 9) “Hello‑world” verification script (pseudo)
|
||||
|
||||
```bash
|
||||
# 1) Verify SBOM sig -> get sbomDigest
|
||||
cosign verify-blob --signature sbom.sig sbom.json
|
||||
|
||||
# 2) Verify VEX sig
|
||||
cosign verify-blob --signature vex.sig vex.json
|
||||
|
||||
# 3) Check that every VEX affects.ref exists in SBOM
|
||||
jq -r '.vulnerabilities[].affects[].ref' vex.json | while read ref; do
|
||||
jq -e --arg r "$ref" '.components[] | select(.["bom-ref"]==$r)' sbom.json >/dev/null
|
||||
done
|
||||
|
||||
# 4) Compare running image digest to SBOM metadata.component.bom-ref
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Where this fits in Stella Ops (quick wiring)
|
||||
|
||||
* **Sbomer**: emits CycloneDX with deterministic `bom-ref`s + DSSE sig.
|
||||
* **Scanner**: normalizes findings to `bom-ref`.
|
||||
* **Vexer**: produces/signed VEX; includes `properties` back to SBOM/reachability/policy.
|
||||
* **Authority/Verifier**: one click “Prove it” view → checks DSSE, Rekor, and `ref` equality.
|
||||
* **Proof Graph**: edge types: `produces(SBOM)`, `affects(VEX↔component)`, `signedBy`, `recordedAt(Rekor)`.
|
||||
|
||||
If you want, I can turn this into:
|
||||
|
||||
* a **.NET 10** helper lib for stable `bom-ref` generation,
|
||||
* a **CLI** that takes `sbom.json` + `vex.json` and runs the full verification,
|
||||
* or **fixtures** (golden SBOM/VEX/DSSE triplets) for your CI.
|
||||
Below is a developer-oriented blueprint you can hand to engineers as “How we build a verifiable SBOM→VEX chain”.
|
||||
|
||||
---
|
||||
|
||||
## 1. Objectives and Trust Model
|
||||
|
||||
**Goal:** Any VEX statement about a component must be:
|
||||
|
||||
1. **Precisely scoped** to one or more concrete artifacts.
|
||||
2. **Cryptographically linked** to the SBOM that defined those artifacts.
|
||||
3. **Replayable**: a third party can re-run verification and reach the same conclusion.
|
||||
4. **Auditable**: every step is backed by signatures and immutable logs (e.g., Rekor or internal ledger).
|
||||
|
||||
**Questions you must be able to answer deterministically:**
|
||||
|
||||
* “Which exact artifact does this VEX statement apply to?”
|
||||
* “Show me the SBOM where this artifact is defined, and prove it was not tampered with.”
|
||||
* “Prove that the VEX document I am looking at was authored and/or approved by the expected party.”
|
||||
|
||||
---
|
||||
|
||||
## 2. Canonical Identifiers: Non-Negotiable Foundation
|
||||
|
||||
You cannot build a verifiable chain without **stable, content-addressed IDs**.
|
||||
|
||||
### 2.1 Component IDs
|
||||
|
||||
For every component, choose a deterministic scheme:
|
||||
|
||||
* Base: PURL or URN, e.g.,
|
||||
`pkg:maven/org.apache.commons/commons-lang3@3.14.0`
|
||||
* Extend with content hash:
|
||||
`pkg:maven/org.apache.commons/commons-lang3@3.14.0?sha256=<digest>`
|
||||
* Use this value as the **CycloneDX `bom-ref`**.
|
||||
|
||||
**Developer rule:**
|
||||
|
||||
* `bom-ref` must be:
|
||||
|
||||
* Stable across SBOM regenerations for identical content.
|
||||
* Independent of local, ephemeral data (paths, build numbers).
|
||||
* Derived from canonical bytes (normalized archive/layer, not “whatever we saw on disk”).
|
||||
|
||||
### 2.2 Top-Level Artifact IDs
|
||||
|
||||
For images, archives, etc.:
|
||||
|
||||
* Prefer OCI-style naming:
|
||||
`pkg:oci/<repo>@sha256:<manifestDigest>`
|
||||
* Set this as `metadata.component.bom-ref` in the SBOM.
|
||||
|
||||
---
|
||||
|
||||
## 3. SBOM Generation Guidelines
|
||||
|
||||
### 3.1 Required Properties
|
||||
|
||||
When emitting a CycloneDX SBOM (1.5/1.6):
|
||||
|
||||
* `metadata.component`:
|
||||
|
||||
* `name`, `version`, `bom-ref`.
|
||||
* `components[]`:
|
||||
|
||||
* `name`, `version`, `purl` (if available), **`bom-ref`**.
|
||||
* `hashes[]`: include at least `SHA-256`.
|
||||
* `properties[]`:
|
||||
|
||||
* Build metadata:
|
||||
|
||||
* `build.gitCommit`
|
||||
* `build.pipelineRunId`
|
||||
* `build.toolchain` (e.g., `dotnet-10.0.100`, `maven-3.9.9`)
|
||||
* Optional:
|
||||
|
||||
* `provenance.statementDigest`
|
||||
* `scm.url`
|
||||
|
||||
Minimal JSON fragment:
|
||||
|
||||
```json
|
||||
{
|
||||
"bomFormat": "CycloneDX",
|
||||
"specVersion": "1.6",
|
||||
"metadata": {
|
||||
"component": {
|
||||
"type": "container",
|
||||
"name": "example/api-gateway",
|
||||
"version": "1.0.5",
|
||||
"bom-ref": "pkg:oci/example/api-gateway@sha256:abcd..."
|
||||
}
|
||||
},
|
||||
"components": [
|
||||
{
|
||||
"type": "library",
|
||||
"name": "openssl",
|
||||
"version": "3.2.1",
|
||||
"purl": "pkg:apk/alpine/openssl@3.2.1-r0",
|
||||
"bom-ref": "pkg:apk/alpine/openssl@3.2.1-r0?sha256:1234...",
|
||||
"hashes": [
|
||||
{ "alg": "SHA-256", "content": "1234..." }
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### 3.2 SBOM Normalization
|
||||
|
||||
Developer directions:
|
||||
|
||||
* Normalize JSON before hashing/signing:
|
||||
|
||||
* Sorted keys, UTF-8, consistent whitespace.
|
||||
* Ensure SBOM generation is **deterministic** given the same:
|
||||
|
||||
* Inputs (image, source tree)
|
||||
* Tool versions
|
||||
* Settings/flags
|
||||
|
||||
---
|
||||
|
||||
## 4. Signing and Publishing the SBOM
|
||||
|
||||
### 4.1 DSSE Envelope
|
||||
|
||||
Wrap the raw SBOM bytes in a DSSE envelope and sign:
|
||||
|
||||
```json
|
||||
{
|
||||
"payloadType": "application/vnd.cyclonedx+json;version=1.6",
|
||||
"payload": "BASE64(SBOM_JSON)",
|
||||
"signatures": [
|
||||
{
|
||||
"keyid": "<KID>",
|
||||
"sig": "BASE64(SIGNATURE)"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Guidelines:
|
||||
|
||||
* Use a **dedicated signing identity** (keypair or KMS key) for SBOMs.
|
||||
* Publish signature and payload hash to:
|
||||
|
||||
* Rekor or
|
||||
* Your internal immutable log / ledger.
|
||||
|
||||
Persist:
|
||||
|
||||
* `sbomDigest = sha256(SBOM_JSON)`.
|
||||
* `sbomLogId` (Rekor UUID or internal ledger ID).
|
||||
|
||||
---
|
||||
|
||||
## 5. Vulnerability Findings → Normalized Targets
|
||||
|
||||
Your scanners (or imports from external scanners) must map findings onto **the same IDs used in the SBOM**.
|
||||
|
||||
### 5.1 Mapping Rule
|
||||
|
||||
For each finding:
|
||||
|
||||
* `vulnId`: CVE, GHSA, OSV ID, etc.
|
||||
* `affectedRef`: **exact `bom-ref`** from SBOM.
|
||||
* Optional: secondary keys (file path, package manager coordinates).
|
||||
|
||||
Example internal record:
|
||||
|
||||
```json
|
||||
{
|
||||
"vulnId": "CVE-2025-0001",
|
||||
"affectedRef": "pkg:apk/alpine/openssl@3.2.1-r0?sha256:1234...",
|
||||
"scanner": "grype@0.79.0",
|
||||
"sourceSbomDigest": "sha256:91f2...",
|
||||
"foundAt": "2025-12-09T12:34:56Z"
|
||||
}
|
||||
```
|
||||
|
||||
Developer directions:
|
||||
|
||||
* Build a **component index** keyed by `bom-ref` when ingesting SBOMs.
|
||||
* Any finding that cannot be mapped to a known `bom-ref` must be flagged:
|
||||
|
||||
* `status = "unlinked"` and either:
|
||||
|
||||
* dropped from VEX scope, or
|
||||
* fixed by improving normalization rules.
|
||||
|
||||
---
|
||||
|
||||
## 6. VEX Authoring Guidelines
|
||||
|
||||
Use CycloneDX VEX (or OpenVEX) with a strict mapping to SBOM `bom-ref`s.
|
||||
|
||||
### 6.1 Minimal VEX Structure
|
||||
|
||||
```json
|
||||
{
|
||||
"bomFormat": "CycloneDX",
|
||||
"specVersion": "1.6",
|
||||
"version": 1,
|
||||
"vulnerabilities": [
|
||||
{
|
||||
"id": "CVE-2025-0001",
|
||||
"source": { "name": "NVD" },
|
||||
"analysis": {
|
||||
"state": "not_affected",
|
||||
"justification": "vulnerable_code_not_in_execute_path",
|
||||
"response": ["will_not_fix"],
|
||||
"detail": "The vulnerable function is not reachable in this configuration."
|
||||
},
|
||||
"affects": [
|
||||
{ "ref": "pkg:apk/alpine/openssl@3.2.1-r0?sha256:1234..." }
|
||||
],
|
||||
"properties": [
|
||||
{ "name": "evidence.sbomDigest", "value": "sha256:91f2..." },
|
||||
{ "name": "evidence.sbomLogId", "value": "rekor:abcd-..." },
|
||||
{ "name": "policy.decisionId", "value": "TRUST-ALG-001#rule-7" }
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### 6.2 Required Analysis Discipline
|
||||
|
||||
For each `(vulnId, affectedRef)`:
|
||||
|
||||
* `state` ∈ { `not_affected`, `affected`, `fixed`, `under_investigation` }.
|
||||
* `justification`:
|
||||
|
||||
* `vulnerable_code_not_present`
|
||||
* `vulnerable_code_not_in_execute_path`
|
||||
* `vulnerable_code_not_configured`
|
||||
* `vulnerable_code_cannot_be_controlled_by_adversary`
|
||||
* etc.
|
||||
* `detail`: **concrete explanation**, not generic text.
|
||||
* Reference back to SBOM and other proofs via `properties`.
|
||||
|
||||
Developer rules:
|
||||
|
||||
* Every `affects.ref` must match **exactly** a `bom-ref` in at least one SBOM.
|
||||
* VEX generator must fail if it cannot confirm this mapping.
|
||||
|
||||
---
|
||||
|
||||
## 7. Cryptographic Linking: SBOM ↔ VEX
|
||||
|
||||
To make the chain verifiable:
|
||||
|
||||
1. Compute `sbomDigest = sha256(SBOM_JSON)`.
|
||||
2. Inside each VEX vulnerability (or at top-level), include:
|
||||
|
||||
* `properties.evidence.sbomDigest = sbomDigest`
|
||||
* `properties.evidence.sbomLogId` if a transparency log is used.
|
||||
3. Sign the VEX document with DSSE:
|
||||
|
||||
* Separate key from SBOM key, or the same with different usage metadata.
|
||||
4. Optionally publish VEX DSSE to Rekor (or equivalent).
|
||||
|
||||
Resulting verification chain:
|
||||
|
||||
* Artifact digest → matches SBOM `metadata.component.bom-ref`.
|
||||
* SBOM `bom-ref`s → referenced by `vulnerabilities[].affects[].ref`.
|
||||
* VEX references SBOM by hash/log ID.
|
||||
* Both SBOM and VEX have valid signatures and log inclusion proofs.
|
||||
|
||||
---
|
||||
|
||||
## 8. Verifier Implementation Guidelines
|
||||
|
||||
You should implement a **verifier library** and then thin wrappers:
|
||||
|
||||
* CLI
|
||||
* API endpoint
|
||||
* UI “Prove it” button
|
||||
|
||||
### 8.1 Verification Steps (Algorithm)
|
||||
|
||||
Given: artifact digest, SBOM, VEX, signatures, logs.
|
||||
|
||||
1. **Verify SBOM DSSE signature.**
|
||||
2. **Verify VEX DSSE signature.**
|
||||
3. If using Rekor/log:
|
||||
|
||||
* Verify SBOM and VEX entries:
|
||||
|
||||
* log inclusion proof
|
||||
* payload hashes match local files.
|
||||
4. Confirm that:
|
||||
|
||||
* `artifactDigest` matches `metadata.component.bom-ref` or the indicated digest.
|
||||
5. Build a map of `bom-ref` from SBOM.
|
||||
6. For each VEX `affects.ref`:
|
||||
|
||||
* Ensure it exists in SBOM components.
|
||||
* Ensure `properties.evidence.sbomDigest == sbomDigest`.
|
||||
7. Compile per-component decisions:
|
||||
|
||||
For each component:
|
||||
|
||||
* List associated VEX records.
|
||||
* Derive effective state using a policy (e.g., most recent, highest priority source).
|
||||
|
||||
Verifier output should be **structured** (not just logs), e.g.:
|
||||
|
||||
```json
|
||||
{
|
||||
"artifact": "pkg:oci/example/api-gateway@sha256:abcd...",
|
||||
"sbomVerified": true,
|
||||
"vexVerified": true,
|
||||
"components": [
|
||||
{
|
||||
"bomRef": "pkg:apk/alpine/openssl@3.2.1-r0?sha256:1234...",
|
||||
"vulnerabilities": [
|
||||
{
|
||||
"id": "CVE-2025-0001",
|
||||
"state": "not_affected",
|
||||
"justification": "vulnerable_code_not_in_execute_path"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9. Data Model and Storage
|
||||
|
||||
A minimal relational / document model:
|
||||
|
||||
* `Artifacts`
|
||||
|
||||
* `id`
|
||||
* `purl`
|
||||
* `digest`
|
||||
* `bomRef` (top level)
|
||||
* `Sboms`
|
||||
|
||||
* `id`
|
||||
* `digest`
|
||||
* `dsseSignature`
|
||||
* `logId`
|
||||
* `rawJson`
|
||||
* `SbomComponents`
|
||||
|
||||
* `id`
|
||||
* `sbomId`
|
||||
* `bomRef` (unique per SBOM)
|
||||
* `purl`
|
||||
* `hash`
|
||||
* `VexDocuments`
|
||||
|
||||
* `id`
|
||||
* `digest`
|
||||
* `dsseSignature`
|
||||
* `logId`
|
||||
* `rawJson`
|
||||
* `VexEntries`
|
||||
|
||||
* `id`
|
||||
* `vexId`
|
||||
* `vulnId`
|
||||
* `affectedBomRef`
|
||||
* `state`
|
||||
* `justification`
|
||||
* `evidenceSbomDigest`
|
||||
* `policyDecisionId`
|
||||
|
||||
Guideline: store **raw JSON** plus an **indexed view** for efficient queries.
|
||||
|
||||
---
|
||||
|
||||
## 10. Testing: Golden Chains
|
||||
|
||||
Developers should maintain **golden fixtures** where:
|
||||
|
||||
* A known image or package → SBOM (JSON) → VEX (JSON) → DSSE envelopes → log entries.
|
||||
* For each fixture:
|
||||
|
||||
* A test harness runs the verifier.
|
||||
* Asserts:
|
||||
|
||||
* All signatures valid.
|
||||
* All `affects.ref` map to a SBOM `bom-ref`.
|
||||
* The final summarized decision for specific `(vulnId, bomRef)` pairs matches expectations.
|
||||
|
||||
Include negative tests:
|
||||
|
||||
* VEX referencing unknown `bom-ref` → verification error.
|
||||
* Mismatching `evidence.sbomDigest` → verification error.
|
||||
* Tampered SBOM or VEX → signature/log verification failure.
|
||||
|
||||
---
|
||||
|
||||
## 11. Operational Practices and Guardrails
|
||||
|
||||
Developer-facing rules of thumb:
|
||||
|
||||
1. **Never** generate `bom-ref` from mutable fields (paths, timestamps).
|
||||
2. Treat tool versions and feed snapshots as part of the “scan config”:
|
||||
|
||||
* Include hashes/versions in SBOM/VEX properties.
|
||||
3. Enforce **strict types** in code (e.g., enums for VEX states/justifications).
|
||||
4. Keep keys and signing policies separate per role:
|
||||
|
||||
* Build pipeline SBOM signer.
|
||||
* Security team VEX signer.
|
||||
5. Offer a single, stable API:
|
||||
|
||||
* `POST /verify`:
|
||||
|
||||
* Inputs: artifact digest (or image reference), SBOM+VEX or references.
|
||||
* Outputs: structured verification report.
|
||||
|
||||
---
|
||||
|
||||
If you want, next step I can do is sketch a small reference implementation outline (e.g., .NET 10 service with DTOs and verification pipeline) that you can drop directly into your codebase.
|
||||
@@ -0,0 +1,558 @@
|
||||
You might find this interesting — there’s a new paper, ReachCheck, that describes a breakthrough in call‑graph reachability analysis for IDEs that could be exactly what you need for Stella’s third‑party library precomputations and incremental call‑stack explainers. ([LiLin's HomePage][1])
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
## 🔍 What ReachCheck does
|
||||
|
||||
* ReachCheck builds a *compositional, library‑aware call‑graph summary*: it pre‑summarizes third‑party library reachability (offline), then merges those summaries on‑demand with your application code. ([LiLin's HomePage][1])
|
||||
* It relies on a matrix‑based representation of call graphs + fast matrix multiplication to compute transitive closures. That lets it answer “can method A reach method B (possibly via library calls)?” queries extremely quickly. ([Chengpeng Wang][2])
|
||||
|
||||
## ⚡ Impressive Efficiency Gains
|
||||
|
||||
In experiments over real‑world codebases, ReachCheck outperformed online traversal and classic summary‑based or graph algorithms by **2–4 orders of magnitude**. ([LiLin's HomePage][1])
|
||||
|
||||
* For example: some comparisons showed speedups of 80×, 238×, even up to 4369× vs older techniques. ([LiLin's HomePage][1])
|
||||
* Downstream tasks — like dependency‑conflict detection or vulnerability (CVE) risk detection — completed in **sub‑second times** (0.35 s, 0.61 s in reported cases). ([LiLin's HomePage][1])
|
||||
|
||||
## ✅ Why this aligns with Stella’s Architecture Goals
|
||||
|
||||
Given your plans for Stella Ops — especially around **pre‑computing third‑party libs**, **deterministic/replayable scans**, and **explainable call‑stack / dependency tracing** — ReachCheck fits like a glove:
|
||||
|
||||
* You could **precompute summaries** for every third‑party library once, store them, and reuse them across scans.
|
||||
* On-demand analysis becomes **lightweight and near instantaneous**, which supports incremental, fast feedback (e.g. in IDE or CI/CD).
|
||||
* Combining with your planned **Proof‑of‑Integrity Graph** and **AI-assisted explainers**, call‑graph queries (e.g. “does our code reach this vulnerable library method?”) become efficient enough to embed in runtime or build‑time workflows.
|
||||
|
||||
If you like — I can fetch the paper’s **limitations** section (hard cases, unsoundness/tradeoffs) so you can assess whether ReachCheck’s assumptions align with your threat‑model for Stella.
|
||||
|
||||
[1]: https://matafeiyanll.github.io/paper/TOSEM-2025.pdf?utm_source=chatgpt.com "ReachCheck: Compositional Library-Aware Call Graph ..."
|
||||
[2]: https://chengpeng-wang.github.io/publications/TOSEM2025.pdf?utm_source=chatgpt.com "ReachCheck: Compositional Library-Aware Call Graph ..."
|
||||
Right, if the variation space explodes, you do **not** want a full all-pairs precompute. The usual pattern is:
|
||||
|
||||
* Precompute and store only the **underlying graph/index** once per artifact.
|
||||
* Answer reachability **on demand per query** (per “call”).
|
||||
* **Cache query results** (and maybe the discovered path) keyed by a small, deterministic tuple.
|
||||
|
||||
Let me outline how I’d shape this for Stella Ops.
|
||||
|
||||
---
|
||||
|
||||
## 1. Separate “graph build” from “reachability query”
|
||||
|
||||
Instead of “precompute all paths”, you:
|
||||
|
||||
1. **Once per artifact (image / repo / package):**
|
||||
|
||||
* Build a compact call graph (nodes = functions/methods, edges = calls).
|
||||
* Persist it in a deterministic format:
|
||||
|
||||
* E.g. `callgraph.{language}.{artifact_hash}.bin`.
|
||||
* Think of this as “static index” – it is stable for all queries on that artifact.
|
||||
|
||||
2. **For each reachability query (once per call):**
|
||||
|
||||
* Input: `(artifact_hash, source_symbol, target_symbol, query_context)`
|
||||
* Load call graph (or retrieve from in-memory cache).
|
||||
* Run a **bounded graph search**:
|
||||
|
||||
* BFS / bidirectional BFS / A* on the call graph.
|
||||
* Return:
|
||||
|
||||
* `reachable: bool`
|
||||
* `path: [symbol1, symbol2, ...]` (canonical shortest path)
|
||||
* maybe some “explanation metadata” (callsite locations, files/lines).
|
||||
|
||||
No all-pairs transitive closure. Just efficient search on a pre-indexed graph.
|
||||
|
||||
---
|
||||
|
||||
## 2. Caching “once per query” – but done smartly
|
||||
|
||||
Your idea “do it once per call and maybe cache the result” is exactly the right middle ground. The key is to define what the cache key is.
|
||||
|
||||
### 2.1. Suggested cache key
|
||||
|
||||
For Stella, something like:
|
||||
|
||||
```text
|
||||
Key = (
|
||||
artifact_digest, // container image / repo hash
|
||||
language, // java, dotnet, go, etc.
|
||||
source_symbol_id, // normalized symbol id (e.g. method handle)
|
||||
target_symbol_id, // same
|
||||
context_flags_hash // OS/arch, feature flags, framework env etc (optional)
|
||||
)
|
||||
```
|
||||
|
||||
Value:
|
||||
|
||||
```text
|
||||
Value = {
|
||||
reachable: bool,
|
||||
path: [symbol_id...], // empty if not reachable
|
||||
computed_at_version: graph_version_id
|
||||
}
|
||||
```
|
||||
|
||||
Where `graph_version_id` increments if you change the call-graph builder, so you can invalidate stale cache entries across releases.
|
||||
|
||||
### 2.2. Cache scopes
|
||||
|
||||
You can have layered caches:
|
||||
|
||||
1. **In-scan in-memory cache (per scanner run):**
|
||||
|
||||
* Lives only for the current scan.
|
||||
* No eviction needed, deterministic, very simple.
|
||||
* Great when a UX asks the same or similar question repeatedly.
|
||||
|
||||
2. **Local persistent cache (per node / per deployment):**
|
||||
|
||||
* E.g. Postgres / RocksDB with the key above.
|
||||
* Useful if:
|
||||
|
||||
* The same artifact is scanned repeatedly (typical for CI and policy checks).
|
||||
* The same CVEs / sinks get queried often.
|
||||
|
||||
You can keep the persistent cache optional so air-gapped/offline deployments can decide whether they want this extra optimization.
|
||||
|
||||
---
|
||||
|
||||
## 3. Why this works even with many variations
|
||||
|
||||
You are right: there are “too many variations” if you think in terms of:
|
||||
|
||||
* All entrypoints × all sinks
|
||||
* All frameworks × all environment conditions
|
||||
|
||||
But note:
|
||||
|
||||
* You are **not** computing all combinations.
|
||||
* You only compute **those actually asked by:**
|
||||
|
||||
* The UX (“show me path from `vuln_method` to `Controller.Foo`”).
|
||||
* The policy engine (“prove whether this HTTP handler can reach this vulnerable method”).
|
||||
|
||||
So the number of **distinct, real queries** is usually far smaller than the combinatorial space.
|
||||
|
||||
And for each such query, a graph search on a typical microservice-size codebase is usually cheap (tens of milliseconds) if:
|
||||
|
||||
* The call graph is kept in memory (or memory-mapped).
|
||||
* You keep a compact node/edge representation (integer IDs, adjacency lists).
|
||||
|
||||
---
|
||||
|
||||
## 4. Implementation details you might care about
|
||||
|
||||
### 4.1. On-demand graph search
|
||||
|
||||
Per language you can stay with simple, predictable algorithms:
|
||||
|
||||
* **BFS / Dijkstra / bidirectional BFS** on a directed graph.
|
||||
* For large graphs, **bidirectional search** is usually the best “bang for the buck”.
|
||||
|
||||
You can encapsulate this in a small “Reachability Engine” module with a uniform API:
|
||||
|
||||
```csharp
|
||||
ReachabilityResult CheckReachability(
|
||||
ArtifactId artifact,
|
||||
string language,
|
||||
SymbolId from,
|
||||
SymbolId to,
|
||||
ReachabilityContext ctx);
|
||||
```
|
||||
|
||||
Internally it:
|
||||
|
||||
1. Looks up `callgraph(artifact, language)`.
|
||||
2. Checks in-memory cache for the key `(artifact, lang, from, to, ctxhash)`.
|
||||
3. If miss:
|
||||
|
||||
* Does bidirectional BFS.
|
||||
* Stores result (reachable, canonical path) into cache.
|
||||
4. Returns result.
|
||||
|
||||
### 4.2. Determinism
|
||||
|
||||
For Stella’s deterministic / replayable scans, you want:
|
||||
|
||||
* **Deterministic traversal order** inside the BFS:
|
||||
|
||||
* Sort adjacency lists once when building the graph.
|
||||
* Then BFS always visits neighbors in the same order.
|
||||
* **Canonical path**:
|
||||
|
||||
* Always store/report the lexicographically minimal shortest path, or “first discovered shortest path” with deterministic adjacency sorting.
|
||||
|
||||
For your **replay bundles**, you simply store:
|
||||
|
||||
* The original graph (or a hash referencing it).
|
||||
* The list of queries and their results (including the paths).
|
||||
* Optionally, the cache is re-derivable from those query logs.
|
||||
|
||||
You do not need to store the whole cache; you just need enough to replay.
|
||||
|
||||
---
|
||||
|
||||
## 5. When you still might want precomputation (but limited)
|
||||
|
||||
You might selectively precompute for:
|
||||
|
||||
* **Hot sinks**:
|
||||
|
||||
* e.g. `Runtime.exec`, `ProcessBuilder.start`, dangerous deserialization APIs, SQL exec functions.
|
||||
* **Framework entrypoints**:
|
||||
|
||||
* HTTP controllers, message handlers, scheduled tasks.
|
||||
|
||||
For those, you can precompute or at least **pre-index outgoing/incoming cones**:
|
||||
|
||||
* E.g. build “forward cones” of dangerous sinks:
|
||||
|
||||
* From each sink, compute reachable callers up to some depth and store only that summary.
|
||||
* Or “backward cones” from hot entrypoints.
|
||||
|
||||
This is much cheaper than all-pairs and helps common queries (CVE affecting a known sink) to be answered almost instantly.
|
||||
|
||||
---
|
||||
|
||||
## 6. How this maps to your multi-language priority (Java, JS, Python, Go, C/C++, .NET)
|
||||
|
||||
The strategy is language-agnostic:
|
||||
|
||||
* Each language plugin produces:
|
||||
|
||||
* `CallGraphNodes` (with stable `SymbolId`s).
|
||||
* `CallGraphEdges`.
|
||||
* The reachability engine is shared; only symbol encoding / mapping is language-specific.
|
||||
|
||||
The “once per call, then cache” strategy stays identical:
|
||||
|
||||
1. Normalize `(from_symbol, to_symbol)` into internal IDs.
|
||||
2. Run search.
|
||||
3. Cache.
|
||||
|
||||
---
|
||||
|
||||
## 7. Concrete recommendation for Stella Ops
|
||||
|
||||
If I distill it into a decision:
|
||||
|
||||
* Do **not** precompute all reachability; it will not scale.
|
||||
* Do **precompute and store only the call graph** per artifact.
|
||||
* Implement a **deterministic, on-demand reachability engine** with:
|
||||
|
||||
* In-scan in-memory cache (mandatory).
|
||||
* Optional persisted cache (per deployment).
|
||||
* Add the query log + results to your **deterministic scan manifest** so each answer is replayable.
|
||||
|
||||
If you like, next step I can sketch a small C# interface and data model for `StellaOps.Reachability` (nodes, edges, cache keys, and the main `CheckReachability` entrypoint) that your agents can implement directly.
|
||||
Below is a distilled and expanded articulation of **Stella Ops’ competitive advantages**, followed by **deep, developer-oriented guidelines** for implementing them consistently across all modules (Scanner, Sbomer, Authority, Vexer, Concelier, Excitors, Scheduler, Feedser, Gateway, UI).
|
||||
|
||||
The intention is to give you both the **strategic narrative** and the **engineering discipline** to ensure Stella retains a fundamental moat versus all classical scanners.
|
||||
|
||||
---
|
||||
|
||||
# Stella Ops: Expanded Advantages
|
||||
|
||||
Structured in a way that a developer or architect can immediately translate into code, data models, policies, and UX surfaces.
|
||||
|
||||
## 1. Deterministic Security Engine
|
||||
|
||||
**Advantage:** Modern vulnerability scanners produce non-deterministic results: changing feeds, inconsistent call-graphs, transient metadata. Stella Ops produces **replayable evidence**.
|
||||
|
||||
### What this means for developers
|
||||
|
||||
* Every scan must produce a **Manifest of Deterministic Inputs**:
|
||||
|
||||
* Feed versions, rule versions, SBOM versions, VEX versions.
|
||||
* Hashes of each input.
|
||||
* The scan output must be fully reproducible with no external network calls.
|
||||
* Every module must support a **Replay Mode**:
|
||||
|
||||
* Inputs only from the manifest bundle.
|
||||
* Deterministic ordering of graph traversals, vulnerability matches, and path results.
|
||||
* No module may fetch anything non-pinned or non-hashed.
|
||||
|
||||
**Outcome:** An auditor can verify the exact same result years later.
|
||||
|
||||
---
|
||||
|
||||
## 2. Proof-Linked SBOM → VEX Chain
|
||||
|
||||
**Advantage:** Stella generates **cryptographically signed evidence graphs**, not just raw SBOMs and JSON VEX files.
|
||||
|
||||
### Developer requirements
|
||||
|
||||
* Always produce **DSSE attestations** for SBOM, reachability, and call-graph outputs.
|
||||
* The Authority service maintains a **Proof Ledger** linking:
|
||||
|
||||
* SBOM digest → Reachability digest → VEX reduction digest → Final policy decision.
|
||||
* Each reduction step records:
|
||||
|
||||
* Rule ID, lattice rule, inputs digests, output digest, timestamp, signer.
|
||||
|
||||
**Outcome:** A customer can present a *chain of proof*, not a PDF.
|
||||
|
||||
---
|
||||
|
||||
## 3. Compositional Reachability Engine
|
||||
|
||||
**Advantage:** Stella calculates call-stack reachability **on demand**, with deterministic caching and pre-summarized third-party libraries.
|
||||
|
||||
### Developer requirements
|
||||
|
||||
* Store only the **call graph** per artifact.
|
||||
* Provide an engine API:
|
||||
|
||||
```csharp
|
||||
ReachabilityResult Query(ArtifactId a, SymbolId from, SymbolId to, Context ctx);
|
||||
```
|
||||
* Ensure deterministic BFS/bidirectional BFS with sorted adjacency lists.
|
||||
* Cache on:
|
||||
|
||||
* `(artifact_digest, from_id, to_id, ctx_hash)`.
|
||||
* Store optional summaries for:
|
||||
|
||||
* Hot sinks (deserialization, SQL exec, command exec).
|
||||
* Framework entrypoints (HTTP handlers, queues).
|
||||
|
||||
**Outcome:** Fast and precise evidence, not “best guess” matching.
|
||||
|
||||
---
|
||||
|
||||
## 4. Lattice-Based VEX Resolution
|
||||
|
||||
**Advantage:** A visual “Trust Algebra Studio” where users define how VEX, vendor attestations, runtime info, and internal evidence merge.
|
||||
|
||||
### Developer requirements
|
||||
|
||||
* Implement lattice operators as code interfaces:
|
||||
|
||||
```csharp
|
||||
interface ILatticeRule {
|
||||
EvidenceState Combine(EvidenceState left, EvidenceState right);
|
||||
}
|
||||
```
|
||||
* Produce canonical merge logs for every decision.
|
||||
* Store the final state with:
|
||||
|
||||
* Trace of merges, reductions, evidence nodes.
|
||||
* Ensure monotonic, deterministic ordering of rule evaluation.
|
||||
|
||||
**Outcome:** Transparent, explainable policy outcomes, not opaque severity scores.
|
||||
|
||||
---
|
||||
|
||||
## 5. Quiet-by-Design Vulnerability Triage
|
||||
|
||||
**Advantage:** Stella only flags what is provable and relevant, unlike noisy scanners.
|
||||
|
||||
### Developer requirements
|
||||
|
||||
* Every finding must include:
|
||||
|
||||
* Evidence chain
|
||||
* Reachability path (or absence of one)
|
||||
* Provenance
|
||||
* Confidence class
|
||||
* Findings must be grouped by:
|
||||
|
||||
* Exploitable
|
||||
* Probably exploitable
|
||||
* Non-exploitable
|
||||
* Unknown (with ranking of unknowns)
|
||||
* Unknowns must be ranked by:
|
||||
|
||||
* Distance to sinks
|
||||
* Structural entropy
|
||||
* Pattern similarity to vulnerable nodes
|
||||
* Missing metadata dimensions
|
||||
|
||||
**Outcome:** DevOps receives actionable intelligence, not spreadsheet chaos.
|
||||
|
||||
---
|
||||
|
||||
## 6. Crypto-Sovereign Readiness
|
||||
|
||||
**Advantage:** Stella works in any national crypto regime (eIDAS, FIPS, GOST, SM2/3/4, PQC).
|
||||
|
||||
### Developer requirements
|
||||
|
||||
* Modular signature providers:
|
||||
|
||||
```csharp
|
||||
ISignatureProvider { Sign(), Verify() }
|
||||
```
|
||||
* Allow switching signature suite via configuration.
|
||||
* Include post-quantum functions (Dilithium/Falcon) for long-term archival.
|
||||
|
||||
**Outcome:** Sovereign deployments across Europe, Middle East, Asia without compromise.
|
||||
|
||||
---
|
||||
|
||||
## 7. Proof-of-Integrity Graph (Runtime → Build Ancestry)
|
||||
|
||||
**Advantage:** Stella links running containers to provable build origins.
|
||||
|
||||
### Developer requirements
|
||||
|
||||
* Each runtime probe generates:
|
||||
|
||||
* Container digest
|
||||
* Build recipe digest
|
||||
* Git commit digest
|
||||
* SBOM + VEX chain
|
||||
* Graph nodes: artifacts; edges: integrity proofs.
|
||||
* The final ancestry graph must be persisted and queryable:
|
||||
|
||||
* “Show me all running containers derived from a compromised artifact.”
|
||||
|
||||
**Outcome:** Real runtime accountability.
|
||||
|
||||
---
|
||||
|
||||
## 8. Adaptive Trust Economics
|
||||
|
||||
**Advantage:** Vendors earn trust credits; untrustworthy artifacts lose trust weight.
|
||||
|
||||
### Developer requirements
|
||||
|
||||
* Trust scoring function must be deterministic and signed.
|
||||
* Inputs:
|
||||
|
||||
* Vendor signature quality
|
||||
* Update cadence
|
||||
* Vulnerability density
|
||||
* Historical reliability
|
||||
* SBOM completeness
|
||||
* Store trust evolution over time for auditing.
|
||||
|
||||
**Outcome:** Procurement decisions driven by quantifiable reliability, not guesswork.
|
||||
|
||||
---
|
||||
|
||||
# Developers Guidelines for Implementing Those Advantages
|
||||
|
||||
Here is an actionable, module-by-module guideline set.
|
||||
|
||||
---
|
||||
|
||||
# Global Engineering Principles (apply to all modules)
|
||||
|
||||
1. **Determinism First**
|
||||
|
||||
* All loops with collections must use sorted structures.
|
||||
* All graph algorithms must use canonical neighbor ordering.
|
||||
* All outputs must be hash-stable.
|
||||
|
||||
2. **Evidence Everywhere**
|
||||
|
||||
* Every decision includes a provenance node.
|
||||
* Never return a boolean without a proof trail.
|
||||
|
||||
3. **Separation of Reduction Steps**
|
||||
|
||||
* SBOM generation
|
||||
* Vulnerability mapping
|
||||
* Reachability estimation
|
||||
* VEX reduction
|
||||
* Policy/Lattice resolution
|
||||
must be separate services or separate steps with isolated digests.
|
||||
|
||||
4. **Offline First**
|
||||
|
||||
* Feed updates must be packaged and pinned.
|
||||
* No live API calls allowed during scanning.
|
||||
|
||||
5. **Replay Mode Required**
|
||||
|
||||
* Every service can re-run the scan from recorded evidence without external data.
|
||||
|
||||
---
|
||||
|
||||
# Module-Specific Developer Guidelines
|
||||
|
||||
## Scanner
|
||||
|
||||
* Perform layered FS exploration deterministically.
|
||||
* Load vulnerability datasets from Feedser by digest.
|
||||
* For each match, require:
|
||||
|
||||
* Package evidence
|
||||
* Version bound match
|
||||
* Full rule trace.
|
||||
|
||||
## Sbomer
|
||||
|
||||
* Produce SPDX 3.0.1 + CycloneDX 1.6 simultaneously.
|
||||
* Emit DSSE attestations.
|
||||
* Guarantee stable ordering of all components.
|
||||
|
||||
## Reachability Engine
|
||||
|
||||
* Implement deterministic bidirectional BFS.
|
||||
* Add “unknown symbol” ranking heuristics.
|
||||
* Cache per `(artifact, from, to, context_hash)`.
|
||||
* Store path and evidence.
|
||||
|
||||
## Vexer / Excitors
|
||||
|
||||
* Interpret vendor VEX, internal evidence, runtime annotations.
|
||||
* Merge using lattice logic.
|
||||
* Produce signed reduction logs.
|
||||
|
||||
## Concelier
|
||||
|
||||
* Enforces policies using lattice outputs.
|
||||
* Must produce a “policy decision record” per artifact.
|
||||
* Must include a machine-readable decision DAG.
|
||||
|
||||
## Feedser
|
||||
|
||||
* Feeds emitted with version, timestamp, immutable hash.
|
||||
* Must support delta updates.
|
||||
* Must support full offline installation.
|
||||
|
||||
## Authority
|
||||
|
||||
* Stores all proofs in a verifiable ledger.
|
||||
* Exposes graph queries:
|
||||
|
||||
* “Show the exact evidence chain leading to this status.”
|
||||
|
||||
## Scheduler
|
||||
|
||||
* Ensures scan runs are reproducible.
|
||||
* Logs all parameters.
|
||||
* Stores outputs with manifest.
|
||||
|
||||
## UI / Gateway
|
||||
|
||||
* No decision without showing the proof chain.
|
||||
* Always show:
|
||||
|
||||
* Path evidence
|
||||
* Lattice reasoning
|
||||
* Final verdict with confidence class
|
||||
|
||||
---
|
||||
|
||||
# Final Deliverable: Stella Ops Advantage Summary for Developers
|
||||
|
||||
You can give this to your engineers:
|
||||
|
||||
1. Produce **deterministic, replayable evidence** for every scan.
|
||||
2. Store and link SBOM → Reachability → VEX → Policy outputs into a **Proof Ledger**.
|
||||
3. Use **compositional reachability** with deterministic caching.
|
||||
4. Implement **lattice-based evidence merging**.
|
||||
5. Maintain **quiet-by-design triage**: only show evidence-proven risks.
|
||||
6. Support **sovereign crypto** including PQC.
|
||||
7. Build a **runtime integrity graph** referencing build ancestry.
|
||||
8. Maintain **trust economics** for artifacts and vendors.
|
||||
9. Everything must be auditable, replayable, signed, and offline-capable.
|
||||
|
||||
If you want, I can now produce **architectural diagrams**, **module interfaces**, or **a complete AGENTS.md charter** incorporating these guidelines.
|
||||
@@ -0,0 +1,414 @@
|
||||
Here’s a compact idea you can drop straight into Stella Ops: make CVSS scoring **reproducible and auditable** by binding inputs to signed evidence and re‑scoring only what’s actually reachable.
|
||||
|
||||
### Why this matters (quick background)
|
||||
|
||||
* **CVSS** is a formula; the weak point is *inputs* (AV/AC/PR/UI/S/C/I/A + environment).
|
||||
* Inputs often come from ad‑hoc notes, so scores drift and aren’t traceable.
|
||||
* **SBOMs** (CycloneDX/SPDX) + **reachability** (traces from code/dep graphs) + **in‑toto/DSSE attestations** can freeze those inputs as verifiable evidence.
|
||||
|
||||
### The core concept: ScoreGraph
|
||||
|
||||
A normalized graph that ties every CVSS input to a signed fact:
|
||||
|
||||
* **Nodes:** Vulnerability, Package, Version, Artifact, Evidence (SBOM entry, reachability trace, config fact), EnvironmentAssumption, Score.
|
||||
* **Edges:** *derived_from*, *observed_in*, *signed_by*, *applies_to_env*, *supersedes*.
|
||||
* **Rules:** a tiny engine that recomputes CVSS when upstream evidence or assumptions change.
|
||||
|
||||
### Minimal schema (skeletal)
|
||||
|
||||
```yaml
|
||||
# scheduler/scoregraph.schema.yml
|
||||
ScoreGraph:
|
||||
nodes:
|
||||
Vulnerability: { id: string } # e.g., CVE-2025-12345
|
||||
Artifact: { digest: sha256, name: string }
|
||||
Package: { purl: string, version: string }
|
||||
Evidence:
|
||||
id: string
|
||||
kind: [SBOM, Reachability, Config, VEX, Manual]
|
||||
attestation: { dsse: bytes, signer: string, rekor: string? }
|
||||
hash: sha256
|
||||
observedAt: datetime
|
||||
EnvAssumption:
|
||||
id: string
|
||||
key: string # e.g., "scope.network"
|
||||
value: string|bool
|
||||
provenance: string # policy file, ticket, SOP
|
||||
attestation?: { dsse: bytes }
|
||||
Score:
|
||||
id: string
|
||||
cvss: { v: "3.1|4.0", base: number, temporal?: number, environmental?: number }
|
||||
inputsRef: string[] # Evidence/EnvAssumption ids
|
||||
computedAt: datetime
|
||||
edges:
|
||||
- { from: Vulnerability, to: Package, rel: "affects" }
|
||||
- { from: Package, to: Artifact, rel: "contained_in" }
|
||||
- { from: Evidence, to: Package, rel: "observed_in" }
|
||||
- { from: Evidence, to: Vulnerability, rel: "supports" }
|
||||
- { from: EnvAssumption, to: Score, rel: "parameter_of" }
|
||||
- { from: Evidence, to: Score, rel: "input_to" }
|
||||
- { from: Score, to: Vulnerability, rel: "score_for" }
|
||||
```
|
||||
|
||||
### How it works (end‑to‑end)
|
||||
|
||||
1. **Bind facts:**
|
||||
|
||||
* Import SBOM → create `Evidence(SBOM)` nodes signed via DSSE.
|
||||
* Import reachability traces (e.g., call‑graph hits, route exposure) → `Evidence(Reachability)`.
|
||||
* Record environment facts (network scope, auth model, mitigations) as `EnvAssumption` with optional DSSE attestation.
|
||||
2. **Normalize CVSS inputs:** A mapper converts Evidence/Assumptions → AV/AC/PR/UI/S/C/I/A (and CVSS v4.0 metrics if you adopt them).
|
||||
3. **Compute score:** Scheduler assembles a **ScoreRun** from referenced inputs; emits `Score` node plus a diff against prior `Score`.
|
||||
4. **Make deltas auditable:** Every score carries `inputsRef` hashes and signer IDs; any change shows *which* fact moved and *why*.
|
||||
5. **Trace‑based rescoring:** If a reachability trace flips (e.g., method no longer reachable), only affected `Score` nodes are recomputed.
|
||||
|
||||
### Where to put it in Stella Ops
|
||||
|
||||
* **Scheduler**: owns ScoreGraph lifecycle and re‑score jobs.
|
||||
* **Scanner/Vexer**: produce Evidence nodes (reachability, VEX).
|
||||
* **Authority**: verifies DSSE, Rekor anchors, and maintains trusted keys.
|
||||
* **Concelier**: policy that decides when a delta is “material” (e.g., gate builds if Environmental score ≥ threshold).
|
||||
|
||||
### Minimal APIs (developer‑friendly)
|
||||
|
||||
```http
|
||||
POST /scoregraph/evidence
|
||||
POST /scoregraph/env-assumptions
|
||||
POST /scoregraph/score:compute # body: { vulnId, artifactDigest, envProfileId }
|
||||
GET /scoregraph/score/{id} # returns inputs + signatures + diff vs previous
|
||||
GET /scoregraph/vuln/{id}/history
|
||||
```
|
||||
|
||||
### Quick implementation steps
|
||||
|
||||
1. **Define protobuf/JSON contracts** for ScoreGraph nodes/edges (under `Scheduler.Contracts`).
|
||||
2. **Add DSSE verify utility** in Authority SDK (accepts multiple key suites incl. PQC toggle).
|
||||
3. **Write mappers**: Evidence → CVSS inputs (v3.1 now, v4.0 behind a feature flag).
|
||||
4. **Implement rescoring triggers**: on new Evidence, EnvAssumption change, or artifact rebuild.
|
||||
5. **Ship a “replay file”** (deterministic run manifest: feed hashes, policies, versions) to make any score reproducible offline.
|
||||
6. **UI**: a “Why this score?” panel listing inputs, signatures, and a one‑click diff between Score versions.
|
||||
|
||||
### Guardrails
|
||||
|
||||
* **No unsigned inputs** in production mode.
|
||||
* **Environment profiles** are versioned (e.g., `onprem‑dmz‑v3`), so ops changes don’t silently alter scores.
|
||||
* **Reachability confidence** annotated (static/dynamic/probe); low confidence requires human sign‑off Evidence.
|
||||
|
||||
If you want, I can draft the C# contracts and the Scheduler job that builds a `ScoreRun` from a set of Evidence/Assumptions next.
|
||||
### Stella Ops advantage: “Score-as-Evidence” instead of “Score-as-Opinion”
|
||||
|
||||
The core upgrade is that Stella Ops treats every CVSS input as **a derived value backed by signed, immutable evidence** (SBOM, VEX, reachability, config, runtime exposure), and makes scoring **deterministic + replayable**.
|
||||
|
||||
Here’s what that buys you, in practical terms.
|
||||
|
||||
## 1) Advantages for Stella Ops (product + platform)
|
||||
|
||||
### A. Reproducible risk you can replay in audits
|
||||
|
||||
* Every score is tied to **exact artifacts** (`sha256`), exact dependencies (`purl@version`), exact policies, and exact evidence hashes.
|
||||
* You can “replay” a score later and prove: *same inputs → same vector → same score*.
|
||||
* Great for: SOC2/ISO narratives, incident postmortems (“why did we ship?”), customer security reviews.
|
||||
|
||||
### B. Fewer false positives via reachability + exposure context
|
||||
|
||||
Traditional scanners flag “present in SBOM” as “risky”. Stella Ops can separate:
|
||||
|
||||
* **Present but unreachable** (e.g., dead code path, optional feature never enabled)
|
||||
* **Reachable but not exposed** (internal-only, behind auth)
|
||||
* **Externally exposed + reachable** (highest priority)
|
||||
|
||||
This lets you cut vulnerability “noise” without hiding anything—because the de-prioritization is itself **evidence-backed**.
|
||||
|
||||
### C. Faster triage: “Why this score?” becomes a clickable chain
|
||||
|
||||
Each score can explain itself:
|
||||
|
||||
* “CVE affects package X → present in artifact Y → reachable via path Z → exposed on ingress route R → env assumption S → computed vector …”
|
||||
|
||||
That collapses hours of manual investigation into a few verifiable links.
|
||||
|
||||
### D. Incremental rescoring instead of full rescans
|
||||
|
||||
The scheduler only recomputes what changes:
|
||||
|
||||
* New SBOM? Only affected packages/artifacts.
|
||||
* Reachability trace changes? Only scores referencing those traces.
|
||||
* Environment profile changes? Only scores for that profile.
|
||||
|
||||
This is huge for monorepos and large fleets.
|
||||
|
||||
### E. Safe collaboration: humans can override, but overrides are signed & expiring
|
||||
|
||||
Stella Ops can support:
|
||||
|
||||
* Manual “not exploitable because …” decisions
|
||||
* Mitigation acceptance (WAF, sandbox, feature flag)
|
||||
…but as first-class evidence with:
|
||||
* signer identity + ticket link
|
||||
* scope (which artifacts/envs)
|
||||
* TTL/expiry (forces revalidation)
|
||||
|
||||
### F. Clear separation of “Base CVSS” vs “Your Environment”
|
||||
|
||||
A key differentiator: don’t mutate upstream base CVSS.
|
||||
|
||||
* Store vendor/NVD base vector as **BaseScore**
|
||||
* Compute Stella’s **EnvironmentalScore** (CVSS environmental metrics + policy overlays) from evidence
|
||||
|
||||
That preserves compatibility while still making the score reflect reality.
|
||||
|
||||
### G. Standard-friendly integration (future-proof)
|
||||
|
||||
Even if your internal graph is proprietary, your inputs/outputs can align with:
|
||||
|
||||
* SBOM: CycloneDX / SPDX
|
||||
* “Not affected / fixed”: OpenVEX / CSAF VEX
|
||||
* Provenance/attestation: DSSE / in-toto style envelopes (and optionally transparency logs)
|
||||
|
||||
This reduces vendor lock-in fears and eases ecosystem integrations.
|
||||
|
||||
---
|
||||
|
||||
## 2) Reference flow (how teams actually use it)
|
||||
|
||||
```
|
||||
CI build
|
||||
├─ generate SBOM (CycloneDX/SPDX)
|
||||
├─ generate build provenance attestation
|
||||
├─ run reachability (static/dynamic)
|
||||
└─ sign all outputs (DSSE)
|
||||
|
||||
Stella Ops ingestion
|
||||
├─ verify signatures + signer trust
|
||||
├─ create immutable Evidence nodes
|
||||
└─ link Evidence → Package → Artifact → Vulnerability
|
||||
|
||||
Scheduler
|
||||
├─ assemble ScoreRun(inputs)
|
||||
├─ compute Base + Environmental score
|
||||
└─ emit Score node + diff
|
||||
|
||||
Policy/Gates
|
||||
├─ PR comment: “risk delta”
|
||||
├─ build gate: threshold rules
|
||||
└─ deployment gate: env-profile-specific
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Developer Guidelines (building it without foot-guns)
|
||||
|
||||
## Guideline 1: Make everything immutable and content-addressed
|
||||
|
||||
**Do**
|
||||
|
||||
* Evidence ID = `sha256(canonical_payload_bytes)`
|
||||
* Artifact ID = `sha256(image_or_binary)`
|
||||
* ScoreRun ID = `sha256(sorted_input_ids + policy_version + scorer_version)`
|
||||
|
||||
**Don’t**
|
||||
|
||||
* Allow “editing” evidence. Corrections are *new evidence* with `supersedes` links.
|
||||
|
||||
This makes dedupe, caching, and audit trails trivial.
|
||||
|
||||
---
|
||||
|
||||
## Guideline 2: Canonicalize before hashing/signing
|
||||
|
||||
If two systems serialize JSON differently, hashes won’t match.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
* Use a canonical JSON scheme (e.g., RFC 8785/JCS-style) or a strict protobuf canonical encoding.
|
||||
* Store the canonical bytes alongside the parsed object.
|
||||
|
||||
---
|
||||
|
||||
## Guideline 3: Treat attestations as the security boundary
|
||||
|
||||
Ingestion should be “verify-then-store”.
|
||||
|
||||
**Do**
|
||||
|
||||
* Verify DSSE signature
|
||||
* Verify signer identity against allowlist / trust policy
|
||||
* Optionally verify transparency-log inclusion (if you use one)
|
||||
|
||||
**Don’t**
|
||||
|
||||
* Accept unsigned “manual” facts in production mode.
|
||||
|
||||
---
|
||||
|
||||
## Guideline 4: Keep Base CVSS pristine; compute environment as a separate layer
|
||||
|
||||
**Data model pattern**
|
||||
|
||||
* `Score.cvss.base` = upstream (vendor/NVD) vector + version + source
|
||||
* `Score.cvss.environmental` = Stella computed (with Modified metrics + requirements)
|
||||
|
||||
**Why**
|
||||
|
||||
* Preserves comparability and avoids arguments about “changing CVSS”.
|
||||
|
||||
---
|
||||
|
||||
## Guideline 5: Define a strict mapping from evidence → environmental metrics
|
||||
|
||||
Create one module that converts evidence into CVSS environmental knobs.
|
||||
|
||||
Example mapping rules (illustrative):
|
||||
|
||||
* Reachability says “only callable via local admin CLI” → `MAV=Local`
|
||||
* Config evidence says “internet-facing ingress” → `MAV=Network`
|
||||
* Mitigation evidence says “strong sandbox + no data exfil path” → lower confidentiality impact *only if* the mitigation is signed and scoped
|
||||
|
||||
**Do**
|
||||
|
||||
* Put these mappings in versioned policy code (“score policy v12”)
|
||||
* Record `policyVersion` inside Score
|
||||
|
||||
---
|
||||
|
||||
## Guideline 6: Reachability evidence must include confidence + method
|
||||
|
||||
Reachability is never perfectly certain. Encode that.
|
||||
|
||||
**Reachability evidence fields**
|
||||
|
||||
* `method`: `static_callgraph | dynamic_trace | fuzz_probe | manual_review`
|
||||
* `confidence`: 0–1 or `low/med/high`
|
||||
* `path`: minimal path proof (entrypoint → sink)
|
||||
* `scope`: commit SHA, build ID, feature flags, runtime config
|
||||
|
||||
**Policy tip**
|
||||
|
||||
* Low-confidence reachability should not auto-downgrade risk without a human sign-off evidence node.
|
||||
|
||||
---
|
||||
|
||||
## Guideline 7: Make scoring deterministic (down to library versions)
|
||||
|
||||
A score should be reproducible on a laptop later.
|
||||
|
||||
**Do**
|
||||
|
||||
* Freeze scorer implementation version (`scorerSemver`, `gitSha`)
|
||||
* Store the computed **vector string**, not just numeric score
|
||||
* Store the exact inputsRef list
|
||||
|
||||
---
|
||||
|
||||
## Guideline 8: Design the Scheduler as a pure function + incremental triggers
|
||||
|
||||
**Event triggers**
|
||||
|
||||
* `EvidenceAdded`
|
||||
* `EvidenceSuperseded`
|
||||
* `EnvProfileChanged`
|
||||
* `ArtifactBuilt`
|
||||
* `VEXUpdated`
|
||||
|
||||
**Incremental rule**
|
||||
|
||||
* Recompute scores that reference changed input IDs, and scores reachable via graph edges (Vuln→Package→Artifact).
|
||||
|
||||
---
|
||||
|
||||
## Guideline 9: Implement overrides as evidence, not database edits
|
||||
|
||||
Override workflow:
|
||||
|
||||
* Create `Evidence(kind=Manual|VEX)` referencing a ticket and rationale
|
||||
* Signed by authorized role
|
||||
* Scoped (artifact + env profile)
|
||||
* Has expiry
|
||||
|
||||
This prevents “quiet” risk suppression.
|
||||
|
||||
---
|
||||
|
||||
## Guideline 10: Provide “developer ergonomics” by default
|
||||
|
||||
If engineers don’t like it, they’ll bypass it.
|
||||
|
||||
**Must-have DX**
|
||||
|
||||
* PR comment: “risk delta” (before/after dependency bump)
|
||||
* One-click “Why this score?” graph trace
|
||||
* Local replay tool: `stella score replay --scoreRunId …`
|
||||
* Clear “what evidence would reduce this risk?” hints:
|
||||
|
||||
* add VEX from vendor
|
||||
* prove unreachable via integration test trace
|
||||
* fix exposure via ingress policy
|
||||
|
||||
---
|
||||
|
||||
# Minimal payload contracts (starter templates)
|
||||
|
||||
### 1) Evidence ingestion (signed DSSE payload inside)
|
||||
|
||||
```json
|
||||
{
|
||||
"kind": "SBOM",
|
||||
"subject": { "artifactDigest": "sha256:..." },
|
||||
"payloadType": "application/vnd.cyclonedx+json",
|
||||
"payload": { "...": "..." },
|
||||
"attestation": {
|
||||
"dsseEnvelope": "BASE64(...)",
|
||||
"signer": "spiffe://org/ci/scanner",
|
||||
"rekorEntry": "optional"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 2) Score compute request
|
||||
|
||||
```json
|
||||
{
|
||||
"vulnId": "CVE-2025-12345",
|
||||
"artifactDigest": "sha256:...",
|
||||
"envProfileId": "onprem-dmz-v3",
|
||||
"cvssVersion": "3.1"
|
||||
}
|
||||
```
|
||||
|
||||
### 3) Score response (what you want to show in UI/logs)
|
||||
|
||||
```json
|
||||
{
|
||||
"scoreId": "score_...",
|
||||
"cvss": {
|
||||
"v": "3.1",
|
||||
"base": 7.5,
|
||||
"environmental": 5.3,
|
||||
"vector": "CVSS:3.1/AV:N/AC:L/..."
|
||||
},
|
||||
"inputsRef": ["evidence_sha256:...", "env_sha256:..."],
|
||||
"policyVersion": "score-policy@12",
|
||||
"scorerVersion": "stella-scorer@1.8.2",
|
||||
"computedAt": "2025-12-09T10:20:30Z"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## “Definition of Done” checklist for teams integrating with Stella Ops
|
||||
|
||||
* [ ] SBOM generated per build (CycloneDX/SPDX)
|
||||
* [ ] SBOM + build provenance signed (DSSE)
|
||||
* [ ] Reachability evidence produced for critical services
|
||||
* [ ] Environment profiles versioned + signed
|
||||
* [ ] Scoring policy versioned and recorded in Score
|
||||
* [ ] Overrides implemented as expiring signed evidence
|
||||
* [ ] CI gate uses **EnvironmentalScore** thresholds + materiality rules
|
||||
* [ ] PR shows “risk delta” and “why” trace
|
||||
|
||||
---
|
||||
|
||||
If you want to push this from “concept” to “developer-ready”, the next step is to define:
|
||||
|
||||
1. the exact **Evidence payload schemas** per kind (SBOM, Reachability, VEX, Config), and
|
||||
2. the **policy mapping** rules that convert those into CVSS environmental metrics (v3.1 now, v4.0 behind a flag).
|
||||
@@ -0,0 +1,860 @@
|
||||
Here’s a compact, from‑scratch playbook for running **attestation, verification, and SBOM ingestion fully offline**—including pre‑seeded keyrings, an offline Rekor‑style log, and deterministic evidence reconciliation inside sealed networks.
|
||||
|
||||
---
|
||||
|
||||
# 1) Core concepts (quick)
|
||||
|
||||
* **SBOM**: a machine‑readable inventory (CycloneDX/SPDX) of what’s in an artifact.
|
||||
* **Attestation**: signed metadata (e.g., in‑toto/SLSA provenance, VEX) bound to an artifact’s digest.
|
||||
* **Verification**: cryptographically checking the artifact + attestations against trusted keys/policies.
|
||||
* **Transparency log (Rekor‑style)**: tamper‑evident ledger of entries (hashes + proofs). Offline we use a **private mirror** (no internet).
|
||||
* **Deterministic reconciliation**: repeatable joining of SBOM + attestation + policy into a stable “evidence graph” with identical results when inputs match.
|
||||
|
||||
---
|
||||
|
||||
# 2) Golden inputs you must pre‑seed into the air‑gap
|
||||
|
||||
* **Root of trust**:
|
||||
|
||||
* Vendor/org public keys (X.509 or SSH/age/PGP), **AND** their certificate chains if using Fulcio‑like PKI.
|
||||
* A pinned **CT/transparency log root** (your private one) + inclusion proof parameters.
|
||||
* **Policy bundle**:
|
||||
|
||||
* Verification policies (Cosign/in‑toto rules, VEX merge rules, allow/deny lists).
|
||||
* Hash‑pinned toolchain manifests (exact versions + SHA256 of cosign, oras, jq, your scanners, etc.).
|
||||
* **Evidence bundle**:
|
||||
|
||||
* SBOMs (CycloneDX 1.5/1.6 and/or SPDX 3.0.x).
|
||||
* DSSE‑wrapped attestations (provenance, build, SLSA, VEX).
|
||||
* Optional: vendor CVE feeds/VEX as static snapshots.
|
||||
* **Offline log snapshot**:
|
||||
|
||||
* A **signed checkpoint** (tree head) and **entry pack** (all log entries you rely on), plus Merkle proofs.
|
||||
|
||||
Ship all of the above on signed, write‑once media (WORM/BD‑R or signed tar with detached sigs).
|
||||
|
||||
---
|
||||
|
||||
# 3) Minimal offline directory layout
|
||||
|
||||
```
|
||||
/evidence/
|
||||
keys/
|
||||
roots/ # root/intermediate certs, PGP pubkeys
|
||||
identities/ # per-vendor public keys
|
||||
tlog-root/ # hashed/pinned tlog root(s)
|
||||
policy/
|
||||
verify-policy.yaml # cosign/in-toto verification policies
|
||||
lattice-rules.yaml # your VEX merge / trust lattice rules
|
||||
sboms/ # *.cdx.json, *.spdx.json
|
||||
attestations/ # *.intoto.jsonl.dsig (DSSE)
|
||||
tlog/
|
||||
checkpoint.sig # signed tree head
|
||||
entries/ # *.jsonl (Merkle leaves) + proofs
|
||||
tools/
|
||||
cosign-<ver> (sha256)
|
||||
oras-<ver> (sha256)
|
||||
jq-<ver> (sha256)
|
||||
your-scanner-<ver> (sha256)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# 4) Pre‑seeded keyrings (no online CA lookups)
|
||||
|
||||
**Cosign** (example with file‑based roots and identity pins):
|
||||
|
||||
```bash
|
||||
# Verify a DSSE attestation with local roots & identities only
|
||||
COSIGN_EXPERIMENTAL=1 cosign verify-attestation \
|
||||
--key ./evidence/keys/identities/vendor_A.pub \
|
||||
--insecure-ignore-tlog \
|
||||
--certificate-identity "https://ci.vendorA/build" \
|
||||
--certificate-oidc-issuer "https://fulcio.offline" \
|
||||
--rekor-url "http://127.0.0.1:8080" \ # your private tlog OR omit entirely
|
||||
--policy ./evidence/policy/verify-policy.yaml \
|
||||
<artifact-digest-or-ref>
|
||||
```
|
||||
|
||||
If you do **not** run any server inside the air‑gap, omit `--rekor-url` and use **local tlog proofs** (see §6).
|
||||
|
||||
**in‑toto** (offline layout):
|
||||
|
||||
```bash
|
||||
in-toto-verify \
|
||||
--layout ./attestations/layout.root.json \
|
||||
--layout-keys ./keys/identities/vendor_A.pub \
|
||||
--products <artifact-file>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# 5) SBOM ingestion (deterministic)
|
||||
|
||||
1. Normalize SBOMs to a canonical form:
|
||||
|
||||
```bash
|
||||
jq -S . sboms/app.cdx.json > sboms/_canon/app.cdx.json
|
||||
jq -S . sboms/app.spdx.json > sboms/_canon/app.spdx.json
|
||||
```
|
||||
|
||||
2. Validate schemas (use vendored validators).
|
||||
3. Hash‑pin the canonical files and record in a **manifest.lock**:
|
||||
|
||||
```bash
|
||||
sha256sum sboms/_canon/*.json > manifest.lock
|
||||
```
|
||||
|
||||
4. Import into your DB with **idempotent keys = (artifactDigest, sbomHash)**. Reject if same key exists with different bytes.
|
||||
|
||||
---
|
||||
|
||||
# 6) Offline Rekor mirror (no internet)
|
||||
|
||||
Two patterns:
|
||||
|
||||
**A. Embedded file‑ledger (simplest)**
|
||||
|
||||
* Keep `tlog/checkpoint.sig` (signed tree head) and `tlog/entries/*.jsonl` (leaves + inclusion proofs).
|
||||
* During verify:
|
||||
|
||||
* Recompute the Merkle root from entries.
|
||||
* Check it matches `checkpoint.sig` (after verifying its signature with your **tlog root key**).
|
||||
* For each attestation, verify its **UUID / digest** appears in the entry pack and the **inclusion proof** resolves.
|
||||
|
||||
**B. Private Rekor instance (inside air‑gap)**
|
||||
|
||||
* Run Rekor pointing to your local storage.
|
||||
* Load entries via an **import job** from the entry pack.
|
||||
* Pin the Rekor **public key** in `keys/tlog-root/`.
|
||||
* Verification uses `--rekor-url http://rekor.local:3000` with no outbound traffic.
|
||||
|
||||
> In both cases, verification must **not** fall back to the public internet. Fail closed if proofs or keys are missing.
|
||||
|
||||
---
|
||||
|
||||
# 7) Deterministic evidence reconciliation (the “merge without magic” loop)
|
||||
|
||||
Goal: produce the same “evidence graph” every time given the same inputs.
|
||||
|
||||
Algorithm sketch:
|
||||
|
||||
1. **Index** artifacts by immutable digest.
|
||||
2. For each artifact digest:
|
||||
|
||||
* Collect SBOM nodes (components) from canonical SBOM files.
|
||||
* Collect attestations: provenance, VEX, SLSA, signatures (DSSE).
|
||||
* Validate each attestation **before** merge:
|
||||
|
||||
* Sig verifies with pre‑seeded keys.
|
||||
* (If used) tlog inclusion proof verifies against offline checkpoint.
|
||||
3. **Normalize** all docs (stable sort keys, strip timestamps to allowed fields, lower‑case URIs).
|
||||
4. **Apply lattice rules** (your `lattice-rules.yaml`):
|
||||
|
||||
* Example: `VEX: under_review < affected < fixed < not_affected (statement-trust)` with **vendor > maintainer > 3rd‑party** precedence.
|
||||
* Conflicts resolved via deterministic priority list (source, signature strength, issuance time rounded to minutes, then lexical tiebreak).
|
||||
5. Emit:
|
||||
|
||||
* `evidence-graph.json` (stable node/edge order).
|
||||
* `evidence-graph.sha256` and a DSSE signature from your **Authority** key.
|
||||
|
||||
This gives you **byte‑for‑byte identical** outputs across runs.
|
||||
|
||||
---
|
||||
|
||||
# 8) Offline provenance for the tools themselves
|
||||
|
||||
* Treat every tool binary in `/evidence/tools/` like a supply‑chain artifact:
|
||||
|
||||
* Keep **SBOM for the tool**, its **checksum**, and a **signature** from your build or a trusted vendor.
|
||||
* Verification policy must reject running a tool without a matching `(checksum, signature)` entry.
|
||||
|
||||
---
|
||||
|
||||
# 9) Example verification policy (cosign‑style, offline)
|
||||
|
||||
```yaml
|
||||
# evidence/policy/verify-policy.yaml
|
||||
keys:
|
||||
- ./evidence/keys/identities/vendor_A.pub
|
||||
- ./evidence/keys/identities/your_authority.pub
|
||||
tlog:
|
||||
mode: "offline" # never reach out
|
||||
checkpoint: "./evidence/tlog/checkpoint.sig"
|
||||
entry_pack: "./evidence/tlog/entries"
|
||||
attestations:
|
||||
required:
|
||||
- type: slsa-provenance
|
||||
- type: cyclonedx-sbom
|
||||
optional:
|
||||
- type: vex
|
||||
constraints:
|
||||
subjects:
|
||||
alg: "sha256" # only sha256 digests accepted
|
||||
certs:
|
||||
allowed_issuers:
|
||||
- "https://fulcio.offline"
|
||||
allow_expired_if_timepinned: true
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# 10) Operational flow inside the sealed network
|
||||
|
||||
1. **Import bundle** (mount WORM media read‑only).
|
||||
2. **Verify tools** (hash + signature) before execution.
|
||||
3. **Verify tlog checkpoint**, then **verify each inclusion proof**.
|
||||
4. **Verify attestations** (keyring + policy).
|
||||
5. **Ingest SBOMs** (canonicalize + hash).
|
||||
6. **Reconcile** (apply lattice rules → evidence graph).
|
||||
7. **Record your run**:
|
||||
|
||||
* Write `run.manifest` with hashes of: inputs, policies, tools, outputs.
|
||||
* DSSE‑sign `run.manifest` with the Authority key.
|
||||
|
||||
---
|
||||
|
||||
# 11) Disaster‑ready “seed and refresh” model
|
||||
|
||||
* **Seed**: quarterly (or release‑based) export from connected world → signed bundle.
|
||||
* **Delta refreshes**: smaller entry packs with only new SBOMs/attestations + updated checkpoint.
|
||||
* Always keep **N previous checkpoints** to allow replay and audits.
|
||||
|
||||
---
|
||||
|
||||
# 12) Quick hardening checklist
|
||||
|
||||
* Fail closed on: unknown keys, missing proofs, schema drift, clock skew beyond tolerance.
|
||||
* No online fallbacks—env vars like `NO_NETWORK=1` guardrails in your verification binaries.
|
||||
* Pin all versions and capture `--version` output into `run.manifest`.
|
||||
* Use reproducible container images (digest‑locked) even for your internal tools.
|
||||
|
||||
---
|
||||
|
||||
If you want, I can turn this into:
|
||||
|
||||
* a **ready‑to‑run folder template** (with sample policies + scripts),
|
||||
* a **.NET 10** helper library for DSSE + offline tlog proof checks,
|
||||
* or a **Stella Ops module sketch** (Authority, Sbomer, Vexer, Scanner, Feedser) wired exactly to this flow.
|
||||
I will split this in two parts:
|
||||
|
||||
1. Stella Ops advantages (deepened, structured as “moats”).
|
||||
2. Concrete developer guidelines you can drop into a `DEV_GUIDELINES.md` for all Stella services.
|
||||
|
||||
---
|
||||
|
||||
## 1. Stella Ops advantages – expanded
|
||||
|
||||
### 1.1 Evidence-first, not “CVE list-first”
|
||||
|
||||
**Problem in the market**
|
||||
|
||||
Most tools:
|
||||
|
||||
* Dump long CVE lists from a single scanner + single feed.
|
||||
* Have weak cross-scanner consistency.
|
||||
* Treat SBOM, VEX, and runtime evidence as separate, loosely coupled features.
|
||||
|
||||
**Stella advantage**
|
||||
|
||||
Stella’s core product is an **evidence graph**, not a report:
|
||||
|
||||
* All inputs (SBOMs, scanner findings, VEX, runtime probes, policies) are ingested as **immutable evidence nodes**, with:
|
||||
|
||||
* Cryptographic identity (hash / dsse envelope / tlog proof).
|
||||
* Clear provenance (source, time, keys, feeds).
|
||||
* Risk signals (what is exploitable/important) are derived **after** evidence is stored, via lattice logic in `Scanner.WebService`, not during ingestion.
|
||||
* UI, API, and CI output are always **explanations of the evidence graph** (“this CVE is suppressed by this signed VEX statement, proven by these keys and these rules”).
|
||||
|
||||
This gives you:
|
||||
|
||||
* **Quiet-by-design UX**: the “noise vs. signal” ratio is controlled by lattice logic and reachability, not vendor marketing severity.
|
||||
* **Traceable decisions**: every “allow/deny” decision can be traced to concrete evidence and rules.
|
||||
|
||||
Developer consequence:
|
||||
Every Stella module must treat its job as **producing or transforming evidence**, not “telling the user what to do.”
|
||||
|
||||
---
|
||||
|
||||
### 1.2 Deterministic, replayable scans
|
||||
|
||||
**Problem**
|
||||
|
||||
* Existing tools are hard to replay: feeds change, scanners change, rules change.
|
||||
* For audits/compliance you cannot easily re-run “the same scan” from 9 months ago and get the same answer.
|
||||
|
||||
**Stella advantage**
|
||||
|
||||
Each scan in Stella is defined by a **deterministic manifest**:
|
||||
|
||||
* Precise hashes and versions of:
|
||||
|
||||
* Scanner binaries / containers.
|
||||
* SBOM parsers, VEX parsers.
|
||||
* Lattice rules, policies, allow/deny lists.
|
||||
* Feeds snapshots (CVE, CPE/CPE-2.3, OS vendor advisories, distro data).
|
||||
* Exact artifact digests (image, files, dependencies).
|
||||
* Exact tlog checkpoints used for attestation verification.
|
||||
* Config parameters (flags, perf knobs) recorded.
|
||||
|
||||
From this:
|
||||
|
||||
* You can recreate a *replay bundle* and re-run the scan offline with **byte-for-byte identical outcomes**, given the same inputs.
|
||||
* Auditors/clients can verify that a historical decision was correct given the knowledge at that time.
|
||||
|
||||
Developer consequence:
|
||||
Any new feature that affects risk decisions must:
|
||||
|
||||
* Persist versioned configuration and inputs in a **scan manifest**, and
|
||||
* Be able to reconstruct results from that manifest without network calls.
|
||||
|
||||
---
|
||||
|
||||
### 1.3 Crypto-sovereign, offline-ready by design
|
||||
|
||||
**Problem**
|
||||
|
||||
* Most “Sigstore-enabled” tooling assumes access to public Fulcio/Rekor over the internet.
|
||||
* Many orgs (banks, defense, state operators) cannot rely on foreign CAs or public transparency logs.
|
||||
* Regional crypto standards (GOST, SM2/3/4, eIDAS, FIPS) are rarely supported properly.
|
||||
|
||||
**Stella advantage**
|
||||
|
||||
* **Offline trust anchors**: Stella runs with a fully pre-seeded root of trust:
|
||||
|
||||
* Local CA chains (Fulcio-like), private Rekor mirror or file-based Merkle log.
|
||||
* Vendor/org keys and cert chains for SBOM, VEX, and provenance.
|
||||
* **Crypto abstraction layer**:
|
||||
|
||||
* Pluggable algorithms: NIST curves, Ed25519, GOST, SM2/3/4, PQC (Dilithium/Falcon) as optional profiles.
|
||||
* Policy-driven: per-tenant crypto policy that defines what signatures are acceptable in which contexts.
|
||||
* **No online fallback**:
|
||||
|
||||
* Verification will never “phone home” to public CAs/logs.
|
||||
* Missing keys/proofs → deterministic, explainable failures.
|
||||
|
||||
Developer consequence:
|
||||
Every crypto operation must:
|
||||
|
||||
* Go through a **central crypto and trust-policy abstraction**, not directly through platform libraries.
|
||||
* Support an **offline-only execution mode** that fails closed when external services are not available.
|
||||
|
||||
---
|
||||
|
||||
### 1.4 Rich SBOM/VEX semantics and “link-not-merge”
|
||||
|
||||
**Problem**
|
||||
|
||||
* Many tools turn SBOMs into their own proprietary schema early, losing fidelity.
|
||||
* VEX data is often flattened into flags (“affected/not affected”) without preserving original statements and signatures.
|
||||
|
||||
**Stella advantage**
|
||||
|
||||
* **Native support** for:
|
||||
|
||||
* CycloneDX 1.5/1.6 and SPDX 3.x as first-class citizens.
|
||||
* DSSE-wrapped attestations (provenance, VEX, custom).
|
||||
* **Link-not-merge model**:
|
||||
|
||||
* Original SBOM/VEX files are stored **immutable** (canonical JSON).
|
||||
* Stella maintains **links** between:
|
||||
|
||||
* Artifacts ↔ Components ↔ Vulnerabilities ↔ VEX statements ↔ Attestations.
|
||||
* Derived views are computed on top of links, not by mutating original data.
|
||||
* **Trust-aware VEX lattice**:
|
||||
|
||||
* Multiple VEX statements from different parties can conflict.
|
||||
* A lattice engine defines precedence and resolution: vendor vs maintainer vs third-party; affected/under-investigation/not-affected/fixed, etc.
|
||||
|
||||
Developer consequence:
|
||||
No module is ever allowed to “rewrite” SBOM/VEX content. They may:
|
||||
|
||||
* Store it,
|
||||
* Canonicalize it,
|
||||
* Link it,
|
||||
* Derive views on top of it,
|
||||
but must always keep original bytes addressable and hash-pinned.
|
||||
|
||||
---
|
||||
|
||||
### 1.5 Lattice-based trust algebra (Stella “Trust Algebra Studio”)
|
||||
|
||||
**Problem**
|
||||
|
||||
* Existing tools treat suppression, exception, and VEX as ad-hoc rule sets, hard to reason about and even harder to audit.
|
||||
* There is no clear, composable way to combine multiple trust sources.
|
||||
|
||||
**Stella advantage**
|
||||
|
||||
* Use of **lattice theory** for trust:
|
||||
|
||||
* Risk states (e.g., exploitable, mitigated, irrelevant, unknown) are elements of a lattice.
|
||||
* VEX statements, policies, and runtime evidence act as **morphisms** over that lattice.
|
||||
* Final state is a deterministic “join/meet” over all evidence.
|
||||
* Vendor- and customer-configurable:
|
||||
|
||||
* Visual and declarative editing in “Trust Algebra Studio.”
|
||||
* Exported as machine-readable manifests used by `Scanner.WebService`.
|
||||
|
||||
Developer consequence:
|
||||
All “Is this safe?” or “Should we fail the build?” logic:
|
||||
|
||||
* Lives in the **lattice engine in `Scanner.WebService`**, not in Sbomer / Vexer / Feedser / Concelier.
|
||||
* Must be fully driven by declarative policy artifacts, which are:
|
||||
|
||||
* Versioned,
|
||||
* Hash-pinned,
|
||||
* Stored as evidence.
|
||||
|
||||
---
|
||||
|
||||
### 1.6 Proof-of-Integrity Graph (build → deploy → runtime)
|
||||
|
||||
**Problem**
|
||||
|
||||
* Many vendors provide a one-shot scan or “image signing” with no continuous linkage back to build provenance and SBOM.
|
||||
* Runtime views are disconnected from build-time evidence.
|
||||
|
||||
**Stella advantage**
|
||||
|
||||
* **Proof-of-Integrity Graph**:
|
||||
|
||||
* For each running container/process, Stella tracks:
|
||||
|
||||
* Image digest → SBOM → provenance attestation → signatures and tlog proofs → policies applied → runtime signals.
|
||||
* Every node in that chain is cryptographically linked.
|
||||
* This lets you say:
|
||||
|
||||
* “This running pod corresponds to this exact build, these SBOM components, and these VEX statements, verified with these keys.”
|
||||
|
||||
Developer consequence:
|
||||
Any runtime-oriented module (scanner sidecars, agents, k8s admission, etc.) must:
|
||||
|
||||
* Treat the **digest** + attestation chain as the identity of a workload.
|
||||
* Never operate solely on mutable labels (tags, names, namespaces) without a digest backlink.
|
||||
|
||||
---
|
||||
|
||||
### 1.7 AI Codex / Assistant on top of proofs, not heuristics
|
||||
|
||||
**Problem**
|
||||
|
||||
* Most AI-driven security assistants are “LLM over text reports,” effectively hallucinating risk judgments.
|
||||
|
||||
**Stella advantage**
|
||||
|
||||
* AI assistant (Zastava / Companion) is constrained to:
|
||||
|
||||
* Read from the **evidence graph**, lattice decisions, and deterministic manifests.
|
||||
* Generate **explanations**, remediation plans, and playbooks—but never bypass hard rules.
|
||||
* This yields:
|
||||
|
||||
* Human-readable, audit-friendly reasoning.
|
||||
* Low hallucination risk, because the assistant is grounded in structured facts.
|
||||
|
||||
Developer consequence:
|
||||
All AI-facing APIs must:
|
||||
|
||||
* Expose **structured, well-typed evidence and decisions**, not raw strings.
|
||||
* Treat LLM/AI output as advisory, never as an authority that can modify evidence, policy, or crypto state.
|
||||
|
||||
---
|
||||
|
||||
## 2. Stella Ops – developer guidelines
|
||||
|
||||
You can think of this as a “short charter” for all devs in the Stella codebase.
|
||||
|
||||
### 2.1 Architectural principles
|
||||
|
||||
1. **Evidence-first, policy-second, UI-third**
|
||||
|
||||
* First: model and persist raw evidence (SBOM, VEX, scanner findings, attestations, logs).
|
||||
* Second: apply policies/lattice logic to evaluate evidence.
|
||||
* Third: build UI and CLI views that explain decisions based on evidence and policies.
|
||||
|
||||
2. **Pipeline-first interfaces**
|
||||
|
||||
* Every capability must be consumable from:
|
||||
|
||||
* CLI,
|
||||
* API,
|
||||
* CI/CD YAML integration.
|
||||
* The web UI is an explainer/debugger, not the only control plane.
|
||||
|
||||
3. **Offline-first design**
|
||||
|
||||
* Every network dependency must have:
|
||||
|
||||
* A clear “online” path, and
|
||||
* A documented “offline bundle” path (pre-seeded feeds, keyrings, logs).
|
||||
* No module is allowed to perform optional online calls that change security outcomes when offline.
|
||||
|
||||
4. **Determinism by default**
|
||||
|
||||
* Core algorithms (matching, reachability, lattice resolution) must not:
|
||||
|
||||
* Depend on wall-clock time (beyond inputs captured in the scan manifest),
|
||||
* Depend on network responses,
|
||||
* Use randomness without a seed recorded in the manifest.
|
||||
* Outputs must be reproducible given:
|
||||
|
||||
* Same inputs,
|
||||
* Same policies,
|
||||
* Same versions of components.
|
||||
|
||||
---
|
||||
|
||||
### 2.2 Solution & code organization (.NET 10 / C#)
|
||||
|
||||
For each service, follow a consistent layout, e.g.:
|
||||
|
||||
* `StellaOps.<Module>.Domain`
|
||||
|
||||
* Pure domain models, lattice algebra types, value objects.
|
||||
* No I/O, no HTTP, no EF, no external libs except core BCL and domain math libs.
|
||||
* `StellaOps.<Module>.Application`
|
||||
|
||||
* Use-cases / handlers / orchestrations (CQRS style if preferred).
|
||||
* Interfaces for repositories, crypto, feeds, scanners.
|
||||
* `StellaOps.<Module>.Infrastructure`
|
||||
|
||||
* Implementations of ports:
|
||||
|
||||
* EF Core 9 / Dapper for Postgres,
|
||||
* MongoDB drivers,
|
||||
* Integration with external scanners and tools.
|
||||
* `StellaOps.<Module>.WebService`
|
||||
|
||||
* ASP.NET minimal APIs or controllers.
|
||||
* AuthZ, multi-tenancy boundaries, DTOs, API versioning.
|
||||
* Lattice engine for Scanner only (per your standing rule).
|
||||
* `StellaOps.Sdk.*`
|
||||
|
||||
* Shared models and clients for:
|
||||
|
||||
* Evidence graph schemas,
|
||||
* DSSE/attestation APIs,
|
||||
* Crypto abstraction.
|
||||
|
||||
Guideline:
|
||||
No domain logic inside controllers, jobs, or EF entities. All logic lives in `Domain` and `Application`.
|
||||
|
||||
---
|
||||
|
||||
### 2.3 Global invariants developers must respect
|
||||
|
||||
1. **Original evidence is immutable**
|
||||
|
||||
* Once an SBOM/VEX/attestation/scanner report is stored:
|
||||
|
||||
* Never update the stored bytes.
|
||||
* Only mark it as superseded / obsolete via new records.
|
||||
* Every mutation of state must be modeled as:
|
||||
|
||||
* New evidence node or
|
||||
* New relationship.
|
||||
|
||||
2. **“Link-not-merge” for external content**
|
||||
|
||||
* Store external documents as canonical blobs + parsed, normalized models.
|
||||
* Link them to internal models; do not re-serialize a “Stella version” and throw away the original.
|
||||
|
||||
3. **Lattice logic only in `Scanner.WebService`**
|
||||
|
||||
* Sbomer/Vexer/Feedser/Concelier must:
|
||||
|
||||
* Ingest/normalize/publish evidence,
|
||||
* Never implement their own evaluation of “safe vs unsafe.”
|
||||
* `Scanner.WebService` is the only place where:
|
||||
|
||||
* Reachability,
|
||||
* Severity,
|
||||
* VEX resolution,
|
||||
* Policy decisions
|
||||
are computed.
|
||||
|
||||
4. **Crypto operations via Authority**
|
||||
|
||||
* Any signing or verification of:
|
||||
|
||||
* SBOMs,
|
||||
* VEX,
|
||||
* Provenance,
|
||||
* Scan manifests,
|
||||
must go through Authority abstractions:
|
||||
* Key store,
|
||||
* Trust policy engine,
|
||||
* Rekor/log verifier (online or offline).
|
||||
* No direct `RSA.Create()` etc. inside application services.
|
||||
|
||||
5. **No implicit network trust**
|
||||
|
||||
* Any HTTP client must:
|
||||
|
||||
* Explicitly declare whether it is allowed in:
|
||||
|
||||
* Online mode only, or
|
||||
* Online + offline (with mirror).
|
||||
* Online fetches may only:
|
||||
|
||||
* Pull feeds and cache them as immutable snapshots.
|
||||
* Never change decisions made for already-completed scans.
|
||||
|
||||
---
|
||||
|
||||
### 2.4 Module-level guidelines
|
||||
|
||||
#### 2.4.1 Scanner.*
|
||||
|
||||
Responsibilities:
|
||||
|
||||
* Integrate one or more scanners (Trivy, Grype, OSV, custom engines, Bun/Node etc.).
|
||||
* Normalize their findings into a **canonical finding model**.
|
||||
* Run lattice + reachability algorithms to derive final “risk states”.
|
||||
|
||||
Guidelines:
|
||||
|
||||
* Each engine integration:
|
||||
|
||||
* Runs in an isolated, well-typed adapter (e.g., `IScannerEngine`).
|
||||
* Produces **raw findings** with full context (CVE, package, version, location, references).
|
||||
* Canonical model:
|
||||
|
||||
* Represent vulnerability, package, location, and evidence origin explicitly.
|
||||
* Track which engine(s) reported each finding.
|
||||
* Lattice engine:
|
||||
|
||||
* Consumes:
|
||||
|
||||
* Canonical findings,
|
||||
* SBOM components,
|
||||
* VEX statements,
|
||||
* Policies,
|
||||
* Optional runtime call graph / reachability information.
|
||||
* Produces:
|
||||
|
||||
* Deterministic risk state per (vulnerability, component, artifact).
|
||||
* Scanner output:
|
||||
|
||||
* Always include:
|
||||
|
||||
* Raw evidence references (IDs),
|
||||
* Decisions (state),
|
||||
* Justification (which rules fired).
|
||||
|
||||
#### 2.4.2 Sbomer.*
|
||||
|
||||
Responsibilities:
|
||||
|
||||
* Ingest, validate, and store SBOMs.
|
||||
* Canonicalize and expose them as structured evidence.
|
||||
|
||||
Guidelines:
|
||||
|
||||
* Support CycloneDX + SPDX first; plug-in architecture for others.
|
||||
* Canonicalization:
|
||||
|
||||
* Sort keys, normalize IDs, strip non-essential formatting.
|
||||
* Compute **canonical hash** and store.
|
||||
* Never drop information:
|
||||
|
||||
* Unknown/extension fields should be preserved in a generic structure.
|
||||
* Indexing:
|
||||
|
||||
* Index SBOMs by artifact digest and canonical hash.
|
||||
* Make ingestion idempotent for identical content.
|
||||
|
||||
#### 2.4.3 Vexer / Excitors.*
|
||||
|
||||
Responsibilities:
|
||||
|
||||
* Ingest and normalize VEX and advisory documents from multiple sources.
|
||||
* Maintain a source-preserving model of statements, not final risk.
|
||||
|
||||
Guidelines:
|
||||
|
||||
* VEX statements:
|
||||
|
||||
* Model as: (subject, vulnerability, status, justification, timestamp, signer, source).
|
||||
* Keep source granularity (which file, line, signature).
|
||||
* Excitors (feed-to-VEX/advisory converters):
|
||||
|
||||
* Pull from vendors (Red Hat, Debian, etc.) and convert into normalized VEX-like statements or internal advisory format.
|
||||
* Preserve raw docs alongside normalized statements.
|
||||
* No lattice resolution:
|
||||
|
||||
* They only output statements; resolution happens in Scanner based on trust lattice.
|
||||
|
||||
#### 2.4.4 Feedser.*
|
||||
|
||||
Responsibilities:
|
||||
|
||||
* Fetch and snapshot external feeds (CVE, OS, language ecosystems, vendor advisories).
|
||||
|
||||
Guidelines:
|
||||
|
||||
* Snapshot model:
|
||||
|
||||
* Each fetch = versioned snapshot with:
|
||||
|
||||
* Source URL,
|
||||
* Time,
|
||||
* Hash,
|
||||
* Signed metadata if available.
|
||||
* Offline bundles:
|
||||
|
||||
* Ability to export/import snapshots as tarballs for air-gapped environments.
|
||||
* Idempotency:
|
||||
|
||||
* Importing the same snapshot twice must be a no-op.
|
||||
|
||||
#### 2.4.5 Authority.*
|
||||
|
||||
Responsibilities:
|
||||
|
||||
* Central key and trust management.
|
||||
* DSSE, signing, verification.
|
||||
* Rekor/log (online or offline) integration.
|
||||
|
||||
Guidelines:
|
||||
|
||||
* Key management:
|
||||
|
||||
* Clearly separate:
|
||||
|
||||
* Online signing keys,
|
||||
* Offline/HSM keys,
|
||||
* Root keys.
|
||||
* Verification:
|
||||
|
||||
* Use local keyrings, pinned CAs, and offline logs by default.
|
||||
* Enforce “no public fallback” unless explicitly opted in by the admin.
|
||||
* API:
|
||||
|
||||
* Provide a stable interface for:
|
||||
|
||||
* `VerifyAttestation(artifactDigest, dsseEnvelope, verificationPolicy) → VerificationResult`
|
||||
* `SignEvidence(evidenceHash, keyId, context) → Signature`
|
||||
|
||||
#### 2.4.6 Concelier.*
|
||||
|
||||
Responsibilities:
|
||||
|
||||
* Map the evidence graph to business context:
|
||||
|
||||
* Applications, environments, customers, SLAs.
|
||||
|
||||
Guidelines:
|
||||
|
||||
* Never change evidence; only:
|
||||
|
||||
* Attach business labels,
|
||||
* Build views (per app, per cluster, per customer).
|
||||
* Use decisions from Scanner:
|
||||
|
||||
* Do not re-implement risk logic.
|
||||
* Only interpret risk in business terms (SLA breach, policy exceptions etc.).
|
||||
|
||||
---
|
||||
|
||||
### 2.5 Testing & quality guidelines
|
||||
|
||||
1. **Golden fixtures everywhere**
|
||||
|
||||
* For SBOM/VEX/attestation/scan pipelines:
|
||||
|
||||
* Maintain small, realistic fixture sets with:
|
||||
|
||||
* Inputs (files),
|
||||
* Config manifests,
|
||||
* Expected evidence graph outputs.
|
||||
* Tests must be deterministic and work offline.
|
||||
|
||||
2. **Snapshot-style tests**
|
||||
|
||||
* For lattice decisions:
|
||||
|
||||
* Use snapshot tests of decisions per (artifact, vulnerability).
|
||||
* Any change must be reviewed as a potential policy or algorithm change.
|
||||
|
||||
3. **Offline mode tests**
|
||||
|
||||
* CI must include a job that:
|
||||
|
||||
* Runs with `NO_NETWORK=1` (or equivalent),
|
||||
* Uses only pre-seeded bundles,
|
||||
* Ensures features degrade gracefully but deterministically.
|
||||
|
||||
4. **Performance caps**
|
||||
|
||||
* For core algorithms (matching, lattice, reachability):
|
||||
|
||||
* Maintain per-feature benchmarks with target upper bounds.
|
||||
* Fail PRs that introduce significant regressions.
|
||||
|
||||
---
|
||||
|
||||
### 2.6 CI/CD and deployment guidelines
|
||||
|
||||
1. **Immutable build**
|
||||
|
||||
* All binaries and containers:
|
||||
|
||||
* Built in controlled CI,
|
||||
* SBOM-ed,
|
||||
* Signed (Authority),
|
||||
* Optional tlog entry (online or offline).
|
||||
|
||||
2. **Self-hosting expectations**
|
||||
|
||||
* Default deployment is:
|
||||
|
||||
* Docker Compose or Kubernetes,
|
||||
* Postgres + Mongo (if used) pinned with migrations,
|
||||
* No internet required after initial bundle import.
|
||||
|
||||
3. **Scan as code**
|
||||
|
||||
* Scans declared as YAML manifests (or JSON) checked into Git:
|
||||
|
||||
* Artifact(s),
|
||||
* Policies,
|
||||
* Feeds snapshot IDs,
|
||||
* Toolchain versions.
|
||||
* CI jobs call Stella CLI/SDK using those manifests.
|
||||
|
||||
---
|
||||
|
||||
### 2.7 Definition of Done for a new feature
|
||||
|
||||
When you implement a new Stella feature, it is “Done” only if:
|
||||
|
||||
1. Evidence
|
||||
|
||||
* New data is persisted as immutable evidence with canonical hashes.
|
||||
* Original external content is stored and linkable.
|
||||
|
||||
2. Determinism
|
||||
|
||||
* Results are deterministic given a manifest of inputs; a “replay” test exists.
|
||||
|
||||
3. Offline
|
||||
|
||||
* Feature works with offline bundles and does not silently call the internet.
|
||||
* Degradation behavior is clearly defined and tested.
|
||||
|
||||
4. Trust & crypto
|
||||
|
||||
* All signing/verification goes through Authority.
|
||||
* Any new trust decisions are expressible via lattice/policy manifests.
|
||||
|
||||
5. UX & pipeline
|
||||
|
||||
* Feature is accessible from:
|
||||
|
||||
* CLI,
|
||||
* API,
|
||||
* CI.
|
||||
* UI only explains and navigates; it is not the sole control.
|
||||
|
||||
If you like, next step I can do is: take one module (e.g., `StellaOps.Scanner.WebService`) and write a concrete, file-level skeleton (projects, directories, main classes/interfaces) that follows all of these rules.
|
||||
@@ -0,0 +1,180 @@
|
||||
I thought you might find this — a snapshot of the evolving landscape around «defensible core» security / supply‑chain tooling — interesting. It touches exactly on the kind of architecture you seem to be building for Stella Ops.
|
||||
|
||||
Here’s what’s relevant:
|
||||
|
||||
## 🔎 What the “core risk + reachability + prioritization” trend is — and where it stands
|
||||
|
||||
* Endor Labs is explicitly marketing “function‑level reachability + call‑paths + prioritization” as a differentiator compared to generic SCA tools. Their reachability analysis labels functions (or dependencies) as **Reachable / Unreachable / Potentially Reachable** based on static call‑graph + dependency analysis. ([Endor Labs Docs][1])
|
||||
* In their docs, they say this lets customers “reduce thousands of vulnerability findings to just a handful (e.g. 5)” when combining severity, patch availability, and reachability — effectively de‑noising the output. ([Endor Labs][2])
|
||||
* Snyk — the other widely used vendor in this space — does something related but different: e.g. their “Priority Score” for scan findings is a deterministic score (1–1000) combining severity, occurrence count, “hot‑files”, fix‑example availability, etc. ([Snyk][3])
|
||||
* However: Snyk’s prioritization is *not* the same as reachability-based call‑graph analysis. Their score helps prioritize, but doesn’t guarantee exploitability or call‑path feasibility.
|
||||
|
||||
**What this means:** There is growing industry push for “defensible core” practices — combining call‑path / reachability analysis (to focus on *actually reachable* code), deterministic prioritization scoring (so actions are repeatable and explainable), and better UX/context so developers see “what matters.” Endor Labs appears to lead here; Snyk is more generic but still useful for initial triage.
|
||||
|
||||
## 🔐 Why combining evidence‑backed provenance (signatures + transparency logs) matters — and recent progress
|
||||
|
||||
* Sigstore’s transparency‑log project Rekor recently hit **General Availability** (v2) as of Oct 10, 2025. The new tile‑backed log is cheaper to maintain, more scalable, and simpler to operate compared to earlier versions. ([Sigstore Blog][4])
|
||||
* With Rekor v2 + Sigstore tooling (e.g. Cosign v2.6.0+), you can sign artifacts and generate *bundles* (wrapped DSSE + in‑toto attestations) whose inclusion in the public log can be verified, producing proofs of supply‑chain provenance. ([Sigstore Blog][4])
|
||||
* This means you can tie each “finding” (e.g. a vulnerable library, or a build artifact) to verifiable cryptographic evidence, anchored in a public transparency log — not just heuristics or internal metadata.
|
||||
|
||||
**What this means for a “defensible core”:** You can go beyond “reachability + scoring” (i.e. “this vulnerability matters for our code”) — and link each finding to actual cryptographic proof that the artifact existed, was signed by who you expect, wasn’t tampered with, and is publicly logged. That makes auditability and traceability much stronger — which aligns with many of your own strategic moats (e.g. deterministic, auditable build/replay, proof‑of‑integrity graph).
|
||||
|
||||
## 🧩 Where this converges — and how it speaks to *your* Stella Ops vision
|
||||
|
||||
Given your interest in deterministic reproducibility, cryptographically verifiable build artifacts, and a “proof‑market ledger” — this convergence matters:
|
||||
|
||||
* Tools like Endor Labs show that **reachable‑call‑path + prioritization UX** is becoming a commercially accepted approach. That validates part of your “Policy/Lattice Engine + Prioritized remediation” vision.
|
||||
* With Sigstore/Rekor v2 supporting DSSE‑wrapped in‑toto attestations, you get a **real-world, deployed substrate** for binding findings → signed evidence → public log, which matches your “Proof‑Market Ledger + decentralized Rekor‑mirror” idea.
|
||||
* Because Rekor v2 reduces operational complexity and cost, it lowers barrier-to-entry for widespread adoption — increasing the odds that forging a tooling ecosystem around it (e.g. for containers, SBOMs, runtime artifacts) will gain traction.
|
||||
|
||||
## ⚠ Limitations and What to Watch
|
||||
|
||||
* Reachability analysis — even when function‑level — is inherently best‑effort static analysis. There may be dynamic behaviours, reflection, runtime code‑gen, or edge‑cases that evade call‑graph analysis. Thus “reachable = exploitable” remains probabilistic.
|
||||
* Transparency logs like Rekor guarantee integrity of signature metadata and provenance, but they don’t guarantee correctness of code, absence of vulnerabilities, or semantic security properties. They give auditability — not safety by themselves.
|
||||
* Adoption — for both reachability‑based SCA and Sigstore‑style provenance — remains uneven. Not all languages, ecosystems, or organisations integrate deeply (especially transitive dependencies, legacy code, dynamic languages).
|
||||
|
||||
---
|
||||
|
||||
If you like, I can **pull up 4–6 recent (2025) academic papers or real‑world case studies** exploring reachability + provenance + prioritization — could be useful for Stella Ops research/whitepaper.
|
||||
|
||||
[1]: https://docs.endorlabs.com/introduction/reachability-analysis/?utm_source=chatgpt.com "Reachability analysis"
|
||||
[2]: https://www.endorlabs.com/learn/how-to-prioritize-reachable-open-source-software-oss-vulnerabilities?utm_source=chatgpt.com "How to Prioritize Reachable Open Source Software (OSS) ..."
|
||||
[3]: https://snyk.io/blog/snyk-code-priority-score-prioritizes-vulnerabilities/?utm_source=chatgpt.com "How Snyk Code prioritizes vulnerabilities using their ..."
|
||||
[4]: https://blog.sigstore.dev/rekor-v2-ga/?utm_source=chatgpt.com "Rekor v2 GA - Cheaper to run, simpler to maintain"
|
||||
Stella Ops wins if it becomes the place where **risk decisions are both correct *and provable***—not just “high/medium/low” dashboards.
|
||||
|
||||
## Stella Ops advantages (the “Defensible Core” thesis)
|
||||
|
||||
### 1) Less noise: prioritize what’s *actually* executable
|
||||
|
||||
Classic SCA/container scanners mostly tell you “a vulnerable thing exists.” Stella should instead answer: **“Can an attacker reach it from *our* entrypoints?”**
|
||||
Industry precedent: reachability tooling classifies findings as *reachable / potentially reachable / unreachable* based on call-graph style analysis. ([Endor Labs Docs][1])
|
||||
**Advantage:** cut backlog from thousands to the handful that are realistically exploitable.
|
||||
|
||||
### 2) Evidence-first trust: tie every claim to cryptographic proof
|
||||
|
||||
Make “what shipped” and “who built it” verifiable via signatures + attestations, not screenshots and tribal knowledge.
|
||||
|
||||
* **Cosign** supports signing and **in-toto attestations** ([Sigstore][2])
|
||||
* Sigstore **Bundles** wrap a DSSE envelope containing an in-toto statement ([Sigstore][3])
|
||||
* Rekor provides a transparency log; Rekor v2 is positioned as cheaper/simpler to operate ([blog.sigstore.dev][4])
|
||||
|
||||
**Advantage:** audits become “verify these proofs,” not “trust our process.”
|
||||
|
||||
### 3) Deterministic decisions: the score is explainable and repeatable
|
||||
|
||||
Security teams hate debates; developers hate random severity flipping. Stella can compute a deterministic “Fix Now” priority based on:
|
||||
|
||||
* severity + exploit context + **reachability evidence** + patchability + blast radius
|
||||
This is directionally similar to Snyk’s Priority Score approach (0–1000) that blends severity/impact/actionability. ([docs.snyk.io][5])
|
||||
**Advantage:** every ticket includes *why* and *what to do next*, consistently.
|
||||
|
||||
### 4) One graph from source → build → artifact → deploy → runtime
|
||||
|
||||
The moat is not “one more scanner.” It’s a **proof graph**:
|
||||
|
||||
* commit / repo identity
|
||||
* build provenance (how/where built)
|
||||
* SBOM (what’s inside)
|
||||
* VEX (is it affected or not?)
|
||||
* deployment admission (what actually ran)
|
||||
|
||||
Standards to lean on:
|
||||
|
||||
* **SLSA provenance** describes where/when/how an artifact was produced ([SLSA][6])
|
||||
* **CycloneDX** is a widely-used SBOM format ([cyclonedx.org][7])
|
||||
* **OpenVEX / VEX** communicates exploitability status ([GitHub][8])
|
||||
|
||||
**Advantage:** when a new CVE drops, you can answer in minutes: “which deployed digests are affected and reachable?”
|
||||
|
||||
---
|
||||
|
||||
## Developer guidelines (the “Stella-ready” workflow)
|
||||
|
||||
### A. Repo hygiene (make builds and reachability analyzable)
|
||||
|
||||
1. **Pin dependencies** (lockfiles, digest-pinned base images, avoid floating tags like `latest`).
|
||||
2. **Declare entrypoints**: API routes, CLIs, queue consumers, cron jobs—anything that can trigger code paths. This massively improves reachability signal.
|
||||
3. Prefer **explicit wiring over reflection/dynamic loading** where feasible (or annotate the dynamic edges).
|
||||
|
||||
### B. CI “Golden Path” (always produce proof artifacts)
|
||||
|
||||
Every build should output *three things*: **artifact + SBOM + provenance**, then sign/attest.
|
||||
|
||||
Minimal pipeline shape:
|
||||
|
||||
1. Build + test
|
||||
2. Generate SBOM (CycloneDX recommended) ([cyclonedx.org][7])
|
||||
3. Generate provenance (SLSA-style) ([SLSA][6])
|
||||
4. Sign artifact by **digest** (immutable) and attach attestations
|
||||
|
||||
* Cosign signing + keyless flow is documented; signing is `cosign sign <image@digest>` ([Sigstore][2])
|
||||
* Attestations use in-toto predicates ([Sigstore][9])
|
||||
|
||||
Concrete example (illustrative):
|
||||
|
||||
```bash
|
||||
# build produces $IMAGE_URI_DIGEST (image@sha256:...)
|
||||
cosign sign "$IMAGE_URI_DIGEST" # keyless signing common in CI
|
||||
cosign attest --predicate sbom.cdx.json --type cyclonedx "$IMAGE_URI_DIGEST"
|
||||
cosign attest --predicate provenance.json --type slsa-provenance "$IMAGE_URI_DIGEST"
|
||||
```
|
||||
|
||||
(Exact flags vary by environment; the key idea is: **sign the digest and attach attestations**.) ([Sigstore][2])
|
||||
|
||||
### C. CD / runtime guardrails (verify before running)
|
||||
|
||||
1. Admission policy: **only run signed artifacts**; verify signer identity and OIDC issuer (keyless). ([Sigstore][2])
|
||||
2. Require attestations for deploy:
|
||||
|
||||
* provenance present
|
||||
* SBOM present
|
||||
* (optional) vulnerability scan attestation present
|
||||
3. Fail closed by default; allow **time-bounded exceptions** (see below).
|
||||
|
||||
### D. Vulnerability handling (reachability + VEX, not panic patching)
|
||||
|
||||
When a finding arrives:
|
||||
|
||||
1. **Check reachability category**:
|
||||
|
||||
* *Reachable* → fix quickly
|
||||
* *Potentially reachable* → investigate (add entrypoint annotations/tests)
|
||||
* *Unreachable* → document + monitor (don’t ignore forever) ([Endor Labs Docs][1])
|
||||
2. Publish a VEX statement for each shipped artifact:
|
||||
|
||||
* affected / not affected / under investigation / fixed
|
||||
OpenVEX exists specifically to express this status in a minimal way. ([GitHub][8])
|
||||
|
||||
### E. Policy & exceptions (keep velocity without lying)
|
||||
|
||||
1. Policies should be **machine-checkable** and backed by attestations (not “we promise”).
|
||||
2. Exceptions must include:
|
||||
|
||||
* owner + reason
|
||||
* scope (which artifacts/services)
|
||||
* expiry date (auto-reopen)
|
||||
* compensating control (mitigation, WAF rule, feature flag off, etc.)
|
||||
|
||||
### F. Developer ergonomics (make the secure path the easy path)
|
||||
|
||||
* Provide a single local command that mirrors CI verification (e.g., “verify I can ship”).
|
||||
* PR comments should include:
|
||||
|
||||
* top 3 risks with deterministic score
|
||||
* call-path snippet if reachable
|
||||
* one-click remediation suggestion (upgrade path, patch PR)
|
||||
|
||||
---
|
||||
|
||||
If you want to sharpen this into a “Stella Ops Developer Playbook” doc, the most useful format is usually **two pages**: (1) the Golden Path checklist, (2) the exception/triage rubric with examples of reachable vs unreachable + a sample OpenVEX statement.
|
||||
|
||||
[1]: https://docs.endorlabs.com/introduction/reachability-analysis/?utm_source=chatgpt.com "Reachability analysis"
|
||||
[2]: https://docs.sigstore.dev/quickstart/quickstart-cosign/?utm_source=chatgpt.com "Sigstore Quickstart with Cosign"
|
||||
[3]: https://docs.sigstore.dev/about/bundle/?utm_source=chatgpt.com "Sigstore Bundle Format"
|
||||
[4]: https://blog.sigstore.dev/rekor-v2-ga/?utm_source=chatgpt.com "Rekor v2 GA - Cheaper to run, simpler to maintain"
|
||||
[5]: https://docs.snyk.io/manage-risk/prioritize-issues-for-fixing/priority-score?utm_source=chatgpt.com "Priority Score | Snyk User Docs"
|
||||
[6]: https://slsa.dev/spec/draft/build-provenance?utm_source=chatgpt.com "Build: Provenance"
|
||||
[7]: https://cyclonedx.org/specification/overview/?utm_source=chatgpt.com "Specification Overview"
|
||||
[8]: https://github.com/openvex/spec?utm_source=chatgpt.com "OpenVEX Specification"
|
||||
[9]: https://docs.sigstore.dev/cosign/verifying/attestation/?utm_source=chatgpt.com "In-Toto Attestations"
|
||||
@@ -0,0 +1,277 @@
|
||||
I’m sharing this because I think your architecture‑moat ambitions for Stella Ops map really well onto what’s already emerging in SBOM/VEX + call‑graph / contextual‑analysis tooling — and you could use those ideas to shape Stella Ops’ “policy + proof‑market” features.
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
## ✅ What SBOM↔VEX + Reachability / Call‑Path Tools Already Offer
|
||||
|
||||
* The combination of Snyk’s “reachability analysis” and Vulnerability Exploitability eXchange (VEX) lets you label each reported vulnerability as **REACHABLE / NO PATH FOUND / NOT APPLICABLE**, based on static call‑graph (or AI‑enhanced analysis) of your actual application code rather than just “this library version has a CVE.” ([Snyk User Docs][1])
|
||||
* If reachable, Snyk even provides a **“call‑path” view** showing how your code leads to the vulnerable element — giving a human-readable trace from your own functions/modules into the vulnerable dependency. ([Snyk User Docs][1])
|
||||
* The VEX model (as defined e.g. in CycloneDX) is designed to let you embed exploitability/ context‑specific data alongside a standard SBOM. That way, you don’t just convey “what components are present,” but “which known vulnerabilities actually matter in this build or environment.” ([CycloneDX][2])
|
||||
|
||||
In short: SBOM → alerts many potential CVEs. SBOM + VEX + Reachability/Call‑path → highlights only those with an actual path from your code — drastically reducing noise and focusing remediation where it matters.
|
||||
|
||||
## 🔧 What Artefact‑ or Container‑Level “Contextual Analysis” Adds (Triage + Proof Trail)
|
||||
|
||||
* JFrog Xray’s “Contextual Analysis” (when used on container images or binary artifacts) goes beyond “is the library present” — it tries to reason **whether the vulnerable code is even invoked / relevant in this build**. If not, it marks the CVE as “not exploitable / not applicable.” That dramatically reduces false positives: in one study JFrog found ~78% of reported CVEs in popular DockerHub images were not actually exploitable. ([JFrog][3])
|
||||
* Contextual Analysis includes a **call‑graph view** (direct vs transitive calls), highlights affected files/functions & line numbers, and lets you copy details for remediation or auditing. ([JFrog][4])
|
||||
* Combined with SBOM/VEX metadata, this enables a **full proof trail**: you can track from “component in SBOM” → “vulnerability discovered” → “reachable from my code or image” → “call‑path evidence” → “justified exploitability status.” That makes your SBOM/VEX data actionable and auditable without manual spreadsheets or email threads. ([Endor Labs][5])
|
||||
|
||||
## 🎯 Why This Matters for Stella Ops’ Moats
|
||||
|
||||
Given your ambitious Stella Ops design — deterministic scans, reproducible proofs, cryptographically verifiable supply‑chain attestations — this established workflow model is a **very powerful blueprint**:
|
||||
|
||||
* You could integrate call‑graph reachability + artifact‑level contextual analysis directly into your “Proof‑of‑Integrity Graph,” giving customers a **clear, machine‑readable + human‑auditable exploitability verdict** per vulnerability.
|
||||
* The “one‑click evidence” model (call‑path + signed SBOM/VEX + transparency‑log index) maps neatly onto your planned **Proof‑Market Ledger** / “trust‑credit” economy: consumers see not just “vulnerable component included,” but “here’s the exact evidence proving it is or isn’t reachable/exploitable.”
|
||||
* It also supports **tiered remediation prioritization** (e.g. reachable + high severity + business‑critical vs unreachable / low‑impact), reducing burden on teams just like Snyk and JFrog do for their customers.
|
||||
|
||||
## 🧑💻 What’s Still Missing / What You Should Improve
|
||||
|
||||
* Reachability analysis — especially static — is not perfect: “NO PATH FOUND” doesn’t guarantee “never exploitable.” Reflection, dynamic loading, environment‑specific behavior can cause false negatives. ([Snyk User Docs][1])
|
||||
* VEX annotations often remain **semi‑manual or vendor‑supplied** — but for a fully trust‑worthy "Proof‑Market," you’ll need **automated generation + strong signing + revocation / update mechanisms** (which seems aligned with what you planned).
|
||||
* For containerized builds, the contextual analysis must tie SBOM entries to actual runtime/packaged artifacts — your deterministic rebuild + binary lineage + transparency‑log approach could help guarantee that.
|
||||
|
||||
---
|
||||
|
||||
I think combining SBOM + VEX + reachability/call‑path + artifact‑level contextual analysis gives a **real, operational model** that matches what you envision for Stella Ops — but with some limitations that your added layers (cryptographic signing, replayability, trust‑market) can uniquely address.
|
||||
|
||||
If you like — I can **sketch a minimal architecture** (modules & data flow) that implements exactly that model in Stella Ops (call‑graph analyzer → VEX annotator → signed bundle → ledger entry).
|
||||
|
||||
[1]: https://docs.snyk.io/manage-risk/prioritize-issues-for-fixing/reachability-analysis?utm_source=chatgpt.com "Reachability analysis | Snyk User Docs"
|
||||
[2]: https://cyclonedx.org/capabilities/vex/?utm_source=chatgpt.com "Vulnerability Exploitability eXchange (VEX)"
|
||||
[3]: https://jfrog.com/blog/turns-out-78-of-reported-cves-on-top-dockerhub-images-are-not-really-exploitable/?utm_source=chatgpt.com "Turns out 78% of reported common CVEs on top ..."
|
||||
[4]: https://jfrog.com/help/r/jfrog-security-user-guide/products/advanced-security/features-and-capabilities/contextual-analysis-of-cves?utm_source=chatgpt.com "Contextual Analysis of CVEs"
|
||||
[5]: https://www.endorlabs.com/learn/how-cyclonedx-vex-makes-your-sbom-useful?utm_source=chatgpt.com "How CycloneDX VEX Makes Your SBOM Useful | Blog"
|
||||
Stella Ops’ big advantage can be: **turn “security findings” into “verifiable claims with evidence”**—portable across org boundaries, continuously updateable, and audit-friendly—by combining SBOM + VEX + reachability/call-path + signed provenance into one evidence graph.
|
||||
|
||||
## Stella Ops advantages (what becomes uniquely hard to copy)
|
||||
|
||||
### 1) Actionable vulnerability truth, not CVE spam
|
||||
|
||||
SBOMs tell you *what’s present*; VEX tells you *whether a known vuln matters in your specific context* (affected vs not_affected vs fixed vs under_investigation), which is the difference between “alert fatigue” and prioritized remediation. ([cyclonedx.org][1])
|
||||
|
||||
**Stella Ops moat:** VEX isn’t just a checkbox—it’s backed by *machine-verifiable evidence* (reachability traces, policy decisions, build lineage).
|
||||
|
||||
---
|
||||
|
||||
### 2) “Evidence bundles” that any downstream can verify
|
||||
|
||||
If every release ships with:
|
||||
|
||||
* SBOM (what’s in it)
|
||||
* VEX (what matters + why)
|
||||
* Provenance/attestations (how it was built)
|
||||
* Signatures + transparency log inclusion
|
||||
|
||||
…then downstream teams can verify claims *without trusting your internal tooling*.
|
||||
|
||||
This mirrors best practices in the supply-chain world: SLSA recommends distributing provenance and using transparency logs as part of verification. ([SLSA][2])
|
||||
Sigstore also standardizes “bundles” that can include DSSE-wrapped attestations plus transparency log material/timestamps. ([Sigstore][3])
|
||||
|
||||
**Stella Ops moat:** “Proof packaging + verification UX” becomes a platform primitive, not an afterthought.
|
||||
|
||||
---
|
||||
|
||||
### 3) A unified **Proof Graph** (the missing layer in most tooling)
|
||||
|
||||
Most tools produce *reports*. Stella Ops can maintain a **typed graph**:
|
||||
|
||||
`source commit → build step(s) → artifact digest → SBOM components → CVEs → reachability evidence → VEX statements → signers → log inclusion`
|
||||
|
||||
That graph lets you answer hard questions fast:
|
||||
|
||||
* “Is CVE-XXXX exploitable in prod image sha256:…?”
|
||||
* “Show the call-path evidence or runtime proof.”
|
||||
* “Which policy or signer asserted not_affected, and when?”
|
||||
|
||||
SPDX 3.x explicitly aims to support vulnerability metadata (including VEX fields) in a way that can evolve as security knowledge changes. ([spdx.dev][4])
|
||||
|
||||
**Moat:** graph-scale lineage + queryability + verification, not just scanning.
|
||||
|
||||
---
|
||||
|
||||
### 4) Reachability becomes **a signed, reviewable artifact**
|
||||
|
||||
Reachability analysis commonly produces statuses like “REACHABLE / NO PATH FOUND / NOT APPLICABLE.” ([docs.snyk.io][5])
|
||||
Stella Ops can store:
|
||||
|
||||
* the reachability result,
|
||||
* the methodology (static, runtime, hybrid),
|
||||
* confidence/coverage,
|
||||
* and the call-path (optionally redacted),
|
||||
then sign it and tie it to a specific artifact digest.
|
||||
|
||||
**Moat:** you’re not asking teams to *trust* a reachability claim—you’re giving them something they can verify and audit.
|
||||
|
||||
---
|
||||
|
||||
### 5) Continuous updates without chaos (versioned statements, not tribal knowledge)
|
||||
|
||||
VEX statements change over time (“under_investigation” → “not_affected” or “affected”). OpenVEX requires that “not_affected” includes a justification or an impact statement—so consumers can understand *why* it’s not affected. ([GitHub][6])
|
||||
Stella Ops can make those transitions explicit and signed, with append-only history.
|
||||
|
||||
**Moat:** an operational truth system for vulnerability status, not a spreadsheet.
|
||||
|
||||
---
|
||||
|
||||
### 6) “Proof Market” (if you want the deep moat)
|
||||
|
||||
Once evidence is a first-class signed object, you can support multiple signers:
|
||||
|
||||
* vendor (you),
|
||||
* third-party auditors,
|
||||
* internal security team,
|
||||
* trusted scanner services.
|
||||
|
||||
A “proof market” is essentially: **policy chooses whose attestations count** for which claims. (You can start simple—just “org signers”—and expand.)
|
||||
|
||||
**Moat:** trust-routing + signer reputation + network effects.
|
||||
|
||||
---
|
||||
|
||||
## Developer guidelines (for teams adopting Stella Ops)
|
||||
|
||||
### A. Build + identity: make artifacts verifiable
|
||||
|
||||
1. **Anchor everything to an immutable subject**
|
||||
Use the *artifact digest* (e.g., OCI image digest) as the primary key for SBOM, VEX, provenance, reachability results.
|
||||
|
||||
2. **Aim for reproducible-ish builds**
|
||||
Pin dependencies (lockfiles), pin toolchains, and record build inputs/params in provenance. The goal is: “same inputs → same digest,” or at least “digest ↔ exact inputs.” (Even partial determinism pays off.)
|
||||
|
||||
3. **Use standard component identifiers**
|
||||
Prefer PURLs for dependencies and keep them consistent across SBOM + VEX. (This avoids “can’t match vulnerability to component” pain.)
|
||||
|
||||
---
|
||||
|
||||
### B. SBOM: generate it like you mean it
|
||||
|
||||
4. **Generate SBOMs at the right layer(s)**
|
||||
|
||||
* Source-level SBOM (dependency graph)
|
||||
* Artifact/container SBOM (what actually shipped)
|
||||
|
||||
If they disagree, treat that as a signal—your build pipeline is mutating inputs.
|
||||
|
||||
5. **Don’t weld vulnerability state into SBOM unless you must**
|
||||
It’s often cleaner operationally to publish SBOM + separate VEX (since vuln knowledge changes faster). SPDX 3.x explicitly supports richer, evolving security/vulnerability info. ([spdx.dev][4])
|
||||
|
||||
---
|
||||
|
||||
### C. VEX: make statuses evidence-backed and automatable
|
||||
|
||||
6. **Use the standard status set**
|
||||
Common VEX implementations use:
|
||||
|
||||
* `affected`
|
||||
* `not_affected`
|
||||
* `fixed`
|
||||
* `under_investigation` ([Docker Documentation][7])
|
||||
|
||||
7. **Require justification for `not_affected`**
|
||||
OpenVEX requires a status justification or an impact statement for `not_affected`. ([GitHub][6])
|
||||
Practical rule: no “not_affected” without one of:
|
||||
|
||||
* “vulnerable code not in execute path” (+ evidence)
|
||||
* “component not present”
|
||||
* “inline mitigations exist”
|
||||
…plus a link to the supporting artifact(s).
|
||||
|
||||
8. **Version and timestamp VEX statements**
|
||||
Treat VEX like a living contract. Consumers need to know what changed and when.
|
||||
|
||||
---
|
||||
|
||||
### D. Reachability / contextual analysis: avoid false certainty
|
||||
|
||||
9. **Treat reachability as “evidence with confidence,” not absolute truth**
|
||||
Static reachability is great but imperfect (reflection, plugins, runtime dispatch). Operationally:
|
||||
|
||||
* `REACHABLE` → prioritize
|
||||
* `NO PATH FOUND` → deprioritize, don’t ignore
|
||||
* `NOT APPLICABLE` → fall back to other signals ([docs.snyk.io][5])
|
||||
|
||||
10. **Attach the “why”: call-path or runtime proof**
|
||||
If you claim “reachable,” include the call path (or a redacted proof).
|
||||
If you claim “not_affected,” include the justification and a reachability artifact.
|
||||
|
||||
---
|
||||
|
||||
### E. Signing + distribution: ship proofs the way you ship artifacts
|
||||
|
||||
11. **Bundle evidence and sign it**
|
||||
A practical Stella Ops “release bundle” looks like:
|
||||
|
||||
* `sbom.(cdx|spdx).json`
|
||||
* `vex.(openvex|cdx|spdx|csaf).json`
|
||||
* `provenance.intoto.json`
|
||||
* `reachability.json|sarif`
|
||||
* `bundle.sigstore.json` (or equivalent)
|
||||
|
||||
Sigstore’s bundle format supports DSSE envelopes over attestations and can include transparency log entry material/timestamps. ([Sigstore][3])
|
||||
|
||||
12. **Publish to an append-only transparency log**
|
||||
Transparency logs are valuable because they’re auditable and append-only; monitors can check consistency/inclusion. ([Sigstore][8])
|
||||
|
||||
---
|
||||
|
||||
### F. Policy: gate on what matters, not what’s loud
|
||||
|
||||
13. **Write policies in terms of (severity × exploitability × confidence)**
|
||||
Example policy pattern:
|
||||
|
||||
* Block deploy if: `affected AND reachable AND critical`
|
||||
* Warn if: `affected AND no_path_found`
|
||||
* Allow with waiver if: `under_investigation` but time-boxed and signed
|
||||
|
||||
14. **Make exceptions first-class and expiring**
|
||||
Exceptions should be signed statements tied to artifact digests, with TTL and rationale.
|
||||
|
||||
---
|
||||
|
||||
## Developer guidelines (for engineers building Stella Ops itself)
|
||||
|
||||
1. **Everything is a signed claim about a subject**
|
||||
Model each output as: `subject digest + predicate + evidence + signer + time`.
|
||||
|
||||
2. **Support multiple VEX formats, but normalize internally**
|
||||
There are multiple VEX implementations (e.g., CycloneDX, SPDX, OpenVEX, CSAF); normalize into a canonical internal model so policy doesn’t care about input format. ([Open Source Security Foundation][9])
|
||||
|
||||
3. **Expose uncertainty**
|
||||
Store:
|
||||
|
||||
* analysis method (static/runtime/hybrid),
|
||||
* coverage (entrypoints, languages supported),
|
||||
* confidence score,
|
||||
* and known limitations.
|
||||
This prevents “NO PATH FOUND” being treated as “impossible.”
|
||||
|
||||
4. **Make verification fast and offline-friendly**
|
||||
Cache transparency log checkpoints, ship inclusion proofs in bundles, and support air-gapped verification flows where possible.
|
||||
|
||||
5. **Design for redaction**
|
||||
Call-path evidence can leak internals. Provide:
|
||||
|
||||
* full evidence (internal),
|
||||
* redacted evidence (external),
|
||||
* plus hash-based linking so the two correspond.
|
||||
|
||||
6. **Build plugin rails**
|
||||
Reachability analyzers, SBOM generators, scanners, and policy engines will vary by ecosystem. A stable plugin interface is key for adoption.
|
||||
|
||||
---
|
||||
|
||||
If you want a crisp deliverable to hand to engineering, you can lift the above into a 1–2 page “Stella Ops Integration Guide” with: **pipeline steps, required artifacts, recommended policy defaults, and a VEX decision checklist**.
|
||||
|
||||
[1]: https://cyclonedx.org/capabilities/vex/?utm_source=chatgpt.com "Vulnerability Exploitability eXchange (VEX)"
|
||||
[2]: https://slsa.dev/spec/v1.0/distributing-provenance?utm_source=chatgpt.com "Distributing provenance"
|
||||
[3]: https://docs.sigstore.dev/about/bundle/?utm_source=chatgpt.com "Sigstore Bundle Format"
|
||||
[4]: https://spdx.dev/capturing-software-vulnerability-data-in-spdx-3-0/?utm_source=chatgpt.com "Capturing Software Vulnerability Data in SPDX 3.0"
|
||||
[5]: https://docs.snyk.io/manage-risk/prioritize-issues-for-fixing/reachability-analysis?utm_source=chatgpt.com "Reachability analysis | Snyk User Docs"
|
||||
[6]: https://github.com/openvex/spec/blob/main/OPENVEX-SPEC.md?utm_source=chatgpt.com "spec/OPENVEX-SPEC.md at main"
|
||||
[7]: https://docs.docker.com/scout/how-tos/create-exceptions-vex/?utm_source=chatgpt.com "Create an exception using the VEX"
|
||||
[8]: https://docs.sigstore.dev/logging/overview/?utm_source=chatgpt.com "Rekor"
|
||||
[9]: https://openssf.org/blog/2023/09/07/vdr-vex-openvex-and-csaf/?utm_source=chatgpt.com "VDR, VEX, OpenVEX and CSAF"
|
||||
@@ -0,0 +1,595 @@
|
||||
Here’s a compact pattern you can drop into Stella Ops to make reachability checks fast, reproducible, and audit‑friendly.
|
||||
|
||||
---
|
||||
|
||||
# Lazy, single‑use reachability cache + signed “reach‑map” artifacts
|
||||
|
||||
**Why:** reachability queries explode combinatorially; precomputing everything wastes RAM and goes stale. Cache results only when first asked, make them deterministic, and emit a signed artifact so the same evidence can be replayed in VEX proofs.
|
||||
|
||||
**Core ideas (plain English):**
|
||||
|
||||
* **Lazy on first call:** compute only the exact path/query requested; cache that result.
|
||||
* **Deterministic key:** cache key = `algo_signature + inputs_hash + call_path_hash` so the same inputs always hit the same entry.
|
||||
* **Single‑use / bounded TTL:** entries survive just long enough to serve concurrent deduped calls, then get evicted (or on TTL/size). This keeps memory tight and avoids stale proofs.
|
||||
* **Reach‑map artifact:** every cache fill writes a compact, deterministic JSON “reach‑map” (edges, justifications, versions, timestamps) and signs it (DSSE). The artifact is what VEX cites, not volatile memory.
|
||||
* **Replayable proofs:** later runs can skip recomputation by verifying + loading the reach‑map, yielding byte‑for‑byte identical evidence.
|
||||
|
||||
**Minimal shape (C#/.NET 10):**
|
||||
|
||||
```csharp
|
||||
public readonly record struct ReachKey(
|
||||
string AlgoSig, // e.g., "RTA@sha256:…"
|
||||
string InputsHash, // SBOM slice + policy + versions
|
||||
string CallPathHash // normalized query graph (src->sink, opts)
|
||||
);
|
||||
|
||||
public sealed class ReachCache {
|
||||
private readonly ConcurrentDictionary<ReachKey, Lazy<Task<ReachResult>>> _memo = new();
|
||||
|
||||
public Task<ReachResult> GetOrComputeAsync(
|
||||
ReachKey key,
|
||||
Func<Task<ReachResult>> compute,
|
||||
CancellationToken ct)
|
||||
{
|
||||
var lazy = _memo.GetOrAdd(key, _ => new Lazy<Task<ReachResult>>(
|
||||
() => compute(), LazyThreadSafetyMode.ExecutionAndPublication));
|
||||
|
||||
return lazy.Value.ContinueWith(t => {
|
||||
if (t.IsCompletedSuccessfully) return t.Result;
|
||||
_memo.TryRemove(key, out _); // don’t retain failures
|
||||
throw t.Exception ?? new Exception("reachability failed");
|
||||
}, ct);
|
||||
}
|
||||
|
||||
public void Evict(ReachKey key) => _memo.TryRemove(key, out _);
|
||||
}
|
||||
```
|
||||
|
||||
**Compute path → emit DSSE reach‑map (pseudocode):**
|
||||
|
||||
```csharp
|
||||
var result = await cache.GetOrComputeAsync(key, async () => {
|
||||
var graph = BuildSlice(inputs); // deterministic ordering!
|
||||
var paths = FindReachable(graph, query); // your chosen algo
|
||||
var reachMap = Canonicalize(new {
|
||||
algo = key.AlgoSig,
|
||||
inputs_hash = key.InputsHash,
|
||||
call_path = key.CallPathHash,
|
||||
edges = paths.Edges,
|
||||
witnesses = paths.Witnesses, // file:line, symbol ids, versions
|
||||
created = NowUtcIso8601()
|
||||
});
|
||||
var dsse = Dsse.Sign(reachMap, signingKey); // e.g., in‑toto/DSSE
|
||||
await ArtifactStore.PutAsync(KeyToPath(key), dsse.Bytes);
|
||||
return new ReachResult(paths, dsse.Digest);
|
||||
}, ct);
|
||||
```
|
||||
|
||||
**Operational rules:**
|
||||
|
||||
* **Canonical everything:** sort nodes/edges, normalize file paths, strip nondeterministic fields.
|
||||
* **Cache scope:** per‑scan, per‑workspace, or per‑feed version. Evict on feed/policy changes.
|
||||
* **TTL:** e.g., 15–60 minutes; or evict after pipeline completes. Guard with a max‑entries cap.
|
||||
* **Concurrency:** use `Lazy<Task<…>>` (above) to coalesce duplicate in‑flight calls.
|
||||
* **Validation path:** before computing, look for `reach-map.dsse` by `ReachKey`; if signature verifies and schema version matches, load and return (no compute).
|
||||
|
||||
**How this helps VEX in Stella Ops:**
|
||||
|
||||
* **Consistency:** the DSSE reach‑map is the evidence blob your VEX record links to.
|
||||
* **Speed:** repeat scans and parallel microservices reuse cached or pre‑signed artifacts.
|
||||
* **Memory safety:** no unbounded precompute; everything is small, query‑driven.
|
||||
|
||||
**Drop‑in tasks for your agents:**
|
||||
|
||||
1. **Define ReachKey builders** in `Scanner.WebService` (inputs hash = SBOM slice + policy + resolver versions).
|
||||
2. **Add ReachCache** as a scoped service with size/TTL config (appsettings → `Scanner.Reach.Cache`).
|
||||
3. **Implement Canonicalize + Dsse.Sign** in `StellaOps.Crypto` (support FIPS/eIDAS/GOST modes).
|
||||
4. **ArtifactStore**: write/read `reach-map.dsse.json` under deterministic path:
|
||||
`artifacts/reach/<algo>/<inputsHash>/<callPathHash>.dsse.json`.
|
||||
5. **Wire VEXer** to reference the artifact digest and include a verification note.
|
||||
6. **Tests:** golden fixtures asserting stable bytes for the same inputs; mutation tests to ensure any input change invalidates the cache key.
|
||||
|
||||
If you want, I can turn this into a ready‑to‑commit `StellaOps.Scanner.Reach` module (interfaces, options, tests, and a stub DSSE signer).
|
||||
I will split this in two parts:
|
||||
|
||||
1. What are Stella Ops’ concrete advantages (the “moats”).
|
||||
2. How developers must build to actually realize them (guidelines and checklists).
|
||||
|
||||
---
|
||||
|
||||
## 1. Stella Ops Advantages – What We Are Optimizing For
|
||||
|
||||
### 1.1 Deterministic, Replayable Security Evidence
|
||||
|
||||
**Idea:** Any scan or VEX decision run today must be replayable bit-for-bit in 3–5 years for audits, disputes, and compliance.
|
||||
|
||||
**What this means:**
|
||||
|
||||
* Every scan has an explicit **input manifest** (feeds, rules, policies, versions, timestamps).
|
||||
* Outputs (findings, reachability, VEX, attestations) are **pure functions** of that manifest.
|
||||
* Evidence is stored as **immutable artifacts** (DSSE, SBOMs, reach-maps, policy snapshots), not just rows in a DB.
|
||||
|
||||
---
|
||||
|
||||
### 1.2 Reachability-First, Quiet-By-Design Triage
|
||||
|
||||
**Idea:** The main value is not “finding more CVEs” but **proving which ones matter** in your actual runtime and call graph – and keeping noise down.
|
||||
|
||||
**What this means:**
|
||||
|
||||
* Scoring/prioritization is dominated by **reachability + runtime context**, not just CVSS.
|
||||
* Unknowns and partial evidence are surfaced **explicitly**, not hidden.
|
||||
* UX is intentionally quiet: “Can I ship?” → “Yes / No, because of these N concrete, reachable issues.”
|
||||
|
||||
---
|
||||
|
||||
### 1.3 Crypto-Sovereign, Air-Gap-Ready Trust
|
||||
|
||||
**Idea:** The platform must run offline, support local CAs/HSMs, and switch between cryptographic regimes (FIPS, eIDAS, GOST, SM, PQC) by configuration, not by code changes.
|
||||
|
||||
**What this means:**
|
||||
|
||||
* No hard dependency on any public CA, cloud KMS, or single trust provider.
|
||||
* All attestations are **locally verifiable** with bundled roots and policies.
|
||||
* Crypto suites are **pluggable profiles** selected per deployment / tenant.
|
||||
|
||||
---
|
||||
|
||||
### 1.4 Policy / Lattice Engine (“Trust Algebra Studio”)
|
||||
|
||||
**Idea:** Vendors, customers, and regulators speak different languages. Stella Ops provides a **formal lattice** to merge and reason over:
|
||||
|
||||
* VEX statements
|
||||
* Runtime observations
|
||||
* Code provenance
|
||||
* Organizational policies
|
||||
|
||||
…without losing provenance (“who said what”).
|
||||
|
||||
**What this means:**
|
||||
|
||||
* Clear separation between **facts** (observations) and **policies** (how we rank/merge them).
|
||||
* Lattice merge operations are **explicit, testable functions**, not hidden heuristics.
|
||||
* Same artifact can be interpreted differently by different tenants via different lattice policies.
|
||||
|
||||
---
|
||||
|
||||
### 1.5 Proof-Linked SBOM→VEX Chain
|
||||
|
||||
**Idea:** Every VEX claim must point to concrete, verifiable evidence:
|
||||
|
||||
* Which SBOM / version?
|
||||
* Which reachability analysis?
|
||||
* Which runtime signals?
|
||||
* Which signer/policy?
|
||||
|
||||
**What this means:**
|
||||
|
||||
* VEX is not just a JSON document – it is a **graph of links** to attestations and analysis artifacts.
|
||||
* You can click from a VEX statement to the exact DSSE reach-map / scan run that justified it.
|
||||
|
||||
---
|
||||
|
||||
### 1.6 Proof-of-Integrity Graph (Build → Image → Runtime)
|
||||
|
||||
**Idea:** Connect:
|
||||
|
||||
* Source → Build → Image → SBOM → Scan → VEX → Runtime
|
||||
|
||||
…into a single **cryptographically verifiable graph**.
|
||||
|
||||
**What this means:**
|
||||
|
||||
* Every step has a **signed attestation** (in-toto/DSSE style).
|
||||
* Graph queries like “Show me all running pods that descend from this compromised builder” or
|
||||
“Show me all VEX statements that rely on this revoked key” are first-class.
|
||||
|
||||
---
|
||||
|
||||
### 1.7 AI Codex & Zastava Companion (Explainable by Construction)
|
||||
|
||||
**Idea:** AI is used only as a **narrator and planner** on top of hard evidence, not as an oracle.
|
||||
|
||||
**What this means:**
|
||||
|
||||
* Zastava never invents facts; it explains **what is already in the evidence graph**.
|
||||
* Remediation plans cite **concrete artifacts** (scan IDs, attestations, policies) and affected assets.
|
||||
* All AI outputs include links back to raw structured data and can be re-generated in future with the same evidence set.
|
||||
|
||||
---
|
||||
|
||||
### 1.8 Proof-Market Ledger & Adaptive Trust Economics
|
||||
|
||||
**Idea:** Over time, vendors publishing good SBOM/VEX evidence should **gain trust-credit**; sloppy or contradictory publishers lose it.
|
||||
|
||||
**What this means:**
|
||||
|
||||
* A ledger of **published proofs**, signatures, and revocations.
|
||||
* A **trust score** per artifact / signer / vendor, derived from consistency, coverage, and historical correctness.
|
||||
* This feeds into procurement and risk dashboards, not just security triage.
|
||||
|
||||
---
|
||||
|
||||
## 2. Developer Guidelines – How to Build for These Advantages
|
||||
|
||||
I will phrase this as rules and checklists you can directly apply in Stella Ops repos (.NET 10, C#, Postgres, MongoDB, etc.).
|
||||
|
||||
---
|
||||
|
||||
### 2.1 Determinism & Replayability
|
||||
|
||||
**Rules:**
|
||||
|
||||
1. **Pure functions, explicit manifests**
|
||||
|
||||
* Any long-running or non-trivial computation (scan, reachability, lattice merge, trust score) must accept a **single, structured input manifest**, e.g.:
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"scannerVersion": "1.3.0",
|
||||
"rulesetId": "stella-default-2025.11",
|
||||
"feeds": {
|
||||
"nvdDigest": "sha256:...",
|
||||
"osvDigest": "sha256:..."
|
||||
},
|
||||
"sbomDigest": "sha256:...",
|
||||
"policyDigest": "sha256:..."
|
||||
}
|
||||
```
|
||||
|
||||
* No hidden configuration from environment variables, machine-local files, or system clock inside the core algorithm.
|
||||
|
||||
2. **Canonicalization everywhere**
|
||||
|
||||
* Before hashing or signing:
|
||||
|
||||
* Sort arrays by stable keys.
|
||||
* Normalize paths (POSIX style), line endings (LF), and encodings (UTF-8).
|
||||
* Provide a shared `StellaOps.Core.Canonicalization` library used by all services.
|
||||
|
||||
3. **Stable IDs**
|
||||
|
||||
* Every scan, reachability call, lattice evaluation, and VEX bundle gets an opaque but **stable** ID based on the input manifest hash.
|
||||
* Do not use incremental integer IDs for evidence; use digests (hashes) or ULIDs/GUIDs derived from content.
|
||||
|
||||
4. **Golden fixtures**
|
||||
|
||||
* For each non-trivial algorithm, ship at least one **golden fixture**:
|
||||
|
||||
* Input manifest JSON
|
||||
* Expected output JSON
|
||||
* CI must assert byte-for-byte equality for these fixtures (after canonicalization).
|
||||
|
||||
**Developer checklist (per feature):**
|
||||
|
||||
* [ ] Input manifest type defined and versioned.
|
||||
* [ ] Canonicalization applied before hashing/signing.
|
||||
* [ ] Output stored with `inputsDigest` and `algoDigest`.
|
||||
* [ ] At least one golden fixture proves determinism.
|
||||
|
||||
---
|
||||
|
||||
### 2.2 Reachability-First Analysis & Quiet UX
|
||||
|
||||
**Rules:**
|
||||
|
||||
1. **Reachability lives in Scanner.WebService**
|
||||
|
||||
* All lattice/graph heavy lifting for reachability must run in `Scanner.WebService` (standing architectural rule).
|
||||
* Other services (Concelier, Excitors, Feedser) only **consume** reachability artifacts and must “preserve prune source” (never rewrite paths/proofs, only annotate or filter).
|
||||
|
||||
2. **Lazy, query-driven computation**
|
||||
|
||||
* Do not precompute reachability for entire SBOMs.
|
||||
* Compute per **exact query** (image + vulnerability or source→sink path).
|
||||
* Use an in-memory or short-lived cache keyed by:
|
||||
|
||||
* Algorithm signature
|
||||
* Input manifest hash
|
||||
* Query description (call-path hash)
|
||||
|
||||
3. **Evidence-first, severity-second**
|
||||
|
||||
* Internal ranking objects should look like:
|
||||
|
||||
```csharp
|
||||
public sealed record FindingRank(
|
||||
string FindingId,
|
||||
EvidencePointer Evidence,
|
||||
ReachabilityScore Reach,
|
||||
ExploitStatus Exploit,
|
||||
RuntimePresence Runtime,
|
||||
double FinalScore);
|
||||
```
|
||||
|
||||
* UI always has a “Show evidence” or “Explain” action that can be serialized as JSON and re-used by Zastava.
|
||||
|
||||
4. **Quiet-by-design UX**
|
||||
|
||||
* For any list view, default sort is:
|
||||
|
||||
1. Reachable, exploitable, runtime-present
|
||||
2. Reachable, exploitable
|
||||
3. Reachable, unknown exploit
|
||||
4. Unreachable
|
||||
* Show **counts by bucket**, not only total CVE count.
|
||||
|
||||
**Developer checklist:**
|
||||
|
||||
* [ ] Reachability algorithms only in Scanner.WebService.
|
||||
* [ ] Cache is lazy and keyed by deterministic inputs.
|
||||
* [ ] Output includes explicit evidence pointers.
|
||||
* [ ] UI endpoints expose reachability state in structured form.
|
||||
|
||||
---
|
||||
|
||||
### 2.3 Crypto-Sovereign & Air-Gap Mode
|
||||
|
||||
**Rules:**
|
||||
|
||||
1. **Cryptography via “profiles”**
|
||||
|
||||
* Implement a `CryptoProfile` abstraction (e.g. `FipsProfile`, `GostProfile`, `EidasProfile`, `SmProfile`, `PqcProfile`).
|
||||
* All signing/verifying APIs take a `CryptoProfile` or resolve one from tenant config; no direct calls to `RSA.Create()` etc. in business code.
|
||||
|
||||
2. **No hard dependency on public PKI**
|
||||
|
||||
* All verification logic must accept:
|
||||
|
||||
* Provided root cert bundle
|
||||
* Local CRL or OCSP-equivalent
|
||||
* Never assume internet OCSP/CRL.
|
||||
|
||||
3. **Offline bundles**
|
||||
|
||||
* Any operation required for air-gapped mode must be satisfiable with:
|
||||
|
||||
* SBOM + feeds + policy bundle + key material
|
||||
* Define explicit **“offline bundle” formats** (zip/tar + manifest) with hashes of all contents.
|
||||
|
||||
4. **Key rotation and algorithm agility**
|
||||
|
||||
* Metadata for every signature must record:
|
||||
|
||||
* Algorithm
|
||||
* Key ID
|
||||
* Profile
|
||||
* Verification code must fail safely when a profile is disabled, and error messages must be precise.
|
||||
|
||||
**Developer checklist:**
|
||||
|
||||
* [ ] No direct crypto calls in feature code; only via profile layer.
|
||||
* [ ] All attestations carry algorithm + key id + profile.
|
||||
* [ ] Offline bundle type exists for this workflow.
|
||||
* [ ] Tests for at least 2 different crypto profiles.
|
||||
|
||||
---
|
||||
|
||||
### 2.4 Policy / Lattice Engine
|
||||
|
||||
**Rules:**
|
||||
|
||||
1. **Facts vs. Policies separation**
|
||||
|
||||
* Facts:
|
||||
|
||||
* SBOM components, CVEs, reachability edges, runtime signals.
|
||||
* Policies:
|
||||
|
||||
* “If vendor says ‘not affected’ and reachability says unreachable, treat as Informational.”
|
||||
* Serialize facts and policies separately, with their own digests.
|
||||
|
||||
2. **Lattice implementation location**
|
||||
|
||||
* Lattice evaluation (trust algebra) for VEX decisions happens in:
|
||||
|
||||
* `Scanner.WebService` for scan-time interpretation
|
||||
* `Vexer/Excitor` for publishing and transformation into VEX documents
|
||||
* Concelier/Feedser must not recompute lattice results, only read them.
|
||||
|
||||
3. **Formal merge operations**
|
||||
|
||||
* Each lattice merge function must be:
|
||||
|
||||
* Explicitly named (e.g. `MaxSeverity`, `VendorOverridesIfStrongerEvidence`, `ConservativeIntersection`).
|
||||
* Versioned and referenced by ID in artifacts (e.g. `latticeAlgo: "trust-algebra/v1/max-severity"`).
|
||||
|
||||
4. **Studio-ready representation**
|
||||
|
||||
* Internal data structures must align with a future “Trust Algebra Studio” UI:
|
||||
|
||||
* Nodes = statements (VEX, runtime observation, reachability result)
|
||||
* Edges = “derived_from” / “overrides” / “constraints”
|
||||
* Policies = transformations over these graphs.
|
||||
|
||||
**Developer checklist:**
|
||||
|
||||
* [ ] Facts and policies are serialized separately.
|
||||
* [ ] Lattice code is in allowed services only.
|
||||
* [ ] Merge strategies are named and versioned.
|
||||
* [ ] Artifacts record which lattice algorithm was used.
|
||||
|
||||
---
|
||||
|
||||
### 2.5 Proof-Linked SBOM→VEX Chain
|
||||
|
||||
**Rules:**
|
||||
|
||||
1. **Link, don’t merge**
|
||||
|
||||
* SBOM, scan result, reachability artifact, and VEX should keep their own schemas.
|
||||
* Use **linking IDs** instead of denormalizing everything into one mega-document.
|
||||
|
||||
2. **Evidence pointers in VEX**
|
||||
|
||||
* Every VEX statement (per vuln/component) includes:
|
||||
|
||||
* `sbomDigest`
|
||||
* `scanId`
|
||||
* `reachMapDigest`
|
||||
* `policyDigest`
|
||||
* `signerKeyId`
|
||||
|
||||
3. **DSSE everywhere**
|
||||
|
||||
* All analysis artifacts are wrapped in DSSE:
|
||||
|
||||
* Payload = canonical JSON
|
||||
* Envelope = signature + key metadata + profile
|
||||
* Do not invent yet another custom envelope format.
|
||||
|
||||
**Developer checklist:**
|
||||
|
||||
* [ ] VEX schema includes pointers back to all upstream artifacts.
|
||||
* [ ] No duplication of SBOM or scan content inside VEX.
|
||||
* [ ] DSSE used as standard envelope type.
|
||||
|
||||
---
|
||||
|
||||
### 2.6 Proof-of-Integrity Graph
|
||||
|
||||
**Rules:**
|
||||
|
||||
1. **Graph-first storage model**
|
||||
|
||||
* Model the lifecycle as a graph:
|
||||
|
||||
* Nodes: source commit, build, image, SBOM, scan, VEX, runtime instance.
|
||||
* Edges: “built_from”, “scanned_as”, “deployed_as”, “derived_from”.
|
||||
* Use stable IDs and store in a graph-friendly form (e.g. adjacency collections in Postgres or document graph in Mongo).
|
||||
|
||||
2. **Attestations as edges**
|
||||
|
||||
* Attestations represent edges, not just metadata blobs.
|
||||
* Example: a build attestation is an edge: `commit -> image`, signed by the CI builder.
|
||||
|
||||
3. **Queryable from APIs**
|
||||
|
||||
* Expose API endpoints like:
|
||||
|
||||
* `GET /graph/runtime/{podId}/lineage`
|
||||
* `GET /graph/image/{digest}/vex`
|
||||
* Zastava and the UI must use the same APIs, not private shortcuts.
|
||||
|
||||
**Developer checklist:**
|
||||
|
||||
* [ ] Graph nodes and edges modelled explicitly.
|
||||
* [ ] Each edge type has an attestation schema.
|
||||
* [ ] At least two graph traversal APIs implemented.
|
||||
|
||||
---
|
||||
|
||||
### 2.7 AI Codex & Zastava Companion
|
||||
|
||||
**Rules:**
|
||||
|
||||
1. **Evidence in, explanation out**
|
||||
|
||||
* Zastava must receive:
|
||||
|
||||
* Explicit evidence bundle (JSON) for a question.
|
||||
* The user’s question.
|
||||
* It must not be responsible for data retrieval or correlation itself – that is the platform’s job.
|
||||
|
||||
2. **Stable explanation contracts**
|
||||
|
||||
* Define a structured response format, for example:
|
||||
|
||||
```json
|
||||
{
|
||||
"shortAnswer": "You can ship, with 1 reachable critical.",
|
||||
"findingsSummary": [...],
|
||||
"remediationPlan": [...],
|
||||
"evidencePointers": [...]
|
||||
}
|
||||
```
|
||||
|
||||
* This allows regeneration and multi-language rendering later.
|
||||
|
||||
3. **No silent decisions**
|
||||
|
||||
* Every recommendation must include:
|
||||
|
||||
* Which lattice policy was assumed.
|
||||
* Which artifacts were used (by ID).
|
||||
|
||||
**Developer checklist:**
|
||||
|
||||
* [ ] Zastava APIs accept evidence bundles, not query strings against the DB.
|
||||
* [ ] Responses are structured and deterministic given the evidence.
|
||||
* [ ] Explanations include policy and artifact references.
|
||||
|
||||
---
|
||||
|
||||
### 2.8 Proof-Market Ledger & Adaptive Trust
|
||||
|
||||
**Rules:**
|
||||
|
||||
1. **Ledger as append-only**
|
||||
|
||||
* Treat proof-market ledger as an **append-only log**:
|
||||
|
||||
* New proofs (SBOM/VEX/attestations)
|
||||
* Revocations
|
||||
* Corrections / contradictions
|
||||
* Do not delete; instead emit revocation events.
|
||||
|
||||
2. **Trust-score derivation**
|
||||
|
||||
* Trust is not a free-form label; it is a numeric or lattice value computed from:
|
||||
|
||||
* Number of consistent proofs over time.
|
||||
* Speed of publishing after CVE.
|
||||
* Rate of contradictions or revocations.
|
||||
|
||||
3. **Separation from security decisions**
|
||||
|
||||
* Trust scores feed into:
|
||||
|
||||
* Sorting and highlighting.
|
||||
* Procurement / vendor dashboards.
|
||||
* Do not hard-gate security decisions solely on trust scores.
|
||||
|
||||
**Developer checklist:**
|
||||
|
||||
* [ ] Ledger is append-only with explicit revocations.
|
||||
* [ ] Trust scoring algorithm documented and versioned.
|
||||
* [ ] UI uses trust scores only as a dimension, not a gate.
|
||||
|
||||
---
|
||||
|
||||
### 2.9 Quantum-Resilient Mode
|
||||
|
||||
**Rules:**
|
||||
|
||||
1. **Optional PQC**
|
||||
|
||||
* PQC algorithms (e.g. Dilithium, Falcon) are an **opt-in crypto profile**.
|
||||
* Artifacts can carry multiple signatures (classical + PQC) to ease migration.
|
||||
|
||||
2. **No PQC assumption in core logic**
|
||||
|
||||
* Core logic must treat algorithm as opaque; only crypto layer understands whether it is PQ or classical.
|
||||
|
||||
**Developer checklist:**
|
||||
|
||||
* [ ] PQC profile implemented as a first-class profile.
|
||||
* [ ] Artifacts support multi-signature envelopes.
|
||||
|
||||
---
|
||||
|
||||
## 3. Definition of Done Templates
|
||||
|
||||
You can use this as a per-feature DoD in Stella Ops:
|
||||
|
||||
**For any new feature that touches scans, VEX, or evidence:**
|
||||
|
||||
* [ ] Deterministic: input manifest defined, canonicalization applied, golden fixture(s) added.
|
||||
* [ ] Evidence: outputs are DSSE-wrapped and linked (not merged) into existing artifacts.
|
||||
* [ ] Reachability / Lattice: if applicable, runs only in allowed services and records algorithm IDs.
|
||||
* [ ] Crypto: crypto calls go through profile abstraction; tests for at least 2 profiles if security-sensitive.
|
||||
* [ ] Graph: lineage edges added where appropriate; node/edge IDs stable and queryable.
|
||||
* [ ] UX/API: at least one API to retrieve structured evidence for Zastava and UI.
|
||||
* [ ] Tests: unit + golden + at least one integration test with a full SBOM → scan → VEX chain.
|
||||
|
||||
If you want, next step can be to pick one module (e.g. Scanner.WebService or Vexer) and turn these high-level rules into a concrete CONTRIBUTING.md / ARCHITECTURE.md for that service.
|
||||
@@ -0,0 +1,646 @@
|
||||
Here’s a compact, first‑time‑friendly plan to add two high‑leverage features to your platform: an **image Smart‑Diff** (with signed, policy‑aware deltas) and **better binaries** (symbol/byte‑level SCA + provenance + SARIF).
|
||||
|
||||
# Smart‑Diff (images & containers) — what/why/how
|
||||
|
||||
**What it is:** Compute *deltas* between two images (or layers) and enrich them with context: which files, which packages, which configs flip behavior, and whether the change is actually *reachable at runtime*. Then sign the report so downstream tools can trust it.
|
||||
|
||||
**Why it matters:** Teams drown in “changed but harmless” noise. A diff that knows “is this reachable, config‑activated, and under the running user?” prioritizes real risk and shortens MTTR.
|
||||
|
||||
**How to ship it (Stella Ops‑style, on‑prem, .NET 10):**
|
||||
|
||||
* **Scope of diff**
|
||||
|
||||
* Layer → file → package → symbol (map file changes to package + version; map package to symbols/exports when available).
|
||||
* Config/env lens: overlay `ENTRYPOINT/CMD`, env, feature flags, mounted secrets, user/UID.
|
||||
* **Reachability gates (3‑bit severity gate)**
|
||||
|
||||
* `Reachable?` (call graph / entrypoints / process tree)
|
||||
* `Config‑activated?` (feature flags, env, args)
|
||||
* `Running user?` (match file/dir ACLs, capabilities, container `User:`)
|
||||
* Compute a severity class from these bits (e.g., 0–7) and attach a short rationale.
|
||||
* **Attestation**
|
||||
|
||||
* Emit a **DSSE‑wrapped in‑toto attestation** with the Smart‑Diff as predicate.
|
||||
* Include: artifact digests (old/new), diff summary, gate bits, rule versions, and scanner build info.
|
||||
* Sign offline; verify with cosign/rekor when online is available.
|
||||
* **Predicate (minimal JSON shape)**
|
||||
|
||||
```json
|
||||
{
|
||||
"predicateType": "stellaops.dev/predicates/smart-diff@v1",
|
||||
"predicate": {
|
||||
"baseImage": {"name":"...", "digest":"sha256:..."},
|
||||
"targetImage": {"name":"...", "digest":"sha256:..."},
|
||||
"diff": {
|
||||
"filesAdded": [...],
|
||||
"filesRemoved": [...],
|
||||
"filesChanged": [{"path":"...", "hunks":[...]}],
|
||||
"packagesChanged": [{"name":"openssl","from":"1.1.1u","to":"3.0.14"}]
|
||||
},
|
||||
"context": {
|
||||
"entrypoint":["/app/start"],
|
||||
"env":{"FEATURE_X":"true"},
|
||||
"user":{"uid":1001,"caps":["NET_BIND_SERVICE"]}
|
||||
},
|
||||
"reachabilityGate": {"reachable":true,"configActivated":true,"runningUser":false,"class":6},
|
||||
"scanner": {"name":"StellaOps.Scanner","version":"...","ruleset":"reachability-2025.12"}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
* **Pipelines**
|
||||
|
||||
* Scanner computes diff → predicate JSON → DSSE envelope → write `.intoto.jsonl`.
|
||||
* Optionally export a lightweight **human report** (markdown) and a **machine delta** (protobuf/JSON).
|
||||
|
||||
# Better binaries — symbol/byte SCA + provenance + SARIF
|
||||
|
||||
**What it is:** Go beyond package SBOMs. Identify *symbols, sections, and compiler fingerprints* in each produced binary; capture provenance (compiler, flags, LTO, link order, hashes); then emit:
|
||||
|
||||
1. an **in‑toto statement per binary**, and
|
||||
2. a **SARIF 2.1.0** report for GitHub code scanning.
|
||||
|
||||
**Why it matters:** A lot of risk hides below package level (vendored code, static libs, LTO). Symbol/byte SCA catches it; provenance proves how the binary was built.
|
||||
|
||||
**How to ship it:**
|
||||
|
||||
* **Extractors (modular analyzers)**
|
||||
|
||||
* ELF/PE/Mach‑O parsers (sections, imports/exports, build‑ids, rpaths).
|
||||
* Symbol tables (public + demangled), string tables, compiler notes (`.comment`), PDB/DWARF when present.
|
||||
* Fingerprints: rolling hashes per section/function; Bloom filters for quick symbol presence checks.
|
||||
* **Provenance capture**
|
||||
|
||||
* Compiler: name/version, target triple, LTO (on/off/mode).
|
||||
* Flags: `-O`, `-fstack-protector`, `-D_FORTIFY_SOURCE`, PIE/RELRO, CET/CFGuard.
|
||||
* Linker: version, libs, order, dead‑strip/LTO decisions.
|
||||
* **In‑toto statement (per binary)**
|
||||
|
||||
```json
|
||||
{
|
||||
"predicateType":"slsa.dev/provenance/v1",
|
||||
"subject":[{"name":"bin/app","digest":{"sha256":"..."}}],
|
||||
"predicate":{
|
||||
"builder":{"id":"stellaops://builder/ci"},
|
||||
"buildType":"stellaops.dev/build/native@v1",
|
||||
"metadata":{"buildInvocationID":"...","buildStartedOn":"...","buildFinishedOn":"..."},
|
||||
"materials":[{"uri":"git+ssh://...#<commit>","digest":{"sha1":"..."}}],
|
||||
"buildConfig":{
|
||||
"compiler":{"name":"clang","version":"18.1.3"},
|
||||
"flags":["-O2","-fstack-protector-strong","-fPIE"],
|
||||
"lto":"thin",
|
||||
"linker":{"name":"lld","version":"18.1.3"},
|
||||
"hardening":{"pie":true,"relro":"full","fortify":true}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
* **SARIF 2.1.0 for GitHub code scanning**
|
||||
|
||||
* One SARIF file per build (or per repo), tool name `StellaOps.BinarySCA`.
|
||||
* For each finding (e.g., vulnerable function signature or insecure linker flag), add:
|
||||
|
||||
* `ruleId`, CWE/Vuln ID, severity, location (binary + symbol), `helpUri`.
|
||||
* Upload via Actions/API so issues appear in *Security → Code scanning alerts*.
|
||||
* **CI wiring (on‑prem friendly)**
|
||||
|
||||
* Build → run binary analyzers → write `binary.intoto.jsonl` + `sca.sarif.json`.
|
||||
* Sign the in‑toto statement (DSSE). If air‑gapped, store in your internal evidence bucket; sync to Rekor mirror later.
|
||||
* Optional: export a compact “binary SBOM” (function inventory + hashes).
|
||||
|
||||
# Minimal .NET 10 / CLI layout (suggested)
|
||||
|
||||
```
|
||||
src/Scanner/StellaOps.Scanner.SmartDiff/
|
||||
src/Scanner/StellaOps.Scanner.BinarySCA/
|
||||
src/Predicates/StellaOps.Predicates/ # JSON schemas, versioned
|
||||
src/Sign/StellaOps.Attestation/ # DSSE envelopes, cosign integration
|
||||
src/Exports/StellaOps.Exports.Sarif/
|
||||
src/Exports/StellaOps.Exports.InToto/
|
||||
```
|
||||
|
||||
* **Contracts:** freeze JSON Schemas under `StellaOps.Predicates` and version them (e.g., `smart-diff@v1`, `binary-provenance@v1`).
|
||||
* **Determinism:** lock analyzer rulesets + feeds with content hashes; record them in each predicate (`rulesetDigest`).
|
||||
|
||||
# Quick wins this week
|
||||
|
||||
* Implement the **3‑bit reachability gate** and surface it in your UI filter.
|
||||
* Emit **DSSE‑wrapped in‑toto** for Smart‑Diff first (binary provenance next).
|
||||
* Add a **basic SARIF exporter** that maps binary findings → GitHub alerts (even if only a couple rules to start: missing RELRO/PIE; insecure `__strcpy` occurrences).
|
||||
|
||||
If you want, I can draft:
|
||||
|
||||
* The JSON Schemas for `smart-diff@v1` and `binary-provenance@v1`.
|
||||
* A tiny C# sample that wraps a predicate in DSSE and verifies with cosign.
|
||||
* A SARIF emitter stub wired to your CI.
|
||||
I will split this into two parts:
|
||||
|
||||
1. A compact list of Stella Ops advantages (how we win vs other scanners).
|
||||
2. Concrete developer guidelines that make those advantages real in code.
|
||||
|
||||
---
|
||||
|
||||
## 1. Stella Ops advantages (what we are optimizing for)
|
||||
|
||||
When you build any feature, it should clearly reinforce at least one of these:
|
||||
|
||||
1. **Evidence-first, signed, replayable**
|
||||
Every non-trivial operation produces signed, DSSE-wrapped attestations and can be re-run later to obtain the same result byte-for-byte.
|
||||
|
||||
2. **Reachability-first triage (Smart-Diff + gates)**
|
||||
We never just say “this CVE exists”; we say: *it changed*, *it is or is not reachable*, *it is or is not activated by config*, and *which user actually executes it*.
|
||||
|
||||
3. **Binary-level SCA + provenance**
|
||||
We do not stop at packages. We inspect binaries (symbols, sections, toolchain fingerprints) and provide in-toto/SLSA provenance plus SARIF to development tools.
|
||||
|
||||
4. **Crypto-sovereign and offline-ready**
|
||||
All signing/verification can use local trust roots and local cryptographic profiles (FIPS / eIDAS / GOST / SM) with no hard dependency on public CAs or external clouds.
|
||||
|
||||
5. **Deterministic, replayable scans**
|
||||
A “scan” is a pure function of: artifact digests, feeds, rules, lattice policies, and configuration. Anything not captured there is a bug.
|
||||
|
||||
6. **Policy & lattice engine instead of ad-hoc rules**
|
||||
Risk and VEX decisions are the result of explicit lattice merge rules (“trust algebra”), not opaque if-else trees in the code.
|
||||
|
||||
7. **Proof-of-integrity graph**
|
||||
All artifacts (source → build → container → runtime) are connected in a cryptographic graph that can be traversed, audited, and exported.
|
||||
|
||||
8. **Quiet-by-design UX**
|
||||
The system is optimized to answer three questions fast:
|
||||
|
||||
1. Can I ship this? 2) If not, what blocks me? 3) What is the minimal safe change?
|
||||
|
||||
Everything you build should clearly map to one or more of the above.
|
||||
|
||||
---
|
||||
|
||||
## 2. Developer guidelines by advantage
|
||||
|
||||
### 2.1 Evidence-first, signed, replayable
|
||||
|
||||
**Core rule:** Any non-trivial action must be traceable as a signed, re-playable evidence record.
|
||||
|
||||
**Implementation guidelines**
|
||||
|
||||
1. **Uniform attestation model**
|
||||
|
||||
* Define and use a shared library, e.g. `StellaOps.Predicates`, with:
|
||||
|
||||
* Versioned JSON Schemas (e.g. `smart-diff@v1`, `binary-provenance@v1`, `reachability-summary@v1`).
|
||||
* Strongly-typed C# DTOs that match the schemas.
|
||||
* Every module (Scanner, Sbomer, Concelier, Excititor/Vexer, Authority, Scheduler, Feedser) must:
|
||||
|
||||
* Emit **DSSE-wrapped in-toto statements**.
|
||||
* Use the same hashing strategy (e.g., SHA-256 over canonical JSON, no whitespace variance).
|
||||
* Include tool name, version, ruleset/feeds digests and configuration id in each predicate.
|
||||
|
||||
2. **Link-not-merge**
|
||||
|
||||
* Never rewrite or mutate third-party SBOM/VEX/attestations.
|
||||
* Instead:
|
||||
|
||||
* Store original documents as immutable blobs addressed by hash.
|
||||
* Refer to them using digests and URIs (e.g. `sha256:…`) from your own predicates.
|
||||
* Emit **linking evidence**: “this SBOM (digest X) was used to compute decision Y”.
|
||||
|
||||
3. **Deterministic scan manifests**
|
||||
|
||||
* Each scan must have a manifest object:
|
||||
|
||||
```json
|
||||
{
|
||||
"artifactDigest": "sha256:...",
|
||||
"scannerVersion": "1.2.3",
|
||||
"rulesetDigest": "sha256:...",
|
||||
"feedsDigests": { "nvd": "sha256:...", "vendorX": "sha256:..." },
|
||||
"latticePolicyDigest": "sha256:...",
|
||||
"configId": "prod-eu-1",
|
||||
"timestamp": "2025-12-09T13:37:00Z"
|
||||
}
|
||||
```
|
||||
* Store it alongside results and include its digest in all predicates produced by that run.
|
||||
|
||||
4. **Signing & verification**
|
||||
|
||||
* All attestation writing goes through a single abstraction, e.g.:
|
||||
|
||||
```csharp
|
||||
interface IAttestationSigner {
|
||||
Task<DSSEEnvelope> SignAsync<TPredicate>(TPredicate predicate, CancellationToken ct);
|
||||
}
|
||||
```
|
||||
* Implementations may use:
|
||||
|
||||
* Sigstore (Fulcio + Rekor) when online.
|
||||
* Local keys (HSM, TPM, file key) when offline.
|
||||
* Never do ad-hoc crypto directly in features; always go through the shared crypto layer (see 2.4).
|
||||
|
||||
---
|
||||
|
||||
### 2.2 Reachability-first triage and Smart-Diff
|
||||
|
||||
**Core rule:** You must never treat “found a CVE” as sufficient. You must track change + reachability + config + execution context.
|
||||
|
||||
#### Smart-Diff
|
||||
|
||||
1. **Diff levels**
|
||||
|
||||
* Implement layered diffs:
|
||||
|
||||
* Image / layer → file → package → symbol.
|
||||
* Map:
|
||||
|
||||
* File changes → owning package, version.
|
||||
* Package changes → known vulnerabilities and exports.
|
||||
* Attach config context: entrypoint, env vars, feature flags, user/UID.
|
||||
|
||||
2. **Signed Smart-Diff predicate**
|
||||
|
||||
* Use the minimal shape like (simplified):
|
||||
|
||||
```json
|
||||
{
|
||||
"predicateType": "stellaops.dev/predicates/smart-diff@v1",
|
||||
"predicate": {
|
||||
"baseImage": {...},
|
||||
"targetImage": {...},
|
||||
"diff": {...},
|
||||
"context": {...},
|
||||
"reachabilityGate": {...},
|
||||
"scanner": {...}
|
||||
}
|
||||
}
|
||||
```
|
||||
* Always sign as DSSE and attach the scan manifest digest.
|
||||
|
||||
#### Reachability gate
|
||||
|
||||
Use the **3-bit gate** consistently:
|
||||
|
||||
* `reachable` (static/dynamic call graph says “yes”)
|
||||
* `configActivated` (env/flags/args activate the code path)
|
||||
* `runningUser` (the user/UID that can actually execute it)
|
||||
|
||||
Guidelines:
|
||||
|
||||
1. **Data model**
|
||||
|
||||
```csharp
|
||||
public sealed record ReachabilityGate(
|
||||
bool? Reachable, // true / false / null for unknown
|
||||
bool? ConfigActivated,
|
||||
bool? RunningUser,
|
||||
int Class, // 0..7 derived from the bits when all known
|
||||
string Rationale // short explanation, human-readable
|
||||
);
|
||||
```
|
||||
|
||||
2. **Unknowns must stay unknown**
|
||||
|
||||
* Never silently treat `null` as `false` or `true`.
|
||||
* If any of the bits is `null`, compute `Class` only from known bits or set `Class = -1` to denote “incomplete”.
|
||||
* Feed all “unknown” cases into a dedicated “Unknowns ranking” path (separate heuristics and UX).
|
||||
|
||||
3. **Where reachability is computed**
|
||||
|
||||
* Respect your standing rule: **lattice and reachability algorithms run in `Scanner.WebService`**, not in Concelier, Feedser, or Excitors/Vexer.
|
||||
* Other services only:
|
||||
|
||||
* Persist / index results.
|
||||
* Prune / filter based on policy.
|
||||
* Present data — never recompute core reachability.
|
||||
|
||||
4. **Caching reachability**
|
||||
|
||||
* Key caches by:
|
||||
|
||||
* Artifact digest (image/layer/binary).
|
||||
* Ruleset/lattice digest.
|
||||
* Language/runtime version (for static analysis).
|
||||
* Pattern:
|
||||
|
||||
* First time a call path is requested, compute and cache.
|
||||
* Subsequent accesses in the same scan use the in-memory cache.
|
||||
* For cross-scan reuse, store a compact summary keyed by (artifactDigest, rulesetDigest) in Scanner’s persistence node.
|
||||
* Never cache across incompatible rule or feed versions.
|
||||
|
||||
---
|
||||
|
||||
### 2.3 Binary-level SCA and provenance
|
||||
|
||||
**Core rule:** Treat each built binary as a first-class subject with its own SBOM, SCA, and provenance.
|
||||
|
||||
1. **Pluggable analyzers**
|
||||
|
||||
* Create analyzers per binary format/language:
|
||||
|
||||
* ELF, PE, Mach-O.
|
||||
* Language/toolchain detectors (GCC/Clang/MSVC/.NET/Go/Rust).
|
||||
* Common interface:
|
||||
|
||||
```csharp
|
||||
interface IBinaryAnalyzer {
|
||||
bool CanHandle(BinaryContext ctx);
|
||||
Task<BinaryAnalysisResult> AnalyzeAsync(BinaryContext ctx, CancellationToken ct);
|
||||
}
|
||||
```
|
||||
|
||||
2. **Binary SBOM + SCA**
|
||||
|
||||
* Output per-binary:
|
||||
|
||||
* Function/symbol inventory (names, addresses).
|
||||
* Linked static libraries.
|
||||
* Detected third-party components (via fingerprints).
|
||||
* Map to known vulnerabilities via:
|
||||
|
||||
* Symbol signatures.
|
||||
* Function-level or section-level hashes.
|
||||
* Emit:
|
||||
|
||||
* CycloneDX/SPDX component entries for binaries.
|
||||
* A separate predicate `binary-sca@v1`.
|
||||
|
||||
3. **Provenance (in-toto/SLSA)**
|
||||
|
||||
* Emit an in-toto statement per binary:
|
||||
|
||||
* Subject = `bin/app` (digest).
|
||||
* Predicate = build metadata (compiler, flags, LTO, linker, hardening).
|
||||
* Always include:
|
||||
|
||||
* Source material (git repo + commit).
|
||||
* Build environment (container image digest or runner OS).
|
||||
* Exact build command / script identifier.
|
||||
|
||||
4. **SARIF for GitHub / IDEs**
|
||||
|
||||
* Provide an exporter:
|
||||
|
||||
* Input: `BinaryAnalysisResult`.
|
||||
* Output: SARIF 2.1.0 with:
|
||||
|
||||
* Findings: missing RELRO/PIE, unsafe functions, known vulns, weak flags.
|
||||
* Locations: binary path + symbol/function name.
|
||||
* Keep rule IDs stable and documented (e.g. `STB001_NO_RELRO`, `STB010_VULN_SYMBOL`).
|
||||
|
||||
---
|
||||
|
||||
### 2.4 Crypto-sovereign, offline-ready
|
||||
|
||||
**Core rule:** No feature may rely on a single global PKI or always-online trust path.
|
||||
|
||||
1. **Crypto abstraction**
|
||||
|
||||
* Introduce a narrow interface:
|
||||
|
||||
```csharp
|
||||
interface ICryptoProfile {
|
||||
string Name { get; }
|
||||
IAttestationSigner AttestationSigner { get; }
|
||||
IVerifier DefaultVerifier { get; }
|
||||
}
|
||||
```
|
||||
* Provide implementations:
|
||||
|
||||
* `FipsCryptoProfile`
|
||||
* `EUeIDASCryptoProfile`
|
||||
* `GostCryptoProfile`
|
||||
* `SmCryptoProfile`
|
||||
* Selection via configuration, not code changes.
|
||||
|
||||
2. **Offline bundles**
|
||||
|
||||
* Everything needed to verify a decision must be downloadable:
|
||||
|
||||
* Scanner binaries.
|
||||
* Rules/feeds snapshot.
|
||||
* CA chains / trust roots.
|
||||
* Public keys for signers.
|
||||
* Implement a “bundle manifest” that ties these together and is itself signed.
|
||||
|
||||
3. **Rekor / ledger independence**
|
||||
|
||||
* If Rekor is available:
|
||||
|
||||
* Log attestations.
|
||||
* If not:
|
||||
|
||||
* Log to Stella Ops Proof-Market Ledger or internal append-only store.
|
||||
* Features must not break when Rekor is absent.
|
||||
|
||||
---
|
||||
|
||||
### 2.5 Policy & lattice engine
|
||||
|
||||
**Core rule:** Risk decisions are lattice evaluations over facts; do not hide policy logic inside business code.
|
||||
|
||||
1. **Facts vs policy**
|
||||
|
||||
* Facts are:
|
||||
|
||||
* CVE presence, severity, exploit data.
|
||||
* Reachability gates.
|
||||
* Runtime events (was this function ever executed?).
|
||||
* Vendor VEX statements.
|
||||
* Policy is:
|
||||
|
||||
* Lattice definitions and merge rules.
|
||||
* Trust preferences (vendor vs runtime vs scanner).
|
||||
* In code:
|
||||
|
||||
* Facts are input DTOs stored in the evidence graph.
|
||||
* Policy is JSON/YAML configuration with versioned schemas.
|
||||
|
||||
2. **Single evaluation engine in Scanner.WebService**
|
||||
|
||||
* Lattice evaluation must run only in `StellaOps.Scanner.WebService` (your standing rule).
|
||||
* Other services:
|
||||
|
||||
* Request decisions from Scanner.
|
||||
* Pass only references (IDs/digests) to facts, not raw policy.
|
||||
|
||||
3. **Deterministic evaluation**
|
||||
|
||||
* Lattice evaluation must:
|
||||
|
||||
* Use only input facts + policy.
|
||||
* Never depend on current time, random, environment state.
|
||||
* Every decision object must include:
|
||||
|
||||
* `policyDigest`
|
||||
* `inputFactsDigests[]`
|
||||
* `decisionReason` (short machine+human readable explanation)
|
||||
|
||||
---
|
||||
|
||||
### 2.6 Proof-of-integrity graph
|
||||
|
||||
**Core rule:** Everything is a node; all relationships are typed edges; nothing disappears.
|
||||
|
||||
1. **Graph model**
|
||||
|
||||
* Nodes: source repo, commit, build job, SBOM, attestation, image, container runtime, host.
|
||||
* Edges: “built_from”, “scanned_with”, “deployed_as”, “executes_on”, “derived_from”.
|
||||
* Store in a graph store or graph-like relational schema:
|
||||
|
||||
* IDs are content digests where possible.
|
||||
|
||||
2. **Append-only**
|
||||
|
||||
* Never delete or overwrite nodes; mark as superseded if needed.
|
||||
* Evidence mutations (e.g. new scan) are new nodes/edges.
|
||||
|
||||
3. **APIs**
|
||||
|
||||
* Provide traversal APIs:
|
||||
|
||||
* “Given this CVE, which production pods are affected?”
|
||||
* “Given this pod, show full ancestry up to source commit.”
|
||||
* All UI queries must work via these APIs, not ad-hoc joins.
|
||||
|
||||
---
|
||||
|
||||
### 2.7 Quiet-by-design UX and observability
|
||||
|
||||
**Core rule:** Default to minimal, actionable noise; logs and telemetry must be compliant and air-gap friendly.
|
||||
|
||||
1. **Triage model**
|
||||
|
||||
* Classify everything into:
|
||||
|
||||
* “Blockers” (fail pipeline).
|
||||
* “Needs review” (warn but pass).
|
||||
* “Noise” (hidden unless requested).
|
||||
* The classification uses:
|
||||
|
||||
* Lattice decisions.
|
||||
* Reachability gates.
|
||||
* Environment criticality (prod vs dev).
|
||||
|
||||
2. **Evidence-centric UX**
|
||||
|
||||
* Each UI card or API answer must:
|
||||
|
||||
* Reference the underlying attestations by ID/digest.
|
||||
* Provide a one-click path to “show raw evidence”.
|
||||
|
||||
3. **Logging & telemetry defaults**
|
||||
|
||||
* Logging:
|
||||
|
||||
* Structured JSON.
|
||||
* No secrets, no PII, no full source in logs.
|
||||
* Local file + log rotation is the default.
|
||||
* Telemetry:
|
||||
|
||||
* OpenTelemetry-compatible exporters.
|
||||
* Pluggable sinks:
|
||||
|
||||
* In-memory (dev).
|
||||
* Postgres.
|
||||
* External APM if configured.
|
||||
* For on-prem:
|
||||
|
||||
* All telemetry must be optional.
|
||||
* The system must be fully operational with only local logs.
|
||||
|
||||
---
|
||||
|
||||
### 2.8 AI Codex / Zastava Companion
|
||||
|
||||
**Core rule:** AI is a consumer of the evidence graph, never a source of truth.
|
||||
|
||||
1. **Separation of roles**
|
||||
|
||||
* Zastava:
|
||||
|
||||
* Reads evidence, decisions, and context.
|
||||
* Produces explanations and remediation plans.
|
||||
* It must not:
|
||||
|
||||
* Invent vulnerabilities or states not present in evidence.
|
||||
* Change decisions or policies.
|
||||
|
||||
2. **Interfaces**
|
||||
|
||||
* Input:
|
||||
|
||||
* IDs/digests of:
|
||||
|
||||
* Attestations.
|
||||
* Lattice decisions.
|
||||
* Smart-Diff results.
|
||||
* Output:
|
||||
|
||||
* Natural language summary.
|
||||
* Ordered remediation steps with references back to evidence IDs.
|
||||
|
||||
3. **Determinism around AI**
|
||||
|
||||
* Core security behaviour must not depend on AI responses.
|
||||
* Pipelines should never “pass/fail based on AI text”.
|
||||
* AI is advice only; enforcement is always policy + lattice + evidence.
|
||||
|
||||
---
|
||||
|
||||
## 3. Cross-cutting rules for all Stella Ops developers
|
||||
|
||||
When you implement anything in Stella Ops, verify you comply with these:
|
||||
|
||||
1. **Determinism first**
|
||||
|
||||
* If re-running with the same:
|
||||
|
||||
* artifact digests,
|
||||
* feeds,
|
||||
* rules,
|
||||
* lattices,
|
||||
* config,
|
||||
* then results must be identical (except for timestamps and cryptographic randomness inside signatures).
|
||||
|
||||
2. **Offline-first**
|
||||
|
||||
* No hard dependency on:
|
||||
|
||||
* External CAs.
|
||||
* External DBs of vulnerabilities.
|
||||
* External ledgers.
|
||||
* All remote interactions must be:
|
||||
|
||||
* Optional.
|
||||
* Pluggable.
|
||||
* Replaceable with local mirrors.
|
||||
|
||||
3. **Evidence over UI**
|
||||
|
||||
* Never implement logic “only in the UI”.
|
||||
* The API and attestations must fully reflect what the UI shows.
|
||||
|
||||
4. **Contracts over convenience**
|
||||
|
||||
* Schemas are contracts:
|
||||
|
||||
* Version them.
|
||||
* Do not change existing fields’ meaning.
|
||||
* Add fields with defaults.
|
||||
* Deprecate explicitly, never silently break consumers.
|
||||
|
||||
5. **Golden fixtures**
|
||||
|
||||
* For any new predicate or decision:
|
||||
|
||||
* Create golden fixtures (input → output → attestations).
|
||||
* Use them in regression tests.
|
||||
* This is crucial for “deterministic replayable scans”.
|
||||
|
||||
6. **Respect service boundaries**
|
||||
|
||||
* Scanner: facts + evaluation (lattices, reachability).
|
||||
* Sbomer: SBOM generation and normalization.
|
||||
* Concelier / Vexer: policy application, filtering, presentation; they “preserve prune source”.
|
||||
* Authority: signing keys, crypto profiles, trust roots.
|
||||
* Feedser: feeds ingestion; must never “decide”, only normalize.
|
||||
|
||||
If you want, next step I can do a very concrete checklist for adding a **new scanner feature** (e.g., “Smart-Diff for Python wheels”) with exact project structure (`src/Scanner/StellaOps.Scanner.*`), tests, and the minimal set of predicates and attestations that must be produced.
|
||||
@@ -0,0 +1,366 @@
|
||||
Stella DevOps UX Implementation Guide
|
||||
=====================================
|
||||
|
||||
1\. DevOps Mental Model
|
||||
-----------------------
|
||||
|
||||
**Development Direction:** Align the platform’s design with a DevOps engineer’s mental model of the software delivery flow. All key entities (pipelines, builds, artifacts, environments, deployments) should be first-class concepts in both UI and API. The system must allow tracing the path from code commit through CI/CD pipeline to the artifact and finally to the running environment, reflecting how DevOps think about changes moving through stages. This means using consistent identifiers (e.g. commit SHA, artifact version, build number) across views so everything is linked in a coherent flow[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=Evidence%20should%20flow%20like%20logs%2C,good%3B%20centralizing%20with%20context%20is). For example, an engineer should easily follow a chain from a security control or test result, to the artifact produced, to where that artifact is deployed.
|
||||
|
||||
**Implementation Plan:** Model the domain objects (pipeline runs, artifacts, environments) in the backend with clear relationships. For instance, store each pipeline run with metadata: commit ID, associated artifact IDs, and target environment. Implement linking in the UI: pipeline run pages link to the artifacts they produced; artifact pages link to the deployments or environments where they’re running. Use tags or labels (in a database or artifact repository metadata) to tie artifacts back to source commits or tickets. This could leverage existing CI systems (Jenkins, GitLab CI, etc.) by pulling their data via APIs, or be built on a custom pipeline engine (e.g. Tekton on Kubernetes for native pipeline CRDs). Ensure any integration (with Git or ticketing) populates these references automatically. By tagging and correlating objects, we enable deep linking: e.g. clicking an artifact’s version shows which pipeline produced it and which environment it's in[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=Evidence%20should%20flow%20like%20logs%2C,good%3B%20centralizing%20with%20context%20is).
|
||||
|
||||
**DevOps-facing Outcome:** DevOps users will experience a platform that “thinks” the way they do. In practice, they can trace a story of a change across the system: for a given commit, see the CI/CD run that built it, view the artifact (container image, package, etc.) with its SBOM and test results attached, and see exactly which environment or cluster is running that version[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=Evidence%20should%20flow%20like%20logs%2C,good%3B%20centralizing%20with%20context%20is). This traceability instills confidence – it’s obvious where any given change is and what happened to it. New team members find the UI intuitive because it mirrors real deployment workflows rather than abstract concepts.
|
||||
|
||||
2\. Global UX Principles for DevOps
|
||||
-----------------------------------
|
||||
|
||||
**Development Direction:** Build the user experience with an emphasis on **clarity, consistency, and minimal friction** for DevOps tasks. The platform should be intuitive enough that common actions require few clicks and little to no documentation. Use familiar conventions from other DevOps tools (icons, terminology, keyboard shortcuts) to leverage existing mental models[opslevel.com](https://www.opslevel.com/resources/devex-series-part-2-how-tooling-affects-developer-experience-devex#:~:text=5,developers%20to%20provide%20multiple%20inputs). Prioritize core functionality over feature bloat to keep the interface straightforward – focus on the top tasks DevOps engineers perform daily. Every part of the tool (UI, CLI, API) should follow the same design principles so that switching contexts doesn’t confuse the user[opslevel.com](https://www.opslevel.com/resources/devex-series-part-2-how-tooling-affects-developer-experience-devex#:~:text=,enter%20and%20maintain%20flow%20state).
|
||||
|
||||
**Implementation Plan:** Adopt a consistent design system and navigation structure across all modules. For example, use standard color coding (green for success, red for failure) and layout similar to popular CI/CD tools for pipeline status to meet user expectations[opslevel.com](https://www.opslevel.com/resources/devex-series-part-2-how-tooling-affects-developer-experience-devex#:~:text=5,developers%20to%20provide%20multiple%20inputs). Implement safe defaults and templates: e.g. provide pipeline configuration templates and environment defaults so users aren’t overwhelmed with setup (following “convention over configuration” for common scenarios[opslevel.com](https://www.opslevel.com/resources/devex-series-part-2-how-tooling-affects-developer-experience-devex#:~:text=make%20your%20tool%20more%20intuitive,developers%20to%20provide%20multiple%20inputs)). Ensure **immediate, contextual feedback** in the UI – if a pipeline fails, highlight the failed step with error details right there (no hunting through logs unnecessarily). Incorporate guidance into the product: for instance, tooltips or inline hints for first-time setup, but design the flow so that the “right way” is also the easiest way (leveraging constraints to guide best practices[opslevel.com](https://www.opslevel.com/resources/devex-series-part-2-how-tooling-affects-developer-experience-devex#:~:text=1,must%20provide%20feedback%20that%E2%80%99s%20informative)). Integrate authentication and SSO with existing systems (LDAP/OIDC) to avoid extra logins, and integrate with familiar interfaces (ChatOps, Slack, IDE plugins) to reduce context-switching. Maintain parity between the web UI and CLI by making both use the same underlying APIs – this ensures consistency and that improvements apply to all interfaces. In development, use UX best practices such as usability testing with actual DevOps users to refine workflows (e.g. ensure creating a new environment or pipeline is a short, logical sequence). Keep pages responsive and lightweight for quick load times, as speed is part of good UX.
|
||||
|
||||
**DevOps-facing Outcome:** DevOps practitioners will find the tool **intuitive and efficient**. They can accomplish routine tasks (triggering a deployment, approving a change, checking logs) without referring to documentation, because the UI naturally leads them through workflows. The system provides feedback that is specific and actionable – for example, error messages clearly state what failed (e.g. “Deployment to QA failed policy check X”) and suggest next steps (with a link to the policy or waiver option), rather than generic errors[opslevel.com](https://www.opslevel.com/resources/devex-series-part-2-how-tooling-affects-developer-experience-devex#:~:text=Well,not%20reading%20the%20documentation%2C%20but). Users notice that everything feels **familiar**: the terminology matches their conventions, and even the CLI commands and outputs align with tools they know. Friction is minimized: they aren’t wasting time on redundant confirmations or searching for information across different screens. Overall, this leads to improved flow state and productivity – the tool “gets out of the way” and lets DevOps focus on delivering software[opslevel.com](https://www.opslevel.com/resources/devex-series-part-2-how-tooling-affects-developer-experience-devex#:~:text=,enter%20and%20maintain%20flow%20state).
|
||||
|
||||
3\. Core Views DevOps Actually Need
|
||||
-----------------------------------
|
||||
|
||||
### Pipeline/Run-Centric View
|
||||
|
||||
**Development Direction:** Provide a **pipeline-run dashboard** that gives a real-time and historical view of CI/CD pipeline executions. DevOps users need to see each pipeline run’s status, stages, and logs at a glance, with the ability to drill down into any step. Key requirements include visual indicators of progress (running, passed, failed), links to related entities (commit, artifacts produced, deployment targets), and controls to re-run or rollback if needed. Essentially, we need to build what is often seen in tools like Jenkins Blue Ocean or GitLab Pipelines: a clear timeline or graph of pipeline stages with results. The view should support filtering (by branch, status, timeframe) and show recent pipeline outcomes to quickly spot failures[getambush.com](https://www.getambush.com/article/devops-for-classified-environments/#:~:text=Information%20radiators%20display%20system%20status,demonstrate%20system%20health%20and%20trends).
|
||||
|
||||
**Implementation Plan:** Leverage the CI system’s data to populate this view. If using an existing CI (Jenkins/GitLab/GitHub Actions), integrate through their APIs to fetch pipeline run details (jobs, status, logs). Alternatively, if building a custom pipeline service (e.g. Tekton on Kubernetes), use its pipeline CRDs and results to construct the UI. Implement a **real-time update** mechanism (WebSocket or long-poll) so users can watch a running pipeline’s progress live (e.g. seeing stages turn green or red as they complete). The UI could be a linear timeline of stages or a node graph for parallel stages. Each stage node should be clickable to view logs and any artifacts from that stage. Include a sidebar or modal for logs with search and highlight (so DevOps can quickly diagnose failures). Provide controls to download logs or artifacts right from the UI. Integrate links: e.g. the commit hash in the pipeline header links to the SCM, the artifact name links to the artifact repository or artifact-centric view. If a pipeline fails a quality gate or test, highlight it and possibly prompt next actions (create a ticket or issue, or jump to evidence). Use CI webhooks or event listeners to update pipeline status in the platform database, and maintain a history of past runs. This can be backed by a database table (storing run id, pipeline id, status, duration, initiator, etc.) for querying and metrics.
|
||||
|
||||
**DevOps-facing Outcome:** The pipeline-centric view becomes the **mission control** for builds and releases. A DevOps engineer looking at this dashboard can immediately answer: “What’s the state of our pipelines right now?” They’ll see perhaps a list or grid of recent runs, with status color-codes (e.g. green check for success, red X for failure, yellow for running). They can click a failed pipeline and instantly see which stage failed and the error message, without wading through raw logs. For a running deployment, they might see a live streaming log of tests and a progress bar of stages. This greatly speeds up troubleshooting and situational awareness[getambush.com](https://www.getambush.com/article/devops-for-classified-environments/#:~:text=Information%20radiators%20display%20system%20status,demonstrate%20system%20health%20and%20trends). Moreover, from this view they can trigger actions – e.g. re-run a failed job or approve a manual gate – making it a one-stop interface for pipeline operations. Overall, this view ensures that pipeline status and history are highly visible (no more digging through Jenkins job lists or disparate tools), which supports faster feedback and collaboration (e.g. a team board showing these pipeline dashboards to all team members[getambush.com](https://www.getambush.com/article/devops-for-classified-environments/#:~:text=Information%20radiators%20display%20system%20status,demonstrate%20system%20health%20and%20trends)).
|
||||
|
||||
### Artifact-Centric View
|
||||
|
||||
**Development Direction:** Create an **artifact-centric view** that tracks the build outputs (artifacts) through their lifecycle. DevOps teams often manage artifacts like container images, binaries, or packages that are built once and then promoted across environments. This view should list artifact versions along with metadata: what build produced it, which tests it passed, security scan results, and where it’s currently deployed. The guiding principle is “promote artifacts, not code” – once an artifact is proven in one environment, it should be the same artifact moving forward[getambush.com](https://www.getambush.com/article/devops-for-classified-environments/#:~:text=Managing%20artifacts%20in%20classified%20environments,ensures%20artifact%20integrity%20and%20authenticity). Therefore, the system must support viewing an artifact (say version 1.2.3 of a service) and seeing its chain of custody: built by Pipeline #123 from Commit ABC, signed and stored in registry, deployed to Staging, awaiting promotion to Prod. It should also highlight if an artifact is **approved** (all checks passed) or if it carries any waivers/exceptions.
|
||||
|
||||
**Implementation Plan:** Integrate with artifact repositories and registries. For example, if using Docker images, connect to a container registry (AWS ECR, Docker Hub, etc.) via API or CLI to list image tags and digests. For JARs or packages, integrate with a binary repository (Artifactory, Nexus, etc.). Store metadata in a database linking artifact IDs (e.g. digest or version) to pipeline run and test results. The implementation could include a dedicated microservice to handle artifact metadata: when a pipeline produces a new artifact, record its details (checksum, storage URL, SBOM, test summary, security scan outcome). Implement the artifact view UI to display a table or list of artifact versions, each expandable to show details like: build timestamp, commit ID, link to pipeline run, list of environments where it’s deployed, and compliance status (e.g. “Signed ✅, Security scan ✅, Tests ✅”). Provide actions like promoting an artifact to an environment (which could trigger a deployment pipeline or Argo CD sync behind the scenes). Include **promotion workflows** with approvals – e.g. a button to “Promote to Production” that will enforce an approval if required by policy[getambush.com](https://www.getambush.com/article/devops-for-classified-environments/#:~:text=Managing%20artifacts%20in%20classified%20environments,ensures%20artifact%20integrity%20and%20authenticity). Ensure the artifact view can filter or search by component/service name and version. Behind the scenes, implement retention policies for artifacts (possibly configurable) and mark artifacts that are no longer deployed so they can be archived or cleaned up[getambush.com](https://www.getambush.com/article/devops-for-classified-environments/#:~:text=Managing%20artifacts%20in%20classified%20environments,ensures%20artifact%20integrity%20and%20authenticity). Use signing tools (like Cosign for container images) and display signature verification status in the UI to ensure integrity[getambush.com](https://www.getambush.com/article/devops-for-classified-environments/#:~:text=Managing%20artifacts%20in%20classified%20environments,ensures%20artifact%20integrity%20and%20authenticity). This likely means storing signature info and verification results in our metadata DB and updating on artifact fetch.
|
||||
|
||||
**DevOps-facing Outcome:** Users gain a **single source of truth** for artifacts. Instead of manually cross-referencing CI runs and Docker registries, they can go to “Artifact X version Y” page and get a comprehensive picture: “Built 2 days ago from commit `abc123` by pipeline #56[getambush.com](https://www.getambush.com/article/devops-for-classified-environments/#:~:text=Managing%20artifacts%20in%20classified%20environments,ensures%20artifact%20integrity%20and%20authenticity). Passed all tests and security checks. Currently in UAT and Prod.” They will see if the artifact was signed and by whom, and they can trust that what went through QA is exactly what’s in production (no surprise re-builds). If an artifact has a known vulnerability, they can quickly find everywhere it’s running. Conversely, if a deployment is failing, they can confirm the artifact’s provenance (maybe the issue is that it wasn’t the artifact they expected). This view also streamlines promotions: a DevOps engineer can promote a vetted artifact to the next environment with one click, knowing the platform will handle the deployment and update the status. Overall, the artifact-centric view reduces release errors by emphasizing immutability and traceability of builds, and it gives teams confidence that only approved artifacts progress through environments[getambush.com](https://www.getambush.com/article/devops-for-classified-environments/#:~:text=Managing%20artifacts%20in%20classified%20environments,ensures%20artifact%20integrity%20and%20authenticity).
|
||||
|
||||
### Environment/Cluster-Centric View
|
||||
|
||||
**Development Direction:** Provide an **environment or cluster-centric dashboard** focusing on the state of each deployment environment (Dev, QA, Prod, or specific Kubernetes clusters). DevOps need to see **what is running where** and the health/status of those environments. This view should show each environment’s active versions of services, configuration, last deployment time, and any pending changes or issues. Essentially, when selecting an environment (or a cluster), the user should see all relevant information: which artifacts/versions are deployed, whether there are any out-of-policy conditions, recent deployment history for that environment, and live metrics or alerts for it. It’s about answering “Is everything OK in environment X right now? What’s deployed there?” at a glance. The environment view should also integrate any Infrastructure-as-Code context – e.g. show if the environment’s infrastructure (Terraform, Kubernetes resources) is in sync or drifted from the desired state.
|
||||
|
||||
**Implementation Plan:** Represent environments as entities in the system with attributes and links to resources. For a Kubernetes cluster environment, integrate with the K8s API or Argo CD to fetch the list of deployed applications and their versions. For VM or cloud environments, integrate with deployment scripts or Terraform state: e.g. tag deployments with an environment ID so the system knows what’s deployed. Implement an environment overview page showing a grid or list of services in that environment and their current version (pull this from a deployment registry or continuous delivery tool). Include environment-specific status checks: e.g. call Kubernetes for pod statuses or use health check endpoints of services. If using Terraform or another IaC, query its state or run a drift detection (using Terraform plan or Terraform Cloud APIs) to identify differences between desired and actual infrastructure; highlight those if any. Additionally, integrate recent deployment logs: e.g. “Deployed version 1.2.3 of ServiceA 2 hours ago by pipeline #45 (passed ✅)” so that context is visible[getambush.com](https://www.getambush.com/article/devops-for-classified-environments/#:~:text=Information%20radiators%20display%20system%20status,demonstrate%20system%20health%20and%20trends). Enable quick access to logs or monitoring: e.g. links to Kibana for logs or Prometheus/Grafana for metrics specific to that environment. For environment config, provide a way to manage environment-specific variables or secrets (possibly by integrating with a vault or config management). This view might also expose controls like pausing deployments (maintenance mode) or manually triggering a rollback in that environment. If the organization uses approval gates on environments, show whether the environment is open for deployment or awaiting approvals. Use role-based access control to ensure users only see and act on environments they’re allowed to. In terms of tech, you might integrate with Kubernetes via the Kubernetes API (client libraries) for cluster state, and with cloud providers (AWS, etc.) for resource statuses. If multiple clusters, aggregate them or allow selecting each.
|
||||
|
||||
**DevOps-facing Outcome:** When a DevOps engineer opens the environment view (say for “Production”), they get a **comprehensive snapshot** of Prod. For example, they see that Service A version 2.3 is running (with a green check indicating all health checks pass), Service B version 1.8 is running but has a warning (perhaps a policy violation or a pod restarting). They can see that the last deployment was yesterday, and maybe an approval is pending for a new version (clearly indicated). They also notice any environment-level alerts (e.g. “Disk space low” or “Compliance drift detected: one config changed outside of pipeline”). This reduces the need to jump between different monitoring and deployment tools – key information is aggregated. They can directly access logs or metrics if something looks off. For example, if an incident occurs in production, the on-call can open this view to quickly find what changed recently and on which nodes. The environment-centric view thus **bridges operations and release info**: it’s not just what versions are deployed, but also their run-state and any issues. As a result, DevOps teams can more effectively manage environments, verify deployments, and ensure consistency. This high-level visibility aligns with best practices where environments are monitored and audited continuously[browserstack.com](https://www.browserstack.com/guide/azure-devops-environment#:~:text=Step%206%3A%20Monitor%20and%20Manage,Deployments) – the UI will show deployment history and status in one place, simplifying compliance and troubleshooting.
|
||||
|
||||
4\. Evidence & Proof Presentation
|
||||
---------------------------------
|
||||
|
||||
**Development Direction:** The platform must automatically collect and present **evidence of compliance and quality** for each release, making audits and reviews straightforward. This means every pipeline and deployment should leave an “evidence trail” – test results, security scan reports, configuration snapshots, audit logs – that is organized and accessible. DevOps users (and auditors or security teams) need a dedicated view or report that proves all required checks were done (for example, that an artifact has an SBOM, passed vulnerability scanning, was signed, and met policy criteria). Essentially, treat evidence as a first-class artifact of the process, not an afterthought[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=Evidence%20should%20flow%20like%20logs%2C,good%3B%20centralizing%20with%20context%20is). The UX should include dashboards or evidence pages where one can inspect and download these proofs, whether for an individual release or an environment’s compliance status.
|
||||
|
||||
**Implementation Plan:** **Automate evidence generation and storage** in the CI/CD pipeline. Incorporate steps in pipelines to generate artifacts like test reports (e.g. JUnit XML, coverage reports), security scan outputs (SAST/DAST results, SBOMs), and policy compliance logs. Use a secure storage (artifact repository or object storage bucket) for these evidence artifacts. For example, after a pipeline run, store the JUnit report and link it to that run record. Implement an “Evidence” section in the UI for each pipeline run or release: this could list the artifacts with download links or visual summaries (like a list of passed tests vs failed tests, vulnerability counts, etc.). Leverage “audit as code” practices – encode compliance checks as code so their output can be captured as evidence[getambush.com](https://www.getambush.com/article/devops-for-classified-environments/#:~:text=cloud%20as%20an%20enabler,time%20visibility%20into%20system%20status). For instance, if using Policy as Code (OPA, HashiCorp Sentinel, etc.), have the pipeline produce a policy evaluation report and save it. Use version-controlled snapshots: for a deployment, take a snapshot of environment configuration (container image digests, config values) and store that as a JSON/YAML file as evidence of “what was deployed”. Utilize tagging and retention: mark these evidence files with the build or release ID and keep them immutably (perhaps using an object store with write-once settings[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=Evidence%20should%20flow%20like%20logs%2C,good%3B%20centralizing%20with%20context%20is)). Integrate a **compliance dashboard** that aggregates evidence status – e.g. “100% of builds have test reports, 95% have no critical vulns” etc., for a quick view of compliance posture[getambush.com](https://www.getambush.com/article/devops-for-classified-environments/#:~:text=cloud%20as%20an%20enabler,time%20visibility%20into%20system%20status). We may implement a database of compliance statuses (each control check per pipeline run) to quickly query and display summaries. Also, provide an export or report generation feature: allow users to download an “attestation bundle” (ZIP of SBOMs, test results, etc.) for a release to provide to auditors[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=Evidence%20should%20flow%20like%20logs%2C,good%3B%20centralizing%20with%20context%20is). Security-wise, ensure this evidence store is append-only to prevent tampering (using object locks or checksums). In terms of tech, tools like **SLSA attestations** can be integrated to sign and verify evidence (for supply chain security). The UI can show verification status of attestation signatures to prove integrity.
|
||||
|
||||
**DevOps-facing Outcome:** DevOps teams and compliance officers will see a **clear, accessible trail of proof** for each deployment. For example, when viewing a particular release, they might see: _Tests:_ 120/120 passed (link to detailed results), _Security:_ 0 critical vulns (link to scanner report), _Config Audit:_ 1 minor deviation (waiver granted, link to waiver details). They can click any of those to dive deeper – e.g. open the actual security scan report artifact or view the SBOM file. Instead of scrambling to gather evidence from multiple tools at audit time, the platform surfaces it continuously[getambush.com](https://www.getambush.com/article/devops-for-classified-environments/#:~:text=cloud%20as%20an%20enabler,time%20visibility%20into%20system%20status)[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=Evidence%20should%20flow%20like%20logs%2C,good%3B%20centralizing%20with%20context%20is). An auditor or DevOps lead could open a compliance dashboard and see in real-time that all production releases have the required documentation and checks attached, and even download a bundle for an audit. This **builds trust** with stakeholders: when someone asks “How do we know this release is secure and compliant?”, the answer is a few clicks away in the evidence tab, not a week-long hunt. It also helps engineers themselves – if a question arises about “Did we run performance tests before this release?”, the evidence view will show if that artifact is present. By making evidence visible and automatic, it encourages teams to incorporate compliance into daily work (no more hidden spreadsheets or missing screenshots), ultimately making audits “boringly” smooth[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=Evidence%20should%20flow%20like%20logs%2C,good%3B%20centralizing%20with%20context%20is).
|
||||
|
||||
5\. Exception & Waiver UX
|
||||
-------------------------
|
||||
|
||||
_Example of an exemption request form (Harness.io) where a user selects scope (pipeline, target, project), duration, and reason for a waiver. Our implementation will provide a similar interface to manage policy exceptions._
|
||||
**Development Direction:** Implement a **controlled workflow for exceptions/waivers** that allows DevOps to override certain failures (policy violations, test failures) **only with proper approval and tracking**. In real-world pipelines, there are cases where a security vulnerability or policy may be temporarily excepted (waived) to unblock a deployment – but this must be done transparently and with accountability. The UX should make it easy to request an exception when needed (with justification) and to see the status of that request, but also make the presence of any waivers very visible to everyone (so they’re not forgotten). Key requirements: ability to request a waiver with specific scope (e.g. just for this pipeline run or environment, vs broader), mandatory reason and expiration for each waiver, an approval step by authorized roles, and an “exception register” in the UI that lists all active waivers and their expiry[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=secret%20sauce%3A%20each%20metric%20links,We%20don%E2%80%99t). Essentially, treat waivers as temporary, auditable objects in the system.
|
||||
|
||||
**Implementation Plan:** Build a feature where pipeline policy checks or scan results that would fail the pipeline can be turned into an **exception request**. For example, if a pipeline finds a critical vulnerability, provide a “Request Waiver” button next to the failure message in the UI. This triggers a form (like the image example) to capture details: scope of waiver (this specific deployment, this application, or whole project)[developer.harness.io](https://developer.harness.io/docs/security-testing-orchestration/exemptions/exemption-workflows/#:~:text=You%20can%20create%20exemption%20requests,specific%20occurrences%20within%20an%20issue), duration (e.g. 14 days or until a certain date), and a required reason category and description (like “Acceptable risk – low impact, fix in next release” or “False positive”[developer.harness.io](https://developer.harness.io/docs/security-testing-orchestration/exemptions/exemption-workflows/#:~:text=)). Once submitted, store the request in a database with status “Pending” and notify the appropriate approvers (could integrate with email/Slack or just within the app). Implement an approval interface where a security lead or product owner can review the request and either approve (possibly adjusting scope or duration)[developer.harness.io](https://developer.harness.io/docs/security-testing-orchestration/exemptions/exemption-workflows/#:~:text=Reviewers%20have%20the%20flexibility%20to,requests%2C%20refer%20to%20%2037) or reject it. Use role-based permissions to ensure only certain roles (e.g. Security Officer) can approve. If approved, the pipeline or policy engine should automatically apply that exception: e.g. mark that particular check as waived for the specified scope. This could be implemented by updating a policy store (for instance, adding an entry that “vuln XYZ is waived for app A in staging until date D”). The pipeline then reads these waivers on the next run so it doesn’t fail for a known, waived issue. Ensure the waiver is time-bound: perhaps schedule a job to auto-expire it (or the pipeline will treat it as fail after expiration). In the UI, implement an **“Active Waivers” dashboard**[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=secret%20sauce%3A%20each%20metric%20links,We%20don%E2%80%99t) listing all current exceptions, with details: what was waived, why, who approved, and countdown to expiration. Possibly show this on the environment and artifact views too (e.g. a banner “Running with 1 waiver: CVE-1234 in ServiceA (expires in 5 days)”). Also log all waiver actions in the audit trail. Technically, this could integrate with a policy engine like OPA – e.g. OPA could have a data map of exceptions which the policies check. Or simpler, our app’s database serves as the source of truth and our pipeline code consults it. Finally, enforce in code that any exception must have an owner and expiry set (no indefinite waivers) – e.g. do not allow submission without an expiry date, and prevent using expired waivers (pipeline should fail if an expired waiver is encountered). This follows the best practice of “time-boxed exceptions with owners”[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=secret%20sauce%3A%20each%20metric%20links,We%20don%E2%80%99t).
|
||||
|
||||
**DevOps-facing Outcome:** Instead of ad-hoc Slack approvals or lingering risk acceptances, DevOps users get a **transparent, self-service mechanism** to handle necessary exceptions. For example, if a pipeline is blocking a deployment due to a vulnerability that is a false positive, the engineer can click “Request Waiver”, fill in the justification (selecting “False positive” and adding notes) and submit. They will see the request in a pending state and, if authorized, an approver will get notified. Once approved, the pipeline might automatically continue or allow a rerun to succeed. In the UI, a clear label might mark that deployment as “**Waiver applied**” so it’s never hidden[developer.harness.io](https://developer.harness.io/docs/security-testing-orchestration/exemptions/exemption-workflows/#:~:text=Issue%20exemptions%20help%20unblock%20pipelines,to%20the%20issue%20exemptions%20workflow). The team and auditors can always consult the Waivers dashboard to see, for instance, that “CVE-1234 in ServiceA was waived for 7 days by Jane Doe on Oct 10, reason: Acceptable risk[developer.harness.io](https://developer.harness.io/docs/security-testing-orchestration/exemptions/exemption-workflows/#:~:text=Issue%20exemptions%20help%20unblock%20pipelines,to%20the%20issue%20exemptions%20workflow).” As waivers near expiration, perhaps the system alerts the team to fix the underlying issue. This prevents “forever exceptions” – it’s obvious if something is continuously waived. By integrating this UX, we **maintain velocity without sacrificing governance**: teams aren’t stuck when a known low-risk issue pops up, but any deviation from standards is documented and tracked. Over time, the exception log can even drive improvement (e.g. seeing which policies frequently get waived might indicate they need adjustment). In summary, DevOps engineers experience a workflow where getting an exception is streamlined yet responsible, and they always know which releases are carrying exceptions (no surprises to be caught in audits or incidents)[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=secret%20sauce%3A%20each%20metric%20links,We%20don%E2%80%99t).
|
||||
|
||||
6\. CLI and Automation UX
|
||||
-------------------------
|
||||
|
||||
**Development Direction:** Offer a **powerful CLI tool** that mirrors the capabilities of the UI, enabling automation and scripting of all DevOps workflows. DevOps engineers often prefer or need command-line access for integration into CI scripts, Infrastructure as Code pipelines, or simply for speed. The CLI experience should be considered part of the product’s UX – it must be intuitive, consistent with the UI concepts, and provide useful output (including machine-readable formats). Essentially, anything you can do in the web console (view pipeline status, approve a waiver, deploy an artifact, fetch evidence) should be doable via the CLI or API. This empowers advanced users and facilitates integration with other automation (shell scripts, CI jobs, Git hooks, etc.). A good CLI follows standard conventions and provides help, clear errors, and supports environment configuration for non-interactive use.
|
||||
|
||||
**Implementation Plan:** Develop the CLI as a first-class client to the platform’s REST/GraphQL API. Likely implement it in a language suited for cross-platform command-line tools (Go is a common choice for CLIs due to easy binary distribution, or Python for rapid development with an installer). Use an existing CLI framework (for Go, something like Cobra or Click for Python) to structure commands and flags. Ensure the commands map closely to the domain: e.g. `stella pipeline list`, `stella pipeline logs <id>`, `stella artifact promote <artifact> --env prod`, `stella evidence download --release <id>`, `stella waiver request ...` etc. Follow common UNIX CLI design principles: support `--help` for every command, use short (`-f`) and long (`--force`) flags appropriately, and return proper exit codes (so scripts can detect success/failure). Include output format switches, e.g. `--output json` for commands to get machine-parseable output (allowing integration with other tools). Integrate authentication in a user-friendly way: perhaps `stella auth login` to do an OAuth device code flow or accept a token, and store it (maybe in `~/.stella/config`). The CLI should respect environment variables for non-interactive use (e.g. `STELLA_API_TOKEN`, `STELLA_TENANT`) for easy CI integration[git.stella-ops.org](https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/src/commit/48702191bed7d66b8e29929a8fad4ecdb40b9490/docs/11_AUTHORITY.md#:~:text=errors%20to%20operators%20and%20provide,STELLA_POLICY_DIGEST). Provide auto-completion scripts for common shells to improve usability. Tie the CLI version to the server API version, and provide a clear upgrade path (maybe `stella upgrade` to get the latest version). As part of development, create comprehensive docs and examples for the CLI, and possibly a testing harness to ensure it works on all platforms. Consider also that the CLI might be used in pipelines: ensure it’s efficient (no unnecessary output when not needed, perhaps a quiet mode). For implementing heavy tasks (like streaming logs), use web socket or long polling under the hood to show live logs in the terminal, similar to how `kubectl logs -f` works. If the CLI will handle potentially sensitive operations (like approvals or secret management), ensure it can prompt for confirmation or use flags to force through in scripts. Also, align CLI error messages and terminology with the UI for consistency.
|
||||
|
||||
**DevOps-facing Outcome:** For DevOps engineers, the CLI becomes a **productivity booster** and a Swiss army knife in automation. They can script repetitive tasks: for instance, a release engineer might run a script that uses `stella artifact list --env staging` to verify what's in staging, then `stella artifact promote` to push to production followed by `stella pipeline monitor --wait` to watch the rollout complete. All of this can be done without leaving their terminal or clicking in a browser. The CLI output is designed to be readable but also parseable: e.g. `stella pipeline status 123` might output a concise summary in human-readable form, or with `--json` give a JSON that a script can parse to decide next steps. In on-call situations, an engineer could quickly fetch evidence or status: e.g. `stella evidence summary --release 2025.10.05` to see if all checks passed for a particular release, right from the terminal. This complements the UI by enabling **automation integration** – the CLI can be used in CI pipelines (maybe even in other systems, e.g. a Jenkins job could call `stella ...` to trigger something in Stella). Because the CLI uses the same language as the UI, users don’t have to learn a completely different syntax or mental model. And by providing robust help and logical command names, even newcomers find it accessible (for example, typing `stella --help` lists subcommands in a clear way, similar to kubectl or git CLIs they know). Overall, the DevOps-facing outcome is that the tool meets engineers where they are – whether they love GUIs or CLIs – and supports **automation at scale**, which is a core DevOps principle.
|
||||
|
||||
7\. Alerting & Incident Integration
|
||||
-----------------------------------
|
||||
|
||||
**Development Direction:** The platform should seamlessly integrate with **alerting and incident management workflows** so that issues in pipelines or environments automatically notify the right people, and ongoing incidents are visible in the deployment context. DevOps teams rely on fast feedback for failures or abnormal conditions – whether a pipeline fails, a deployment causes a service outage, or a security scan finds a critical issue, the system needs to push alerts to the channels where engineers are already looking (chat, email, incident tools). Additionally, when viewing the DevOps dashboards, users should see indicators of active incidents or alerts related to recent changes. This tight integration helps bridge the gap between CI/CD and operations: deployments and incidents should not be separate silos. The UX should support configuring alert rules and connecting to tools like PagerDuty, Opsgenie, Slack/MS Teams, or even Jira for incident tickets, with minimal setup.
|
||||
|
||||
**Implementation Plan:** Introduce an **alerting configuration** module where certain events trigger notifications. Key events to consider: pipeline failures, pipeline successes (optional), deployment to production, policy violations, security vulnerabilities found, and performance regressions in metrics. Allow users to configure where these go – e.g. a Slack webhook, an email list, or an incident management system’s API. For pipeline failures or critical security findings, integration with PagerDuty/On-call rotation can create an incident automatically. Use webhooks and APIs: for Slack or Teams, send a formatted message (e.g. “:red\_circle: _Deployment Failed_ – Pipeline #123 failed at step 'Integration Tests'. Click here to view details.” with a link to the UI). For PagerDuty, use their Events API to trigger an incident with details including the pipeline or service impacted. On the **incoming side**, integrate with monitoring tools to reflect incidents: e.g. use status from an incident management system or monitoring alerts to display in the platform. If the organization uses something like ServiceNow or Jira for incidents, consider a plugin or link: for instance, tag deployments with change IDs and then auto-update those tickets if a deployment triggers an alert. In the environment view, include a widget that shows current alerts for that environment (by pulling from Prometheus Alertmanager or cloud monitoring alerts relevant to that cluster). Implement ChatOps commands as well: possibly allow acknowledging or redeploying via Slack bot commands. This can be achieved by having a small service listening to chat commands (Slack slash commands or similar) that call the same internal APIs (for example, a “/deploy rollback serviceA” command in Slack triggers the rollback pipeline). For UI implementation, ensure that when an alert is active, it’s clearly indicated: e.g. a red badge on the environment or pipeline view, and maybe a top-level “Incidents” section that lists all unresolved incidents (with links to their external system if applicable). Use the information radiators approach – maybe a large screen mode or summary panel showing system health and any ongoing incidents[getambush.com](https://www.getambush.com/article/devops-for-classified-environments/#:~:text=Information%20radiators%20display%20system%20status,demonstrate%20system%20health%20and%20trends). Technically, setting up these integrations means building outbound webhook capabilities and possibly small integration plugins for each target (Slack, PagerDuty, etc.). Also include the ability to throttle or filter alerts (to avoid spamming on every minor issue). Logging and auditing: record what alerts were sent and when (so one can later review incident timelines).
|
||||
|
||||
**DevOps-facing Outcome:** DevOps engineers will be **immediately aware** of problems without having to constantly watch the dashboards. For example, if a nightly build fails or a critical vulnerability is found in a new build, the on-call engineer might get a PagerDuty alert or a Slack message in the team channel within seconds. The message will contain enough context (pipeline name, failure reason snippet, a link to view details) so they can quickly respond. During a live incident, when they open the Stella environment view, they might see an **incident banner** or an “Active Alerts” list indicating which services are affected, aligning with what their monitoring is showing. This context speeds up remediation: if a production incident is ongoing, the team can see which recent deployment might have caused it (since the platform correlates deployment events with incident alerts). Conversely, when doing a deployment, if an alert fires (e.g. error rate spiked), the system could even pause further stages and notify the team. By integrating ChatOps, some users might even resolve things without leaving their chat: e.g. the Slack bot reports “Deployment failed” and the engineer types a command to rollback right in Slack, which the platform executes[getambush.com](https://www.getambush.com/article/devops-for-classified-environments/#:~:text=requiring%20attention,system%20health%20and%20trends). Overall, the outcome is a highly responsive DevOps process: issues are caught and communicated in real-time, and the platform becomes part of the incident handling loop, not isolated. Management can also see in retrospective reports that alerts were linked to changes (useful for blameless postmortems, since you can trace alert -> deployment). The tight coupling of alerting with the DevOps UX ensures nothing falls through the cracks, and teams can react swiftly, embodying the DevOps ideal of continuous feedback[getambush.com](https://www.getambush.com/article/devops-for-classified-environments/#:~:text=Information%20radiators%20display%20system%20status,demonstrate%20system%20health%20and%20trends).
|
||||
|
||||
8\. Metrics That Matter
|
||||
-----------------------
|
||||
|
||||
**Development Direction:** Define and display the **key metrics** that truly measure DevOps success and software delivery performance, rather than vanity metrics. This likely includes industry-standard **DORA metrics** (Deployment Frequency, Lead Time for Changes, Change Failure Rate, Time to Restore) to gauge velocity and stability[docs.gitlab.com](https://docs.gitlab.com/user/analytics/dora_metrics/#:~:text=,quickly%20your%20organization%20delivers%20software), as well as any domain-specific metrics (like compliance metrics or efficiency metrics relevant to the team). The UX should provide a metrics dashboard that is easy to interpret – with trends over time, targets or benchmarks, and the ability to drill down into what’s influencing those metrics. By focusing on “metrics that matter,” the platform steers teams toward continuous improvement on important outcomes (like faster deployments with high reliability) and avoids information overload. Each metric should be backed by data collected from the pipelines, incidents, and other parts of the system.
|
||||
|
||||
**Implementation Plan:** **Instrument the CI/CD pipeline and operations data** to collect these metrics automatically. For example, every successful deployment should log an event with a timestamp and environment, which can feed Deployment Frequency calculations (e.g. how many deploys to prod per day/week)[docs.gitlab.com](https://docs.gitlab.com/user/analytics/dora_metrics/#:~:text=,metrics%20measure%20your%20software%E2%80%99s%20reliability). Track lead time by measuring time from code commit (or merge) to deployment completion – this might involve integrating with the version control system to get commit timestamps and comparing to deployment events[docs.gitlab.com](https://docs.gitlab.com/user/analytics/dora_metrics/#:~:text=,metrics%20measure%20your%20software%E2%80%99s%20reliability). Change Failure Rate can be inferred by flagging deployments that resulted in a failure or rollback – integrate with incident tracking or post-deployment health checks to mark a deployment as “failed” if it had to be reverted or caused an alert. Time to Restore is measured from incident start to resolution – integrate with incident management timestamps or pipeline rollback completion times. Additionally, incorporate compliance/quality metrics highlighted earlier: e.g. “% of builds with all tests passing”, “average time to remediate critical vulnerabilities” – many of these can be derived from the evidence and waiver data we track[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=We%20try%20to%20make%20compliance,quarter). Use a time-series database (Prometheus, InfluxDB) or even just a relational DB with time-series tables to store metric data points. Implement a **Metrics Dashboard UI** with charts for each key metric, ideally with the ability to view by different scopes (maybe per service or team or environment). For instance, a line chart for Deployment Frequency (deploys per week) with annotations when big changes happened, or a bar chart for Change Failure Rate per month. Provide comparison to industry benchmarks if available (e.g. highlighting if the team is elite per DORA benchmarks). Also, crucially, implement **drill-down links**: if a metric spike or drop is observed, the user should be able to click it and see underlying data – e.g. clicking a high Change Failure Rate in April shows which deployments failed in April and links to those pipeline runs[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=policy%20to%20merged%20after%20fixes,We%20don%E2%80%99t). Use color-coding to flag concerning trends (like increasing failure rate). Allow export of metrics for reporting purposes. Possibly integrate with existing analytics (if using Datadog or other BI, allow data export or API access to metrics). Ensure that metrics are updated in near real-time (maybe after each pipeline run or incident closure, recalc relevant metrics) so the dashboard is always current. We should also secure the metrics view (maybe management only for some, but ideally DevOps leads have it openly to promote transparency). In development, validate that these metrics indeed correlate with what teams care about (work with users to refine).
|
||||
|
||||
**DevOps-facing Outcome:** The team gets a **focused insight** into how they are performing and where to improve. On the metrics dashboard, they might see for example: Deployment Frequency – 20 deploys/week (trending upward), Lead Time – 1 day median, Change Failure Rate – 5%, Time to Restore – 1 hour median. These will be shown perhaps as simple cards or charts. They can quickly glean, say, “We’re deploying more often, but our change failure rate spiked last month,” prompting investigation. By clicking that spike, they see a list of incidents or failed deployments that contributed, allowing them to identify common causes and address them[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=policy%20to%20merged%20after%20fixes,We%20don%E2%80%99t). The dashboard might also show compliance metrics if relevant: e.g. “100% of builds had SBOMs attached this quarter” (the team could celebrate this boring but important win)[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=We%20celebrate%20the%20boring%20wins%3A,stakes%20we%E2%80%99re%20proud%20to%20meet), or “Median time to patch critical vulns: 2 days” – these could be in a separate section for security/compliance. Importantly, all metrics shown are ones that drive behavior the organization cares about – no pointless graphs that don’t lead to action. This ensures that when leadership asks “How are we doing in DevOps?”, the answer is readily available with evidence[docs.gitlab.com](https://docs.gitlab.com/user/analytics/dora_metrics/#:~:text=,quickly%20your%20organization%20delivers%20software). It also gamifies improvement: teams can see the needle move when they streamline a pipeline or improve testing. For example, after investing in parallel tests, Lead Time drops – the dashboard confirms such improvements. Furthermore, the presence of drill-down and context means metrics are **trusted** by engineers: if someone questions a number, they can click in and see the raw data behind it (making it hard to ignore or dispute the findings)[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=policy%20to%20merged%20after%20fixes,We%20don%E2%80%99t). Overall, this focus on meaningful metrics helps align everyone (Dev, Ops, and management) on common goals and provides continuous feedback at a high level on the effectiveness of DevOps practices. It’s not just data for managers – it’s a working tool for teams to guide decisions (like where to invest automation efforts next). By keeping the metrics visible and up-to-date, we encourage a culture of **data-driven improvement** in the DevOps process, as opposed to anecdotal or vanity measures[devopsoasis.blog](https://devopsoasis.blog/bake-ruthless-compliance-into-cicd-without-slowing-releases/#:~:text=We%20try%20to%20make%20compliance,quarter).
|
||||
|
||||
Short answer: they turn “nice principles” into a concrete blueprint of **what screens you need, what each must show, and what interactions must exist** — so your Angular/UX team can build Stella Ops UI with clear scope and acceptance criteria instead of guesswork.
|
||||
|
||||
I’ll break it down in practical terms.
|
||||
|
||||
---
|
||||
|
||||
## 1. From principles to UI backlog
|
||||
|
||||
Those 8 sections already give you:
|
||||
|
||||
* **Primary entry points (top-level navigation):**
|
||||
|
||||
* Pipelines / Runs
|
||||
* Artifacts / Images
|
||||
* Environments / Clusters
|
||||
* Evidence / Waivers
|
||||
* Metrics
|
||||
* **Cross-cutting objects:**
|
||||
|
||||
* Vulnerability detail
|
||||
* Proof bundle
|
||||
* Exception / Waiver
|
||||
* Pipeline policy decision
|
||||
|
||||
That is essentially your **information architecture**. You can now:
|
||||
|
||||
* Define Angular routes for each area (`/pipelines`, `/pipelines/:id`, `/artifacts/:id`, `/env/:id`, `/metrics`, etc.).
|
||||
* Turn every subsection into **epics and tickets**:
|
||||
|
||||
* “Implement pipeline run detail view with ‘Why this failed’ header, Evidence panel, Actions side rail.”
|
||||
* “Implement artifact risk summary component (‘score block’ + proof spine).”
|
||||
* “Implement waiver request dialog with scope, reason and expiry.”
|
||||
|
||||
Instead of “build a UI for DevOps”, your backlog becomes a set of very specific, testable UI tasks.
|
||||
|
||||
---
|
||||
|
||||
## 2. Direct mapping to concrete screens and components
|
||||
|
||||
### 2.1. DevOps mental model → Navigation and page layout
|
||||
|
||||
“Can I ship / what blocks me / what’s the minimum safe change?” directly drives:
|
||||
|
||||
* **Home / default view**:
|
||||
|
||||
* Recent pipeline runs with status + a “Blocked by X” cause line.
|
||||
* **Context around everything**:
|
||||
|
||||
* Every “red” or “yellow” item has:
|
||||
|
||||
* A one-line answer to “why”.
|
||||
* A visible next step (“fix”, “waive”, or “defer”).
|
||||
|
||||
For UI dev this means:
|
||||
|
||||
* Every error block must have:
|
||||
|
||||
* A cause string: `reasonSummary`.
|
||||
* A small action cluster: buttons for `View evidence`, `Propose fix`, `Request waiver`.
|
||||
|
||||
That becomes UI component contracts and TypeScript models.
|
||||
|
||||
---
|
||||
|
||||
### 2.2. Global UX principles → Acceptance criteria for each page
|
||||
|
||||
Examples:
|
||||
|
||||
* **Pipeline-first, UI-second**
|
||||
|
||||
* UI stories must always include:
|
||||
|
||||
* “Same data must be available via `/api/...` and CLI.”
|
||||
* When you design a screen, you also define:
|
||||
|
||||
* Which REST endpoints and DTOs back it.
|
||||
* Which CLI command shows the same information.
|
||||
|
||||
* **Time-to-evidence ≤ 30 seconds / 3 clicks**
|
||||
|
||||
* For every design, you can set **explicit acceptance tests**:
|
||||
|
||||
* From a failed pipeline in the list, user can reach:
|
||||
|
||||
* The vulnerability detail page
|
||||
* With SBOM line, VEX statements and lattice decision
|
||||
* In **≤ 2 navigations**.
|
||||
|
||||
During implementation you can literally check: “Do we need more than 3 clicks? If yes, redesign.”
|
||||
|
||||
* **No dead ends**
|
||||
|
||||
* Every page spec must include a “Next actions” section:
|
||||
|
||||
* E.g. for run detail: `open cluster`, `open artifact`, `export proof`, `request waiver`.
|
||||
* Frontend definition: each detail view must embed `<stella-next-actions [context]="...">`.
|
||||
|
||||
So the principles become **checklist items in PR review** and **Storybook stories**:
|
||||
|
||||
* “Evidence view: verify user is never stuck with no action.”
|
||||
|
||||
---
|
||||
|
||||
### 2.3. Core views → Page blueprints
|
||||
|
||||
Each of the three core views is practically a ready-made UI spec.
|
||||
|
||||
**Pipeline / run-centric view**
|
||||
|
||||
* Route: `/pipelines` and `/pipelines/:runId`
|
||||
* Components:
|
||||
|
||||
* `PipelinesTableComponent` (statuses, columns, filter bar)
|
||||
* `RunHeaderComponent` (“Why this run failed” line)
|
||||
* `EvidencePanelComponent` (SBOM / Feeds / VEX / Lattice / History tabs)
|
||||
* `RunActionsSidebarComponent` (propose upgrade, waiver, open cluster, export proof)
|
||||
|
||||
You can hand this to a designer as a wireframe task and to a dev as:
|
||||
|
||||
* API needed: `GET /runs`, `GET /runs/:id`, `GET /runs/:id/evidence`.
|
||||
* UI tests: “When run is FAILED, header shows single sentence reason.”
|
||||
|
||||
**Artifact-centric view**
|
||||
|
||||
* Route: `/artifacts/:id`
|
||||
* Components:
|
||||
|
||||
* `ArtifactHeader` (`<registry>/<repo>:tag @ sha256...`)
|
||||
* `RiskScoreBlock`
|
||||
* `ProofSpine` (hashes and Rekor link)
|
||||
* `VulnerabilityTable` with filters (“blockers”, “with VEX”, “unknown reachability”)
|
||||
|
||||
For UI dev this defines:
|
||||
|
||||
* Data model: `ArtifactDto` with `verdict`, `vulnSummary`, `proofSpine`.
|
||||
* Interactions: toggling filters updates the table (Angular reactive forms + data source).
|
||||
|
||||
**Environment / cluster view**
|
||||
|
||||
* Route: `/environments/:envId`
|
||||
* Components:
|
||||
|
||||
* `EnvironmentTree` (`env → service → artifact`)
|
||||
* `StatusLegend` (green / yellow / red)
|
||||
* Quick drill-down: click red service → artifact view → vuln view.
|
||||
|
||||
This is a straightforward “topology” page spec.
|
||||
|
||||
---
|
||||
|
||||
## 3. Evidence & waivers → Reusable UI patterns
|
||||
|
||||
The “Evidence tabs” and “Waiver flow” sections give you **reusable patterns**:
|
||||
|
||||
* **Evidence tabs**:
|
||||
|
||||
* One generic `EvidenceTabsComponent` with five tabs:
|
||||
|
||||
* SBOM, Feeds, VEX, Lattice decision, History.
|
||||
* Used in:
|
||||
|
||||
* Pipeline run details
|
||||
* Artifact view
|
||||
* Single-vulnerability modal
|
||||
|
||||
* **Waiver UX**:
|
||||
|
||||
* A single `WaiverDialogComponent` with:
|
||||
|
||||
* Scope dropdown
|
||||
* Reason dropdown
|
||||
* Free-text justification
|
||||
* Expiry picker
|
||||
* Used from:
|
||||
|
||||
* Run detail
|
||||
* Vulnerability detail
|
||||
* Environment warnings
|
||||
|
||||
Implement once, reuse everywhere. That keeps the UI consistent and reduces dev effort.
|
||||
|
||||
---
|
||||
|
||||
## 4. CLI & alerts → UI/UX boundaries
|
||||
|
||||
The CLI and alerting guidelines tell you **what not to overstuff into the web UI**:
|
||||
|
||||
* Some flows **must** be comfortable in terminal:
|
||||
|
||||
* Monitoring runs in CI logs, non-interactive approvals, export JSON.
|
||||
* Some interactions primarily happen in:
|
||||
|
||||
* Slack / email (alert text pattern).
|
||||
|
||||
For UI design this means:
|
||||
|
||||
* Your web UI concentrates on:
|
||||
|
||||
* Exploration, debugging, visualizing proof, managing waivers and policy.
|
||||
* CLI/Chat handle:
|
||||
|
||||
* “Fast path” operations (quick approvals, re-runs).
|
||||
|
||||
That clear separation prevents you from bloating the UI with obscure admin features that only scripts should use.
|
||||
|
||||
---
|
||||
|
||||
## 5. Metrics → Management / lead views
|
||||
|
||||
The metrics section defines:
|
||||
|
||||
* A dedicated **“Metrics & Reports”** area of the UI.
|
||||
* Concrete charts to build:
|
||||
|
||||
* DORA metrics
|
||||
* Exception health
|
||||
* Time-to-evidence, time-to-decision
|
||||
|
||||
For the UI team:
|
||||
|
||||
* You can design one `MetricsDashboard` route, fed by `/metrics/...` endpoints.
|
||||
* The drill-down requirement (“click spike → list underlying runs”) becomes:
|
||||
|
||||
* Link from chart points to pre-filtered `/pipelines` or `/incidents`.
|
||||
|
||||
Again, this is a direct translation from text to wires, not a vague “we should have some stats”.
|
||||
|
||||
---
|
||||
|
||||
## 6. How to operationalize this for Stella Ops
|
||||
|
||||
If you want this to directly drive implementation, you can do:
|
||||
|
||||
1. **Create a UI map document**
|
||||
|
||||
* List all routes and screens inferred above.
|
||||
* For each screen:
|
||||
|
||||
* Purpose (which question it answers)
|
||||
* Main components
|
||||
* Required backend endpoints.
|
||||
|
||||
2. **Derive a component inventory**
|
||||
|
||||
* Common reusables:
|
||||
|
||||
* EvidenceTabs
|
||||
* NextActionsSidebar
|
||||
* RiskScoreBlock
|
||||
* WaiverDialog
|
||||
* Build them first in isolation (Storybook) to lock UX.
|
||||
|
||||
3. **Turn principles into checklists**
|
||||
|
||||
* Add “3 clicks to evidence”, “No dead ends”, “Explain ‘why’ for every verdict” as:
|
||||
|
||||
* UI review checklists
|
||||
* Acceptance criteria on tickets.
|
||||
|
||||
4. **Prioritize DevOps-critical flows first**
|
||||
|
||||
* Sprint 1–2:
|
||||
|
||||
* Pipeline list + run detail with evidence and waiver.
|
||||
* Sprint 3–4:
|
||||
|
||||
* Artifact view + environment view.
|
||||
* Later:
|
||||
|
||||
* Metrics dashboards and richer filters.
|
||||
|
||||
So: these guidelines are not abstract; they **define your pages, your components, your routes, and your acceptance tests**. That is exactly what your Stella Ops UI team needs to move from idea → concrete Angular screens that DevOps can actually live in every day.
|
||||
@@ -0,0 +1,750 @@
|
||||
Here’s a simple, practical way to score vulnerabilities that’s more auditable than plain CVSS: build a **deterministic score** from three reproducible inputs—**Reachability**, **Evidence**, and **Provenance**—so every number is explainable and replayable.
|
||||
|
||||
---
|
||||
|
||||
### Why move beyond CVSS?
|
||||
|
||||
* **CVSS is context-light**: it rarely knows *your* call paths, configs, or runtime.
|
||||
* **Audits need proof**: regulators and customers increasingly ask, “show me how you got this score.”
|
||||
* **Teams need consistency**: the same image should get the same score across environments when inputs are identical.
|
||||
|
||||
---
|
||||
|
||||
### The scoring idea (plain English)
|
||||
|
||||
Score = a weighted function of:
|
||||
|
||||
1. **Reachability Depth (R)** — how close the vulnerable function is to a real entry point in *your* app (e.g., public HTTP route → handler → library call).
|
||||
2. **Evidence Density (E)** — how much concrete proof you have (stack traces, symbol hits, config toggles, feature flags, SCA vs. SAST vs. DAST vs. runtime).
|
||||
3. **Provenance Integrity (P)** — how trustworthy the artifact chain is (signed SBOM, DSSE attestations, SLSA/Rekor entries, reproducible build match).
|
||||
|
||||
A compact, auditable formula you can start with:
|
||||
|
||||
```
|
||||
NormalizedScore = W_R * f(R) + W_E * g(E) + W_P * h(P)
|
||||
```
|
||||
|
||||
* Pick monotonic, bounded transforms (e.g., map to 0..1):
|
||||
|
||||
* f(R): inverse of hops (shorter path ⇒ higher value)
|
||||
* g(E): weighted sum of evidence types (runtime>DAST>SAST>SCA, with decay for stale data)
|
||||
* h(P): cryptographic/provenance checks (unsigned < signed < signed+attested < signed+attested+reproducible)
|
||||
|
||||
Keep **W_R + W_E + W_P = 1** (e.g., 0.5, 0.35, 0.15 for reachability-first triage).
|
||||
|
||||
---
|
||||
|
||||
### What makes this “deterministic”?
|
||||
|
||||
* Inputs are **machine-replayable**: call-graph JSON, evidence bundle (hashes + timestamps), provenance attestations.
|
||||
* The score is **purely a function of those inputs**, so anyone can recompute it later and match your result byte-for-byte.
|
||||
|
||||
---
|
||||
|
||||
### Minimal rubric (ready to implement)
|
||||
|
||||
* **Reachability (R, 0..1)**
|
||||
|
||||
* 1.00 = vulnerable symbol called on a hot path from a public route (≤3 hops)
|
||||
* 0.66 = reachable but behind uncommon feature flag or deep path (4–7 hops)
|
||||
* 0.33 = only theoretically reachable (code present, no discovered path)
|
||||
* 0.00 = dead/unreferenced code in this build
|
||||
* **Evidence (E, 0..1)** (sum, capped at 1.0)
|
||||
|
||||
* +0.6 runtime trace hitting the symbol
|
||||
* +0.3 DAST/integ test activating vulnerable behavior
|
||||
* +0.2 SAST precise sink match
|
||||
* +0.1 SCA presence only (no call evidence)
|
||||
* (Apply 10–30% decay if older than N days)
|
||||
* **Provenance (P, 0..1)**
|
||||
|
||||
* 0.0 unsigned/unknown origin
|
||||
* 0.3 signed image only
|
||||
* 0.6 signed + SBOM (hash-linked)
|
||||
* 1.0 signed + SBOM + DSSE attestations + reproducible build match
|
||||
|
||||
Example weights: `W_R=0.5, W_E=0.35, W_P=0.15`.
|
||||
|
||||
---
|
||||
|
||||
### How this plugs into **Stella Ops**
|
||||
|
||||
* **Scanner** produces call-graphs & symbol maps (R).
|
||||
* **Vexer**/Evidence store aggregates SCA/SAST/DAST/runtime proofs with timestamps (E).
|
||||
* **Authority/Proof‑Graph** verifies signatures, SBOM↔image hash links, DSSE/Rekor (P).
|
||||
* **Policy Engine** applies the scoring formula (YAML policy) and emits a signed VEX note with the score + input hashes.
|
||||
* **Replay**: any audit can re-run the same policy with the same inputs and get the same score.
|
||||
|
||||
---
|
||||
|
||||
### Developer checklist (do this first)
|
||||
|
||||
* Emit a **Reachability JSON** per build: entrypoints, hops, functions, edges, timestamps, hashes.
|
||||
* Normalize **Evidence Types** with IDs, confidence, freshness, and content hashes.
|
||||
* Record **Provenance Facts** (signing certs, SBOM digest, DSSE bundle, reproducible-build fingerprint).
|
||||
* Implement the **score function as a pure library** (no I/O), version it (e.g., `score.v1`), and include the version + inputs’ hashes in every VEX note.
|
||||
* Add a **30‑sec “Time‑to‑Evidence” UI**: click a score → see the exact call path, evidence list, and provenance checks.
|
||||
|
||||
---
|
||||
|
||||
### Why this helps compliance & sales
|
||||
|
||||
* Every number is **auditable** (inputs + function are transparent).
|
||||
* Scores remain **consistent across air‑gapped sites** (deterministic, no hidden heuristics).
|
||||
* You can **prove reduction** after a fix (paths disappear, evidence decays, provenance improves).
|
||||
|
||||
If you want, I can draft the YAML policy schema and a tiny .NET 10 library stub for `score.v1` so you can drop it into Stella Ops today.
|
||||
Below is an extended, **developer-ready implementation plan** to build the deterministic vulnerability score into **Stella Ops** (Scanner → Evidence/Vexer → Authority/Proof‑Graph → Policy Engine → UI/VEX output). I’m assuming a .NET-centric stack (since you mentioned .NET 10 earlier), but everything is laid out so the scoring core stays language-agnostic.
|
||||
|
||||
---
|
||||
|
||||
## 1) Extend the scoring model into a stable, “auditable primitive”
|
||||
|
||||
### 1.1 Outputs you should standardize on
|
||||
|
||||
Produce **two** signed artifacts per finding (plus optional UI views):
|
||||
|
||||
1. **ScoreResult** (primary):
|
||||
|
||||
* `riskScore` (0–100 integer)
|
||||
* `subscores` (each 0–100 integer): `baseSeverity`, `reachability`, `evidence`, `provenance`
|
||||
* `explain[]` (structured reasons, ordered deterministically)
|
||||
* `inputs` (digests of all upstream inputs)
|
||||
* `policy` (policy version + digest)
|
||||
* `engine` (engine version + digest)
|
||||
* `asOf` timestamp (the only “time” allowed to affect the result)
|
||||
|
||||
2. **VEX note** (OpenVEX/CSAF-compatible wrapper):
|
||||
|
||||
* references ScoreResult digest
|
||||
* embeds the score (optional) + the input digests
|
||||
* signed by Stella Ops Authority
|
||||
|
||||
> Key audit requirement: anyone can recompute the score **offline** from the input bundle + policy + engine version.
|
||||
|
||||
---
|
||||
|
||||
## 2) Make determinism non-negotiable
|
||||
|
||||
### 2.1 Determinism rules (implement as “engineering constraints”)
|
||||
|
||||
These are the common ways deterministic systems become non-deterministic:
|
||||
|
||||
* **No floating point** in scoring math. Use integer “basis points” and integer bucket tables.
|
||||
* **No implicit time**. Scoring takes `asOf` as an explicit input. Evidence “freshness” is computed as `asOf - evidence.timestamp`.
|
||||
* **Canonical serialization** for hashing:
|
||||
|
||||
* Use RFC-style canonical JSON (e.g., JCS) or a strict canonical CBOR profile.
|
||||
* Sort keys and arrays deterministically.
|
||||
* **Stable ordering** for explanation lists:
|
||||
|
||||
* Always sort factors by `(factorId, contributingObjectDigest)`.
|
||||
|
||||
### 2.2 Fixed-point scoring approach (recommended)
|
||||
|
||||
Represent weights and multipliers as **basis points** (bps):
|
||||
|
||||
* 100% = 10,000 bps
|
||||
* 1% = 100 bps
|
||||
|
||||
Example: `totalScore = (wB*B + wR*R + wE*E + wP*P) / 10000`
|
||||
|
||||
---
|
||||
|
||||
## 3) Extended score definition (v1)
|
||||
|
||||
### 3.1 Subscores (0–100 integers)
|
||||
|
||||
#### BaseSeverity (B)
|
||||
|
||||
* Source: CVSS if present, else vendor severity, else default.
|
||||
* Normalize to 0–100:
|
||||
|
||||
* CVSS 0.0–10.0 → 0–100 by `B = round(CVSS * 10)`
|
||||
|
||||
Keep it small weight so you’re “beyond CVSS” but still anchored.
|
||||
|
||||
#### Reachability (R)
|
||||
|
||||
Computed from reachability report (call-path depth + gating conditions).
|
||||
|
||||
**Hop buckets** (example):
|
||||
|
||||
* 0–2 hops: 100
|
||||
* 3 hops: 85
|
||||
* 4 hops: 70
|
||||
* 5 hops: 55
|
||||
* 6 hops: 45
|
||||
* 7 hops: 35
|
||||
* 8+ hops: 20
|
||||
* unreachable: 0
|
||||
|
||||
**Gate multipliers** (apply multiplicatively in bps):
|
||||
|
||||
* behind feature flag: ×7000
|
||||
* auth required: ×8000
|
||||
* only admin role: ×8500
|
||||
* non-default config: ×7500
|
||||
|
||||
Final: `R = bucketScore * gateMultiplier / 10000`
|
||||
|
||||
#### Evidence (E)
|
||||
|
||||
Sum evidence “points” capped at 100, then apply freshness multiplier.
|
||||
|
||||
Evidence points (example):
|
||||
|
||||
* runtime trace hitting vulnerable symbol: +60
|
||||
* DAST / integration test triggers behavior: +30
|
||||
* SAST precise sink match: +20
|
||||
* SCA presence only: +10
|
||||
|
||||
Freshness bucket multiplier (example):
|
||||
|
||||
* age ≤ 7 days: ×10000
|
||||
* ≤ 30 days: ×9000
|
||||
* ≤ 90 days: ×7500
|
||||
* ≤ 180 days: ×6000
|
||||
* ≤ 365 days: ×4000
|
||||
* > 365: ×2000
|
||||
|
||||
Final: `E = min(100, sum(points)) * freshness / 10000`
|
||||
|
||||
#### Provenance (P)
|
||||
|
||||
Based on verified supply-chain checks.
|
||||
|
||||
Levels:
|
||||
|
||||
* unsigned/unknown: 0
|
||||
* signed image: 30
|
||||
* signed + SBOM hash-linked to image: 60
|
||||
* signed + SBOM + DSSE attestations verified: 80
|
||||
* above + reproducible build match: 100
|
||||
|
||||
### 3.2 Total score and overrides
|
||||
|
||||
Weights (example):
|
||||
|
||||
* `wB=1000` (10%)
|
||||
* `wR=4500` (45%)
|
||||
* `wE=3000` (30%)
|
||||
* `wP=1500` (15%)
|
||||
|
||||
Total:
|
||||
|
||||
* `riskScore = (wB*B + wR*R + wE*E + wP*P) / 10000`
|
||||
|
||||
Override examples (still deterministic, because they depend on evidence flags):
|
||||
|
||||
* If `knownExploited=true` AND `R >= 70` → force score to 95+
|
||||
* If unreachable (`R=0`) AND only SCA evidence (`E<=10`) → clamp score ≤ 25
|
||||
|
||||
---
|
||||
|
||||
## 4) Canonical schemas (what to build first)
|
||||
|
||||
### 4.1 ReachabilityReport (per artifact + vuln)
|
||||
|
||||
Minimum fields:
|
||||
|
||||
* `artifactDigest` (sha256 of image or build artifact)
|
||||
* `graphDigest` (sha256 of canonical call-graph representation)
|
||||
* `vulnId` (CVE/OSV/etc)
|
||||
* `vulnerableSymbol` (fully-qualified)
|
||||
* `entrypoints[]` (HTTP routes, queue consumers, CLI commands, cron handlers)
|
||||
* `shortestPath`:
|
||||
|
||||
* `hops` (int)
|
||||
* `nodes[]` (ordered list of symbols)
|
||||
* `edges[]` (optional)
|
||||
* `gates[]`:
|
||||
|
||||
* `type` (“featureFlag” | “authRequired” | “configNonDefault” | …)
|
||||
* `detail` (string)
|
||||
* `computedAt` (timestamp)
|
||||
* `toolVersion`
|
||||
|
||||
### 4.2 EvidenceBundle (per artifact + vuln)
|
||||
|
||||
Evidence items are immutable and deduped by content hash.
|
||||
|
||||
* `evidenceId` (content hash)
|
||||
* `artifactDigest`
|
||||
* `vulnId`
|
||||
* `type` (“SCA” | “SAST” | “DAST” | “RUNTIME” | “ADVISORY”)
|
||||
* `tool` (name/version)
|
||||
* `timestamp`
|
||||
* `confidence` (0–100)
|
||||
* `subject` (package, symbol, endpoint)
|
||||
* `payloadDigest` (hash of raw payload stored separately)
|
||||
|
||||
### 4.3 ProvenanceReport (per artifact)
|
||||
|
||||
* `artifactDigest`
|
||||
* `signatureChecks[]` (who signed, what key, result)
|
||||
* `sbomDigest` + `sbomType`
|
||||
* `attestations[]` (DSSE digests + verification result)
|
||||
* `transparencyLogRefs[]` (optional)
|
||||
* `reproducibleMatch` (bool)
|
||||
* `computedAt`
|
||||
* `toolVersion`
|
||||
* `verificationLogDigest`
|
||||
|
||||
### 4.4 ScoreInput + ScoreResult
|
||||
|
||||
**ScoreInput** should include:
|
||||
|
||||
* `asOf`
|
||||
* `policyVersion`
|
||||
* digests for reachability/evidence/provenance/base severity source
|
||||
|
||||
**ScoreResult** should include:
|
||||
|
||||
* `riskScore`, `subscores`
|
||||
* `explain[]` (deterministic)
|
||||
* `engineVersion`, `policyDigest`
|
||||
* `inputs[]` (digests)
|
||||
* `resultDigest` (hash of canonical ScoreResult)
|
||||
* `signature` (Authority signs the digest)
|
||||
|
||||
---
|
||||
|
||||
## 5) Development implementation plan (phased, with deliverables + acceptance criteria)
|
||||
|
||||
### Phase A — Foundations: schemas, hashing, policy format, test harness
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* Canonical JSON format rules + hashing utilities (shared lib)
|
||||
* JSON Schemas for: ReachabilityReport, EvidenceItem, ProvenanceReport, ScoreInput, ScoreResult
|
||||
* “Golden fixture” repo: a set of input bundles and expected ScoreResults
|
||||
* Policy format `score.v1` (YAML or JSON) using **integer bps**
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* Same input bundle → identical `resultDigest` across:
|
||||
|
||||
* OS (Linux/Windows)
|
||||
* CPU (x64/ARM64)
|
||||
* runtime versions (supported .NET versions)
|
||||
* Fixtures run in CI and fail on any byte-level diff
|
||||
|
||||
---
|
||||
|
||||
### Phase B — Scoring engine (pure function library)
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* `Stella.ScoreEngine` as a pure library:
|
||||
|
||||
* `ComputeScore(ScoreInputBundle) -> ScoreResult`
|
||||
* `Explain(ScoreResult) -> structured explanation` (already embedded)
|
||||
* Policy parser + validator:
|
||||
|
||||
* weights sum to 10,000
|
||||
* bucket tables monotonic
|
||||
* override rules deterministic and total order
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* 100% deterministic tests passing (golden fixtures)
|
||||
* “Explain” always includes:
|
||||
|
||||
* subscores
|
||||
* applied buckets
|
||||
* applied gate multipliers
|
||||
* freshness bucket selected
|
||||
* provenance level selected
|
||||
* No non-deterministic dependencies (time, random, locale, float)
|
||||
|
||||
---
|
||||
|
||||
### Phase C — Evidence pipeline (Vexer / Evidence Store)
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* Normalized evidence ingestion adapters:
|
||||
|
||||
* SCA ingest (from your existing scanner output)
|
||||
* SAST ingest
|
||||
* DAST ingest
|
||||
* runtime trace ingest (optional MVP → “symbol hit” events)
|
||||
* Evidence Store service:
|
||||
|
||||
* immutability (append-only)
|
||||
* dedupe by `evidenceId`
|
||||
* query by `(artifactDigest, vulnId)`
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* Ingesting the same evidence twice yields identical state (idempotent)
|
||||
* Every evidence record can be exported as a bundle with content hashes
|
||||
* Evidence timestamps preserved; `asOf` drives freshness deterministically
|
||||
|
||||
---
|
||||
|
||||
### Phase D — Reachability analyzer (Scanner extension)
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* Call-graph builder and symbol resolver:
|
||||
|
||||
* for .NET: IL-level call graph + ASP.NET route discovery
|
||||
* Reachability computation:
|
||||
|
||||
* compute shortest path hops from entrypoints to vulnerable symbol
|
||||
* attach gating detections (config/feature/auth heuristics)
|
||||
* Reachability report emitter:
|
||||
|
||||
* emits ReachabilityReport with stable digests
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* Given the same build artifact, reachability report digest is stable
|
||||
* Paths are replayable and visualizable (nodes are resolvable)
|
||||
* Unreachable findings are explicitly marked and explainable
|
||||
|
||||
---
|
||||
|
||||
### Phase E — Provenance verification (Authority / Proof‑Graph)
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* Verification pipeline:
|
||||
|
||||
* signature verification for artifact digest
|
||||
* SBOM hash linking
|
||||
* attestation verification (DSSE/in‑toto style)
|
||||
* optional transparency log reference capture
|
||||
* optional reproducible-build comparison input
|
||||
* ProvenanceReport emitter (signed verification log digest)
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* Verification is offline-capable if given the necessary bundles
|
||||
* Any failed check is captured with a deterministic error code + message
|
||||
* ProvenanceReport digest is stable for same inputs
|
||||
|
||||
---
|
||||
|
||||
### Phase F — Orchestration: “score a finding” workflow + VEX output
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* Orchestrator service (or existing pipeline step) that:
|
||||
|
||||
1. receives a vulnerability finding
|
||||
2. fetches reachability/evidence/provenance bundles
|
||||
3. builds ScoreInput with `asOf`
|
||||
4. computes ScoreResult
|
||||
5. signs ScoreResult digest
|
||||
6. emits VEX note referencing ScoreResult digest
|
||||
* Storage for ScoreResult + VEX note (immutable, versioned)
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* “Recompute” produces same ScoreResult digest if inputs unchanged
|
||||
* VEX note includes:
|
||||
|
||||
* policy version + digest
|
||||
* engine version
|
||||
* input digests
|
||||
* score + subscores
|
||||
* End-to-end API returns “why” data in <1 round trip (cached)
|
||||
|
||||
---
|
||||
|
||||
### Phase G — UI: “Why this score?” and replay/export
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* Findings view enhancements:
|
||||
|
||||
* score badge + risk bucket (Low/Med/High/Critical)
|
||||
* click-through “Why this score”
|
||||
* “Why this score” panel:
|
||||
|
||||
* call path visualization (at least as an ordered list for MVP)
|
||||
* evidence list with freshness + confidence
|
||||
* provenance checks list (pass/fail)
|
||||
* export bundle (inputs + policy + engine version) for audit replay
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* Any score is explainable in <30 seconds by a human reviewer
|
||||
* Exported bundle can reproduce score offline
|
||||
|
||||
---
|
||||
|
||||
### Phase H — Governance: policy-as-code, versioning, calibration, rollout
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* Policy registry:
|
||||
|
||||
* store `score.v1` policies by org/project/environment
|
||||
* approvals + change log
|
||||
* Versioning strategy:
|
||||
|
||||
* engine semantic versioning
|
||||
* policy digest pinned in ScoreResult
|
||||
* migration tooling (e.g., score.v1 → score.v2)
|
||||
* Rollout mechanics:
|
||||
|
||||
* shadow mode: compute score but don’t enforce
|
||||
* enforcement gates: block deploy if score ≥ threshold
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* Policy changes never rewrite past scores
|
||||
* You can backfill new scores with a new policy version without ambiguity
|
||||
* Audit log shows: who changed policy, when, why (optional but recommended)
|
||||
|
||||
---
|
||||
|
||||
## 6) Engineering backlog (epics → stories → DoD)
|
||||
|
||||
### Epic 1: Deterministic core
|
||||
|
||||
* Story: implement canonical JSON + hashing
|
||||
* Story: implement fixed-point math helpers (bps)
|
||||
* Story: implement score.v1 buckets + overrides
|
||||
* DoD:
|
||||
|
||||
* no floats
|
||||
* golden test suite
|
||||
* deterministic explain ordering
|
||||
|
||||
### Epic 2: Evidence normalization
|
||||
|
||||
* Story: evidence schema + dedupe
|
||||
* Story: adapters (SCA/SAST/DAST/runtime)
|
||||
* Story: evidence query API
|
||||
* DoD:
|
||||
|
||||
* idempotent ingest
|
||||
* bundle export with digests
|
||||
|
||||
### Epic 3: Reachability
|
||||
|
||||
* Story: entrypoint discovery for target frameworks
|
||||
* Story: call graph extraction
|
||||
* Story: shortest-path computation
|
||||
* Story: gating heuristics
|
||||
* DoD:
|
||||
|
||||
* stable digests
|
||||
* replayable paths
|
||||
|
||||
### Epic 4: Provenance
|
||||
|
||||
* Story: verify signatures
|
||||
* Story: verify SBOM link
|
||||
* Story: verify attestations
|
||||
* Story: reproducible match input support
|
||||
* DoD:
|
||||
|
||||
* deterministic error codes
|
||||
* stable provenance scoring
|
||||
|
||||
### Epic 5: End-to-end score + VEX
|
||||
|
||||
* Story: orchestration
|
||||
* Story: ScoreResult signing
|
||||
* Story: VEX generation and storage
|
||||
* DoD:
|
||||
|
||||
* recompute parity
|
||||
* verifiable signatures
|
||||
|
||||
### Epic 6: UI
|
||||
|
||||
* Story: score badge + buckets
|
||||
* Story: why panel
|
||||
* Story: export bundle + recompute button
|
||||
* DoD:
|
||||
|
||||
* human explainability
|
||||
* offline replay works
|
||||
|
||||
---
|
||||
|
||||
## 7) APIs to implement (minimal but complete)
|
||||
|
||||
### 7.1 Compute score (internal)
|
||||
|
||||
* `POST /api/score/compute`
|
||||
|
||||
* input: `ScoreInput` + references or inline bundles
|
||||
* output: `ScoreResult`
|
||||
|
||||
### 7.2 Get score (product)
|
||||
|
||||
* `GET /api/findings/{findingId}/score`
|
||||
|
||||
* returns latest ScoreResult + VEX reference
|
||||
|
||||
### 7.3 Explain score
|
||||
|
||||
* `GET /api/findings/{findingId}/score/explain`
|
||||
|
||||
* returns `explain[]` + call path + evidence list + provenance checks
|
||||
|
||||
### 7.4 Export replay bundle
|
||||
|
||||
* `GET /api/findings/{findingId}/score/bundle`
|
||||
|
||||
* returns a tar/zip containing:
|
||||
|
||||
* ScoreInput
|
||||
* policy file
|
||||
* reachability/evidence/provenance reports
|
||||
* engine version manifest
|
||||
|
||||
---
|
||||
|
||||
## 8) Testing strategy (what to automate early)
|
||||
|
||||
### Unit tests
|
||||
|
||||
* bucket selection correctness
|
||||
* gate multiplier composition
|
||||
* evidence freshness bucketing
|
||||
* provenance level mapping
|
||||
* override rule ordering
|
||||
|
||||
### Golden fixtures
|
||||
|
||||
* fixed input bundles → fixed ScoreResult digest
|
||||
* run on every supported platform/runtime
|
||||
|
||||
### Property-based tests
|
||||
|
||||
* monotonicity:
|
||||
|
||||
* fewer hops should not reduce R
|
||||
* more evidence points should not reduce E
|
||||
* stronger provenance should not reduce P
|
||||
|
||||
### Integration tests
|
||||
|
||||
* full pipeline: finding → bundles → score → VEX
|
||||
* “recompute” parity tests
|
||||
|
||||
---
|
||||
|
||||
## 9) Operational concerns and hardening
|
||||
|
||||
### Performance
|
||||
|
||||
* Cache reachability per `(artifactDigest, vulnId, symbol)`
|
||||
* Cache provenance per `artifactDigest`
|
||||
* Evidence queries should be indexed by `(artifactDigest, vulnId, type)`
|
||||
|
||||
### Security
|
||||
|
||||
* Treat evidence ingestion as untrusted input:
|
||||
|
||||
* strict schema validation
|
||||
* content-hash dedupe prevents tampering via overwrite
|
||||
* Sign ScoreResults and VEX notes
|
||||
* RBAC:
|
||||
|
||||
* who can change policy
|
||||
* who can override scores (if allowed at all)
|
||||
|
||||
### Data retention
|
||||
|
||||
* Evidence payloads can be large; keep digests + store raw payloads in object storage
|
||||
* Keep a “minimal replay bundle” always (schemas + digests + policy + engine)
|
||||
|
||||
---
|
||||
|
||||
## 10) Concrete “MVP first” slice (smallest valuable product)
|
||||
|
||||
If you want a crisp MVP that still satisfies “auditable determinism”:
|
||||
|
||||
1. Scoring engine (`B + R + E + P`), fixed-point, golden tests
|
||||
2. Evidence store (SCA + runtime optional)
|
||||
3. Reachability: only hop depth from HTTP routes to symbol (no fancy gates)
|
||||
4. Provenance: signed image + SBOM link only
|
||||
5. UI: score + “why” panel showing:
|
||||
|
||||
* hops/path list
|
||||
* evidence list
|
||||
* provenance checklist
|
||||
6. Emit a signed VEX note containing the score + input digests
|
||||
|
||||
That MVP already proves the core differentiator: **deterministic, replayable risk scoring**.
|
||||
|
||||
---
|
||||
|
||||
## 11) Starter policy file (score.v1) using basis points
|
||||
|
||||
Here’s a good “real implementation” starting point (int-only):
|
||||
|
||||
```yaml
|
||||
policyVersion: score.v1
|
||||
weightsBps:
|
||||
baseSeverity: 1000
|
||||
reachability: 4500
|
||||
evidence: 3000
|
||||
provenance: 1500
|
||||
|
||||
reachability:
|
||||
hopBuckets:
|
||||
- { maxHops: 2, score: 100 }
|
||||
- { maxHops: 3, score: 85 }
|
||||
- { maxHops: 4, score: 70 }
|
||||
- { maxHops: 5, score: 55 }
|
||||
- { maxHops: 6, score: 45 }
|
||||
- { maxHops: 7, score: 35 }
|
||||
- { maxHops: 9999, score: 20 }
|
||||
unreachableScore: 0
|
||||
gateMultipliersBps:
|
||||
featureFlag: 7000
|
||||
authRequired: 8000
|
||||
adminOnly: 8500
|
||||
nonDefaultConfig: 7500
|
||||
|
||||
evidence:
|
||||
points:
|
||||
runtime: 60
|
||||
dast: 30
|
||||
sast: 20
|
||||
sca: 10
|
||||
freshnessBuckets:
|
||||
- { maxAgeDays: 7, multiplierBps: 10000 }
|
||||
- { maxAgeDays: 30, multiplierBps: 9000 }
|
||||
- { maxAgeDays: 90, multiplierBps: 7500 }
|
||||
- { maxAgeDays: 180, multiplierBps: 6000 }
|
||||
- { maxAgeDays: 365, multiplierBps: 4000 }
|
||||
- { maxAgeDays: 99999, multiplierBps: 2000 }
|
||||
|
||||
provenance:
|
||||
levels:
|
||||
unsigned: 0
|
||||
signed: 30
|
||||
signedWithSbom: 60
|
||||
signedWithSbomAndAttestations: 80
|
||||
reproducible: 100
|
||||
|
||||
overrides:
|
||||
- name: knownExploitedAndReachable
|
||||
when:
|
||||
flags:
|
||||
knownExploited: true
|
||||
minReachability: 70
|
||||
setScore: 95
|
||||
|
||||
- name: unreachableAndOnlySca
|
||||
when:
|
||||
maxReachability: 0
|
||||
maxEvidence: 10
|
||||
clampMaxScore: 25
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
If you want, I can also include a **repo layout + CI “golden fixture” test runner** (dotnet test + cross-platform determinism checks) and a **.NET 10 ScoreEngine skeleton** that enforces: no floats, canonical JSON hashing, and stable explanation ordering.
|
||||
@@ -0,0 +1,744 @@
|
||||
Here’s a simple, high‑leverage UX metric to add to your pipeline run view that will immediately make DevOps feel faster and calmer:
|
||||
|
||||
# Time‑to‑First‑Signal (TTFS)
|
||||
|
||||
**What it is:** the time from opening a run’s details page until the UI renders the **first actionable insight** (e.g., “Stage `build` failed – `dotnet restore` 401 – token expired”).
|
||||
**Why it matters:** engineers don’t need *all* data instantly—just the first trustworthy clue to start acting. Lower TTFS = quicker triage, lower stress, tighter MTTR.
|
||||
|
||||
---
|
||||
|
||||
## What counts as a “first signal”
|
||||
|
||||
* Failed stage + reason (exit code, key log line, failing test name)
|
||||
* Degraded but actionable status (e.g., flaky test signature)
|
||||
* Policy gate block with the specific rule that failed
|
||||
* Reachability‑aware security finding that blocks deploy (one concrete example, not the whole list)
|
||||
|
||||
> Not a signal: spinners, generic “loading…”, or unactionable counts.
|
||||
|
||||
---
|
||||
|
||||
## How to optimize TTFS (practical steps)
|
||||
|
||||
1. **Deferred loading (prioritize critical panes):**
|
||||
|
||||
* Render header + failing stage card first; lazy‑load artifacts, full logs, and graphs after.
|
||||
* Pre‑expand the *first failing node* in the stage graph.
|
||||
|
||||
2. **Log pre‑indexing at ingest:**
|
||||
|
||||
* During CI, stream logs into chunks keyed by `[jobId, phase, severity, firstErrorLine]`.
|
||||
* Extract the **first error tuple** (timestamp, step, message) and store it next to the job record.
|
||||
* On UI open, fetch only that tuple (sub‑100 ms) before fetching the rest.
|
||||
|
||||
3. **Cached summaries:**
|
||||
|
||||
* Persist a tiny JSON “run.summary.v1” (status, first failing stage, first error line, blocking policies) in Redis/Postgres.
|
||||
* Invalidate on new job events; always serve this summary first.
|
||||
|
||||
4. **Edge prefetch:**
|
||||
|
||||
* When the runs table is visible, prefetch summaries for rows in viewport so details pages open “warm”.
|
||||
|
||||
5. **Compress + cap first log burst:**
|
||||
|
||||
* Send the first **5–10 error lines** (already extracted) immediately; stream the rest.
|
||||
|
||||
---
|
||||
|
||||
## Instrumentation (so you can prove it)
|
||||
|
||||
Emit these points as telemetry:
|
||||
|
||||
* `ttfs_start`: when the run details route is entered (or when tab becomes visible)
|
||||
* `ttfs_signal_rendered`: when the first actionable card is in the DOM
|
||||
* `ttfs_ms = signal_rendered - start`
|
||||
* Dimensions: `pipeline_provider`, `repo`, `branch`, `run_type` (PR/main), `device`, `release`, `network_state`
|
||||
|
||||
**SLO:** *P50 ≤ 700 ms, P95 ≤ 2.5 s* (adjust to your infra).
|
||||
|
||||
**Dashboards to track:**
|
||||
|
||||
* TTFS distribution (P50/P90/P95) by release
|
||||
* Correlate TTFS with bounce rate and “open → rerun” delay
|
||||
* Error budget: % of views with TTFS > 3 s
|
||||
|
||||
---
|
||||
|
||||
## Minimal backend contract (example)
|
||||
|
||||
```json
|
||||
GET /api/runs/{runId}/first-signal
|
||||
{
|
||||
"runId": "123",
|
||||
"firstSignal": {
|
||||
"type": "stage_failed",
|
||||
"stage": "build",
|
||||
"step": "dotnet restore",
|
||||
"message": "401 Unauthorized: token expired",
|
||||
"at": "2025-12-11T09:22:31Z",
|
||||
"artifact": { "kind": "log", "range": {"start": 1880, "end": 1896} }
|
||||
},
|
||||
"summaryEtag": "W/\"a1b2c3\""
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Frontend pattern (Angular 17, signal‑first)
|
||||
|
||||
* Fire `first-signal` request in route resolver.
|
||||
* Render `FirstSignalCard` immediately.
|
||||
* Lazy‑load stage graph, full logs, security panes.
|
||||
* Fire `ttfs_signal_rendered` when `FirstSignalCard` enters viewport.
|
||||
|
||||
---
|
||||
|
||||
## CI adapter hints (GitLab/GitHub/Azure)
|
||||
|
||||
* Hook on job status webhooks to compute & store the first error tuple.
|
||||
* For GitLab: scan `trace` stream for first `ERRO|FATAL|##[error]` match; store to DB table `ci_run_first_signal(run_id, stage, step, message, t)`.
|
||||
|
||||
---
|
||||
|
||||
## “Good TTFS” acceptance tests
|
||||
|
||||
* Run with early fail → first signal < 1 s, shows exact command + exit code.
|
||||
* Run with policy gate fail → rule name + fix hint visible first.
|
||||
* Offline/slow network → cached summary still renders an actionable hint.
|
||||
|
||||
---
|
||||
|
||||
## Copy to put in your UX guidelines
|
||||
|
||||
> “Optimize **Time‑to‑First‑Signal (TTFS)** above all. Users must see one trustworthy, actionable clue within 1 second on a warm path—even if the rest of the UI is still loading.”
|
||||
|
||||
If you want, I can sketch the exact DB schema for the pre‑indexed log tuples and the Angular resolver + telemetry hooks next.
|
||||
Below is an extended, end‑to‑end implementation plan for **Time‑to‑First‑Signal (TTFS)** that you can drop into your backlog. It includes architecture, data model, API contracts, frontend work, observability, QA, and rollout—structured as epics/phases with “definition of done” and acceptance criteria.
|
||||
|
||||
---
|
||||
|
||||
# Scope extension
|
||||
|
||||
## What we’re building
|
||||
|
||||
A run details experience that renders **one actionable clue** fast—before loading heavy UI like full logs, graphs, artifacts.
|
||||
|
||||
**“First signal”** is a small payload derived from run/job events and the earliest meaningful error evidence (stage/step + key log line(s) + reason/classification).
|
||||
|
||||
## What we’re extending beyond the initial idea
|
||||
|
||||
1. **First‑Signal Quality** (not just speed)
|
||||
|
||||
* Classify error type (auth, dependency, compilation, test, infra, policy, timeout).
|
||||
* Identify “culprit step” and a stable “signature” for dedupe and search.
|
||||
2. **Progressive disclosure UX**
|
||||
|
||||
* Summary → First signal card → expanded context (stage graph, logs, artifacts).
|
||||
3. **Provider‑agnostic ingestion**
|
||||
|
||||
* Adapters for GitLab/GitHub/Azure (or your CI provider).
|
||||
4. **Caching + prefetch**
|
||||
|
||||
* Warm open from list/table, with ETags and stale‑while‑revalidate.
|
||||
5. **Observability & SLOs**
|
||||
|
||||
* TTFS metrics, dashboards, alerting, and quality metrics (false signals).
|
||||
6. **Rollout safety**
|
||||
|
||||
* Feature flags, canary, A/B gating, and a guaranteed fallback path.
|
||||
|
||||
---
|
||||
|
||||
# Success criteria
|
||||
|
||||
## Primary metric
|
||||
|
||||
* **TTFS (ms)**: time from details page route enter → first actionable signal rendered.
|
||||
|
||||
## Targets (example SLOs)
|
||||
|
||||
* **P50 ≤ 700 ms**, **P95 ≤ 2500 ms** on warm path.
|
||||
* **Cold path**: P95 ≤ 4000 ms (depends on infra).
|
||||
|
||||
## Secondary outcome metrics
|
||||
|
||||
* **Open→Action time**: time from opening run to first user action (rerun, cancel, assign, open failing log line).
|
||||
* **Bounce rate**: close page within 10 seconds without interaction.
|
||||
* **MTTR proxy**: time from failure to first rerun or fix commit.
|
||||
|
||||
## Quality metrics
|
||||
|
||||
* **Signal availability rate**: % of run views that show a first signal card within 3s.
|
||||
* **Signal accuracy score** (sampled): engineer confirms “helpful vs not”.
|
||||
* **Extractor failure rate**: parsing errors / missing mappings / timeouts.
|
||||
|
||||
---
|
||||
|
||||
# Architecture overview
|
||||
|
||||
## Data flow
|
||||
|
||||
1. **CI provider events** (job started, job finished, stage failed, log appended) land in your backend.
|
||||
2. **Run summarizer** maintains:
|
||||
|
||||
* `run_summary` (small JSON)
|
||||
* `first_signal` (small, actionable payload)
|
||||
3. **UI opens run details**
|
||||
|
||||
* Immediately calls `GET /runs/{id}/first-signal` (or `/summary`).
|
||||
* Renders FirstSignalCard as soon as payload arrives.
|
||||
4. Background fetches:
|
||||
|
||||
* Stage graph, full logs, artifacts, security scans, trends.
|
||||
|
||||
## Key decision: where to compute first signal
|
||||
|
||||
* **Option A: at ingest time (recommended)**
|
||||
Compute first signal when logs/events arrive, store it, serve it instantly.
|
||||
* **Option B: on demand**
|
||||
Compute when user opens run details (simpler initially, worse TTFS and load).
|
||||
|
||||
---
|
||||
|
||||
# Data model
|
||||
|
||||
## Tables (relational example)
|
||||
|
||||
### `ci_run`
|
||||
|
||||
* `run_id (pk)`
|
||||
* `provider`
|
||||
* `repo_id`
|
||||
* `branch`
|
||||
* `status`
|
||||
* `created_at`, `updated_at`
|
||||
|
||||
### `ci_job`
|
||||
|
||||
* `job_id (pk)`
|
||||
* `run_id (fk)`
|
||||
* `stage_name`
|
||||
* `job_name`
|
||||
* `status`
|
||||
* `started_at`, `finished_at`
|
||||
|
||||
### `ci_log_chunk`
|
||||
|
||||
* `chunk_id (pk)`
|
||||
* `job_id (fk)`
|
||||
* `seq` (monotonic)
|
||||
* `byte_start`, `byte_end` (range into blob)
|
||||
* `first_error_line_no` (nullable)
|
||||
* `first_error_excerpt` (nullable, short)
|
||||
* `severity_max` (info/warn/error)
|
||||
|
||||
### `ci_run_summary`
|
||||
|
||||
* `run_id (pk)`
|
||||
* `version` (e.g., `1`)
|
||||
* `etag` (hash)
|
||||
* `summary_json` (small, 1–5 KB)
|
||||
* `updated_at`
|
||||
|
||||
### `ci_first_signal`
|
||||
|
||||
* `run_id (pk)`
|
||||
* `etag`
|
||||
* `signal_json` (small, 0.5–2 KB)
|
||||
* `quality_flags` (bitmask or json)
|
||||
* `updated_at`
|
||||
|
||||
## Cache layer
|
||||
|
||||
* Redis keys:
|
||||
|
||||
* `run:{runId}:summary:v1`
|
||||
* `run:{runId}:first-signal:v1`
|
||||
* TTL: generous but safe (e.g., 24h) with “write‑through” on event updates.
|
||||
|
||||
---
|
||||
|
||||
# First signal definition
|
||||
|
||||
## `FirstSignal` object (recommended shape)
|
||||
|
||||
```json
|
||||
{
|
||||
"runId": "123",
|
||||
"computedAt": "2025-12-12T09:22:31Z",
|
||||
"status": "failed",
|
||||
"firstSignal": {
|
||||
"type": "stage_failed",
|
||||
"classification": "dependency_auth",
|
||||
"stage": "build",
|
||||
"job": "build-linux-x64",
|
||||
"step": "dotnet restore",
|
||||
"message": "401 Unauthorized: token expired",
|
||||
"signature": "dotnet-restore-401-unauthorized",
|
||||
"log": {
|
||||
"jobId": "job-789",
|
||||
"lines": [
|
||||
"error : Response status code does not indicate success: 401 (Unauthorized).",
|
||||
"error : The token is expired."
|
||||
],
|
||||
"range": { "start": 1880, "end": 1896 }
|
||||
},
|
||||
"suggestedActions": [
|
||||
{ "label": "Rotate token", "type": "doc", "target": "internal://docs/tokens" },
|
||||
{ "label": "Rerun job", "type": "action", "target": "rerun-job:job-789" }
|
||||
]
|
||||
},
|
||||
"etag": "W/\"a1b2c3\""
|
||||
}
|
||||
```
|
||||
|
||||
### Notes
|
||||
|
||||
* `signature` should be stable for grouping.
|
||||
* `suggestedActions` is optional but hugely valuable (even 1–2 actions).
|
||||
|
||||
---
|
||||
|
||||
# APIs
|
||||
|
||||
## 1) First signal endpoint
|
||||
|
||||
**GET** `/api/runs/{runId}/first-signal`
|
||||
|
||||
Headers:
|
||||
|
||||
* `If-None-Match: W/"..."` supported
|
||||
* Response includes `ETag` and `Cache-Control`
|
||||
|
||||
Responses:
|
||||
|
||||
* `200`: full first signal object
|
||||
* `304`: not modified
|
||||
* `404`: run not found
|
||||
* `204`: run exists but signal not available yet (rare; should degrade gracefully)
|
||||
|
||||
## 2) Summary endpoint (optional but useful)
|
||||
|
||||
**GET** `/api/runs/{runId}/summary`
|
||||
|
||||
* Includes: status, first failing stage/job, timestamps, blocking policies, artifact counts.
|
||||
|
||||
## 3) SSE / WebSocket updates (nice-to-have)
|
||||
|
||||
**GET** `/api/runs/{runId}/events` (SSE)
|
||||
|
||||
* Push new signal or summary updates in near real-time while user is on the page.
|
||||
|
||||
---
|
||||
|
||||
# Frontend implementation plan (Angular 17)
|
||||
|
||||
## UX behavior
|
||||
|
||||
1. **Route enter**
|
||||
|
||||
* Start TTFS timer.
|
||||
2. Render instantly:
|
||||
|
||||
* Title, status badge, pipeline metadata (run id, commit, branch).
|
||||
* Skeleton for details area.
|
||||
3. Fetch first signal:
|
||||
|
||||
* Render `FirstSignalCard` immediately when available.
|
||||
* Fire telemetry event when card is **in DOM and visible**.
|
||||
4. Lazy-load:
|
||||
|
||||
* Stage graph
|
||||
* Full logs viewer
|
||||
* Artifacts list
|
||||
* Security findings
|
||||
* Trends, flaky tests, etc.
|
||||
|
||||
## Angular structure
|
||||
|
||||
* `RunDetailsResolver` (or `resolveFn`) requests first signal.
|
||||
* `RunDetailsComponent` uses signals to render quickly.
|
||||
* `FirstSignalCardComponent` is standalone + minimal deps.
|
||||
|
||||
## Prefetch strategy from runs list view
|
||||
|
||||
* When the runs table is visible, prefetch summaries/first signals for items in viewport:
|
||||
|
||||
* Use `IntersectionObserver` to prefetch only visible rows.
|
||||
* Store results in an in-memory cache (e.g., `Map<runId, FirstSignal>`).
|
||||
* Respect ETag to avoid redundant payloads.
|
||||
|
||||
## Telemetry hooks
|
||||
|
||||
* `ttfs_start`: route activation + tab visible
|
||||
* `ttfs_signal_rendered`: FirstSignalCard attached and visible
|
||||
* Dimensions: provider, repo, branch, run_type, release_version, network_state
|
||||
|
||||
---
|
||||
|
||||
# Backend implementation plan
|
||||
|
||||
## Summarizer / First-signal service
|
||||
|
||||
A service or module that:
|
||||
|
||||
* subscribes to run/job events
|
||||
* receives log chunks (or pointers)
|
||||
* computes and stores:
|
||||
|
||||
* `run_summary`
|
||||
* `first_signal`
|
||||
* publishes updates (optional) to an event stream for SSE
|
||||
|
||||
### Concurrency rule
|
||||
|
||||
First signal should be set once per run unless a “better” signal appears:
|
||||
|
||||
* if current signal is missing → set
|
||||
* if current signal is “generic” and new one is “specific” → replace
|
||||
* otherwise keep (avoid churn)
|
||||
|
||||
---
|
||||
|
||||
# Extraction & classification logic
|
||||
|
||||
## Minimum viable extractor (Phase 1)
|
||||
|
||||
* Heuristics:
|
||||
|
||||
* first match among patterns: `FATAL`, `ERROR`, `##[error]`, `panic:`, `Unhandled exception`, `npm ERR!`, `BUILD FAILED`, etc.
|
||||
* plus provider-specific fail markers
|
||||
* Pull:
|
||||
|
||||
* stage/job/step context (from job metadata or step boundaries)
|
||||
* 5–10 log lines around first error line
|
||||
|
||||
## Improved extractor (Phase 2+)
|
||||
|
||||
* Language/tool specific rules:
|
||||
|
||||
* dotnet, maven/gradle, npm/yarn/pnpm, python/pytest, go test, docker build, terraform, helm
|
||||
* Add `classification` and `signature`:
|
||||
|
||||
* normalize common errors:
|
||||
|
||||
* auth expired/forbidden
|
||||
* missing dependency / DNS / TLS
|
||||
* compilation error
|
||||
* test failure (include test name)
|
||||
* infra capacity / agent lost
|
||||
* policy gate failure
|
||||
|
||||
## Guardrails
|
||||
|
||||
* **Secret redaction**: before storing excerpts, run your existing redaction pipeline.
|
||||
* **Payload cap**: cap message length and excerpt lines.
|
||||
* **PII discipline**: avoid including arbitrary stack traces if they contain sensitive paths; include only key lines.
|
||||
|
||||
---
|
||||
|
||||
# Development plan by phases (epics)
|
||||
|
||||
Each phase below includes deliverables + acceptance criteria. You can treat each as a sprint/iteration.
|
||||
|
||||
---
|
||||
|
||||
## Phase 0 — Baseline and alignment
|
||||
|
||||
### Deliverables
|
||||
|
||||
* Baseline TTFS measurement (current behavior)
|
||||
* Definition of “actionable signal” and priority rules
|
||||
* Performance budget for run details view
|
||||
|
||||
### Tasks
|
||||
|
||||
* Add client-side telemetry for current page load steps:
|
||||
|
||||
* route enter, summary loaded, logs loaded, graph loaded
|
||||
* Measure TTFS proxy today (likely “time to status shown”)
|
||||
* Identify top 20 failure modes in your CI (from historical logs)
|
||||
|
||||
### Acceptance criteria
|
||||
|
||||
* Dashboard shows baseline P50/P95 for current experience.
|
||||
* “First signal” contract signed off with UI + backend teams.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1 — Data model and storage
|
||||
|
||||
### Deliverables
|
||||
|
||||
* DB migrations for `ci_run_summary` and `ci_first_signal`
|
||||
* Redis cache keys and invalidation strategy
|
||||
* ADR: where summaries live and how they update
|
||||
|
||||
### Tasks
|
||||
|
||||
* Create tables and indices:
|
||||
|
||||
* index on `run_id`, `updated_at`, `provider`
|
||||
* Add serializer/deserializer for `summary_json` and `signal_json`
|
||||
* Implement ETag generation (hash of JSON payload)
|
||||
|
||||
### Acceptance criteria
|
||||
|
||||
* Can store and retrieve summary + first signal for a run in < 50ms (DB) and < 10ms (cache).
|
||||
* ETag works end-to-end.
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 — Ingestion and first signal computation
|
||||
|
||||
### Deliverables
|
||||
|
||||
* First-signal computation module
|
||||
* Provider adapter integration points (webhook consumers)
|
||||
* “first error tuple” extraction from logs
|
||||
|
||||
### Tasks
|
||||
|
||||
* On job log append:
|
||||
|
||||
* scan incrementally for first error markers
|
||||
* store excerpt + line range + job/stage/step mapping
|
||||
* On job finish/fail:
|
||||
|
||||
* finalize first signal with best known context
|
||||
* Implement the “better signal replaces generic” rule
|
||||
|
||||
### Acceptance criteria
|
||||
|
||||
* For a known failing run, API returns first signal without reading full log blob.
|
||||
* Computation does not exceed a small CPU budget per log chunk (guard with limits).
|
||||
* Extraction failure rate < 1% for sampled runs (initial).
|
||||
|
||||
---
|
||||
|
||||
## Phase 3 — API endpoints and caching
|
||||
|
||||
### Deliverables
|
||||
|
||||
* `/runs/{id}/first-signal` endpoint
|
||||
* Optional `/runs/{id}/summary`
|
||||
* Cache-control + ETag support
|
||||
* Access control checks consistent with existing run authorization
|
||||
|
||||
### Tasks
|
||||
|
||||
* Serve cached first signal first; fallback to DB
|
||||
* If missing:
|
||||
|
||||
* return `204` (or a “pending” object) and allow UI fallback
|
||||
* Add server-side metrics:
|
||||
|
||||
* endpoint latency, cache hit rate, payload size
|
||||
|
||||
### Acceptance criteria
|
||||
|
||||
* Endpoint P95 latency meets target (e.g., < 200ms internal).
|
||||
* Cache hit rate is high for active runs (after prefetch).
|
||||
|
||||
---
|
||||
|
||||
## Phase 4 — Frontend progressive rendering
|
||||
|
||||
### Deliverables
|
||||
|
||||
* FirstSignalCard component
|
||||
* Route resolver + local cache
|
||||
* Prefetch on runs list view
|
||||
* Telemetry for TTFS
|
||||
|
||||
### Tasks
|
||||
|
||||
* Render shell immediately
|
||||
* Fetch and render first signal
|
||||
* Lazy-load heavy panels using `@defer` / dynamic imports
|
||||
* Implement “open failing stage” default behavior
|
||||
|
||||
### Acceptance criteria
|
||||
|
||||
* In throttled network test, first signal card appears significantly earlier than logs and graphs.
|
||||
* `ttfs_signal_rendered` fires exactly once per view, with correct dimensions.
|
||||
|
||||
---
|
||||
|
||||
## Phase 5 — Observability, dashboards, and alerting
|
||||
|
||||
### Deliverables
|
||||
|
||||
* TTFS dashboards by:
|
||||
|
||||
* provider, repo, run type, release version
|
||||
* Alerts:
|
||||
|
||||
* P95 regression threshold
|
||||
* Quality dashboard:
|
||||
|
||||
* availability rate, extraction failures, “generic signal rate”
|
||||
|
||||
### Tasks
|
||||
|
||||
* Create event pipeline for telemetry into your analytics system
|
||||
* Define SLO/error budget alerts
|
||||
* Add tracing (OpenTelemetry) for endpoint and summarizer
|
||||
|
||||
### Acceptance criteria
|
||||
|
||||
* You can correlate TTFS with:
|
||||
|
||||
* bounce rate
|
||||
* open→action time
|
||||
* You can pinpoint whether regressions are backend, frontend, or provider‑specific.
|
||||
|
||||
---
|
||||
|
||||
## Phase 6 — QA, performance testing, rollout
|
||||
|
||||
### Deliverables
|
||||
|
||||
* Automated tests
|
||||
* Feature flag + gradual rollout
|
||||
* A/B experiment (optional)
|
||||
|
||||
### Tasks
|
||||
|
||||
**Testing**
|
||||
|
||||
* Unit tests:
|
||||
|
||||
* extractor patterns
|
||||
* classification rules
|
||||
* Integration tests:
|
||||
|
||||
* simulated job logs with known outcomes
|
||||
* E2E (Playwright/Cypress):
|
||||
|
||||
* verify first signal appears before logs
|
||||
* verify fallback path works if endpoint fails
|
||||
* Performance tests:
|
||||
|
||||
* cold cache vs warm cache
|
||||
* throttled CPU/network profiles
|
||||
|
||||
**Rollout**
|
||||
|
||||
* Feature flag:
|
||||
|
||||
* enabled for internal users first
|
||||
* ramp by repo or percentage
|
||||
* Monitor key metrics during ramp:
|
||||
|
||||
* TTFS P95
|
||||
* API error rate
|
||||
* UI error rate
|
||||
* cache miss spikes
|
||||
|
||||
### Acceptance criteria
|
||||
|
||||
* No increase in overall error rates.
|
||||
* TTFS improves at least X% for a meaningful slice of users (define X from baseline).
|
||||
* Fallback UX remains usable when signals are unavailable.
|
||||
|
||||
---
|
||||
|
||||
# Backlog examples (ready-to-create Jira tickets)
|
||||
|
||||
## Epic: Run summary and first signal storage
|
||||
|
||||
* Create `ci_first_signal` table
|
||||
* Create `ci_run_summary` table
|
||||
* Implement ETag hashing
|
||||
* Implement Redis caching layer
|
||||
* Add admin/debug endpoint (internal only) to inspect computed signals
|
||||
|
||||
## Epic: Log chunk extraction
|
||||
|
||||
* Implement incremental log scanning
|
||||
* Store first error excerpt + range
|
||||
* Map excerpt to job + step
|
||||
* Add redaction pass to excerpts
|
||||
|
||||
## Epic: Run details progressive UI
|
||||
|
||||
* FirstSignalCard UI component
|
||||
* Lazy-load logs viewer
|
||||
* Default to opening failing stage
|
||||
* Prefetch signals in runs list
|
||||
|
||||
## Epic: Telemetry and dashboards
|
||||
|
||||
* Add `ttfs_start` and `ttfs_signal_rendered`
|
||||
* Add endpoint latency metrics
|
||||
* Build dashboards + alerts
|
||||
* Add sampling for “signal helpfulness” feedback
|
||||
|
||||
---
|
||||
|
||||
# Risk register and mitigations
|
||||
|
||||
## Risk: First signal is wrong/misleading
|
||||
|
||||
* Mitigation:
|
||||
|
||||
* track “generic signal rate” and “corrected by user” feedback
|
||||
* classification confidence scoring
|
||||
* always provide quick access to full logs as fallback
|
||||
|
||||
## Risk: Logs contain secrets
|
||||
|
||||
* Mitigation:
|
||||
|
||||
* redact excerpts before storing/serving
|
||||
* cap excerpt lines and length
|
||||
* keep raw logs behind existing permissions
|
||||
|
||||
## Risk: Increased ingest CPU cost
|
||||
|
||||
* Mitigation:
|
||||
|
||||
* incremental scanning with early stop after first error captured
|
||||
* limit scanning per chunk
|
||||
* sample/skip overly large logs until job completion
|
||||
|
||||
## Risk: Cache invalidation bugs
|
||||
|
||||
* Mitigation:
|
||||
|
||||
* ETag-based correctness
|
||||
* versioned keys: `:v1`
|
||||
* “write-through” cache updates from summarizer
|
||||
|
||||
---
|
||||
|
||||
# Definition of Done checklist
|
||||
|
||||
A phase is “done” when:
|
||||
|
||||
* ✅ TTFS measured with reliable client events
|
||||
* ✅ FirstSignalCard renders from `/first-signal` endpoint
|
||||
* ✅ ETag caching works
|
||||
* ✅ Fallback path is solid (no blank screens)
|
||||
* ✅ Dashboards exist and are actively watched during rollout
|
||||
* ✅ Security review completed for log excerpts/redaction
|
||||
* ✅ Load tests show no unacceptable backend regressions
|
||||
|
||||
---
|
||||
|
||||
# Optional enhancements after initial launch
|
||||
|
||||
1. **Next-step recommendations**
|
||||
Add action suggestions and deep links (rotate token, open failing test, open doc).
|
||||
2. **Flaky test / known issue detection**
|
||||
Show “this matches known flaky signature” with last-seen frequency.
|
||||
3. **“Compare to last green”**
|
||||
Summarize what changed since last successful run (commit diff, dependency bump).
|
||||
4. **SSE live updates**
|
||||
Update first signal as soon as failure occurs while user watches.
|
||||
|
||||
---
|
||||
|
||||
If you tell me your current backend stack (Node/Go/.NET), log storage (S3/Elastic/Loki), and which CI providers you support, I can translate this into a concrete set of modules/classes, exact schema migrations, and the Angular routing + signals code structure you’d implement.
|
||||
@@ -0,0 +1,643 @@
|
||||
Here’s a simple, practical idea to make your scans provably repeatable over time and catch drift fast.
|
||||
|
||||
# Replay Fidelity (what, why, how)
|
||||
|
||||
**What it is:** the share of historical scans that reproduce **bit‑for‑bit** when re‑run using their saved manifests (inputs, versions, rules, seeds). Higher = more deterministic system.
|
||||
|
||||
**Why you want it:** it exposes hidden nondeterminism (feed drift, time‑dependent rules, race conditions, unstable dependency resolution) and proves auditability for customers/compliance.
|
||||
|
||||
---
|
||||
|
||||
## The metric
|
||||
|
||||
* **Per‑scan:** `replay_match = 1` if SBOM/VEX/findings + hashes are identical; else `0`.
|
||||
* **Windowed:** `Replay Fidelity = (Σ replay_match) / (# historical replays in window)`.
|
||||
* **Breakdown:** also track by scanner, language, image base, feed version, and environment.
|
||||
|
||||
---
|
||||
|
||||
## What must be captured in the scan manifest
|
||||
|
||||
* Exact source refs (image digest / repo SHA), container layers’ digests
|
||||
* Scanner build ID + config (flags, rules, lattice/policy sets, seeds)
|
||||
* Feed snapshots (CVE DB, OVAL, vendor advisories) as **content‑addressed** bundles
|
||||
* Normalization/version of SBOM schema (e.g., CycloneDX 1.6 vs SPDX 3.0.1)
|
||||
* Platform facts (OS/kernel, tz, locale), toolchain versions, clock policy
|
||||
|
||||
---
|
||||
|
||||
## Pass/Fail rules you can ship
|
||||
|
||||
* **Green:** Fidelity ≥ 0.98 over 30 days, and no bucket < 0.95
|
||||
* **Warn:** Any bucket drops by ≥ 2% week‑over‑week
|
||||
* **Fail the pipeline:** If fidelity < 0.90 or any regulated project < 0.95
|
||||
|
||||
---
|
||||
|
||||
## Minimal replay harness (outline)
|
||||
|
||||
1. Pick N historical scans (e.g., last 200 or stratified by image language).
|
||||
2. Restore their **frozen** manifest (scanner binary, feed bundle, policy lattice, seeds).
|
||||
3. Re‑run in a pinned runtime (OCI digest, pinned kernel in VM, fixed TZ/locale).
|
||||
4. Compare artifacts: SBOM JSON, VEX JSON, findings list, evidence blobs → SHA‑256.
|
||||
5. Emit: pass/fail, diff summary, and the “cause” tag if mismatch (feed, policy, runtime, code).
|
||||
|
||||
---
|
||||
|
||||
## Dashboard (what to show)
|
||||
|
||||
* Fidelity % (30/90‑day) + sparkline
|
||||
* Top offenders (by language/scanner/policy set)
|
||||
* “Cause of mismatch” histogram (feed vs runtime vs code vs policy)
|
||||
* Click‑through: deterministic diff (e.g., which CVEs flipped and why)
|
||||
|
||||
---
|
||||
|
||||
## Quick wins for Stella Ops
|
||||
|
||||
* Treat **feeds as immutable snapshots** (content‑addressed tar.zst) and record their digest in each scan.
|
||||
* Run scanner in a **repro shell** (OCI image digest + fixed TZ/locale + no network).
|
||||
* Normalize SBOM/VEX (key order, whitespace, float precision) before hashing.
|
||||
* Add a `stella replay --from MANIFEST.json` command + nightly cron to sample replays.
|
||||
* Store `replay_result` rows; expose `/metrics` for Prometheus and a CI badge: `Replay Fidelity: 99.2%`.
|
||||
|
||||
Want me to draft the `stella replay` CLI spec and the DB table (DDL) you can drop into Postgres?
|
||||
Below is an extended “Replay Fidelity” design **plus a concrete development implementation plan** you can hand to engineering. I’m assuming Stella Ops is doing container/app security scans that output SBOM + findings (and optionally VEX), and uses vulnerability “feeds” and policy/lattice/rules.
|
||||
|
||||
---
|
||||
|
||||
## 1) Extend the concept: Replay Fidelity as a product capability
|
||||
|
||||
### 1.1 Fidelity levels (so you can be strict without being brittle)
|
||||
|
||||
Instead of a single yes/no, define **tiers** that you can report and gate on:
|
||||
|
||||
1. **Bitwise Fidelity (BF)**
|
||||
|
||||
* *Definition:* All primary artifacts (SBOM, findings, VEX, evidence) match **byte-for-byte** after canonicalization.
|
||||
* *Use:* strongest auditability, catch ordering/nondeterminism.
|
||||
|
||||
2. **Semantic Fidelity (SF)**
|
||||
|
||||
* *Definition:* The *meaning* matches even if formatting differs (e.g., key order, whitespace, timestamps).
|
||||
* *How:* compare normalized objects: same packages, versions, CVEs, fix versions, severities, policy verdicts.
|
||||
* *Use:* protects you from “cosmetic diffs” and helps triage.
|
||||
|
||||
3. **Policy Fidelity (PF)**
|
||||
|
||||
* *Definition:* Final policy decision (pass/fail + reason codes) matches.
|
||||
* *Use:* useful when outputs may evolve but governance outcome must remain stable.
|
||||
|
||||
**Recommended reporting:**
|
||||
|
||||
* Dashboard shows BF, SF, PF together.
|
||||
* Default engineering SLO: **BF ≥ 0.98**; compliance SLO: **BF ≥ 0.95** for regulated projects; PF should be ~1.0 unless policy changed intentionally.
|
||||
|
||||
---
|
||||
|
||||
### 1.2 “Why did it drift?”—Mismatch classification taxonomy
|
||||
|
||||
When a replay fails, auto-tag the cause so humans don’t diff JSON by hand.
|
||||
|
||||
**Primary mismatch classes**
|
||||
|
||||
* **Feed drift:** CVE/OVAL/vendor advisory snapshot differs.
|
||||
* **Policy drift:** policy/lattice/rules differ (or default rule set changed).
|
||||
* **Runtime drift:** base image / libc / kernel / locale / tz / CPU arch differences.
|
||||
* **Scanner drift:** scanner binary build differs or dependency versions changed.
|
||||
* **Nondeterminism:** ordering instability, concurrency race, unseeded RNG, time-based logic.
|
||||
* **External IO:** network calls, “latest” resolution, remote package registry changes.
|
||||
|
||||
**Output:** a `mismatch_reason` plus a short `diff_summary`.
|
||||
|
||||
---
|
||||
|
||||
### 1.3 Deterministic “scan envelope” design
|
||||
|
||||
A replay only works if the scan is fully specified.
|
||||
|
||||
**Scan envelope components**
|
||||
|
||||
* **Inputs:** image digest, repo commit SHA, build provenance, layers digests.
|
||||
* **Scanner:** scanner OCI image digest (or binary digest), config flags, feature toggles.
|
||||
* **Feeds:** content-addressed feed bundle digests (see §2.3).
|
||||
* **Policy/rules:** git commit SHA + content digest of compiled rules.
|
||||
* **Environment:** OS/arch, tz/locale, “clock mode”, network mode, CPU count.
|
||||
* **Normalization:** “canonicalization version” for SBOM/VEX/findings.
|
||||
|
||||
---
|
||||
|
||||
### 1.4 Canonicalization so “bitwise” is meaningful
|
||||
|
||||
To make BF achievable:
|
||||
|
||||
* Canonical JSON serialization (sorted keys, stable array ordering, normalized floats)
|
||||
* Strip/normalize volatile fields (timestamps, “scan_duration_ms”, hostnames)
|
||||
* Stable ordering for lists: packages sorted by `(purl, version)`, vulnerabilities by `(cve_id, affected_purl)`
|
||||
* Deterministic IDs: if you generate internal IDs, derive from stable hashes of content (not UUID4)
|
||||
|
||||
---
|
||||
|
||||
### 1.5 Sampling strategy
|
||||
|
||||
You don’t need to replay everything.
|
||||
|
||||
**Nightly sample:** stratified by:
|
||||
|
||||
* language ecosystem (npm, pip, maven, go, rust…)
|
||||
* scanner engine
|
||||
* base OS
|
||||
* “regulatory tier”
|
||||
* image size/complexity
|
||||
|
||||
**Plus:** always replay “golden canaries” (a fixed set of reference images) after every scanner release and feed ingestion pipeline change.
|
||||
|
||||
---
|
||||
|
||||
## 2) Technical architecture blueprint
|
||||
|
||||
### 2.1 System components
|
||||
|
||||
1. **Manifest Writer (in the scan pipeline)**
|
||||
|
||||
* Produces `ScanManifest v1` JSON
|
||||
* Records all digests and versions
|
||||
|
||||
2. **Artifact Store**
|
||||
|
||||
* Stores SBOM, findings, VEX, evidence blobs
|
||||
* Stores canonical hashes for BF checks
|
||||
|
||||
3. **Feed Snapshotter**
|
||||
|
||||
* Periodically builds immutable feed bundles
|
||||
* Content-addressed (digest-keyed)
|
||||
* Stores metadata (source URLs, generation timestamp, signature)
|
||||
|
||||
4. **Replay Orchestrator**
|
||||
|
||||
* Chooses historical scans to replay
|
||||
* Launches “replay executor” jobs
|
||||
|
||||
5. **Replay Executor**
|
||||
|
||||
* Runs scanner in pinned container image
|
||||
* Network off, tz fixed, clock policy applied
|
||||
* Produces new artifacts + hashes
|
||||
|
||||
6. **Diff & Scoring Engine**
|
||||
|
||||
* Computes BF/SF/PF
|
||||
* Generates mismatch classification + diff summary
|
||||
|
||||
7. **Metrics + UI Dashboard**
|
||||
|
||||
* Prometheus metrics
|
||||
* UI for drill-down diffs
|
||||
|
||||
---
|
||||
|
||||
### 2.2 Data model (Postgres-friendly)
|
||||
|
||||
**Core tables**
|
||||
|
||||
* `scan_manifests`
|
||||
|
||||
* `scan_id (pk)`
|
||||
* `manifest_json`
|
||||
* `manifest_sha256`
|
||||
* `created_at`
|
||||
* `scan_artifacts`
|
||||
|
||||
* `scan_id (fk)`
|
||||
* `artifact_type` (sbom|findings|vex|evidence)
|
||||
* `artifact_uri`
|
||||
* `canonical_sha256`
|
||||
* `schema_version`
|
||||
* `feed_snapshots`
|
||||
|
||||
* `feed_digest (pk)`
|
||||
* `bundle_uri`
|
||||
* `sources_json`
|
||||
* `generated_at`
|
||||
* `signature`
|
||||
* `replay_runs`
|
||||
|
||||
* `replay_id (pk)`
|
||||
* `original_scan_id (fk)`
|
||||
* `status` (queued|running|passed|failed)
|
||||
* `bf_match bool`, `sf_match bool`, `pf_match bool`
|
||||
* `mismatch_reason`
|
||||
* `diff_summary_json`
|
||||
* `started_at`, `finished_at`
|
||||
* `executor_env_json` (arch, tz, cpu, image digest)
|
||||
|
||||
**Indexes**
|
||||
|
||||
* `(created_at)` for sampling windows
|
||||
* `(mismatch_reason, finished_at)` for triage
|
||||
* `(scanner_version, ecosystem)` for breakdown dashboards
|
||||
|
||||
---
|
||||
|
||||
### 2.3 Feed Snapshotting (the key to long-term replay)
|
||||
|
||||
**Feed bundle format**
|
||||
|
||||
* `feeds/<source>/<date>/...` inside a tar.zst
|
||||
* manifest file inside bundle: `feed_bundle_manifest.json` containing:
|
||||
|
||||
* source URLs
|
||||
* retrieval commit/etag (if any)
|
||||
* file hashes
|
||||
* generated_by version
|
||||
|
||||
**Content addressing**
|
||||
|
||||
* Digest of the entire bundle (`sha256(tar.zst)`) is the reference.
|
||||
* Scans record only the digest + URI.
|
||||
|
||||
**Immutability**
|
||||
|
||||
* Store bundles in object storage with WORM / retention if you need compliance.
|
||||
|
||||
---
|
||||
|
||||
### 2.4 Replay execution sandbox
|
||||
|
||||
For determinism, enforce:
|
||||
|
||||
* **No network** (K8s NetworkPolicy, firewall rules, or container runtime flags)
|
||||
* **Fixed TZ/locale**
|
||||
* **Pinned container image digest**
|
||||
* **Clock policy**
|
||||
|
||||
* Either “real time but recorded” or “frozen time at original scan timestamp”
|
||||
* If scanner logic uses current date for severity windows, freeze time
|
||||
|
||||
---
|
||||
|
||||
## 3) Development implementation plan
|
||||
|
||||
I’ll lay this out as **workstreams** + **a sprinted plan**. You can compress/expand depending on team size.
|
||||
|
||||
### Workstream A — Scan Manifest & Canonical Artifacts
|
||||
|
||||
**Goal:** every scan is replayable on paper, even before replays run.
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* `ScanManifest v1` schema + writer integrated into scan pipeline
|
||||
* Canonicalization library + canonical hashing for all artifacts
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* Every scan stores: input digests, scanner digest, policy digest, feed digest placeholders
|
||||
* Artifact hashes are stable across repeated runs in the same environment
|
||||
|
||||
---
|
||||
|
||||
### Workstream B — Feed Snapshotting & Policy Versioning
|
||||
|
||||
**Goal:** eliminate “feed drift” by pinning immutable inputs.
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* Feed bundle builder + signer + uploader
|
||||
* Policy/rules bundler (compiled rules bundle, digest recorded)
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* New scans reference feed bundle digests (not “latest”)
|
||||
* A scan can be re-run with the same feed bundle and policy bundle
|
||||
|
||||
---
|
||||
|
||||
### Workstream C — Replay Runner & Diff Engine
|
||||
|
||||
**Goal:** execute historical scans and score BF/SF/PF with actionable diffs.
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* `stella replay --from manifest.json`
|
||||
* Orchestrator job to schedule replays
|
||||
* Diff engine + mismatch classifier
|
||||
* Storage of replay results
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* Replay produces deterministic artifacts in a pinned environment
|
||||
* Dashboard/CLI shows BF/SF/PF + diff summary for failures
|
||||
|
||||
---
|
||||
|
||||
### Workstream D — Observability, Dashboard, and CI Gates
|
||||
|
||||
**Goal:** make fidelity visible and enforceable.
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* Prometheus metrics: `replay_fidelity_bf`, `replay_fidelity_sf`, `replay_fidelity_pf`
|
||||
* Breakdown labels (scanner, ecosystem, policy_set, base_os)
|
||||
* Alerts for drop thresholds
|
||||
* CI gate option: “block release if BF < threshold on canary set”
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* Engineering can see drift within 24h
|
||||
* Releases are blocked when fidelity regressions occur
|
||||
|
||||
---
|
||||
|
||||
## 4) Suggested sprint plan with concrete tasks
|
||||
|
||||
### Sprint 0 — Design lock + baseline
|
||||
|
||||
**Tasks**
|
||||
|
||||
* Define manifest schema: `ScanManifest v1` fields + versioning rules
|
||||
* Decide canonicalization rules (what is normalized vs preserved)
|
||||
* Choose initial “golden canary” scan set (10–20 representative targets)
|
||||
* Add “replay-fidelity” epic with ownership & SLIs/SLOs
|
||||
|
||||
**Exit criteria**
|
||||
|
||||
* Approved schema + canonicalization spec
|
||||
* Canary set stored and tagged
|
||||
|
||||
---
|
||||
|
||||
### Sprint 1 — Manifest writer + artifact hashing (MVP)
|
||||
|
||||
**Tasks**
|
||||
|
||||
* Implement manifest writer in scan pipeline
|
||||
* Store `manifest_json` + `manifest_sha256`
|
||||
* Implement canonicalization + hashing for:
|
||||
|
||||
* findings list (sorted)
|
||||
* SBOM (normalized)
|
||||
* VEX (if present)
|
||||
* Persist canonical hashes in `scan_artifacts`
|
||||
|
||||
**Exit criteria**
|
||||
|
||||
* Two identical scans in the same environment yield identical artifact hashes
|
||||
* A “manifest export” endpoint/CLI works:
|
||||
|
||||
* `stella scan --emit-manifest out.json`
|
||||
|
||||
---
|
||||
|
||||
### Sprint 2 — Feed snapshotter + policy bundling
|
||||
|
||||
**Tasks**
|
||||
|
||||
* Build feed bundler job:
|
||||
|
||||
* pull raw sources
|
||||
* normalize layout
|
||||
* generate `feed_bundle_manifest.json`
|
||||
* tar.zst + sha256
|
||||
* upload + record in `feed_snapshots`
|
||||
* Update scan pipeline:
|
||||
|
||||
* resolve feed bundle digest at scan start
|
||||
* record digest in scan manifest
|
||||
* Bundle policy/lattice:
|
||||
|
||||
* compile rules into an immutable artifact
|
||||
* record policy bundle digest in manifest
|
||||
|
||||
**Exit criteria**
|
||||
|
||||
* Scans reference immutable feed + policy digests
|
||||
* You can fetch feed bundle by digest and reproduce the same feed inputs
|
||||
|
||||
---
|
||||
|
||||
### Sprint 3 — Replay executor + “no network” sandbox
|
||||
|
||||
**Tasks**
|
||||
|
||||
* Create replay container image / runtime wrapper
|
||||
* Implement `stella replay --from MANIFEST.json`
|
||||
|
||||
* pulls scanner image by digest
|
||||
* mounts feed bundle + policy bundle
|
||||
* runs in network-off mode
|
||||
* applies tz/locale + clock mode
|
||||
* Store replay outputs as artifacts (`replay_scan_id` or `replay_id` linkage)
|
||||
|
||||
**Exit criteria**
|
||||
|
||||
* Replay runs end-to-end for canary scans
|
||||
* Deterministic runtime controls verified (no DNS egress, fixed tz)
|
||||
|
||||
---
|
||||
|
||||
### Sprint 4 — Diff engine + mismatch classification
|
||||
|
||||
**Tasks**
|
||||
|
||||
* Implement BF compare (canonical hashes)
|
||||
* Implement SF compare (semantic JSON/object comparison)
|
||||
* Implement PF compare (policy decision equivalence)
|
||||
* Implement mismatch classification rules:
|
||||
|
||||
* if feed digest differs → feed drift
|
||||
* if scanner digest differs → scanner drift
|
||||
* if environment differs → runtime drift
|
||||
* else → nondeterminism (with sub-tags for ordering/time/RNG)
|
||||
* Generate `diff_summary_json`:
|
||||
|
||||
* top N changed CVEs
|
||||
* packages added/removed
|
||||
* policy verdict changes
|
||||
|
||||
**Exit criteria**
|
||||
|
||||
* Every failed replay has a cause tag and a diff summary that’s useful in <2 minutes
|
||||
* Engineers can reproduce failures locally with the manifest
|
||||
|
||||
---
|
||||
|
||||
### Sprint 5 — Dashboard + alerts + CI gate
|
||||
|
||||
**Tasks**
|
||||
|
||||
* Expose Prometheus metrics from replay service
|
||||
* Build dashboard:
|
||||
|
||||
* BF/SF/PF trends
|
||||
* breakdown by ecosystem/scanner/policy
|
||||
* mismatch cause histogram
|
||||
* Add alerting rules (drop threshold, bucket regression)
|
||||
* Add CI gate mode:
|
||||
|
||||
* “run replays on canary set for this release candidate”
|
||||
* block merge if BF < target
|
||||
|
||||
**Exit criteria**
|
||||
|
||||
* Fidelity visible to leadership and engineering
|
||||
* Release process is protected by canary replays
|
||||
|
||||
---
|
||||
|
||||
### Sprint 6 — Hardening + compliance polish
|
||||
|
||||
**Tasks**
|
||||
|
||||
* Backward compatible manifest upgrades:
|
||||
|
||||
* `manifest_version` bump rules
|
||||
* migration support
|
||||
* Artifact signing / integrity:
|
||||
|
||||
* sign manifest hash
|
||||
* optional transparency log later
|
||||
* Storage & retention policies (cost controls)
|
||||
* Runbook + oncall playbook
|
||||
|
||||
**Exit criteria**
|
||||
|
||||
* Audit story is complete: “show me exactly how scan X was produced”
|
||||
* Operational load is manageable and cost-bounded
|
||||
|
||||
---
|
||||
|
||||
## 5) Engineering specs you can start implementing immediately
|
||||
|
||||
### 5.1 `ScanManifest v1` skeleton (example)
|
||||
|
||||
```json
|
||||
{
|
||||
"manifest_version": "1.0",
|
||||
"scan_id": "scan_123",
|
||||
"created_at": "2025-12-12T10:15:30Z",
|
||||
|
||||
"input": {
|
||||
"type": "oci_image",
|
||||
"image_ref": "registry/app@sha256:...",
|
||||
"layers": ["sha256:...", "sha256:..."],
|
||||
"source_provenance": {"repo_sha": "abc123", "build_id": "ci-999"}
|
||||
},
|
||||
|
||||
"scanner": {
|
||||
"engine": "stella",
|
||||
"scanner_image_digest": "sha256:...",
|
||||
"scanner_version": "2025.12.0",
|
||||
"config_digest": "sha256:...",
|
||||
"flags": ["--deep", "--vex"]
|
||||
},
|
||||
|
||||
"feeds": {
|
||||
"vuln_feed_bundle_digest": "sha256:...",
|
||||
"license_db_digest": "sha256:..."
|
||||
},
|
||||
|
||||
"policy": {
|
||||
"policy_bundle_digest": "sha256:...",
|
||||
"policy_set": "prod-default"
|
||||
},
|
||||
|
||||
"environment": {
|
||||
"arch": "amd64",
|
||||
"os": "linux",
|
||||
"tz": "UTC",
|
||||
"locale": "C",
|
||||
"network": "disabled",
|
||||
"clock_mode": "frozen",
|
||||
"clock_value": "2025-12-12T10:15:30Z"
|
||||
},
|
||||
|
||||
"normalization": {
|
||||
"canonicalizer_version": "1.2.0",
|
||||
"sbom_schema": "cyclonedx-1.6",
|
||||
"vex_schema": "cyclonedx-vex-1.0"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 5.2 CLI spec (minimal)
|
||||
|
||||
* `stella scan ... --emit-manifest MANIFEST.json --emit-artifacts-dir out/`
|
||||
* `stella replay --from MANIFEST.json --out-dir replay_out/`
|
||||
* `stella diff --a out/ --b replay_out/ --mode bf|sf|pf --json`
|
||||
|
||||
---
|
||||
|
||||
## 6) Testing strategy (to prevent determinism regressions)
|
||||
|
||||
### Unit tests
|
||||
|
||||
* Canonicalization: same object → same bytes
|
||||
* Sorting stability: randomized input order → stable output
|
||||
* Hash determinism
|
||||
|
||||
### Integration tests
|
||||
|
||||
* Golden canaries:
|
||||
|
||||
* run scan twice in same runner → BF match
|
||||
* replay from manifest → BF match
|
||||
* “Network leak” test:
|
||||
|
||||
* DNS requests must be zero
|
||||
* “Clock leak” test:
|
||||
|
||||
* freeze time; ensure outputs do not include real timestamps
|
||||
|
||||
### Chaos tests
|
||||
|
||||
* Vary CPU count, run concurrency, run order → still BF match
|
||||
* Randomized scheduling / thread interleavings to find races
|
||||
|
||||
---
|
||||
|
||||
## 7) Operational policies (so it stays useful)
|
||||
|
||||
### Retention & cost controls
|
||||
|
||||
* Keep full artifacts for regulated scans (e.g., 1–7 years)
|
||||
* For non-regulated:
|
||||
|
||||
* keep manifests + canonical hashes long-term
|
||||
* expire heavy evidence blobs after N days
|
||||
* Compress large artifacts and dedupe by digest
|
||||
|
||||
### Alerting examples
|
||||
|
||||
* BF drops by ≥2% week-over-week (any major bucket) → warn
|
||||
* BF < 0.90 overall or regulated BF < 0.95 → page / block release
|
||||
|
||||
### Triage workflow
|
||||
|
||||
* Failed replay auto-creates a ticket with:
|
||||
|
||||
* manifest link
|
||||
* mismatch_reason
|
||||
* diff_summary
|
||||
* reproduction command
|
||||
|
||||
---
|
||||
|
||||
## 8) What “done” looks like (definition of success)
|
||||
|
||||
* Any customer/auditor can pick a scan from 6 months ago and you can:
|
||||
|
||||
1. retrieve manifest + feed bundle + policy bundle by digest
|
||||
2. replay in a pinned sandbox
|
||||
3. show BF/SF/PF results and diffs
|
||||
* Engineering sees drift quickly and can attribute it to feed vs scanner vs runtime.
|
||||
|
||||
---
|
||||
|
||||
If you want, I can also provide:
|
||||
|
||||
* a **Postgres DDL** for the tables above,
|
||||
* a **Prometheus metrics contract** (names + labels + example queries),
|
||||
* and a **diff_summary_json schema** that supports a UI “diff view” without reprocessing artifacts.
|
||||
@@ -0,0 +1,840 @@
|
||||
Here’s a quick, plain‑English idea you can use right away: **not all code diffs are equal**—some actually change what’s *reachable* at runtime (and thus security posture), while others just refactor internals. A “**Smart‑Diff**” pipeline flags only the diffs that open or close attack paths by combining (1) call‑stack traces, (2) dependency graphs, and (3) dataflow.
|
||||
|
||||
---
|
||||
|
||||
### Why this matters (background)
|
||||
|
||||
* Text diffs ≠ behavior diffs. A rename or refactor can look big in Git but do nothing to reachable flows from external entry points (HTTP, gRPC, CLI, message consumers).
|
||||
* Security triage gets noisy because scanners attach CVEs to all present packages, not to the code paths you can actually hit.
|
||||
* **Dataflow‑aware diffs** shrink noise and make VEX generation honest: “vuln present but **not exploitable** because the sink is unreachable from any policy‑defined entrypoint.”
|
||||
|
||||
---
|
||||
|
||||
### Minimal architecture (fits Stella Ops)
|
||||
|
||||
1. **Entrypoint map** (per service): controllers, handlers, consumers.
|
||||
2. **Call graph + dataflow** (per commit): Roslyn for C#, `golang.org/x/tools/go/callgraph` for Go, plus taint rules (source→sink).
|
||||
3. **Reachability cache** keyed by (commit, entrypoint, package@version).
|
||||
4. **Smart‑Diff** = `reachable_paths(commit_B) – reachable_paths(commit_A)`.
|
||||
|
||||
* If a path to a sensitive sink is newly reachable → **High**.
|
||||
* If a path disappears → auto‑generate **VEX “not affected (no reachable path)”**.
|
||||
|
||||
---
|
||||
|
||||
### Tiny working seeds
|
||||
|
||||
**C# (.NET 10) — Roslyn skeleton to diff call‑reachability**
|
||||
|
||||
```csharp
|
||||
// SmartDiff.csproj targets net10.0
|
||||
using Microsoft.CodeAnalysis;
|
||||
using Microsoft.CodeAnalysis.CSharp;
|
||||
using Microsoft.CodeAnalysis.FindSymbols;
|
||||
|
||||
public static class SmartDiff
|
||||
{
|
||||
public static async Task<HashSet<string>> ReachableSinks(string solutionPath, string[] entrypoints, string[] sinks)
|
||||
{
|
||||
var workspace = MSBuild.MSBuildWorkspace.Create();
|
||||
var solution = await workspace.OpenSolutionAsync(solutionPath);
|
||||
var index = new HashSet<string>();
|
||||
|
||||
foreach (var proj in solution.Projects)
|
||||
{
|
||||
var comp = await proj.GetCompilationAsync();
|
||||
if (comp is null) continue;
|
||||
|
||||
// Resolve entrypoints & sinks by symbol name
|
||||
var epSymbols = comp.GlobalNamespace.GetMembers().SelectMany(Descend)
|
||||
.OfType<IMethodSymbol>().Where(m => entrypoints.Contains(m.ToDisplayString())).ToList();
|
||||
var sinkSymbols = comp.GlobalNamespace.GetMembers().SelectMany(Descend)
|
||||
.OfType<IMethodSymbol>().Where(m => sinks.Contains(m.ToDisplayString())).ToList();
|
||||
|
||||
foreach (var ep in epSymbols)
|
||||
foreach (var sink in sinkSymbols)
|
||||
{
|
||||
// Heuristic reachability: cheap path search via SymbolFinder
|
||||
var refs = await SymbolFinder.FindReferencesAsync(sink, solution);
|
||||
if (refs.SelectMany(r => r.Locations).Any()) // replace with real graph walk
|
||||
index.Add($"{ep.ToDisplayString()} -> {sink.ToDisplayString()}");
|
||||
}
|
||||
}
|
||||
return index;
|
||||
|
||||
static IEnumerable<ISymbol> Descend(INamespaceOrTypeSymbol sym)
|
||||
{
|
||||
foreach (var m in sym.GetMembers())
|
||||
{
|
||||
yield return m;
|
||||
if (m is INamespaceOrTypeSymbol nt) foreach (var x in Descend(nt)) yield return x;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Go — SSA & callgraph seed**
|
||||
|
||||
```go
|
||||
// go.mod: require golang.org/x/tools latest
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"golang.org/x/tools/go/callgraph/cha"
|
||||
"golang.org/x/tools/go/packages"
|
||||
"golang.org/x/tools/go/ssa"
|
||||
)
|
||||
|
||||
func main() {
|
||||
cfg := &packages.Config{Mode: packages.LoadAllSyntax, Tests: false}
|
||||
pkgs, _ := packages.Load(cfg, "./...")
|
||||
prog, pkgsSSA := ssa.NewProgram(pkgs[0].Fset, ssa.BuilderMode(0))
|
||||
for _, p := range pkgsSSA { prog.CreatePackage(p, p.Syntax, p.TypesInfo, true) }
|
||||
prog.Build()
|
||||
|
||||
cg := cha.CallGraph(prog)
|
||||
// TODO: map entrypoints & sinks, then walk cg from EPs to sinks
|
||||
fmt.Println("nodes:", len(cg.Nodes))
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### How to use it in your pipeline (fast win)
|
||||
|
||||
* **Pre‑merge job**:
|
||||
|
||||
1. Build call graph for `HEAD` and `HEAD^`.
|
||||
2. Compute Smart‑Diff.
|
||||
3. If any *new* EP→sink path appears, fail with a short, proof‑linked note:
|
||||
“New reachable path: `POST /Invoices -> PdfExporter.Save(string path)` (writes outside sandbox).”
|
||||
* **Post‑scan VEX**:
|
||||
|
||||
* For each CVE on a package, mark **Affected** only if any EP can reach a symbol that uses that package’s vulnerable surface.
|
||||
|
||||
---
|
||||
|
||||
### Evidence to show in the UI
|
||||
|
||||
* “**Path card**”: EP → … → Sink, with file:line hop‑list and commit hash.
|
||||
* “**What changed**”: before/after path diff (green removed, red added).
|
||||
* “**Why it matters**”: sink classification (network write, file write, deserialization, SQL, crypto).
|
||||
|
||||
---
|
||||
|
||||
### Developer checklist (Stella Ops style)
|
||||
|
||||
* [ ] Define entrypoints per service (attribute or YAML).
|
||||
* [ ] Define sink taxonomy (FS, NET, DESER, SQL, CRYPTO).
|
||||
* [ ] Implement language adapters: `.NET (Roslyn)`, `Go (SSA)`, later `Java (Soot/WALA)`.
|
||||
* [ ] Add a **ReachabilityCache** (Postgres table keyed by commit+lang+service).
|
||||
* [ ] Wire a `SmartDiffJob` in CI; emit SARIF + CycloneDX `vulnerability-assertions` extension or OpenVEX.
|
||||
* [ ] Gate merges on **newly‑reachable sensitive sinks**; auto‑VEX when paths disappear.
|
||||
|
||||
If you want, I can turn this into a small repo scaffold (Roslyn + Go adapters, Postgres schema, a GitLab/GitHub pipeline, and a minimal UI “path card”).
|
||||
Below is a concrete **development implementation plan** to take the “Smart‑Diff” idea (reachability + dataflow + dependency/vuln context) into a shippable product integrated into your pipeline (Stella Ops style). I’ll assume the initial languages are **.NET (C#)** and **Go**, and the initial goal is **PR gating + VEX automation** with strong evidence (paths + file/line hops).
|
||||
|
||||
---
|
||||
|
||||
## 1) Product definition
|
||||
|
||||
### Problem you’re solving
|
||||
|
||||
Security noise comes from:
|
||||
|
||||
* “Vuln exists in dependency” ≠ “vuln exploitable from any entrypoint”
|
||||
* Git diffs look big even when behavior is unchanged
|
||||
* Teams struggle to triage “is this change actually risky?”
|
||||
|
||||
### What Smart‑Diff should do (core behavior)
|
||||
|
||||
Given **base commit A** and **head commit B**:
|
||||
|
||||
1. Identify **entrypoints** (web handlers, RPC methods, message consumers, CLI commands).
|
||||
2. Identify **sinks** (file write, command exec, SQL, SSRF, deserialization, crypto misuse, templating, etc.).
|
||||
3. Compute **reachable paths** from entrypoints → sinks (call graph + dataflow/taint).
|
||||
4. Emit **Smart‑Diff**:
|
||||
|
||||
* **Newly reachable** EP→sink paths (risk ↑)
|
||||
* **Removed** EP→sink paths (risk ↓)
|
||||
* **Changed** paths (same sink but different sanitization/guards)
|
||||
5. Attach **dependency vulnerability context**:
|
||||
|
||||
* If a vulnerable API surface is reachable (or data reaches it), mark “affected/exploitable”
|
||||
* Otherwise generate **VEX**: “not affected” / “not exploitable” with evidence
|
||||
|
||||
### MVP definition (minimum shippable)
|
||||
|
||||
A PR check that:
|
||||
|
||||
* Flags **new** reachable paths to a small set of high‑risk sinks (e.g., command exec, unsafe deserialization, filesystem write, SSRF/network dial, raw SQL).
|
||||
* Produces:
|
||||
|
||||
* SARIF report (for code scanning UI)
|
||||
* JSON artifact containing proof paths (EP → … → sink with file:line)
|
||||
* Optional VEX statement for dependency vulnerabilities (if you already have an SCA feed)
|
||||
|
||||
---
|
||||
|
||||
## 2) Architecture you can actually build
|
||||
|
||||
### High‑level components
|
||||
|
||||
1. **Policy & Taxonomy Service**
|
||||
|
||||
* Defines entrypoints, sources, sinks, sanitizers, confidence rules
|
||||
* Versioned and centrally managed (but supports repo overrides)
|
||||
|
||||
2. **Analyzer Workers (language adapters)**
|
||||
|
||||
* .NET analyzer (Roslyn + control flow)
|
||||
* Go analyzer (SSA + callgraph)
|
||||
* Outputs standardized IR (Intermediate Representation)
|
||||
|
||||
3. **Graph Store + Reachability Engine**
|
||||
|
||||
* Stores symbol nodes + call edges + dataflow edges
|
||||
* Computes reachable sinks per entrypoint
|
||||
* Computes diff between commits A and B
|
||||
|
||||
4. **Vulnerability Mapper + VEX Generator**
|
||||
|
||||
* Maps vulnerable packages/functions → “surfaces”
|
||||
* Joins with reachability results
|
||||
* Emits OpenVEX (or CycloneDX VEX) with evidence links
|
||||
|
||||
5. **CI/PR Integrations**
|
||||
|
||||
* CLI that runs in CI
|
||||
* Optional server mode (cache + incremental processing)
|
||||
|
||||
6. **UI/API**
|
||||
|
||||
* Path cards: “what changed”, “why it matters”, “proof”
|
||||
* Filters by sink class, confidence, service, entrypoint
|
||||
|
||||
### Data contracts (standardized IR)
|
||||
|
||||
Make every analyzer output the same shapes so the rest of the pipeline is language‑agnostic:
|
||||
|
||||
* **Symbols**
|
||||
|
||||
* `symbol_id`: stable hash of (lang, module, fully-qualified name, signature)
|
||||
* metadata: file, line ranges, kind (method/function), accessibility
|
||||
|
||||
* **Edges**
|
||||
|
||||
* Call edge: `caller_symbol_id -> callee_symbol_id`
|
||||
* Dataflow edge: `source_symbol_id -> sink_symbol_id` with variable/parameter traces
|
||||
* Edge metadata: type, confidence, reason (static, reflection guess, interface dispatch, etc.)
|
||||
|
||||
* **Entrypoints / Sources / Sinks**
|
||||
|
||||
* entrypoint: (symbol_id, route/topic/command metadata)
|
||||
* sink: (symbol_id, sink_type, severity, cwe mapping optional)
|
||||
|
||||
* **Paths**
|
||||
|
||||
* `entrypoint -> ... -> sink`
|
||||
* hop list: symbol_id + file:line, plus “dataflow step evidence” when relevant
|
||||
|
||||
---
|
||||
|
||||
## 3) Workstreams and deliverables
|
||||
|
||||
### Workstream A — Policy, taxonomy, configuration
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* `smartdiff.policy.yaml` schema and validator
|
||||
* A default sink taxonomy:
|
||||
|
||||
* `CMD_EXEC`, `UNSAFE_DESER`, `SQL_RAW`, `SSRF`, `FILE_WRITE`, `PATH_TRAVERSAL`, `TEMPLATE_INJECTION`, `CRYPTO_WEAK`, `AUTHZ_BYPASS` (expand later)
|
||||
* Initial sanitizer patterns:
|
||||
|
||||
* For example: parameter validation, safe deserialization wrappers, ORM parameterized APIs, path normalization, allowlists
|
||||
|
||||
**Implementation notes**
|
||||
|
||||
* Start strict and small: 10–20 sinks, 10 sources, 10 sanitizers.
|
||||
* Provide repo-level overrides:
|
||||
|
||||
* `smartdiff.policy.yaml` in repo root
|
||||
* Central policies referenced by version tag
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* A service can onboard by configuring:
|
||||
|
||||
* entrypoint discovery mode (auto + manual)
|
||||
* sink classes to enforce
|
||||
* severity threshold to fail PR
|
||||
|
||||
---
|
||||
|
||||
### Workstream B — .NET analyzer (Roslyn)
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* Build pipeline that produces:
|
||||
|
||||
* call graph (methods and invocations)
|
||||
* basic control-flow guards for reachability (optional for MVP)
|
||||
* taint propagation for common patterns (MVP: parameter → sink)
|
||||
* Entry point discovery for:
|
||||
|
||||
* ASP.NET controllers (`[HttpGet]`, `[HttpPost]`)
|
||||
* Minimal APIs (`MapGet/MapPost`)
|
||||
* gRPC service methods
|
||||
* message consumers (configurable attributes/interfaces)
|
||||
|
||||
**Implementation notes (practical path)**
|
||||
|
||||
* MVP static callgraph:
|
||||
|
||||
* Use Roslyn semantic model to resolve invocation targets
|
||||
* For virtual/interface calls: conservative resolution to possible implementations within the compilation
|
||||
* MVP taint:
|
||||
|
||||
* “Sources”: request params/body, headers, query string, message payloads
|
||||
* “Sinks”: wrappers around `Process.Start`, `SqlCommand`, `File.WriteAllText`, `HttpClient.Send`, deserializers, etc.
|
||||
* Propagate taint across:
|
||||
|
||||
* parameter → local → argument
|
||||
* return values
|
||||
* simple assignments and concatenations (heuristic)
|
||||
* Confidence scoring:
|
||||
|
||||
* Direct static call resolution: high
|
||||
* Reflection/dynamic: low (flag separately)
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* On a demo ASP.NET service, if a PR adds:
|
||||
|
||||
* `HttpPost /upload` → `File.WriteAllBytes(userPath, ...)`
|
||||
Smart‑Diff flags **new EP→FILE_WRITE path** and shows hops with file/line.
|
||||
|
||||
---
|
||||
|
||||
### Workstream C — Go analyzer (SSA)
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* SSA build + callgraph extraction
|
||||
* Entrypoint discovery for:
|
||||
|
||||
* `net/http` handlers
|
||||
* common routers (Gin/Echo/Chi) via adapter rules
|
||||
* gRPC methods
|
||||
* consumers (Kafka/NATS/etc.) by config
|
||||
|
||||
**Implementation notes**
|
||||
|
||||
* Use `golang.org/x/tools/go/packages` + `ssa` build
|
||||
* Callgraph:
|
||||
|
||||
* start with CHA (Class Hierarchy Analysis) for speed
|
||||
* later add pointer analysis for precision on interfaces
|
||||
* Taint:
|
||||
|
||||
* sources: `http.Request`, router params, message payloads
|
||||
* sinks: `os/exec`, `database/sql` raw query, file I/O, `net/http` outbound, unsafe deserialization libs
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* A PR that adds `exec.Command(req.FormValue("cmd"))` becomes a **new EP→CMD_EXEC** finding.
|
||||
|
||||
---
|
||||
|
||||
### Workstream D — Graph store + reachability computation
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* Schema in Postgres (recommended first) for:
|
||||
|
||||
* commits, services, languages
|
||||
* symbols, edges, entrypoints, sinks
|
||||
* computed reachable “facts” (entrypoint→sink with shortest path(s))
|
||||
* Reachability engine:
|
||||
|
||||
* BFS/DFS per entrypoint with early cutoffs
|
||||
* path reconstruction storage (store predecessor map or store k-shortest paths)
|
||||
|
||||
**Implementation notes**
|
||||
|
||||
* Don’t start with a graph DB unless you must.
|
||||
* Use Postgres tables + indexes:
|
||||
|
||||
* `edges(from_symbol, to_symbol, commit_id, kind)`
|
||||
* `symbols(symbol_id, lang, module, fqn, file, line_start, line_end)`
|
||||
* `reachability(entrypoint_id, sink_id, commit_id, path_hash, confidence, severity, evidence_json)`
|
||||
* Cache:
|
||||
|
||||
* keyed by (commit, policy_version, analyzer_version)
|
||||
* avoids recompute on re-runs
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* For any analyzed commit, you can answer:
|
||||
|
||||
* “Which sinks are reachable from these entrypoints?”
|
||||
* “Show me one proof path per (entrypoint, sink_type).”
|
||||
|
||||
---
|
||||
|
||||
### Workstream E — Smart‑Diff engine (the “diff” part)
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* Diff algorithm producing three buckets:
|
||||
|
||||
* `added_paths`, `removed_paths`, `changed_paths`
|
||||
* “Changed” means:
|
||||
|
||||
* same entrypoint + sink type, but path differs OR taint/sanitization differs OR confidence changes
|
||||
|
||||
**Implementation notes**
|
||||
|
||||
* Identify a path by a stable fingerprint:
|
||||
|
||||
* `path_id = hash(entrypoint_symbol + sink_symbol + sink_type + policy_version + analyzer_version)`
|
||||
* Store:
|
||||
|
||||
* top-k paths for each pair for evidence (k=1 for MVP, add more later)
|
||||
* Severity gating rules:
|
||||
|
||||
* Example:
|
||||
|
||||
* New path to `CMD_EXEC` = fail
|
||||
* New path to `FILE_WRITE` = warn unless under `/tmp` allowlist
|
||||
* New path to `SQL_RAW` = fail unless parameterized sanitizer present
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* Given commits A and B:
|
||||
|
||||
* If B introduces a new reachable sink, CI fails with a single actionable card:
|
||||
|
||||
* **EP**: route / handler
|
||||
* **Sink**: type + symbol
|
||||
* **Proof**: hop list
|
||||
* **Why**: policy rule triggered
|
||||
|
||||
---
|
||||
|
||||
### Workstream F — Vulnerability mapping + VEX
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* Ingest dependency inventory (SBOM or lockfiles)
|
||||
* Map vulnerabilities to “surfaces”
|
||||
|
||||
* package → vulnerable module/function patterns
|
||||
* minimal version/range matching (from your existing vuln feed)
|
||||
* Decision logic:
|
||||
|
||||
* **Affected** if any reachable path intersects vulnerable surface OR dataflow reaches vulnerable sink
|
||||
* else **Not affected / Not exploitable** with justification
|
||||
|
||||
**Implementation notes**
|
||||
|
||||
* Start with a pragmatic approach:
|
||||
|
||||
* package‑level reachability: “is any symbol in that package reachable?”
|
||||
* then iterate toward function‑level surfaces
|
||||
* VEX output:
|
||||
|
||||
* include commit hash, policy version, evidence paths
|
||||
* embed links to internal “path card” URLs if available
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* For a known vulnerable dependency, the system emits:
|
||||
|
||||
* VEX “not affected” if package code is never reached from any entrypoint, with proof references.
|
||||
|
||||
---
|
||||
|
||||
### Workstream G — CI integration + developer UX
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* A single CLI:
|
||||
|
||||
* `smartdiff analyze --commit <sha> --service <svc> --lang <dotnet|go>`
|
||||
* `smartdiff diff --base <shaA> --head <shaB> --out sarif`
|
||||
* CI templates for:
|
||||
|
||||
* GitHub Actions / GitLab CI
|
||||
* Outputs:
|
||||
|
||||
* SARIF
|
||||
* JSON evidence bundle
|
||||
* optional OpenVEX file
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* Teams can enable Smart‑Diff by adding:
|
||||
|
||||
* CI job + config file
|
||||
* no additional infra required for MVP (local artifacts mode)
|
||||
* When infra is available, enable server caching mode for speed.
|
||||
|
||||
---
|
||||
|
||||
### Workstream H — UI “Path Cards”
|
||||
|
||||
**Deliverables**
|
||||
|
||||
* UI components:
|
||||
|
||||
* Path card list with filters (sink type, severity, confidence)
|
||||
* “What changed” diff view:
|
||||
|
||||
* red = added hops
|
||||
* green = removed hops
|
||||
* “Evidence” panel:
|
||||
|
||||
* file:line for each hop
|
||||
* code snippets (optional)
|
||||
* APIs:
|
||||
|
||||
* `GET /smartdiff/{repo}/{pr}/findings`
|
||||
* `GET /smartdiff/{repo}/{commit}/path/{path_id}`
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
* A developer can click one finding and understand:
|
||||
|
||||
* how the data got there
|
||||
* exactly what line introduced the risk
|
||||
* how to fix (sanitize/guard/allowlist)
|
||||
|
||||
---
|
||||
|
||||
## 4) Milestone plan (sequenced, no time promises)
|
||||
|
||||
### Milestone 0 — Foundation
|
||||
|
||||
* Repo scaffolding:
|
||||
|
||||
* `smartdiff-cli/`
|
||||
* `analyzers/dotnet/`
|
||||
* `analyzers/go/`
|
||||
* `core-ir/` (schemas + validation)
|
||||
* `server/` (optional; can come later)
|
||||
* Define IR JSON schema + versioning rules
|
||||
* Implement policy YAML + validator + sample policies
|
||||
* Implement “local mode” artifact output
|
||||
|
||||
**Exit criteria**
|
||||
|
||||
* You can run `smartdiff analyze` and get a valid IR file for at least one trivial repo.
|
||||
|
||||
---
|
||||
|
||||
### Milestone 1 — Callgraph reachability MVP
|
||||
|
||||
* .NET: build call edges + entrypoint discovery (basic)
|
||||
* Go: build call edges + entrypoint discovery (basic)
|
||||
* Graph store: in-memory or local sqlite/postgres
|
||||
* Compute reachable sinks (callgraph only, no taint)
|
||||
|
||||
**Exit criteria**
|
||||
|
||||
* On a demo repo, you can list:
|
||||
|
||||
* entrypoints
|
||||
* reachable sinks (callgraph reachability only)
|
||||
* a proof path (hop list)
|
||||
|
||||
---
|
||||
|
||||
### Milestone 2 — Smart‑Diff MVP (PR gating)
|
||||
|
||||
* Compute diff between base/head reachable sink sets
|
||||
* Produce SARIF with:
|
||||
|
||||
* rule id = sink type
|
||||
* message includes entrypoint + sink + link to evidence JSON
|
||||
* CI templates + documentation
|
||||
|
||||
**Exit criteria**
|
||||
|
||||
* In PR checks, the job fails on new EP→sink paths and links to a proof.
|
||||
|
||||
---
|
||||
|
||||
### Milestone 3 — Taint/dataflow MVP (high-value sinks only)
|
||||
|
||||
* Add taint propagation to reduce false positives:
|
||||
|
||||
* differentiate “sink reachable” vs “untrusted data reaches sink”
|
||||
* Add sanitizer recognition
|
||||
* Add confidence scoring + suppression mechanisms (policy allowlists)
|
||||
|
||||
**Exit criteria**
|
||||
|
||||
* A sink is only “high severity” if it is both reachable and tainted (or policy says otherwise).
|
||||
|
||||
---
|
||||
|
||||
### Milestone 4 — VEX integration MVP
|
||||
|
||||
* Join reachability with dependency vulnerabilities
|
||||
* Emit OpenVEX (and/or CycloneDX VEX)
|
||||
* Store evidence references (paths) inside VEX justification
|
||||
|
||||
**Exit criteria**
|
||||
|
||||
* For a repo with a vulnerable dependency, you can automatically produce:
|
||||
|
||||
* affected/not affected with evidence.
|
||||
|
||||
---
|
||||
|
||||
### Milestone 5 — Scale and precision improvements
|
||||
|
||||
* Incremental analysis (only analyze changed projects/packages)
|
||||
* Better dynamic dispatch handling (Go pointer analysis, .NET interface dispatch expansion)
|
||||
* Optional runtime telemetry integration:
|
||||
|
||||
* import production traces to prioritize “actually observed” entrypoints
|
||||
|
||||
**Exit criteria**
|
||||
|
||||
* Works on large services with acceptable run time and stable noise levels.
|
||||
|
||||
---
|
||||
|
||||
## 5) Backlog you can paste into Jira (epics + key stories)
|
||||
|
||||
### Epic: Policy & taxonomy
|
||||
|
||||
* Story: Define `smartdiff.policy.yaml` schema and validator
|
||||
**AC:** invalid configs fail with clear errors; configs are versioned.
|
||||
* Story: Provide default sink list and severities
|
||||
**AC:** at least 10 sink rules with test cases.
|
||||
|
||||
### Epic: .NET analyzer
|
||||
|
||||
* Story: Resolve method invocations to symbols (Roslyn)
|
||||
**AC:** correct targets for direct calls; conservative handling for virtual calls.
|
||||
* Story: Discover ASP.NET routes and bind to entrypoint symbols
|
||||
**AC:** entrypoints include route/method metadata.
|
||||
|
||||
### Epic: Go analyzer
|
||||
|
||||
* Story: SSA build and callgraph extraction
|
||||
**AC:** function nodes and edges generated for a multi-package repo.
|
||||
* Story: net/http entrypoint discovery
|
||||
**AC:** handler functions recognized as entrypoints with path labels.
|
||||
|
||||
### Epic: Reachability engine
|
||||
|
||||
* Story: Compute reachable sinks per entrypoint
|
||||
**AC:** store at least one path with hop list.
|
||||
* Story: Smart‑Diff A vs B
|
||||
**AC:** added/removed paths computed deterministically.
|
||||
|
||||
### Epic: CI/SARIF
|
||||
|
||||
* Story: Emit SARIF results
|
||||
**AC:** findings appear in code scanning UI; include file/line.
|
||||
|
||||
### Epic: Taint analysis
|
||||
|
||||
* Story: Propagate taint from request to sink for 3 sink classes
|
||||
**AC:** produces “tainted” evidence with a variable/argument trace.
|
||||
* Story: Sanitizer recognition
|
||||
**AC:** path marked “sanitized” and downgraded per policy.
|
||||
|
||||
### Epic: VEX
|
||||
|
||||
* Story: Generate OpenVEX statements from reachability + vuln feed
|
||||
**AC:** for “not affected” includes justification and evidence references.
|
||||
|
||||
---
|
||||
|
||||
## 6) Key engineering decisions (recommended defaults)
|
||||
|
||||
### Storage
|
||||
|
||||
* Start with **Postgres** (or even local sqlite for MVP) for simplicity.
|
||||
* Introduce a graph DB only if:
|
||||
|
||||
* you need very large multi-commit graph queries at low latency
|
||||
* Postgres performance becomes a hard blocker
|
||||
|
||||
### Confidence model
|
||||
|
||||
Every edge/path should carry:
|
||||
|
||||
* `confidence`: High/Med/Low
|
||||
* `reasons`: e.g., `DirectCall`, `InterfaceDispatch`, `ReflectionGuess`, `RouterHeuristic`
|
||||
This lets you:
|
||||
* gate only on high-confidence paths in early rollout
|
||||
* keep low-confidence as “informational”
|
||||
|
||||
### Suppression model
|
||||
|
||||
* Local suppressions:
|
||||
|
||||
* `smartdiff.suppress.yaml` with rule id + symbol id + reason + expiry
|
||||
* Policy allowlists:
|
||||
|
||||
* allow file writes only under certain directories
|
||||
* allow outbound network only to configured domains
|
||||
|
||||
---
|
||||
|
||||
## 7) Testing strategy (to avoid “cool demo, unusable tool”)
|
||||
|
||||
### Unit tests
|
||||
|
||||
* Symbol hashing stability tests
|
||||
* Call resolution tests:
|
||||
|
||||
* overloads, generics, interfaces, lambdas
|
||||
* Policy parsing/validation tests
|
||||
|
||||
### Integration tests (must-have)
|
||||
|
||||
* Golden repos in `testdata/`:
|
||||
|
||||
* one ASP.NET minimal API
|
||||
* one MVC controller app
|
||||
* one Go net/http + one Gin app
|
||||
* Golden outputs:
|
||||
|
||||
* expected entrypoints
|
||||
* expected reachable sinks
|
||||
* expected diff between commits
|
||||
|
||||
### Regression tests
|
||||
|
||||
* A curated corpus of “known issues”:
|
||||
|
||||
* false positives you fixed should never return
|
||||
* false negatives: ensure known risky path is always found
|
||||
|
||||
### Performance tests
|
||||
|
||||
* Measure:
|
||||
|
||||
* analysis time per 50k LOC
|
||||
* memory peak
|
||||
* graph size
|
||||
* Budget enforcement:
|
||||
|
||||
* if over budget, degrade gracefully (lower precision, mark low confidence)
|
||||
|
||||
---
|
||||
|
||||
## 8) Example configs and outputs (to make onboarding easy)
|
||||
|
||||
### Example policy YAML (minimal)
|
||||
|
||||
```yaml
|
||||
version: 1
|
||||
service: invoices-api
|
||||
entrypoints:
|
||||
autodiscover:
|
||||
dotnet:
|
||||
aspnet: true
|
||||
go:
|
||||
net_http: true
|
||||
|
||||
sinks:
|
||||
- type: CMD_EXEC
|
||||
severity: high
|
||||
match:
|
||||
dotnet:
|
||||
symbols:
|
||||
- "System.Diagnostics.Process.Start(string)"
|
||||
go:
|
||||
symbols:
|
||||
- "os/exec.Command"
|
||||
- type: FILE_WRITE
|
||||
severity: medium
|
||||
match:
|
||||
dotnet:
|
||||
namespaces: ["System.IO"]
|
||||
go:
|
||||
symbols: ["os.WriteFile"]
|
||||
|
||||
gating:
|
||||
fail_on:
|
||||
- sink_type: CMD_EXEC
|
||||
when: "added && confidence >= medium"
|
||||
- sink_type: FILE_WRITE
|
||||
when: "added && tainted && confidence >= medium"
|
||||
```
|
||||
|
||||
### Evidence JSON shape (what the UI consumes)
|
||||
|
||||
```json
|
||||
{
|
||||
"commit": "abc123",
|
||||
"entrypoint": {"symbol": "InvoicesController.Upload()", "route": "POST /upload"},
|
||||
"sink": {"type": "FILE_WRITE", "symbol": "System.IO.File.WriteAllBytes"},
|
||||
"confidence": "high",
|
||||
"tainted": true,
|
||||
"path": [
|
||||
{"symbol": "InvoicesController.Upload()", "file": "Controllers/InvoicesController.cs", "line": 42},
|
||||
{"symbol": "UploadService.Save()", "file": "Services/UploadService.cs", "line": 18},
|
||||
{"symbol": "System.IO.File.WriteAllBytes", "file": null, "line": null}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9) Risks and mitigations (explicit)
|
||||
|
||||
1. **Dynamic behavior (reflection, DI, router magic)**
|
||||
|
||||
* Mitigation: conservative fallbacks + confidence labels + optional runtime traces later
|
||||
|
||||
2. **Noise from huge callgraphs**
|
||||
|
||||
* Mitigation: sink-first slicing (compute reachability backwards from sinks), entrypoint scoping, k‑shortest paths only
|
||||
|
||||
3. **Large repo build failures**
|
||||
|
||||
* Mitigation: analyzer runs inside build containers; allow partial analysis with explicit “incomplete” result flag
|
||||
|
||||
4. **Teams rejecting gating**
|
||||
|
||||
* Mitigation: staged rollout:
|
||||
|
||||
* Observe-only mode → warn-only → fail-only for high-confidence CMD_EXEC/UNSAFE_DESER
|
||||
|
||||
---
|
||||
|
||||
## 10) Definition of done (what “implemented” means)
|
||||
|
||||
You should consider Smart‑Diff “implemented” when:
|
||||
|
||||
* A repo can enable it with one config + one CI job.
|
||||
* PRs get:
|
||||
|
||||
* a small number of **actionable** findings (not hundreds)
|
||||
* each finding has a proof path with file/line hops
|
||||
* It reliably detects at least:
|
||||
|
||||
* new command execution paths
|
||||
* new unsafe deserialization paths
|
||||
* new tainted filesystem write paths
|
||||
* It can optionally emit VEX decisions backed by reachability evidence.
|
||||
|
||||
---
|
||||
|
||||
If you want the next step, I can also give you:
|
||||
|
||||
* a **concrete repo layout** with module boundaries,
|
||||
* the **Postgres schema** (tables + indexes),
|
||||
* and a **language adapter interface** (so adding Java/Python later is straightforward).
|
||||
@@ -0,0 +1,866 @@
|
||||
Here’s a simple, battle‑tested way to make your UX feel fast under pressure: treat **Time‑to‑First‑Signal (TTFS)** as a product SLO and design everything backwards from it.
|
||||
|
||||
---
|
||||
|
||||
# TTFS SLO: the idea in one line
|
||||
|
||||
Guarantee **p50 < 2s, p95 < 5s** from user action (or CI event) to the **first meaningful signal** (status, cause, or next step)—fast enough to calm triage, short enough to be felt.
|
||||
|
||||
---
|
||||
|
||||
## What counts as “First Signal”?
|
||||
|
||||
* A clear, human message like: “Scan started; last error matched: `NU1605` (likely transitive). Retry advice →”
|
||||
* Or a progress token with context: “Queued (ETA ~18s). Cached reachability graph loaded.”
|
||||
|
||||
Not a spinner. Not 0% progress. A real, decision‑shaping hint.
|
||||
|
||||
---
|
||||
|
||||
## Budget the pipeline backwards (guardrails)
|
||||
|
||||
* **Frontend (≤150 ms):** render instant skeleton + last known state; optimistic UI; no blocking on fresh data.
|
||||
* **Edge/API (≤250 ms):** return a “signal frame” fast path (status + last error signature + cached ETA) from cache.
|
||||
* **Core services (≤500–1500 ms):** pre‑index failures, warm reachability summaries, enqueue heavy work, emit stream token.
|
||||
* **Slow work (async):** full scan, lattice policy merge, provenance trails—arrive later via push updates.
|
||||
|
||||
---
|
||||
|
||||
## Minimal implementation (1–2 sprints)
|
||||
|
||||
1. **Define the signal contract**
|
||||
|
||||
* `FirstSignal { kind, verb, scope, lastKnownOutcome?, ETA?, nextAction? }`
|
||||
* Version it; keep it <1 KB; always return within the SLO window.
|
||||
|
||||
2. **Cache last error signature**
|
||||
|
||||
* Key: `(repo, branch|imageDigest, toolchain-hash)`
|
||||
* Value: `{errorCode, excerpt, fixLink, firstSeenAt, hitCount}`
|
||||
* Evict by LRU + TTL (e.g., 7–14 days). Use Valkey in default profile; Postgres JSONB in air‑gap.
|
||||
|
||||
3. **Pre‑index the failing step**
|
||||
|
||||
* When a job fails, extract and store:
|
||||
|
||||
* normalized step id (e.g., `scanner:deps-restore`)
|
||||
* top 1–3 error tokens (codes, regex’d phrases)
|
||||
* minimal context (package id, version range)
|
||||
* Write a tiny **“failure indexer”** that runs in‑band on failure and out‑of‑band on success.
|
||||
|
||||
4. **Lazy‑load everything else**
|
||||
|
||||
* UI shows FirstSignal + “Details loading…”
|
||||
* Fetch heavy panes (full CVE list, call‑graph, SBOM diff) after paint.
|
||||
|
||||
5. **Fast path endpoint**
|
||||
|
||||
* `GET /signal/{jobId}` returns from cache or snapshot table.
|
||||
* If cache miss: fall back to “cold signal” (`queued`, basic ETA) and **immediately** enqueue warmup tasks.
|
||||
|
||||
6. **Streaming updates**
|
||||
|
||||
* Emit compact deltas: `status:started → status:analyzing → triage:blocked(POLICY_X)` etc.
|
||||
* UI subscribes; CI annotates with the same tokens.
|
||||
|
||||
---
|
||||
|
||||
## TTFS SLO monitor (keep it honest)
|
||||
|
||||
* Emit for every user‑visible action: `ttfs_ms`, `path` (UI|CLI|CI), `signal_kind`, `cache_hit` (T/F).
|
||||
* Track **p50/p95** by surface and by repo size.
|
||||
* Page on **p95 > 5s** for 5 consecutive minutes (or >2% of traffic).
|
||||
* Store exemplars (trace ids) to replay slow paths.
|
||||
|
||||
---
|
||||
|
||||
## Stella Ops–specific hooks (drop‑in)
|
||||
|
||||
* **Scanner.WebService:** on job accept, write `FirstSignal{kind:"queued", ETA}`; if failure index has a hit, attach `lastKnownOutcome`.
|
||||
* **Feedser/Vexer:** publish “known criticals changed since last run” as a hint in FirstSignal.
|
||||
* **Policy Engine:** pre‑evaluate “obvious blocks” (e.g., banned license) and surface as `nextAction:"toggle waiver or update license map"`.
|
||||
* **Air‑gapped profile:** skip Valkey; keep a `first_signal_snapshots` Postgres table + NOTIFY/LISTEN for streaming.
|
||||
|
||||
---
|
||||
|
||||
## UX micro‑rules
|
||||
|
||||
* **Never show a spinner alone**; always pair with a sentence or chip (“Warm cache found; verifying”).
|
||||
* **3 taps max** to reach evidence: Button → FirstSignal → Evidence card.
|
||||
* **Always include a next step** (“Retry with `--ignore NU1605` is unsafe; use `PackageReference` pin → link”).
|
||||
|
||||
---
|
||||
|
||||
## Quick success criteria
|
||||
|
||||
* New incident claims: “I knew what was happening within 2 seconds.”
|
||||
* CI annotates within 5s on p95.
|
||||
* Support tickets referencing “stuck scans” drop ≥40%.
|
||||
|
||||
---
|
||||
|
||||
If you want, I can turn this into a ready‑to‑paste **TASKS.md** (owners, DOD, metrics, endpoints, DB schemas) for your Stella Ops repos.
|
||||
````md
|
||||
# TASKS.md — TTFS (Time‑to‑First‑Signal) Fast Signal + Progressive Updates
|
||||
|
||||
> Paste this file into the repo root (or `/docs/TTFS/TASKS.md`).
|
||||
> This plan is structured as two sprints (A + B) with clear owners, dependencies, and DoD.
|
||||
|
||||
---
|
||||
|
||||
## 0) Product SLO and non‑negotiables
|
||||
|
||||
### SLO
|
||||
- **TTFS p50 < 2s, p95 < 5s**
|
||||
- Applies to: **Web UI**, **CLI**, **CI annotations**
|
||||
- TTFS = time from **user action / CI start** → **first meaningful signal rendered/logged**
|
||||
|
||||
### What counts as “First Signal”
|
||||
A First Signal must include at least one of:
|
||||
- Status + context (“Queued, ETA ~18s”; “Started, phase: restore”; “Blocked by policy XYZ”)
|
||||
- Known cause hint (error token/code/category)
|
||||
- Next action (open logs, docs link, retry command)
|
||||
|
||||
A spinner alone does **not** count.
|
||||
|
||||
### Hard constraints
|
||||
- `/jobs/{id}/signal` must **never block** on full scan work
|
||||
- FirstSignal payload in normal cases **< 1KB**
|
||||
- **No secrets** in snapshots, excerpts, telemetry
|
||||
|
||||
---
|
||||
|
||||
## 1) Scope and module owners
|
||||
|
||||
### Modules (assumed)
|
||||
- **Scanner.WebService** (job API + signal provider)
|
||||
- **Scanner.Worker** (phase changes + event publishing)
|
||||
- **Policy Engine** (block reasons + quick pre-eval hooks)
|
||||
- **Feedser/Vexer** (optional: “critical changed” hint)
|
||||
- **Web UI** (progressive rendering + streaming)
|
||||
- **CLI** (first signal + streaming)
|
||||
- **CI Integration** (checks/annotations)
|
||||
- **Platform/Observability** (metrics, dashboards, alerts)
|
||||
- **Security/Compliance** (redaction + tenant isolation)
|
||||
|
||||
### Owners (replace with actual people/teams)
|
||||
- **Backend Lead:** @be-owner
|
||||
- **Frontend Lead:** @fe-owner
|
||||
- **DevEx/CLI Lead:** @dx-owner
|
||||
- **CI Integrations Lead:** @ci-owner
|
||||
- **SRE/Obs Lead:** @sre-owner
|
||||
- **Security Lead:** @sec-owner
|
||||
- **PM:** @pm-owner
|
||||
|
||||
---
|
||||
|
||||
## 2) Canonical contract: FirstSignal v1.0
|
||||
|
||||
### FirstSignal shape (canonical)
|
||||
All surfaces (UI/CLI/CI) must be representable via this contract.
|
||||
|
||||
```json
|
||||
{
|
||||
"version": "1.0",
|
||||
"signalId": "sig_...",
|
||||
"jobId": "job_...",
|
||||
|
||||
"timestamp": "2025-12-14T18:22:31.014Z",
|
||||
"kind": "queued|started|phase|blocked|failed|succeeded|canceled|unavailable",
|
||||
"phase": "resolve|fetch|restore|analyze|policy|report|unknown",
|
||||
|
||||
"scope": { "type": "repo|image|artifact", "id": "org/repo@branch-or-digest" },
|
||||
|
||||
"summary": "Queued (ETA ~18s). Last failure matched: NU1605 (dependency downgrade).",
|
||||
"etaSeconds": 18,
|
||||
|
||||
"lastKnownOutcome": {
|
||||
"signatureId": "sigerr_...",
|
||||
"errorCode": "NU1605",
|
||||
"token": "dependency-downgrade",
|
||||
"excerpt": "Detected package downgrade: ...",
|
||||
"confidence": "low|medium|high",
|
||||
"firstSeenAt": "2025-12-01T00:00:00Z",
|
||||
"hitCount": 14
|
||||
},
|
||||
|
||||
"nextActions": [
|
||||
{ "type": "open_logs|open_job|docs|retry|cli_command", "label": "Open logs", "target": "/jobs/job_.../logs" }
|
||||
],
|
||||
|
||||
"diagnostics": {
|
||||
"cacheHit": true,
|
||||
"source": "snapshot|failure_index|cold_start",
|
||||
"correlationId": "corr_..."
|
||||
}
|
||||
}
|
||||
````
|
||||
|
||||
### Contract rules
|
||||
|
||||
* Must always include: `version`, `jobId`, `timestamp`, `kind`, `summary`
|
||||
* Keep normal payload < 1KB (enforce excerpt max length; avoid lists)
|
||||
* Never include secrets; excerpts must be redacted
|
||||
|
||||
---
|
||||
|
||||
## 3) Milestones
|
||||
|
||||
### Sprint A — “TTFS Baseline”
|
||||
|
||||
Goal: Always show **some** meaningful First Signal quickly.
|
||||
|
||||
Deliverables:
|
||||
|
||||
* Snapshot persistence (DB) + optional cache
|
||||
* `/jobs/{id}/signal` fast path
|
||||
* UI skeleton + immediate FirstSignal rendering (poll fallback OK)
|
||||
* Base telemetry: `ttfs_ms`, endpoint latency, cache hit
|
||||
|
||||
### Sprint B — “Smart Hints + Streaming”
|
||||
|
||||
Goal: First Signal is helpful and updates live.
|
||||
|
||||
Deliverables:
|
||||
|
||||
* Failure signature indexer + lookup
|
||||
* SSE events (or WebSocket) for incremental updates
|
||||
* CLI streaming + CI annotations
|
||||
* Dashboards + alerts + exemplars/traces
|
||||
* Redaction hardening and tenant isolation validation
|
||||
|
||||
---
|
||||
|
||||
## 4) Sprint A tasks — TTFS baseline
|
||||
|
||||
### A1 — Implement FirstSignal types and helpers (shared package)
|
||||
|
||||
**Owner:** @be-owner
|
||||
**Depends on:** none
|
||||
**Est:** 2–4 pts
|
||||
|
||||
**Tasks**
|
||||
|
||||
* [ ] Define FirstSignal v1.0 schema in a shared package (`/common/contracts/firstsignal`)
|
||||
* [ ] Add validators:
|
||||
|
||||
* [ ] required fields present
|
||||
* [ ] size limits (excerpt length; total serialized bytes threshold warning)
|
||||
* [ ] allowed enums for kind/phase
|
||||
* [ ] Add builders:
|
||||
|
||||
* [ ] `buildQueuedSignal(job, eta?)`
|
||||
* [ ] `buildColdSignal(job)`
|
||||
* [ ] `mergeHint(signal, lastKnownOutcome)`
|
||||
* [ ] `addNextActions(signal, actions[])`
|
||||
|
||||
**DoD**
|
||||
|
||||
* Contract is versioned, unit-tested, and used by backend endpoint
|
||||
* Validation rejects/flags invalid signals in tests
|
||||
|
||||
---
|
||||
|
||||
### A2 — Snapshot storage: `first_signal_snapshots` table + migrations
|
||||
|
||||
**Owner:** @be-owner
|
||||
**Depends on:** A1
|
||||
**Est:** 3–5 pts
|
||||
|
||||
**Tasks**
|
||||
|
||||
* [ ] Add Postgres migration for `first_signal_snapshots`
|
||||
* [ ] Implement CRUD:
|
||||
|
||||
* [ ] `createSnapshot(jobId, signal)`
|
||||
* [ ] `updateSnapshot(jobId, partialSignal)` (phase transitions)
|
||||
* [ ] `getSnapshot(jobId)`
|
||||
* [ ] Enforce:
|
||||
|
||||
* [ ] `payload_json` size guard (soft warn + hard cap via excerpt limit)
|
||||
* [ ] `updated_at` maintained automatically
|
||||
|
||||
**Suggested schema**
|
||||
|
||||
```sql
|
||||
CREATE TABLE first_signal_snapshots (
|
||||
job_id TEXT PRIMARY KEY,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||
kind TEXT NOT NULL,
|
||||
phase TEXT NOT NULL,
|
||||
summary TEXT NOT NULL,
|
||||
eta_seconds INT NULL,
|
||||
payload_json JSONB NOT NULL
|
||||
);
|
||||
CREATE INDEX ON first_signal_snapshots (updated_at DESC);
|
||||
```
|
||||
|
||||
**DoD**
|
||||
|
||||
* Migration included
|
||||
* Integration test: create job → snapshot exists within request lifecycle (or best-effort async write + immediate cold response)
|
||||
|
||||
---
|
||||
|
||||
### A3 — Cache layer (default profile) with Postgres fallback
|
||||
|
||||
**Owner:** @be-owner
|
||||
**Depends on:** A2
|
||||
**Est:** 3–6 pts
|
||||
|
||||
**Tasks**
|
||||
|
||||
* [ ] Add optional Valkey/Redis support:
|
||||
|
||||
* [ ] key: `signal:job:{jobId}` TTL: 24h
|
||||
* [ ] read-through cache on `/signal`
|
||||
* [ ] write-through on snapshot updates
|
||||
* [ ] Air-gapped mode behavior:
|
||||
|
||||
* [ ] cache disabled → read/write snapshots in Postgres only
|
||||
* [ ] Add config toggles:
|
||||
|
||||
* [ ] `TTFS_CACHE_BACKEND=valkey|postgres|none`
|
||||
* [ ] `TTFS_CACHE_TTL_SECONDS=86400`
|
||||
|
||||
**DoD**
|
||||
|
||||
* With cache enabled: `/signal` p95 latency meets budget in load test
|
||||
* With cache disabled: correctness remains; p95 within acceptable baseline
|
||||
|
||||
---
|
||||
|
||||
### A4 — `/jobs/{jobId}/signal` fast-path endpoint
|
||||
|
||||
**Owner:** @be-owner
|
||||
**Depends on:** A2, A3
|
||||
**Est:** 4–8 pts
|
||||
|
||||
**Tasks**
|
||||
|
||||
* [ ] Implement `GET /jobs/{jobId}/signal`
|
||||
|
||||
* [ ] Try cache snapshot
|
||||
* [ ] Else DB snapshot
|
||||
* [ ] Else cold signal (`kind=queued`, `phase=unknown`, summary “Queued. Preparing scan…”)
|
||||
* [ ] Best-effort snapshot write if missing (non-blocking)
|
||||
* [ ] Response headers:
|
||||
|
||||
* [ ] `X-Correlation-Id`
|
||||
* [ ] `Cache-Status: hit|miss|bypass`
|
||||
* [ ] Add server-side timing logs (debug-level) for:
|
||||
|
||||
* [ ] cache read time
|
||||
* [ ] db read time
|
||||
* [ ] cold path time
|
||||
|
||||
**Performance budget**
|
||||
|
||||
* Cache-hit response: **p95 ≤ 250ms**
|
||||
* Cold response: **p95 ≤ 500ms**
|
||||
|
||||
**DoD**
|
||||
|
||||
* Endpoint never blocks on scan work
|
||||
* Returns a valid FirstSignal every time job exists
|
||||
* Load test demonstrates budgets
|
||||
|
||||
---
|
||||
|
||||
### A5 — Create snapshot at job creation and update on phase changes
|
||||
|
||||
**Owner:** @be-owner + @worker-owner
|
||||
**Depends on:** A2
|
||||
**Est:** 5–8 pts
|
||||
|
||||
**Tasks**
|
||||
|
||||
* [ ] In `POST /jobs`:
|
||||
|
||||
* [ ] Immediately write initial snapshot:
|
||||
|
||||
* `kind=queued`
|
||||
* `phase=unknown`
|
||||
* summary includes “Queued” and optional ETA
|
||||
* [ ] In worker:
|
||||
|
||||
* [ ] When job starts: update snapshot to `kind=started`, `phase=resolve|fetch|restore…`
|
||||
* [ ] On phase transitions: update snapshot
|
||||
* [ ] On terminal: `kind=succeeded|failed|canceled`
|
||||
* [ ] Ensure updates are idempotent and safe (replays)
|
||||
|
||||
**DoD**
|
||||
|
||||
* For any started job, snapshot shows phase changes within a few seconds
|
||||
* Terminal kind always correct
|
||||
|
||||
---
|
||||
|
||||
### A6 — UI: Immediate “First Signal” rendering with polling fallback
|
||||
|
||||
**Owner:** @fe-owner
|
||||
**Depends on:** A4
|
||||
**Est:** 6–10 pts
|
||||
|
||||
**Tasks**
|
||||
|
||||
* [ ] On scan trigger:
|
||||
|
||||
* [ ] Render skeleton + “Preparing scan…” message (no spinner-only)
|
||||
* [ ] Call `POST /jobs` (get jobId)
|
||||
* [ ] Immediately call `GET /jobs/{jobId}/signal`
|
||||
* [ ] Render summary + at least one next action button (Open job/logs)
|
||||
* [ ] Poll fallback:
|
||||
|
||||
* [ ] If streaming not available yet (Sprint A), poll `/signal` every 2–5s until terminal
|
||||
* [ ] Lazy-load heavy panels (must not block First Signal):
|
||||
|
||||
* [ ] vulnerability list
|
||||
* [ ] dependency graph
|
||||
* [ ] SBOM diff
|
||||
|
||||
**DoD**
|
||||
|
||||
* Real user monitoring shows UI TTFS p50 < 2s, p95 < 5s for the baseline path
|
||||
* No spinner-only states
|
||||
|
||||
---
|
||||
|
||||
### A7 — Telemetry: baseline metrics and tracing
|
||||
|
||||
**Owner:** @sre-owner + @be-owner + @fe-owner
|
||||
**Depends on:** A4, A6
|
||||
**Est:** 5–10 pts
|
||||
|
||||
**Metrics**
|
||||
|
||||
* [ ] `ttfs_ms` (emitted client-side for UI; server-side for CLI/CI if needed)
|
||||
|
||||
* tags: `surface=ui|cli|ci`, `cache_hit=true|false`, `signal_source=snapshot|cold_start`, `kind`, `repo_size_bucket`
|
||||
* [ ] `signal_endpoint_latency_ms`
|
||||
* [ ] `signal_payload_bytes`
|
||||
* [ ] `signal_error_rate`
|
||||
|
||||
**Tracing**
|
||||
|
||||
* [ ] Correlation id propagated:
|
||||
|
||||
* [ ] API response header
|
||||
* [ ] worker logs
|
||||
* [ ] events (Sprint B)
|
||||
|
||||
**Dashboards**
|
||||
|
||||
* [ ] TTFS p50/p95 by surface
|
||||
* [ ] cache hit rate
|
||||
* [ ] endpoint latency percentiles
|
||||
|
||||
**DoD**
|
||||
|
||||
* Metrics visible in dashboard
|
||||
* Correlation ids make it possible to trace slow examples end-to-end
|
||||
|
||||
---
|
||||
|
||||
## 5) Sprint B tasks — smart hints + streaming
|
||||
|
||||
### B1 — Failure signature extraction + redaction library
|
||||
|
||||
**Owner:** @be-owner + @sec-owner
|
||||
**Depends on:** A1
|
||||
**Est:** 6–12 pts
|
||||
|
||||
**Tasks**
|
||||
|
||||
* [ ] Implement redaction utility (unit-tested):
|
||||
|
||||
* [ ] strip bearer tokens, API keys, access tokens, private URLs
|
||||
* [ ] cap excerpt length (e.g., 240 chars)
|
||||
* [ ] normalize whitespace
|
||||
* [ ] Implement signature extraction from:
|
||||
|
||||
* [ ] structured step errors (preferred)
|
||||
* [ ] raw logs (fallback) via regex ruleset
|
||||
* [ ] Map to:
|
||||
|
||||
* `errorCode` (if present)
|
||||
* `token` (normalized category)
|
||||
* `confidence` (high/med/low)
|
||||
|
||||
**DoD**
|
||||
|
||||
* Redaction unit tests include “known secret-like patterns”
|
||||
* Extraction produces stable tokens for top failure families
|
||||
|
||||
---
|
||||
|
||||
### B2 — Failure signature storage: `failure_signatures` table + upsert on failures
|
||||
|
||||
**Owner:** @be-owner
|
||||
**Depends on:** B1
|
||||
**Est:** 5–10 pts
|
||||
|
||||
**Tasks**
|
||||
|
||||
* [ ] Add Postgres migration for `failure_signatures`
|
||||
* [ ] Implement lookup key:
|
||||
|
||||
* `(scope_type, scope_id, toolchain_hash)`
|
||||
* [ ] On job failure:
|
||||
|
||||
* [ ] extract signature → redaction → upsert
|
||||
* [ ] increment hit_count; update last_seen_at
|
||||
* [ ] Retention:
|
||||
|
||||
* [ ] TTL job: delete signatures older than 14 days (configurable)
|
||||
* [ ] or retain last N signatures per scope
|
||||
|
||||
**Suggested schema**
|
||||
|
||||
```sql
|
||||
CREATE TABLE failure_signatures (
|
||||
signature_id TEXT PRIMARY KEY,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||
scope_type TEXT NOT NULL,
|
||||
scope_id TEXT NOT NULL,
|
||||
toolchain_hash TEXT NOT NULL,
|
||||
error_code TEXT NULL,
|
||||
token TEXT NOT NULL,
|
||||
excerpt TEXT NULL,
|
||||
confidence TEXT NOT NULL,
|
||||
first_seen_at TIMESTAMPTZ NOT NULL,
|
||||
last_seen_at TIMESTAMPTZ NOT NULL,
|
||||
hit_count INT NOT NULL DEFAULT 1
|
||||
);
|
||||
CREATE INDEX ON failure_signatures (scope_type, scope_id, toolchain_hash);
|
||||
CREATE INDEX ON failure_signatures (token);
|
||||
```
|
||||
|
||||
**DoD**
|
||||
|
||||
* Failure runs populate signatures
|
||||
* Excerpts are redacted and capped
|
||||
* Retention job verified
|
||||
|
||||
---
|
||||
|
||||
### B3 — Enrich FirstSignal with “lastKnownOutcome” hint
|
||||
|
||||
**Owner:** @be-owner
|
||||
**Depends on:** B2
|
||||
**Est:** 3–6 pts
|
||||
|
||||
**Tasks**
|
||||
|
||||
* [ ] On `/signal` (fast path):
|
||||
|
||||
* [ ] if snapshot exists but has no hint, attempt signature lookup by scope+toolchain hash
|
||||
* [ ] merge hint into signal
|
||||
* [ ] include `diagnostics.source=failure_index` when used
|
||||
* [ ] Add “next actions” for common tokens:
|
||||
|
||||
* [ ] docs link for known error codes/tokens
|
||||
* [ ] “open logs” always present
|
||||
|
||||
**DoD**
|
||||
|
||||
* For scopes with prior failures, FirstSignal includes hint within SLO budgets
|
||||
|
||||
---
|
||||
|
||||
### B4 — Streaming updates via SSE (recommended)
|
||||
|
||||
**Owner:** @be-owner + @worker-owner + @fe-owner
|
||||
**Depends on:** A5
|
||||
**Est:** 8–16 pts
|
||||
|
||||
**Backend tasks**
|
||||
|
||||
* [ ] Add `GET /jobs/{jobId}/events` SSE endpoint
|
||||
* [ ] Define event payloads:
|
||||
|
||||
* `status` (kind+phase+message)
|
||||
* `hint` (token+errorCode+confidence)
|
||||
* `policy` (blocked + policyId)
|
||||
* `complete` (terminal)
|
||||
* [ ] Worker publishes events at:
|
||||
|
||||
* start
|
||||
* phase transitions
|
||||
* policy decision
|
||||
* terminal
|
||||
* [ ] Ensure reconnect safety:
|
||||
|
||||
* [ ] event id monotonic or timestamp
|
||||
* [ ] optional replay window (last N events in memory or DB)
|
||||
|
||||
**Frontend tasks**
|
||||
|
||||
* [ ] Subscribe after jobId known
|
||||
* [ ] Update FirstSignal UI in-place on deltas
|
||||
* [ ] Fallback to polling when SSE fails
|
||||
|
||||
**DoD**
|
||||
|
||||
* UI updates without refresh
|
||||
* Event stream doesn’t spam (3–8 meaningful events per job typical)
|
||||
* SSE failure degrades gracefully
|
||||
|
||||
---
|
||||
|
||||
### B5 — Policy Engine: “obvious block” pre-eval for early signal
|
||||
|
||||
**Owner:** @be-owner + @policy-owner
|
||||
**Depends on:** B4 (optional), or can enrich snapshot directly
|
||||
**Est:** 5–10 pts
|
||||
|
||||
**Tasks**
|
||||
|
||||
* [ ] Add a quick pre-evaluation hook for high-signal blocks:
|
||||
|
||||
* banned license
|
||||
* disallowed package
|
||||
* org-level denylist
|
||||
* [ ] Emit early policy event or update snapshot:
|
||||
|
||||
* `kind=blocked`, `phase=policy`, summary names the policy
|
||||
* next action points to waiver/docs (if supported)
|
||||
|
||||
**DoD**
|
||||
|
||||
* When an obvious block is present, users see it in FirstSignal without waiting for full analysis
|
||||
|
||||
---
|
||||
|
||||
### B6 — CLI: First Signal + streaming
|
||||
|
||||
**Owner:** @dx-owner
|
||||
**Depends on:** A4, B4
|
||||
**Est:** 5–10 pts
|
||||
|
||||
**Tasks**
|
||||
|
||||
* [ ] Ensure CLI prints FirstSignal within TTFS budget
|
||||
* [ ] Add `--follow` default behavior:
|
||||
|
||||
* connect to SSE and stream deltas
|
||||
* [ ] Provide minimal, non-spammy output:
|
||||
|
||||
* only on meaningful transitions
|
||||
* [ ] Print correlation id for support triage
|
||||
|
||||
**DoD**
|
||||
|
||||
* CLI TTFS p50 < 2s, p95 < 5s
|
||||
* Streaming works and degrades to polling
|
||||
|
||||
---
|
||||
|
||||
### B7 — CI annotations/checks: initial First Signal within 5s p95
|
||||
|
||||
**Owner:** @ci-owner
|
||||
**Depends on:** A4, B4 (optional)
|
||||
**Est:** 6–12 pts
|
||||
|
||||
**Tasks**
|
||||
|
||||
* [ ] On CI job start:
|
||||
|
||||
* [ ] call `/signal` and publish check/annotation with summary + job link
|
||||
* [ ] Update annotations only on state changes:
|
||||
|
||||
* queued → started
|
||||
* started → blocked/failed/succeeded
|
||||
* [ ] Avoid annotation spam (max 3–5 updates)
|
||||
|
||||
**DoD**
|
||||
|
||||
* CI shows actionable first message within 5s p95
|
||||
* Updates are minimal and meaningful
|
||||
|
||||
---
|
||||
|
||||
### B8 — Observability: SLO alerts + exemplars
|
||||
|
||||
**Owner:** @sre-owner
|
||||
**Depends on:** A7
|
||||
**Est:** 5–10 pts
|
||||
|
||||
**Tasks**
|
||||
|
||||
* [ ] Alerts:
|
||||
|
||||
* [ ] page when `p95(ttfs_ms) > 5000` for 5 mins
|
||||
* [ ] page when `signal_endpoint_error_rate > 1%`
|
||||
* [ ] Add exemplars / trace links on slow TTFS samples
|
||||
* [ ] Add breakdown dashboard:
|
||||
|
||||
* surface (ui/cli/ci)
|
||||
* cacheHit
|
||||
* repo size bucket
|
||||
* kind/phase
|
||||
|
||||
**DoD**
|
||||
|
||||
* On-call can diagnose slow TTFS with one click to traces/logs
|
||||
|
||||
---
|
||||
|
||||
## 6) Cross-cutting: security, privacy, and tenancy
|
||||
|
||||
### S1 — Tenant-safe caching and lookups
|
||||
|
||||
**Owner:** @sec-owner + @be-owner
|
||||
**Depends on:** A3, B2
|
||||
**Est:** 3–6 pts
|
||||
|
||||
**Tasks**
|
||||
|
||||
* [ ] Ensure cache keys include tenant/org boundary where applicable:
|
||||
|
||||
* `tenant:{tenantId}:signal:job:{jobId}`
|
||||
* [ ] Ensure failure signatures are only looked up within same tenant
|
||||
* [ ] Add tests for cross-tenant leakage
|
||||
|
||||
**DoD**
|
||||
|
||||
* No cross-tenant access possible via cache or signature index
|
||||
|
||||
---
|
||||
|
||||
### S2 — No secrets policy enforcement
|
||||
|
||||
**Owner:** @sec-owner
|
||||
**Depends on:** B1
|
||||
**Est:** 2–5 pts
|
||||
|
||||
**Tasks**
|
||||
|
||||
* [ ] Add “secret scanning” unit tests for redaction
|
||||
* [ ] Add runtime guardrails:
|
||||
|
||||
* if excerpt contains forbidden patterns → replace with “[redacted]”
|
||||
* [ ] Ensure telemetry attributes never include excerpts
|
||||
|
||||
**DoD**
|
||||
|
||||
* Security review sign-off for snapshot + signature + telemetry
|
||||
|
||||
---
|
||||
|
||||
## 7) Global Definition of Done
|
||||
|
||||
A feature is “done” only when:
|
||||
|
||||
* [ ] Meets TTFS SLO in staging load test and in production RUM (within agreed rollout window)
|
||||
* [ ] Has:
|
||||
|
||||
* [ ] unit tests
|
||||
* [ ] integration tests
|
||||
* [ ] basic load test coverage for `/signal`
|
||||
* [ ] Has:
|
||||
|
||||
* [ ] dashboards
|
||||
* [ ] alerts (or explicitly deferred with signed waiver)
|
||||
* [ ] Has:
|
||||
|
||||
* [ ] secure redaction
|
||||
* [ ] tenant isolation
|
||||
* [ ] Has a rollback plan via feature flag
|
||||
|
||||
---
|
||||
|
||||
## 8) Test plan
|
||||
|
||||
### Unit tests
|
||||
|
||||
* FirstSignal contract validation (required fields, enums)
|
||||
* Redaction patterns (bearer tokens, API keys, URLs, long strings)
|
||||
* Signature extraction rule correctness
|
||||
|
||||
### Integration tests
|
||||
|
||||
* Create job → snapshot exists → `/signal` returns it
|
||||
* Worker phase transitions update snapshot
|
||||
* Job fail → signature stored → next job → `/signal` includes lastKnownOutcome
|
||||
* SSE connect → receive events in order → terminal event once
|
||||
|
||||
### Load tests (must-have)
|
||||
|
||||
* `/jobs/{id}/signal`:
|
||||
|
||||
* cache-hit p95 ≤ 250ms
|
||||
* cold path p95 ≤ 500ms
|
||||
* error rate < 0.1% under expected concurrency
|
||||
|
||||
### Chaos/degraded tests
|
||||
|
||||
* Cache down → Postgres fallback works
|
||||
* SSE blocked → UI polls and still updates
|
||||
|
||||
---
|
||||
|
||||
## 9) Feature flags and rollout
|
||||
|
||||
### Flags
|
||||
|
||||
* `ttfs.first_signal_enabled` (default ON in staging)
|
||||
* `ttfs.cache_enabled`
|
||||
* `ttfs.failure_index_enabled`
|
||||
* `ttfs.sse_enabled`
|
||||
* `ttfs.policy_preeval_enabled`
|
||||
|
||||
### Rollout steps
|
||||
|
||||
1. Enable baseline FirstSignal + snapshots for internal/staging
|
||||
2. Enable cache in default profile
|
||||
3. Enable failure index (read-only first; then write)
|
||||
4. Enable SSE for 10% traffic → 50% → 100%
|
||||
5. Enable CI annotations (start with non-blocking informational checks)
|
||||
|
||||
---
|
||||
|
||||
## 10) PR review checklist (paste into PR template)
|
||||
|
||||
* [ ] No blocking heavy work added to `/signal` path
|
||||
* [ ] Signal payload size remains < 1KB in normal cases
|
||||
* [ ] Excerpts are redacted + length-capped
|
||||
* [ ] Tenant boundary included in cache keys and DB queries
|
||||
* [ ] Metrics emitted (`ttfs_ms`, endpoint latency, cacheHit)
|
||||
* [ ] UI has no spinner-only state; always shows message + next action
|
||||
* [ ] Streaming has polling fallback
|
||||
* [ ] Tests added/updated (unit + integration)
|
||||
|
||||
---
|
||||
|
||||
## 11) “Ready for QA” scenarios
|
||||
|
||||
QA should validate:
|
||||
|
||||
* UI:
|
||||
|
||||
* click scan → first message within 2s typical
|
||||
* see queued/started/blocked states clearly
|
||||
* open logs works
|
||||
* CLI:
|
||||
|
||||
* first output within 2s typical
|
||||
* follow stream updates
|
||||
* CI:
|
||||
|
||||
* first annotation/check appears quickly and links to job
|
||||
* Security:
|
||||
|
||||
* inject fake token into logs → stored excerpt is redacted
|
||||
* Multi-tenant:
|
||||
|
||||
* run jobs across tenants → no leakage in signals or hints
|
||||
|
||||
---
|
||||
|
||||
```
|
||||
|
||||
If you want this split into **multiple repo-local files** (e.g., `/docs/TTFS/ARCH.md`, `/docs/TTFS/SCHEMAS.sql`, `/docs/TTFS/RUNBOOK.md`, plus a PR template snippet), say the folder structure you prefer and I’ll output them in the same paste-ready format.
|
||||
```
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,892 @@
|
||||
Here’s a crisp, first‑time‑friendly blueprint for **Smart‑Diff**—a minimal‑noise way to highlight only changes that actually shift security risk, not every tiny SBOM/VEX delta.
|
||||
|
||||
---
|
||||
|
||||
# What “Smart‑Diff” means (in plain terms)
|
||||
|
||||
Smart‑Diff is the **smallest set of changes** between two builds/releases that **materially change risk**. We only surface a change when it affects exploitability or policy—not when a dev-only transitive bumped a patch with no runtime path.
|
||||
|
||||
**Count it as a Smart‑Diff only if at least one of these flips:**
|
||||
|
||||
* **Reachability:** new reachable vulnerable code appears, or previously reachable code becomes unreachable.
|
||||
* **VEX status:** a CVE’s status changes (e.g., to `not_affected`).
|
||||
* **Version vs affected ranges:** a dependency crosses into/out of a known vulnerable range.
|
||||
* **KEV/EPSS/Policy:** CISA KEV listing, EPSS spike, or your org policy gates change.
|
||||
|
||||
Ignore:
|
||||
|
||||
* CVEs that are both **unreachable** and **VEX = not_affected**.
|
||||
* Pure patch‑level churn that doesn’t cross an affected range and isn’t KEV‑listed.
|
||||
* Dev/test‑only deps with **no runtime path**.
|
||||
|
||||
---
|
||||
|
||||
# Minimal data model (practical)
|
||||
|
||||
* **DiffSet { added, removed, changed }** for packages, symbols, CVEs, and policy gates.
|
||||
* **AffectedGraph { package → symbol → call‑site }**: reachability edges from entrypoints to vulnerable sinks.
|
||||
* **EvidenceLink { attestation | VEX | KEV | scanner trace }** per item, so every claim is traceable.
|
||||
|
||||
---
|
||||
|
||||
# Core algorithms (what makes it “smart”)
|
||||
|
||||
* **Reachability‑aware set ops:** run set diffs only on **reachable** vuln findings.
|
||||
* **SemVer gates:** treat “crossing an affected range” as a boolean boundary; patch bumps inside a safe range don’t alert.
|
||||
* **VEX merge logic:** vendor or internal VEX that says `not_affected` suppresses noise unless KEV contradicts.
|
||||
* **EPSS‑weighted priority:** rank surfaced diffs by latest EPSS; KEV always escalates to top.
|
||||
* **Policy overlays:** org rules (e.g., “block any KEV,” “warn if EPSS > 0.7”) applied last.
|
||||
|
||||
---
|
||||
|
||||
# Example (why it’s quieter, but safer)
|
||||
|
||||
* **OpenSSL 3.0.10 → 3.0.11** with VEX `not_affected` for a CVE: Smart‑Diff marks **risk down** and **closes** the prior alert.
|
||||
* A **transitive dev dependency** changes with **no runtime path**: Smart‑Diff **logs only**, no red flag.
|
||||
|
||||
---
|
||||
|
||||
# Implementation plan (Stella Ops‑ready)
|
||||
|
||||
**1) Inputs**
|
||||
|
||||
* SBOM (CycloneDX/SPDX) old vs new
|
||||
* VEX (OpenVEX/CycloneDX VEX)
|
||||
* Vuln feeds (NVD, vendor), **CISA KEV**, **EPSS**
|
||||
* Reachability traces (per language analyzers)
|
||||
|
||||
**2) Normalize**
|
||||
|
||||
* Map all deps to **purl**, normalize versions, index CVEs → affected ranges.
|
||||
* Ingest VEX and attach to CVE ↔ component with clear status precedence.
|
||||
|
||||
**3) Build graphs**
|
||||
|
||||
* Generate/refresh **AffectedGraph** per build: entrypoints → call stacks → vulnerable symbols.
|
||||
* Tag each finding with `{reachable?, vex_status, kev?, epss, policy_flags}`.
|
||||
|
||||
**4) Diff**
|
||||
|
||||
* Compute **DiffSet** between builds for:
|
||||
|
||||
* Reachable findings
|
||||
* VEX statuses
|
||||
* Version/range crossings
|
||||
* Policy/KEV/EPSS gates
|
||||
|
||||
**5) Prioritize & suppress**
|
||||
|
||||
* Drop items that are **unreachable AND not_affected**.
|
||||
* Collapse patch‑level churn unless **KEV‑listed**.
|
||||
* Sort remaining by **KEV first**, then **EPSS**, then **runtime blast‑radius** (fan‑in/fan‑out).
|
||||
|
||||
**6) Evidence**
|
||||
|
||||
* Attach **EvidenceLink** to each surfaced change:
|
||||
|
||||
* VEX doc (line/ID)
|
||||
* KEV entry
|
||||
* EPSS score + timestamp
|
||||
* Reachability call stack (top 1‑3 paths)
|
||||
|
||||
**7) UX**
|
||||
|
||||
* Pipeline‑first: output a **Smart‑Diff report JSON** + concise CLI table:
|
||||
|
||||
* `risk ↑/↓`, reason (reachability/VEX/KEV/EPSS), component@version, CVE, **one** example call‑stack.
|
||||
* UI is an explainer: expand to full stack, VEX note, KEV link, and “minimum safe change” suggestion.
|
||||
|
||||
---
|
||||
|
||||
# Module sketch (your stack)
|
||||
|
||||
* **Services:** `Sbomer.Diff`, `Vexer.Merge`, `Scanner.Reachability`, `Feedser.KEV/EPSS`, `Policy.Engine`, `SmartDiff.Service`
|
||||
* **Store:** PostgreSQL (SoR), Valkey cache (ephemeral). Tables: `components`, `cves`, `vex_entries`, `reachability_edges`, `smartdiff_events`, `evidence_links`.
|
||||
* **APIs:**
|
||||
|
||||
* `POST /smartdiff/compare` → returns filtered diff + priorities
|
||||
* `GET /smartdiff/:id/evidence` → links to VEX/KEV/EPSS + trace
|
||||
* **CI usage:** `stella smart-diff --old sbomA.json --new sbomB.json --vex vex.json --out smartdiff.json`
|
||||
|
||||
---
|
||||
|
||||
# Guardrails (to keep it deterministic)
|
||||
|
||||
* Freeze feed snapshots per run (hash KEV/EPSS CSVs + VEX docs).
|
||||
* Version the merge rules (VEX precedence + policy) and emit in the report header.
|
||||
* Log the **exact** semver comparisons that triggered/exempted an alert.
|
||||
|
||||
If you want, I can draft the **Postgres schema**, the **.NET 10 DTOs** for `DiffSet` and `AffectedGraph`, and a **CLI prototype** (`stella smart-diff`) you can drop into your pipeline.
|
||||
Noted: the services are **Concelier** (feeds: KEV/EPSS/NVD/vendor snapshots) and **Excititor** (VEX merge + status resolution). I’ll use those names going forward.
|
||||
|
||||
Below is a **product + business analysis implementation spec** that a developer can follow to build the Smart‑Diff capability you described.
|
||||
|
||||
---
|
||||
|
||||
# 1) Product objective
|
||||
|
||||
## Problem
|
||||
|
||||
Classic SBOM/VEX diffs are noisy: they surface *all* dependency/CVE churn, even when nothing changes in **actual exploitable risk**.
|
||||
|
||||
## Goal
|
||||
|
||||
Produce a **Smart‑Diff report** between two builds/releases that highlights only changes that **materially impact security risk**, with evidence attached.
|
||||
|
||||
## Success criteria
|
||||
|
||||
* **Noise reduction:** >80% fewer diff items vs raw SBOM diff for typical builds (measured by count).
|
||||
* **No missed “high-risk flips”:** any change that creates or removes a **reachable vulnerable path** must appear.
|
||||
* **Traceability:** every surfaced Smart‑Diff item has at least **one evidence link** (VEX entry, reachability trace, KEV reference, feed snapshot hash, scanner output).
|
||||
|
||||
---
|
||||
|
||||
# 2) Scope
|
||||
|
||||
## In scope (MVP)
|
||||
|
||||
* Compare two “build snapshots”: `{SBOM, VEX, reachability traces, vuln feed snapshot, policy snapshot}`
|
||||
* Detect & report these change types:
|
||||
|
||||
1. **Reachability flips** (reachable ↔ unreachable)
|
||||
2. **VEX status changes** (e.g., `affected` → `not_affected`)
|
||||
3. **Version crosses vuln boundary** (safe ↔ affected range)
|
||||
4. **KEV/EPSS/policy gate flips** (e.g., becomes KEV-listed)
|
||||
* Suppress noise using explicit rules (see section 6)
|
||||
* Output:
|
||||
|
||||
* JSON report for CI
|
||||
* concise CLI output (table)
|
||||
* optional UI list view (later)
|
||||
|
||||
## Out of scope (for now)
|
||||
|
||||
* Full remediation planning / patch PR automation
|
||||
* Cross-repo portfolio aggregation (doable later)
|
||||
* Advanced exploit intelligence beyond KEV/EPSS
|
||||
|
||||
---
|
||||
|
||||
# 3) Key definitions (developers must implement these exactly)
|
||||
|
||||
## 3.1 Finding
|
||||
|
||||
A “finding” is a tuple:
|
||||
|
||||
`FindingKey = (component_purl, component_version, cve_id)`
|
||||
|
||||
…and includes computed fields:
|
||||
|
||||
* `reachable: bool | unknown`
|
||||
* `vex_status: enum` (see 3.3)
|
||||
* `in_affected_range: bool | unknown`
|
||||
* `kev: bool`
|
||||
* `epss_score: float | null`
|
||||
* `policy_flags: set<string>`
|
||||
* `evidence_links: list<EvidenceLink>`
|
||||
|
||||
## 3.2 Material risk change (Smart‑Diff item)
|
||||
|
||||
A change is “material” if it changes the computed **RiskState** for any `FindingKey` or creates/removes a `FindingKey` that is in-scope after suppression rules.
|
||||
|
||||
## 3.3 VEX status vocabulary
|
||||
|
||||
Normalize all incoming VEX statuses into a fixed internal enum:
|
||||
|
||||
* `AFFECTED`
|
||||
* `NOT_AFFECTED`
|
||||
* `FIXED`
|
||||
* `UNDER_INVESTIGATION`
|
||||
* `UNKNOWN` (no statement or unparseable)
|
||||
|
||||
> Note: Use OpenVEX/CycloneDX VEX mappings, but internal logic must operate on the above set.
|
||||
|
||||
---
|
||||
|
||||
# 4) System context and responsibilities
|
||||
|
||||
You already have a modular setup. Developers should implement Smart‑Diff as a pipeline over these components:
|
||||
|
||||
## Components (names aligned to your system)
|
||||
|
||||
* **Sbomer**
|
||||
|
||||
* Ingest SBOM(s), normalize to purl/version graph
|
||||
* **Scanner.Reachability**
|
||||
|
||||
* Produce reachability traces: entrypoints → call paths → vulnerable symbol/sink
|
||||
* **Concelier**
|
||||
|
||||
* Fetch + snapshot vulnerability intelligence (NVD/vendor/OSV as applicable), **CISA KEV**, **EPSS**
|
||||
* Provide *feed snapshot identifiers* (hashes) per run
|
||||
* **Excititor**
|
||||
|
||||
* Ingest and merge VEX sources
|
||||
* Resolve a final `vex_status` per (component, cve)
|
||||
* Provide precedence + explanation
|
||||
* **Policy.Engine**
|
||||
|
||||
* Evaluate org rules against a computed finding (e.g., “block if KEV”)
|
||||
* **SmartDiff.Service**
|
||||
|
||||
* Compute risk states for “old” and “new”
|
||||
* Diff them
|
||||
* Suppress noise
|
||||
* Rank + output report with evidence
|
||||
|
||||
---
|
||||
|
||||
# 5) Developer deliverables
|
||||
|
||||
## Deliverable A: Smart‑Diff computation library
|
||||
|
||||
A deterministic library that takes:
|
||||
|
||||
* `OldSnapshot` and `NewSnapshot` (see section 7)
|
||||
* returns a `SmartDiffReport`
|
||||
|
||||
## Deliverable B: Service endpoint
|
||||
|
||||
`POST /smartdiff/compare` returns report JSON.
|
||||
|
||||
## Deliverable C: CLI command
|
||||
|
||||
`stella smart-diff --old <dir|file> --new <dir|file> [--policy policy.json] --out smartdiff.json`
|
||||
|
||||
---
|
||||
|
||||
# 6) Smart‑Diff rules
|
||||
|
||||
Developers must implement these as **explicit, testable rule functions**.
|
||||
|
||||
## 6.1 Suppression rules (noise filters)
|
||||
|
||||
A finding is **suppressed** if ALL apply:
|
||||
|
||||
1. `reachable == false` (or `unknown` treated as false only if you explicitly decide; recommended: unknown is *not* suppressible)
|
||||
2. `vex_status == NOT_AFFECTED`
|
||||
3. `kev == false`
|
||||
4. no policy requires it (e.g., “report all vuln findings” override)
|
||||
|
||||
**Patch churn suppression**
|
||||
|
||||
* If a component version changes but:
|
||||
|
||||
* `in_affected_range` remains false in both versions, AND
|
||||
* no KEV/policy flag flips,
|
||||
* then suppress (don’t surface).
|
||||
|
||||
**Dev/test dependency suppression (optional if you already tag scopes)**
|
||||
|
||||
* If SBOM scope indicates `dev/test` AND `reachable == false`, suppress.
|
||||
* If reachability is unknown, do **not** suppress by scope alone (avoid false negatives).
|
||||
|
||||
## 6.2 Material change detection rules
|
||||
|
||||
Surface a Smart‑Diff item when any of the following changes between old and new:
|
||||
|
||||
### Rule R1: Reachability flip
|
||||
|
||||
* `reachable` changes: `false → true` (risk ↑) or `true → false` (risk ↓)
|
||||
* Include at least one call path as evidence if reachable is true.
|
||||
|
||||
### Rule R2: VEX status flip
|
||||
|
||||
* `vex_status` changes meaningfully:
|
||||
|
||||
* `AFFECTED ↔ NOT_AFFECTED`
|
||||
* `UNDER_INVESTIGATION → NOT_AFFECTED` etc.
|
||||
* Changes involving `UNKNOWN` should be shown but ranked lower unless KEV.
|
||||
|
||||
### Rule R3: Affected range boundary
|
||||
|
||||
* `in_affected_range` flips:
|
||||
|
||||
* `false → true` (risk ↑)
|
||||
* `true → false` (risk ↓)
|
||||
* This is the main guard against patch churn noise.
|
||||
|
||||
### Rule R4: Intelligence / policy flip
|
||||
|
||||
* `kev` changes `false → true` or `epss_score` crosses a configured threshold
|
||||
* any `policy_flag` changes severity (warn → block)
|
||||
|
||||
---
|
||||
|
||||
# 7) Snapshot contract (what Smart‑Diff compares)
|
||||
|
||||
Define a stable internal format:
|
||||
|
||||
```json
|
||||
{
|
||||
"snapshot_id": "build-2025.12.14+sha.abc123",
|
||||
"created_at": "2025-12-14T12:34:56Z",
|
||||
"sbom": { "...": "CycloneDX or SPDX raw" },
|
||||
"vex_documents": [ { "...": "OpenVEX/CycloneDX VEX raw" } ],
|
||||
"reachability": {
|
||||
"analyzer": "java-callgraph@1.2.0",
|
||||
"entrypoints": ["com.app.Main#main"],
|
||||
"paths": [
|
||||
{
|
||||
"component_purl": "pkg:maven/org.example/foo@1.2.3",
|
||||
"cve": "CVE-2024-1234",
|
||||
"sink": "org.example.foo.VulnClass#vulnMethod",
|
||||
"callstack": ["...", "..."]
|
||||
}
|
||||
]
|
||||
},
|
||||
"concelier_feed_snapshot": {
|
||||
"kev_hash": "sha256:...",
|
||||
"epss_hash": "sha256:...",
|
||||
"vuln_db_hash": "sha256:..."
|
||||
},
|
||||
"policy_snapshot": { "policy_hash": "sha256:...", "rules": [ ... ] }
|
||||
}
|
||||
```
|
||||
|
||||
**Implementation note**
|
||||
|
||||
* SBOM/VEX can remain “raw”, but you must also build normalized indexes (in-memory or stored) for diffing.
|
||||
|
||||
---
|
||||
|
||||
# 8) Data normalization requirements
|
||||
|
||||
## 8.1 Component identity
|
||||
|
||||
* Use **purl** as canonical component ID.
|
||||
* Normalize casing, qualifiers, and version string normalization per ecosystem.
|
||||
|
||||
## 8.2 Vulnerability identity
|
||||
|
||||
* Use `CVE-*` as primary key where available.
|
||||
* If you ingest OSV IDs too, map them to CVE when possible but keep OSV ID in evidence.
|
||||
|
||||
## 8.3 Affected range evaluation
|
||||
|
||||
Implement:
|
||||
`bool? IsVersionInAffectedRange(version, affectedRanges)`
|
||||
|
||||
Return `null` (unknown) if version cannot be parsed or range semantics are unknown.
|
||||
|
||||
---
|
||||
|
||||
# 9) Excititor: VEX merge requirements
|
||||
|
||||
Developers should implement Excititor as a deterministic resolver:
|
||||
|
||||
## 9.1 Inputs
|
||||
|
||||
* List of VEX documents, each with metadata:
|
||||
|
||||
* `source` (vendor/internal/scanner)
|
||||
* `issued_at`
|
||||
* `signature/attestation` info (if present)
|
||||
|
||||
## 9.2 Output
|
||||
|
||||
For each `(component_purl, cve_id)`:
|
||||
|
||||
* `final_status`
|
||||
* `winning_statement_id`
|
||||
* `precedence_reason`
|
||||
* `all_statements[]` (for audit)
|
||||
|
||||
## 9.3 Precedence rules (recommendation)
|
||||
|
||||
Implement as ordered priority (highest wins), unless overridden by your org:
|
||||
|
||||
1. **Internal signed VEX** (security team attested)
|
||||
2. **Vendor signed VEX**
|
||||
3. **Internal unsigned VEX**
|
||||
4. **Scanner/VEX-like annotations**
|
||||
5. None → `UNKNOWN`
|
||||
|
||||
Conflict handling:
|
||||
|
||||
* If two same-priority statements disagree, pick newest by `issued_at`, but **record conflict** and surface it as a low-priority Smart‑Diff meta-item (optional).
|
||||
|
||||
---
|
||||
|
||||
# 10) Concelier: feed snapshot requirements
|
||||
|
||||
Concelier must provide deterministic inputs to Smart‑Diff.
|
||||
|
||||
## 10.1 What Concelier stores
|
||||
|
||||
* KEV list snapshot
|
||||
* EPSS snapshot
|
||||
* Vulnerability database snapshot (your choice: NVD mirror, OSV, vendor advisories)
|
||||
|
||||
## 10.2 Required APIs (internal)
|
||||
|
||||
* `GET /concelier/snapshots/latest`
|
||||
* `GET /concelier/snapshots/{hash}`
|
||||
* `GET /concelier/kev/{snapshotHash}/is_listed?cve=CVE-...`
|
||||
* `GET /concelier/epss/{snapshotHash}/score?cve=CVE-...`
|
||||
|
||||
## 10.3 Determinism
|
||||
|
||||
Smart‑Diff report must include the snapshot hashes used, so the result can be reproduced.
|
||||
|
||||
---
|
||||
|
||||
# 11) RiskState computation (core dev logic)
|
||||
|
||||
Implement a pure function:
|
||||
|
||||
`RiskState ComputeRiskState(FindingKey key, Snapshot snapshot)`
|
||||
|
||||
### Inputs used
|
||||
|
||||
* SBOM: to confirm component exists, scope, runtime path
|
||||
* Concelier feeds: KEV, EPSS, affected ranges
|
||||
* Excititor: VEX status
|
||||
* Reachability analyzer output
|
||||
* Policy engine: flags based on org rules
|
||||
|
||||
### Output
|
||||
|
||||
```json
|
||||
{
|
||||
"finding_key": { "purl": "...", "version": "...", "cve": "..." },
|
||||
"reachable": true,
|
||||
"vex_status": "AFFECTED",
|
||||
"in_affected_range": true,
|
||||
"kev": false,
|
||||
"epss": 0.42,
|
||||
"policy": {
|
||||
"decision": "WARN|BLOCK|ALLOW",
|
||||
"flags": ["epss_over_0_4"]
|
||||
},
|
||||
"evidence": [
|
||||
{ "type": "reachability_trace", "ref": "trace:abc", "detail": "short call stack..." },
|
||||
{ "type": "vex", "ref": "openvex:doc123#stmt7" },
|
||||
{ "type": "concelier_snapshot", "ref": "sha256:..." }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# 12) Diff engine specification
|
||||
|
||||
## 12.1 Inputs
|
||||
|
||||
* `OldRiskStates: map<FindingKey, RiskState>`
|
||||
* `NewRiskStates: map<FindingKey, RiskState>`
|
||||
|
||||
You build these maps by:
|
||||
|
||||
1. Enumerating candidate findings in each snapshot:
|
||||
|
||||
* from vulnerability matching against SBOM components (affected ranges)
|
||||
* plus any VEX statements referencing components
|
||||
2. Joining with reachability traces
|
||||
3. Resolving status via Excititor
|
||||
4. Applying Concelier intelligence + policy
|
||||
|
||||
## 12.2 Diff output types
|
||||
|
||||
Return `SmartDiffItem` with:
|
||||
|
||||
* `change_type`: `ADDED|REMOVED|CHANGED`
|
||||
* `risk_direction`: `UP|DOWN|NEUTRAL`
|
||||
* `reason_codes`: `[REACHABILITY_FLIP, VEX_FLIP, RANGE_FLIP, KEV_FLIP, POLICY_FLIP, EPSS_THRESHOLD]`
|
||||
* `old_state` / `new_state`
|
||||
* `priority_score`
|
||||
* `evidence_links[]`
|
||||
|
||||
## 12.3 Suppress AFTER diff, not before
|
||||
|
||||
Important: compute diff on full sets, then suppress items by rules, because:
|
||||
|
||||
* suppression itself can flip (e.g., VEX becomes `not_affected` → item disappears, which is meaningful as “risk down”).
|
||||
|
||||
---
|
||||
|
||||
# 13) Priority scoring & ranking
|
||||
|
||||
Implement a deterministic score:
|
||||
|
||||
### Hard ordering
|
||||
|
||||
1. `kev == true` in new state → top tier
|
||||
2. Reachable in new state (`reachable == true`) → next tier
|
||||
|
||||
### Numeric scoring (example)
|
||||
|
||||
```
|
||||
score =
|
||||
+ 1000 if new.kev
|
||||
+ 500 if new.reachable
|
||||
+ 200 if reason includes RANGE_FLIP to affected
|
||||
+ 150 if VEX_FLIP to AFFECTED
|
||||
+ 0..100 based on EPSS (epss * 100)
|
||||
+ policy weight: +300 if decision BLOCK, +100 if WARN
|
||||
```
|
||||
|
||||
Always include `score_breakdown` in report for explainability.
|
||||
|
||||
---
|
||||
|
||||
# 14) Evidence requirements (must implement)
|
||||
|
||||
Every Smart‑Diff item must include **at least one** evidence link, and ideally 2–4:
|
||||
|
||||
EvidenceLink schema:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "vex|reachability|kev|epss|scanner|sbom|policy",
|
||||
"ref": "stable identifier",
|
||||
"summary": "one-line human readable",
|
||||
"blob_hash": "sha256 of raw evidence payload (optional)"
|
||||
}
|
||||
```
|
||||
|
||||
Examples:
|
||||
|
||||
* `type=kev`: ref is `concelier:kev@{snapshotHash}#CVE-2024-1234`
|
||||
* `type=reachability`: ref is `reach:{snapshotId}:{traceId}`
|
||||
* `type=vex`: ref is `openvex:{docHash}#statement:{id}`
|
||||
|
||||
---
|
||||
|
||||
# 15) API specification
|
||||
|
||||
## 15.1 Compare endpoint
|
||||
|
||||
`POST /smartdiff/compare`
|
||||
|
||||
Request:
|
||||
|
||||
```json
|
||||
{
|
||||
"old_snapshot_id": "buildA",
|
||||
"new_snapshot_id": "buildB",
|
||||
"options": {
|
||||
"include_suppressed": false,
|
||||
"max_items": 200,
|
||||
"epss_threshold": 0.7
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Response:
|
||||
|
||||
```json
|
||||
{
|
||||
"report_id": "smartdiff:2025-12-14:xyz",
|
||||
"old": { "snapshot_id": "buildA", "feed_hashes": { ... } },
|
||||
"new": { "snapshot_id": "buildB", "feed_hashes": { ... } },
|
||||
"summary": {
|
||||
"risk_up": 3,
|
||||
"risk_down": 8,
|
||||
"reachable_new": 2,
|
||||
"kev_new": 1,
|
||||
"suppressed": 143
|
||||
},
|
||||
"items": [
|
||||
{
|
||||
"change_type": "CHANGED",
|
||||
"risk_direction": "UP",
|
||||
"priority_score": 1680,
|
||||
"reason_codes": ["REACHABILITY_FLIP","RANGE_FLIP"],
|
||||
"finding_key": {
|
||||
"purl": "pkg:maven/org.example/foo",
|
||||
"version_old": "1.2.3",
|
||||
"version_new": "1.2.4",
|
||||
"cve": "CVE-2024-1234"
|
||||
},
|
||||
"old_state": { "...": "RiskState" },
|
||||
"new_state": { "...": "RiskState" },
|
||||
"evidence": [ ... ]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## 15.2 Evidence endpoint
|
||||
|
||||
`GET /smartdiff/{report_id}/evidence/{evidence_ref}`
|
||||
|
||||
Returns raw stored evidence (or a signed URL if you store blobs elsewhere).
|
||||
|
||||
---
|
||||
|
||||
# 16) CLI behavior
|
||||
|
||||
Command:
|
||||
|
||||
```
|
||||
stella smart-diff \
|
||||
--old ./snapshots/buildA \
|
||||
--new ./snapshots/buildB \
|
||||
--policy ./policy.json \
|
||||
--out ./smartdiff.json
|
||||
```
|
||||
|
||||
CLI output (human):
|
||||
|
||||
* Summary line: `risk ↑ 3 | risk ↓ 8 | new reachable 2 | new KEV 1`
|
||||
* Then top N items sorted by priority, each one line:
|
||||
|
||||
* `↑ REACHABILITY_FLIP foo@1.2.4 CVE-2024-1234 (EPSS 0.42) path: Main→...→vulnMethod`
|
||||
|
||||
Exit code:
|
||||
|
||||
* `0` if policy decision overall is ALLOW/WARN
|
||||
* `2` if any item triggers policy BLOCK in new snapshot (configurable)
|
||||
|
||||
---
|
||||
|
||||
# 17) Storage schema (Postgres) — implementation-ready
|
||||
|
||||
You can implement in a single schema to start; split later.
|
||||
|
||||
## Core tables
|
||||
|
||||
### `snapshots`
|
||||
|
||||
* `snapshot_id (pk)`
|
||||
* `created_at`
|
||||
* `sbom_hash`
|
||||
* `policy_hash`
|
||||
* `kev_hash`
|
||||
* `epss_hash`
|
||||
* `vuln_db_hash`
|
||||
* `metadata jsonb`
|
||||
|
||||
### `components`
|
||||
|
||||
* `component_id (pk)` (internal UUID)
|
||||
* `snapshot_id (fk)`
|
||||
* `purl`
|
||||
* `version`
|
||||
* `scope` (runtime/dev/test/unknown)
|
||||
* `direct bool`
|
||||
* indexes on `(snapshot_id, purl)` and `(purl, version)`
|
||||
|
||||
### `findings`
|
||||
|
||||
* `finding_id (pk)`
|
||||
* `snapshot_id (fk)`
|
||||
* `purl`
|
||||
* `version`
|
||||
* `cve`
|
||||
* `reachable bool null`
|
||||
* `vex_status text`
|
||||
* `in_affected_range bool null`
|
||||
* `kev bool`
|
||||
* `epss real null`
|
||||
* `policy_decision text`
|
||||
* `policy_flags text[]`
|
||||
* index `(snapshot_id, purl, cve)`
|
||||
|
||||
### `reachability_traces`
|
||||
|
||||
* `trace_id (pk)`
|
||||
* `snapshot_id (fk)`
|
||||
* `purl`
|
||||
* `cve`
|
||||
* `sink`
|
||||
* `callstack jsonb`
|
||||
* index `(snapshot_id, purl, cve)`
|
||||
|
||||
### `vex_statements`
|
||||
|
||||
* `stmt_id (pk)`
|
||||
* `snapshot_id (fk)`
|
||||
* `purl`
|
||||
* `cve`
|
||||
* `source`
|
||||
* `issued_at`
|
||||
* `status`
|
||||
* `doc_hash`
|
||||
* `raw jsonb`
|
||||
* index `(snapshot_id, purl, cve)`
|
||||
|
||||
### `smartdiff_reports`
|
||||
|
||||
* `report_id (pk)`
|
||||
* `created_at`
|
||||
* `old_snapshot_id`
|
||||
* `new_snapshot_id`
|
||||
* `options jsonb`
|
||||
* `summary jsonb`
|
||||
|
||||
### `smartdiff_items`
|
||||
|
||||
* `item_id (pk)`
|
||||
* `report_id (fk)`
|
||||
* `change_type`
|
||||
* `risk_direction`
|
||||
* `priority_score`
|
||||
* `reason_codes text[]`
|
||||
* `purl`
|
||||
* `cve`
|
||||
* `old_version`
|
||||
* `new_version`
|
||||
* `old_state jsonb`
|
||||
* `new_state jsonb`
|
||||
|
||||
### `evidence_links`
|
||||
|
||||
* `evidence_id (pk)`
|
||||
* `report_id (fk)`
|
||||
* `item_id (fk)`
|
||||
* `type`
|
||||
* `ref`
|
||||
* `summary`
|
||||
* `blob_hash`
|
||||
|
||||
---
|
||||
|
||||
# 18) Implementation plan (developer-focused)
|
||||
|
||||
## Phase 1 — MVP (end-to-end working)
|
||||
|
||||
1. **Normalize SBOM**
|
||||
|
||||
* Parse CycloneDX/SPDX
|
||||
* Build `components` list with purl + version + scope
|
||||
2. **Concelier integration**
|
||||
|
||||
* Load KEV + EPSS snapshots (even from local files initially)
|
||||
* Expose snapshot hashes
|
||||
3. **Excititor integration**
|
||||
|
||||
* Parse OpenVEX/CycloneDX VEX
|
||||
* Implement precedence rules and output `final_status`
|
||||
4. **Affected range matching**
|
||||
|
||||
* For each component, query vulnerability DB snapshot for affected ranges
|
||||
* Produce candidate findings `(purl, version, cve)`
|
||||
5. **Reachability ingestion**
|
||||
|
||||
* Accept reachability JSON traces (even if generated elsewhere initially)
|
||||
* Mark `reachable=true` when trace exists for (purl,cve)
|
||||
6. **Compute RiskState**
|
||||
|
||||
* For each finding compute `kev`, `epss`, `policy_decision`
|
||||
7. **Diff + suppression + ranking**
|
||||
|
||||
* Generate `SmartDiffReport`
|
||||
8. **Outputs**
|
||||
|
||||
* JSON report + CLI table
|
||||
* Store report + items in Postgres
|
||||
|
||||
Acceptance tests for Phase 1:
|
||||
|
||||
* Given a known pair of snapshots, Smart‑Diff only includes:
|
||||
|
||||
* reachable vulnerable changes
|
||||
* VEX flips
|
||||
* affected range boundary flips
|
||||
* KEV flips
|
||||
* Patch churn not crossing ranges is absent.
|
||||
|
||||
## Phase 2 — Determinism & evidence hardening
|
||||
|
||||
* Store raw evidence blobs (VEX doc hash, trace payload hash)
|
||||
* Ensure feed snapshots are immutable and referenced by hash
|
||||
* Add `score_breakdown`
|
||||
* Add conflict surfacing for VEX merge
|
||||
|
||||
## Phase 3 — Performance & scale
|
||||
|
||||
* Incremental computation (only recompute affected components changed)
|
||||
* Cache Concelier lookups by `(snapshotHash, cve)`
|
||||
* Batch range matching queries
|
||||
* Add pagination and `max_items` enforcement
|
||||
|
||||
---
|
||||
|
||||
# 19) Edge cases developers must handle
|
||||
|
||||
1. **Reachability unknown**
|
||||
|
||||
* If no analyzer output exists, set `reachable = null`
|
||||
* Do not suppress solely based on `reachable=null`
|
||||
2. **Version parse failures**
|
||||
|
||||
* `in_affected_range = null`
|
||||
* Surface range-related changes only when one side is determinable
|
||||
3. **Component renamed / purl drift**
|
||||
|
||||
* Consider purl normalization rules (namespace casing, qualifiers)
|
||||
* If purl changes but is same artifact, treat as new component (unless you implement alias mapping later)
|
||||
4. **Multiple CVE sources / duplicates**
|
||||
|
||||
* Deduplicate by CVE ID per component+version
|
||||
5. **Conflicting VEX statements**
|
||||
|
||||
* Pick winner deterministically, but log conflict evidence
|
||||
6. **KEV listed but VEX says not affected**
|
||||
|
||||
* Still suppress? Recommended:
|
||||
|
||||
* Do **not** suppress; surface as “KEV listed but VEX not_affected” and rank high (KEV tier)
|
||||
7. **Policy config changes**
|
||||
|
||||
* Treat policy hash difference as a diff dimension; surface “policy flip” items even if underlying vuln unchanged
|
||||
|
||||
---
|
||||
|
||||
# 20) Testing strategy (must implement)
|
||||
|
||||
## Unit tests
|
||||
|
||||
* SemVer compare + affected range evaluation
|
||||
* Excititor precedence resolution
|
||||
* Suppression rules (table-driven tests)
|
||||
* Priority scoring determinism
|
||||
|
||||
## Integration tests
|
||||
|
||||
* Build synthetic snapshots:
|
||||
|
||||
* A: vuln present, unreachable, VEX not_affected
|
||||
* B: same vuln reachable
|
||||
* Assert Smart‑Diff surfaces exactly one item with `REACHABILITY_FLIP`
|
||||
* KEV flip test:
|
||||
|
||||
* Same findings, KEV list changes between Concelier snapshots
|
||||
* Assert item surfaces with `KEV_FLIP`
|
||||
|
||||
## Regression suite
|
||||
|
||||
Keep a folder of snapshot pairs and expected outputs:
|
||||
|
||||
* `fixtures/snapA`, `fixtures/snapB`, `expected.smartdiff.json`
|
||||
|
||||
---
|
||||
|
||||
# 21) What the developer should code first (practical order)
|
||||
|
||||
1. DTOs:
|
||||
|
||||
* `Snapshot`, `Component`, `VexStatement`, `ReachTrace`, `FindingKey`, `RiskState`, `SmartDiffItem`, `SmartDiffReport`
|
||||
2. Pure functions:
|
||||
|
||||
* `NormalizePurl`
|
||||
* `IsVersionInAffectedRange`
|
||||
* `ResolveVexStatus` (Excititor)
|
||||
* `ComputeRiskState`
|
||||
* `DiffRiskStates`
|
||||
* `ApplySuppression`
|
||||
* `ScoreAndRank`
|
||||
3. Persistence:
|
||||
|
||||
* store snapshots and computed findings
|
||||
4. API + CLI wrappers
|
||||
|
||||
---
|
||||
|
||||
If you want, I can also provide:
|
||||
|
||||
* a **concrete JSON Schema** for `SmartDiffReport`
|
||||
* **C# (.NET 10) interfaces + class skeletons** for `ConcelierClient`, `ExcititorResolver`, and `SmartDiffService`
|
||||
* a **fixture set** (sample SBOM/VEX/reach traces) to bootstrap the test suite
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,787 @@
|
||||
Here’s a compact playbook for building **10–20 “toy services” with planted, labeled vulnerabilities** so you can demo reachability, measure scanner accuracy, and make the “why” behind each finding obvious.
|
||||
|
||||
### Why do this
|
||||
|
||||
* **Repeatable benchmarks:** same inputs → same findings → track accuracy over time.
|
||||
* **Explainable demos:** each vuln has a story, proof path, and a fix.
|
||||
* **Coverage sanity checks:** distinguish **reachable** vs **unreachable** vulns so tools can’t inflate results.
|
||||
|
||||
### Core design
|
||||
|
||||
* Each service = 1 repo with:
|
||||
|
||||
* `/app` (tiny API or worker), `/infra` (Dockerfile/compose), `/tests` (PyTest/Jest + attack scripts), `/labels.yaml` (ground‑truth).
|
||||
* `labels.yaml` schema:
|
||||
|
||||
```yaml
|
||||
service: svc-01-password-reset
|
||||
vulns:
|
||||
- id: V1
|
||||
cve: CVE-2022-XXXXX
|
||||
type: dep_runtime
|
||||
package: express
|
||||
version: 4.17.0
|
||||
reachable: true
|
||||
path_tags: ["route:/reset", "call:crypto.md5", "env:DEV_MODE"]
|
||||
proof: ["curl.sh#L10", "trace.json:/reset stack -> md5()"]
|
||||
fix_hint: "upgrade express to 4.18.3"
|
||||
- id: V2
|
||||
type: dep_build
|
||||
package: lodash
|
||||
version: 4.17.5
|
||||
reachable: false
|
||||
path_tags: ["devDependency", "no-import"]
|
||||
```
|
||||
* **Tagged paths**: add lightweight traces (e.g., log “TAG:route:/reset” before vulnerable call) so tests can assert reachability.
|
||||
|
||||
### Suggested catalog (pick 10–20)
|
||||
|
||||
1. **Password reset token** (MD5, predictable tokens) – reachable via `/reset`.
|
||||
2. **SQL injection** (string‑concat query) – reachable via `/search`.
|
||||
3. **Path traversal** (`../` in `?file=`) – reachable but sandboxed; variant unreachable behind dead route flag.
|
||||
4. **Deserialization bug** (unsafe `pickle`/`BinaryFormatter`) – reachable in worker queue.
|
||||
5. **SSRF** (proxy fetch) – guarded by allow‑list in unreachable variant.
|
||||
6. **Command injection** (`child_process.exec`) – reachable via debug param; unreachable alt uses execFile.
|
||||
7. **JWT none‑alg** acceptance – only when `DEV_MODE=1`.
|
||||
8. **Hardcoded credentials** (in config) – present but not used (unreachable).
|
||||
9. **Dependency vuln (runtime)** old `express/fastapi` called in hot path.
|
||||
10. **Dependency vuln (build‑time only)** devDependency only (unreachable at runtime).
|
||||
11. **Insecure TLS** (skip verify) – gated behind feature flag.
|
||||
12. **Open redirect** – requires crafted `next=` param.
|
||||
13. **XXE** in XML upload – off by default in unreachable variant.
|
||||
14. **Insecure deserialization in message bus consumer** – invoked by test producer.
|
||||
15. **Race condition** (TOCTOU temp file) – demonstrated by parallel test.
|
||||
16. **Use‑after‑free style bug** (C tiny service) – reachable with specific sequence; alt path never called.
|
||||
17. **CSRF** on state‑changing route – reachable only without SameSite/CSRF tokens.
|
||||
18. **Directory listing** (misconfigured static server) – reachable under `/public`.
|
||||
19. **Prototype pollution** (JS merge) – only reachable when `content-type: application/json`.
|
||||
20. **Zip‑slip** in archive import – prevented in unreachable variant via safe unzip.
|
||||
|
||||
### Tech stack mix
|
||||
|
||||
* **Languages:** Node (Express), Python (FastAPI/Flask), Go (net/http), C# (.NET Minimal API), one small C binary.
|
||||
* **Packaging:** Docker per service; one multi‑stage with vulnerable build‑tool only (to test build‑time vs runtime vulns).
|
||||
* **Data:** SQLite or in‑memory maps to avoid ops noise.
|
||||
|
||||
### Test harness (deterministic)
|
||||
|
||||
* `make test` runs:
|
||||
|
||||
1. **Smoke** (service up).
|
||||
2. **Exploit scripts** trigger each *reachable* vuln and store `evidence/trace.json`.
|
||||
3. **Scanner run** (your tool + competitors) against the image/container/fs.
|
||||
4. **Evaluator** compares scanner output to `labels.yaml`.
|
||||
|
||||
### Metrics you’ll get
|
||||
|
||||
* **Precision/recall** overall and by class (dep_runtime, dep_build, code, config).
|
||||
* **Reachability precision**: % of flagged vulns with a proven path tag match.
|
||||
* **Overreport index**: unreachable‑flag hits / total hits.
|
||||
* **TTFS (Time‑to‑first‑signal)**: from scan start to first evidence‑backed block.
|
||||
* **Fix guidance score**: did the tool propose the correct minimal upgrade/patch?
|
||||
|
||||
### Minimal evaluator format
|
||||
|
||||
Scanner output → normalized JSON:
|
||||
|
||||
```json
|
||||
{ "findings": [
|
||||
{"cve":"CVE-2022-XXXXX","package":"express","version":"4.17.0",
|
||||
"class":"dep_runtime","path_tags":["route:/reset","call:crypto.md5"]}
|
||||
]}
|
||||
```
|
||||
|
||||
Evaluator joins on `(cve|type|package)` and checks:
|
||||
|
||||
* tag overlap with `labels.vulns[*].path_tags`
|
||||
* reachable expectation matches
|
||||
* counts per class; exports `report.md` + `report.csv`.
|
||||
|
||||
### Demo storyline (5 min)
|
||||
|
||||
1. Run **svc‑01**; hit `/reset`; show trace marker.
|
||||
2. Run your scanner; show it ranks the **reachable dep vuln** above the **devDependency vuln**.
|
||||
3. Flip env to disable route; rerun → reachable finding disappears → score improves.
|
||||
4. Show **fix hint** applied (upgrade) → green.
|
||||
|
||||
### Repo layout (monorepo)
|
||||
|
||||
```
|
||||
/toys/
|
||||
svc-01-reset-md5/
|
||||
svc-02-sql-injection/
|
||||
...
|
||||
/harness/
|
||||
normalize.py
|
||||
evaluate.py
|
||||
run_scans.sh
|
||||
/docs/
|
||||
rubric.md # metric definitions & thresholds
|
||||
```
|
||||
|
||||
### Guardrails
|
||||
|
||||
* Keep images tiny (<150MB) and ports unique.
|
||||
* Deterministic seeds for any randomness.
|
||||
* No outbound calls in tests (use local mocks).
|
||||
* Clearly mark **unsafe** code blocks with comments.
|
||||
|
||||
### First 5 to build this week
|
||||
|
||||
1. `svc-01-reset-md5` (Node)
|
||||
2. `svc-02-sql-injection` (Python/FastAPI)
|
||||
3. `svc-03-dep-build-only` (Node devDependency)
|
||||
4. `svc-04-cmd-injection` (.NET Minimal API)
|
||||
5. `svc-05-zip-slip` (Go)
|
||||
|
||||
If you want, I can generate the skeleton repos (Dockerfile, app, tests, `labels.yaml`, and the evaluator script) so you can drop them into your monorepo and start measuring immediately.
|
||||
Below is a **developer framework** you can hand to the team as the governing “contract” for implementing the full toy-service catalogue at a **best-in-class** standard, while keeping the suite deterministic, safe, and maximally useful for scanner R&D.
|
||||
|
||||
---
|
||||
|
||||
## 1) Non-negotiable principles
|
||||
|
||||
1. **Determinism first**
|
||||
|
||||
* Same git SHA + same inputs ⇒ identical images, SBOMs, findings, scores.
|
||||
* Pin everything: base image **by digest**, language deps **by lockfiles**, tool versions **by exact semver**, and record it in an evidence manifest.
|
||||
|
||||
2. **Ground truth is authoritative**
|
||||
|
||||
* Every planted weakness must have a **machine-readable label**, and at least one **verifiable proof artifact**.
|
||||
* No “implicit” vulnerabilities; if it’s not labeled, it does not exist for scoring.
|
||||
|
||||
3. **Reachability is tiered, not binary**
|
||||
|
||||
* You will label and prove *how* it is reachable (imported vs executed vs tainted input), not just “reachable: true”.
|
||||
|
||||
4. **Safety by construction**
|
||||
|
||||
* Services run on an isolated docker network; tests must not require internet.
|
||||
* Proofs should demonstrate *execution and dataflow* rather than “weaponized exploitation”.
|
||||
|
||||
---
|
||||
|
||||
## 2) Repository and service contract
|
||||
|
||||
### Standard monorepo layout
|
||||
|
||||
```
|
||||
/toys/
|
||||
svc-01-.../
|
||||
app/
|
||||
infra/ # Dockerfile, compose, network policy
|
||||
tests/ # positive + negative reachability tests
|
||||
labels.yaml # ground truth
|
||||
evidence/ # generated by tests (trace, tags, manifests)
|
||||
fix/ # minimal patch proving remediation
|
||||
/harness/
|
||||
run-suite/
|
||||
normalize/
|
||||
evaluate/
|
||||
/schemas/
|
||||
labels.schema.json
|
||||
/docs/
|
||||
benchmark-contract.md
|
||||
scoring.md
|
||||
reviewer-checklist.md
|
||||
```
|
||||
|
||||
### Required service deliverables (Definition of Done)
|
||||
|
||||
A service PR is “DONE” only if it includes:
|
||||
|
||||
* `labels.yaml` validated by `schemas/labels.schema.json`
|
||||
* Docker build reproducible enough to be stable in CI (digest pinned; lockfiles committed)
|
||||
* **Positive tests** that generate evidence proving reachability tiers (see §3)
|
||||
* **Negative tests** proving “unreachable” claims (feature flags off, devDependency only, dead route, etc.)
|
||||
* `fix/` patch that removes/mitigates the weakness and produces a measurable delta (findings drop, reachability flips, or config gate blocks)
|
||||
* An `evidence/manifest.json` capturing tool versions, git sha, image digest, timestamps (UTC), and hashes of evidence files
|
||||
|
||||
---
|
||||
|
||||
## 3) Reachability tiers and evidence requirements
|
||||
|
||||
### Reachability levels (use these everywhere)
|
||||
|
||||
* **R0 Present**: vulnerable component exists in image/SBOM, not imported/loaded.
|
||||
* **R1 Loaded**: imported/linked/initialized, but no executed path proven.
|
||||
* **R2 Executed**: vulnerable function/module is executed in a test (deterministic trace).
|
||||
* **R3 Tainted execution**: execution occurs with externally influenced input (route param/message/body).
|
||||
* **R4 Exploitable** (optional): controlled, non-harmful PoC demonstrates full impact.
|
||||
|
||||
### Minimum evidence per level
|
||||
|
||||
* R0: SBOM + file hash / package metadata
|
||||
* R1: runtime startup logs or module load trace tag
|
||||
* R2: callsite tag + stack trace snippet (or deterministic trace file)
|
||||
* R3: R2 + taint marker showing data originated from external boundary (HTTP/queue/env) and reached call
|
||||
* R4: only if safe and necessary; keep it non-weaponized and sandboxed
|
||||
|
||||
**Key rule:** prefer proving **execution + dataflow** over providing “payload recipes”.
|
||||
|
||||
---
|
||||
|
||||
## 4) Ground truth schema (what `labels.yaml` must capture)
|
||||
|
||||
Every vuln entry must have:
|
||||
|
||||
* Stable ID: `svc-XX:Vn` (never renumber once published)
|
||||
* Class: `dep_runtime | dep_build | code | config | os_pkg | supply_chain`
|
||||
* Identity: `cve` (if applicable), `purl`, `package`, `version`, `location` (path/module)
|
||||
* Reachability: `reachability_level: R0..R4`, `entrypoint` (route/topic/cli), `preconditions` (flags/env/auth)
|
||||
* Proofs:
|
||||
|
||||
* `proof.artifacts[]` (e.g., trace file, tag log, coverage snippet)
|
||||
* `proof.tags[]` (canonical tag strings)
|
||||
* Fix:
|
||||
|
||||
* `fix.type` (upgrade/config/code)
|
||||
* `fix.patch_path` (under `fix/`)
|
||||
* `fix.expected_delta` (what should change in findings/evidence)
|
||||
* Negatives (if unreachable):
|
||||
|
||||
* `negative_proof` explaining and proving why it is unreachable
|
||||
|
||||
Canonical tag format (consistent across languages):
|
||||
|
||||
* `TAG:route:/reset`
|
||||
* `TAG:call:Crypto.Md5`
|
||||
* `TAG:taint:http.body.resetToken`
|
||||
* `TAG:flag:DEV_MODE=true`
|
||||
|
||||
---
|
||||
|
||||
## 5) Service implementation standards (how developers build each toy)
|
||||
|
||||
### A. Vulnerability planting patterns (approved)
|
||||
|
||||
* **Dependency runtime**: vulnerable version is a production dependency and exercised on a normal route/job.
|
||||
* **Dependency build-only**: devDependency only, or used only in build stage; prove it never ships in final image.
|
||||
* **Code vuln**: the vulnerable sink is behind a clean, deterministic entrypoint and instrumented.
|
||||
* **Config vuln**: misconfig is explicit and versioned (headers, TLS settings, authz rules), with a fix patch.
|
||||
|
||||
### B. Instrumentation requirements
|
||||
|
||||
* Every reachable vuln must emit:
|
||||
|
||||
* one **entrypoint tag** (route/topic/command)
|
||||
* one **sink tag** (the vulnerable call or module)
|
||||
* optional **taint tag** for R3
|
||||
* Evidence generation must be stable and machine-parsable:
|
||||
|
||||
* JSON trace preferred (`evidence/trace.json`)
|
||||
* Logs acceptable if structured and anchored with tags
|
||||
|
||||
### C. Negative-case discipline (unreachable means proven unreachable)
|
||||
|
||||
Unreachable claims must be backed by one of:
|
||||
|
||||
* compilation/linker exclusion (dead code eliminated) + proof
|
||||
* dependency not present in final image (multi-stage) + proof (image file listing / SBOM diff)
|
||||
* feature flag off + proof (config captured + route unavailable)
|
||||
* auth gate + proof (unauthorized cannot reach sink)
|
||||
|
||||
---
|
||||
|
||||
## 6) Harness and scoring gates (how you enforce “best in class”)
|
||||
|
||||
### Normalization
|
||||
|
||||
All scanners’ outputs must normalize into one internal shape:
|
||||
|
||||
* `(identity: purl+cve+version+location) + class + reachability_claim + evidence_refs`
|
||||
|
||||
### Core metrics (tracked per commit)
|
||||
|
||||
* **Recall (by class)**: runtime deps, OS pkgs, code, config
|
||||
* **Precision**: false positive rate, especially R0/R1 misclassified as R2/R3
|
||||
* **Reachability accuracy**:
|
||||
|
||||
* overreach: predicted reachable but labeled R0/R1
|
||||
* underreach: labeled R2/R3 but predicted non-reachable
|
||||
* **TTFS** (Time-to-First-Signal): time to first *evidence-backed* blocking issue
|
||||
* **Fix validation**: applying `fix/` must produce the expected delta
|
||||
|
||||
### Quality gates (example thresholds you can enforce in CI)
|
||||
|
||||
* Runtime dependency recall ≥ 0.95
|
||||
* Unreachable false positives ≤ 0.05 (for R0/R1)
|
||||
* Reachability underreport ≤ 0.10 (for labeled R2/R3)
|
||||
* TTFS regression: no worse than +10% vs main
|
||||
* Fix validation pass rate = 100% for modified services
|
||||
|
||||
(Adjust numbers as your suite matures; the framework is the key.)
|
||||
|
||||
---
|
||||
|
||||
## 7) Review checklist (what reviewers enforce)
|
||||
|
||||
A PR adding/modifying a service is rejected if any of these fail:
|
||||
|
||||
* Labels complete, schema-valid, and stable IDs preserved
|
||||
* Proof artifacts are deterministic and generated by tests
|
||||
* Reachability tier justified and matches evidence
|
||||
* Unreachable claims have negative proofs
|
||||
* Docker build uses pinned digests + lockfiles committed
|
||||
* `fix/` produces measurable delta and does not introduce new unlabeled issues
|
||||
* No network egress required; tests are hermetic
|
||||
|
||||
---
|
||||
|
||||
## 8) Catalogue coverage plan (so the suite actually pushes “best in class”)
|
||||
|
||||
Define coverage targets across axes (each service must declare which axes it covers):
|
||||
|
||||
* Language/runtime: Node, Python, Go, .NET 10, Java, native
|
||||
* Packaging: fat images, slim, distroless, multi-stage, vendored deps
|
||||
* Dependency complexity: direct, transitive, aliased/renamed, bundled
|
||||
* Reachability: R0..R3 mandatory; R4 only for a small subset
|
||||
* Supply chain: SBOM mismatches, provenance metadata, build-time tools present but excluded
|
||||
|
||||
This prevents you from building 20 near-duplicates and thinking you have “coverage”.
|
||||
|
||||
---
|
||||
|
||||
## 9) What to hand developers today (copy/paste into your repo)
|
||||
|
||||
Create these docs/templates as the “contract”:
|
||||
|
||||
* `/docs/benchmark-contract.md` (principles + DoD + reachability tiers)
|
||||
* `/docs/reviewer-checklist.md` (the rejection criteria)
|
||||
* `/schemas/labels.schema.json` (strict validation)
|
||||
* `/toys/_TEMPLATE/` (service skeleton with instrumentation + tests scaffolding)
|
||||
* `/harness/evaluate/` (metrics + CI gating)
|
||||
|
||||
If you want, I can draft the actual **`labels.schema.json`**, the **service template**, and the **review checklist** in ready-to-commit form (paths, names, and .NET 10 conventions included).
|
||||
Below is an **extended, implementable catalogue** (20 services + 4 “meta” repos) with **concrete stacks, entrypoints, reachability tiers, canonical tags, proof tests, and fixes**. It is written so developers can start building immediately while keeping the suite deterministic and scanner-grade.
|
||||
|
||||
---
|
||||
|
||||
## A. Catalogue index (what each service is for)
|
||||
|
||||
| ID | Service | Stack | Primary capability tested |
|
||||
| -: | --------------------------- | --------------------- | ------------------------------------------------------ |
|
||||
| 01 | reset-token-weak-crypto | Node/Express | Code reachability + crypto misuse |
|
||||
| 02 | search-sql-injection | Python/FastAPI | Taint → sink (SQLi), route evidence |
|
||||
| 03 | cmd-injection-diagnostics | .NET 10 Minimal API | Taint → shell sink + gating |
|
||||
| 04 | zip-import-zip-slip | Go net/http | Archive handling (Zip Slip), filesystem proof |
|
||||
| 05 | xml-upload-xxe | Java/Spring Boot | XML parser config (XXE), safe proof |
|
||||
| 06 | jwt-none-devmode | .NET 10 | Config-gated auth bypass (reachability depends on env) |
|
||||
| 07 | fetcher-ssrf | Node/Express | SSRF to internal-only target, network isolation |
|
||||
| 08 | outbound-tls-skipverify | Go | TLS misconfig + “reachable only if feature enabled” |
|
||||
| 09 | queue-pickle-deser | Python worker | Async reachability via queue + unsafe deserialization |
|
||||
| 10 | efcore-rawsql | .NET 10 + EF Core | ORM raw SQL misuse + input flow |
|
||||
| 11 | shaded-jar-deps | Java/Gradle | Shaded/fat jar dependency discovery |
|
||||
| 12 | webpack-bundled-dep | Node/Webpack | Bundled deps + SBOM correctness |
|
||||
| 13 | go-static-modver | Go static | Detect module versions in static binaries |
|
||||
| 14 | dotnet-singlefile-trim | .NET 10 publish | Single-file/trimmed dependency evidence |
|
||||
| 15 | cors-credentials-wildcard | .NET 10 or Node | Config vulnerability (CORS) + fix delta |
|
||||
| 16 | open-redirect | Node/Express | Web vuln classification + allowlist fix |
|
||||
| 17 | csrf-state-change | .NET 10 Razor/Minimal | Missing CSRF protections + cookie semantics |
|
||||
| 18 | prototype-pollution-merge | Node | JSON-body gated path + sink |
|
||||
| 19 | path-traversal-download | Python/Flask | File handling traversal + normalization |
|
||||
| 20 | insecure-tempfile-toctou | Go or .NET | Concurrency/race evidence (safe) |
|
||||
| 21 | k8s-misconfigs | YAML/Helm | IaC scanning (privileged, hostPath, etc.) |
|
||||
| 22 | docker-multistage-buildonly | Any | Build-time-only vuln exclusion proof |
|
||||
| 23 | secrets-fakes-corpus | Any | Secret detection precision (fake tokens) |
|
||||
| 24 | sbom-mismatch-lab | Any | SBOM validation + diff correctness |
|
||||
|
||||
---
|
||||
|
||||
## B. Canonical tagging (use across all services)
|
||||
|
||||
Every reachable vuln must produce at least:
|
||||
|
||||
* `TAG:route:<method> <path>` or `TAG:topic:<name>`
|
||||
* `TAG:call:<sink>`
|
||||
* If R3: `TAG:taint:<boundary>` (http.query, http.body, queue.msg, env.var)
|
||||
|
||||
**Evidence artifact:** `evidence/trace.json` lines such as:
|
||||
|
||||
```json
|
||||
{"ts":"...","corr":"...","tags":["TAG:route:POST /reset","TAG:taint:http.body.email","TAG:call:Crypto.MD5"]}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## C. Service specs (developers can implement 1:1)
|
||||
|
||||
### 01) `svc-01-reset-token-weak-crypto` (Node/Express)
|
||||
|
||||
**Purpose:** R3 code reachability; crypto misuse; ensure scanner doesn’t over-rank unreachable dev deps.
|
||||
**Entrypoints:** `POST /reset` and `POST /reset/confirm`
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **CWE-327 Weak Crypto** — reset token derived from deterministic inputs (no CSPRNG).
|
||||
|
||||
* Reachability: **R3**
|
||||
* Tags: `TAG:route:POST /reset`, `TAG:taint:http.body.email`, `TAG:call:Crypto.WeakToken`
|
||||
* Proof test: request reset; assert trace contains sink tag.
|
||||
* Fix: use `crypto.randomBytes()` and store hashed token.
|
||||
* `V2` **dep_build** — vulnerable npm devDependency present only in `devDependencies`.
|
||||
|
||||
* Reachability: **R0**
|
||||
* Negative proof: final image contains no node_modules entry for it OR it is never imported (coverage + grep import map).
|
||||
|
||||
**Hard mode variant:** token generation only happens when `FEATURE_RESET_V1=1` → label unreachable when off.
|
||||
|
||||
---
|
||||
|
||||
### 02) `svc-02-search-sql-injection` (Python/FastAPI + SQLite)
|
||||
|
||||
**Purpose:** Classic taint → SQL sink; evidence-driven.
|
||||
**Entrypoint:** `GET /search?q=`
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **CWE-89 SQL Injection** — query constructed via string concatenation.
|
||||
|
||||
* Reachability: **R3**
|
||||
* Tags: `TAG:route:GET /search`, `TAG:taint:http.query.q`, `TAG:call:SQL.Unparameterized`
|
||||
* Proof test: send query with SQL metacharacters; verify trace hits sink.
|
||||
* Fix: parameterized query / query builder.
|
||||
|
||||
**Hard mode variant:** same route exists but safe path uses parameters; unsafe path only if header `X-Debug=1` and env `DEV_MODE=1`.
|
||||
|
||||
---
|
||||
|
||||
### 03) `svc-03-cmd-injection-diagnostics` (.NET 10 Minimal API)
|
||||
|
||||
**Purpose:** Detect command execution sink and prove gating.
|
||||
**Entrypoint:** `GET /diag/ping?host=`
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **CWE-78 Command Injection** — shell invocation with user-influenced argument.
|
||||
|
||||
* Reachability: **R3** when `DIAG_ENABLED=1`
|
||||
* Tags: `TAG:route:GET /diag/ping`, `TAG:taint:http.query.host`, `TAG:call:Process.Start.Shell`
|
||||
* Proof test: call endpoint with characters that would alter shell parsing; evidence is sink tag + controlled output marker (not destructive).
|
||||
* Fix: avoid shell, use argument arrays (`ProcessStartInfo.ArgumentList`) + allowlist hostnames.
|
||||
|
||||
**Hard mode variant:** sink is in a helper library referenced transitively; scanner must resolve call graph.
|
||||
|
||||
---
|
||||
|
||||
### 04) `svc-04-zip-import-zip-slip` (Go)
|
||||
|
||||
**Purpose:** File/archive handling; safe filesystem proof; no “real system” impact.
|
||||
**Entrypoint:** `POST /import-zip`
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **CWE-22 Path Traversal (Zip Slip)** — extraction path not normalized/validated.
|
||||
|
||||
* Reachability: **R3**
|
||||
* Tags: `TAG:route:POST /import-zip`, `TAG:taint:http.body.zip`, `TAG:call:Archive.Extract.UnsafeJoin`
|
||||
* Proof test: upload crafted zip that attempts to place `evidence/sentinel.txt` outside dest; assert sentinel ends up outside intended folder.
|
||||
* Fix: clean paths; reject entries escaping dest; forbid absolute paths.
|
||||
|
||||
---
|
||||
|
||||
### 05) `svc-05-xml-upload-xxe` (Java/Spring Boot)
|
||||
|
||||
**Purpose:** Parser config scanning + code-path proof.
|
||||
**Entrypoint:** `POST /upload-xml`
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **CWE-611 XXE** — DocumentBuilderFactory with external entities enabled.
|
||||
|
||||
* Reachability: **R3**
|
||||
* Tags: `TAG:route:POST /upload-xml`, `TAG:taint:http.body.xml`, `TAG:call:XML.Parse.XXEEnabled`
|
||||
* Proof test: XML references a **local test file under `/app/testdata/`** and returns its sentinel string (no external network).
|
||||
* Fix: disable external entity resolution and secure processing.
|
||||
|
||||
---
|
||||
|
||||
### 06) `svc-06-jwt-none-devmode` (.NET 10)
|
||||
|
||||
**Purpose:** Reachability depends on environment and config.
|
||||
**Entrypoint:** `GET /admin` (Bearer JWT)
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **CWE-345 Insufficient Verification** — accepts unsigned token when `DEV_MODE=1`.
|
||||
|
||||
* Reachability: **R2** (exec) / **R3** (if token from request)
|
||||
* Tags: `TAG:route:GET /admin`, `TAG:flag:DEV_MODE=true`, `TAG:call:Auth.JWT.AcceptNoneAlg`
|
||||
* Proof test: run container with DEV_MODE=1; request triggers sink tag.
|
||||
* Negative test: DEV_MODE=0 must not hit sink tag.
|
||||
* Fix: enforce algorithm + signature validation always.
|
||||
|
||||
---
|
||||
|
||||
### 07) `svc-07-fetcher-ssrf` (Node/Express)
|
||||
|
||||
**Purpose:** SSRF detection with internal-only target in docker network.
|
||||
**Entrypoint:** `GET /fetch?url=`
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **CWE-918 SSRF** — URL fetched without scheme/host restrictions.
|
||||
|
||||
* Reachability: **R3**
|
||||
* Tags: `TAG:route:GET /fetch`, `TAG:taint:http.query.url`, `TAG:call:HTTP.Client.Fetch`
|
||||
* Proof test: fetch `http://internal-metadata/health` (a companion container in compose); assert response contains sentinel + sink tag.
|
||||
* Fix: allowlist hosts/schemes; block private ranges; require signed destinations.
|
||||
|
||||
---
|
||||
|
||||
### 08) `svc-08-outbound-tls-skipverify` (Go)
|
||||
|
||||
**Purpose:** Config vuln + “reachable only when feature on.”
|
||||
**Entrypoint:** `POST /sync` triggers outbound HTTPS call
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **CWE-295 Improper Cert Validation** — `InsecureSkipVerify=true` when `SYNC_FAST=1`.
|
||||
|
||||
* Reachability: **R2** (exec)
|
||||
* Tags: `TAG:route:POST /sync`, `TAG:flag:SYNC_FAST=true`, `TAG:call:TLS.InsecureSkipVerify`
|
||||
* Fix: proper CA pinning / system pool; explicit cert verification.
|
||||
|
||||
---
|
||||
|
||||
### 09) `svc-09-queue-pickle-deser` (Python API + worker)
|
||||
|
||||
**Purpose:** Async reachability: API enqueues → worker executes sink.
|
||||
**Entrypoints:** `POST /enqueue` + worker consumer
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **CWE-502 Unsafe Deserialization** — worker uses unsafe deserializer.
|
||||
|
||||
* Reachability: **R3** (taint from HTTP → queue → worker)
|
||||
* Tags: `TAG:route:POST /enqueue`, `TAG:topic:jobs`, `TAG:call:Deserialize.Unsafe`
|
||||
* Proof test: enqueue benign payload that triggers sink tag and deterministic “handled” response (no arbitrary execution PoC).
|
||||
* Fix: switch to safe format (JSON) and validate schema.
|
||||
|
||||
---
|
||||
|
||||
### 10) `svc-10-efcore-rawsql` (.NET 10 + EF Core)
|
||||
|
||||
**Purpose:** ORM misuse; taint → SQL sink detection.
|
||||
**Entrypoint:** `GET /reports?where=`
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **CWE-89 SQLi** — `FromSqlRaw`/`ExecuteSqlRaw` with interpolated input.
|
||||
|
||||
* Reachability: **R3**
|
||||
* Tags: `TAG:route:GET /reports`, `TAG:taint:http.query.where`, `TAG:call:EFCore.FromSqlRaw.Unsafe`
|
||||
* Fix: `FromSqlInterpolated` with parameters or LINQ predicates.
|
||||
|
||||
---
|
||||
|
||||
### 11) `svc-11-shaded-jar-deps` (Java/Gradle)
|
||||
|
||||
**Purpose:** Dependency discovery inside fat/shaded jar; reachable vs present-only.
|
||||
**Entrypoint:** `GET /parse`
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **dep_runtime** — vulnerable lib included in shaded jar and actually invoked.
|
||||
|
||||
* Reachability: **R2**
|
||||
* Tags: `TAG:route:GET /parse`, `TAG:call:Lib.Parse.VulnerableMethod`
|
||||
* `V2` **dep_build/test** — test-scoped vulnerable lib not packaged in runtime jar.
|
||||
|
||||
* Reachability: **R0**
|
||||
* Negative proof: SBOM for runtime jar excludes it; file listing confirms.
|
||||
|
||||
**Fix:** bump dependency and rebuild shaded jar.
|
||||
|
||||
---
|
||||
|
||||
### 12) `svc-12-webpack-bundled-dep` (Node/Webpack)
|
||||
|
||||
**Purpose:** Bundled dependencies, source map presence/absence, SBOM correctness.
|
||||
**Entrypoint:** `GET /render?template=`
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **dep_runtime** — vulnerable template lib bundled; invoked by render.
|
||||
|
||||
* Reachability: **R2/R3** depending on input usage
|
||||
* Tags: `TAG:route:GET /render`, `TAG:taint:http.query.template`, `TAG:call:Template.Render`
|
||||
* `V2` **R0** — vulnerable package in lockfile but tree-shaken and absent from output bundle.
|
||||
|
||||
* Negative proof: bundle inspection + build manifest.
|
||||
|
||||
**Fix:** upgrade dependency and rebuild bundle; ensure SBOM maps bundle contents.
|
||||
|
||||
---
|
||||
|
||||
### 13) `svc-13-go-static-modver` (Go static binary)
|
||||
|
||||
**Purpose:** Scanner capability to extract module versions from static binary.
|
||||
**Entrypoint:** `GET /hash?alg=`
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **dep_runtime** — vulnerable Go module version linked; executed on route.
|
||||
|
||||
* Reachability: **R2**
|
||||
* Tags: `TAG:route:GET /hash`, `TAG:call:GoMod.VulnFunc`
|
||||
* `V2` **R1** — module linked but only used in dead code path (guarded by constant false).
|
||||
|
||||
* Negative proof: coverage/trace never hits sink.
|
||||
|
||||
**Fix:** update `go.mod` and rebuild.
|
||||
|
||||
---
|
||||
|
||||
### 14) `svc-14-dotnet-singlefile-trim` (.NET 10 publish single-file)
|
||||
|
||||
**Purpose:** Detect assemblies in single-file + trimming edge cases.
|
||||
**Entrypoint:** `GET /export`
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **dep_runtime** — vulnerable NuGet referenced and executed.
|
||||
|
||||
* Reachability: **R2**
|
||||
* Tags: `TAG:route:GET /export`, `TAG:call:NuGet.VulnMethod`
|
||||
* `V2` **R0** — package referenced in project but trimmed out and not present.
|
||||
|
||||
* Negative proof: runtime file map (single-file manifest) excludes it.
|
||||
|
||||
**Fix:** bump NuGet; adjust trimming settings if needed.
|
||||
|
||||
---
|
||||
|
||||
### 15) `svc-15-cors-credentials-wildcard` (.NET 10)
|
||||
|
||||
**Purpose:** Config/misconfig detection; clear fix delta.
|
||||
**Entrypoint:** any API route
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **CWE-942 / CORS Misconfig** — `Access-Control-Allow-Origin: *` with credentials.
|
||||
|
||||
* Reachability: **R2** (observed in response headers)
|
||||
* Tags: `TAG:route:GET /health`, `TAG:call:HTTP.Headers.CORSWildcardCreds`
|
||||
* Proof test: request and assert headers + tag.
|
||||
* Fix: explicit allowed origins + disable credentials unless needed.
|
||||
|
||||
---
|
||||
|
||||
### 16) `svc-16-open-redirect` (Node/Express)
|
||||
|
||||
**Purpose:** Web vuln classification, allowlist fix.
|
||||
**Entrypoint:** `GET /login?next=`
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **CWE-601 Open Redirect** — next param used directly.
|
||||
|
||||
* Reachability: **R3**
|
||||
* Tags: `TAG:route:GET /login`, `TAG:taint:http.query.next`, `TAG:call:Redirect.Unvalidated`
|
||||
* Fix: allowlist relative paths; reject absolute URLs.
|
||||
|
||||
---
|
||||
|
||||
### 17) `svc-17-csrf-state-change` (.NET 10)
|
||||
|
||||
**Purpose:** CSRF detection + cookie semantics.
|
||||
**Entrypoint:** `POST /account/email` (cookie auth)
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **CWE-352 CSRF** — no anti-forgery token; SameSite mis-set.
|
||||
|
||||
* Reachability: **R2**
|
||||
* Tags: `TAG:route:POST /account/email`, `TAG:call:Auth.CSRF.MissingProtection`
|
||||
* Fix: antiforgery token + SameSite=Lax/Strict and proper CORS.
|
||||
|
||||
---
|
||||
|
||||
### 18) `svc-18-prototype-pollution-merge` (Node)
|
||||
|
||||
**Purpose:** JSON-body gated sink; reachability must respect content-type and route.
|
||||
**Entrypoint:** `POST /profile` (application/json)
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **CWE-1321 Prototype Pollution** — unsafe deep merge of user object into defaults.
|
||||
|
||||
* Reachability: **R3** (only if JSON)
|
||||
* Tags: `TAG:route:POST /profile`, `TAG:taint:http.body.json`, `TAG:call:Object.Merge.Unsafe`
|
||||
* Negative test: same request with non-JSON must not hit sink tag.
|
||||
* Fix: safe merge, deny `__proto__` / `constructor` keys.
|
||||
|
||||
---
|
||||
|
||||
### 19) `svc-19-path-traversal-download` (Python/Flask)
|
||||
|
||||
**Purpose:** File traversal with safe, local sentinel proof.
|
||||
**Entrypoint:** `GET /download?file=`
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **CWE-22 Path Traversal** — file path concatenated without normalization.
|
||||
|
||||
* Reachability: **R3**
|
||||
* Tags: `TAG:route:GET /download`, `TAG:taint:http.query.file`, `TAG:call:FS.Read.UnsafePath`
|
||||
* Proof test: attempt to read a known sentinel file outside the allowed directory (within container).
|
||||
* Fix: normalize path, enforce base dir constraint.
|
||||
|
||||
---
|
||||
|
||||
### 20) `svc-20-insecure-tempfile-toctou` (Go or .NET)
|
||||
|
||||
**Purpose:** Concurrency/race category; deterministic reproduction via controlled scheduling.
|
||||
**Entrypoint:** `POST /export` creates temp file and then reopens by name
|
||||
**Vulns:**
|
||||
|
||||
* `V1` **CWE-367 TOCTOU** — uses predictable temp name + separate open.
|
||||
|
||||
* Reachability: **R2** (requires parallel test harness)
|
||||
* Tags: `TAG:route:POST /export`, `TAG:call:FS.TempFile.InsecurePattern`
|
||||
* Proof test: run two coordinated requests; assert race condition triggers sentinel behavior.
|
||||
* Fix: use secure temp APIs + hold open FD; atomic operations.
|
||||
|
||||
---
|
||||
|
||||
## D. Meta repos (not “services” but essential for best-in-class scanning)
|
||||
|
||||
### 21) `svc-21-k8s-misconfigs` (YAML/Helm)
|
||||
|
||||
**Purpose:** IaC scanning; false-positive discipline.
|
||||
**Artifacts:** `manifests/*.yaml`, `helm/Chart.yaml`
|
||||
**Findings to plant:**
|
||||
|
||||
* privileged container, `hostPath`, `runAsUser: 0`, missing resource limits, writable rootfs, wildcard RBAC
|
||||
**Proof:** static assertions in tests (OPA/Conftest or your harness) generate evidence tags like `TAG:iac:k8s.privileged`.
|
||||
|
||||
---
|
||||
|
||||
### 22) `svc-22-docker-multistage-buildonly`
|
||||
|
||||
**Purpose:** Prove build-time-only deps do not ship; prevent scanners from overreporting.
|
||||
**Pattern:** builder stage installs vulnerable tooling; final stage is distroless and excludes it.
|
||||
**Proof:** final image SBOM + `docker export` file list hash; must not include builder artifacts.
|
||||
|
||||
---
|
||||
|
||||
### 23) `svc-23-secrets-fakes-corpus`
|
||||
|
||||
**Purpose:** Secret detection precision/recall without storing real secrets.
|
||||
**Pattern:** files containing **fake** tokens matching common regexes but clearly marked `FAKE_` and useless.
|
||||
**Labels:** must distinguish:
|
||||
|
||||
* `R0 present` fake secret in docs/examples
|
||||
* `R2 reachable` secret injected into runtime env accidentally (then fixed)
|
||||
|
||||
---
|
||||
|
||||
### 24) `svc-24-sbom-mismatch-lab`
|
||||
|
||||
**Purpose:** SBOM validation and drift detection.
|
||||
**Pattern:** generate an SBOM, then change deps without regenerating; label mismatch as a “supply_chain” issue.
|
||||
**Proof:** harness compares `image digest + lockfile hash + sbom hash`.
|
||||
|
||||
---
|
||||
|
||||
## E. Implementation notes that raise the bar (recommended defaults)
|
||||
|
||||
1. **Each service ships with both**:
|
||||
|
||||
* `tests/test_positive_v*.{py,js,cs}` producing evidence for reachable vulns
|
||||
* `tests/test_negative_v*.{py,js,cs}` proving unreachable claims
|
||||
2. **Every service includes a `fix/` patch** and a CI job that:
|
||||
|
||||
* builds “vuln image”, scans, evaluates
|
||||
* applies fix, rebuilds, re-scans, confirms expected delta
|
||||
3. **Hard-mode toggle per service** (optional but valuable):
|
||||
|
||||
* `MODE=easy`: vuln sits on hot path (for demos)
|
||||
* `MODE=hard`: same vuln behind realistic conditions (auth, header, flag, content-type, async)
|
||||
|
||||
---
|
||||
|
||||
If you want this to be “maxim degree” for scanner R&D, the next step is to add **one additional dimension per service** (fat jar, single-file, distroless, vendored deps, shaded deps, optional extras, transitive only, etc.). I can propose a precise pairing (which dimension goes to which service) so the suite covers all packaging and reachability edge cases without duplication.
|
||||
@@ -0,0 +1,551 @@
|
||||
Here’s a tight, practical blueprint for building (and proving) a fast, evidence‑first triage workflow—plus the power‑user affordances that make Stella Ops feel “snappy” even offline.
|
||||
|
||||
# What “good” looks like (background in plain words)
|
||||
|
||||
* **Alert → evidence → decision** in one flow: an alert should open directly onto the concrete proof (reachability, call‑stack, provenance), then offer a one‑click decision (VEX/CSAF status) with audit logging.
|
||||
* **Time‑to‑First‑Signal (TTFS)** is king: how fast a human sees the first credible piece of evidence that explains *why this alert matters here*.
|
||||
* **Clicks‑to‑Closure**: count how many interactions to reach a defensible decision recorded in the audit log.
|
||||
|
||||
# Minimal evidence bundle per finding
|
||||
|
||||
* **Reachability proof**: function‑level path or package‑level import chain (with “toggle reachability view” hotkey).
|
||||
* **Call‑stack snippet**: 5–10 frames around the sink/source with file:line anchors.
|
||||
* **Provenance**: attestation / DSSE + build ancestry (image → layer → artifact → commit).
|
||||
* **VEX/CSAF status**: affected/not‑affected/under‑investigation + reason.
|
||||
* **Diff**: what changed since last scan (SBOM or VEX delta), rendered as a small, human‑readable “smart‑diff.”
|
||||
|
||||
# KPIs to measure in CI and UI
|
||||
|
||||
* **TTFS (p50/p95)** from alert creation to first rendered evidence.
|
||||
* **Clicks‑to‑Closure (median)** per decision type.
|
||||
* **Evidence completeness score** (0–4): reachability, call‑stack, provenance, VEX/CSAF present.
|
||||
* **Offline friendliness score**: % of evidence resolvable with no network.
|
||||
* **Audit log completeness**: every decision has: evidence hash set, actor, policy context, replay token.
|
||||
|
||||
# Power‑user affordances (keyboard first)
|
||||
|
||||
* **Jump to evidence** (`J`): focuses the first incomplete evidence pane.
|
||||
* **Copy DSSE** (`Y`): copies the attestation block or Rekor entry ref.
|
||||
* **Toggle reachability view** (`R`): path list ↔ compact graph ↔ textual proof.
|
||||
* **Search‑within‑graph** (`/`): node/func/package, instant.
|
||||
* **Deterministic sort** (`S`): stable sort by (reachability→severity→age→component) to remove hesitation.
|
||||
* **Quick VEX set** (`A`, `N`, `U`): Affected / Not‑affected / Under‑investigation with templated reasons.
|
||||
|
||||
# UX flow to implement (end‑to‑end)
|
||||
|
||||
1. **Alert row** shows: TTFS timer, reachability badge, “decision state,” and a diff‑dot if something changed.
|
||||
2. **Open alert** lands on **Evidence tab** (not Details). Top strip = three proof pills:
|
||||
|
||||
* Reachability ✓ / Call‑stack ✓ / Provenance ✓ (click to expand inline).
|
||||
3. **Decision drawer** pinned on the right:
|
||||
|
||||
* VEX/CSAF radio (A/N/U) → Reason presets → “Record decision.”
|
||||
* Shows **audit‑ready summary** (hashes, timestamps, policy).
|
||||
4. **Diff tab**: SBOM/VEX delta since last run, grouped by “meaningful risk shift.”
|
||||
5. **Activity tab**: immutable audit log; export as a signed bundle for audits.
|
||||
|
||||
# Graph performance on large call‑graphs
|
||||
|
||||
* **Minimal‑latency snapshots**: pre‑render static PNG/SVG thumbnails server‑side; open with tiny preview then hydrate to interactive graph lazily.
|
||||
* **Progressive neighborhood expansion**: load 1‑hop first, expand on demand; keep the first TTFS < 500 ms.
|
||||
* **Stable node ordering**: deterministic layout with consistent anchors to avoid “graph shuffle” anxiety.
|
||||
* **Chunked graph edges** with capped fan‑out; collapse identical library paths into a **reachability macro‑edge**.
|
||||
|
||||
# Offline‑friendly design
|
||||
|
||||
* **Local evidence cache**: store (SBOM slices, path proofs, DSSE attestations, compiled call‑stacks) in a signed bundle beside the SARIF/VEX.
|
||||
* **Deferred enrichment**: mark fields that need internet (e.g., upstream CSAF fetch) and queue a background “enricher” when network returns.
|
||||
* **Predictable fallbacks**: if provenance server missing, show embedded DSSE and “verification pending,” never blank states.
|
||||
|
||||
# Audit & replay
|
||||
|
||||
* **Deterministic replay token**: hash(feed manifests + rules + lattice policy + inputs) → attach to every decision.
|
||||
* **One‑click “Reproduce”**: opens CLI snippet pinned to the exact versions and policies.
|
||||
* **Evidence hash‑set**: content‑address each proof artifact; the audit entry stores only hashes + signer.
|
||||
|
||||
# TTFS & Clicks‑to‑Closure: how to measure in code
|
||||
|
||||
* Emit a `ttfs.start` at alert creation; first paint of any evidence card emits `ttfs.signal`.
|
||||
* Increment a per‑alert **interaction counter**; on “Record decision” emit `close.clicks`.
|
||||
* Log **evidence bitset** (reach, stack, prov, vex) at decision time for completeness scoring.
|
||||
|
||||
# Developer tasks (concrete, shippable)
|
||||
|
||||
* **Evidence API**: `GET /alerts/{id}/evidence` returns `{reachability, callstack, provenance, vex, hashes[]}` with deterministic sort.
|
||||
* **Proof renderer**: tiny, no‑framework widget that can render from the offline bundle; hydrate to full only on interaction.
|
||||
* **Keyboard map**: global handler with overlay help (`?`); no collisions; all actions are idempotent.
|
||||
* **Graph service**: server‑side layout + snapshot PNG; client hydrates WebGL only when user expands.
|
||||
* **Smart‑diff**: diff SBOM/VEX → classify into “risk‑raising / neutral / reducing,” surface only the first item by default.
|
||||
* **Audit logger**: append‑only stream; signed checkpoints; export `.stella-audit.tgz` (attestations + JSONL).
|
||||
|
||||
# Benchmarks to run weekly
|
||||
|
||||
* **TTFS under poor network** (100 ms RTT, 1% loss): p95 < 1.5 s to first evidence.
|
||||
* **Graph hydration on 250k‑edge image**: preview < 300 ms, interactive < 2.0 s.
|
||||
* **Keyboard coverage**: ≥90% of triage actions executable without mouse.
|
||||
* **Offline replay**: 100% of decisions re‑render from bundle; zero web calls required.
|
||||
|
||||
# Why Stella’s approach reduces hesitation
|
||||
|
||||
* **Deterministic sort orders** keep findings in place between refreshes.
|
||||
* **Minimal‑latency graph snapshots** show something trustworthy immediately, then refine—no “blank panel” delay.
|
||||
* **Replayable, signed bundles** make every click auditable and reversible, which builds operator confidence.
|
||||
|
||||
If you want, I can turn this into:
|
||||
|
||||
* a **UI checklist** for a design review,
|
||||
* a **.NET 10 API contract** (DTOs + endpoints),
|
||||
* or a **Cypress/Playwright test plan** that measures TTFS and clicks‑to‑closure automatically.
|
||||
Below is a PM‑style implementation guideline you can hand to developers. It’s written as a **build spec**: clear goals, “MUST/SHOULD” requirements, acceptance criteria, and the non‑functional guardrails (performance, offline, auditability) that make triage feel fast and defensible.
|
||||
|
||||
---
|
||||
|
||||
# Stella Ops — Evidence‑First Triage Implementation Guidelines (PM Spec)
|
||||
|
||||
## 0) Assumptions and scope
|
||||
|
||||
**Assumptions**
|
||||
|
||||
* Stella Ops ingests vulnerability findings (SCA/SAST/image scans), has SBOM context, and can compute reachability/call paths.
|
||||
* Triage outcomes must be recorded as VEX/CSAF‑compatible states with reasons and audit trails.
|
||||
* Users may operate in restricted networks and need an offline mode that still shows evidence.
|
||||
|
||||
**In scope**
|
||||
|
||||
* Evidence‑first alert triage UI + APIs + telemetry.
|
||||
* Reachability proof + call stack view + provenance attestation view.
|
||||
* VEX/CSAF decision recording with audit export.
|
||||
* Offline evidence bundle and deterministic replay token.
|
||||
|
||||
**Out of scope (for this phase)**
|
||||
|
||||
* Building the underlying static analyzer or SBOM generator (we consume their outputs).
|
||||
* Full CSAF publishing workflow (we store and export; publishing is separate).
|
||||
* Remediation automation (PRs, patching).
|
||||
|
||||
---
|
||||
|
||||
## 1) Product principles (non‑negotiables)
|
||||
|
||||
1. **Evidence before detail**
|
||||
Opening an alert **MUST** show the best available evidence immediately (even partial/placeholder), not a generic “details” page.
|
||||
2. **Fast first signal**
|
||||
The UI **MUST** render a credible “first signal” quickly (reachability badge, call stack snippet, or provenance block).
|
||||
3. **Determinism reduces hesitation**
|
||||
Sorting, graphs, and diffs **MUST** be stable across refreshes. No jittery re-layout.
|
||||
4. **Offline by design**
|
||||
If evidence exists locally (bundle), the UI **MUST** render it without network access.
|
||||
5. **Audit-ready by default**
|
||||
Every decision **MUST** be reproducible, attributable, and exportable with evidence hashes.
|
||||
|
||||
---
|
||||
|
||||
## 2) Success metrics (what we ship toward)
|
||||
|
||||
These become acceptance criteria and dashboards.
|
||||
|
||||
### Primary metrics (P0)
|
||||
|
||||
* **TTFS (Time‑to‑First‑Signal)**: p95 < **1.5s** from opening an alert to first evidence card rendering (with 100ms RTT, 1% loss simulation).
|
||||
* **Clicks‑to‑Closure**: median < **6** interactions to record a VEX decision.
|
||||
* **Evidence completeness** at decision time: ≥ **90%** of decisions include evidence hash set + reason + replay token.
|
||||
|
||||
### Secondary metrics (P1)
|
||||
|
||||
* **Offline resolution rate**: ≥ **95%** of alerts opened with a local bundle show reachability + provenance without network.
|
||||
* **Graph usability**: preview render < **300ms**, interactive hydration < **2.0s** for large graphs (see §7).
|
||||
|
||||
---
|
||||
|
||||
## 3) User workflows and “Definition of Done”
|
||||
|
||||
### Workflow A: Triage an alert to a decision
|
||||
|
||||
**DoD**: user can open an alert, see evidence, set VEX state, and the system records a signed/auditable decision event.
|
||||
|
||||
**Steps**
|
||||
|
||||
1. Alert list shows key signals (reachability badge, decision state, diff indicator).
|
||||
2. Open alert → Evidence view loads first.
|
||||
3. User reviews reachability/call stack/provenance.
|
||||
4. User sets VEX status + reason preset (editable).
|
||||
5. User records decision.
|
||||
6. Audit log entry appears instantly and is exportable.
|
||||
|
||||
### Workflow B: Explain “why is this flagged?”
|
||||
|
||||
**DoD**: user can show a defensible proof (path/call stack/provenance) and copy it into a ticket.
|
||||
|
||||
---
|
||||
|
||||
## 4) UI requirements (MUST/SHOULD/MAY)
|
||||
|
||||
## 4.1 Alert list page
|
||||
|
||||
**MUST**
|
||||
|
||||
* Each row includes:
|
||||
|
||||
* Severity + component identifier
|
||||
* **Decision state** (Unset / Under Investigation / Not Affected / Affected)
|
||||
* **Reachability badge** (Reachable / Not Reachable / Unknown) where available
|
||||
* **Diff indicator** if SBOM/VEX changed since last scan (simple dot/label)
|
||||
* Age / first seen / last updated
|
||||
* **Deterministic sort** default:
|
||||
`Reachability DESC → Severity DESC → Decision state (Unset first) → Age DESC → Component name ASC`
|
||||
* Keyboard navigation:
|
||||
|
||||
* `↑/↓` move selection, `Enter` open alert.
|
||||
* `/` search/filter focus.
|
||||
|
||||
**SHOULD**
|
||||
|
||||
* Inline “quick set” decision menu (Affected / Not affected / Under investigation) without leaving list for obvious cases, but still requires reason and logs evidence hashes.
|
||||
|
||||
## 4.2 Alert detail — landing tab MUST be Evidence
|
||||
|
||||
**MUST**
|
||||
|
||||
* Default landing is **Evidence** (not “Overview”).
|
||||
* Top section shows 3 “proof pills” with status:
|
||||
|
||||
* Reachability (✓ / ! / …)
|
||||
* Call stack (✓ / ! / …)
|
||||
* Provenance (✓ / ! / …)
|
||||
* Each pill expands inline (no navigation) into a compact evidence panel.
|
||||
|
||||
**MUST: No blank panels**
|
||||
|
||||
* If evidence is loading, show skeleton + “what’s coming.”
|
||||
* If evidence missing, show a reason (“not computed”, “requires source map”, “offline – enrichment pending”).
|
||||
|
||||
## 4.3 Decision drawer
|
||||
|
||||
**MUST**
|
||||
|
||||
* Pinned right drawer (or persistent bottom sheet on small screens).
|
||||
* Controls:
|
||||
|
||||
* VEX/CSAF status: **Affected / Not affected / Under investigation**
|
||||
* Reason preset dropdown + editable reason text
|
||||
* “Record decision” button
|
||||
* Preview “Audit summary” before submit:
|
||||
|
||||
* Evidence hashes included
|
||||
* Policy context (ruleset version)
|
||||
* Replay token
|
||||
* Actor identity
|
||||
|
||||
**MUST**
|
||||
|
||||
* On submit, create an append-only audit event and immediately reflect status in UI.
|
||||
|
||||
**SHOULD**
|
||||
|
||||
* Allow attaching references: ticket URL, incident ID, PR link (stored as metadata).
|
||||
|
||||
## 4.4 Diff tab
|
||||
|
||||
**MUST**
|
||||
|
||||
* Show delta since last scan:
|
||||
|
||||
* SBOM diffs (component version changes, removals/additions)
|
||||
* VEX diffs (status changes)
|
||||
* Group diffs by **risk shift**:
|
||||
|
||||
* Risk‑raising (new reachable vuln, severity increase)
|
||||
* Neutral (metadata-only)
|
||||
* Risk‑reducing (fixed version, reachability removed)
|
||||
|
||||
**SHOULD**
|
||||
|
||||
* Provide “Copy diff summary” for change management.
|
||||
|
||||
## 4.5 Activity/Audit tab
|
||||
|
||||
**MUST**
|
||||
|
||||
* Immutable timeline of decisions and evidence changes.
|
||||
* Each entry includes:
|
||||
|
||||
* actor, timestamp, decision, reason
|
||||
* evidence hash set
|
||||
* replay token
|
||||
* bundle/export availability
|
||||
|
||||
---
|
||||
|
||||
## 5) Power-user and accessibility requirements
|
||||
|
||||
### Keyboard shortcuts (MUST)
|
||||
|
||||
* `J`: jump to next missing/incomplete evidence panel
|
||||
* `R`: toggle reachability view (list ↔ compact graph ↔ textual proof)
|
||||
* `Y`: copy selected evidence block (call stack / DSSE / path proof)
|
||||
* `A`: set “Affected” (opens reason preset selection)
|
||||
* `N`: set “Not affected”
|
||||
* `U`: set “Under investigation”
|
||||
* `?`: keyboard help overlay
|
||||
|
||||
### Accessibility (MUST)
|
||||
|
||||
* Fully navigable by keyboard
|
||||
* Visible focus states
|
||||
* Screen-reader labels for evidence pills and drawer controls
|
||||
* Color is never the only signal (badges must have text/icon)
|
||||
|
||||
---
|
||||
|
||||
## 6) Evidence model: what every alert should attempt to provide
|
||||
|
||||
Treat this as the **minimum evidence bundle**. Each item may be “unavailable,” but must be explicit.
|
||||
|
||||
**MUST** support:
|
||||
|
||||
1. **Reachability proof**
|
||||
|
||||
* At least one of:
|
||||
|
||||
* function-level call path: `entry → … → vulnerable_sink`
|
||||
* package/module import chain
|
||||
* Includes confidence/algorithm tag: `static`, `dynamic`, `heuristic`
|
||||
2. **Call stack snippet**
|
||||
|
||||
* 5–10 frames around the relevant node with file:line anchors where possible
|
||||
3. **Provenance**
|
||||
|
||||
* DSSE attestation or equivalent statement
|
||||
* Artifact ancestry chain: image → layer → artifact → commit (as available)
|
||||
* Verification status: verified / pending / failed (with reason)
|
||||
4. **Decision state**
|
||||
|
||||
* VEX status + reason + timestamps
|
||||
5. **Evidence hash set**
|
||||
|
||||
* Content-addressed hashes of each evidence artifact included in the decision
|
||||
|
||||
**SHOULD**
|
||||
|
||||
* “Evidence freshness”: when computed, tool version, input revisions.
|
||||
|
||||
---
|
||||
|
||||
## 7) Performance and graph rendering requirements
|
||||
|
||||
### TTFS budget (MUST)
|
||||
|
||||
* When opening an alert:
|
||||
|
||||
* **<200ms**: show skeleton and cached row metadata
|
||||
* **<500ms**: render at least one evidence pill with meaningful content OR a cached preview image
|
||||
* **<1.5s p95**: render reachability + provenance for typical alerts
|
||||
|
||||
### Graph rendering for large call graphs (MUST)
|
||||
|
||||
* **Two-phase rendering**
|
||||
|
||||
1. Server-generated **static snapshot** (PNG/SVG) displayed immediately
|
||||
2. Interactive graph hydrates lazily on user expand
|
||||
* **Progressive expansion**
|
||||
|
||||
* Load 1-hop neighborhood first; expand on click
|
||||
* **Deterministic layout**
|
||||
|
||||
* Same input produces same layout anchors (no reshuffles between refreshes)
|
||||
* **Fan-out control**
|
||||
|
||||
* Collapse repeated library paths into “macro edges” to keep the graph readable
|
||||
|
||||
---
|
||||
|
||||
## 8) Offline mode requirements
|
||||
|
||||
Offline is not “nice to have”; it is a defined mode.
|
||||
|
||||
### Offline evidence bundle (MUST)
|
||||
|
||||
* A single file (e.g., `.stella.bundle.tgz`) that contains:
|
||||
|
||||
* Alert metadata snapshot
|
||||
* Evidence artifacts (reachability proofs, call stacks, provenance attestations)
|
||||
* SBOM slice(s) necessary for diffs
|
||||
* VEX decision history (if available)
|
||||
* Manifest with content hashes (Merkle-ish)
|
||||
* Bundle must be **signed** (or include signature material) and verifiable.
|
||||
|
||||
### UI behavior (MUST)
|
||||
|
||||
* If bundle is present:
|
||||
|
||||
* UI loads evidence from it first
|
||||
* Any missing items show “enrichment pending” (not “error”)
|
||||
* If network returns:
|
||||
|
||||
* Background refresh allowed, but **must not reorder** the alert list unexpectedly
|
||||
* Must surface “updated evidence available” as a user-controlled refresh, not an auto-switch that changes context mid-triage
|
||||
|
||||
---
|
||||
|
||||
## 9) Auditability and replay requirements
|
||||
|
||||
### Decision event schema (MUST)
|
||||
|
||||
Every recorded decision must store:
|
||||
|
||||
* `alert_id`, `artifact_id` (image digest or commit hash)
|
||||
* `actor_id`, `timestamp`
|
||||
* `decision_status` (Affected/Not affected/Under investigation)
|
||||
* `reason_code` (preset) + `reason_text`
|
||||
* `evidence_hashes[]` (content-addressed hashes)
|
||||
* `policy_context` (ruleset version, policy id)
|
||||
* `replay_token` (hash of inputs needed to reproduce)
|
||||
|
||||
### Replay token (MUST)
|
||||
|
||||
* Deterministic hash of:
|
||||
|
||||
* scan inputs (SBOM digest, image digest, tool versions)
|
||||
* policy/rules versions
|
||||
* reachability algorithm version
|
||||
* “Reproduce” button produces a CLI snippet (copyable) pinned to these versions.
|
||||
|
||||
### Export (MUST)
|
||||
|
||||
* Exportable audit bundle that includes:
|
||||
|
||||
* JSONL of decision events
|
||||
* evidence artifacts referenced by hashes
|
||||
* signatures/attestations
|
||||
* Export must be stable and verifiable later.
|
||||
|
||||
---
|
||||
|
||||
## 10) API and data contract guidelines (developer-facing)
|
||||
|
||||
This is an implementation guideline, not a full API spec—keep it simple and cache-friendly.
|
||||
|
||||
### MUST endpoints (or equivalent)
|
||||
|
||||
* `GET /alerts?filters…` → list view payload (small, cacheable)
|
||||
* `GET /alerts/{id}/evidence` → evidence payload (reachability, call stack, provenance, hashes)
|
||||
* `POST /alerts/{id}/decisions` → record decision event (append-only)
|
||||
* `GET /alerts/{id}/audit` → audit timeline
|
||||
* `GET /alerts/{id}/diff?baseline=…` → SBOM/VEX diff view
|
||||
* `GET /bundles/{id}` and/or `POST /bundles/verify` → offline bundle download/verify
|
||||
|
||||
### Evidence payload guidelines (MUST)
|
||||
|
||||
* Deterministic ordering for arrays and nodes (stable sorts).
|
||||
* Explicit `status` per evidence section: `available | loading | unavailable | error`.
|
||||
* Include `hash` per artifact for content addressing.
|
||||
|
||||
**Example shape**
|
||||
|
||||
```json
|
||||
{
|
||||
"alert_id": "a123",
|
||||
"reachability": { "status": "available", "hash": "sha256:…", "proof": { "type": "call_path", "nodes": [...] } },
|
||||
"callstack": { "status": "available", "hash": "sha256:…", "frames": [...] },
|
||||
"provenance": { "status": "pending", "hash": null, "dsse": { "embedded": true, "payload": "…" } },
|
||||
"vex": { "status": "available", "current": {...}, "history": [...] },
|
||||
"hashes": ["sha256:…", "sha256:…"]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 11) Telemetry requirements (how we prove it’s fast)
|
||||
|
||||
**MUST** instrument:
|
||||
|
||||
* `alert_opened` (timestamp, alert_id)
|
||||
* `evidence_first_paint` (timestamp, evidence_type)
|
||||
* `decision_recorded` (timestamp, clicks_count, evidence_bitset)
|
||||
* `bundle_loaded` (hit/miss, size, verification_status)
|
||||
* `graph_preview_paint` and `graph_hydrated`
|
||||
|
||||
**MUST** compute:
|
||||
|
||||
* TTFS = `evidence_first_paint - alert_opened`
|
||||
* Clicks‑to‑Closure = interaction counter per alert until decision recorded
|
||||
* Evidence completeness bitset at decision time: reachability/callstack/provenance/vex present
|
||||
|
||||
---
|
||||
|
||||
## 12) Error handling and edge cases
|
||||
|
||||
**MUST**
|
||||
|
||||
* Never show empty states without explanation.
|
||||
* Distinguish between:
|
||||
|
||||
* “not computed yet”
|
||||
* “not possible due to missing inputs”
|
||||
* “blocked by permissions”
|
||||
* “offline—enrichment pending”
|
||||
* “verification failed”
|
||||
|
||||
**SHOULD**
|
||||
|
||||
* Offer “Request enrichment” action when evidence missing (creates a job/task id).
|
||||
|
||||
---
|
||||
|
||||
## 13) Security, permissions, and multi-tenancy
|
||||
|
||||
**MUST**
|
||||
|
||||
* RBAC gating for:
|
||||
|
||||
* viewing provenance attestations
|
||||
* recording decisions
|
||||
* exporting audit bundles
|
||||
* All decision events are immutable; corrections are new events (append-only).
|
||||
* PII handling:
|
||||
|
||||
* Avoid storing freeform reasons with secrets; warn on paste patterns (optional P1).
|
||||
|
||||
---
|
||||
|
||||
## 14) Engineering execution plan (priorities)
|
||||
|
||||
### P0 (ship first)
|
||||
|
||||
* Evidence-first alert detail landing
|
||||
* Decision drawer + append-only audit
|
||||
* Deterministic alert list sort + reachability badge
|
||||
* Evidence API + decision POST
|
||||
* TTFS + clicks telemetry
|
||||
* Static graph preview + lazy hydration
|
||||
|
||||
### P1
|
||||
|
||||
* Offline bundle load/verify + offline rendering
|
||||
* Smart diff view (risk shift grouping)
|
||||
* Exportable audit bundle
|
||||
* Keyboard shortcuts + help overlay
|
||||
|
||||
### P2
|
||||
|
||||
* Inline quick decisions from list
|
||||
* Advanced graph search within view
|
||||
* Suggest reason presets based on evidence patterns
|
||||
|
||||
---
|
||||
|
||||
## 15) Acceptance criteria checklist (what QA signs off)
|
||||
|
||||
A build is acceptable when:
|
||||
|
||||
* Opening an alert renders at least one evidence pill within **500ms** (with cache) and TTFS p95 meets target under network simulation.
|
||||
* Users can record A/N/U decisions with reason and see an audit event immediately.
|
||||
* Decision event includes evidence hashes + replay token.
|
||||
* Alert list sorting is stable and deterministic across refresh.
|
||||
* Graph preview appears instantly; interactive graph hydrates only on expand.
|
||||
* Offline bundle renders evidence without network; missing items show “enrichment pending,” not errors.
|
||||
* Keyboard shortcuts work; `?` overlay lists them; full keyboard navigation is possible.
|
||||
|
||||
---
|
||||
|
||||
If you want, I can also format this into a **developer-ready ticket pack** (epics + user stories + acceptance tests) so engineers can implement without interpretation drift.
|
||||
@@ -0,0 +1,544 @@
|
||||
Here’s a quick, practical cheat‑sheet on choosing **PostgreSQL vs MongoDB** for security/DevOps apps—plus how I’d model SBOM/VEX and queues in Stella Ops without adding moving parts.
|
||||
|
||||
---
|
||||
|
||||
# PostgreSQL you can lean on (why it often wins for ops apps)
|
||||
|
||||
* **JSONB that flies:** Store documents yet query like SQL. Add **GIN indexes** on JSONB fields for fast lookups (`jsonb_ops` general; `jsonb_path_ops` great for `@>` containment).
|
||||
* **Queue pattern built‑in:** `SELECT … FOR UPDATE SKIP LOCKED` lets multiple workers pop jobs from the same table safely—no head‑of‑line blocking, no extra broker.
|
||||
* **Cooperative locks:** **Advisory locks** (session/transaction) for “at‑most‑once” sections or leader election.
|
||||
* **Lightweight pub/sub:** **LISTEN/NOTIFY** for async nudges between services (poke a worker to re‑scan, refresh cache, etc.).
|
||||
* **Search included:** **Full‑text search** (tsvector/tsquery) is native—no separate search service for moderate needs.
|
||||
* **Serious backups:** **PITR** with WAL archiving / `pg_basebackup` for deterministic rollbacks and offline bundles.
|
||||
|
||||
# MongoDB facts to factor in
|
||||
|
||||
* **Flexible ingest:** Schemaless docs make it easy to absorb varied telemetry and vendor feeds.
|
||||
* **Horizontal scale:** Sharding is mature for huge, read‑heavy datasets.
|
||||
* **Consistency is a choice:** Design embedding vs refs and when to use multi‑document transactions.
|
||||
|
||||
---
|
||||
|
||||
# A simple rule of thumb (Stella Ops‑style)
|
||||
|
||||
* **System of record:** PostgreSQL (JSONB first).
|
||||
* **Hot paths:** Materialized views + JSONB GIN indexes.
|
||||
* **Queues & coordination:** PostgreSQL (skip‑locked + advisory locks).
|
||||
* **Cache/accel only:** Valkey (ephemeral).
|
||||
* **MongoDB:** Optional for **very large, read‑optimized graph snapshots** (e.g., periodically baked reachability graphs) if Postgres starts to strain.
|
||||
|
||||
---
|
||||
|
||||
# Concrete patterns you can drop in today
|
||||
|
||||
**1) SBOM/VEX storage (Postgres JSONB)**
|
||||
|
||||
```sql
|
||||
-- Documents
|
||||
CREATE TABLE sbom (
|
||||
id BIGSERIAL PRIMARY KEY,
|
||||
artifact_purl TEXT NOT NULL,
|
||||
doc JSONB NOT NULL,
|
||||
created_at TIMESTAMPTZ DEFAULT now()
|
||||
);
|
||||
CREATE INDEX sbom_purl_idx ON sbom(artifact_purl);
|
||||
CREATE INDEX sbom_doc_gin ON sbom USING GIN (doc jsonb_path_ops);
|
||||
|
||||
-- Common queries
|
||||
-- find components by name/version:
|
||||
-- SELECT * FROM sbom WHERE doc @> '{"components":[{"name":"openssl","version":"3.0.14"}]}';
|
||||
|
||||
-- VEX
|
||||
CREATE TABLE vex (
|
||||
id BIGSERIAL PRIMARY KEY,
|
||||
subject_purl TEXT NOT NULL,
|
||||
vex_doc JSONB NOT NULL,
|
||||
created_at TIMESTAMPTZ DEFAULT now()
|
||||
);
|
||||
CREATE INDEX vex_subject_idx ON vex(subject_purl);
|
||||
CREATE INDEX vex_doc_gin ON vex USING GIN (vex_doc jsonb_path_ops);
|
||||
```
|
||||
|
||||
**2) Hot reads via materialized views**
|
||||
|
||||
```sql
|
||||
CREATE MATERIALIZED VIEW mv_open_findings AS
|
||||
SELECT
|
||||
s.artifact_purl,
|
||||
c->>'name' AS comp,
|
||||
c->>'version' AS ver,
|
||||
v.vex_doc
|
||||
FROM sbom s
|
||||
CROSS JOIN LATERAL jsonb_array_elements(s.doc->'components') c
|
||||
LEFT JOIN vex v ON v.subject_purl = s.artifact_purl
|
||||
-- add WHERE clauses to pre‑filter only actionable rows
|
||||
;
|
||||
CREATE INDEX mv_open_findings_idx ON mv_open_findings(artifact_purl, comp);
|
||||
```
|
||||
|
||||
Refresh cadence: on feed import or via a scheduler; `REFRESH MATERIALIZED VIEW CONCURRENTLY mv_open_findings;`
|
||||
|
||||
**3) Queue without a broker**
|
||||
|
||||
```sql
|
||||
CREATE TABLE job_queue(
|
||||
id BIGSERIAL PRIMARY KEY,
|
||||
kind TEXT NOT NULL, -- e.g., 'scan', 'sbom-diff'
|
||||
payload JSONB NOT NULL,
|
||||
run_after TIMESTAMPTZ DEFAULT now(),
|
||||
attempts INT DEFAULT 0,
|
||||
locked_at TIMESTAMPTZ,
|
||||
locked_by TEXT
|
||||
);
|
||||
CREATE INDEX job_queue_ready_idx ON job_queue(kind, run_after);
|
||||
|
||||
-- Worker loop
|
||||
WITH cte AS (
|
||||
SELECT id FROM job_queue
|
||||
WHERE kind = $1 AND run_after <= now() AND locked_at IS NULL
|
||||
ORDER BY id
|
||||
FOR UPDATE SKIP LOCKED
|
||||
LIMIT 1
|
||||
)
|
||||
UPDATE job_queue j
|
||||
SET locked_at = now(), locked_by = $2
|
||||
FROM cte
|
||||
WHERE j.id = cte.id
|
||||
RETURNING j.*;
|
||||
```
|
||||
|
||||
Release/fail with: set `locked_at=NULL, locked_by=NULL, attempts=attempts+1` or delete on success.
|
||||
|
||||
**4) Advisory lock for singletons**
|
||||
|
||||
```sql
|
||||
-- Acquire (per tenant, per artifact)
|
||||
SELECT pg_try_advisory_xact_lock(hashtextextended('recalc:'||tenant||':'||artifact, 0));
|
||||
```
|
||||
|
||||
**5) Nudge workers without a bus**
|
||||
|
||||
```sql
|
||||
NOTIFY stella_scan, json_build_object('purl', $1, 'priority', 5)::TEXT;
|
||||
-- workers LISTEN stella_scan and enqueue quickly
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# When to add MongoDB
|
||||
|
||||
* You need **interactive exploration** over **hundreds of millions of nodes/edges** (e.g., historical “proof‑of‑integrity” graphs) where document fan‑out and denormalized reads beat relational joins.
|
||||
* Snapshot cadence is **batchy** (hourly/daily), and you can **re‑emit** snapshots deterministically from Postgres (single source of truth).
|
||||
* You want to isolate read spikes from the transactional core.
|
||||
|
||||
**Snapshot pipe:** Postgres → (ETL) → MongoDB collection `{graph_id, node, edges[], attrs}` with **compound shard keys** tuned to your UI traversal.
|
||||
|
||||
---
|
||||
|
||||
# Why this fits Stella Ops
|
||||
|
||||
* Fewer moving parts on‑prem/air‑gapped.
|
||||
* Deterministic replays (PITR + immutable imports).
|
||||
* Clear performance levers (GIN indexes, MVs, skip‑locked queues).
|
||||
* MongoDB stays optional, purpose‑built for giant read graphs—not a default dependency.
|
||||
|
||||
If you want, I can turn the above into ready‑to‑run `.sql` migrations and a small **.NET 10** worker (Dapper/EF Core) that implements the queue loop + advisory locks + LISTEN/NOTIFY hooks.
|
||||
Below is a handoff-ready set of **PostgreSQL tables/views engineering guidelines** intended for developer review. It is written as a **gap-finding checklist** with **concrete DDL patterns** and **performance red flags** (Postgres as system of record, JSONB where useful, derived projections where needed).
|
||||
|
||||
---
|
||||
|
||||
# PostgreSQL Tables & Views Engineering Guide
|
||||
|
||||
## 0) Non-negotiable principles
|
||||
|
||||
1. **Every hot query must have an index story.** If you cannot name the index that serves it, you have a performance gap.
|
||||
2. **Write path stays simple.** Prefer **append-only** versioning to large updates (especially for JSONB).
|
||||
3. **Multi-tenant must be explicit.** Every core table includes `tenant_id` and indexes are tenant-prefixed.
|
||||
4. **Derived data is a product.** If the UI needs it fast, model it as a **projection table or materialized view**, not as an ad-hoc mega-join.
|
||||
5. **Idempotency is enforced in the DB.** Unique keys for imports/jobs/results; no “best effort” dedupe in application only.
|
||||
|
||||
---
|
||||
|
||||
# 1) Table taxonomy and what to look for
|
||||
|
||||
Use this to classify every table; each class has different indexing/retention/locking rules.
|
||||
|
||||
### A. Source-of-truth (SOR) tables
|
||||
|
||||
Examples: `sbom_document`, `vex_document`, `feed_import`, `scan_manifest`, `attestation`.
|
||||
|
||||
* **Expect:** immutable rows, versioning via new row inserts.
|
||||
* **Gaps:** frequent updates to large JSONB; missing `content_hash`; no unique idempotency key.
|
||||
|
||||
### B. Projection tables (query-optimized)
|
||||
|
||||
Examples: `open_findings`, `artifact_risk_summary`, `component_index`.
|
||||
|
||||
* **Expect:** denormalized, indexed for UI/API; refresh/update strategy defined.
|
||||
* **Gaps:** projections rebuilt from scratch too often; missing incremental update plan; no retention plan.
|
||||
|
||||
### C. Queue/outbox tables
|
||||
|
||||
Examples: `job_queue`, `outbox_events`.
|
||||
|
||||
* **Expect:** `SKIP LOCKED` claim pattern; retry + DLQ; minimal lock duration.
|
||||
* **Gaps:** holding row locks while doing work; missing partial index for “ready” jobs.
|
||||
|
||||
### D. Audit/event tables
|
||||
|
||||
Examples: `scan_run_event`, `decision_event`, `access_audit`.
|
||||
|
||||
* **Expect:** append-only; partitioned by time; BRIN on timestamps.
|
||||
* **Gaps:** single huge table without partitioning; slow deletes instead of partition drops.
|
||||
|
||||
---
|
||||
|
||||
# 2) Naming, keys, and required columns
|
||||
|
||||
## Required columns per class
|
||||
|
||||
### SOR documents (SBOM/VEX/Attestations)
|
||||
|
||||
* `tenant_id uuid`
|
||||
* `id bigserial` (internal PK)
|
||||
* `external_id uuid` (optional API-facing id)
|
||||
* `content_hash bytea` (sha256) **NOT NULL**
|
||||
* `doc jsonb` **NOT NULL**
|
||||
* `created_at timestamptz` **NOT NULL default now()**
|
||||
* `supersedes_id bigint NULL` (version chain) OR `version int`
|
||||
|
||||
**Checklist**
|
||||
|
||||
* [ ] Unique constraint exists: `(tenant_id, content_hash)`
|
||||
* [ ] Version strategy exists (supersedes/version) and is queryable
|
||||
* [ ] “Latest” access is index-backed (see §4)
|
||||
|
||||
### Queue
|
||||
|
||||
* `tenant_id uuid` (if multi-tenant)
|
||||
* `id bigserial`
|
||||
* `kind text`
|
||||
* `payload jsonb`
|
||||
* `run_after timestamptz`
|
||||
* `attempts int`
|
||||
* `locked_at timestamptz NULL`
|
||||
* `locked_by text NULL`
|
||||
* `status smallint` (optional; e.g., ready/running/done/dead)
|
||||
|
||||
**Checklist**
|
||||
|
||||
* [ ] “Ready to claim” has a partial index (see §4)
|
||||
* [ ] Claim transaction is short (claim+commit; work outside lock)
|
||||
|
||||
---
|
||||
|
||||
# 3) JSONB rules that prevent “looks fine → melts in prod”
|
||||
|
||||
## When JSONB is appropriate
|
||||
|
||||
* Storing signed envelopes (DSSE), SBOM/VEX raw docs, vendor payloads.
|
||||
* Ingest-first scenarios where schema evolves.
|
||||
|
||||
## When JSONB is a performance hazard
|
||||
|
||||
* You frequently query deep keys/arrays (components, vulnerabilities, call paths).
|
||||
* You need sorting/aggregations on doc fields.
|
||||
|
||||
**Mandatory pattern for hot JSON fields**
|
||||
|
||||
1. Keep the raw JSONB for fidelity.
|
||||
2. Extract **hot keys** into **stored generated columns** (or real columns), index those.
|
||||
3. Extract **hot arrays** into child tables (components, vulnerabilities).
|
||||
|
||||
Example:
|
||||
|
||||
```sql
|
||||
CREATE TABLE sbom_document (
|
||||
id bigserial PRIMARY KEY,
|
||||
tenant_id uuid NOT NULL,
|
||||
artifact_purl text NOT NULL,
|
||||
content_hash bytea NOT NULL,
|
||||
doc jsonb NOT NULL,
|
||||
created_at timestamptz NOT NULL DEFAULT now(),
|
||||
|
||||
-- hot keys as generated columns
|
||||
bom_format text GENERATED ALWAYS AS ((doc->>'bomFormat')) STORED,
|
||||
spec_version text GENERATED ALWAYS AS ((doc->>'specVersion')) STORED
|
||||
);
|
||||
|
||||
CREATE UNIQUE INDEX ux_sbom_doc_hash ON sbom_document(tenant_id, content_hash);
|
||||
CREATE INDEX ix_sbom_doc_tenant_artifact ON sbom_document(tenant_id, artifact_purl, created_at DESC);
|
||||
CREATE INDEX ix_sbom_doc_json_gin ON sbom_document USING GIN (doc jsonb_path_ops);
|
||||
CREATE INDEX ix_sbom_doc_bomformat ON sbom_document(tenant_id, bom_format);
|
||||
```
|
||||
|
||||
**Checklist**
|
||||
|
||||
* [ ] Any query using `doc->>` in WHERE has either an expression index or a generated column index
|
||||
* [ ] Any query using `jsonb_array_elements(...)` in hot path has been replaced by a normalized child table or a projection table
|
||||
|
||||
---
|
||||
|
||||
# 4) Indexing standards (what devs must justify)
|
||||
|
||||
## Core rules
|
||||
|
||||
1. **Tenant-first**: `INDEX(tenant_id, …)` for anything read per tenant.
|
||||
2. **Sort support**: if query uses `ORDER BY created_at DESC`, index must end with `created_at DESC`.
|
||||
3. **Partial indexes** for sparse predicates (status/locked flags).
|
||||
4. **BRIN** for massive append-only time series.
|
||||
5. **GIN jsonb_path_ops** for containment (`@>`) on JSONB; avoid GIN for everything.
|
||||
|
||||
## Required index patterns by use case
|
||||
|
||||
### “Latest version per artifact”
|
||||
|
||||
If you store versions as rows:
|
||||
|
||||
```sql
|
||||
-- supports: WHERE tenant_id=? AND artifact_purl=? ORDER BY created_at DESC LIMIT 1
|
||||
CREATE INDEX ix_sbom_latest ON sbom_document(tenant_id, artifact_purl, created_at DESC);
|
||||
```
|
||||
|
||||
### Ready queue claims
|
||||
|
||||
```sql
|
||||
CREATE INDEX ix_job_ready
|
||||
ON job_queue(kind, run_after, id)
|
||||
WHERE locked_at IS NULL;
|
||||
|
||||
-- Optional: tenant scoped
|
||||
CREATE INDEX ix_job_ready_tenant
|
||||
ON job_queue(tenant_id, kind, run_after, id)
|
||||
WHERE locked_at IS NULL;
|
||||
```
|
||||
|
||||
### JSON key lookup (expression index)
|
||||
|
||||
```sql
|
||||
-- supports: WHERE (doc->>'subject') = ?
|
||||
CREATE INDEX ix_vex_subject_expr
|
||||
ON vex_document(tenant_id, (doc->>'subject'));
|
||||
```
|
||||
|
||||
### Massive event table time filtering
|
||||
|
||||
```sql
|
||||
CREATE INDEX brin_scan_events_time
|
||||
ON scan_run_event USING BRIN (occurred_at);
|
||||
```
|
||||
|
||||
**Red flags**
|
||||
|
||||
* GIN index on a JSONB column + frequent updates = bloat and write amplification.
|
||||
* No partial index for queue readiness → sequential scans under load.
|
||||
* Composite indexes with wrong leading column order (e.g., `created_at, tenant_id`) → not used.
|
||||
|
||||
---
|
||||
|
||||
# 5) Partitioning and retention (avoid “infinite tables”)
|
||||
|
||||
Use partitioning for:
|
||||
|
||||
* audit/events
|
||||
* scan run logs
|
||||
* large finding histories
|
||||
* anything > tens of millions rows with time-based access
|
||||
|
||||
## Standard approach
|
||||
|
||||
* Partition by `occurred_at` (monthly) for event/audit tables.
|
||||
* Retention by dropping partitions (fast and vacuum-free).
|
||||
|
||||
Example:
|
||||
|
||||
```sql
|
||||
CREATE TABLE scan_run_event (
|
||||
tenant_id uuid NOT NULL,
|
||||
scan_run_id bigint NOT NULL,
|
||||
occurred_at timestamptz NOT NULL,
|
||||
event_type text NOT NULL,
|
||||
payload jsonb NOT NULL
|
||||
) PARTITION BY RANGE (occurred_at);
|
||||
```
|
||||
|
||||
**Checklist**
|
||||
|
||||
* [ ] Partition creation/rollover process exists (migration or scheduler)
|
||||
* [ ] Retention is “DROP PARTITION”, not “DELETE WHERE occurred_at < …”
|
||||
* [ ] Each partition has needed local indexes (BRIN/time + tenant filters)
|
||||
|
||||
---
|
||||
|
||||
# 6) Views vs Materialized Views vs Projection Tables
|
||||
|
||||
## Use a normal VIEW when
|
||||
|
||||
* It’s thin (renaming columns, simple joins) and not used in hot paths.
|
||||
|
||||
## Use a MATERIALIZED VIEW when
|
||||
|
||||
* It accelerates complex joins/aggregations and can be refreshed on a schedule.
|
||||
* You can tolerate refresh lag.
|
||||
|
||||
**Materialized view requirements**
|
||||
|
||||
* Must have a **unique index** to use `REFRESH … CONCURRENTLY`.
|
||||
* Refresh must be **outside** an explicit transaction block.
|
||||
|
||||
Example:
|
||||
|
||||
```sql
|
||||
CREATE MATERIALIZED VIEW mv_artifact_risk AS
|
||||
SELECT tenant_id, artifact_purl, max(score) AS risk_score
|
||||
FROM open_findings
|
||||
GROUP BY tenant_id, artifact_purl;
|
||||
|
||||
CREATE UNIQUE INDEX ux_mv_artifact_risk
|
||||
ON mv_artifact_risk(tenant_id, artifact_purl);
|
||||
```
|
||||
|
||||
## Prefer projection tables over MV when
|
||||
|
||||
* You need **incremental updates** (on import/scan completion).
|
||||
* You need deterministic “point-in-time” snapshots per manifest.
|
||||
|
||||
**Checklist**
|
||||
|
||||
* [ ] Every MV has refresh cadence + owner (which worker/job triggers it)
|
||||
* [ ] UI/API queries do not depend on a heavy non-materialized view
|
||||
* [ ] If “refresh cost” scales with whole dataset, projection table exists instead
|
||||
|
||||
---
|
||||
|
||||
# 7) Queue and outbox patterns that do not deadlock
|
||||
|
||||
## Claim pattern (short transaction)
|
||||
|
||||
```sql
|
||||
WITH cte AS (
|
||||
SELECT id
|
||||
FROM job_queue
|
||||
WHERE kind = $1
|
||||
AND run_after <= now()
|
||||
AND locked_at IS NULL
|
||||
ORDER BY id
|
||||
FOR UPDATE SKIP LOCKED
|
||||
LIMIT 1
|
||||
)
|
||||
UPDATE job_queue j
|
||||
SET locked_at = now(),
|
||||
locked_by = $2
|
||||
FROM cte
|
||||
WHERE j.id = cte.id
|
||||
RETURNING j.*;
|
||||
```
|
||||
|
||||
**Rules**
|
||||
|
||||
* Claim + commit quickly.
|
||||
* Do work outside the lock.
|
||||
* On completion: update row to done (or delete if you want compactness).
|
||||
* On failure: increment attempts, set `run_after = now() + backoff`, release lock.
|
||||
|
||||
**Checklist**
|
||||
|
||||
* [ ] Worker does not keep transaction open while scanning/importing
|
||||
* [ ] Backoff policy is encoded (in DB columns) and observable
|
||||
* [ ] DLQ condition exists (attempts > N) and is queryable
|
||||
|
||||
---
|
||||
|
||||
# 8) Query performance review checklist (what to require in PRs)
|
||||
|
||||
For each new endpoint/query:
|
||||
|
||||
* [ ] Provide the query (SQL) and the intended parameters.
|
||||
* [ ] Provide `EXPLAIN (ANALYZE, BUFFERS)` from a dataset size that resembles staging.
|
||||
* [ ] Identify the serving index(es).
|
||||
* [ ] Confirm row estimates are not wildly wrong (if they are: stats or predicate mismatch).
|
||||
* [ ] Confirm it is tenant-scoped and uses the tenant-leading index.
|
||||
|
||||
**Common fixes**
|
||||
|
||||
* Replace `IN (SELECT …)` with `EXISTS` for correlated checks.
|
||||
* Replace `ORDER BY … LIMIT` without index with an index that matches ordering.
|
||||
* Avoid exploding joins with JSON arrays; pre-extract.
|
||||
|
||||
---
|
||||
|
||||
# 9) Vacuum, bloat, and “why is disk growing”
|
||||
|
||||
## Design to avoid bloat
|
||||
|
||||
* Append-only for large docs and events.
|
||||
* If frequent updates are needed, isolate hot-updated columns into a smaller table.
|
||||
|
||||
Example split:
|
||||
|
||||
* `job_queue_payload` (stable)
|
||||
* `job_queue_state` (locked/status/attempts updated frequently)
|
||||
|
||||
**Checklist**
|
||||
|
||||
* [ ] Large frequently-updated JSONB tables have been questioned
|
||||
* [ ] Updates do not rewrite big TOAST values repeatedly
|
||||
* [ ] Retention is partition-drop where possible
|
||||
|
||||
---
|
||||
|
||||
# 10) Migration safety rules (prevent production locks)
|
||||
|
||||
* Index creation: `CREATE INDEX CONCURRENTLY`.
|
||||
* Dropping indexes: `DROP INDEX CONCURRENTLY`.
|
||||
* New column with default on large table:
|
||||
|
||||
1. `ADD COLUMN` nullable
|
||||
2. backfill in batches
|
||||
3. `ALTER COLUMN SET NOT NULL`
|
||||
4. add default if needed
|
||||
|
||||
**Checklist**
|
||||
|
||||
* [ ] No long-running `ALTER TABLE` on huge tables without plan
|
||||
* [ ] Any new NOT NULL constraint is staged safely
|
||||
|
||||
---
|
||||
|
||||
# 11) Stella Ops-specific schema guidance (SBOM/VEX/Finding)
|
||||
|
||||
## Minimum recommended normalized tables
|
||||
|
||||
Even if you keep raw SBOM/VEX JSON:
|
||||
|
||||
* `sbom_document` (raw, immutable)
|
||||
* `sbom_component` (extracted components)
|
||||
* `vex_document` (raw, immutable)
|
||||
* `vex_statement` (extracted statements per CVE/component)
|
||||
* `finding` (facts: CVE ↔ component ↔ artifact ↔ scan_run)
|
||||
* `scan_manifest` (determinism: feed versions/hashes, policy hash)
|
||||
* `scan_run` (links results to manifest)
|
||||
|
||||
**Key gap detectors**
|
||||
|
||||
* If “find all artifacts affected by CVE X” is slow → missing `finding` indexing.
|
||||
* If “component search” is slow → missing `sbom_component` and its indexes.
|
||||
* If “replay this scan” is not exact → missing `scan_manifest` + feed import hashes.
|
||||
|
||||
---
|
||||
|
||||
# 12) Minimal “definition of done” for a new table/view
|
||||
|
||||
A PR adding a table/view is incomplete unless it includes:
|
||||
|
||||
* [ ] Table classification (SOR / projection / queue / event)
|
||||
* [ ] Primary key and idempotency unique key
|
||||
* [ ] Tenant scoping strategy
|
||||
* [ ] Index plan mapped to known queries
|
||||
* [ ] Retention plan (especially for event/projection tables)
|
||||
* [ ] Refresh/update plan if derived
|
||||
* [ ] Example query + `EXPLAIN` for the top 1–3 access patterns
|
||||
|
||||
---
|
||||
|
||||
If you want this as a single drop-in repo document, tell me the target path (e.g., `/docs/platform/postgres-table-view-guidelines.md`) and I will format it exactly as a team-facing guideline, including a one-page “Architecture/Performance Gaps” review form that engineers can paste into PR descriptions.
|
||||
@@ -0,0 +1,20 @@
|
||||
# 29-Nov-2025 · Acceptance Tests Pack and Guardrails
|
||||
|
||||
**Why now:** Guardrail coverage (AT1–AT10) is required before Md.I ladder can proceed; acceptance packs must be deterministic, signed, and offline-ready.
|
||||
|
||||
## Scope
|
||||
- Publish acceptance test pack schema + checklist for CI/DB/rew definitions.
|
||||
- Bundle deterministic fixtures (pinned seeds, UTC timestamps) with DSSE provenance.
|
||||
- Define gating thresholds and replay parity checks for admission/VEX/auth flows.
|
||||
|
||||
## Required artefacts (MVP for DONE)
|
||||
- Advisory summary (this file) plus checklist stub under `docs/process/` referencing AT1–AT10.
|
||||
- Links into sprint tracker row 4 (`SPRINT_0300_0001_0001_documentation_process.md`).
|
||||
- Placeholder fixture pack path reserved under `tests/acceptance/packs/guardrails/` (no network).
|
||||
|
||||
## Determinism & Offline
|
||||
- Freeze scanner/db versions; record in a `inputs.lock` for the pack.
|
||||
- All fixtures must be reproducible from seeds; include DSSE envelopes for pack manifests.
|
||||
|
||||
## Next actions
|
||||
- Add checklist stub and register the pack path; log evidence in sprint Execution Log before 2025-12-08.
|
||||
@@ -0,0 +1,19 @@
|
||||
# 29-Nov-2025 · CVSS v4.0 Momentum in Vulnerability Management
|
||||
|
||||
**Why now:** Vendors (NVD, GitHub, Microsoft, Snyk) are shipping CVSS v4 signals; StellaOps needs awareness to align receipts, reporting, and UI before defaulting to v4 everywhere.
|
||||
|
||||
## Scope
|
||||
- Brief on adoption signals and compatibility risks when mixing v3.1/v4.
|
||||
- Map impacts to receipt schemas (`SPRINT_0190_0001_0001_cvss_v4_receipts.md`).
|
||||
- Identify quick UI/reporting deltas required for transparency.
|
||||
|
||||
## Required artefacts (MVP for DONE)
|
||||
- This briefing plus linkage in `docs/product-advisories/ADVISORY_INDEX.md` (already indexed).
|
||||
- Note in sprint Decisions & Risks for CVSS receipts sprints; ensure SPRINT_0300 tracker row 15 records completion.
|
||||
|
||||
## Determinism & Offline
|
||||
- Keep CVSS vector parsing deterministic; pin scoring library versions in receipts.
|
||||
- Avoid live API dependency; rely on mirrored NVD feeds or frozen samples.
|
||||
|
||||
## Next actions
|
||||
- Cross-link to receipts schema draft; add Execution Log entry when briefing is published.
|
||||
@@ -0,0 +1,19 @@
|
||||
# 29-Nov-2025 · SBOM to VEX Proof Pipeline Blueprint
|
||||
|
||||
**Why now:** The Docs ladder needs a canonical blueprint tying SBOM ingestion to VEX proofs with DSSE/Rekor integration, to unblock downstream module dossier updates.
|
||||
|
||||
## Scope
|
||||
- Describe DSSE → Rekor v2 → VEX linkage with offline verification steps.
|
||||
- Capture diagram/stub scripts for proof generation and verification.
|
||||
- Define inputs.lock/idempotency rules and chain hash recipe.
|
||||
|
||||
## Required artefacts (MVP for DONE)
|
||||
- Diagram placeholder (`docs/diagrams/sbom-vex-blueprint.svg` reserved) and script stub path `docs/scripts/sbom-vex/verify.sh` (offline, deterministic sorting/hashes).
|
||||
- Cross-links in `docs/modules/platform/architecture-overview.md` and sprint row 16 completion evidence.
|
||||
|
||||
## Determinism & Offline
|
||||
- Sorted canonical inputs before hashing; UTC timestamps only when unavoidable, otherwise derive from content.
|
||||
- No network calls; use bundled Rekor root + mirror snapshot for verification examples.
|
||||
|
||||
## Next actions
|
||||
- Land the stub diagram/script placeholders and log completion in the sprint Execution Log.
|
||||
@@ -0,0 +1,19 @@
|
||||
# 29-Nov-2025 · SCA Failure Catalogue for StellaOps Tests
|
||||
|
||||
**Why now:** Recent regressions show noisy or divergent SCA results; we need a deterministic failure catalogue to anchor acceptance tests and fixture packs.
|
||||
|
||||
## Scope
|
||||
- Document the five observed regressions (credential leak, Trivy offline DB mismatch, SBOM parity drift, Grype version divergence, inconsistent detection).
|
||||
- Provide expected signals for acceptance tests and links to fixture locations.
|
||||
- Drive remediation task SCA-FIXTURE-GAPS-300-014 (FC1–FC10) in the sprint.
|
||||
|
||||
## Required artefacts (MVP for DONE)
|
||||
- This catalogue plus a pointer to fixture pack root `tests/fixtures/sca/catalogue/` (to be populated with deterministic seeds + DSSE manifests).
|
||||
- Sprint Execution Log entry for row 17 when published.
|
||||
|
||||
## Determinism & Offline
|
||||
- Fixtures must pin scanner versions and feeds; include `inputs.lock` and DSSE manifest for each case.
|
||||
- Results should be normalized (ordering, casing) to avoid flaky comparisons.
|
||||
|
||||
## Next actions
|
||||
- Create initial fixture directory with README and seed notes; log status in sprint tracker and move row 17 toward DONE.
|
||||
@@ -0,0 +1,20 @@
|
||||
# 29-Nov-2025 · StellaOps – Mid-Level .NET Onboarding (Quick Start)
|
||||
|
||||
**Why now:** The Docs ladder needs a deterministic, offline-ready quickstart to unblock module dossier refreshes and align mid-level .NET contributors with DSSE/VEX requirements.
|
||||
|
||||
## Scope
|
||||
- Provide a 1–2 day runway for mid-level .NET engineers to become productive on StellaOps.
|
||||
- Emphasise determinism, offline posture, DSSE/in-toto usage, and the canonical data model.
|
||||
- Pair this advisory with the living guide at `docs/onboarding/dev-quickstart.md`.
|
||||
|
||||
## Required artefacts (MVP for DONE)
|
||||
- Update `docs/onboarding/dev-quickstart.md` with deterministic/offline steps, DSSE/key-handling, and DB matrix pointers.
|
||||
- Cross-links in `docs/README.md` and `docs/modules/platform/architecture-overview.md` to the quickstart.
|
||||
- Sprint tracker: `docs/implplan/SPRINT_0300_0001_0001_documentation_process.md` row 3 marked DONE with Execution Log proof.
|
||||
|
||||
## Determinism & Offline
|
||||
- Use fixed seeds and pinned toolchain versions for any sample commands.
|
||||
- Avoid live network calls; prefer cached feeds/mirrors and note mirror paths.
|
||||
|
||||
## Next actions
|
||||
- Land the cross-link updates and note completion in the sprint Execution Log.
|
||||
@@ -0,0 +1,18 @@
|
||||
# 30-Nov-2025 · Comparative Evidence Patterns for Stella Ops
|
||||
|
||||
**Why now:** UX and data-model decisions need a grounded comparison of evidence/suppression patterns across major vendors.
|
||||
|
||||
## Scope
|
||||
- Summarise how Snyk, GitHub, Aqua, Anchore/Grype, and Prisma Cloud handle evidence, suppression, audit/export.
|
||||
- Feed UX/data-model decisions for VEX, evidence views, and export policies.
|
||||
- Drives task 23 (EVIDENCE + suppression pattern sync) in sprint tracker.
|
||||
|
||||
## Required artefacts (MVP)
|
||||
- This brief plus links to any UI or schema follow-ups once drafted.
|
||||
- Sprint tracker row 23 updated when evidence is logged.
|
||||
|
||||
## Determinism & Offline
|
||||
- Keep examples deterministic; no live API calls in comparisons; cite cached docs/artefacts where needed.
|
||||
|
||||
## Next actions
|
||||
- Add schema/UX notes to module docs when ready; log completion in Execution Log.
|
||||
@@ -0,0 +1,18 @@
|
||||
# 30-Nov-2025 · Ecosystem Reality Test Cases for StellaOps
|
||||
|
||||
**Why now:** Real incidents (credential leak, offline DB schema mismatch, SBOM parity drift, scanner instability) must translate into deterministic acceptance tests.
|
||||
|
||||
## Scope
|
||||
- Document the five incidents and expected signals.
|
||||
- Map each to acceptance tests and fixture paths (`tests/fixtures/sca/catalogue/`).
|
||||
- Drives ECOSYS-FIXTURES-GAPS-300-017 (ET1–ET10).
|
||||
|
||||
## Required artefacts (MVP)
|
||||
- This advisory plus fixture root path and acceptance test references.
|
||||
- Sprint tracker row 21 updated when evidence lands.
|
||||
|
||||
## Determinism & Offline
|
||||
- Fixtures must pin tool versions and feeds; no live network.
|
||||
|
||||
## Next actions
|
||||
- Populate fixtures and acceptance specs; log in sprint Execution Log when added.
|
||||
@@ -0,0 +1,18 @@
|
||||
# 30-Nov-2025 · Implementor Guidelines for Stella Ops
|
||||
|
||||
**Why now:** Contributors need an enforceable checklist that ties SRS, release playbook, and determinism/offline requirements into a CI-enforced guardrail.
|
||||
|
||||
## Scope
|
||||
- Operational checklist for code and docs changes; mandates determinism, offline posture, provenance, and boundary rules.
|
||||
- Intended to drive lint/CI that enforces `docs touched → docs: n/a` tagging, schema/versioning control, and perf/quota expectations.
|
||||
|
||||
## Required artefacts (MVP)
|
||||
- Checklist mapped into `docs/process/implementor-guidelines.md` (to be created/expanded in sprints 18/19 tasks).
|
||||
- CI lint hook stub path declared (e.g., `tools/lint/implementor-guidelines.sh`).
|
||||
- Sprint tracker row 18 marked DONE once linked and logged.
|
||||
|
||||
## Determinism & Offline
|
||||
- Prefer reproducible seeds, pinned toolchain versions, and no live network in examples.
|
||||
|
||||
## Next actions
|
||||
- Add the checklist doc and CI stub; link from sprint Decisions & Risks where relevant.
|
||||
@@ -0,0 +1,18 @@
|
||||
# 30-Nov-2025 · Rekor Receipt Checklist for Stella Ops
|
||||
|
||||
**Why now:** Rekor receipts must be deterministic, tenant-scoped, and verifiable offline for Authority/Sbomer/Vexer flows.
|
||||
|
||||
## Scope
|
||||
- Field-level ownership map for receipts and bundles.
|
||||
- Offline verifier expectations and mirror snapshot rules.
|
||||
- DSSE/receipt schema pointers to be consumed by Authority/Sbomer/Vexer modules.
|
||||
|
||||
## Required artefacts (MVP)
|
||||
- Checklist page (this file) and cross-link in module docs when schemas land.
|
||||
- Sprint sync rows in `SPRINT_0300_0001_0001_documentation_process.md` and `SPRINT_0314_0001_0001_docs_modules_authority.md`.
|
||||
|
||||
## Determinism & Offline
|
||||
- Bundle TSA/time anchors with receipts; prefer mirror snapshots; avoid live log fetches in examples.
|
||||
|
||||
## Next actions
|
||||
- Publish schema draft and offline verifier stub; update module dossiers accordingly.
|
||||
@@ -0,0 +1,16 @@
|
||||
# 30-Nov-2025 · Standup Sprint Kickstarters
|
||||
|
||||
**Why now:** Day-0 unblockers accelerate sprint readiness and reduce blocker latency for Docs ladder and downstream modules.
|
||||
|
||||
## Scope
|
||||
- Three kickstarter tasks (scanner regressions, Postgres slice, DSSE/Rekor sweep) with ticket names/owners.
|
||||
- Alignment with sprint template and readiness checklist expectations.
|
||||
|
||||
## Required artefacts (MVP)
|
||||
- This advisory; sprint tracker row 22 updated; readiness checklist ties into `docs/implplan/README.md` template.
|
||||
|
||||
## Determinism & Offline
|
||||
- Keep examples and scripts offline-friendly; pin tool versions.
|
||||
|
||||
## Next actions
|
||||
- Add readiness checklist snippets to sprint template; log completion in Execution Log when linked.
|
||||
@@ -0,0 +1,49 @@
|
||||
# 30-Nov-2025 - UI Micro-Interactions for StellaOps
|
||||
|
||||
## Purpose
|
||||
Define canonical micro-interaction rules (MI1-MI10) for the StellaOps Console so motion, latency, error, and offline behaviors stay accessible, deterministic, and offline-ready. This advisory is the source of truth for UI sprints 0209/0210/0211 and Storybook/Playwright harnesses.
|
||||
|
||||
## Scope
|
||||
- Applies to Angular workspace `src/Web/StellaOps.Web` (Console), shared UI tokens, Storybook stories, and Playwright/axe/perf checks.
|
||||
- Covers micro-copy/localisation, telemetry events, reduced-motion behavior, and deterministic seeds/snapshots.
|
||||
|
||||
## Principles
|
||||
- Prefer informative micro-interactions over decorative motion; default to reduced-motion safe values.
|
||||
- Deterministic by default: fixed seeds, frozen fixtures, stable ordering, UTC timestamps.
|
||||
- Offline-first: no CDN assets; latency/error/offline states always represented with cached assets and copy.
|
||||
- WCAG 2.1 AA (focus order, keyboard, contrast, reduced motion, status messaging).
|
||||
|
||||
## MI Tracks (acceptance)
|
||||
| ID | Requirement | Acceptance Signals |
|
||||
| --- | --- | --- |
|
||||
| MI1 | Motion tokens catalogue (durations 80/140/200/260/320ms; easings `standard`,`decel`,`accel`,`emphasized`; distance scales XS-XL) in `tokens/motion.ts` and SCSS variables. | Storybook token page renders tokens; tokens consumed by at least 3 components (button, modal, banner). |
|
||||
| MI2 | Reduced-motion rules: prefers-reduced-motion -> durations clamp to 0-80ms, disable parallax/auto-animations; focus/hover states remain visible. | Playwright + axe snapshots with reduced-motion emulate; no animated GIF/Lottie autoplay; acceptance screenshots stable. |
|
||||
| MI3 | Latency/idle/load patterns: skeletons (<=400ms show), progress (400-1200ms), optimistic states, and offline banners with retry/backoff copy. | Fixtures for slow/failed/offline in Storybook; Playwright asserts state transitions and retry button wiring. |
|
||||
| MI4 | Error/cancel/undo: deterministic snackbar/toast pattern with keyboard focus, `aria-live=polite`, and 8s undo window. | Playwright e2e shows undo restores prior state; axe passes; snapshot stable. |
|
||||
| MI5 | Perf budgets: interaction response <=100ms, animation frame budget 16ms avg/50ms p95; LCP placeholder shown; no layout shift >0.05. | Lighthouse CI (offline config) JSON attached; flamechart samples stored under `tests/perf/micro/`. |
|
||||
| MI6 | Component mapping: table of components -> interaction type -> token usage; avoid bespoke easing outside catalog. | Mapping lives in `docs/modules/ui/micro-interactions-map.md`; lint rule blocks non-catalog easings. |
|
||||
| MI7 | Telemetry schema: `ui.micro.*` events with versioned schema, fields (tenant, surface, component, action, latency_ms, outcome, reduced_motion, offline_mode, error_code); PII-free. | JSON schema under `docs/modules/ui/telemetry/ui-micro.schema.json`; unit test validates sample payloads. |
|
||||
| MI8 | Deterministic seeds/snapshots: RNG seeds fixed per story/test; timestamps frozen; CSS animations paused in snapshots. | Storybook stories set `parameters.chromatic.disableAnimation`; Playwright uses `useFakeTimers` + fixed seeds exported from `micro-fixtures.ts`. |
|
||||
| MI9 | Micro-copy localisation: keys and ICU messages for micro-interaction copy; defaults EN, fallbacks present; no hard-coded strings in components. | i18n extraction shows zero TODO keys; snapshot verifies translated skeleton/error/undo copy. |
|
||||
| MI10 | Theme/contrast guidance: light/dark/HC tokens for backgrounds, borders, focus rings, and status colors; contrast >= 4.5:1 text, 3:1 UI elements. | Theming doc `docs/modules/ui/micro-theme.md`; axe-color passes for sample stories; focus ring visible in HC mode. |
|
||||
|
||||
## Deliverables
|
||||
- Token catalog: `src/Web/StellaOps.Web/src/styles/tokens/motion.{ts,scss}` with reduced-motion overrides.
|
||||
- Storybook stories: `apps/storybook/src/stories/micro/*` covering slow, error, offline, reduced-motion, undo flows; deterministic seeds.
|
||||
- Playwright suite: `tests/e2e/micro-interactions.spec.ts` covering MI2/MI3/MI4/MI8.
|
||||
- Telemetry schema + validators: `docs/modules/ui/telemetry/ui-micro.schema.json` plus unit test in `src/app/telemetry/__tests__/ui-micro.schema.spec.ts`.
|
||||
- Mapping doc: `docs/modules/ui/micro-interactions-map.md` linking components to token usage.
|
||||
|
||||
## Test & Determinism Rules
|
||||
- Freeze timers to 2025-12-04T12:00:00Z in stories/e2e; seed RNG with `0x5EED2025` unless scenario-specific.
|
||||
- All fixtures and snapshots stored under `tests/fixtures/micro/`; no network calls; offline assets bundled.
|
||||
- Playwright runs with `--disable-animations` and reduced-motion emulation variant.
|
||||
|
||||
## Risks & Mitigations
|
||||
- Workspace absence blocks implementation -> tracked in sprint action #7.
|
||||
- Token drift across sprints -> add lint rule and CI check for non-catalog easings.
|
||||
- Localisation regressions -> include i18n extraction in PR CI for micro stories.
|
||||
|
||||
## Traceability
|
||||
- Sprint: `docs/implplan/SPRINT_0209_0001_0001_ui_i.md` (task UI-MICRO-GAPS-0209-011).
|
||||
- Findings reference: `docs/product-advisories/31-Nov-2025 FINDINGS.md` (MI1-MI10 gap family).
|
||||
@@ -0,0 +1,17 @@
|
||||
# 30-Nov-2025 · Unknowns Decay & Triage Heuristics
|
||||
|
||||
**Why now:** Stale “unknown” findings create noise; we need deterministic decay and triage rules with UI/export artifacts.
|
||||
|
||||
## Scope
|
||||
- Define confidence decay card, triage queue UI, and export artifacts for planning.
|
||||
- Map to runtime signals sprint (`SPRINT_0140_0001_0001_runtime_signals.md`) and docs tracker row 20.
|
||||
|
||||
## Required artefacts (MVP)
|
||||
- This brief plus references to UnknownsRegistry docs (to be expanded).
|
||||
- UI/export snapshot expectations and deterministic decay logic description.
|
||||
|
||||
## Determinism & Offline
|
||||
- Decay windows and thresholds must be deterministic; exports should be reproducible without live dependencies.
|
||||
|
||||
## Next actions
|
||||
- Land UI mock/export schema; link into sprint Decisions & Risks and module docs once available.
|
||||
Reference in New Issue
Block a user