house keeping work

This commit is contained in:
StellaOps Bot
2025-12-19 22:19:08 +02:00
parent 91f3610b9d
commit 5b57b04484
64 changed files with 4702 additions and 4 deletions

View File

@@ -0,0 +1,366 @@
# A. Executive directive (send as-is to both PM + Dev)
1. **A “Release” is not an SBOM or a scan report. A Release is a “Security State Snapshot.”**
* A snapshot is a **versioned, content-addressed bundle** containing:
* SBOM graph (canonical form, hashed)
* Reachability graph (canonical form, hashed)
* VEX claim set (canonical form, hashed)
* Policies + rule versions used (hashed)
* Data-feed identifiers used (hashed)
* Toolchain versions (hashed)
2. **Diff is a product primitive, not a UI feature.**
* “Diff” must exist as a stable API and artifact, not a one-off report.
* Every comparison produces a **Delta object** (machine-readable) and a **Delta Verdict attestation** (signed).
3. **The CI/CD gate should never ask “how many CVEs?”**
* It should ask: **“What materially changed in exploitable risk since the last approved baseline?”**
* The Delta Verdict must be deterministically reproducible given the same snapshots and policy.
4. **Every Delta Verdict must be portable and auditable.**
* It must be a signed attestation that can be stored with the build artifact (OCI attach) and replayed offline.
---
# B. Product Management directions
## B1) Define the product concept: “Security Delta as the unit of governance”
**Position the capability as change-control for software risk**, not as “a scanner with comparisons.”
### Primary user stories (MVP)
1. **Release Manager / Security Engineer**
* “Compare the candidate build to the last approved build and explain *what changed* in exploitable risk.”
2. **CI Pipeline Owner**
* “Fail the build only for *new* reachable high-risk exposures (or policy-defined deltas), not for unchanged legacy issues.”
3. **Auditor / Compliance**
* “Show a signed delta verdict with evidence references proving why this release passed.”
### MVP “Delta Verdict” policy questions to support
* Are there **new reachable vulnerabilities** introduced?
* Did any **previously unreachable vulnerability become reachable**?
* Are there **new affected VEX states** (e.g., NOT_AFFECTED → AFFECTED)?
* Are there **new Unknowns** above a threshold?
* Is the **net exploitable surface** increased beyond policy budget?
## B2) Define the baseline selection rules (product-critical)
Diff is meaningless without a baseline contract. Product must specify baseline selection as a first-class choice.
Minimum baseline modes:
* **Previous build in the same pipeline**
* **Last “approved” snapshot** (from an approval gate)
* **Last deployed in environment X** (optional later, but roadmap it)
Acceptance criteria:
* The delta object must always contain:
* `baseline_snapshot_digest`
* `target_snapshot_digest`
* `baseline_selection_method` and identifiers
## B3) Define the delta taxonomy (what your product “knows” how to talk about)
Avoid “diffing findings lists.” You need consistent delta categories.
Minimum taxonomy:
1. **SBOM deltas**
* Component added/removed
* Component version change
* Dependency edge change (graph-level)
2. **VEX deltas**
* Claim added/removed
* Status change (e.g., under_investigation → fixed)
* Justification/evidence change (optional MVP)
3. **Reachability deltas**
* New reachable vulnerable symbol(s)
* Removed reachability
* Entry point changes
4. **Decision deltas**
* Policy outcome changed (PASS → FAIL)
* Explanation changed (drivers of decision)
PM deliverable:
* A one-page **Delta Taxonomy Spec** that becomes the canonical list used across API, UI, and attestations.
## B4) Define what “signed delta verdict” means in product terms
A delta verdict is not a PDF.
It is:
* A deterministic JSON payload
* Wrapped in a signature envelope (DSSE)
* Attached to the artifact (OCI attach)
* Includes pointers (hash references) to evidence graphs
PM must define:
* Where customers can view it (UI + CLI)
* Where it lives (artifact registry + Stella store)
* How it is consumed (policy gate, audit export)
## B5) PM success metrics (must be measurable)
* % of releases gated by delta verdict
* Mean time to explain “why failed”
* Reduction in “unchanged legacy vuln” false gating
* Reproducibility rate: same inputs → same verdict (target: 100%)
---
# C. Development Management directions
## C1) Architecture: treat Snapshot and Delta as immutable, content-addressed objects
You need four core services/modules:
1. **Canonicalization + Hashing**
* Deterministic serialization (stable field ordering, normalized IDs)
* Content addressing: every graph and claim set gets a digest
2. **Snapshot Store (Ledger)**
* Store snapshots keyed by digest
* Store relationships: artifact → snapshot, snapshot → predecessor(s)
* Must support offline export/import later (design now)
3. **Diff Engine**
* Inputs: `baseline_snapshot_digest`, `target_snapshot_digest`
* Outputs:
* `delta_object` (structured)
* `delta_summary` (human-friendly)
* Must be deterministic and testable with golden fixtures
4. **Verdict Engine + Attestation Writer**
* Evaluate policies against delta
* Produce `delta_verdict`
* Wrap as DSSE / in-toto-style statement (or your chosen predicate type)
* Sign and optionally attach to OCI artifact
## C2) Data model (minimum viable schemas)
### Snapshot (conceptual fields)
* `snapshot_id` (digest)
* `artifact_ref` (e.g., image digest)
* `sbom_graph_digest`
* `reachability_graph_digest`
* `vex_claimset_digest`
* `policy_bundle_digest`
* `feed_snapshot_digest`
* `toolchain_digest`
* `created_at`
### Delta object (conceptual fields)
* `delta_id` (digest)
* `baseline_snapshot_digest`
* `target_snapshot_digest`
* `sbom_delta` (structured)
* `reachability_delta` (structured)
* `vex_delta` (structured)
* `unknowns_delta` (structured)
* `derived_risk_delta` (structured)
* `created_at`
### Delta verdict attestation (must include)
* Subjects: artifact digest(s)
* Baseline snapshot digest + Target snapshot digest
* Policy bundle digest
* Verdict enum: PASS/WARN/FAIL
* Drivers: references to delta nodes (hash pointers)
* Signature metadata
## C3) Determinism requirements (non-negotiable)
Development must implement:
* **Canonical ID scheme** for components and graph nodes
(example: package URL + version + supplier + qualifiers, then hashed)
* Stable sorting for node/edge lists
* Stable normalization of timestamps (do not include wall-clock in hash inputs unless explicitly policy-relevant)
* A “replay test harness”:
* Given the same inputs, byte-for-byte identical snapshot/delta/verdict
Definition of Done:
* Golden test vectors for snapshots and deltas checked into repo
* Deterministic hashing tests in CI
## C4) Graph diff design (how to do it without drowning in noise)
### SBOM graph diff (MVP)
Implement:
* Node set delta: added/removed/changed nodes (by stable node ID)
* Edge set delta: added/removed edges (dependency relations)
* A “noise suppressor” layer:
* ignore ordering differences
* ignore metadata-only changes unless policy enables
Output should identify:
* “What changed?” (added/removed/upgraded/downgraded)
* “Why it matters?” (ties to vulnerability & reachability where available)
### VEX claimset diff (MVP)
Implement:
* Keyed by `(product/artifact scope, component ID, vulnerability ID)`
* Delta types:
* claim added/removed
* status changed
* justification changed (optional later)
### Reachability diff (incremental approach)
MVP can start narrow:
* Support one or two ecosystems initially (e.g., Java + Maven, or Go modules)
* Represent reachability as:
* `entrypoint → function/symbol → vulnerable symbol`
* Diff should highlight:
* Newly reachable vulnerable symbols
* Removed reachability
Important: even if reachability is initially partial, the diff model must support it cleanly (unknowns must exist).
## C5) Policy evaluation must run on delta, not on raw findings
Define a policy DSL contract like:
* `fail_if new_reachable_critical > 0`
* `warn_if new_unknowns > 10`
* `fail_if vex_status_regressed == true`
* `pass_if no_net_increase_exploitable_surface == true`
Engineering directive:
* Policies must reference **delta fields**, not scanner-specific output.
* Keep the policy evaluation pure and deterministic.
## C6) Signing and attachment (implementation-level)
Minimum requirements:
* Support signing delta verdict as a DSSE envelope with a stable predicate type.
* Support:
* keyless signing (optional)
* customer-managed keys (enterprise)
* Attach to OCI artifact as an attestation (where possible), and store in Stella ledger for retrieval.
Definition of Done:
* A CI workflow can:
1. create snapshots
2. compute delta
3. produce signed delta verdict
4. verify signature and gate
---
# D. Roadmap (sequenced to deliver value early without painting into a corner)
## Phase 1: “Snapshot + SBOM Diff + Delta Verdict”
* Version SBOM graphs
* Diff SBOM graphs
* Produce delta verdict based on SBOM delta + vulnerability delta (even before reachability)
* Signed delta verdict artifact exists
Output:
* Baseline/target selection
* Delta taxonomy v1
* Signed delta verdict v1
## Phase 2: “VEX claimsets and VEX deltas”
* Ingest OpenVEX/CycloneDX/CSAF
* Store canonical claimsets per snapshot
* Diff claimsets and incorporate into delta verdict
Output:
* “VEX status regression” gating works deterministically
## Phase 3: “Reachability graphs and reachability deltas”
* Start with one ecosystem
* Generate reachability evidence
* Diff reachability and incorporate into verdict
Output:
* “new reachable critical” becomes the primary gate
## Phase 4: “Offline replay bundle”
* Export/import snapshot + feed snapshot + policy bundle
* Replay delta verdict identically in air-gapped environment
---
# E. Acceptance criteria checklist (use this as a release gate for your own feature)
A feature is not done until:
1. **Snapshot is content-addressed** and immutable.
2. **Delta is content-addressed** and immutable.
3. Delta shows:
* SBOM delta
* VEX delta (when enabled)
* Reachability delta (when enabled)
* Unknowns delta
4. **Delta verdict is signed** and verification is automated.
5. **Replay test**: given same baseline/target snapshots + policy bundle, verdict is identical byte-for-byte.
6. The product answers, clearly:
* What changed?
* Why does it matter?
* Why is the verdict pass/fail?
* What evidence supports this?
---
# F. What to tell your teams to avoid (common failure modes)
* Do **not** ship “diff” as a UI compare of two scan outputs.
* Do **not** make reachability an unstructured “note” field; it must be a graph with stable IDs.
* Do **not** allow non-deterministic inputs into verdict hashes (timestamps, random IDs, nondeterministic ordering).
* Do **not** treat VEX as “ignore rules” only; treat it as a claimset with provenance and merge semantics (even if merge comes later).