save progress
This commit is contained in:
@@ -0,0 +1,146 @@
|
||||
# Building a Deterministic Verdict Engine
|
||||
|
||||
> **Status:** PLANNED — Implementation in progress
|
||||
> **Date:** 2025-12-25
|
||||
> **Updated:** 2025-12-26
|
||||
> **Related Sprints:** [`SPRINT_20251226_007_BE_determinism_gaps.md`](../implplan/SPRINT_20251226_007_BE_determinism_gaps.md)
|
||||
> **Merged Advisories:** [`25-Dec-2025 - Enforcing Canonical JSON for Stable Verdicts.md`](./25-Dec-2025%20-%20Enforcing%20Canonical%20JSON%20for%20Stable%20Verdicts.md) (SUPERSEDED)
|
||||
|
||||
---
|
||||
|
||||
## Implementation Status
|
||||
|
||||
| Component | Status | Location |
|
||||
|-----------|--------|----------|
|
||||
| Canonical JSON (JCS) | COMPLETE | `StellaOps.Canonical.Json` |
|
||||
| NFC String Normalization | COMPLETE | `StellaOps.Resolver.NfcStringNormalizer` |
|
||||
| Content-Addressed IDs | COMPLETE | `Attestor.ProofChain/Identifiers/` |
|
||||
| DSSE Signing | COMPLETE | `Signer/`, `Attestor/` |
|
||||
| Delta Verdict | COMPLETE | `Policy/Deltas/DeltaVerdict.cs` |
|
||||
| Merkle Trees | COMPLETE | `ProofChain/Merkle/` |
|
||||
| Determinism Guards | COMPLETE | `Policy.Engine/DeterminismGuard/` |
|
||||
| Replay Manifest | COMPLETE | `StellaOps.Replay.Core` |
|
||||
| Feed Snapshot Coordinator | TODO | SPRINT_20251226_007 DET-GAP-01..04 |
|
||||
| Keyless Signing | TODO | SPRINT_20251226_001 |
|
||||
| Cross-Platform Testing | TODO | SPRINT_20251226_007 DET-GAP-11..13 |
|
||||
|
||||
**Overall Progress:** ~85% complete
|
||||
|
||||
---
|
||||
|
||||
## Advisory Content
|
||||
|
||||
Here's a tight, practical blueprint for evolving Stella Ops's policy engine into a **fully deterministic verdict engine**—so the *same SBOM + VEX + reachability subgraph ⇒ the exact same, replayable verdict* every time, with auditor‑grade trails and signed "delta verdicts."
|
||||
|
||||
## Why this matters (quick)
|
||||
|
||||
* **Reproducibility:** auditors can replay any scan and get identical results.
|
||||
* **Trust & scale:** cross‑agent consensus via content‑addressed inputs and signed outputs.
|
||||
* **Operational clarity:** diffs between builds become crisp, machine‑verifiable artifacts.
|
||||
|
||||
## Core principles
|
||||
|
||||
* **Determinism-first:** no wall‑clock time, no random iteration order, no network during evaluation.
|
||||
* **Content‑addressing:** hash every *input* (SBOM, VEX docs, reachability subgraph, policy set, rule versions, feed snapshots).
|
||||
* **Declarative state:** a compact **Scan Manifest** lists input hashes + policy bundle hash + engine version.
|
||||
* **Pure evaluation:** the verdict function is referentially transparent: `Verdict = f(Manifest)`.
|
||||
|
||||
## Canonical JSON (Merged from Canonical JSON Advisory)
|
||||
|
||||
All JSON artifacts must use **RFC 8785 JCS** canonicalization with optional **Unicode NFC** normalization:
|
||||
|
||||
```csharp
|
||||
// Existing implementation
|
||||
using StellaOps.Canonical.Json;
|
||||
|
||||
var canonical = CanonJson.Canonicalize(myObject);
|
||||
var hash = CanonJson.Hash(myObject);
|
||||
var versionedHash = CanonJson.HashVersioned(myObject, CanonVersion.V1);
|
||||
```
|
||||
|
||||
**Canonicalization Rules:**
|
||||
1. Object keys sorted lexicographically (Ordinal)
|
||||
2. No whitespace or formatting variations
|
||||
3. UTF-8 encoding without BOM
|
||||
4. IEEE 754 number formatting
|
||||
5. Version markers for migration safety (`_canonVersion: "stella:canon:v1"`)
|
||||
|
||||
## Data artifacts
|
||||
|
||||
* **Scan Manifest (`manifest.jsonc`)**
|
||||
* `sbom_sha256`, `vex_set_sha256[]`, `reach_subgraph_sha256`, `feeds_snapshot_sha256`, `policy_bundle_sha256`, `engine_version`, `policy_semver`, `options_hash`
|
||||
|
||||
* **Verdict (`verdict.json`)**
|
||||
* canonical JSON (stable key order); includes:
|
||||
* `risk_score`, `status` (pass/warn/fail), `unknowns_count`
|
||||
* **evidence_refs:** content IDs for cited VEX statements, nodes/edges from reachability, CVE records, feature‑flags, env‑guards
|
||||
* **explanations:** stable, template‑driven strings (+ machine reasons)
|
||||
|
||||
* **Delta Verdict (`delta.json`)**
|
||||
* computed between two manifests/verdicts:
|
||||
* `added_findings[]`, `removed_findings[]`, `severity_shift[]`, `unknowns_delta`, `policy_effects[]`
|
||||
* signed (DSSE/COSE/JWS), time‑stamped, and linkable to both verdicts
|
||||
|
||||
## Engine architecture (deterministic path)
|
||||
|
||||
1. **Normalize inputs**
|
||||
* SBOM: sort by `packageUrl`/`name@version`; resolve aliases; freeze semver comparison rules.
|
||||
* VEX: normalize provider → `vex_id`, `product_ref`, `status` (`affected`, `not_affected`, …), *with* source trust score precomputed from a **trust registry** (strict, versioned).
|
||||
* Reachability: store subgraph as adjacency lists sorted by node ID; hash after topological stable ordering.
|
||||
* Feeds: lock to a **snapshot** (timestamp + commit/hash); no live calls.
|
||||
|
||||
2. **Policy bundle**
|
||||
* Declarative rules (e.g., lattice/merge semantics), compiled to a **canonical IR** (e.g., OPA‑Rego → sorted DNF).
|
||||
* Merge precedence is explicit (e.g., `vendor > distro > internal` can be replaced by a lattice‑merge table).
|
||||
* Unknowns policy baked in: e.g., `fail_if_unknowns > N in prod`.
|
||||
|
||||
3. **Evaluation**
|
||||
* Build a **finding set**: `(component, vuln, context)` tuples with deterministic IDs.
|
||||
* Apply **lattice‑based VEX merge** (proof‑carrying): each suppression must carry an evidence pointer (feature flag off, code path unreachable, patched‑backport proof).
|
||||
* Compute final `status` and `risk_score` using fixed‑precision math; round rules are part of the bundle.
|
||||
|
||||
4. **Emit**
|
||||
* Canonicalize verdict JSON; attach **evidence map** (content IDs only).
|
||||
* Sign verdict; attach as **OCI attestation** to image/digest.
|
||||
|
||||
## APIs (minimal but complete)
|
||||
|
||||
* `POST /evaluate` → returns `verdict.json` + attestation
|
||||
* `POST /delta` with `{base_verdict, head_verdict}` → `delta.json` (signed)
|
||||
* `GET /replay?manifest_sha=` → re‑executes using cached snapshot bundles, returns the same `verdict_sha`
|
||||
* `GET /evidence/:cid` → fetches immutable evidence blobs (offline‑ready)
|
||||
|
||||
## Storage & indexing
|
||||
|
||||
* **CAS (content‑addressable store):** `/evidence/<sha256>` for SBOM/VEX/graphs/feeds/policies.
|
||||
* **Verdict registry:** keyed by `(image_digest, manifest_sha, engine_version)`.
|
||||
* **Delta ledger:** append‑only, signed; supports cross‑agent consensus (multiple engines can co‑sign identical deltas).
|
||||
|
||||
## UI slices (where it lives)
|
||||
|
||||
* **Run details → "Verdict" tab:** status, risk score, unknowns, top evidence links.
|
||||
* **"Diff" tab:** render **Delta Verdict** (added/removed/changed), with drill‑down to proofs.
|
||||
* **"Replay" button:** shows the exact manifest & engine version; one‑click re‑evaluation (offline possible).
|
||||
* **Audit export:** zip of `manifest.jsonc`, `verdict.json`, `delta.json` (if any), attestation, and referenced evidence.
|
||||
|
||||
## Testing & QA (must‑have)
|
||||
|
||||
* **Golden tests:** fixtures of manifests → frozen verdict JSONs (byte‑for‑byte).
|
||||
* **Chaos determinism tests:** vary thread counts, env vars, map iteration seeds; assert identical verdicts.
|
||||
* **Cross‑engine round‑trips:** two independent builds of the engine produce the same verdict for the same manifest.
|
||||
* **Time‑travel tests:** replay older feed snapshots to ensure stability.
|
||||
|
||||
## Rollout plan
|
||||
|
||||
1. **Phase 1:** Introduce Manifest + canonical verdict format alongside existing policy engine (shadow mode).
|
||||
2. **Phase 2:** Make verdicts the **first‑class artifact** (OCI‑attached); ship UI "Verdict/Diff".
|
||||
3. **Phase 3:** Enforce **delta‑gates** in CI/CD (risk budgets + exception packs referenceable by content ID).
|
||||
4. **Phase 4:** Open **consensus mode**—accept externally signed identical delta verdicts to strengthen trust.
|
||||
|
||||
## Notes for Stella modules
|
||||
|
||||
* **scanner.webservice:** keep lattice algorithms here (per your standing rule). Concelier/Excitors "preserve‑prune source."
|
||||
* **Authority/Attestor:** handle DSSE signing, key management, regional crypto profiles (eIDAS/FIPS/GOST/SM).
|
||||
* **Feedser/Vexer:** produce immutable **snapshot bundles**; never query live during evaluation.
|
||||
* **Router/Scheduler:** schedule replay jobs; cache manifests to speed up audits.
|
||||
* **Db:** Postgres as SoR; Valkey only for ephemeral queues/caches (per your BSD‑only profile).
|
||||
Reference in New Issue
Block a user