save progress

This commit is contained in:
StellaOps Bot
2025-12-26 22:03:32 +02:00
parent 9a4cd2e0f7
commit e6c47c8f50
3634 changed files with 253222 additions and 56632 deletions

View File

@@ -0,0 +1,567 @@
# Consolidated Advisory: Deterministic Evidence and Verdict Architecture
> **Status:** PLANNED — Implementation ~85% complete
> **Created:** 2025-12-26
> **Consolidated From:**
> - `25-Dec-2025 - Building a Deterministic Verdict Engine.md` (original)
> - `25-Dec-2025 - Enforcing Canonical JSON for Stable Verdicts.md` (superseded)
> - `25-Dec-2025 - Planning Keyless Signing for Verdicts.md` (original)
> - `26-Dec-2026 - SmartDiff as a Core Evidence Primitive.md` (archived)
> - `26-Dec-2026 - Reachability as Cryptographic Proof.md` (archived)
> **Technical Specification:** [`docs/technical/architecture/determinism-specification.md`](../technical/architecture/determinism-specification.md)
---
## Executive Summary
This document consolidates StellaOps guidance on **deterministic verdict computation**, **canonical serialization**, **keyless signing**, and **proof-carrying reachability** into a single authoritative reference. The core proposition:
**Same SBOM + VEX + reachability subgraph ⇒ exact same, replayable verdict every time—with auditor-grade trails and signed evidence.**
### Key Capabilities
1. **Deterministic Evaluation**: Pure functions with no wall-clock, RNG, or network during evaluation
2. **Canonical Serialization**: RFC 8785 JCS + Unicode NFC for stable hashes
3. **Content-Addressed Storage**: Every input identified by cryptographic hash
4. **Keyless Signing**: Sigstore/Fulcio for short-lived certificates with Rekor transparency
5. **Proof-Carrying Reachability**: Minimal, reproducible chains showing why vulns can/cannot hit runtime
6. **Delta Verdicts**: Signed diffs between evaluation states for CI/CD gates
### Implementation Status
| Component | Status | Location |
|-----------|--------|----------|
| Canonical JSON (JCS) | ✅ COMPLETE | `StellaOps.Canonical.Json` |
| NFC String Normalization | ✅ COMPLETE | `StellaOps.Resolver.NfcStringNormalizer` |
| Content-Addressed IDs | ✅ COMPLETE | `Attestor.ProofChain/Identifiers/` |
| DSSE Signing | ✅ COMPLETE | `Signer/`, `Attestor/` |
| Delta Verdict | ✅ COMPLETE | `Policy/Deltas/DeltaVerdict.cs` |
| Merkle Trees | ✅ COMPLETE | `ProofChain/Merkle/` |
| Determinism Guards | ✅ COMPLETE | `Policy.Engine/DeterminismGuard/` |
| Replay Manifest | ✅ COMPLETE | `StellaOps.Replay.Core` |
| Feed Snapshot Coordinator | 🔄 TODO | SPRINT_20251226_007 |
| Keyless Signing (Fulcio) | 🔄 TODO | SPRINT_20251226_001 |
| Cross-Platform Testing | 🔄 TODO | SPRINT_20251226_007 |
**Overall Progress:** ~85% complete
---
## Table of Contents
1. [Why Determinism Matters](#1-why-determinism-matters)
2. [Core Principles](#2-core-principles)
3. [Canonical Serialization](#3-canonical-serialization)
4. [Data Artifacts](#4-data-artifacts)
5. [Signing & Attestation](#5-signing--attestation)
6. [Proof-Carrying Reachability](#6-proof-carrying-reachability)
7. [Delta Verdicts](#7-delta-verdicts)
8. [Engine Architecture](#8-engine-architecture)
9. [Testing Strategy](#9-testing-strategy)
10. [APIs & Integration](#10-apis--integration)
11. [Implementation Status Matrix](#11-implementation-status-matrix)
---
## 1. Why Determinism Matters
### Reproducibility for Auditors
Auditors can replay any scan and get identical results. No "it worked on my machine" scenarios—verdicts are cryptographically verifiable.
### Content-Addressed Caching
Hash-based storage enables:
- Deduplication across scans
- Cache hits on unchanged inputs
- Efficient delta computation
### Cross-Agent Consensus
Multiple evaluation engines can independently produce the same verdict for the same manifest, enabling:
- Distributed verification
- Multi-party attestations
- Trust without centralization
### Operational Clarity
Diffs between builds become crisp, machine-verifiable artifacts. When a verdict changes, you know exactly why.
---
## 2. Core Principles
### 2.1 No Wall-Clock Time
Evaluation functions never read current time. All timestamps come from input manifests.
### 2.2 No Random Iteration
All collections use deterministic ordering:
- Objects: keys sorted lexicographically (Ordinal)
- Arrays: preserve input order or sort by stable key
- Sets: sort by content hash
### 2.3 No Network During Evaluation
All external data is pre-fetched and pinned by hash before evaluation begins.
### 2.4 Content-Addressing All Inputs
Every input is identified by its cryptographic hash:
- `sbom_sha256` - SBOM graph hash
- `vex_set_sha256[]` - VEX document hashes
- `reach_subgraph_sha256` - Reachability graph hash
- `feeds_snapshot_sha256` - Feed snapshot hash
- `policy_bundle_sha256` - Policy/rules hash
### 2.5 Pure Evaluation Functions
The verdict function is referentially transparent:
```
Verdict = f(Manifest)
```
Given the same manifest, the function always returns the same verdict.
---
## 3. Canonical Serialization
### 3.1 The Rule
**Adopt one canonicalization spec and apply it everywhere at ingress/egress of your resolver:**
- **Strings:** normalize to **UTF-8, Unicode NFC** (Normalization Form C)
- **JSON:** canonicalize with **RFC 8785 JCS**: sorted keys, no insignificant whitespace, exact number formatting
- **Binary for hashing/signing:** always hash **the canonical bytes**, never ad-hoc serializer output
### 3.2 Implementation
```csharp
// Canonical JSON with version markers
using StellaOps.Canonical.Json;
var canonical = CanonJson.Canonicalize(myObject);
var hash = CanonJson.Hash(myObject);
var versionedHash = CanonJson.HashVersioned(myObject, CanonVersion.V1);
// NFC normalization
using StellaOps.Resolver;
var normalizer = NfcStringNormalizer.Instance;
var nfcString = normalizer.Normalize(input);
// RFC 8785 JCS for raw JSON bytes
using StellaOps.Attestor.ProofChain.Json;
var canonicalizer = new Rfc8785JsonCanonicalizer();
var canonicalBytes = canonicalizer.Canonicalize(utf8JsonBytes);
```
### 3.3 Canonicalization Rules
1. **Object keys** sorted lexicographically (Ordinal comparison)
2. **No whitespace** or formatting variations
3. **UTF-8 encoding** without BOM
4. **IEEE 754 number formatting** (no trailing zeros, no exponent for small integers)
5. **Version markers** for migration safety: `_canonVersion: "stella:canon:v1"`
### 3.4 Contract
1. Inputs may arrive in any well-formed JSON
2. Resolver **normalizes strings (NFC)** and **re-emits JSON in JCS**
3. **Content hash** is computed from **JCS-canonical UTF-8 bytes** only
4. Any signature/attestation (DSSE/OCI) MUST cover those same bytes
5. Any module that can't speak JCS must pass raw data to the resolver
---
## 4. Data Artifacts
### 4.1 Scan Manifest
The manifest lists all input hashes plus engine version:
```json
{
"sbom_sha256": "sha256:a1b2c3...",
"vex_set_sha256": ["sha256:d4e5f6...", "sha256:g7h8i9..."],
"reach_subgraph_sha256": "sha256:j0k1l2...",
"feeds_snapshot_sha256": "sha256:m3n4o5...",
"policy_bundle_sha256": "sha256:p6q7r8...",
"engine_version": "1.0.0",
"policy_semver": "2025.12",
"options_hash": "sha256:s9t0u1..."
}
```
### 4.2 Verdict
Canonical JSON with stable key order:
```json
{
"risk_score": 42,
"status": "warn",
"unknowns_count": 3,
"evidence_refs": [
"sha256:...",
"sha256:..."
],
"explanations": [
{
"template": "CVE-{cve} suppressed by VEX claim from {source}",
"params": {"cve": "2025-1234", "source": "vendor"},
"machine_reason": "VEX_NOT_AFFECTED"
}
]
}
```
### 4.3 Delta Verdict
Computed between two manifests/verdicts:
```json
{
"base_manifest_sha": "sha256:...",
"head_manifest_sha": "sha256:...",
"added_findings": [...],
"removed_findings": [...],
"severity_shift": [...],
"unknowns_delta": -2,
"policy_effects": [...],
"timestamp": "2025-12-26T00:00:00Z",
"signature": "..."
}
```
---
## 5. Signing & Attestation
### 5.1 Keyless Signing with Sigstore
Use **keyless** signing in CI pipelines:
- Obtain an OIDC token from your CI runner
- **Fulcio** issues a short-lived X.509 cert (~10 minutes)
- Sign with the ephemeral key
- Cert + signature logged to **Rekor**
**Why:** No key escrow in CI, nothing persistent to steal, every signature is time-bound + transparency-logged.
### 5.2 Hardware-Backed Org Key
Reserve a physical HSM/YubiKey (or KMS) key for:
- Re-signing monthly bundles
- Offline/air-gapped verification workflows
### 5.3 OCI Attestations
Emit DSSE/attestations as OCI-attached artifacts:
- SBOM deltas
- Reachability graphs
- Policy results
- Verdicts
### 5.4 Bundle Rotation Policy
Every month:
1. Collect older attestations
2. Re-sign into a long-lived "bundle" (plus timestamps) using the org key
3. Bundle contains: cert chain, Rekor inclusion proof, timestamps
**Suggested SLOs:**
- CI keyless cert TTL: 10 minutes (Fulcio default)
- Bundle cadence: monthly (or per release)
- Retention: N=24 months
### 5.5 Offline Verification
Mirror the image + attestation + Rekor proof (or bundle) into the disconnected registry. Verify with `cosign verify` using mirrored materials—no internet needed.
### 5.6 Implementation Sprints
| Sprint | Module | Topic |
|--------|--------|-------|
| SPRINT_20251226_001 | Signer | Fulcio keyless signing client |
| SPRINT_20251226_002 | Attestor | Monthly bundle rotation |
| SPRINT_20251226_003 | Attestor | Offline/air-gap verification |
| SPRINT_20251226_004 | Backend | CI/CD integration templates |
---
## 6. Proof-Carrying Reachability
### 6.1 The Concept
**Reachability** asks: "Could data flow from an attacker to the vulnerable code path during real execution?"
**Proof-carrying reachability** says: "Don't just say yes/no—hand me a *proof chain* I can re-run."
### 6.2 Proof Structure
1. **Scope hash**: content digests for artifact(s) (image layers, SBOM nodes, commit IDs, compiler flags)
2. **Policy hash**: the decision rules used
3. **Graph snippet**: the *minimal subgraph* connecting entrypoints → sources → validators → sinks
4. **Conditions**: feature flags, env vars, platform guards, version ranges, eBPF-observed edges
5. **Verdict** (signed): A → {Affected | Not Affected | Under-Constrained} with reason codes
6. **Replay manifest**: the inputs needed to recompute the same verdict
### 6.3 Example Proof
```
Artifact: svc.payments:1.4.7 (image digest sha256:...)
CVE: CVE-2024-XYZ in libyaml 0.2.5
Entry: POST /import, body → YamlDeserializer.Parse
Guards: none (no schema/whitelist prior to parse)
Edge chain: HttpBody → Parse(bytes) → LoadNode() → vulnerable_path()
Condition: feature flag BULK_IMPORT=true
Verdict: AFFECTED
Signed: DSSE envelope over {scope hash, policy hash, graph snippet, conditions, verdict}
```
### 6.4 Operating Modes
| Mode | Unknowns Policy | Proofs |
|------|-----------------|--------|
| **Strict** (prod) | Fail-closed | Required for Not Affected |
| **Lenient** (dev) | Tolerated | Optional but encouraged |
### 6.5 What to Measure
- Proof generation rate
- Median proof size (KB)
- Replay success %
- Proof dedup ratio
- "Unknowns" burn-down
---
## 7. Delta Verdicts
### 7.1 Evidence Model
A **semantic delta** captures meaningful differences between two states:
```json
{
"subject": {"ociDigest": "sha256:..."},
"inputs": {
"feeds": [{"type":"cve","digest":"sha256:..."}],
"tools": {"sbomer":"1.6.3","reach":"0.9.0","policy":"lattice-2025.12"},
"baseline": {"sbomG":"sha256:...","vexSet":"sha256:..."}
},
"delta": {
"components": {"added":[...],"removed":[...],"updated":[...]},
"reachability": {"edgesAdded":[...],"edgesRemoved":[...]},
"settings": {"changed":[...]},
"vex": [{"cve":"CVE-2025-1234","from":"affected","to":"not_affected",
"reason":"config_flag_off","evidenceRef":"att#cfg-42"}],
"attestations": {"changed":[...]}
},
"verdict": {
"decision": "allow",
"riskBudgetUsed": 2,
"policyId": "lattice-2025.12",
"explanationRefs": ["vex[0]","reachability.edgesRemoved[3]"]
},
"signing": {"dsse":"...","signer":"stella-authority"}
}
```
### 7.2 Merge Semantics
Define a policy-controlled lattice for claims:
- **Orderings:** `exploit_observed > affected > under_investigation > fixed > not_affected`
- **Source weights:** vendor, distro, internal SCA, runtime sensor, pentest
- **Conflict rules:** tie-breaks, quorum, freshness windows, required evidence hooks
### 7.3 OCI Attachment
Publish delta verdicts as OCI-attached attestations:
- Media type: `application/vnd.stella.delta-verdict+json`
- Attached alongside SBOM + VEX
---
## 8. Engine Architecture
### 8.1 Evaluation Pipeline
1. **Normalize inputs**
- SBOM: sort by `packageUrl`/`name@version`; resolve aliases
- VEX: normalize provider → `vex_id`, `product_ref`, `status`
- Reachability: adjacency lists sorted by node ID; hash after topological ordering
- Feeds: lock to snapshot (timestamp + commit/hash); no live calls
2. **Policy bundle**
- Declarative rules compiled to canonical IR
- Explicit merge precedence (lattice-merge table)
- Unknowns policy baked in
3. **Evaluation**
- Build finding set: `(component, vuln, context)` tuples with deterministic IDs
- Apply lattice-based VEX merge with evidence pointers
- Compute `status` and `risk_score` using fixed-precision math
4. **Emit**
- Canonicalize verdict JSON (RFC 8785 JCS)
- Sign verdict (DSSE/COSE/JWS)
- Attach as OCI attestation
### 8.2 Storage & Indexing
- **CAS (content-addressable store):** `/evidence/<sha256>` for SBOM/VEX/graphs/feeds/policies
- **Verdict registry:** keyed by `(image_digest, manifest_sha, engine_version)`
- **Delta ledger:** append-only, signed; supports cross-agent consensus
---
## 9. Testing Strategy
### 9.1 Golden Tests
Fixtures of manifests → frozen verdict JSONs (byte-for-byte comparison).
```csharp
[Theory]
[MemberData(nameof(GoldenTestCases))]
public async Task Verdict_MatchesGoldenOutput(string manifestPath, string expectedVerdictPath)
{
var manifest = await LoadManifest(manifestPath);
var actual = await _engine.Evaluate(manifest);
var expected = await File.ReadAllBytesAsync(expectedVerdictPath);
Assert.Equal(expected, CanonJson.Canonicalize(actual));
}
```
### 9.2 Chaos Determinism Tests
Vary thread counts, env vars, map iteration seeds; assert identical verdicts.
```csharp
[Fact]
public async Task Verdict_IsDeterministic_AcrossThreadCounts()
{
var manifest = CreateTestManifest();
var verdicts = new List<byte[]>();
for (int threads = 1; threads <= 16; threads++)
{
var verdict = await EvaluateWithThreads(manifest, threads);
verdicts.Add(CanonJson.Canonicalize(verdict));
}
Assert.All(verdicts, v => Assert.Equal(verdicts[0], v));
}
```
### 9.3 Cross-Engine Round-Trips
Two independent builds of the engine produce the same verdict for the same manifest.
### 9.4 Time-Travel Tests
Replay older feed snapshots to ensure stability.
---
## 10. APIs & Integration
### 10.1 API Endpoints
| Endpoint | Purpose |
|----------|---------|
| `POST /evaluate` | Returns `verdict.json` + attestation |
| `POST /delta` | Returns `delta.json` (signed) |
| `GET /replay?manifest_sha=` | Re-executes with cached snapshots |
| `GET /evidence/:cid` | Fetches immutable evidence blobs |
### 10.2 CLI Commands
```bash
# Evaluate an image
stella evaluate --subject sha256:... --policy prod.json
# Verify delta between versions
stella verify delta --from abc123 --to def456 --print-proofs
# Replay a verdict
stella replay --manifest-sha sha256:... --assert-identical
```
### 10.3 UI Integration
- **Run details → "Verdict" tab:** status, risk score, unknowns, top evidence links
- **"Diff" tab:** render Delta Verdict (added/removed/changed) with drill-down to proofs
- **"Replay" button:** shows exact manifest & engine version; one-click re-evaluation
- **Audit export:** zip of manifest, verdict, delta (if any), attestation, referenced evidence
---
## 11. Implementation Status Matrix
### 11.1 Complete (✅)
| Component | Location | Notes |
|-----------|----------|-------|
| Canonical JSON (JCS) | `StellaOps.Canonical.Json` | RFC 8785 compliant |
| NFC Normalization | `StellaOps.Resolver.NfcStringNormalizer` | Unicode NFC |
| Content-Addressed IDs | `Attestor.ProofChain/Identifiers/` | VerdictId, EvidenceId, GraphRevisionId |
| DSSE Signing | `Signer/`, `Attestor/` | Multiple algorithm support |
| Delta Verdict | `Policy/Deltas/DeltaVerdict.cs` | Full delta computation |
| Merkle Trees | `ProofChain/Merkle/` | Evidence chain verification |
| Determinism Guards | `Policy.Engine/DeterminismGuard/` | Runtime enforcement |
| Replay Manifest | `StellaOps.Replay.Core` | Full manifest serialization |
### 11.2 In Progress (🔄)
| Component | Sprint | Priority |
|-----------|--------|----------|
| Feed Snapshot Coordinator | SPRINT_20251226_007 (DET-GAP-01..04) | P0 |
| Keyless Signing (Fulcio) | SPRINT_20251226_001 | P0 |
| Monthly Bundle Rotation | SPRINT_20251226_002 | P1 |
| Offline Verification | SPRINT_20251226_003 | P2 |
| Cross-Platform Testing | SPRINT_20251226_007 (DET-GAP-11..13) | P1 |
### 11.3 Planned (📋)
| Component | Target | Notes |
|-----------|--------|-------|
| Roslyn Analyzer for Resolver Boundary | Q1 2026 | Compile-time enforcement |
| Pre-canonical Hash Debug Logging | Q1 2026 | Audit trail |
| Consensus Mode | Q2 2026 | Multi-agent verification |
---
## Appendix A: Rollout Plan
### Phase 1: Shadow Mode
Introduce Manifest + canonical verdict format alongside existing policy engine.
### Phase 2: First-Class Verdicts
Make verdicts the first-class artifact (OCI-attached); ship UI "Verdict/Diff".
### Phase 3: Delta Gates
Enforce delta-gates in CI/CD (risk budgets + exception packs referenceable by content ID).
### Phase 4: Consensus Mode
Accept externally signed identical delta verdicts to strengthen trust.
---
## Appendix B: Archive References
The following advisories were consolidated into this document:
| Original File | Archive Location |
|--------------|------------------|
| `25-Dec-2025 - Building a Deterministic Verdict Engine.md` | (kept in place - primary reference) |
| `25-Dec-2025 - Enforcing Canonical JSON for Stable Verdicts.md` | (kept in place - marked superseded) |
| `25-Dec-2025 - Planning Keyless Signing for Verdicts.md` | (kept in place - primary reference) |
| `26-Dec-2026 - SmartDiff as a Core Evidence Primitive.md` | `archived/2025-12-26-superseded/` |
| `26-Dec-2026 - Reachability as Cryptographic Proof.md` | `archived/2025-12-26-superseded/` |
---
## Appendix C: Related Documents
| Document | Relationship |
|----------|--------------|
| [`docs/modules/policy/architecture.md`](../modules/policy/architecture.md) | Policy Engine implementation |
| [`docs/modules/policy/design/deterministic-evaluator.md`](../modules/policy/design/deterministic-evaluator.md) | Evaluator design |
| [`docs/modules/policy/design/policy-determinism-tests.md`](../modules/policy/design/policy-determinism-tests.md) | Test strategy |
| [`docs/modules/scanner/deterministic-execution.md`](../modules/scanner/deterministic-execution.md) | Scanner determinism |
| [`docs/technical/architecture/determinism-specification.md`](../technical/architecture/determinism-specification.md) | Technical specification |