Files

StellaOps Bot 907783f625 Add property-based tests for SBOM/VEX document ordering and Unicode normalization determinism

- Implement `SbomVexOrderingDeterminismProperties` for testing component list and vulnerability metadata hash consistency.
- Create `UnicodeNormalizationDeterminismProperties` to validate NFC normalization and Unicode string handling.
- Add project file for `StellaOps.Testing.Determinism.Properties` with necessary dependencies.
- Introduce CI/CD template validation tests including YAML syntax checks and documentation content verification.
- Create validation script for CI/CD templates ensuring all required files and structures are present.

2025-12-26 15:17:58 +02:00

23 KiB

Raw Blame History

Consolidated Advisory: Diff-Aware Release Gates and Risk Budgets

Status: PLANNED — Consolidated reference document Created: 2025-12-26 Consolidated From:

25-Dec-2025 - Building a Deterministic Verdict Engine.md (original)

26-Dec-2026 - Diff‑Aware Releases and Auditable Exceptions.md (archived)

26-Dec-2026 - Smart‑Diff as a Core Evidence Primitive.md (archived)

25-Dec-2025 - Visual Diffs for Explainable Triage.md (archived)

26-Dec-2026 - Visualizing the Risk Budget.md (archived)

26-Dec-2026 - Weighted Confidence for VEX Sources.md (archived) Technical References:

archived/2025-12-21-moat-gap-closure/14-Dec-2025 - Smart-Diff Technical Reference.md

archived/2025-12-21-moat-phase2/20-Dec-2025 - Moat Explanation - Risk Budgets and Diff-Aware Release Gates.md

Executive Summary

This document consolidates StellaOps guidance on diff-aware release gates, risk budgets, delta verdicts, and VEX trust scoring into a single authoritative reference. The core proposition:

Ship fast on low-risk diffs, slow down only when the change warrants it—with deterministic, auditable, replayable evidence at every step.

Key Capabilities

Risk Budgets: Quantitative "capacity to take risk" per service tier, preventing reliability degradation
Diff-Aware Gates: Release strictness scales with what changed, not generic process
Delta Verdicts: Signed, replayable verdicts comparing before/after states
VEX Trust Scoring: Lattice-based merge of conflicting vulnerability evidence
Exception Workflow: Auditable, evidence-backed, auto-expiring exceptions
Visual Diffs: Explainable triage UI showing exactly what changed and why

Implementation Status

Component	Status	Location
Canonical JSON (JCS)	COMPLETE	`StellaOps.Canonical.Json`
Delta Verdict Engine	COMPLETE	`StellaOps.DeltaVerdict.Engine`
Smart-Diff UI	COMPLETE	`TriageWorkspaceComponent`
Proof Tree Visualization	COMPLETE	`ProofTreeComponent`
VEX Merge with Trust Scoring	COMPLETE	`Policy.Engine/VexMerge/`
Exception Entity Model	COMPLETE	`Policy.Engine/Exceptions/`
Risk Budget Dashboard	TODO	Sprint 2025Q1
Feed Snapshot Coordinator	TODO	SPRINT_20251226_007

Core Concepts
Risk Budget Model
Release Gate Levels
Delta Verdict Engine
Smart-Diff Algorithm
Exception Workflow
VEX Trust Scoring
UI/UX Patterns
CI/CD Integration
Data Models

1. Core Concepts

1.1 SBOM, VEX, and Reachability

SBOM (Software Bill of Materials): Complete inventory of components (CycloneDX 1.6 / SPDX 3.0.1)
VEX (Vulnerability Exploitability eXchange): Claims about whether vulnerabilities affect a specific product
Reachability: Analysis of whether vulnerable code paths are actually exercised at runtime

1.2 Semantic Delta

A semantic delta captures meaningful differences between two states:

Components added/removed/updated
Reachability edges added/removed
VEX claim transitions (affected → not_affected)
Configuration/feature flag changes
Attestation/provenance changes

1.3 Determinism-First Principles

All verdict computations must be:

Reproducible: Same inputs → identical outputs, always
Content-addressed: Every input identified by cryptographic hash
Declarative: Compact manifest lists all input hashes + engine version
Pure: No wall-clock time, no random iteration, no network during evaluation

2. Risk Budget Model

2.1 Service Tiers

Each service/product component must be assigned a Criticality Tier:

Tier	Description	Monthly Budget (RP)
Tier 0	Internal only, low business impact	300
Tier 1	Customer-facing non-critical	200
Tier 2	Customer-facing critical	120
Tier 3	Safety/financial/data-critical	80

2.2 Risk Point Scoring

Release Risk Score (RRS) = Base + Diff Risk + Operational Context − Mitigations

Base (by criticality):

Tier 0: +1
Tier 1: +3
Tier 2: +6
Tier 3: +10

Diff Risk (additive):

Change Type	Points
Docs, comments, non-executed code	+1
UI changes, refactors with high coverage	+3
API contract changes, dependency upgrades	+6
Database schema migrations, auth logic	+10
Infra/networking, encryption, payment flows	+15

Operational Context (additive):

Condition	Points
Active incident or recent Sev1/Sev2	+5
Error budget < 50% remaining	+3
High on-call load	+2
Release during freeze window	+5

Mitigations (subtract):

Control	Points
Feature flag with staged rollout + kill switch	−3
Canary + automated health gates + tested rollback	−3
High-confidence integration coverage	−2
Backward-compatible migration with proven rollback	−2
Change isolated behind permission boundary	−2

2.3 Budget Thresholds

Status	Remaining	Action
Green	≥60%	Normal operation
Yellow	30–59%	Gates tighten by 1 level for medium/high-risk diffs
Red	<30%	Freeze high-risk diffs; allow only low-risk or reliability work
Exhausted	≤0%	Incident/security fixes only with explicit sign-off

2.4 Risk Budget Visualization

The Risk Budget Burn-Up Chart is the key PM dashboard:

X-axis: Calendar dates up to code freeze
Y-axis: Risk points
Budget line: Allowable risk over time (flat or stepped)
Actual Risk line: Cumulative unknowns + knowns − mitigations
Shaded area: Headroom (green) or Overrun (red)
Vertical markers: Feature freeze, pen-test start, dependency bumps
Burn targets: Dotted lines showing required pace

Dashboard KPIs:

"Headroom: 28 pts (green)"
"Unknowns↑ +6 (24h)", "Risk retired −18 (7d)"
"Exceptions expiring: 3"
"At current burn, overrun in 5 days"

3. Release Gate Levels

3.1 Gate Definitions

G0 — No-risk / Administrative

Use for: docs-only, comments-only, non-functional metadata

Requirements:

Lint/format checks
Basic CI pass (build)

G1 — Low Risk

Use for: small localized changes with strong unit coverage, non-core UI, telemetry additions

Requirements:

All automated unit tests
Static analysis/linting
1 peer review
Automated deploy to staging
Post-deploy smoke checks

G2 — Moderate Risk

Use for: moderate logic changes in customer-facing paths, dependency upgrades, backward-compatible API changes

Requirements:

G1 +
Integration tests for impacted modules
Code owner review
Feature flag required if customer impact possible
Staged rollout: canary or small cohort
Rollback plan documented in PR

G3 — High Risk

Use for: schema migrations, auth/permission changes, core business logic, infra changes

Requirements:

G2 +
Security scan + dependency audit
Migration plan (forward + rollback) reviewed
Load/performance checks if in hot path
New/updated dashboards/alerts
Release captain sign-off
Progressive delivery with automatic health gates

G4 — Very High Risk / Safety-Critical

Use for: Tier 3 systems with low budget, freeze window exceptions, broad blast radius, post-incident remediation

Requirements:

G3 +
Formal risk review (PM+DM+Security/SRE) in writing
Explicit rollback rehearsal
Extended canary with success/abort criteria
Customer comms plan if impact plausible
Post-release verification checklist executed

3.2 Gate Selection Logic

Compute RRS from diff + context
Map RRS to default gate:
- 1–5 RP → G1
- 6–12 RP → G2
- 13–20 RP → G3
- 21+ RP → G4
Apply modifiers:
- Budget Yellow → escalate one gate for ≥G2
- Budget Red → escalate one gate for ≥G1, block high-risk unless exception
- Active incident → block non-fix releases by default

4. Delta Verdict Engine

4.1 Core Architecture

The delta verdict engine computes deterministic, signed verdicts comparing two states:

Verdict = f(Manifest)

Where Manifest contains:

sbom_sha256 - SBOM graph hash
vex_set_sha256[] - VEX document hashes
reach_subgraph_sha256 - Reachability graph hash
feeds_snapshot_sha256 - Feed snapshot hash
policy_bundle_sha256 - Policy/rules hash
engine_version - Engine version for reproducibility

4.2 Evaluation Pipeline

Normalize inputs
- SBOM: sort by packageUrl/name@version; resolve aliases
- VEX: normalize provider → vex_id, product_ref, status
- Reachability: adjacency lists sorted by node ID; hash after topological ordering
- Feeds: lock to snapshot (timestamp + commit/hash); no live calls
Policy bundle
- Declarative rules compiled to canonical IR
- Explicit merge precedence (lattice-merge table)
- Unknowns policy baked in: e.g., fail_if_unknowns > N in prod
Evaluation
- Build finding set: (component, vuln, context) tuples with deterministic IDs
- Apply lattice-based VEX merge with evidence pointers
- Compute status and risk_score using fixed-precision math
Emit
- Canonicalize verdict JSON (RFC 8785 JCS)
- Sign verdict (DSSE/COSE/JWS)
- Attach as OCI attestation to image/digest

4.3 Delta Verdict Structure

{
  "subject": {"ociDigest": "sha256:..."},
  "inputs": {
    "feeds": [{"type":"cve","digest":"sha256:..."}],
    "tools": {"sbomer":"1.6.3","reach":"0.9.0","policy":"lattice-2025.12"},
    "baseline": {"sbomG":"sha256:...","vexSet":"sha256:..."}
  },
  "delta": {
    "components": {"added":[...],"removed":[...],"updated":[...]},
    "reachability": {"edgesAdded":[...],"edgesRemoved":[...]},
    "settings": {"changed":[...]},
    "vex": [{"cve":"CVE-2025-1234","from":"affected","to":"not_affected",
             "reason":"config_flag_off","evidenceRef":"att#cfg-42"}],
    "attestations": {"changed":[...]}
  },
  "verdict": {
    "decision": "allow",
    "riskBudgetUsed": 2,
    "policyId": "lattice-2025.12",
    "explanationRefs": ["vex[0]","reachability.edgesRemoved[3]"]
  },
  "signing": {"dsse":"...","signer":"stella-authority"}
}

4.4 Replay Contract

For deterministic replay, pin and record:

Feed snapshots + hashes
Scanner versions + rule packs + lattice/policy version
SBOM generator version + mode
Reachability engine settings
Merge semantics ID

Replayer re-hydrates exact inputs and must reproduce the same verdict bit-for-bit.

5. Smart-Diff Algorithm

5.1 Material Risk Change Detection

FindingKey: (component_purl, component_version, cve_id)

RiskState Fields:

reachable: bool | unknown
vex_status: enum (AFFECTED | NOT_AFFECTED | FIXED | UNDER_INVESTIGATION | UNKNOWN)
in_affected_range: bool | unknown
kev: bool
epss_score: float | null
policy_flags: set<string>
evidence_links: list<EvidenceLink>

5.2 Change Detection Rules

Rule R1: Reachability Flip

reachable changes: false → true (risk ↑) or true → false (risk ↓)

Rule R2: VEX Status Flip

Meaningful changes: AFFECTED ↔ NOT_AFFECTED, UNDER_INVESTIGATION → NOT_AFFECTED

Rule R3: Affected Range Boundary

in_affected_range flips: false → true or true → false

Rule R4: Intelligence/Policy Flip

kev changes false → true
epss_score crosses configured threshold
policy_flag changes severity (warn → block)

5.3 Suppression Rules

All must apply for suppression:

reachable == false
vex_status == NOT_AFFECTED
kev == false
No policy override

Patch Churn Suppression:

If version changes AND in_affected_range remains false in both AND no KEV/policy flip → suppress

5.4 Priority Score Formula

score =
  + 1000 if new.kev
  + 500 if new.reachable
  + 200 if reason includes RANGE_FLIP to affected
  + 150 if VEX_FLIP to AFFECTED
  + 0..100 based on EPSS (epss * 100)
  + policy weight: +300 if decision BLOCK, +100 if WARN

5.5 Reachability Gate (3-Bit Severity)

public sealed record ReachabilityGate(
    bool? Reachable,       // true / false / null for unknown
    bool? ConfigActivated,
    bool? RunningUser,
    int Class,             // 0..7 derived from the bits when all known
    string Rationale
);

Class Computation: 0-7 based on 3 binary gates (reachable, config-activated, running user)

Unknown Handling: Never silently treat null as false or true. If any bit is null, set Class = -1 or compute from known bits only.

6. Exception Workflow

6.1 Exception Entity Model

public record Exception(
    string Id,
    string Scope,           // image:repo/app:tag, component:pkg@ver
    string Subject,         // CVE-2025-1234, package name
    string Reason,          // Human-readable justification
    List<string> EvidenceRefs,  // att:sha256:..., vex:sha256:...
    string CreatedBy,
    DateTime CreatedAt,
    DateTime? ExpiresAt,
    string PolicyBinding,
    string Signature
);

6.2 Exception Requirements

Signed rationale + evidence: Justification with linked proofs (attestation IDs, VEX note, reachability subgraph slice)
Auto-expiry & revalidation: Scheduler re-tests on expiry or when feeds mark "fix available / EPSS ↑ / reachability ↑"
Audit view: Timeline of exception lifecycle (who/why, evidence, re-checks)
Policy hooks: "allow only if: reason ∧ evidence present ∧ max TTL ≤ X ∧ owner = team-Y"
Inheritance: repo→image→env scoping with explicit shadowing

6.3 Exception CLI

stella exception create \
  --cve CVE-2025-1234 \
  --scope image:repo/app:tag \
  --reason "Feature disabled" \
  --evidence att:sha256:... \
  --ttl 30d

6.4 Break-Glass Policy

Exceptions permitted only for:

Incident mitigation or customer harm prevention
Urgent security fix (actively exploited or high severity)
Legal/compliance deadline

Requirements:

Recorded rationale in PR/release ticket
Named approvers: DM + on-call owner; PM for customer-impacting risk
Mandatory follow-up within 5 business days
Budget penalty: +50% of change's RRS

7. VEX Trust Scoring

7.1 Evidence Atoms

For every VEX statement, extract:

scope: package@version, image@digest, file hash
claim: affected, not_affected, under_investigation, fixed
reason: reachable?, feature flag off, vulnerable code not present
provenance: who said it, how it's signed
when: issued_at, observed_at, expires_at
supporting artifacts: SBOM ref, in-toto link, CVE IDs

7.2 Confidence Score (C: 0–1)

Multiply factors, cap at 1:

Factor	Weight
DSSE + Sigstore/Rekor inclusion	0.35
Hardware-backed key or org OIDC	0.15
NVD source	0.20
Major distro PSIRT	0.20
Upstream vendor	0.20
Reputable CERT	0.15
Small vendor	0.10
Reachability proof/test	0.25
Code diff linking	0.20
Deterministic build link	0.15
"Reason" present	0.10
≥2 independent concurring sources	+0.10

7.3 Freshness Score (F: 0–1)

F = exp(−Δdays / τ)

τ values by source class:

Vendor VEX: τ = 30
NVD: τ = 90
Exploit-active feeds: τ = 14

Update reset: New attestation with same subject resets Δdays. Expiry clamp: If now > expires_at, set F = 0.

7.4 Claim Strength (S_claim)

Claim	Base Weight
not_affected	0.9
fixed	0.8
affected	0.7
under_investigation	0.4

Reason multipliers:

reachable? → +0.15 to "affected"
"feature flag off" → +0.10 to "not_affected"
platform mismatch → +0.10
backport patch note (with commit hash) → +0.10

7.5 Lattice Merge

Per evidence e:

Score(e) = C(e) × F(e) × S_claim(e)

Merge in distributive lattice ordered by:

Claim precedence: not_affected > fixed > affected > under_investigation
Break ties by Score(e)
If competing top claims within ε (0.05), escalate to "disputed" and surface both with proofs

7.6 Worked Example

Small vendor Sigstore VEX (signed, reason: code path unreachable, issued 7 days ago):

C ≈ 0.35 + 0.10 + 0.10 + 0.25 = 0.70
F = exp(−7/30) ≈ 0.79
S_claim = 0.9 + 0.10 = 1.0 (capped)
Score ≈ 0.70 × 0.79 × 1.0 = 0.55

NVD entry (affected, no reasoning, 180 days old):

C ≈ 0.20
F = exp(−180/90) ≈ 0.14
S_claim = 0.7
Score ≈ 0.20 × 0.14 × 0.7 = 0.02

Outcome: Vendor VEX wins → not_affected with linked proofs.

8. UI/UX Patterns

8.1 Three-Pane Layout

Categories Pane: Filterable list of change categories
Items Pane: Delta items within selected category
Proof Pane: Evidence details for selected item

8.2 Visual Diff Components

Component	Purpose
`DeltaSummaryStripComponent`	Risk delta header: "Risk ↓ Medium → Low"
`ProofPaneComponent`	Evidence rail with witness paths
`VexMergeExplanationComponent`	Trust algebra visualization
`CompareViewComponent`	Side-by-side before/after
`TriageShortcutsService`	Keyboard navigation

8.3 Micro-interactions

Hover changed node → inline badge explaining why it changed
Click rule change → spotlight the exact subgraph it affected
"Explain like I'm new" toggle → expand jargon into plain language
"Copy audit bundle" → export delta + evidence as attachment

8.4 Hotkeys

Key	Action
`1`	Focus changes only
`2`	Show full graph
`E`	Expand evidence
`A`	Export audit

8.5 Empty States

Incomplete evidence: Yellow "Unknowns present" ribbon with count and collection button
Huge graphs: Default to "changed neighborhood only" with mini-map

9. CI/CD Integration

9.1 API Endpoints

Endpoint	Purpose
`POST /evaluate`	Returns `verdict.json` + attestation
`POST /delta`	Returns `delta.json` (signed)
`GET /replay?manifest_sha=`	Re-executes with cached snapshots
`GET /evidence/:cid`	Fetches immutable evidence blobs

9.2 CLI Commands

# Verify delta between two versions
stella verify delta \
  --from abc123 \
  --to def456 \
  --policy prod.json \
  --print-proofs

# Create exception
stella exception create \
  --cve CVE-2025-1234 \
  --scope image:repo/app:tag \
  --reason "Feature disabled" \
  --evidence att:sha256:... \
  --ttl 30d

# Replay a verdict
stella replay \
  --manifest-sha sha256:... \
  --assert-identical

9.3 Exit Codes

Code	Meaning
0	PASS - Release allowed
1	FAIL - Gate blocked
2	WARN - Proceed with caution
3	ERROR - Evaluation failed

9.4 Pipeline Recipe

release-gate:
  script:
    - stella evaluate --subject $IMAGE_DIGEST --policy $GATE_POLICY
    - |
      if [ $? -eq 1 ]; then
        echo "Gate blocked - risk budget exceeded or policy violation"
        exit 1
      fi
    - stella delta --from $BASELINE --to $IMAGE_DIGEST --export audit-bundle.zip
  artifacts:
    paths:
      - audit-bundle.zip

10. Data Models

10.1 Scan Manifest

{
  "sbom_sha256": "sha256:...",
  "vex_set_sha256": ["sha256:..."],
  "reach_subgraph_sha256": "sha256:...",
  "feeds_snapshot_sha256": "sha256:...",
  "policy_bundle_sha256": "sha256:...",
  "engine_version": "1.0.0",
  "policy_semver": "2025.12",
  "options_hash": "sha256:..."
}

10.2 Verdict

{
  "risk_score": 42,
  "status": "pass|warn|fail",
  "unknowns_count": 3,
  "evidence_refs": ["sha256:...", "sha256:..."],
  "explanations": [
    {"template": "CVE-{cve} suppressed by VEX claim from {source}",
     "params": {"cve": "2025-1234", "source": "vendor"}}
  ]
}

10.3 Smart-Diff Predicate

{
  "predicateType": "stellaops.dev/predicates/smart-diff@v1",
  "predicate": {
    "baseImage": {"name":"...", "digest":"sha256:..."},
    "targetImage": {"name":"...", "digest":"sha256:..."},
    "diff": {
      "filesAdded": [...],
      "filesRemoved": [...],
      "filesChanged": [{"path":"...", "hunks":[...]}],
      "packagesChanged": [{"name":"openssl","from":"1.1.1u","to":"3.0.14"}]
    },
    "context": {
      "entrypoint":["/app/start"],
      "env":{"FEATURE_X":"true"},
      "user":{"uid":1001,"caps":["NET_BIND_SERVICE"]}
    },
    "reachabilityGate": {"reachable":true,"configActivated":true,"runningUser":false,"class":6}
  }
}

Appendix A: Success Metrics

Metric	Description
Mean Time to Explain (MTTE)	Time from "why did this change?" to "Understood"
Change Failure Rate	% of releases causing incidents
MTTR	Mean time to recovery
Gate Compliance Rate	% of releases following required gates
Budget Utilization	Actual RP consumed vs. allocated

Document	Relationship
`docs/modules/policy/architecture.md`	Policy Engine implementation
`docs/modules/scanner/architecture.md`	Scanner/Reachability implementation
`docs/modules/web/smart-diff-ui-architecture.md`	UI component specifications
`SPRINT_20251226_007_BE_determinism_gaps.md`	Determinism implementation sprint

Appendix C: Archive References

The following advisories were consolidated into this document:

Original File	Archive Location
`25-Dec-2025 - Building a Deterministic Verdict Engine.md`	(kept in place - primary reference)
`26-Dec-2026 - Diff‑Aware Releases and Auditable Exceptions.md`	`archived/2025-12-26-superseded/`
`26-Dec-2026 - Smart‑Diff as a Core Evidence Primitive.md`	`archived/2025-12-26-superseded/`
`25-Dec-2025 - Visual Diffs for Explainable Triage.md`	`archived/2025-12-26-triage-advisories/`
`26-Dec-2026 - Visualizing the Risk Budget.md`	`archived/2025-12-26-triage-advisories/`
`26-Dec-2026 - Weighted Confidence for VEX Sources.md`	`archived/2025-12-26-vex-scoring/`

Technical References (not moved):

archived/2025-12-21-moat-gap-closure/14-Dec-2025 - Smart-Diff Technical Reference.md
archived/2025-12-21-moat-phase2/20-Dec-2025 - Moat Explanation - Risk Budgets and Diff-Aware Release Gates.md

23 KiB Raw Blame History Unescape Escape

Consolidated Advisory: Diff-Aware Release Gates and Risk Budgets

Executive Summary

Key Capabilities

Implementation Status

Table of Contents

1. Core Concepts

1.1 SBOM, VEX, and Reachability

1.2 Semantic Delta

1.3 Determinism-First Principles

2. Risk Budget Model

2.1 Service Tiers

2.2 Risk Point Scoring

2.3 Budget Thresholds

2.4 Risk Budget Visualization

3. Release Gate Levels

3.1 Gate Definitions

G0 — No-risk / Administrative

G1 — Low Risk

G2 — Moderate Risk

G3 — High Risk

G4 — Very High Risk / Safety-Critical

3.2 Gate Selection Logic

4. Delta Verdict Engine

4.1 Core Architecture

4.2 Evaluation Pipeline

4.3 Delta Verdict Structure

4.4 Replay Contract

5. Smart-Diff Algorithm

5.1 Material Risk Change Detection

5.2 Change Detection Rules

5.3 Suppression Rules

5.4 Priority Score Formula

5.5 Reachability Gate (3-Bit Severity)

6. Exception Workflow

6.1 Exception Entity Model

6.2 Exception Requirements

6.3 Exception CLI

6.4 Break-Glass Policy

7. VEX Trust Scoring

7.1 Evidence Atoms

7.2 Confidence Score (C: 0–1)

7.3 Freshness Score (F: 0–1)

7.4 Claim Strength (S_claim)

7.5 Lattice Merge

7.6 Worked Example

8. UI/UX Patterns

8.1 Three-Pane Layout

8.2 Visual Diff Components

8.3 Micro-interactions

8.4 Hotkeys

8.5 Empty States

9. CI/CD Integration

9.1 API Endpoints

9.2 CLI Commands

9.3 Exit Codes

9.4 Pipeline Recipe

10. Data Models

10.1 Scan Manifest

10.2 Verdict

10.3 Smart-Diff Predicate

Appendix A: Success Metrics

Appendix B: Related Documents

Appendix C: Archive References

23 KiB

Raw Blame History