- Implement `SbomVexOrderingDeterminismProperties` for testing component list and vulnerability metadata hash consistency. - Create `UnicodeNormalizationDeterminismProperties` to validate NFC normalization and Unicode string handling. - Add project file for `StellaOps.Testing.Determinism.Properties` with necessary dependencies. - Introduce CI/CD template validation tests including YAML syntax checks and documentation content verification. - Create validation script for CI/CD templates ensuring all required files and structures are present.
23 KiB
Consolidated Advisory: Diff-Aware Release Gates and Risk Budgets
Status: PLANNED — Consolidated reference document Created: 2025-12-26 Consolidated From:
25-Dec-2025 - Building a Deterministic Verdict Engine.md(original)26-Dec-2026 - Diff‑Aware Releases and Auditable Exceptions.md(archived)26-Dec-2026 - Smart‑Diff as a Core Evidence Primitive.md(archived)25-Dec-2025 - Visual Diffs for Explainable Triage.md(archived)26-Dec-2026 - Visualizing the Risk Budget.md(archived)26-Dec-2026 - Weighted Confidence for VEX Sources.md(archived) Technical References:archived/2025-12-21-moat-gap-closure/14-Dec-2025 - Smart-Diff Technical Reference.mdarchived/2025-12-21-moat-phase2/20-Dec-2025 - Moat Explanation - Risk Budgets and Diff-Aware Release Gates.md
Executive Summary
This document consolidates StellaOps guidance on diff-aware release gates, risk budgets, delta verdicts, and VEX trust scoring into a single authoritative reference. The core proposition:
Ship fast on low-risk diffs, slow down only when the change warrants it—with deterministic, auditable, replayable evidence at every step.
Key Capabilities
- Risk Budgets: Quantitative "capacity to take risk" per service tier, preventing reliability degradation
- Diff-Aware Gates: Release strictness scales with what changed, not generic process
- Delta Verdicts: Signed, replayable verdicts comparing before/after states
- VEX Trust Scoring: Lattice-based merge of conflicting vulnerability evidence
- Exception Workflow: Auditable, evidence-backed, auto-expiring exceptions
- Visual Diffs: Explainable triage UI showing exactly what changed and why
Implementation Status
| Component | Status | Location |
|---|---|---|
| Canonical JSON (JCS) | COMPLETE | StellaOps.Canonical.Json |
| Delta Verdict Engine | COMPLETE | StellaOps.DeltaVerdict.Engine |
| Smart-Diff UI | COMPLETE | TriageWorkspaceComponent |
| Proof Tree Visualization | COMPLETE | ProofTreeComponent |
| VEX Merge with Trust Scoring | COMPLETE | Policy.Engine/VexMerge/ |
| Exception Entity Model | COMPLETE | Policy.Engine/Exceptions/ |
| Risk Budget Dashboard | TODO | Sprint 2025Q1 |
| Feed Snapshot Coordinator | TODO | SPRINT_20251226_007 |
Table of Contents
- Core Concepts
- Risk Budget Model
- Release Gate Levels
- Delta Verdict Engine
- Smart-Diff Algorithm
- Exception Workflow
- VEX Trust Scoring
- UI/UX Patterns
- CI/CD Integration
- Data Models
1. Core Concepts
1.1 SBOM, VEX, and Reachability
- SBOM (Software Bill of Materials): Complete inventory of components (CycloneDX 1.6 / SPDX 3.0.1)
- VEX (Vulnerability Exploitability eXchange): Claims about whether vulnerabilities affect a specific product
- Reachability: Analysis of whether vulnerable code paths are actually exercised at runtime
1.2 Semantic Delta
A semantic delta captures meaningful differences between two states:
- Components added/removed/updated
- Reachability edges added/removed
- VEX claim transitions (affected → not_affected)
- Configuration/feature flag changes
- Attestation/provenance changes
1.3 Determinism-First Principles
All verdict computations must be:
- Reproducible: Same inputs → identical outputs, always
- Content-addressed: Every input identified by cryptographic hash
- Declarative: Compact manifest lists all input hashes + engine version
- Pure: No wall-clock time, no random iteration, no network during evaluation
2. Risk Budget Model
2.1 Service Tiers
Each service/product component must be assigned a Criticality Tier:
| Tier | Description | Monthly Budget (RP) |
|---|---|---|
| Tier 0 | Internal only, low business impact | 300 |
| Tier 1 | Customer-facing non-critical | 200 |
| Tier 2 | Customer-facing critical | 120 |
| Tier 3 | Safety/financial/data-critical | 80 |
2.2 Risk Point Scoring
Release Risk Score (RRS) = Base + Diff Risk + Operational Context − Mitigations
Base (by criticality):
- Tier 0: +1
- Tier 1: +3
- Tier 2: +6
- Tier 3: +10
Diff Risk (additive):
| Change Type | Points |
|---|---|
| Docs, comments, non-executed code | +1 |
| UI changes, refactors with high coverage | +3 |
| API contract changes, dependency upgrades | +6 |
| Database schema migrations, auth logic | +10 |
| Infra/networking, encryption, payment flows | +15 |
Operational Context (additive):
| Condition | Points |
|---|---|
| Active incident or recent Sev1/Sev2 | +5 |
| Error budget < 50% remaining | +3 |
| High on-call load | +2 |
| Release during freeze window | +5 |
Mitigations (subtract):
| Control | Points |
|---|---|
| Feature flag with staged rollout + kill switch | −3 |
| Canary + automated health gates + tested rollback | −3 |
| High-confidence integration coverage | −2 |
| Backward-compatible migration with proven rollback | −2 |
| Change isolated behind permission boundary | −2 |
2.3 Budget Thresholds
| Status | Remaining | Action |
|---|---|---|
| Green | ≥60% | Normal operation |
| Yellow | 30–59% | Gates tighten by 1 level for medium/high-risk diffs |
| Red | <30% | Freeze high-risk diffs; allow only low-risk or reliability work |
| Exhausted | ≤0% | Incident/security fixes only with explicit sign-off |
2.4 Risk Budget Visualization
The Risk Budget Burn-Up Chart is the key PM dashboard:
- X-axis: Calendar dates up to code freeze
- Y-axis: Risk points
- Budget line: Allowable risk over time (flat or stepped)
- Actual Risk line: Cumulative unknowns + knowns − mitigations
- Shaded area: Headroom (green) or Overrun (red)
- Vertical markers: Feature freeze, pen-test start, dependency bumps
- Burn targets: Dotted lines showing required pace
Dashboard KPIs:
- "Headroom: 28 pts (green)"
- "Unknowns↑ +6 (24h)", "Risk retired −18 (7d)"
- "Exceptions expiring: 3"
- "At current burn, overrun in 5 days"
3. Release Gate Levels
3.1 Gate Definitions
G0 — No-risk / Administrative
Use for: docs-only, comments-only, non-functional metadata
Requirements:
- Lint/format checks
- Basic CI pass (build)
G1 — Low Risk
Use for: small localized changes with strong unit coverage, non-core UI, telemetry additions
Requirements:
- All automated unit tests
- Static analysis/linting
- 1 peer review
- Automated deploy to staging
- Post-deploy smoke checks
G2 — Moderate Risk
Use for: moderate logic changes in customer-facing paths, dependency upgrades, backward-compatible API changes
Requirements:
- G1 +
- Integration tests for impacted modules
- Code owner review
- Feature flag required if customer impact possible
- Staged rollout: canary or small cohort
- Rollback plan documented in PR
G3 — High Risk
Use for: schema migrations, auth/permission changes, core business logic, infra changes
Requirements:
- G2 +
- Security scan + dependency audit
- Migration plan (forward + rollback) reviewed
- Load/performance checks if in hot path
- New/updated dashboards/alerts
- Release captain sign-off
- Progressive delivery with automatic health gates
G4 — Very High Risk / Safety-Critical
Use for: Tier 3 systems with low budget, freeze window exceptions, broad blast radius, post-incident remediation
Requirements:
- G3 +
- Formal risk review (PM+DM+Security/SRE) in writing
- Explicit rollback rehearsal
- Extended canary with success/abort criteria
- Customer comms plan if impact plausible
- Post-release verification checklist executed
3.2 Gate Selection Logic
- Compute RRS from diff + context
- Map RRS to default gate:
- 1–5 RP → G1
- 6–12 RP → G2
- 13–20 RP → G3
- 21+ RP → G4
- Apply modifiers:
- Budget Yellow → escalate one gate for ≥G2
- Budget Red → escalate one gate for ≥G1, block high-risk unless exception
- Active incident → block non-fix releases by default
4. Delta Verdict Engine
4.1 Core Architecture
The delta verdict engine computes deterministic, signed verdicts comparing two states:
Verdict = f(Manifest)
Where Manifest contains:
sbom_sha256- SBOM graph hashvex_set_sha256[]- VEX document hashesreach_subgraph_sha256- Reachability graph hashfeeds_snapshot_sha256- Feed snapshot hashpolicy_bundle_sha256- Policy/rules hashengine_version- Engine version for reproducibility
4.2 Evaluation Pipeline
-
Normalize inputs
- SBOM: sort by
packageUrl/name@version; resolve aliases - VEX: normalize provider →
vex_id,product_ref,status - Reachability: adjacency lists sorted by node ID; hash after topological ordering
- Feeds: lock to snapshot (timestamp + commit/hash); no live calls
- SBOM: sort by
-
Policy bundle
- Declarative rules compiled to canonical IR
- Explicit merge precedence (lattice-merge table)
- Unknowns policy baked in: e.g.,
fail_if_unknowns > N in prod
-
Evaluation
- Build finding set:
(component, vuln, context)tuples with deterministic IDs - Apply lattice-based VEX merge with evidence pointers
- Compute
statusandrisk_scoreusing fixed-precision math
- Build finding set:
-
Emit
- Canonicalize verdict JSON (RFC 8785 JCS)
- Sign verdict (DSSE/COSE/JWS)
- Attach as OCI attestation to image/digest
4.3 Delta Verdict Structure
{
"subject": {"ociDigest": "sha256:..."},
"inputs": {
"feeds": [{"type":"cve","digest":"sha256:..."}],
"tools": {"sbomer":"1.6.3","reach":"0.9.0","policy":"lattice-2025.12"},
"baseline": {"sbomG":"sha256:...","vexSet":"sha256:..."}
},
"delta": {
"components": {"added":[...],"removed":[...],"updated":[...]},
"reachability": {"edgesAdded":[...],"edgesRemoved":[...]},
"settings": {"changed":[...]},
"vex": [{"cve":"CVE-2025-1234","from":"affected","to":"not_affected",
"reason":"config_flag_off","evidenceRef":"att#cfg-42"}],
"attestations": {"changed":[...]}
},
"verdict": {
"decision": "allow",
"riskBudgetUsed": 2,
"policyId": "lattice-2025.12",
"explanationRefs": ["vex[0]","reachability.edgesRemoved[3]"]
},
"signing": {"dsse":"...","signer":"stella-authority"}
}
4.4 Replay Contract
For deterministic replay, pin and record:
- Feed snapshots + hashes
- Scanner versions + rule packs + lattice/policy version
- SBOM generator version + mode
- Reachability engine settings
- Merge semantics ID
Replayer re-hydrates exact inputs and must reproduce the same verdict bit-for-bit.
5. Smart-Diff Algorithm
5.1 Material Risk Change Detection
FindingKey: (component_purl, component_version, cve_id)
RiskState Fields:
reachable: bool | unknownvex_status: enum(AFFECTED | NOT_AFFECTED | FIXED | UNDER_INVESTIGATION | UNKNOWN)in_affected_range: bool | unknownkev: boolepss_score: float | nullpolicy_flags: set<string>evidence_links: list<EvidenceLink>
5.2 Change Detection Rules
Rule R1: Reachability Flip
reachablechanges:false → true(risk ↑) ortrue → false(risk ↓)
Rule R2: VEX Status Flip
- Meaningful changes:
AFFECTED ↔ NOT_AFFECTED,UNDER_INVESTIGATION → NOT_AFFECTED
Rule R3: Affected Range Boundary
in_affected_rangeflips:false → trueortrue → false
Rule R4: Intelligence/Policy Flip
kevchangesfalse → trueepss_scorecrosses configured thresholdpolicy_flagchanges severity (warn → block)
5.3 Suppression Rules
All must apply for suppression:
reachable == falsevex_status == NOT_AFFECTEDkev == false- No policy override
Patch Churn Suppression:
- If version changes AND
in_affected_rangeremains false in both AND no KEV/policy flip → suppress
5.4 Priority Score Formula
score =
+ 1000 if new.kev
+ 500 if new.reachable
+ 200 if reason includes RANGE_FLIP to affected
+ 150 if VEX_FLIP to AFFECTED
+ 0..100 based on EPSS (epss * 100)
+ policy weight: +300 if decision BLOCK, +100 if WARN
5.5 Reachability Gate (3-Bit Severity)
public sealed record ReachabilityGate(
bool? Reachable, // true / false / null for unknown
bool? ConfigActivated,
bool? RunningUser,
int Class, // 0..7 derived from the bits when all known
string Rationale
);
Class Computation: 0-7 based on 3 binary gates (reachable, config-activated, running user)
Unknown Handling: Never silently treat null as false or true. If any bit is null, set Class = -1 or compute from known bits only.
6. Exception Workflow
6.1 Exception Entity Model
public record Exception(
string Id,
string Scope, // image:repo/app:tag, component:pkg@ver
string Subject, // CVE-2025-1234, package name
string Reason, // Human-readable justification
List<string> EvidenceRefs, // att:sha256:..., vex:sha256:...
string CreatedBy,
DateTime CreatedAt,
DateTime? ExpiresAt,
string PolicyBinding,
string Signature
);
6.2 Exception Requirements
- Signed rationale + evidence: Justification with linked proofs (attestation IDs, VEX note, reachability subgraph slice)
- Auto-expiry & revalidation: Scheduler re-tests on expiry or when feeds mark "fix available / EPSS ↑ / reachability ↑"
- Audit view: Timeline of exception lifecycle (who/why, evidence, re-checks)
- Policy hooks: "allow only if: reason ∧ evidence present ∧ max TTL ≤ X ∧ owner = team-Y"
- Inheritance: repo→image→env scoping with explicit shadowing
6.3 Exception CLI
stella exception create \
--cve CVE-2025-1234 \
--scope image:repo/app:tag \
--reason "Feature disabled" \
--evidence att:sha256:... \
--ttl 30d
6.4 Break-Glass Policy
Exceptions permitted only for:
- Incident mitigation or customer harm prevention
- Urgent security fix (actively exploited or high severity)
- Legal/compliance deadline
Requirements:
- Recorded rationale in PR/release ticket
- Named approvers: DM + on-call owner; PM for customer-impacting risk
- Mandatory follow-up within 5 business days
- Budget penalty: +50% of change's RRS
7. VEX Trust Scoring
7.1 Evidence Atoms
For every VEX statement, extract:
- scope: package@version, image@digest, file hash
- claim: affected, not_affected, under_investigation, fixed
- reason: reachable?, feature flag off, vulnerable code not present
- provenance: who said it, how it's signed
- when: issued_at, observed_at, expires_at
- supporting artifacts: SBOM ref, in-toto link, CVE IDs
7.2 Confidence Score (C: 0–1)
Multiply factors, cap at 1:
| Factor | Weight |
|---|---|
| DSSE + Sigstore/Rekor inclusion | 0.35 |
| Hardware-backed key or org OIDC | 0.15 |
| NVD source | 0.20 |
| Major distro PSIRT | 0.20 |
| Upstream vendor | 0.20 |
| Reputable CERT | 0.15 |
| Small vendor | 0.10 |
| Reachability proof/test | 0.25 |
| Code diff linking | 0.20 |
| Deterministic build link | 0.15 |
| "Reason" present | 0.10 |
| ≥2 independent concurring sources | +0.10 |
7.3 Freshness Score (F: 0–1)
F = exp(−Δdays / τ)
τ values by source class:
- Vendor VEX: τ = 30
- NVD: τ = 90
- Exploit-active feeds: τ = 14
Update reset: New attestation with same subject resets Δdays.
Expiry clamp: If now > expires_at, set F = 0.
7.4 Claim Strength (S_claim)
| Claim | Base Weight |
|---|---|
| not_affected | 0.9 |
| fixed | 0.8 |
| affected | 0.7 |
| under_investigation | 0.4 |
Reason multipliers:
- reachable? → +0.15 to "affected"
- "feature flag off" → +0.10 to "not_affected"
- platform mismatch → +0.10
- backport patch note (with commit hash) → +0.10
7.5 Lattice Merge
Per evidence e:
Score(e) = C(e) × F(e) × S_claim(e)
Merge in distributive lattice ordered by:
- Claim precedence: not_affected > fixed > affected > under_investigation
- Break ties by Score(e)
- If competing top claims within ε (0.05), escalate to "disputed" and surface both with proofs
7.6 Worked Example
Small vendor Sigstore VEX (signed, reason: code path unreachable, issued 7 days ago):
- C ≈ 0.35 + 0.10 + 0.10 + 0.25 = 0.70
- F = exp(−7/30) ≈ 0.79
- S_claim = 0.9 + 0.10 = 1.0 (capped)
- Score ≈ 0.70 × 0.79 × 1.0 = 0.55
NVD entry (affected, no reasoning, 180 days old):
- C ≈ 0.20
- F = exp(−180/90) ≈ 0.14
- S_claim = 0.7
- Score ≈ 0.20 × 0.14 × 0.7 = 0.02
Outcome: Vendor VEX wins → not_affected with linked proofs.
8. UI/UX Patterns
8.1 Three-Pane Layout
- Categories Pane: Filterable list of change categories
- Items Pane: Delta items within selected category
- Proof Pane: Evidence details for selected item
8.2 Visual Diff Components
| Component | Purpose |
|---|---|
DeltaSummaryStripComponent |
Risk delta header: "Risk ↓ Medium → Low" |
ProofPaneComponent |
Evidence rail with witness paths |
VexMergeExplanationComponent |
Trust algebra visualization |
CompareViewComponent |
Side-by-side before/after |
TriageShortcutsService |
Keyboard navigation |
8.3 Micro-interactions
- Hover changed node → inline badge explaining why it changed
- Click rule change → spotlight the exact subgraph it affected
- "Explain like I'm new" toggle → expand jargon into plain language
- "Copy audit bundle" → export delta + evidence as attachment
8.4 Hotkeys
| Key | Action |
|---|---|
1 |
Focus changes only |
2 |
Show full graph |
E |
Expand evidence |
A |
Export audit |
8.5 Empty States
- Incomplete evidence: Yellow "Unknowns present" ribbon with count and collection button
- Huge graphs: Default to "changed neighborhood only" with mini-map
9. CI/CD Integration
9.1 API Endpoints
| Endpoint | Purpose |
|---|---|
POST /evaluate |
Returns verdict.json + attestation |
POST /delta |
Returns delta.json (signed) |
GET /replay?manifest_sha= |
Re-executes with cached snapshots |
GET /evidence/:cid |
Fetches immutable evidence blobs |
9.2 CLI Commands
# Verify delta between two versions
stella verify delta \
--from abc123 \
--to def456 \
--policy prod.json \
--print-proofs
# Create exception
stella exception create \
--cve CVE-2025-1234 \
--scope image:repo/app:tag \
--reason "Feature disabled" \
--evidence att:sha256:... \
--ttl 30d
# Replay a verdict
stella replay \
--manifest-sha sha256:... \
--assert-identical
9.3 Exit Codes
| Code | Meaning |
|---|---|
| 0 | PASS - Release allowed |
| 1 | FAIL - Gate blocked |
| 2 | WARN - Proceed with caution |
| 3 | ERROR - Evaluation failed |
9.4 Pipeline Recipe
release-gate:
script:
- stella evaluate --subject $IMAGE_DIGEST --policy $GATE_POLICY
- |
if [ $? -eq 1 ]; then
echo "Gate blocked - risk budget exceeded or policy violation"
exit 1
fi
- stella delta --from $BASELINE --to $IMAGE_DIGEST --export audit-bundle.zip
artifacts:
paths:
- audit-bundle.zip
10. Data Models
10.1 Scan Manifest
{
"sbom_sha256": "sha256:...",
"vex_set_sha256": ["sha256:..."],
"reach_subgraph_sha256": "sha256:...",
"feeds_snapshot_sha256": "sha256:...",
"policy_bundle_sha256": "sha256:...",
"engine_version": "1.0.0",
"policy_semver": "2025.12",
"options_hash": "sha256:..."
}
10.2 Verdict
{
"risk_score": 42,
"status": "pass|warn|fail",
"unknowns_count": 3,
"evidence_refs": ["sha256:...", "sha256:..."],
"explanations": [
{"template": "CVE-{cve} suppressed by VEX claim from {source}",
"params": {"cve": "2025-1234", "source": "vendor"}}
]
}
10.3 Smart-Diff Predicate
{
"predicateType": "stellaops.dev/predicates/smart-diff@v1",
"predicate": {
"baseImage": {"name":"...", "digest":"sha256:..."},
"targetImage": {"name":"...", "digest":"sha256:..."},
"diff": {
"filesAdded": [...],
"filesRemoved": [...],
"filesChanged": [{"path":"...", "hunks":[...]}],
"packagesChanged": [{"name":"openssl","from":"1.1.1u","to":"3.0.14"}]
},
"context": {
"entrypoint":["/app/start"],
"env":{"FEATURE_X":"true"},
"user":{"uid":1001,"caps":["NET_BIND_SERVICE"]}
},
"reachabilityGate": {"reachable":true,"configActivated":true,"runningUser":false,"class":6}
}
}
Appendix A: Success Metrics
| Metric | Description |
|---|---|
| Mean Time to Explain (MTTE) | Time from "why did this change?" to "Understood" |
| Change Failure Rate | % of releases causing incidents |
| MTTR | Mean time to recovery |
| Gate Compliance Rate | % of releases following required gates |
| Budget Utilization | Actual RP consumed vs. allocated |
Appendix B: Related Documents
| Document | Relationship |
|---|---|
docs/modules/policy/architecture.md |
Policy Engine implementation |
docs/modules/scanner/architecture.md |
Scanner/Reachability implementation |
docs/modules/web/smart-diff-ui-architecture.md |
UI component specifications |
SPRINT_20251226_007_BE_determinism_gaps.md |
Determinism implementation sprint |
Appendix C: Archive References
The following advisories were consolidated into this document:
| Original File | Archive Location |
|---|---|
25-Dec-2025 - Building a Deterministic Verdict Engine.md |
(kept in place - primary reference) |
26-Dec-2026 - Diff‑Aware Releases and Auditable Exceptions.md |
archived/2025-12-26-superseded/ |
26-Dec-2026 - Smart‑Diff as a Core Evidence Primitive.md |
archived/2025-12-26-superseded/ |
25-Dec-2025 - Visual Diffs for Explainable Triage.md |
archived/2025-12-26-triage-advisories/ |
26-Dec-2026 - Visualizing the Risk Budget.md |
archived/2025-12-26-triage-advisories/ |
26-Dec-2026 - Weighted Confidence for VEX Sources.md |
archived/2025-12-26-vex-scoring/ |
Technical References (not moved):
archived/2025-12-21-moat-gap-closure/14-Dec-2025 - Smart-Diff Technical Reference.mdarchived/2025-12-21-moat-phase2/20-Dec-2025 - Moat Explanation - Risk Budgets and Diff-Aware Release Gates.md