Add property-based tests for SBOM/VEX document ordering and Unicode normalization determinism

- Implement `SbomVexOrderingDeterminismProperties` for testing component list and vulnerability metadata hash consistency.
- Create `UnicodeNormalizationDeterminismProperties` to validate NFC normalization and Unicode string handling.
- Add project file for `StellaOps.Testing.Determinism.Properties` with necessary dependencies.
- Introduce CI/CD template validation tests including YAML syntax checks and documentation content verification.
- Create validation script for CI/CD templates ensuring all required files and structures are present.
This commit is contained in:
StellaOps Bot
2025-12-26 15:17:15 +02:00
parent 7792749bb4
commit c8f3120174
349 changed files with 78558 additions and 1342 deletions

View File

@@ -184,8 +184,272 @@ var result = await budgetService.CheckBudget(environment, unknowns);
// result.CumulativeUncertainty - total uncertainty score
```
---
# Risk Budget Enforcement
This section describes the risk budget enforcement system that tracks and controls release risk accumulation over time.
## Overview
Risk budgets limit the cumulative risk accepted during a budget window (typically monthly). Each release consumes risk points based on the vulnerabilities it introduces or carries forward. When a budget is exhausted, further high-risk releases are blocked.
## Key Concepts
### Service Tiers
Services are classified by criticality, which determines their risk budget allocation:
| Tier | Name | Monthly Allocation | Description |
|------|------|-------------------|-------------|
| 0 | Internal | 300 RP | Internal-only, low business impact |
| 1 | Customer-Facing Non-Critical | 200 RP | Customer-facing but non-critical |
| 2 | Customer-Facing Critical | 120 RP | Critical customer-facing services |
| 3 | Safety-Critical | 80 RP | Safety, financial, or data-critical |
### Budget Status Thresholds
Budget status transitions based on percentage consumed:
| Status | Threshold | Behavior |
|--------|-----------|----------|
| Green | < 40% consumed | Normal operations |
| Yellow | 40-69% consumed | Increased caution, warnings triggered |
| Red | 70-99% consumed | High-risk diffs frozen, only low-risk allowed |
| Exhausted | >= 100% consumed | Incident and security fixes only |
### Budget Windows
- **Default cadence**: Monthly (YYYY-MM format)
- **Reset behavior**: No carry-over; unused budget expires
- **Window boundary**: UTC midnight on the 1st of each month
## API Endpoints
### Check Budget Status
```http
GET /api/v1/policy/budget/status?serviceId={id}
```
Response:
```json
{
"budgetId": "budget:my-service:2025-12",
"serviceId": "my-service",
"tier": 1,
"window": "2025-12",
"allocated": 200,
"consumed": 85,
"remaining": 115,
"percentageUsed": 42.5,
"status": "Yellow"
}
```
### Record Consumption
```http
POST /api/v1/policy/budget/consume
Content-Type: application/json
{
"serviceId": "my-service",
"riskPoints": 25,
"releaseId": "v1.2.3"
}
```
### Adjust Allocation (Earned Capacity)
```http
POST /api/v1/policy/budget/adjust
Content-Type: application/json
{
"serviceId": "my-service",
"adjustment": 40,
"reason": "MTTR improvement over 2 months"
}
```
### View History
```http
GET /api/v1/policy/budget/history?serviceId={id}&window={yyyy-MM}
```
## CLI Commands
### Check Status
```bash
stella budget status --service my-service
```
Output:
```
Service: my-service
Window: 2025-12
Tier: Customer-Facing Non-Critical (1)
Status: Yellow
Budget: 85 / 200 RP (42.5%)
████████░░░░░░░░░░░░
Remaining: 115 RP
```
### Consume Budget
```bash
stella budget consume --service my-service --points 25 --reason "Release v1.2.3"
```
### List All Budgets
```bash
stella budget list --status Yellow,Red
```
## Earned Capacity Replenishment
Services demonstrating improved reliability can earn additional budget capacity:
### Eligibility Criteria
1. **MTTR Improvement**: Mean Time to Remediate must improve for 2 consecutive windows
2. **CFR Improvement**: Change Failure Rate must improve for 2 consecutive windows
3. **No Major Incidents**: No P1 incidents in the evaluation period
### Increase Calculation
- Minimum increase: 10% of base allocation
- Maximum increase: 20% of base allocation
- Scale: Proportional to improvement magnitude
### Example
```
Service: payment-api (Tier 2, base 120 RP)
MTTR: 48h → 36h → 24h (50% improvement)
CFR: 15% → 12% → 8% (47% improvement)
Earned capacity: +20% = 24 RP
New allocation: 144 RP for next window
```
## Notifications
Budget threshold transitions trigger notifications:
### Warning (Yellow)
Sent when budget reaches 40% consumption:
```
Subject: [Warning] Risk Budget at 40% for my-service
Your risk budget for my-service has reached the warning threshold.
Current: 80 / 200 RP (40%)
Status: Yellow
Consider pausing non-critical changes until the next budget window.
```
### Critical (Red/Exhausted)
Sent when budget reaches 70% or 100%:
```
Subject: [Critical] Risk Budget Exhausted for my-service
Your risk budget for my-service has been exhausted.
Current: 200 / 200 RP (100%)
Status: Exhausted
Only security fixes and incident responses are allowed.
Contact the Platform team for emergency capacity.
```
### Channels
Notifications are sent via:
- Email (to service owners)
- Slack (to designated channel)
- Microsoft Teams (to designated channel)
- Webhooks (for integration)
## Database Schema
```sql
CREATE TABLE policy.budget_ledger (
budget_id TEXT PRIMARY KEY,
service_id TEXT NOT NULL,
tenant_id TEXT,
tier INTEGER NOT NULL,
window TEXT NOT NULL,
allocated INTEGER NOT NULL,
consumed INTEGER NOT NULL DEFAULT 0,
status TEXT NOT NULL DEFAULT 'green',
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE(service_id, window)
);
CREATE TABLE policy.budget_entries (
entry_id TEXT PRIMARY KEY,
service_id TEXT NOT NULL,
window TEXT NOT NULL,
release_id TEXT NOT NULL,
risk_points INTEGER NOT NULL,
consumed_at TIMESTAMPTZ NOT NULL DEFAULT now(),
FOREIGN KEY (service_id, window) REFERENCES policy.budget_ledger(service_id, window)
);
CREATE INDEX idx_budget_entries_service_window ON policy.budget_entries(service_id, window);
```
## Configuration
```yaml
# etc/policy.yaml
policy:
riskBudget:
enabled: true
windowCadence: monthly # monthly | weekly | sprint
carryOver: false
defaultTier: 1
tiers:
0: { name: Internal, allocation: 300 }
1: { name: CustomerFacingNonCritical, allocation: 200 }
2: { name: CustomerFacingCritical, allocation: 120 }
3: { name: SafetyCritical, allocation: 80 }
thresholds:
yellow: 40
red: 70
exhausted: 100
notifications:
enabled: true
channels: [email, slack]
aggregationWindow: 1h # Debounce rapid transitions
earnedCapacity:
enabled: true
requiredImprovementWindows: 2
minIncreasePercent: 10
maxIncreasePercent: 20
```
## Related Documentation
- [Unknown Budget Gates](./unknowns-budget-gates.md)
- [Verdict Attestations](../attestor/verdict-format.md)
- [BudgetCheckPredicate Model](../../api/attestor/budget-check-predicate.md)
- [Risk Point Scoring](./risk-point-scoring.md)
- [Diff-Aware Release Gates](./diff-aware-gates.md)