Files

StellaOps Bot c8f3120174 Add property-based tests for SBOM/VEX document ordering and Unicode normalization determinism

- Implement `SbomVexOrderingDeterminismProperties` for testing component list and vulnerability metadata hash consistency.
- Create `UnicodeNormalizationDeterminismProperties` to validate NFC normalization and Unicode string handling.
- Add project file for `StellaOps.Testing.Determinism.Properties` with necessary dependencies.
- Introduce CI/CD template validation tests including YAML syntax checks and documentation content verification.
- Create validation script for CI/CD templates ensuring all required files and structures are present.

2025-12-26 15:17:15 +02:00

11 KiB

Raw Blame History

Budget Threshold Attestation

This document describes how unknown budget thresholds are attested in verdict bundles for reproducibility and audit purposes.

Overview

Budget attestation captures the budget configuration applied during policy evaluation, enabling:

Auditability: Verify what thresholds were enforced at decision time
Reproducibility: Include all inputs for deterministic verification
Compliance: Demonstrate policy enforcement for regulatory requirements

Budget Check Predicate

The budget check is included in the verdict predicate:

{
  "_type": "https://stellaops.dev/predicates/policy-verdict@v1",
  "tenantId": "tenant-1",
  "policyId": "default-policy",
  "policyVersion": 1,
  "verdict": { ... },
  "budgetCheck": {
    "environment": "production",
    "config": {
      "maxUnknownCount": 10,
      "maxCumulativeUncertainty": 2.5,
      "action": "warn",
      "reasonLimits": {
        "Reachability": 5,
        "Identity": 3
      }
    },
    "actualCounts": {
      "total": 3,
      "cumulativeUncertainty": 1.2,
      "byReason": {
        "Reachability": 2,
        "Identity": 1
      }
    },
    "result": "pass",
    "configHash": "sha256:abc123...",
    "evaluatedAt": "2025-12-25T12:00:00Z",
    "violations": []
  }
}

Fields

budgetCheck.config

Field	Type	Description
`maxUnknownCount`	int	Maximum total unknowns allowed
`maxCumulativeUncertainty`	double	Maximum uncertainty score
`action`	string	Action when exceeded: warn, block
`reasonLimits`	object	Per-reason code limits

budgetCheck.actualCounts

Field	Type	Description
`total`	int	Total unknowns observed
`cumulativeUncertainty`	double	Sum of uncertainty factors
`byReason`	object	Breakdown by reason code

budgetCheck.result

Possible values:

pass - All limits satisfied
warn - Limits exceeded but action is warn
fail - Limits exceeded and action is block

budgetCheck.configHash

SHA-256 hash of the budget configuration for determinism verification. Format: sha256:{64 hex characters}

budgetCheck.violations

List of violations when limits are exceeded:

{
  "violations": [
    {
      "type": "total",
      "limit": 10,
      "actual": 15
    },
    {
      "type": "reason",
      "limit": 5,
      "actual": 8,
      "reason": "Reachability"
    }
  ]
}

Usage

Extracting Budget Check from Verdict

using StellaOps.Policy.Engine.Attestation;

// Parse verdict predicate from DSSE envelope
var predicate = VerdictPredicate.Parse(dssePayload);

// Access budget check
if (predicate.BudgetCheck is not null)
{
    var check = predicate.BudgetCheck;
    Console.WriteLine($"Environment: {check.Environment}");
    Console.WriteLine($"Result: {check.Result}");
    Console.WriteLine($"Total: {check.ActualCounts.Total}/{check.Config.MaxUnknownCount}");
    Console.WriteLine($"Config Hash: {check.ConfigHash}");
}

Verifying Configuration Hash

// Compute expected hash from current configuration
var currentConfig = new VerdictBudgetConfig(
    maxUnknownCount: 10,
    maxCumulativeUncertainty: 2.5,
    action: "warn");

var expectedHash = VerdictBudgetCheck.ComputeConfigHash(currentConfig);

// Compare with attested hash
if (predicate.BudgetCheck?.ConfigHash != expectedHash)
{
    Console.WriteLine("Warning: Budget configuration has changed since attestation");
}

Determinism

The config hash ensures reproducibility:

Configuration is serialized to JSON with canonical ordering
SHA-256 is computed over the UTF-8 bytes
Hash is prefixed with sha256: algorithm identifier

This allows verification that the same budget configuration was used across runs.

Integration Points

VerdictPredicateBuilder

Budget check is added when building verdict predicates:

var budgetCheck = new VerdictBudgetCheck(
    environment: context.Environment,
    config: config,
    actualCounts: counts,
    result: budgetResult.Passed ? "pass" : budgetResult.Budget.Action.ToString(),
    configHash: VerdictBudgetCheck.ComputeConfigHash(config),
    evaluatedAt: DateTimeOffset.UtcNow,
    violations: violations);

var predicate = new VerdictPredicate(
    tenantId: trace.TenantId,
    policyId: trace.PolicyId,
    // ... other fields
    budgetCheck: budgetCheck);

UnknownBudgetService

The enhanced BudgetCheckResult includes all data needed for attestation:

var result = await budgetService.CheckBudget(environment, unknowns);

// result.Budget - the configuration applied
// result.CountsByReason - breakdown for attestation
// result.CumulativeUncertainty - total uncertainty score

Risk Budget Enforcement

This section describes the risk budget enforcement system that tracks and controls release risk accumulation over time.

Overview

Risk budgets limit the cumulative risk accepted during a budget window (typically monthly). Each release consumes risk points based on the vulnerabilities it introduces or carries forward. When a budget is exhausted, further high-risk releases are blocked.

Key Concepts

Service Tiers

Services are classified by criticality, which determines their risk budget allocation:

Tier	Name	Monthly Allocation	Description
0	Internal	300 RP	Internal-only, low business impact
1	Customer-Facing Non-Critical	200 RP	Customer-facing but non-critical
2	Customer-Facing Critical	120 RP	Critical customer-facing services
3	Safety-Critical	80 RP	Safety, financial, or data-critical

Budget Status Thresholds

Budget status transitions based on percentage consumed:

Status	Threshold	Behavior
Green	< 40% consumed	Normal operations
Yellow	40-69% consumed	Increased caution, warnings triggered
Red	70-99% consumed	High-risk diffs frozen, only low-risk allowed
Exhausted	>= 100% consumed	Incident and security fixes only

Budget Windows

Default cadence: Monthly (YYYY-MM format)
Reset behavior: No carry-over; unused budget expires
Window boundary: UTC midnight on the 1st of each month

API Endpoints

Check Budget Status

GET /api/v1/policy/budget/status?serviceId={id}

Response:

{
  "budgetId": "budget:my-service:2025-12",
  "serviceId": "my-service",
  "tier": 1,
  "window": "2025-12",
  "allocated": 200,
  "consumed": 85,
  "remaining": 115,
  "percentageUsed": 42.5,
  "status": "Yellow"
}

Record Consumption

POST /api/v1/policy/budget/consume
Content-Type: application/json

{
  "serviceId": "my-service",
  "riskPoints": 25,
  "releaseId": "v1.2.3"
}

Adjust Allocation (Earned Capacity)

POST /api/v1/policy/budget/adjust
Content-Type: application/json

{
  "serviceId": "my-service",
  "adjustment": 40,
  "reason": "MTTR improvement over 2 months"
}

View History

GET /api/v1/policy/budget/history?serviceId={id}&window={yyyy-MM}

CLI Commands

Check Status

stella budget status --service my-service

Output:

Service: my-service
Window:  2025-12
Tier:    Customer-Facing Non-Critical (1)
Status:  Yellow

Budget:  85 / 200 RP (42.5%)
         ████████░░░░░░░░░░░░

Remaining: 115 RP

Consume Budget

stella budget consume --service my-service --points 25 --reason "Release v1.2.3"

List All Budgets

stella budget list --status Yellow,Red

Earned Capacity Replenishment

Services demonstrating improved reliability can earn additional budget capacity:

Eligibility Criteria

MTTR Improvement: Mean Time to Remediate must improve for 2 consecutive windows
CFR Improvement: Change Failure Rate must improve for 2 consecutive windows
No Major Incidents: No P1 incidents in the evaluation period

Increase Calculation

Minimum increase: 10% of base allocation
Maximum increase: 20% of base allocation
Scale: Proportional to improvement magnitude

Example

Service: payment-api (Tier 2, base 120 RP)
MTTR: 48h → 36h → 24h (50% improvement)
CFR:  15% → 12% → 8%  (47% improvement)

Earned capacity: +20% = 24 RP
New allocation: 144 RP for next window

Notifications

Budget threshold transitions trigger notifications:

Warning (Yellow)

Sent when budget reaches 40% consumption:

Subject: [Warning] Risk Budget at 40% for my-service

Your risk budget for my-service has reached the warning threshold.

Current: 80 / 200 RP (40%)
Status: Yellow

Consider pausing non-critical changes until the next budget window.

Critical (Red/Exhausted)

Sent when budget reaches 70% or 100%:

Subject: [Critical] Risk Budget Exhausted for my-service

Your risk budget for my-service has been exhausted.

Current: 200 / 200 RP (100%)
Status: Exhausted

Only security fixes and incident responses are allowed.
Contact the Platform team for emergency capacity.

Channels

Notifications are sent via:

Email (to service owners)
Slack (to designated channel)
Microsoft Teams (to designated channel)
Webhooks (for integration)

Database Schema

CREATE TABLE policy.budget_ledger (
    budget_id      TEXT PRIMARY KEY,
    service_id     TEXT NOT NULL,
    tenant_id      TEXT,
    tier           INTEGER NOT NULL,
    window         TEXT NOT NULL,
    allocated      INTEGER NOT NULL,
    consumed       INTEGER NOT NULL DEFAULT 0,
    status         TEXT NOT NULL DEFAULT 'green',
    created_at     TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at     TIMESTAMPTZ NOT NULL DEFAULT now(),
    UNIQUE(service_id, window)
);

CREATE TABLE policy.budget_entries (
    entry_id       TEXT PRIMARY KEY,
    service_id     TEXT NOT NULL,
    window         TEXT NOT NULL,
    release_id     TEXT NOT NULL,
    risk_points    INTEGER NOT NULL,
    consumed_at    TIMESTAMPTZ NOT NULL DEFAULT now(),
    FOREIGN KEY (service_id, window) REFERENCES policy.budget_ledger(service_id, window)
);

CREATE INDEX idx_budget_entries_service_window ON policy.budget_entries(service_id, window);

Configuration

# etc/policy.yaml
policy:
  riskBudget:
    enabled: true
    windowCadence: monthly  # monthly | weekly | sprint
    carryOver: false
    defaultTier: 1

    tiers:
      0: { name: Internal, allocation: 300 }
      1: { name: CustomerFacingNonCritical, allocation: 200 }
      2: { name: CustomerFacingCritical, allocation: 120 }
      3: { name: SafetyCritical, allocation: 80 }

    thresholds:
      yellow: 40
      red: 70
      exhausted: 100

    notifications:
      enabled: true
      channels: [email, slack]
      aggregationWindow: 1h  # Debounce rapid transitions

    earnedCapacity:
      enabled: true
      requiredImprovementWindows: 2
      minIncreasePercent: 10
      maxIncreasePercent: 20

11 KiB Raw Blame History

Budget Threshold Attestation

Overview

Budget Check Predicate

Fields

budgetCheck.config

budgetCheck.actualCounts

budgetCheck.result

budgetCheck.configHash

budgetCheck.violations

Usage

Extracting Budget Check from Verdict

Verifying Configuration Hash

Determinism

Integration Points

VerdictPredicateBuilder

UnknownBudgetService

Risk Budget Enforcement

Overview

Key Concepts

Service Tiers

Budget Status Thresholds

Budget Windows

API Endpoints

Check Budget Status

Record Consumption

Adjust Allocation (Earned Capacity)

View History

CLI Commands

Check Status

Consume Budget

List All Budgets

Earned Capacity Replenishment

Eligibility Criteria

Increase Calculation

Example

Notifications

Warning (Yellow)

Critical (Red/Exhausted)

Channels

Database Schema

Configuration

Related Documentation

11 KiB

Raw Blame History