Files
git.stella-ops.org/docs/product-advisories/28-Nov-2025 - Policy Simulation and Shadow Gates.md
StellaOps Bot 0bef705bcc
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
true the date
2025-11-30 19:23:21 +02:00

9.1 KiB

Policy Simulation and Shadow Gates

Version: 1.0 Date: 2025-11-29 Status: Canonical

This advisory defines the product rationale, simulation semantics, and implementation strategy for Policy Engine simulation features, covering shadow runs, coverage fixtures, and promotion gates.


1. Executive Summary

Policy simulation enables safe testing of policy changes before production deployment. Key capabilities:

  • Shadow Runs - Execute policies without enforcement
  • Diff Summaries - Compare old vs new policy outcomes
  • Coverage Fixtures - Validate expected findings
  • Promotion Gates - Block promotion until tests pass
  • Deterministic Replay - Reproduce simulation results

2. Market Drivers

2.1 Target Segments

Segment Simulation Requirements Use Case
Policy Authors Preview changes Development workflow
Security Leads Approve promotions Change management
Compliance Audit trail Policy change evidence
DevSecOps CI integration Automated testing

2.2 Competitive Positioning

Most vulnerability tools lack policy simulation. Stella Ops differentiates with:

  • Shadow execution without production impact
  • Diff visualization of policy changes
  • Coverage testing with fixture validation
  • Promotion gates for governance
  • Deterministic replay for audit

3. Simulation Modes

3.1 Shadow Run

Execute policy against real data without enforcement:

stella policy simulate \
  --policy my-policy:v2 \
  --scope "tenant:acme-corp,namespace:production" \
  --shadow

Behavior:

  • Evaluates all findings
  • Records verdicts to shadow collections
  • No enforcement actions
  • No notifications triggered
  • Metrics tagged with shadow=true

3.2 Diff Run

Compare two policy versions:

stella policy diff \
  --old my-policy:v1 \
  --new my-policy:v2 \
  --scope "tenant:acme-corp"

Output:

{
  "summary": {
    "added": 12,
    "removed": 5,
    "changed": 8,
    "unchanged": 234
  },
  "changes": [
    {
      "findingId": "finding-123",
      "cve": "CVE-2025-12345",
      "oldVerdict": "warned",
      "newVerdict": "blocked",
      "reason": "rule 'critical-cves' now matches"
    }
  ]
}

3.3 Coverage Run

Validate policy against fixture expectations:

stella policy coverage \
  --policy my-policy:v2 \
  --fixtures fixtures/policy-tests.yaml

4. Coverage Fixtures

4.1 Fixture Format

apiVersion: stellaops.io/policy-test.v1
kind: PolicyFixture
metadata:
  name: critical-cve-blocking
  policy: my-policy

fixtures:
  - name: "Block critical CVE in production"
    input:
      finding:
        cve: "CVE-2025-12345"
        severity: critical
        ecosystem: npm
        component: "lodash@4.17.20"
      context:
        namespace: production
        labels:
          tier: frontend
    expected:
      verdict: blocked
      rulesMatched: ["critical-cves", "production-strict"]

  - name: "Warn on high CVE in staging"
    input:
      finding:
        cve: "CVE-2025-12346"
        severity: high
        ecosystem: npm
    expected:
      verdict: warned

  - name: "Ignore low CVE with VEX"
    input:
      finding:
        cve: "CVE-2025-12347"
        severity: low
        vexStatus: not_affected
        vexJustification: "component_not_present"
    expected:
      verdict: ignored

4.2 Fixture Results

{
  "total": 25,
  "passed": 23,
  "failed": 2,
  "failures": [
    {
      "fixture": "Block critical CVE in production",
      "expected": {"verdict": "blocked"},
      "actual": {"verdict": "warned"},
      "diff": "rule 'critical-cves' did not match due to missing label"
    }
  ]
}

5. Promotion Gates

5.1 Gate Requirements

Before a policy can be promoted from draft to active:

Gate Requirement Enforcement
Shadow Run Complete without errors Required
Coverage 100% fixtures pass Required
Diff Review Changes reviewed Optional
Approval Human sign-off Configurable

5.2 Promotion Workflow

stateDiagram-v2
    [*] --> Draft
    Draft --> Shadow: Start shadow run
    Shadow --> Coverage: Run coverage tests
    Coverage --> Review: Pass fixtures
    Review --> Approval: Review diff
    Approval --> Active: Approve
    Coverage --> Draft: Fix failures
    Approval --> Draft: Reject

5.3 CLI Commands

# Start shadow run
stella policy promote start --policy my-policy:v2

# Check promotion status
stella policy promote status --policy my-policy:v2

# Complete promotion (requires approval)
stella policy promote complete --policy my-policy:v2 --comment "Reviewed and approved"

6. Determinism Requirements

6.1 Simulation Guarantees

Property Guarantee
Input ordering Stable sort by (tenant, policyId, findingKey)
Rule evaluation First-match semantics
Timestamp handling Injected TimeProvider
Random values Injected IRandom

6.2 Replay Hash

Each simulation computes:

determinismHash = SHA256(policyVersion + inputsHash + rulesHash)

Replays with same hash must produce identical results.


7. Implementation Strategy

7.1 Phase 1: Shadow Runs (Complete)

  • Shadow collection isolation
  • Shadow metrics tagging
  • Shadow run API
  • CLI integration

7.2 Phase 2: Diff & Coverage (In Progress)

  • Policy diff algorithm
  • Diff visualization
  • Coverage fixture parser (POLICY-COV-50-001)
  • Coverage runner (POLICY-COV-50-002)

7.3 Phase 3: Promotion Gates (Planned)

  • Gate configuration schema
  • Promotion state machine
  • Approval workflow integration
  • Console UI for review

8. API Surface

8.1 Simulation APIs

Endpoint Method Scope Description
/api/policy/simulate POST policy:simulate Start simulation
/api/policy/simulate/{id} GET policy:read Get simulation status
/api/policy/simulate/{id}/results GET policy:read Get results

8.2 Diff APIs

Endpoint Method Scope Description
/api/policy/diff POST policy:read Compare versions

8.3 Coverage APIs

Endpoint Method Scope Description
/api/policy/coverage POST policy:simulate Run coverage
/api/policy/coverage/{id} GET policy:read Get results

8.4 Promotion APIs

Endpoint Method Scope Description
/api/policy/promote POST policy:promote Start promotion
/api/policy/promote/{id} GET policy:read Get status
/api/policy/promote/{id}/approve POST policy:approve Approve promotion
/api/policy/promote/{id}/reject POST policy:approve Reject promotion

9. Storage Model

9.1 Collections

Collection Purpose
policy_simulations Simulation records
policy_simulation_results Per-finding results
policy_coverage_runs Coverage executions
policy_promotions Promotion state

9.2 Shadow Isolation

Shadow results stored in separate collections:

  • effective_finding_{policyId}_shadow
  • Never mixed with production data
  • TTL-based cleanup (default 7 days)

10. Observability

10.1 Metrics

  • policy_simulation_duration_seconds{mode}
  • policy_coverage_pass_rate{policy}
  • policy_promotion_gate_status{gate,status}
  • policy_diff_changes_total{changeType}

10.2 Audit Events

  • policy.simulation.started
  • policy.simulation.completed
  • policy.coverage.passed
  • policy.coverage.failed
  • policy.promotion.approved
  • policy.promotion.rejected

11. Console Integration

11.1 Policy Editor

  • Inline simulation button
  • Real-time diff preview
  • Coverage status badge

11.2 Promotion Dashboard

  • Pending promotions list
  • Gate status visualization
  • Approval/reject actions

Resource Location
Policy architecture docs/modules/policy/architecture.md
DSL reference docs/policy/dsl.md
Lifecycle guide docs/policy/lifecycle.md
Runtime guide docs/policy/runtime.md

13. Sprint Mapping

  • Primary Sprint: SPRINT_0185_0001_0001_policy_simulation.md (NEW)
  • Related Sprints:
    • SPRINT_0120_0000_0001_policy_reasoning.md
    • SPRINT_0121_0001_0001_policy_reasoning.md

Key Task IDs:

  • POLICY-SIM-40-001 - Shadow runs (DONE)
  • POLICY-DIFF-41-001 - Diff algorithm (DONE)
  • POLICY-COV-50-001 - Coverage fixtures (IN PROGRESS)
  • POLICY-COV-50-002 - Coverage runner (IN PROGRESS)
  • POLICY-PROM-55-001 - Promotion gates (TODO)

14. Success Metrics

Metric Target
Simulation latency < 2 min (10k findings)
Coverage accuracy 100% fixture matching
Promotion gate enforcement 100% adherence
Shadow isolation Zero production leakage
Replay determinism 100% hash match

Last updated: 2025-11-29