stella-ops.org/git.stella-ops.org

Fork 0

Files

StellaOps Bot 0bef705bcc

Docs CI / lint-and-preview (push) Has been cancelled

Details

true the date

2025-11-30 19:23:21 +02:00

9.1 KiB

Raw Blame History

Policy Simulation and Shadow Gates

Version: 1.0 Date: 2025-11-29 Status: Canonical

This advisory defines the product rationale, simulation semantics, and implementation strategy for Policy Engine simulation features, covering shadow runs, coverage fixtures, and promotion gates.

1. Executive Summary

Policy simulation enables safe testing of policy changes before production deployment. Key capabilities:

Shadow Runs - Execute policies without enforcement
Diff Summaries - Compare old vs new policy outcomes
Coverage Fixtures - Validate expected findings
Promotion Gates - Block promotion until tests pass
Deterministic Replay - Reproduce simulation results

2. Market Drivers

2.1 Target Segments

Segment	Simulation Requirements	Use Case
Policy Authors	Preview changes	Development workflow
Security Leads	Approve promotions	Change management
Compliance	Audit trail	Policy change evidence
DevSecOps	CI integration	Automated testing

2.2 Competitive Positioning

Most vulnerability tools lack policy simulation. Stella Ops differentiates with:

Shadow execution without production impact
Diff visualization of policy changes
Coverage testing with fixture validation
Promotion gates for governance
Deterministic replay for audit

3. Simulation Modes

3.1 Shadow Run

Execute policy against real data without enforcement:

stella policy simulate \
  --policy my-policy:v2 \
  --scope "tenant:acme-corp,namespace:production" \
  --shadow

Behavior:

Evaluates all findings
Records verdicts to shadow collections
No enforcement actions
No notifications triggered
Metrics tagged with shadow=true

3.2 Diff Run

Compare two policy versions:

stella policy diff \
  --old my-policy:v1 \
  --new my-policy:v2 \
  --scope "tenant:acme-corp"

Output:

{
  "summary": {
    "added": 12,
    "removed": 5,
    "changed": 8,
    "unchanged": 234
  },
  "changes": [
    {
      "findingId": "finding-123",
      "cve": "CVE-2025-12345",
      "oldVerdict": "warned",
      "newVerdict": "blocked",
      "reason": "rule 'critical-cves' now matches"
    }
  ]
}

3.3 Coverage Run

Validate policy against fixture expectations:

stella policy coverage \
  --policy my-policy:v2 \
  --fixtures fixtures/policy-tests.yaml

4. Coverage Fixtures

4.1 Fixture Format

apiVersion: stellaops.io/policy-test.v1
kind: PolicyFixture
metadata:
  name: critical-cve-blocking
  policy: my-policy

fixtures:
  - name: "Block critical CVE in production"
    input:
      finding:
        cve: "CVE-2025-12345"
        severity: critical
        ecosystem: npm
        component: "lodash@4.17.20"
      context:
        namespace: production
        labels:
          tier: frontend
    expected:
      verdict: blocked
      rulesMatched: ["critical-cves", "production-strict"]

  - name: "Warn on high CVE in staging"
    input:
      finding:
        cve: "CVE-2025-12346"
        severity: high
        ecosystem: npm
    expected:
      verdict: warned

  - name: "Ignore low CVE with VEX"
    input:
      finding:
        cve: "CVE-2025-12347"
        severity: low
        vexStatus: not_affected
        vexJustification: "component_not_present"
    expected:
      verdict: ignored

4.2 Fixture Results

{
  "total": 25,
  "passed": 23,
  "failed": 2,
  "failures": [
    {
      "fixture": "Block critical CVE in production",
      "expected": {"verdict": "blocked"},
      "actual": {"verdict": "warned"},
      "diff": "rule 'critical-cves' did not match due to missing label"
    }
  ]
}

5. Promotion Gates

5.1 Gate Requirements

Before a policy can be promoted from draft to active:

Gate	Requirement	Enforcement
Shadow Run	Complete without errors	Required
Coverage	100% fixtures pass	Required
Diff Review	Changes reviewed	Optional
Approval	Human sign-off	Configurable

5.2 Promotion Workflow

stateDiagram-v2
    [*] --> Draft
    Draft --> Shadow: Start shadow run
    Shadow --> Coverage: Run coverage tests
    Coverage --> Review: Pass fixtures
    Review --> Approval: Review diff
    Approval --> Active: Approve
    Coverage --> Draft: Fix failures
    Approval --> Draft: Reject

5.3 CLI Commands

# Start shadow run
stella policy promote start --policy my-policy:v2

# Check promotion status
stella policy promote status --policy my-policy:v2

# Complete promotion (requires approval)
stella policy promote complete --policy my-policy:v2 --comment "Reviewed and approved"

6. Determinism Requirements

6.1 Simulation Guarantees

Property	Guarantee
Input ordering	Stable sort by (tenant, policyId, findingKey)
Rule evaluation	First-match semantics
Timestamp handling	Injected TimeProvider
Random values	Injected IRandom

6.2 Replay Hash

Each simulation computes:

determinismHash = SHA256(policyVersion + inputsHash + rulesHash)

Replays with same hash must produce identical results.

7. Implementation Strategy

7.1 Phase 1: Shadow Runs (Complete)

Shadow collection isolation
Shadow metrics tagging
Shadow run API
CLI integration

7.2 Phase 2: Diff & Coverage (In Progress)

Policy diff algorithm
Diff visualization
Coverage fixture parser (POLICY-COV-50-001)
Coverage runner (POLICY-COV-50-002)

7.3 Phase 3: Promotion Gates (Planned)

Gate configuration schema
Promotion state machine
Approval workflow integration
Console UI for review

8. API Surface

8.1 Simulation APIs

Endpoint	Method	Scope	Description
`/api/policy/simulate`	POST	`policy:simulate`	Start simulation
`/api/policy/simulate/{id}`	GET	`policy:read`	Get simulation status
`/api/policy/simulate/{id}/results`	GET	`policy:read`	Get results

8.2 Diff APIs

Endpoint	Method	Scope	Description
`/api/policy/diff`	POST	`policy:read`	Compare versions

8.3 Coverage APIs

Endpoint	Method	Scope	Description
`/api/policy/coverage`	POST	`policy:simulate`	Run coverage
`/api/policy/coverage/{id}`	GET	`policy:read`	Get results

8.4 Promotion APIs

Endpoint	Method	Scope	Description
`/api/policy/promote`	POST	`policy:promote`	Start promotion
`/api/policy/promote/{id}`	GET	`policy:read`	Get status
`/api/policy/promote/{id}/approve`	POST	`policy:approve`	Approve promotion
`/api/policy/promote/{id}/reject`	POST	`policy:approve`	Reject promotion

9. Storage Model

9.1 Collections

Collection	Purpose
`policy_simulations`	Simulation records
`policy_simulation_results`	Per-finding results
`policy_coverage_runs`	Coverage executions
`policy_promotions`	Promotion state

9.2 Shadow Isolation

Shadow results stored in separate collections:

effective_finding_{policyId}_shadow
Never mixed with production data
TTL-based cleanup (default 7 days)

10. Observability

10.1 Metrics

policy_simulation_duration_seconds{mode}
policy_coverage_pass_rate{policy}
policy_promotion_gate_status{gate,status}
policy_diff_changes_total{changeType}

10.2 Audit Events

policy.simulation.started
policy.simulation.completed
policy.coverage.passed
policy.coverage.failed
policy.promotion.approved
policy.promotion.rejected

11. Console Integration

11.1 Policy Editor

Inline simulation button
Real-time diff preview
Coverage status badge

11.2 Promotion Dashboard

Pending promotions list
Gate status visualization
Approval/reject actions

Resource	Location
Policy architecture	`docs/modules/policy/architecture.md`
DSL reference	`docs/policy/dsl.md`
Lifecycle guide	`docs/policy/lifecycle.md`
Runtime guide	`docs/policy/runtime.md`

13. Sprint Mapping

Primary Sprint: SPRINT_0185_0001_0001_policy_simulation.md (NEW)
Related Sprints:
- SPRINT_0120_0000_0001_policy_reasoning.md
- SPRINT_0121_0001_0001_policy_reasoning.md

Key Task IDs:

POLICY-SIM-40-001 - Shadow runs (DONE)
POLICY-DIFF-41-001 - Diff algorithm (DONE)
POLICY-COV-50-001 - Coverage fixtures (IN PROGRESS)
POLICY-COV-50-002 - Coverage runner (IN PROGRESS)
POLICY-PROM-55-001 - Promotion gates (TODO)

14. Success Metrics

Metric	Target
Simulation latency	< 2 min (10k findings)
Coverage accuracy	100% fixture matching
Promotion gate enforcement	100% adherence
Shadow isolation	Zero production leakage
Replay determinism	100% hash match

Last updated: 2025-11-29

9.1 KiB Raw Blame History