up

2025-11-26 20:23:28 +02:00
parent 4831c7fcb0
commit d63af51f84
139 changed files with 8010 additions and 2795 deletions
--- a/docs/policy/api.md
+++ b/docs/policy/api.md
@@ -0,0 +1,50 @@
+# Policy API Reference (runtime endpoints)
+
+> **Imposed rule:** Policy API calls must include tenant context and operate on frozen inputs; mutating endpoints require Authority scopes and audit events.
+
+## Base
+`/api/v1/policies`
+
+## Endpoints
+- `GET /policies` – list policies (with filters: tenant, status, name, tags); paginated.
+- `GET /policies/{id}` – fetch metadata and versions.
+- `GET /policies/{id}/versions/{v}` – fetch IR, hash, status, shadow flag, attestation refs.
+- `POST /policies/{id}/simulate` – run simulate; body: `{ inputs: { sbom_digest, advisory_snapshot, vex_set, reachability_hash, signals_digest }, settings: { shadow: bool } }`. Returns `runId`, findings, explain summary; full explain via run endpoint.
+- `POST /policies/{id}/run` – full run with frozen cursors; same body as simulate plus `mode` (`full|incremental`).
+- `GET /policy-runs/{runId}` – returns findings, explain trace refs, hashes, shadow flag, status.
+- `POST /policies/{id}/submit` – attach lint/simulate/coverage artefacts; transitions to `submitted`.
+- `POST /policies/{id}/approve` – requires `policy:approve`; records approval note.
+- `POST /policies/{id}/publish` – requires `policy:publish`; body includes `reason`, `ticket`, `sign=true|false`; returns attestation ref.
+- `POST /policies/{id}/activate` – requires `policy:activate`; activates version.
+- `POST /policies/{id}/archive` – archive version; reason required.
+
+## Headers
+- `X-Stella-Tenant` (required)
+- `X-Stella-Shadow` (optional; simulate)
+- `If-None-Match` (IR cache)
+
+## Auth & scopes
+- Read: `policy:read`
+- Simulate: `policy:simulate`
+- Submit: `policy:author`
+- Approve: `policy:approve`
+- Publish/Promote: `policy:publish`/`policy:promote`
+- Activate/Run: `policy:operate`
+
+## Errors (Problem+JSON)
+- `policy_inputs_unfrozen` (409) – missing cursors.
+- `policy_ir_hash_mismatch` (409) – IR hash differs from attested value.
+- `policy_shadow_required` (412) – shadow gate not satisfied.
+- `policy_attestation_required` (412) – publish without attestation metadata.
+- Standard auth/tenant errors.
+
+## Pagination & determinism
+- `limit`/`cursor`; stable ordering by `policyId` then `version`.
+- All list endpoints return `ETag` and `Content-SHA256` headers.
+
+## Offline
+- API supports `file://` bundle handler when running in sealed mode; simulate/run accept `bundle` path instead of remote cursors.
+
+## Observability
+- Metrics: `policy_api_requests_total{endpoint,status}`, `policy_simulate_latency_seconds`, `policy_run_latency_seconds`.
+- Logs: include `policyId`, `version`, `runId`, `tenant`, `shadow`, `cursors` hashes.
--- a/docs/policy/dsl.md
+++ b/docs/policy/dsl.md
@@ -1,6 +1,7 @@
 # Stella Policy DSL (`stella-dsl@1`)

-> **Audience:** Policy authors, reviewers, and tooling engineers building lint/compile flows for the Policy Engine v2 rollout (Sprint 20).
+> **Audience:** Policy authors, reviewers, and tooling engineers building lint/compile flows for the Policy Engine v2 rollout (Sprint 20).
+> **Imposed rule:** Policies that alter reachability or trust weighting must run in shadow mode first with coverage fixtures; promotion to active is blocked until shadow + coverage gates pass.

 This document specifies the `stella-dsl@1` grammar, semantics, and guardrails used by Stella Ops to transform SBOM facts, Concelier advisories, and Excititor VEX statements into effective findings. Use it with the [Policy Engine Overview](overview.md) for architectural context and the upcoming lifecycle/run guides for operational workflows.

@@ -14,6 +15,7 @@ This document specifies the `stella-dsl@1` grammar, semantics, and guardrails us
 - **Lean authoring:** Common precedence, severity, and suppression patterns are first-class.
 - **Offline-friendly:** Grammar and built-ins avoid cloud dependencies, run the same in sealed deployments.
 - **Reachability-aware:** Policies can consume reachability lattice states (`ReachState`) and evidence scores to drive VEX gates (`not_affected`, `under_investigation`, `affected`).
+- **Signal-first:** Trust, reachability, entropy, and uncertainty signals are first-class so explain traces stay reproducible.

 ---

@@ -40,14 +42,26 @@ policy "Default Org Policy" syntax "stella-dsl@1" {
    }
  }

-  rule vex_precedence priority 10 {
-    when vex.any(status in ["not_affected","fixed"])
-      and vex.justification in ["component_not_present","vulnerable_code_not_present"]
-    then status := vex.status
-    because "Strong vendor justification prevails";
-  }
-}
-```
+  rule vex_precedence priority 10 {
+    when vex.any(status in ["not_affected","fixed"])
+      and vex.justification in ["component_not_present","vulnerable_code_not_present"]
+    then status := vex.status
+    because "Strong vendor justification prevails";
+  }
+
+  rule reachability_gate priority 20 {
+    when telemetry.reachability.state == "reachable" and telemetry.reachability.score >= 0.6
+    then status := "affected"
+    because "Runtime/graph evidence shows reachable code path";
+  }
+
+  rule trust_penalty priority 30 {
+    when signals.trust_score < 0.4 or signals.entropy_penalty > 0.2
+    then severity := severity_band("critical")
+    because "Low trust score or high entropy";
+  }
+}
+```

 High-level layout:

@@ -127,9 +141,10 @@ annotate    = "annotate", identifier, ":=", expression, ";" ;

 Notes:

- `helper` is reserved for shared calculcations (not yet implemented in `@1`).
- `else` branch executes only if `when` predicates evaluate truthy **and** no prior rule earlier in priority handled the tuple.
- Semicolons inside rule bodies are optional when each clause is on its own line; the compiler emits canonical semicolons in IR.
+- `helper` is reserved for shared calculcations (not yet implemented in `@1`).
+- `else` branch executes only if `when` predicates evaluate truthy **and** no prior rule earlier in priority handled the tuple.
+- Semicolons inside rule bodies are optional when each clause is on its own line; the compiler emits canonical semicolons in IR.
+- `settings.shadow = true` enables shadow-mode evaluation (findings recorded but not enforced). Promotion gates require at least one shadow run with coverage fixtures.

 ---

@@ -146,6 +161,7 @@ Within predicates and actions you may reference the following namespaces:
 | `run` | `policyId`, `policyVersion`, `tenant`, `timestamp` | Metadata for explain annotations. |
 | `env` | Arbitrary key/value pairs injected per run (e.g., `environment`, `runtime`). |
 | `telemetry` | Optional reachability signals. Example fields: `telemetry.reachability.state`, `telemetry.reachability.score`, `telemetry.reachability.policyVersion`. Missing fields evaluate to `unknown`. |
+| `signals` | Normalised signal dictionary: `trust_score` (0–1), `reachability.state` (`reachable|unreachable|unknown`), `reachability.score` (0–1), `entropy_penalty` (0–0.3), `uncertainty.level` (`U1`–`U3`), `runtime_hits` (bool). |
 | `secret` | `findings`, `bundle`, helper predicates | Populated when the Secrets Analyzer runs. Exposes masked leak findings and bundle metadata for policy decisions. |
 | `profile.<name>` | Values computed inside profile blocks (maps, scalars). |

@@ -162,8 +178,9 @@ Missing fields evaluate to `null`, which is falsey in boolean context and propag
 | `normalize_cvss(advisory)` | `Advisory → SeverityScalar` | Parses `advisory.content.raw` for CVSS data; falls back to policy maps. |
 | `cvss(score, vector)` | `double × string → SeverityScalar` | Constructs a severity object manually. |
 | `severity_band(value)` | `string → SeverityBand` | Normalises strings like `"critical"`, `"medium"`. |
-| `risk_score(base, modifiers...)` | Variadic | Multiplies numeric modifiers (severity × trust × reachability). |
-| `vex.any(predicate)` | `(Statement → bool) → bool` | `true` if any statement satisfies predicate. |
+| `risk_score(base, modifiers...)` | Variadic | Multiplies numeric modifiers (severity × trust × reachability). |
+| `reach_state(state)` | `string → ReachState` | Normalises reachability state strings (`reachable`, `unreachable`, `unknown`). |
+| `vex.any(predicate)` | `(Statement → bool) → bool` | `true` if any statement satisfies predicate. |
 | `vex.all(predicate)` | `(Statement → bool) → bool` | `true` if all statements satisfy predicate. |
 | `vex.latest()` | `→ Statement` | Lexicographically newest statement. |
 | `advisory.has_tag(tag)` | `string → bool` | Checks advisory metadata tags. |
@@ -252,16 +269,30 @@ rule vex_strong_claim priority 5 {
 }
 ```

-### 9.3 Environment-Specific Escalation
+### 9.3 Environment-Specific Escalation

 ```dsl
-rule internet_exposed_guard {
-  when env.exposure == "internet"
-       and severity.normalized >= "High"
-  then escalate to severity_band("Critical")
-  because "Internet-exposed assets require critical posture";
-}
-```
+rule internet_exposed_guard {
+  when env.exposure == "internet"
+       and severity.normalized >= "High"
+  then escalate to severity_band("Critical")
+  because "Internet-exposed assets require critical posture";
+}
+```
+
+### 9.4 Shadow mode & coverage
+
+- Enable `settings { shadow = true; }` for new policies or major changes. Findings are recorded but not enforced.
+- Provide coverage fixtures under `tests/policy/<policyId>/cases/*.json`; run `stella policy test` locally and in CI. Coverage results must be attached on submission.
+- Promotion to active is blocked until shadow runs + coverage gates pass (see lifecycle §3).
+
+### 9.5 Authoring workflow (quick checklist)
+
+1. Write/update policy with shadow enabled.
+2. Add/refresh coverage fixtures; run `stella policy test`.
+3. `stella policy lint` and `stella policy simulate --fixtures ...` with expected signals (trust_score, reachability, entropy_penalty) noted in comments.
+4. Submit with attachments: lint, simulate diff, coverage results.
+5. After approval, disable shadow and promote; retain fixtures for regression tests.

 ### 9.4 Anti-pattern (flagged by linter)

@@ -318,4 +349,4 @@ rule catch_all {

 ---

-*Last updated: 2025-11-05 (Sprint 21).*
+*Last updated: 2025-11-26 (Sprint 0401).* 
--- a/docs/policy/editor.md
+++ b/docs/policy/editor.md
@@ -0,0 +1,49 @@
+# Policy Editor Guide
+
+> **Imposed rule:** Edits must run lint, simulate, and shadow+coverage gates before promotion; UI enforces attachment of results on submission.
+
+This guide walks through the Console Policy Editor: authoring, validation, simulation, approvals, and offline workflow.
+
+## 1. Workspace
+- **Left rail:** policy list, versions, status (draft/submitted/approved/active/archived), shadow flag badge.
+- **Editor pane:** YAML/SPL with schema validation, syntax highlighting, auto-format; shows IR hash after successful lint.
+- **Metadata panel:** description, tags, AOC indicator, attestation status.
+- **Attachments panel:** lint report, simulate diff, coverage results; mandatory before submission.
+
+## 2. Validation
+- Live lint via compiler service; blocks save on fatal errors.
+- Schema assist: hover shows field descriptions; unknown fields flagged as warnings.
+- Determinism check: twin-run diff runs on save; failures block submission.
+
+## 3. Simulation
+- Quick simulate: select fixtures (SBOM/VEX bundles) → runs in shadow mode; results shown inline with deltas vs previous version.
+- Batch simulate: enqueue via orchestrator; results stored as attachments; required freshness <24h for submission.
+
+## 4. Submission & approvals
+- Submit requires: lint OK, simulate attachment, coverage results, shadow enabled.
+- Reviewers comment inline; blocking comments must be resolved before approval.
+- Approvers must enter reason/ticket; Authority enforces two-person rule when configured.
+
+## 5. Promotion & activation
+- Publish & sign: produces DSSE attestation over IR hash + approval metadata; Rekor mirror when online.
+- Activate: selects approved version; records input cursors; triggers run if requested.
+- Rollback: pick prior approved version; requires reason.
+
+## 6. Offline workflow
+- Load policy pack + attachments from Offline Kit; editor runs local lint/simulate with sealed inputs.
+- Submit/approve offline records events locally; sync to Authority when reconnected.
+
+## 7. Shortcuts & a11y
+- Keyboard: `Ctrl+S` save, `Ctrl+Shift+L` lint, `Ctrl+Shift+R` simulate.
+- Screen reader labels on editor, results table, and buttons; focus order follows workflow.
+
+## 8. Troubleshooting
+- Lint failures: open Problems tab; fix schema/unknown fields.
+- Simulate stale: rerun quick simulate; ensure fixtures match policy inputs.
+- Attestation mismatch: regenerate IR (auto) and retry publish; check Authority scopes.
+
+## References
+- `docs/policy/dsl.md`
+- `docs/policy/spl-v1.md`
+- `docs/policy/lifecycle.md`
+- `docs/policy/runtime.md`
--- a/docs/policy/governance.md
+++ b/docs/policy/governance.md
@@ -0,0 +1,51 @@
+# Policy Governance
+
+> **Imposed rule:** Publish/Promote actions require reason + ticket metadata and DSSE attestation; two-person approval is recommended and enforced where configured by Authority.
+
+This guide defines roles, scopes, approvals, signing, and exception handling for Stella policies.
+
+## 1. Roles & scopes
+- Author: `policy:author`, `policy:simulate`
+- Reviewer: `policy:review`, `policy:simulate`
+- Approver: `policy:approve`, `policy:audit`
+- Operator: `policy:operate`, `policy:activate`, `policy:run`
+- Publisher: `policy:publish`, `policy:promote`
+- Auditor: `policy:audit`
+
+Authority policy can map org roles to scopes; two-person rule can be enabled per tenant for publish/promote.
+
+## 2. Approval workflow
+1) Author drafts with shadow + coverage fixtures; runs lint/simulate/test.
+2) Submit with attachments (lint, simulate, coverage, reason/ticket optional at this stage).
+3) Reviewers comment/resolve; approver checks gates (shadow, coverage, determinism).
+4) Publisher runs `stella policy publish --reason --ticket --sign`; attestation stored and optionally mirrored to Rekor.
+5) Operator activates version; audit events recorded.
+
+## 3. Signing & attestation
+- DSSE payload includes IR hash, policyId/version, reason, ticket, approvals, shadow/coverage evidence refs.
+- Rekor mirror when online; offline deployments store bundle + checkpoint for later replay.
+- Evidence Locker stores DSSE + run inputs/outputs for audit.
+
+## 4. Exceptions & waivers
+- Use SPL rules with explicit scope and `because` rationale; no perpetual suppressions.
+- Waivers must include expiration and owner; DSSE attested if exported.
+- AOC: Aggregation-Only Contract requires waiver scope to avoid cross-tenant data; UI/CLI enforce tenant scoping.
+
+## 5. Compliance checklist
+- [ ] Two-person rule enforced (Authority config) for publish/promote.
+- [ ] Reason and ticket captured on publish; stored in attestation metadata.
+- [ ] Shadow + coverage gates passed and attached.
+- [ ] IR hash recorded; attestation verified before activation.
+- [ ] Waivers have expiry, owner, `because`, and scope.
+- [ ] Offline replay path documented for the policy pack.
+
+## 6. Audit & observability
+- Timeline events: `policy.submitted`, `policy.approved`, `policy.published`, `policy.promoted`, `policy.activated`, `policy.archived`.
+- Metrics: `policy_publish_total`, `policy_promote_total`, `policy_attestation_verify_failures`, `policy_shadow_runs_total`.
+- Logs: include `policyId`, `version`, `attestation_ref`, `reason`, `ticket`, `shadow`.
+
+## References
+- `docs/policy/overview.md`
+- `docs/policy/lifecycle.md`
+- `docs/policy/spl-v1.md`
+- `docs/policy/runtime.md`
--- a/docs/policy/lifecycle.md
+++ b/docs/policy/lifecycle.md
@@ -3,17 +3,19 @@
 > **Audience:** Policy authors, reviewers, security approvers, release engineers.  
 > **Scope:** End-to-end flow for `stella-dsl@1` policies from draft through archival, including CLI/Console touch-points, Authority scopes, audit artefacts, and offline considerations.

-This guide explains how a policy progresses through Stella Ops, which roles are involved, and the artefacts produced at every step. Pair it with the [Policy Engine Overview](overview.md), [DSL reference](dsl.md), and upcoming run documentation to ensure consistent authoring and rollout.
+This guide explains how a policy progresses through Stella Ops, which roles are involved, and the artefacts produced at every step. Pair it with the [Policy Engine Overview](overview.md), [DSL reference](dsl.md), and upcoming run documentation to ensure consistent authoring and rollout.
+> **Imposed rule:** New or significantly changed policies must run in **shadow mode** with coverage fixtures before activation. Promotions are blocked until shadow + coverage gates pass.

 ---

 ## 1 · Protocol Summary

- Policies are **immutable versions** attached to a stable `policy_id`.
- Lifecycle states: `draft → submitted → approved → active → archived`.
- Every transition requires explicit Authority scopes and produces structured events + storage artefacts (`policies`, `policy_runs`, audit log collections).
- Simulation and CI gating happen **before** approvals can be granted.
- Activation triggers (runs, bundle exports, CLI `promote`) operate on the **latest approved** version per tenant.
+- Policies are **immutable versions** attached to a stable `policy_id`.
+- Lifecycle states: `draft → submitted → approved → active → archived`.
+- Every transition requires explicit Authority scopes and produces structured events + storage artefacts (`policies`, `policy_runs`, audit log collections).
+- Simulation and CI gating happen **before** approvals can be granted.
+- Activation triggers (runs, bundle exports, CLI `promote`) operate on the **latest approved** version per tenant.
+- Shadow mode runs capture findings without enforcement; shadow exit requires coverage + twin-run determinism checks.

 ```mermaid
 stateDiagram-v2
@@ -53,7 +55,9 @@ stateDiagram-v2
 - **Tools:** Console editor, `stella policy edit`, policy DSL files.
 - **Actions:**
  - Author DSL leveraging [stella-dsl@1](dsl.md).
-  - Run `stella policy lint` and `stella policy simulate --sbom <fixtures>` locally.
+  - Run `stella policy lint` and `stella policy simulate --sbom <fixtures>` locally.
+  - Add/refresh coverage fixtures under `tests/policy/<policyId>/cases/*.json`; run `stella policy test`.
+  - Keep `settings.shadow = true` until coverage + shadow gates pass.
  - Attach rationale metadata (`metadata.description`, tags).
 - **Artefacts:**
  - `policies` document with `status=draft`, `version=n`, `provenance.created_by`.
@@ -67,7 +71,8 @@ stateDiagram-v2
 - **Who:** Authors (`policy:author`).
 - **Tools:** Console “Submit for review” button, `stella policy submit <policyId> --reviewers ...`.
 - **Actions:**
-  - Provide review notes and required simulations (CLI uploads attachments).
+  - Provide review notes and required simulations (CLI uploads attachments).
+  - Attach coverage results (shadow mode + `stella policy test`).
  - Choose reviewer groups; Authority records them in submission metadata.
 - **Artefacts:**
  - Policy document transitions to `status=submitted`, capturing `submitted_by`, `submitted_at`, reviewer list, simulation digest references.
@@ -96,7 +101,8 @@ stateDiagram-v2
 - **Who:** Approvers (`policy:approve`).
 - **Tools:** Console “Approve”, CLI `stella policy approve <id> --version n --note "rationale"`.
 - **Actions:**
-  - Confirm compliance checks (see §6) all green.
+  - Confirm compliance checks (see §6) all green.
+  - Verify shadow gate + coverage suite passed in CI.
  - Provide approval note (mandatory string captured in audit trail).
 - **Artefacts:**
  - Policy `status=approved`, `approved_by`, `approved_at`, `approval_note`.
@@ -190,12 +196,14 @@ All CLI commands emit structured JSON by default; use `--format table` for human

 ## 6 · Compliance Gates

-| Gate | Stage | Enforced by | Requirement |
-|------|-------|-------------|-------------|
-| **DSL lint** | Draft → Submit | CLI/CI | `stella policy lint` successful within 24 h. |
-| **Simulation evidence** | Submit | CLI/Console | Attach diff from `stella policy simulate` covering baseline SBOM set. |
-| **Reviewer quorum** | Submit → Approve | Authority | Minimum approver/reviewer count configurable per tenant. |
-| **Determinism CI** | Approve | DevOps job | Twin run diff passes (`DEVOPS-POLICY-20-003`). |
+| Gate | Stage | Enforced by | Requirement |
+|------|-------|-------------|-------------|
+| **DSL lint** | Draft → Submit | CLI/CI | `stella policy lint` successful within 24 h. |
+| **Simulation evidence** | Submit | CLI/Console | Attach diff from `stella policy simulate` covering baseline SBOM set. |
+| **Shadow run** | Submit → Approve | Policy Engine / CI | Shadow mode enabled (`settings.shadow=true`) with findings recorded; must execute once per change. |
+| **Coverage suite** | Submit → Approve | CI (`stella policy test`) | Coverage fixtures present and passing; artefact attached to submission. |
+| **Reviewer quorum** | Submit → Approve | Authority | Minimum approver/reviewer count configurable per tenant. |
+| **Determinism CI** | Approve | DevOps job | Twin run diff passes (`DEVOPS-POLICY-20-003`). |
 | **Attestation metadata** | Approve → Publish | Authority / CLI | `policy:publish` executed with reason & ticket metadata; DSSE attestation verified. |
 | **Activation health** | Publish/Promote → Activate | Policy Engine | Last run status succeeded; orchestrator queue healthy. |
 | **Export validation** | Archive | Offline Kit | DSSE-signed policy pack generated for long-term retention. |
--- a/docs/policy/overview.md
+++ b/docs/policy/overview.md
@@ -1,173 +1,54 @@
-# Policy Engine Overview
-
-> **Goal:** Evaluate organisation policies deterministically against scanner SBOMs, Concelier advisories, and Excititor VEX evidence, then publish effective findings that downstream services can trust.
-
-This document introduces the v2 Policy Engine: how the service fits into Stella Ops, the artefacts it produces, the contracts it honours, and the guardrails that keep policy decisions reproducible across air-gapped and connected deployments.
-
---
-
-## 1 · Role in the Platform
-
- **Purpose:** Compose policy verdicts by reconciling SBOM inventory, advisory metadata, VEX statements, and organisation rules.
- **Form factor:** Dedicated `.NET 10` Minimal API host (`StellaOps.Policy.Engine`) plus worker orchestration. Policies are defined in `stella-dsl@1` packs compiled to an intermediate representation (IR) with a stable SHA-256 digest.
- **Tenancy:** All workloads run under Authority-enforced scopes (`policy:*`, `findings:read`, `effective:write`). Only the Policy Engine identity may materialise effective findings collections.
- **Consumption:** Findings ledger, Console, CLI, and Notify read the published `effective_finding_{policyId}` materialisations and policy run ledger (`policy_runs`).
- **Offline parity:** Bundled policies import/export alongside advisories and VEX. In sealed mode the engine degrades gracefully, annotating explanations whenever cached signals replace live lookups.
-
---
-
-## 2 · High-Level Architecture
-
-```mermaid
-flowchart LR
-    subgraph Inputs
-        A[Scanner SBOMs<br/>Inventory & Usage]
-        B[Concelier Advisories<br/>Canonical linksets]
-        C[Excititor VEX<br/>Consensus status]
-        D[Policy Packs<br/>stella-dsl@1]
-    end
-    subgraph PolicyEngine["StellaOps.Policy.Engine"]
-        P1[DSL Compiler<br/>IR + Digest]
-        P2[Joiners<br/>SBOM ↔ Advisory ↔ VEX]
-        P3[Deterministic Evaluator<br/>Rule hits + scoring]
-        P4[Materialisers<br/>effective findings]
-        P5[Run Orchestrator<br/>Full & incremental]
-    end
-    subgraph Outputs
-        O1[Effective Findings Collections]
-        O2[Explain Traces<br/>Rule hit lineage]
-        O3[Metrics & Traces<br/>policy_run_seconds,<br/>rules_fired_total]
-        O4[Simulation/Preview Feeds<br/>CLI & Studio]
-    end
-
-    A --> P2
-    B --> P2
-    C --> P2
-    D --> P1 --> P3
-    P2 --> P3 --> P4 --> O1
-    P3 --> O2
-    P5 --> P3
-    P3 --> O3
-    P3 --> O4
-```
-
---
-
-## 3 · Core Concepts
-
-| Concept | Description |
-|---------|-------------|
-| **Policy Pack** | Versioned bundle of DSL documents, metadata, and checksum manifest. Packs import/export via CLI and Offline Kit bundles. |
-| **Policy Digest** | SHA-256 of the canonical IR; used for caching, explain trace attribution, and audit proofs. |
-| **Effective Findings** | Append-only Mongo collections (`effective_finding_{policyId}`) storing the latest verdict per finding, plus history sidecars. |
-| **Policy Run** | Execution record persisted in `policy_runs` capturing inputs, run mode, timings, and determinism hash. |
-| **Explain Trace** | Structured tree showing rule matches, data provenance, and scoring components for UI/CLI explain features. |
-| **Simulation** | Dry-run evaluation that compares a candidate pack against the active pack and produces verdict diffs without persisting results. |
-| **Incident Mode** | Elevated sampling/trace capture toggled automatically when SLOs breach; emits events for Notifier and Timeline Indexer. |
-
---
-
-## 4 · Inputs & Pre-processing
-
-### 4.1 SBOM Inventory
-
- **Source:** Scanner.WebService publishes inventory/usage SBOMs plus BOM-Index (roaring bitmap) metadata.
- **Consumption:** Policy joiners use the index to expand candidate components quickly, keeping evaluation under the `< 5 s` warm path budget.
- **Schema:** CycloneDX Protobuf + JSON views; Policy Engine reads canonical projections via shared SBOM adapters.
-
-### 4.2 Advisory Corpus
-
- **Source:** Concelier exports canonical advisories with deterministic identifiers, linksets, and equivalence tables.
- **Contract:** Policy Engine only consumes raw `content.raw`, `identifiers`, and `linkset` fields per Aggregation-Only Contract (AOC); derived precedence remains a policy concern.
-
-### 4.3 VEX Evidence
-
- **Source:** Excititor consensus service resolves OpenVEX / CSAF statements, preserving conflicts.
- **Usage:** Policy rules can require specific VEX vendors or justification codes; evaluator records when cached evidence substitutes for live statements (sealed mode).
-
-### 4.4 Policy Packs
-
- Authored in Policy Studio or CLI, validated against the `stella-dsl@1` schema.
- Compiler performs canonicalisation (ordering, defaulting) before emitting IR and digest.
- Packs bundle scoring profiles, allowlist metadata, and optional reachability weighting tables.
-
---
-
-## 5 · Evaluation Flow
-
-1. **Run selection** – Orchestrator accepts `full`, `incremental`, or `simulate` jobs. Incremental runs listen to change streams from Concelier, Excititor, and SBOM imports to scope re-evaluation.
-2. **Input staging** – Candidates fetched in deterministic batches; identity graph from Concelier strengthens PURL lookups.
-3. **Rule execution** – Evaluator walks rules in lexical order (first-match wins). Actions available: `block`, `ignore`, `warn`, `defer`, `escalate`, `requireVex`, each supporting quieting semantics where permitted.
-4. **Scoring** – `PolicyScoringConfig` applies severity, trust, reachability weights plus penalties (`warnPenalty`, `ignorePenalty`, `quietPenalty`).
-5. **Verdict and explain** – Engine constructs `PolicyVerdict` records with inputs, quiet flags, unknown confidence bands, and provenance markers; explain trees capture rule lineage.
-6. **Materialisation** – Effective findings collections are upserted append-only, stamped with run identifier, policy digest, and tenant.
-7. **Publishing** – Completed run writes to `policy_runs`, emits metrics (`policy_run_seconds`, `rules_fired_total`, `vex_overrides_total`), and raises events for Console/Notify subscribers.
-
---
-
-## 6 · Run Modes
-
-| Mode | Trigger | Scope | Persistence | Typical Use |
-|------|---------|-------|-------------|-------------|
-| **Full** | Manual CLI (`stella policy run`), scheduled nightly, or emergency rebaseline | Entire tenant | Writes effective findings and run record | After policy publish or major advisory/VEX import |
-| **Incremental** | Change-stream queue driven by Concelier/Excititor/SBOM deltas | Only affected artefacts | Writes effective findings and run record | Continuous upkeep; ensures SLA ≤ 5 min from source change |
-| **Simulate** | CLI/Studio preview, CI pipelines | Candidate subset (diff against baseline) | No materialisation; produces explain & diff payloads | Policy authoring, CI regression suites |
-
-All modes are cancellation-aware and checkpoint progress for replay in case of deployment restarts.
-
---
-
-## 7 · Outputs & Integrations
-
- **APIs** – Minimal API exposes policy CRUD, run orchestration, explain fetches, and cursor-based listing of effective findings (see `/docs/api/policy.md` once published).
- **CLI** – `stella policy simulate/run/show` commands surface JSON verdicts, exit codes, and diff summaries suitable for CI gating.
- **Console / Policy Studio** – UI reads explain traces, policy metadata, approval workflow status, and simulation diffs to guide reviewers.
- **Findings Ledger** – Effective findings feed downstream export, Notify, and risk scoring jobs.
- **Air-gap bundles** – Offline Kit includes policy packs, scoring configs, and explain indexes; export commands generate DSSE-signed bundles for transfer.
-
---
-
-## 8 · Determinism & Guardrails
-
- **Deterministic inputs** – All joins rely on canonical linksets and equivalence tables; batches are sorted, and random/wall-clock APIs are blocked by static analysis plus runtime guards (`ERR_POL_004`).
- **Stable outputs** – Canonical JSON serializers sort keys; digests recorded in run metadata enable reproducible diffs across machines.
- **Idempotent writes** – Materialisers upsert using `{policyId, findingId, tenant}` keys and retain prior versions with append-only history.
- **Sandboxing** – Policy evaluation executes in-process with timeouts; restart-only plug-ins guarantee no runtime DLL injection.
- **Compliance proof** – Every run stores digest of inputs (policy, SBOM batch, advisory snapshot) so auditors can replay decisions offline.
-
---
-
-## 9 · Security, Tenancy & Offline Notes
-
- **Authority scopes:** Gateway enforces `policy:read`, `policy:write`, `policy:simulate`, `policy:runs`, `findings:read`, `effective:write`. Service identities must present DPoP-bound tokens.
- **Tenant isolation:** Collections partition by tenant identifier; cross-tenant queries require explicit admin scopes and return audit warnings.
- **Sealed mode:** In air-gapped deployments the engine surfaces `sealed=true` hints in explain traces, warning about cached EPSS/KEV data and suggesting bundle refreshes (see `docs/airgap/airgap-mode.md`).
- **Observability:** Structured logs carry correlation IDs matching orchestrator job IDs; metrics integrate with OpenTelemetry exporters; sampled rule-hit logs redact policy secrets.
- **Incident response:** Incident mode can be forced via API, boosting trace retention and notifying Notifier through `policy.incident.activated` events.
-
---
-
-## 10 · Working with Policy Packs
-
-1. **Author** in Policy Studio or edit DSL files locally. Validate with `stella policy lint`.
-2. **Simulate** against golden SBOM fixtures (`stella policy simulate --sbom fixtures/*.json`). Inspect explain traces for unexpected overrides.
-3. **Publish** via API or CLI; Authority enforces review/approval workflows (`draft → review → approve → rollout`).
-4. **Monitor** the subsequent incremental runs; if determinism diff fails in CI, roll back pack while investigating digests.
-5. **Bundle** packs for offline sites with `stella policy bundle export` and distribute via Offline Kit.
-
---
-
-## 11 · Compliance Checklist
-
- [ ] **Scopes enforced:** Confirm gateway policy requires `policy:*` and `effective:write` scopes for all mutating endpoints.
- [ ] **Determinism guard active:** Static analyzer blocks clock/RNG usage; CI determinism job diffing repeated runs passes.
- [ ] **Materialisation audit:** Effective findings collections use append-only writers and retain history per policy run.
- [ ] **Explain availability:** UI/CLI expose explain traces for every verdict; sealed-mode warnings display when cached evidence is used.
- [ ] **Offline parity:** Policy bundles (import/export) tested in sealed environment; air-gap degradations documented for operators.
- [ ] **Observability wired:** Metrics (`policy_run_seconds`, `rules_fired_total`, `vex_overrides_total`) and sampled rule hit logs emit to the shared telemetry pipeline with correlation IDs.
- [ ] **Documentation synced:** API (`/docs/api/policy.md`), DSL grammar (`/docs/policy/dsl.md`), lifecycle (`/docs/policy/lifecycle.md`), and run modes (`/docs/policy/runs.md`) cross-link back to this overview.
-
---
-
-*Last updated: 2025-10-26 (Sprint 20).*
-
+# Policy System Overview
+
+> **Imposed rule:** Policies that change reachability or trust weighting must enter shadow mode first and ship coverage fixtures; promotion is blocked until shadow + coverage gates pass (see `docs/policy/lifecycle.md`).
+
+This overview orients authors, reviewers, and operators to the Stella Policy system: the SPL language, lifecycle, evidence inputs, and how policies are enforced online and in air-gapped sites.
+
+## 1. What the Policy System Does
+- Combines SBOM facts, advisories (Concelier), VEX claims (Excititor), reachability signals (Graphs + runtime), trust/entropy signals, and operator metadata to produce deterministic findings.
+- Produces explainable outputs: every verdict carries rule, rationale (`because`), inputs, and evidence hashes.
+- Works online or offline: policies, inputs, and outputs are content-addressed and can be replayed with no network.
+
+## 2. Layers
+- **SPL (Stella Policy Language):** declarative rules (`stella-dsl@1`) with profiles, maps, and rule blocks; no loops or network calls.
+- **Compiler:** canonicalises SPL, emits IR + hash; used by CLI, Console, and CI. Canonical hashes feed attestation and replay.
+- **Engine:** evaluates IR against SBOM/VEX/reachability signals; outputs effective findings and explains every rule fire.
+- **Attestation:** optional DSSE over policy IR and approval metadata; Rekor mirror when online.
+- **Distribution:** policy packs are versioned, tenant-scoped, and promoted via Authority scopes; Offline Kit includes packs + attestations.
+
+## 3. Inputs & Signals
+- SBOM inventory/usage (Scanner), advisories (Concelier), VEX (Excititor), reachability graphs/runtime (Signals), trust/entropy/uncertainty scores, secret-leak findings, environment metadata, and tenant policy defaults.
+- Signals dictionary (normalised): `trust_score`, `reachability.state/score`, `entropy_penalty`, `uncertainty.level`, `runtime_hits`.
+- All inputs must be content-addressed; missing fields evaluate to `unknown`/null and must be handled explicitly.
+
+## 4. Lifecycle (summary)
+1. Draft in SPL with shadow mode on and coverage fixtures (`stella policy test`).
+2. Submit with lint/simulate + coverage artefacts attached.
+3. Review/approve with Authority scopes; determinism and shadow gates enforced in CI.
+4. Publish/attest (DSSE + optional Rekor); promote to environments; activate runs.
+5. Archive or roll back with audit trail preserved.
+
+## 5. Governance & Roles
+- Scopes: `policy:author`, `policy:review`, `policy:approve`, `policy:operate`, `policy:publish`, `policy:activate`, `policy:audit`.
+- Two-person rule recommended for publish/promote; enforced by Authority per tenant.
+- AOC: Aggregation-Only Contract applies to regulated tenants—UI/CLI must respect AOC flags on policies and evidence.
+
+## 6. Review Checklist (fast path)
+- Lint + simulate outputs attached and fresh (<24h).
+- Shadow mode enabled; coverage fixtures passing; twin-run determinism check green.
+- `because` present on every status/severity change; suppressions scoped.
+- Inputs handled explicitly when `unknown` (reachability/runtime missing).
+- Attestation metadata ready (reason, ticket, IR hash) if publish is requested.
+- AOC impact noted; air-gap replay steps documented if applicable.
+
+## 7. Air-gap / Offline Notes
+- Policy packs, attestations, and coverage fixtures ship in Offline Kits; no live feed calls allowed during evaluation.
+- CLI `stella policy simulate --sealed` enforces no-network; policy runs must use frozen SBOM/advisory/VEX bundles and reachability graphs.
+- Attestations and hashes recorded in Evidence Locker; Timeline events emitted on publish/activate.
+
+## 8. Key References
+- `docs/policy/dsl.md` (language)
+- `docs/policy/lifecycle.md` (process, gates)
+- `docs/policy/architecture.md` (engine internals)
+- `docs/modules/policy/implementation_plan.md`
+- `docs/policy/governance.md` (once published)
--- a/docs/policy/runtime.md
+++ b/docs/policy/runtime.md
@@ -0,0 +1,65 @@
+# Policy Runtime & Evaluation
+
+> **Imposed rule:** Runtime evaluations must use frozen inputs (SBOM, advisories, VEX, reachability, signals) and emit explain traces plus DSSE/attestation metadata; no live feed calls during evaluation.
+
+This document describes how SPL policies are compiled, cached, and executed, and how results are surfaced via APIs, CLI, UI, and observability.
+
+## 1. Components
+- **Compiler**: converts SPL (`stella-dsl@1`) into canonical IR JSON, hashes it, and validates lint/coverage. Produces IR cache used by Engine.
+- **Engine**: deterministic evaluator that consumes IR + inputs (SBOM, advisory, VEX, signals) and emits findings + explain traces.
+- **Caches**:
+  - IR cache keyed by `policyId`/`version`/IR hash.
+  - Input cursors (SBOM/advisory/VEX snapshots, reachability graphs) to guarantee replay.
+  - Explain trace cache for recently queried runs (TTL, tenant-scoped).
+- **Attestation**: optional DSSE over IR hash + approval metadata; Rekor mirror when online; stored alongside run outputs in Evidence Locker.
+
+## 2. Execution flow
+1. Resolve active policy version for tenant (or specified version for simulate).
+2. Load IR from cache; verify hash matches attested value if provided.
+3. Fetch frozen inputs via cursors: SBOM digest, advisory snapshot id, VEX set, reachability graph hash, signals bundle.
+4. Evaluate rules in priority order; record explain entries (rule, because, inputs, signals).
+5. Persist findings, explain traces, and run metadata (`runId`, `policyVersion`, hashes) to storage.
+6. Emit events: `policy.run.started`, `policy.run.completed`, `policy.run.failed`; optionally `policy.run.shadow` when settings.shadow=true.
+
+## 3. Caching & determinism
+- IR cache warmed at publish; invalidated on new policy version.
+- Input cursors are mandatory; if missing, run is blocked (returns `inputs_unfrozen`).
+- Explain trace storage keeps deterministic ordering; capped by tenant quotas.
+- Shadow mode runs record findings but mark `enforced=false`; promotion blocked until shadow+coverage gates pass.
+
+## 4. APIs & CLI
+- API: `POST /policies/{id}/simulate`, `POST /policies/{id}/run`, `GET /policy-runs/{runId}` (findings + explain), `GET /policies/{id}/versions/{v}` (IR, hash, attestation refs).
+- CLI: `stella policy simulate`, `stella policy run`, `stella policy explain <runId> --format json|table`, `stella policy export --run <runId> --offline`.
+- Headers: `X-Stella-Tenant`, `X-Stella-Shadow` (optional), `If-None-Match` for IR cache revalidation.
+
+## 5. Observability & SLOs
+- Metrics: `policy_runs_total{status}`, `policy_run_duration_seconds`, `policy_explain_cache_hits`, `policy_inputs_unfrozen_total`, `policy_shadow_runs_total`.
+- Logs include `policyId`, `version`, `runId`, `tenant`, `shadow`, `input_cursor` hashes.
+- Traces: span per run with events for rule evaluation batches; attributes include counts of rules fired and unknowns encountered.
+- SLOs (suggested):
+  - p95 policy run latency < 2s for simulate, < 10s for full run.
+  - Error budget: <0.5% failed runs per rolling 7d.
+  - Explain cache hit rate >80% for repeated queries.
+
+## 6. Failure modes & handling
+- **Inputs unfrozen**: return 409 with required cursors; emit `policy.inputs_unfrozen` event.
+- **Hash mismatch**: IR hash differs from attested; block run and emit `policy.ir_hash_mismatch` alert.
+- **Unknown signals**: if required signals missing, downgrade to `unknown` and optionally set `status=under_investigation`; flag in explain trace.
+- **Exceeded quotas**: explain storage or run count caps → 429 with `Retry-After`; run not executed.
+
+## 7. Offline / air-gap
+- All inputs fetched from Offline Kit bundles; no network during evaluate.
+- CLI `stella policy run --sealed --bundle <path>` loads IR, inputs, and signals from bundle; writes outputs + attestation-ready manifest.
+- Runs produce DSSE-ready payloads (`policy.run@1`) that can be signed later when connectivity is restored.
+
+## 8. Data model (high level)
+- `policy_runs`: `runId`, `policyId`, `version`, `tenant`, `shadow`, `input_cursors`, `ir_hash`, `attestation_ref`, `started_at`, `completed_at`, `status`, `stats` (rules fired, explains, unknowns), `storage_refs` (findings, explains).
+- `policy_findings`: flattened findings with references to explain entries.
+- `policy_explains`: rule-level explain traces with inputs, signals, because text.
+
+## 9. References
+- `docs/policy/dsl.md`
+- `docs/policy/lifecycle.md`
+- `docs/policy/architecture.md`
+- `docs/policy/overview.md`
+- `docs/reachability/DELIVERY_GUIDE.md`
--- a/docs/policy/spl-v1.md
+++ b/docs/policy/spl-v1.md
@@ -0,0 +1,116 @@
+# Stella Policy Language (SPL) v1
+
+> **Status:** Draft (2025-11)
+> **Imposed rule:** SPL packs must pass lint, simulate, shadow, and coverage gates before activation; IR hashes must be attested when published.
+
+This document defines the SPL v1 language: syntax, semantics, JSON schema, and examples used by the Policy Engine.
+
+## 1. Syntax summary
+- File-level directive: `policy "<name>" syntax "stella-dsl@1" { ... }`
+- Blocks: `metadata`, `profile <name> {}`, `settings {}`, `rule <name> [priority n] { when ... then ... because "..." }`
+- No loops, no network/clock access; pure, deterministic evaluation.
+
+## 2. JSON Schema (canonical IR)
+
+```jsonc
+{
+  "$schema": "https://json-schema.org/draft/2020-12/schema",
+  "title": "Stella Policy Language v1",
+  "type": "object",
+  "required": ["policyId", "syntax", "rules"],
+  "properties": {
+    "policyId": {"type": "string"},
+    "syntax": {"const": "stella-dsl@1"},
+    "metadata": {"type": "object"},
+    "settings": {
+      "type": "object",
+      "properties": {
+        "shadow": {"type": "boolean"},
+        "default_status": {"type": "string"}
+      }
+    },
+    "profiles": {
+      "type": "object",
+      "additionalProperties": {
+        "type": "object",
+        "properties": {
+          "maps": {"type": "object"},
+          "env": {"type": "object"},
+          "scalars": {"type": "object"}
+        }
+      }
+    },
+    "rules": {
+      "type": "array",
+      "items": {
+        "type": "object",
+        "required": ["name", "when", "then"],
+        "properties": {
+          "name": {"type": "string"},
+          "priority": {"type": "integer", "minimum": 0},
+          "when": {"type": "object"},
+          "then": {"type": "array"},
+          "else": {"type": "array"},
+          "because": {"type": "string"}
+        }
+      }
+    }
+  }
+}
+```
+
+Notes:
+- The compiler emits canonical IR JSON sorted by keys; hashing uses this canonical form.
+- `when` and actions are expressed as AST nodes; see engine schema for exact shape.
+
+## 3. Built-in functions (v1)
+- `normalize_cvss`, `cvss`, `severity_band`, `risk_score`, `reach_state`, `exists`, `coalesce`, `percent_of`, `lowercase`.
+- VEX helpers: `vex.any`, `vex.all`, `vex.latest`.
+- Secrets helpers: `secret.hasFinding`, `secret.match.count`, `secret.bundle.version`, `secret.mask.applied`, `secret.path.allowlist`.
+- Signals: access via `signals.trust_score`, `signals.reachability.state/score`, `signals.entropy_penalty`, `signals.uncertainty.level`, `signals.runtime_hits`.
+
+## 4. Data namespaces
+- `sbom`, `advisory`, `vex`, `run`, `env`, `telemetry`, `signals`, `secret`, `profile.*`.
+- Missing fields evaluate to `null/unknown`; comparisons must handle `unknown` explicitly.
+
+## 5. Examples
+
+### 5.1 Reachability-aware gate
+```dsl
+rule reachability_gate priority 20 {
+  when signals.reachability.state == "reachable" and signals.reachability.score >= 0.6
+  then status := "affected"
+  because "Runtime/graph evidence shows reachable code path";
+}
+```
+
+### 5.2 Trust/entropy penalty
+```dsl
+rule trust_entropy_penalty priority 30 {
+  when signals.trust_score < 0.4 or signals.entropy_penalty > 0.2
+  then severity := severity_band("critical")
+  because "Low trust score or high entropy";
+}
+```
+
+### 5.3 Shadow mode on
+```dsl
+settings {
+  shadow = true
+}
+```
+
+## 6. Authoring workflow (quick)
+1. Write/update SPL with shadow enabled; add coverage fixtures.
+2. Run `stella policy lint`, `stella policy simulate`, and `stella policy test`.
+3. Attach artefacts to submission; ensure determinism twin-run passes in CI.
+4. Publish with DSSE attestation (IR hash + metadata) and promote to environments.
+
+## 7. Compatibility
+- SPL v1 aligns with `stella-dsl@1` grammar. Future SPL versions will be additive; declare `syntax` explicitly.
+
+## 8. References
+- `docs/policy/dsl.md`
+- `docs/policy/lifecycle.md`
+- `docs/policy/architecture.md`
+- `docs/policy/overview.md`
--- a/docs/policy/ui-integration.md
+++ b/docs/policy/ui-integration.md
@@ -0,0 +1,41 @@
+# Policy UI Integration for Graph/Vuln
+
+Status: Draft (2025-11-26) — aligns with POLICY-ENGINE-30-001..003 and Graph API overlays.
+
+## Goals
+- Explain how UI surfaces (Console, Vuln Explorer) consume policy/VEX overlays from Graph.
+- Clarify cache usage, simulator contracts, and explain traces.
+
+## Data sources
+- Policy overlays (`policy.overlay.v1`) produced by Policy Engine (POLICY-ENGINE-30-001).
+- VEX overlays (`openvex.v1`) from Concelier/Excititor pipelines.
+- Graph API emits overlays per node (see `docs/api/graph.md`) with deterministic IDs and optional `explainTrace` sampling.
+
+## Cache rules
+- UI should respect overlay cache TTL (5–10 minutes). Cache key: tenant + nodeId + overlay kind.
+- On cache miss, fallback to Graph API which will populate cache; avoid fan-out calls per tile.
+- When policy overlay contract version changes, invalidate cache via version tag (e.g., `policy.overlay.v1` → `v2`).
+
+## Requests
+- Graph API: `includeOverlays=true` on `/graph/query` or `/graph/paths` to receive overlay payloads inline.
+- Budget: ensure `budget.tiles` leaves room for overlays; UI may need to request higher budgets when overlays are critical to UX.
+- Simulator: when running policy simulator, attach `X-Stella-Simulator: true` header (once enabled) to route to simulator instance; cache should be bypassed for simulator runs.
+
+## UI rendering guidance
+- Show policy status badge (e.g., `warn`, `deny`, `allow`) with ruleId and severity.
+- If `explainTrace` present, render as expandable list; only one sampled node per query may include trace.
+- VEX overlays: render status (`not_affected`, `affected`) and justification; show issued timestamp and source.
+- Overlay provenance: display `overlayId`, version, and source engine version if present.
+
+## Error handling
+- If Graph returns `GRAPH_BUDGET_EXCEEDED`, prompt user to reduce scope or increase budgets; do not silently drop overlays.
+- On overlay cache miss + upstream failure, surface a non-blocking warning and proceed with node data.
+
+## Events & notifications
+- Subscribe to `policy.overlay.updated` (future) or re-poll every 10 minutes to refresh overlays in UI.
+- When VEX status changes, UI should refresh impacted nodes/edges and reflect new status badges.
+
+## References
+- Policy overlay contract: `docs/modules/policy/prep/2025-11-22-policy-engine-30-001-prep.md`
+- Graph API overlays: `docs/api/graph.md`, `docs/modules/graph/architecture-index.md`
+- Concelier/Excititor overlays: `docs/modules/excititor/vex_observations.md`