# Policy Engine Overview > **Goal:** Evaluate organisation policies deterministically against scanner SBOMs, Concelier advisories, and Excititor VEX evidence, then publish effective findings that downstream services can trust. This document introduces the v2 Policy Engine: how the service fits into Stella Ops, the artefacts it produces, the contracts it honours, and the guardrails that keep policy decisions reproducible across air-gapped and connected deployments. --- ## 1 · Role in the Platform - **Purpose:** Compose policy verdicts by reconciling SBOM inventory, advisory metadata, VEX statements, and organisation rules. - **Form factor:** Dedicated `.NET 10` Minimal API host (`StellaOps.Policy.Engine`) plus worker orchestration. Policies are defined in `stella-dsl@1` packs compiled to an intermediate representation (IR) with a stable SHA-256 digest. - **Tenancy:** All workloads run under Authority-enforced scopes (`policy:*`, `findings:read`, `effective:write`). Only the Policy Engine identity may materialise effective findings collections. - **Consumption:** Findings ledger, Console, CLI, and Notify read the published `effective_finding_{policyId}` materialisations and policy run ledger (`policy_runs`). - **Offline parity:** Bundled policies import/export alongside advisories and VEX. In sealed mode the engine degrades gracefully, annotating explanations whenever cached signals replace live lookups. --- ## 2 · High-Level Architecture ```mermaid flowchart LR subgraph Inputs A[Scanner SBOMs
Inventory & Usage] B[Concelier Advisories
Canonical linksets] C[Excititor VEX
Consensus status] D[Policy Packs
stella-dsl@1] end subgraph PolicyEngine["StellaOps.Policy.Engine"] P1[DSL Compiler
IR + Digest] P2[Joiners
SBOM ↔ Advisory ↔ VEX] P3[Deterministic Evaluator
Rule hits + scoring] P4[Materialisers
effective findings] P5[Run Orchestrator
Full & incremental] end subgraph Outputs O1[Effective Findings Collections] O2[Explain Traces
Rule hit lineage] O3[Metrics & Traces
policy_run_seconds,
rules_fired_total] O4[Simulation/Preview Feeds
CLI & Studio] end A --> P2 B --> P2 C --> P2 D --> P1 --> P3 P2 --> P3 --> P4 --> O1 P3 --> O2 P5 --> P3 P3 --> O3 P3 --> O4 ``` --- ## 3 · Core Concepts | Concept | Description | |---------|-------------| | **Policy Pack** | Versioned bundle of DSL documents, metadata, and checksum manifest. Packs import/export via CLI and Offline Kit bundles. | | **Policy Digest** | SHA-256 of the canonical IR; used for caching, explain trace attribution, and audit proofs. | | **Effective Findings** | Append-only Mongo collections (`effective_finding_{policyId}`) storing the latest verdict per finding, plus history sidecars. | | **Policy Run** | Execution record persisted in `policy_runs` capturing inputs, run mode, timings, and determinism hash. | | **Explain Trace** | Structured tree showing rule matches, data provenance, and scoring components for UI/CLI explain features. | | **Simulation** | Dry-run evaluation that compares a candidate pack against the active pack and produces verdict diffs without persisting results. | | **Incident Mode** | Elevated sampling/trace capture toggled automatically when SLOs breach; emits events for Notifier and Timeline Indexer. | --- ## 4 · Inputs & Pre-processing ### 4.1 SBOM Inventory - **Source:** Scanner.WebService publishes inventory/usage SBOMs plus BOM-Index (roaring bitmap) metadata. - **Consumption:** Policy joiners use the index to expand candidate components quickly, keeping evaluation under the `< 5 s` warm path budget. - **Schema:** CycloneDX Protobuf + JSON views; Policy Engine reads canonical projections via shared SBOM adapters. ### 4.2 Advisory Corpus - **Source:** Concelier exports canonical advisories with deterministic identifiers, linksets, and equivalence tables. - **Contract:** Policy Engine only consumes raw `content.raw`, `identifiers`, and `linkset` fields per Aggregation-Only Contract (AOC); derived precedence remains a policy concern. ### 4.3 VEX Evidence - **Source:** Excititor consensus service resolves OpenVEX / CSAF statements, preserving conflicts. - **Usage:** Policy rules can require specific VEX vendors or justification codes; evaluator records when cached evidence substitutes for live statements (sealed mode). ### 4.4 Policy Packs - Authored in Policy Studio or CLI, validated against the `stella-dsl@1` schema. - Compiler performs canonicalisation (ordering, defaulting) before emitting IR and digest. - Packs bundle scoring profiles, allowlist metadata, and optional reachability weighting tables. --- ## 5 · Evaluation Flow 1. **Run selection** – Orchestrator accepts `full`, `incremental`, or `simulate` jobs. Incremental runs listen to change streams from Concelier, Excititor, and SBOM imports to scope re-evaluation. 2. **Input staging** – Candidates fetched in deterministic batches; identity graph from Concelier strengthens PURL lookups. 3. **Rule execution** – Evaluator walks rules in lexical order (first-match wins). Actions available: `block`, `ignore`, `warn`, `defer`, `escalate`, `requireVex`, each supporting quieting semantics where permitted. 4. **Scoring** – `PolicyScoringConfig` applies severity, trust, reachability weights plus penalties (`warnPenalty`, `ignorePenalty`, `quietPenalty`). 5. **Verdict and explain** – Engine constructs `PolicyVerdict` records with inputs, quiet flags, unknown confidence bands, and provenance markers; explain trees capture rule lineage. 6. **Materialisation** – Effective findings collections are upserted append-only, stamped with run identifier, policy digest, and tenant. 7. **Publishing** – Completed run writes to `policy_runs`, emits metrics (`policy_run_seconds`, `rules_fired_total`, `vex_overrides_total`), and raises events for Console/Notify subscribers. --- ## 6 · Run Modes | Mode | Trigger | Scope | Persistence | Typical Use | |------|---------|-------|-------------|-------------| | **Full** | Manual CLI (`stella policy run`), scheduled nightly, or emergency rebaseline | Entire tenant | Writes effective findings and run record | After policy publish or major advisory/VEX import | | **Incremental** | Change-stream queue driven by Concelier/Excititor/SBOM deltas | Only affected artefacts | Writes effective findings and run record | Continuous upkeep; ensures SLA ≤ 5 min from source change | | **Simulate** | CLI/Studio preview, CI pipelines | Candidate subset (diff against baseline) | No materialisation; produces explain & diff payloads | Policy authoring, CI regression suites | All modes are cancellation-aware and checkpoint progress for replay in case of deployment restarts. --- ## 7 · Outputs & Integrations - **APIs** – Minimal API exposes policy CRUD, run orchestration, explain fetches, and cursor-based listing of effective findings (see `/docs/api/policy.md` once published). - **CLI** – `stella policy simulate/run/show` commands surface JSON verdicts, exit codes, and diff summaries suitable for CI gating. - **Console / Policy Studio** – UI reads explain traces, policy metadata, approval workflow status, and simulation diffs to guide reviewers. - **Findings Ledger** – Effective findings feed downstream export, Notify, and risk scoring jobs. - **Air-gap bundles** – Offline Kit includes policy packs, scoring configs, and explain indexes; export commands generate DSSE-signed bundles for transfer. --- ## 8 · Determinism & Guardrails - **Deterministic inputs** – All joins rely on canonical linksets and equivalence tables; batches are sorted, and random/wall-clock APIs are blocked by static analysis plus runtime guards (`ERR_POL_004`). - **Stable outputs** – Canonical JSON serializers sort keys; digests recorded in run metadata enable reproducible diffs across machines. - **Idempotent writes** – Materialisers upsert using `{policyId, findingId, tenant}` keys and retain prior versions with append-only history. - **Sandboxing** – Policy evaluation executes in-process with timeouts; restart-only plug-ins guarantee no runtime DLL injection. - **Compliance proof** – Every run stores digest of inputs (policy, SBOM batch, advisory snapshot) so auditors can replay decisions offline. --- ## 9 · Security, Tenancy & Offline Notes - **Authority scopes:** Gateway enforces `policy:read`, `policy:write`, `policy:simulate`, `policy:runs`, `findings:read`, `effective:write`. Service identities must present DPoP-bound tokens. - **Tenant isolation:** Collections partition by tenant identifier; cross-tenant queries require explicit admin scopes and return audit warnings. - **Sealed mode:** In air-gapped deployments the engine surfaces `sealed=true` hints in explain traces, warning about cached EPSS/KEV data and suggesting bundle refreshes (see `docs/airgap/EPIC_16_AIRGAP_MODE.md` §3.7). - **Observability:** Structured logs carry correlation IDs matching orchestrator job IDs; metrics integrate with OpenTelemetry exporters; sampled rule-hit logs redact policy secrets. - **Incident response:** Incident mode can be forced via API, boosting trace retention and notifying Notifier through `policy.incident.activated` events. --- ## 10 · Working with Policy Packs 1. **Author** in Policy Studio or edit DSL files locally. Validate with `stella policy lint`. 2. **Simulate** against golden SBOM fixtures (`stella policy simulate --sbom fixtures/*.json`). Inspect explain traces for unexpected overrides. 3. **Publish** via API or CLI; Authority enforces review/approval workflows (`draft → review → approve → rollout`). 4. **Monitor** the subsequent incremental runs; if determinism diff fails in CI, roll back pack while investigating digests. 5. **Bundle** packs for offline sites with `stella policy bundle export` and distribute via Offline Kit. --- ## 11 · Compliance Checklist - [ ] **Scopes enforced:** Confirm gateway policy requires `policy:*` and `effective:write` scopes for all mutating endpoints. - [ ] **Determinism guard active:** Static analyzer blocks clock/RNG usage; CI determinism job diffing repeated runs passes. - [ ] **Materialisation audit:** Effective findings collections use append-only writers and retain history per policy run. - [ ] **Explain availability:** UI/CLI expose explain traces for every verdict; sealed-mode warnings display when cached evidence is used. - [ ] **Offline parity:** Policy bundles (import/export) tested in sealed environment; air-gap degradations documented for operators. - [ ] **Observability wired:** Metrics (`policy_run_seconds`, `rules_fired_total`, `vex_overrides_total`) and sampled rule hit logs emit to the shared telemetry pipeline with correlation IDs. - [ ] **Documentation synced:** API (`/docs/api/policy.md`), DSL grammar (`/docs/policy/dsl.md`), lifecycle (`/docs/policy/lifecycle.md`), and run modes (`/docs/policy/runs.md`) cross-link back to this overview. --- *Last updated: 2025-10-26 (Sprint 20).*