Some checks failed
		
		
	
	Docs CI / lint-and-preview (push) Has been cancelled
				
			- Implemented PolicyDslValidator with command-line options for strict mode and JSON output. - Created PolicySchemaExporter to generate JSON schemas for policy-related models. - Developed PolicySimulationSmoke tool to validate policy simulations against expected outcomes. - Added project files and necessary dependencies for each tool. - Ensured proper error handling and usage instructions across tools.
		
			
				
	
	
	
		
			11 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	
			11 KiB
		
	
	
	
	
	
	
	
Policy Engine Overview
Goal: Evaluate organisation policies deterministically against scanner SBOMs, Concelier advisories, and Excititor VEX evidence, then publish effective findings that downstream services can trust.
This document introduces the v2 Policy Engine: how the service fits into Stella Ops, the artefacts it produces, the contracts it honours, and the guardrails that keep policy decisions reproducible across air-gapped and connected deployments.
1 · Role in the Platform
- Purpose: Compose policy verdicts by reconciling SBOM inventory, advisory metadata, VEX statements, and organisation rules.
- Form factor: Dedicated .NET 10Minimal API host (StellaOps.Policy.Engine) plus worker orchestration. Policies are defined instella-dsl@1packs compiled to an intermediate representation (IR) with a stable SHA-256 digest.
- Tenancy: All workloads run under Authority-enforced scopes (policy:*,findings:read,effective:write). Only the Policy Engine identity may materialise effective findings collections.
- Consumption: Findings ledger, Console, CLI, and Notify read the published effective_finding_{policyId}materialisations and policy run ledger (policy_runs).
- Offline parity: Bundled policies import/export alongside advisories and VEX. In sealed mode the engine degrades gracefully, annotating explanations whenever cached signals replace live lookups.
2 · High-Level Architecture
flowchart LR
    subgraph Inputs
        A[Scanner SBOMs<br/>Inventory & Usage]
        B[Concelier Advisories<br/>Canonical linksets]
        C[Excititor VEX<br/>Consensus status]
        D[Policy Packs<br/>stella-dsl@1]
    end
    subgraph PolicyEngine["StellaOps.Policy.Engine"]
        P1[DSL Compiler<br/>IR + Digest]
        P2[Joiners<br/>SBOM ↔ Advisory ↔ VEX]
        P3[Deterministic Evaluator<br/>Rule hits + scoring]
        P4[Materialisers<br/>effective findings]
        P5[Run Orchestrator<br/>Full & incremental]
    end
    subgraph Outputs
        O1[Effective Findings Collections]
        O2[Explain Traces<br/>Rule hit lineage]
        O3[Metrics & Traces<br/>policy_run_seconds,<br/>rules_fired_total]
        O4[Simulation/Preview Feeds<br/>CLI & Studio]
    end
    A --> P2
    B --> P2
    C --> P2
    D --> P1 --> P3
    P2 --> P3 --> P4 --> O1
    P3 --> O2
    P5 --> P3
    P3 --> O3
    P3 --> O4
3 · Core Concepts
| Concept | Description | 
|---|---|
| Policy Pack | Versioned bundle of DSL documents, metadata, and checksum manifest. Packs import/export via CLI and Offline Kit bundles. | 
| Policy Digest | SHA-256 of the canonical IR; used for caching, explain trace attribution, and audit proofs. | 
| Effective Findings | Append-only Mongo collections ( effective_finding_{policyId}) storing the latest verdict per finding, plus history sidecars. | 
| Policy Run | Execution record persisted in policy_runscapturing inputs, run mode, timings, and determinism hash. | 
| Explain Trace | Structured tree showing rule matches, data provenance, and scoring components for UI/CLI explain features. | 
| Simulation | Dry-run evaluation that compares a candidate pack against the active pack and produces verdict diffs without persisting results. | 
| Incident Mode | Elevated sampling/trace capture toggled automatically when SLOs breach; emits events for Notifier and Timeline Indexer. | 
4 · Inputs & Pre-processing
4.1 SBOM Inventory
- Source: Scanner.WebService publishes inventory/usage SBOMs plus BOM-Index (roaring bitmap) metadata.
- Consumption: Policy joiners use the index to expand candidate components quickly, keeping evaluation under the < 5 swarm path budget.
- Schema: CycloneDX Protobuf + JSON views; Policy Engine reads canonical projections via shared SBOM adapters.
4.2 Advisory Corpus
- Source: Concelier exports canonical advisories with deterministic identifiers, linksets, and equivalence tables.
- Contract: Policy Engine only consumes raw content.raw,identifiers, andlinksetfields per Aggregation-Only Contract (AOC); derived precedence remains a policy concern.
4.3 VEX Evidence
- Source: Excititor consensus service resolves OpenVEX / CSAF statements, preserving conflicts.
- Usage: Policy rules can require specific VEX vendors or justification codes; evaluator records when cached evidence substitutes for live statements (sealed mode).
4.4 Policy Packs
- Authored in Policy Studio or CLI, validated against the stella-dsl@1schema.
- Compiler performs canonicalisation (ordering, defaulting) before emitting IR and digest.
- Packs bundle scoring profiles, allowlist metadata, and optional reachability weighting tables.
5 · Evaluation Flow
- Run selection – Orchestrator accepts full,incremental, orsimulatejobs. Incremental runs listen to change streams from Concelier, Excititor, and SBOM imports to scope re-evaluation.
- Input staging – Candidates fetched in deterministic batches; identity graph from Concelier strengthens PURL lookups.
- Rule execution – Evaluator walks rules in lexical order (first-match wins). Actions available: block,ignore,warn,defer,escalate,requireVex, each supporting quieting semantics where permitted.
- Scoring – PolicyScoringConfigapplies severity, trust, reachability weights plus penalties (warnPenalty,ignorePenalty,quietPenalty).
- Verdict and explain – Engine constructs PolicyVerdictrecords with inputs, quiet flags, unknown confidence bands, and provenance markers; explain trees capture rule lineage.
- Materialisation – Effective findings collections are upserted append-only, stamped with run identifier, policy digest, and tenant.
- Publishing – Completed run writes to policy_runs, emits metrics (policy_run_seconds,rules_fired_total,vex_overrides_total), and raises events for Console/Notify subscribers.
6 · Run Modes
| Mode | Trigger | Scope | Persistence | Typical Use | 
|---|---|---|---|---|
| Full | Manual CLI ( stella policy run), scheduled nightly, or emergency rebaseline | Entire tenant | Writes effective findings and run record | After policy publish or major advisory/VEX import | 
| Incremental | Change-stream queue driven by Concelier/Excititor/SBOM deltas | Only affected artefacts | Writes effective findings and run record | Continuous upkeep; ensures SLA ≤ 5 min from source change | 
| Simulate | CLI/Studio preview, CI pipelines | Candidate subset (diff against baseline) | No materialisation; produces explain & diff payloads | Policy authoring, CI regression suites | 
All modes are cancellation-aware and checkpoint progress for replay in case of deployment restarts.
7 · Outputs & Integrations
- APIs – Minimal API exposes policy CRUD, run orchestration, explain fetches, and cursor-based listing of effective findings (see /docs/api/policy.mdonce published).
- CLI – stella policy simulate/run/showcommands surface JSON verdicts, exit codes, and diff summaries suitable for CI gating.
- Console / Policy Studio – UI reads explain traces, policy metadata, approval workflow status, and simulation diffs to guide reviewers.
- Findings Ledger – Effective findings feed downstream export, Notify, and risk scoring jobs.
- Air-gap bundles – Offline Kit includes policy packs, scoring configs, and explain indexes; export commands generate DSSE-signed bundles for transfer.
8 · Determinism & Guardrails
- Deterministic inputs – All joins rely on canonical linksets and equivalence tables; batches are sorted, and random/wall-clock APIs are blocked by static analysis plus runtime guards (ERR_POL_004).
- Stable outputs – Canonical JSON serializers sort keys; digests recorded in run metadata enable reproducible diffs across machines.
- Idempotent writes – Materialisers upsert using {policyId, findingId, tenant}keys and retain prior versions with append-only history.
- Sandboxing – Policy evaluation executes in-process with timeouts; restart-only plug-ins guarantee no runtime DLL injection.
- Compliance proof – Every run stores digest of inputs (policy, SBOM batch, advisory snapshot) so auditors can replay decisions offline.
9 · Security, Tenancy & Offline Notes
- Authority scopes: Gateway enforces policy:read,policy:write,policy:simulate,policy:runs,findings:read,effective:write. Service identities must present DPoP-bound tokens.
- Tenant isolation: Collections partition by tenant identifier; cross-tenant queries require explicit admin scopes and return audit warnings.
- Sealed mode: In air-gapped deployments the engine surfaces sealed=truehints in explain traces, warning about cached EPSS/KEV data and suggesting bundle refreshes (seedocs/airgap/EPIC_16_AIRGAP_MODE.md§3.7).
- Observability: Structured logs carry correlation IDs matching orchestrator job IDs; metrics integrate with OpenTelemetry exporters; sampled rule-hit logs redact policy secrets.
- Incident response: Incident mode can be forced via API, boosting trace retention and notifying Notifier through policy.incident.activatedevents.
10 · Working with Policy Packs
- Author in Policy Studio or edit DSL files locally. Validate with stella policy lint.
- Simulate against golden SBOM fixtures (stella policy simulate --sbom fixtures/*.json). Inspect explain traces for unexpected overrides.
- Publish via API or CLI; Authority enforces review/approval workflows (draft → review → approve → rollout).
- Monitor the subsequent incremental runs; if determinism diff fails in CI, roll back pack while investigating digests.
- Bundle packs for offline sites with stella policy bundle exportand distribute via Offline Kit.
11 · Compliance Checklist
- Scopes enforced: Confirm gateway policy requires policy:*andeffective:writescopes for all mutating endpoints.
- Determinism guard active: Static analyzer blocks clock/RNG usage; CI determinism job diffing repeated runs passes.
- Materialisation audit: Effective findings collections use append-only writers and retain history per policy run.
- Explain availability: UI/CLI expose explain traces for every verdict; sealed-mode warnings display when cached evidence is used.
- Offline parity: Policy bundles (import/export) tested in sealed environment; air-gap degradations documented for operators.
- Observability wired: Metrics (policy_run_seconds,rules_fired_total,vex_overrides_total) and sampled rule hit logs emit to the shared telemetry pipeline with correlation IDs.
- Documentation synced: API (/docs/api/policy.md), DSL grammar (/docs/policy/dsl.md), lifecycle (/docs/policy/lifecycle.md), and run modes (/docs/policy/runs.md) cross-link back to this overview.
Last updated: 2025-10-26 (Sprint 20).