up
Some checks failed
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled

This commit is contained in:
StellaOps Bot
2025-12-12 09:35:37 +02:00
parent ce5ec9c158
commit efaf3cb789
238 changed files with 146274 additions and 5767 deletions

View File

@@ -4,8 +4,8 @@
> **Ownership:** Policy Guild • Platform Guild
> **Services:** `StellaOps.Policy.Engine` (Minimal API + worker host)
> **Data Stores:** MongoDB (`policies`, `policy_runs`, `effective_finding_*`), Object storage (explain bundles), optional NATS/Mongo queue
> **Related docs:** [Policy overview](../../policy/overview.md), [DSL](../../policy/dsl.md), [SPL v1](../../policy/spl-v1.md), [Lifecycle](../../policy/lifecycle.md), [Runtime](../../policy/runtime.md), [Governance](../../policy/governance.md), [REST API](../../policy/api.md), [Policy CLI](../cli/guides/policy.md), [Architecture overview](../platform/architecture-overview.md), [AOC reference](../../ingestion/aggregation-only-contract.md)
> **Data Stores:** PostgreSQL (`policy.*` schemas for packs, runs, exceptions, receipts), Object storage (explain bundles), optional queue
> **Related docs:** [Policy overview](../../policy/overview.md), [DSL](../../policy/dsl.md), [SPL v1](../../policy/spl-v1.md), [Lifecycle](../../policy/lifecycle.md), [Runtime](../../policy/runtime.md), [Governance](../../policy/governance.md), [REST API](../../policy/api.md), [Policy CLI](../cli/guides/policy.md), [Architecture overview](../platform/architecture-overview.md), [AOC reference](../../ingestion/aggregation-only-contract.md)
This dossier describes the internal structure of the Policy Engine service delivered in Epic2. It focuses on module boundaries, deterministic evaluation, orchestration, and integration contracts with Concelier, Excititor, SBOM Service, Authority, Scheduler, and Observability stacks.
@@ -15,18 +15,18 @@ The service operates strictly downstream of the **Aggregation-Only Contract (AOC
## 1·Responsibilities & Constraints
- Compile and evaluate `stella-dsl@1` policy packs into deterministic verdicts.
- Join SBOM inventory, Concelier advisories, and Excititor VEX evidence via canonical linksets and equivalence tables.
- Materialise effective findings (`effective_finding_{policyId}`) with append-only history and produce explain traces.
- Emit CVSS v4.0 receipts with canonical hashing and policy replay/backfill rules; store tenant-scoped receipts with RBAC; export receipts deterministically (UTC/fonts/order) and flag v3.1→v4.0 conversions (see Sprint 0190 CVSS-GAPS-190-014 / `docs/modules/policy/cvss-v4.md`).
- Emit per-finding OpenVEX decisions anchored to reachability evidence, forward them to Signer/Attestor for DSSE/Rekor, and publish the resulting artifacts for bench/verification consumers.
- Consume reachability lattice decisions (`ReachDecision`, `docs/reachability/lattice.md`) to drive confidence-based VEX gates (not_affected / under_investigation / affected) and record the policy hash used for each decision.
- Honor **hybrid reachability attestations**: graph-level DSSE is required input; when edge-bundle DSSEs exist, prefer their per-edge provenance for quarantine, dispute, and high-risk decisions. Quarantined edges (revoked in bundles or listed in Unknowns registry) must be excluded before VEX emission.
- Enforce **shadow + coverage gates** for new/changed policies: shadow runs record findings without enforcement; promotion blocked until shadow and coverage fixtures pass (see lifecycle/runtime docs). CLI/Console enforce attachment of lint/simulate/coverage evidence.
- Operate incrementally: react to change streams (advisory/vex/SBOM deltas) with ≤5min SLA.
- Provide simulations with diff summaries for UI/CLI workflows without modifying state.
- Enforce strict determinism guard (no wall-clock, RNG, network beyond allow-listed services) and RBAC + tenancy via Authority scopes.
- Support sealed/air-gapped deployments with offline bundles and sealed-mode hints.
- Compile and evaluate `stella-dsl@1` policy packs into deterministic verdicts.
- Join SBOM inventory, Concelier advisories, and Excititor VEX evidence via canonical linksets and equivalence tables.
- Materialise effective findings (`effective_finding_{policyId}`) with append-only history and produce explain traces.
- Emit CVSS v4.0 receipts with canonical hashing and policy replay/backfill rules; store tenant-scoped receipts with RBAC; export receipts deterministically (UTC/fonts/order) and flag v3.1→v4.0 conversions (see Sprint 0190 CVSS-GAPS-190-014 / `docs/modules/policy/cvss-v4.md`).
- Emit per-finding OpenVEX decisions anchored to reachability evidence, forward them to Signer/Attestor for DSSE/Rekor, and publish the resulting artifacts for bench/verification consumers.
- Consume reachability lattice decisions (`ReachDecision`, `docs/reachability/lattice.md`) to drive confidence-based VEX gates (not_affected / under_investigation / affected) and record the policy hash used for each decision.
- Honor **hybrid reachability attestations**: graph-level DSSE is required input; when edge-bundle DSSEs exist, prefer their per-edge provenance for quarantine, dispute, and high-risk decisions. Quarantined edges (revoked in bundles or listed in Unknowns registry) must be excluded before VEX emission.
- Enforce **shadow + coverage gates** for new/changed policies: shadow runs record findings without enforcement; promotion blocked until shadow and coverage fixtures pass (see lifecycle/runtime docs). CLI/Console enforce attachment of lint/simulate/coverage evidence.
- Operate incrementally: react to change streams (advisory/vex/SBOM deltas) with ≤5min SLA.
- Provide simulations with diff summaries for UI/CLI workflows without modifying state.
- Enforce strict determinism guard (no wall-clock, RNG, network beyond allow-listed services) and RBAC + tenancy via Authority scopes.
- Support sealed/air-gapped deployments with offline bundles and sealed-mode hints.
Non-goals: policy authoring UI (handled by Console), ingestion or advisory normalisation (Concelier), VEX consensus (Excititor), runtime enforcement (Zastava).
@@ -111,14 +111,14 @@ Key notes:
| **Authority Client** (`Authority/`) | Acquire tokens, enforce scopes, perform DPoP key rotation. | Only service identity uses `effective:write`. |
| **DSL Compiler** (`Dsl/`) | Parse, canonicalise, IR generation, checksum caching. | Uses Roslyn-like pipeline; caches by `policyId+version+hash`. |
| **Selection Layer** (`Selection/`) | Batch SBOM ↔ advisory ↔ VEX joiners; apply equivalence tables; support incremental cursors. | Deterministic ordering (SBOM → advisory → VEX). |
| **Evaluator** (`Evaluation/`) | Execute IR with first-match semantics, compute severity/trust/reachability weights, record rule hits. | Stateless; all inputs provided by selection layer. |
| **Signals** (`Signals/`) | Normalizes reachability, trust, entropy, uncertainty, runtime hits into a single dictionary passed to Evaluator; supplies default `unknown` values when signals missing. Entropy penalties are derived from Scanner `layer_summary.json`/`entropy.report.json` (K=0.5, cap=0.3, block at image opaque ratio > 0.15 w/ unknown provenance) and exported via `policy_entropy_penalty_value` / `policy_entropy_image_opaque_ratio`; SPL scope `entropy.*` exposes `penalty`, `image_opaque_ratio`, `blocked`, `warned`, `capped`, `top_file_opaque_ratio`. | Aligns with `signals.*` namespace in DSL. |
| **Materialiser** (`Materialization/`) | Upsert effective findings, append history, manage explain bundle exports. | Mongo transactions per SBOM chunk. |
| **Evaluator** (`Evaluation/`) | Execute IR with first-match semantics, compute severity/trust/reachability weights, record rule hits. | Stateless; all inputs provided by selection layer. |
| **Signals** (`Signals/`) | Normalizes reachability, trust, entropy, uncertainty, runtime hits into a single dictionary passed to Evaluator; supplies default `unknown` values when signals missing. Entropy penalties are derived from Scanner `layer_summary.json`/`entropy.report.json` (K=0.5, cap=0.3, block at image opaque ratio > 0.15 w/ unknown provenance) and exported via `policy_entropy_penalty_value` / `policy_entropy_image_opaque_ratio`; SPL scope `entropy.*` exposes `penalty`, `image_opaque_ratio`, `blocked`, `warned`, `capped`, `top_file_opaque_ratio`. | Aligns with `signals.*` namespace in DSL. |
| **Materialiser** (`Materialization/`) | Upsert effective findings, append history, manage explain bundle exports. | Mongo transactions per SBOM chunk. |
| **Orchestrator** (`Runs/`) | Change-stream ingestion, fairness, retry/backoff, queue writer. | Works with Scheduler Models DTOs. |
| **API** (`Api/`) | Minimal API endpoints, DTO validation, problem responses, idempotency. | Generated clients for CLI/UI. |
| **Observability** (`Telemetry/`) | Metrics (`policy_run_seconds`, `rules_fired_total`), traces, structured logs. | Sampled rule-hit logs with redaction. |
| **Offline Adapter** (`Offline/`) | Bundle export/import (policies, simulations, runs), sealed-mode enforcement. | Uses DSSE signing via Signer service; bundles include IR hash, input cursors, shadow flag, coverage artefacts. |
| **VEX Decision Emitter** (`Vex/Emitter/`) | Build OpenVEX statements, attach reachability evidence hashes, request DSSE signing, and persist artifacts for Export Center / bench repo. | New (Sprint401); integrates with Signer predicate `stella.ops/vexDecision@v1` and Attestor Rekor logging. |
| **Observability** (`Telemetry/`) | Metrics (`policy_run_seconds`, `rules_fired_total`), traces, structured logs. | Sampled rule-hit logs with redaction. |
| **Offline Adapter** (`Offline/`) | Bundle export/import (policies, simulations, runs), sealed-mode enforcement. | Uses DSSE signing via Signer service; bundles include IR hash, input cursors, shadow flag, coverage artefacts. |
| **VEX Decision Emitter** (`Vex/Emitter/`) | Build OpenVEX statements, attach reachability evidence hashes, request DSSE signing, and persist artifacts for Export Center / bench repo. | New (Sprint401); integrates with Signer predicate `stella.ops/vexDecision@v1` and Attestor Rekor logging. |
---
@@ -179,49 +179,49 @@ Determinism guard instrumentation wraps the evaluator, rejecting access to forbi
---
## 6·Run Orchestration & Incremental Flow
- **Change streams:** Concelier and Excititor publish document changes to the scheduler queue (`policy.trigger.delta`). Payload includes `tenant`, `source`, `linkset digests`, `cursor`.
- **Orchestrator:** Maintains per-tenant backlog; merges deltas until time/size thresholds met, then enqueues `PolicyRunRequest`.
- **Queue:** Mongo queue with lease; each job assigned `leaseDuration`, `maxAttempts`.
## 6·Run Orchestration & Incremental Flow
- **Change streams:** Concelier and Excititor publish document changes to the scheduler queue (`policy.trigger.delta`). Payload includes `tenant`, `source`, `linkset digests`, `cursor`.
- **Orchestrator:** Maintains per-tenant backlog; merges deltas until time/size thresholds met, then enqueues `PolicyRunRequest`.
- **Queue:** Mongo queue with lease; each job assigned `leaseDuration`, `maxAttempts`.
- **Workers:** Lease jobs, execute evaluation pipeline, report status (success/failure/canceled). Failures with recoverable errors requeue with backoff; determinism or schema violations mark job `failed` and raise incident event.
- **Fairness:** Round-robin per `{tenant, policyId}`; emergency jobs (`priority=emergency`) jump queue but limited via circuit breaker.
- **Replay:** On demand, orchestrator rehydrates run via stored cursors and exports sealed bundle for audit/CI determinism checks.
- **Batch evaluation service (`/api/policy/eval/batch`):** Stateless evaluator powering Findings Ledger and replay/offline workflows. Requests contain canonical ledger events plus optional current projection; responses return status/severity/labels/rationale without mutating state. Policy Engine enforces per-tenant cost budgets, caches results by `(tenantId, policyVersion, eventHash, projectionHash)`, and falls back to inline evaluation when the remote service is disabled.
---
### 6.1·VEX decision attestation pipeline
1. **Verdict capture.** Each evaluation result contains `{findingId, cve, productKey, reachabilityState, evidenceRefs}` plus SBOM and runtime CAS hashes.
2. **OpenVEX serialization.** `VexDecisionEmitter` builds an OpenVEX document with one statement per `(cve, productKey)` and fills:
- `status`, `justification`, `status_notes`, `impact_statement`, `action_statement`.
- `products` (purl) and `evidence` array referencing `reachability.json`, `sbom.cdx.json`, `runtimeFacts`.
3. **DSSE signing.** The emitter calls Signer `POST /api/v1/signer/sign/dsse` with predicate `stella.ops/vexDecision@v1`. Signer verifies PoE + scanner integrity and returns a DSSE envelope (`decision.dsse.json`).
4. **Transparency (optional).** When Rekor integration is enabled, Attestor logs the DSSE payload and returns `{uuid, logIndex, checkpoint}` which we persist next to the decision.
5. **Export.** API/CLI endpoints expose `decision.openvex.json`, `decision.dsse.json`, `rekor.txt`, and evidence metadata so Export Center + bench automation can mirror them into `bench/findings/**` as defined in the [VEX Evidence Playbook](../../benchmarks/vex-evidence-playbook.md).
All payloads are immutable and include analyzer fingerprints (`scanner.native@sha256:...`, `policyEngine@sha256:...`) so replay tooling can recompute identical digests. Determinism tests cover both the OpenVEX JSON and the DSSE payload bytes.
---
## 7·Security & Tenancy
- **Auth:** All API calls pass through Authority gateway; DPoP tokens enforced for service-to-service (Policy Engine service principal). CLI/UI tokens include scope claims.
- **Scopes:** Mutations require `policy:*` scopes corresponding to action; `effective:write` restricted to service identity.
- **Replay:** On demand, orchestrator rehydrates run via stored cursors and exports sealed bundle for audit/CI determinism checks.
- **Batch evaluation service (`/api/policy/eval/batch`):** Stateless evaluator powering Findings Ledger and replay/offline workflows. Requests contain canonical ledger events plus optional current projection; responses return status/severity/labels/rationale without mutating state. Policy Engine enforces per-tenant cost budgets, caches results by `(tenantId, policyVersion, eventHash, projectionHash)`, and falls back to inline evaluation when the remote service is disabled.
---
### 6.1·VEX decision attestation pipeline
1. **Verdict capture.** Each evaluation result contains `{findingId, cve, productKey, reachabilityState, evidenceRefs}` plus SBOM and runtime CAS hashes.
2. **OpenVEX serialization.** `VexDecisionEmitter` builds an OpenVEX document with one statement per `(cve, productKey)` and fills:
- `status`, `justification`, `status_notes`, `impact_statement`, `action_statement`.
- `products` (purl) and `evidence` array referencing `reachability.json`, `sbom.cdx.json`, `runtimeFacts`.
3. **DSSE signing.** The emitter calls Signer `POST /api/v1/signer/sign/dsse` with predicate `stella.ops/vexDecision@v1`. Signer verifies PoE + scanner integrity and returns a DSSE envelope (`decision.dsse.json`).
4. **Transparency (optional).** When Rekor integration is enabled, Attestor logs the DSSE payload and returns `{uuid, logIndex, checkpoint}` which we persist next to the decision.
5. **Export.** API/CLI endpoints expose `decision.openvex.json`, `decision.dsse.json`, `rekor.txt`, and evidence metadata so Export Center + bench automation can mirror them into `bench/findings/**` as defined in the [VEX Evidence Playbook](../../benchmarks/vex-evidence-playbook.md).
All payloads are immutable and include analyzer fingerprints (`scanner.native@sha256:...`, `policyEngine@sha256:...`) so replay tooling can recompute identical digests. Determinism tests cover both the OpenVEX JSON and the DSSE payload bytes.
---
## 7·Security & Tenancy
- **Auth:** All API calls pass through Authority gateway; DPoP tokens enforced for service-to-service (Policy Engine service principal). CLI/UI tokens include scope claims.
- **Scopes:** Mutations require `policy:*` scopes corresponding to action; `effective:write` restricted to service identity.
- **Tenancy:** All queries filter by `tenant`. Service identity uses `tenant-global` for shared policies; cross-tenant reads prohibited unless `policy:tenant-admin` scope present.
- **Secrets:** Configuration loaded via environment variables or sealed secrets; runtime avoids writing secrets to logs.
- **Determinism guard:** Static analyzer prevents referencing forbidden namespaces; runtime guard intercepts `DateTime.Now`, `Random`, `Guid`, HTTP clients beyond allow-list.
- **Sealed mode:** Global flag disables outbound network except allow-listed internal hosts; watchers fail fast if unexpected egress attempted.
### Determinism enforcement (DOCS-POLICY-DET-01)
- **Inputs are ordered and frozen:** Selector emits batches sorted deterministically by `(tenant, policyId, vulnerabilityId, productKey, source)` with stable cursors; workers must not resort.
- **No ambient randomness or wall clocks:** Policy code relies on injected `TimeProvider`/`IRandom` shims; guards block `DateTime.Now`, `Guid.NewGuid`, `Random` when not injected.
- **Immutable evidence:** SBOM/VEX inputs carry content hashes; evaluator treats payloads as read-only and surfaces hashes in logs for replay.
- **Side effects prohibited:** Evaluator cannot call external HTTP except allow-listed internal services (Authority, Storage) and must not write files outside temp workspace.
- **Replay hash:** Each batch computes `determinismHash = SHA256(policyVersion + batchCursor + inputsHash)`; included in logs and run exports.
- **Testing:** Determinism tests run the same batch twice with seeded clock/GUID providers and assert identical outputs + determinismHash; add a test per policy package.
- **Determinism guard:** Static analyzer prevents referencing forbidden namespaces; runtime guard intercepts `DateTime.Now`, `Random`, `Guid`, HTTP clients beyond allow-list.
- **Sealed mode:** Global flag disables outbound network except allow-listed internal hosts; watchers fail fast if unexpected egress attempted.
### Determinism enforcement (DOCS-POLICY-DET-01)
- **Inputs are ordered and frozen:** Selector emits batches sorted deterministically by `(tenant, policyId, vulnerabilityId, productKey, source)` with stable cursors; workers must not resort.
- **No ambient randomness or wall clocks:** Policy code relies on injected `TimeProvider`/`IRandom` shims; guards block `DateTime.Now`, `Guid.NewGuid`, `Random` when not injected.
- **Immutable evidence:** SBOM/VEX inputs carry content hashes; evaluator treats payloads as read-only and surfaces hashes in logs for replay.
- **Side effects prohibited:** Evaluator cannot call external HTTP except allow-listed internal services (Authority, Storage) and must not write files outside temp workspace.
- **Replay hash:** Each batch computes `determinismHash = SHA256(policyVersion + batchCursor + inputsHash)`; included in logs and run exports.
- **Testing:** Determinism tests run the same batch twice with seeded clock/GUID providers and assert identical outputs + determinismHash; add a test per policy package.
---