up
Some checks failed
api-governance / spectral-lint (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
oas-ci / oas-validate (push) Has been cancelled
SDK Publish & Sign / sdk-publish (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
Some checks failed
api-governance / spectral-lint (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
oas-ci / oas-validate (push) Has been cancelled
SDK Publish & Sign / sdk-publish (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
This commit is contained in:
65
docs/policy/runtime.md
Normal file
65
docs/policy/runtime.md
Normal file
@@ -0,0 +1,65 @@
|
||||
# Policy Runtime & Evaluation
|
||||
|
||||
> **Imposed rule:** Runtime evaluations must use frozen inputs (SBOM, advisories, VEX, reachability, signals) and emit explain traces plus DSSE/attestation metadata; no live feed calls during evaluation.
|
||||
|
||||
This document describes how SPL policies are compiled, cached, and executed, and how results are surfaced via APIs, CLI, UI, and observability.
|
||||
|
||||
## 1. Components
|
||||
- **Compiler**: converts SPL (`stella-dsl@1`) into canonical IR JSON, hashes it, and validates lint/coverage. Produces IR cache used by Engine.
|
||||
- **Engine**: deterministic evaluator that consumes IR + inputs (SBOM, advisory, VEX, signals) and emits findings + explain traces.
|
||||
- **Caches**:
|
||||
- IR cache keyed by `policyId`/`version`/IR hash.
|
||||
- Input cursors (SBOM/advisory/VEX snapshots, reachability graphs) to guarantee replay.
|
||||
- Explain trace cache for recently queried runs (TTL, tenant-scoped).
|
||||
- **Attestation**: optional DSSE over IR hash + approval metadata; Rekor mirror when online; stored alongside run outputs in Evidence Locker.
|
||||
|
||||
## 2. Execution flow
|
||||
1. Resolve active policy version for tenant (or specified version for simulate).
|
||||
2. Load IR from cache; verify hash matches attested value if provided.
|
||||
3. Fetch frozen inputs via cursors: SBOM digest, advisory snapshot id, VEX set, reachability graph hash, signals bundle.
|
||||
4. Evaluate rules in priority order; record explain entries (rule, because, inputs, signals).
|
||||
5. Persist findings, explain traces, and run metadata (`runId`, `policyVersion`, hashes) to storage.
|
||||
6. Emit events: `policy.run.started`, `policy.run.completed`, `policy.run.failed`; optionally `policy.run.shadow` when settings.shadow=true.
|
||||
|
||||
## 3. Caching & determinism
|
||||
- IR cache warmed at publish; invalidated on new policy version.
|
||||
- Input cursors are mandatory; if missing, run is blocked (returns `inputs_unfrozen`).
|
||||
- Explain trace storage keeps deterministic ordering; capped by tenant quotas.
|
||||
- Shadow mode runs record findings but mark `enforced=false`; promotion blocked until shadow+coverage gates pass.
|
||||
|
||||
## 4. APIs & CLI
|
||||
- API: `POST /policies/{id}/simulate`, `POST /policies/{id}/run`, `GET /policy-runs/{runId}` (findings + explain), `GET /policies/{id}/versions/{v}` (IR, hash, attestation refs).
|
||||
- CLI: `stella policy simulate`, `stella policy run`, `stella policy explain <runId> --format json|table`, `stella policy export --run <runId> --offline`.
|
||||
- Headers: `X-Stella-Tenant`, `X-Stella-Shadow` (optional), `If-None-Match` for IR cache revalidation.
|
||||
|
||||
## 5. Observability & SLOs
|
||||
- Metrics: `policy_runs_total{status}`, `policy_run_duration_seconds`, `policy_explain_cache_hits`, `policy_inputs_unfrozen_total`, `policy_shadow_runs_total`.
|
||||
- Logs include `policyId`, `version`, `runId`, `tenant`, `shadow`, `input_cursor` hashes.
|
||||
- Traces: span per run with events for rule evaluation batches; attributes include counts of rules fired and unknowns encountered.
|
||||
- SLOs (suggested):
|
||||
- p95 policy run latency < 2s for simulate, < 10s for full run.
|
||||
- Error budget: <0.5% failed runs per rolling 7d.
|
||||
- Explain cache hit rate >80% for repeated queries.
|
||||
|
||||
## 6. Failure modes & handling
|
||||
- **Inputs unfrozen**: return 409 with required cursors; emit `policy.inputs_unfrozen` event.
|
||||
- **Hash mismatch**: IR hash differs from attested; block run and emit `policy.ir_hash_mismatch` alert.
|
||||
- **Unknown signals**: if required signals missing, downgrade to `unknown` and optionally set `status=under_investigation`; flag in explain trace.
|
||||
- **Exceeded quotas**: explain storage or run count caps → 429 with `Retry-After`; run not executed.
|
||||
|
||||
## 7. Offline / air-gap
|
||||
- All inputs fetched from Offline Kit bundles; no network during evaluate.
|
||||
- CLI `stella policy run --sealed --bundle <path>` loads IR, inputs, and signals from bundle; writes outputs + attestation-ready manifest.
|
||||
- Runs produce DSSE-ready payloads (`policy.run@1`) that can be signed later when connectivity is restored.
|
||||
|
||||
## 8. Data model (high level)
|
||||
- `policy_runs`: `runId`, `policyId`, `version`, `tenant`, `shadow`, `input_cursors`, `ir_hash`, `attestation_ref`, `started_at`, `completed_at`, `status`, `stats` (rules fired, explains, unknowns), `storage_refs` (findings, explains).
|
||||
- `policy_findings`: flattened findings with references to explain entries.
|
||||
- `policy_explains`: rule-level explain traces with inputs, signals, because text.
|
||||
|
||||
## 9. References
|
||||
- `docs/policy/dsl.md`
|
||||
- `docs/policy/lifecycle.md`
|
||||
- `docs/policy/architecture.md`
|
||||
- `docs/policy/overview.md`
|
||||
- `docs/reachability/DELIVERY_GUIDE.md`
|
||||
Reference in New Issue
Block a user