Add Policy DSL Validator, Schema Exporter, and Simulation Smoke tools

- Implemented PolicyDslValidator with command-line options for strict mode and JSON output. - Created PolicySchemaExporter to generate JSON schemas for policy-related models. - Developed PolicySimulationSmoke tool to validate policy simulations against expected outcomes. - Added project files and necessary dependencies for each tool. - Ensured proper error handling and usage instructions across tools.
2025-10-27 08:00:11 +02:00
parent 651b8e0fa3
commit 96d52884e8
712 changed files with 49449 additions and 6124 deletions
--- a/docs/security/authority-scopes.md
+++ b/docs/security/authority-scopes.md
@@ -0,0 +1,194 @@
+# Authority Scopes & Tenancy — AOC Update
+
+> **Audience:** Authority Core, platform security engineers, DevOps owners.  
+> **Scope:** Scope taxonomy, tenancy enforcement, rollout guidance for the Aggregation-Only Contract (Sprint 19).
+
+Authority issues short-lived tokens bound to tenants and scopes. Sprint 19 introduces new scopes to support the AOC guardrails in Concelier and Excititor. This document lists the canonical scope catalogue, describes tenancy propagation, and outlines operational safeguards.
+
+---
+
+## 1 · Scope catalogue (post AOC)
+
+| Scope | Surface | Purpose | Notes |
+|-------|---------|---------|-------|
+| `advisory:write` | Concelier ingestion APIs | Allows append-only writes to `advisory_raw`. | Granted to Concelier WebService and trusted connectors. Requires tenant claim. |
+| `advisory:verify` | Concelier `/aoc/verify`, CLI, UI dashboard | Permits guard verification and access to violation summaries. | Read-only; used by `stella aoc verify` and console dashboard. |
+| `vex:write` | Excititor ingestion APIs | Append-only writes to `vex_raw`. | Mirrors `advisory:write`. |
+| `vex:verify` | Excititor `/aoc/verify`, CLI | Read-only verification of VEX ingestion. | Optional for environments without VEX feeds. |
+| `graph:write` | Cartographer build pipeline | Enqueue graph build/overlay jobs. | Reserved for the Cartographer service identity; requires tenant claim. |
+| `graph:read` | Graph API, Scheduler overlays, UI | Read graph projections/overlays. | Requires tenant claim; granted to Cartographer, Graph API, Scheduler. |
+| `graph:export` | Graph export endpoints | Stream GraphML/JSONL artefacts. | UI/gateway automation only; tenant required. |
+| `graph:simulate` | Policy simulation overlays | Trigger what-if overlays on graphs. | Restricted to automation; tenant required. |
+| `effective:write` | Policy Engine | Allows creation/update of `effective_finding_*` collections. | **Only** the Policy Engine service client may hold this scope. |
+| `effective:read` | Console, CLI, exports | Read derived findings. | Shared across tenants with role-based restrictions. |
+| `aoc:dashboard` | Console UI | Access AOC dashboard resources. | Bundles `advisory:verify`/`vex:verify` by default; keep for UI RBAC group mapping. |
+| `aoc:verify` | Automation service accounts | Execute verification via API without the full dashboard role. | For CI pipelines, offline kit validators. |
+| Existing scopes | (e.g., `policy:*`, `sbom:*`) | Unchanged. | Review `/docs/security/policy-governance.md` for policy-specific scopes. |
+
+### 1.1 Scope bundles (roles)
+
+- **`role/concelier-ingest`** → `advisory:write`, `advisory:verify`.
+- **`role/excititor-ingest`** → `vex:write`, `vex:verify`.
+- **`role/aoc-operator`** → `aoc:dashboard`, `aoc:verify`, `advisory:verify`, `vex:verify`.
+- **`role/policy-engine`** → `effective:write`, `effective:read`.
+- **`role/cartographer-service`** → `graph:write`, `graph:read`.
+- **`role/graph-gateway`** → `graph:read`, `graph:export`, `graph:simulate`.
+
+Roles are declared per tenant in `authority.yaml`:
+
+```yaml
+tenants:
+  - name: default
+    roles:
+      concelier-ingest:
+        scopes: [advisory:write, advisory:verify]
+      aoc-operator:
+        scopes: [aoc:dashboard, aoc:verify, advisory:verify, vex:verify]
+      policy-engine:
+        scopes: [effective:write, effective:read]
+```
+
+---
+
+## 2 · Tenancy enforcement
+
+### 2.1 Token claims
+
+Tokens now include:
+
+- `tenant` claim (string) — required for all ingestion and verification scopes.
+- `service_identity` (optional) — e.g., `policy-engine`, `cartographer`. Required when requesting `effective:write` or `graph:write`.
+- `delegation_allowed` (boolean) — defaults `false`. Prevents console tokens from delegating ingest scopes.
+
+Authority rejects requests when:
+
+- `tenant` is missing while requesting `advisory:*`, `vex:*`, or `aoc:*` scopes.
+- `service_identity != policy-engine` but `effective:write` is present (`ERR_AOC_006` enforcement).
+- `service_identity != cartographer` but `graph:write` is present (graph pipeline enforcement).
+- Tokens attempt to combine `advisory:write` with `effective:write` (separation of duties).
+
+### 2.2 Propagation
+
+- API Gateway forwards `tenant` claim as header (`X-Stella-Tenant`). Services refuse requests lacking the header.
+- Concelier/Excititor stamp tenant into raw documents and structured logs.
+- Policy Engine copies `tenant` from tokens into `effective_finding_*` collections.
+
+### 2.3 Cross-tenant scenarios
+
+- Platform operators with `tenant:admin` can assume other tenants via `/authority/tenant/switch` if explicitly permitted.
+- CLI commands accept `--tenant <id>` to override environment default; Authority logs tenant switch events (`authority.tenant.switch`).
+- Console tenant picker uses delegated token exchange (`/token/exchange`) to obtain scoped tenant tokens without exposing raw credentials.
+
+---
+
+## 3 · Configuration changes
+
+### 3.1 Authority configuration (`authority.yaml`)
+
+Add new scopes and optional claims transformations:
+
+```yaml
+security:
+  scopes:
+    - name: advisory:write
+      description: Concelier raw ingestion
+    - name: advisory:verify
+      description: Verify Concelier ingestion
+    - name: vex:write
+      description: Excititor raw ingestion
+    - name: vex:verify
+      description: Verify Excititor ingestion
+    - name: aoc:dashboard
+      description: Access AOC UI dashboards
+    - name: aoc:verify
+      description: Run AOC verification
+    - name: effective:write
+      description: Policy Engine materialisation
+    - name: effective:read
+      description: Read derived findings
+  claimTransforms:
+    - match: { scope: "effective:write" }
+      require:
+        serviceIdentity: policy-engine
+    - match: { scope: "graph:write" }
+      require:
+        serviceIdentity: cartographer
+```
+
+### 3.2 Client registration
+
+Update service clients:
+
+- `Concelier.WebService` → request `advisory:write`, `advisory:verify`.
+- `Excititor.WebService` → request `vex:write`, `vex:verify`.
+- `Policy.Engine` → request `effective:write`, `effective:read`; set `properties.serviceIdentity=policy-engine`.
+- `Cartographer.Service` → request `graph:write`, `graph:read`; set `properties.serviceIdentity=cartographer`.
+- `Graph API Gateway` → request `graph:read`, `graph:export`, `graph:simulate`; tenant hint required.
+- `Console` → request `aoc:dashboard`, `effective:read` plus existing UI scopes.
+- `CLI automation` → request `aoc:verify`, `advisory:verify`, `vex:verify` as needed.
+
+Client definition snippet:
+
+```yaml
+clients:
+  - clientId: concelier-web
+    grantTypes: [client_credentials]
+    scopes: [advisory:write, advisory:verify]
+    tenants: [default]
+  - clientId: policy-engine
+    grantTypes: [client_credentials]
+    scopes: [effective:write, effective:read]
+    properties:
+      serviceIdentity: policy-engine
+  - clientId: cartographer-service
+    grantTypes: [client_credentials]
+    scopes: [graph:write, graph:read]
+    properties:
+      serviceIdentity: cartographer
+```
+
+---
+
+## 4 · Operational safeguards
+
+- **Audit events:** Authority emits `authority.scope.granted` and `authority.scope.revoked` events with `scope` and `tenant`. Monitor for unexpected grants.
+- **Rate limiting:** Apply stricter limits on `/token` endpoints for clients requesting `advisory:write` or `vex:write` to mitigate brute-force ingestion attempts.
+- **Incident response:** Link AOC alerts to Authority audit logs to confirm whether violations come from expected identities.
+- **Rotation:** Rotate ingest client secrets alongside guard deployments; add rotation steps to `ops/authority-key-rotation.md`.
+- **Testing:** Integration tests must fail if tokens lacking `tenant` attempt ingestion; add coverage in Concelier/Excititor smoke suites (see `CONCELIER-CORE-AOC-19-013`).
+
+---
+
+## 5 · Offline & air-gap notes
+
+- Offline Kit bundles include tenant-scoped service credentials. Ensure ingest bundles ship without `advisory:write` scopes unless strictly required.
+- CLI verification in offline environments uses pre-issued `aoc:verify` tokens; document expiration and renewal processes.
+- Authority replicas in air-gapped environments should restrict scope issuance to known tenants and log all `/token` interactions for later replay.
+
+---
+
+## 6 · References
+
+- [Aggregation-Only Contract reference](../ingestion/aggregation-only-contract.md)
+- [Architecture overview](../architecture/overview.md)
+- [Concelier architecture](../ARCHITECTURE_CONCELIER.md)
+- [Excititor architecture](../ARCHITECTURE_EXCITITOR.md)
+- [Policy governance](policy-governance.md)
+- [Authority key rotation playbook](../ops/authority-key-rotation.md)
+
+---
+
+## 7 · Compliance checklist
+
+- [ ] Scope catalogue updated in Authority configuration templates.
+- [ ] Role mappings documented for each tenant profile.
+- [ ] Claim transforms enforce `serviceIdentity` for `effective:write`.
+- [ ] Claim transforms enforce `serviceIdentity` for `graph:write`.
+- [ ] Concelier/Excititor smoke tests cover missing tenant rejection.
+- [ ] Offline kit credentials reviewed for least privilege.
+- [ ] Audit/monitoring guidance validated with Observability Guild.
+- [ ] Authority Core sign-off recorded (owner: @authority-core, due 2025-10-28).
+
+---
+
+*Last updated: 2025-10-26 (Sprint 19).* 
--- a/docs/security/policy-governance.md
+++ b/docs/security/policy-governance.md
@@ -0,0 +1,114 @@
+# Policy Governance & Least Privilege
+
+> **Audience:** Security Guild, Policy Guild, Authority Core, auditors.  
+> **Scope:** Scopes, RBAC, approval controls, tenancy, auditing, and compliance requirements for Policy Engine v2.
+
+---
+
+## 1 · Governance Principles
+
+1. **Least privilege by scope** – API clients receive only the `policy:*` scopes required for their role; `effective:write` reserved for service identity.
+2. **Immutable history** – All policy changes, approvals, runs, and suppressions produce audit artefacts retrievable offline.
+3. **Separation of duties** – Authors cannot approve their own submissions; approvers require distinct scope and should not have deployment rights.
+4. **Deterministic verification** – Simulations, determinism checks, and incident replay bundles provide reproducible evidence for auditors.
+5. **Tenant isolation** – Policies, runs, and findings scoped to tenants; cross-tenant access requires explicit admin scopes and is logged.
+6. **Offline parity** – Air-gapped sites follow the same governance workflow with sealed-mode safeguards and signed bundles.
+
+---
+
+## 2 · Authority Scopes & Role Mapping
+
+| Scope | Description | Recommended role |
+|-------|-------------|------------------|
+| `policy:read` | View policies, revisions, runs, findings. | Readers, auditors. |
+| `policy:write` | Create/edit drafts, run lint/compile. | Authors (SecOps engineers). |
+| `policy:submit` | Move draft → submitted, attach simulations. | Authors with submission rights. |
+| `policy:review` | Comment/approve/request changes (non-final). | Reviewers (peer security, product). |
+| `policy:approve` | Final approval; can archive. | Approval board/security lead. |
+| `policy:activate` | Promote approved version, schedule activation. | Runtime operators / release managers. |
+| `policy:run` | Trigger runs, inspect live status. | Operators, automation bots. |
+| `policy:runs` | Read run history, replay bundles. | Operators, auditors. |
+| `policy:archive` | Retire versions, perform rollbacks. | Approvers, operators. |
+| `policy:simulate` | Execute simulations via API/CLI. | Authors, reviewers, CI. |
+| `policy:operate` | Activate incident mode, toggle sampling. | SRE/on-call. |
+| `findings:read` | View effective findings/explain. | Analysts, auditors, CLI. |
+| `effective:write` | **Service only** – materialise findings. | Policy Engine service principal. |
+
+> Map organisation roles to scopes via Authority issuer config (`authority.tenants[].roles`). Document assignments in tenant onboarding checklist.
+
+> **Authority configuration tip:** the Policy Engine service client must include `properties.serviceIdentity: policy-engine` and a tenant hint in `authority.yaml`. Authority rejects `effective:write` tokens that lack this marker. See [Authority scopes](authority-scopes.md) for the full scope catalogue.
+
+---
+
+## 3 · Workflow Controls
+
+- **Submit gate:** CLI/UI require fresh lint + simulation artefacts (<24 h). Submissions store reviewer list and diff attachments.
+- **Review quorum:** Authority policy enforces minimum reviewers (e.g., 2) and optional separation between functional/security domains.
+- **Approval guard:** Approvers must acknowledge simulation + determinism check completion. CLI enforces `--note` and `--attach` fields.
+- **Activation guard:** Policy Engine refuses activation when latest full run status ≠ success or incremental backlog aged > SLA.
+- **Rollback policy:** Rollbacks require incident reference and produce `policy.rollback` audit events.
+
+---
+
+## 4 · Tenancy & Data Access
+
+- Policies stored per tenant; `tenant-global` used for shared baselines.
+- API filters all requests by `X-Stella-Tenant` (default from token). Cross-tenant requests require `policy:tenant-admin`.
+- Effective findings collections include `tenant` field and unique indexes preventing cross-tenant writes.
+- CLI/Console display tenant context prominently; switching tenant triggers warnings when active policy differs.
+- Offline bundles encode tenant metadata; import commands validate compatibility before applying.
+
+---
+
+## 5 · Audit & Evidence
+
+- **Collections:** `policies`, `policy_reviews`, `policy_history`, `policy_runs`, `policy_run_events`, `effective_finding_*_history`.
+- **Events:** `policy.submitted`, `policy.review.requested`, `policy.approved`, `policy.activated`, `policy.archived`, `policy.run.*`, `policy.incident.*`.
+- **Explain traces:** Stored for critical findings (sampled); available via CLI/UI for auditors (requires `findings:read`).
+- **Offline evidence:** `stella policy bundle export` produces DSSE-signed packages containing DSL, IR digest, simulations, approval notes, run summaries, trace metadata.
+- **Retention:** Default 365 days for run history, extendable per compliance requirements; incident mode extends to 30 days minimum.
+
+---
+
+## 6 · Secrets & Configuration Hygiene
+
+- Policy Engine configuration loaded from environment/secret stores; no secrets in repo.
+- CLI profiles should store tokens encrypted (`stella profile set --secret`).
+- UI/CLI logs redact tokens, reviewer emails, and attachments.
+- Rotating tokens/keys: Authority exposes `policy scopes` in discovery docs; follow `/docs/security/authority-scopes.md` for rotation.
+- Use `policy:operate` to disable self-service simulation temporarily during incident response if needed.
+
+---
+
+## 7 · Incident Response
+
+- Trigger incident mode for determinism violations, backlog surges, or suspected policy abuse.
+- Capture replay bundles and run `stella policy run replay` for affected runs.
+- Coordinate with Observability dashboards (see `/docs/observability/policy.md`) to monitor queue depth, failures.
+- After resolution, document remediation in Lifecycle guide (§8) and attach to approval history.
+
+---
+
+## 8 · Offline / Air-Gapped Governance
+
+- Same scopes apply; tokens issued by local Authority.
+- Approvers must use offline UI/CLI to sign submissions; attachments stored locally.
+- Bundle import/export must be signed (DSSE + cosign). CLI warns if signatures missing.
+- Sealed-mode banner reminds operators to refresh bundles when staleness thresholds exceeded.
+- Offline audits rely on evidence bundles and local `policy_runs` snapshot.
+
+---
+
+## 9 · Compliance Checklist
+
+- [ ] **Scope mapping reviewed:** Authority issuer config updated; RBAC matrix stored with change request.
+- [ ] **Separation enforced:** Automated checks block self-approval; review quorum satisfied.
+- [ ] **Activation guard documented:** Operators trained on run health checks before promoting.
+- [ ] **Audit exports tested:** Evidence bundles verified (hash/signature) and stored per compliance policy.
+- [ ] **Incident drills rehearsed:** Replay/rollback procedures executed and logged.
+- [ ] **Offline parity confirmed:** Air-gapped site executes submit/approve flow with sealed-mode guidance.
+- [ ] **Documentation cross-links:** References to lifecycle, runs, observability, CLI, and API docs validated.
+
+---
+
+*Last updated: 2025-10-26 (Sprint 20).*