feat: Add new projects to solution and implement contract testing documentation

- Added "StellaOps.Policy.Engine", "StellaOps.Cartographer", and "StellaOps.SbomService" projects to the StellaOps solution.
- Created AGENTS.md to outline the Contract Testing Guild Charter, detailing mission, scope, and definition of done.
- Established TASKS.md for the Contract Testing Task Board, outlining tasks for Sprint 62 and Sprint 63 related to mock servers and replay testing.
This commit is contained in:
2025-10-27 07:57:55 +02:00
parent 1e41ba7ffa
commit 651b8e0fa3
355 changed files with 17276 additions and 1160 deletions

567
EPIC_2.md Normal file
View File

@@ -0,0 +1,567 @@
Fine. Heres the next epic, written so you can paste it straight into the repo without having to babysit me. Same structure as before, maximum detail, zero handwaving.
---
# Epic 2: Policy Engine & Policy Editor (VEX + Advisory Application Rules)
> Short name: **Policy Engine v2**
> Services touched: **Policy Engine, Web API, Console (Policy Editor), CLI, Conseiller, Excitator, SBOM Service, Authority, Workers/Scheduler**
> Data stores: **MongoDB (policies, runs, effective findings), optional Redis/NATS for jobs**
---
## 1) What it is
This epic delivers the **organizationspecific decision layer** for Stella. Ingestion is now AOCcompliant (Epic 1). That means advisories and VEX arrive as immutable raw facts. This epic builds the place where those facts become **effective findings** under policies you control.
Core deliverables:
* **Policy Engine**: deterministic evaluator that applies rule sets to inputs:
* Inputs: `advisory_raw`, `vex_raw`, SBOMs, optional telemetry hooks (reachability stubs), org metadata.
* Outputs: `effective_finding_{policyId}` materializations, with full explanation traces.
* **Policy Editor (Console + CLI)**: versioned policy authoring, simulation, review/approval workflow, and change diffs.
* **Rules DSL v1**: safe, declarative language for VEX application, advisory normalization, and risk scoring. No arbitrary code execution, no network calls.
* **Run Orchestrator**: incremental reevaluation when new raw facts or SBOM changes arrive; efficient partial updates.
The philosophy is boring on purpose: policy is a **pure function of inputs**. Same inputs and same policy yield the same outputs, every time, on every machine. If you want drama, watch reality TV, not your risk pipeline.
---
## 2) Why
* Vendors disagree, contexts differ, and your tolerance for risk is not universal.
* VEX means nothing until you decide **how** to apply it to **your** assets.
* Auditors care about the “why.” Youll need consistent, replayable answers, with traces.
* Security teams need **simulation** before rollouts, and **diffs** after.
---
## 3) How it should work (deep details)
### 3.1 Data model
#### 3.1.1 Policy documents (Mongo: `policies`)
```json
{
"_id": "policy:P-7:v3",
"policy_id": "P-7",
"version": 3,
"name": "Default Org Policy",
"status": "approved", // draft | submitted | approved | archived
"owned_by": "team:sec-plat",
"valid_from": "2025-01-15T00:00:00Z",
"valid_to": null,
"dsl": {
"syntax": "stella-dsl@1",
"source": "rule-set text or compiled IR ref"
},
"metadata": {
"description": "Baseline scoring + VEX precedence",
"tags": ["baseline","vex","cvss"]
},
"provenance": {
"created_by": "user:ali",
"created_at": "2025-01-15T08:00:00Z",
"submitted_by": "user:kay",
"approved_by": "user:root",
"approval_at": "2025-01-16T10:00:00Z",
"checksum": "sha256:..."
},
"tenant": "default"
}
```
Constraints:
* `status=approved` is required to run in production.
* Version increments are appendonly. Old versions remain runnable for replay.
#### 3.1.2 Policy runs (Mongo: `policy_runs`)
```json
{
"_id": "run:P-7:2025-02-20T12:34:56Z:abcd",
"policy_id": "P-7",
"policy_version": 3,
"inputs": {
"sbom_set": ["sbom:S-42"],
"advisory_cursor": "2025-02-20T00:00:00Z",
"vex_cursor": "2025-02-20T00:00:00Z"
},
"mode": "incremental", // full | incremental | simulate
"stats": {
"components": 1742,
"advisories_considered": 9210,
"vex_considered": 1187,
"rules_fired": 68023,
"findings_out": 4321
},
"trace": {
"location": "blob://traces/run-.../index.json",
"sampling": "smart-10pct"
},
"status": "succeeded", // queued | running | failed | succeeded | canceled
"started_at": "2025-02-20T12:34:56Z",
"finished_at": "2025-02-20T12:35:41Z",
"tenant": "default"
}
```
#### 3.1.3 Effective findings (Mongo: `effective_finding_P-7`)
```json
{
"_id": "P-7:S-42:pkg:npm/lodash@4.17.21:CVE-2021-23337",
"policy_id": "P-7",
"policy_version": 3,
"sbom_id": "S-42",
"component_purl": "pkg:npm/lodash@4.17.21",
"advisory_ids": ["CVE-2021-23337", "GHSA-..."],
"status": "affected", // affected | not_affected | fixed | under_investigation | suppressed
"severity": {
"normalized": "High",
"score": 7.5,
"vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:N",
"rationale": "cvss_base(OSV) + vendor_weighting + env_modifiers"
},
"rationale": [
{"rule":"vex.precedence","detail":"VendorX not_affected justified=component_not_present wins"},
{"rule":"advisory.cvss.normalization","detail":"mapped GHSA severity to CVSS 3.1 = 7.5"}
],
"references": {
"advisory_raw_ids": ["advisory_raw:osv:GHSA-...:v3"],
"vex_raw_ids": ["vex_raw:VendorX:doc-123:v4"]
},
"run_id": "run:P-7:2025-02-20T12:34:56Z:abcd",
"tenant": "default"
}
```
Write protection: only the **Policy Engine** service identity may write any `effective_finding_*` collection.
---
### 3.2 Rules DSL v1 (stelladsl@1)
**Design goals**
* Declarative, composable, deterministic.
* No loops, no network IO, no nondeterministic time.
* Policy authors see readable text; the engine compiles to a safe IR.
**Concepts**
* **WHEN** condition matches a tuple `(sbom_component, advisory, optional vex_statements)`
* **THEN** actions set `status`, compute `severity`, attach `rationale`, or `suppress` with reason.
* **Profiles** for severity and scoring; **Maps** for vendor weighting; **Guards** for VEX justification.
**Minigrammar (subset)**
```
policy "Default Org Policy" syntax "stella-dsl@1" {
profile severity {
map vendor_weight {
source "GHSA" => +0.5
source "OSV" => +0.0
source "VendorX" => -0.2
}
env base_cvss {
if env.runtime == "serverless" then -0.5
if env.exposure == "internal-only" then -1.0
}
}
rule vex_precedence {
when vex.any(status in ["not_affected","fixed"])
and vex.justification in ["component_not_present","vulnerable_code_not_present"]
then status := vex.status
because "VEX strong justification prevails";
}
rule advisory_to_cvss {
when advisory.source in ["GHSA","OSV"]
then severity := normalize_cvss(advisory)
because "Map vendor severity or CVSS vector";
}
rule reachability_soft_suppress {
when severity.normalized <= "Medium"
and telemetry.reachability == "none"
then status := "suppressed"
because "not reachable and low severity";
}
}
```
**Builtins** (nonexhaustive)
* `normalize_cvss(advisory)` maps GHSA/OSV/CSAF severity fields to CVSS v3.1 numbers when possible; otherwise vendortonumeric mapping table in policy.
* `vex.any(...)` tests across matching VEX statements for the same `(component, advisory)`.
* `telemetry.*` is an optional input namespace reserved for future reachability data; if absent, expressions evaluate to `unknown` (no effect).
**Determinism**
* Rules are evaluated in **stable order**: explicit `priority` attribute or lexical order.
* **Firstmatch** semantics for conflicting status unless `combine` is used.
* Severity computations are pure; numeric maps are part of policy document.
---
### 3.3 Evaluation model
1. **Selection**
* For each SBOM component PURL, find candidate advisories from `advisory_raw` via linkset PURLs or identifiers.
* For each pair `(component, advisory)`, load all matching VEX facts from `vex_raw`.
2. **Context assembly**
* Build an evaluation context from:
* `sbom_component`: PURL, licenses, relationships.
* `advisory`: source, identifiers, references, embedded vendor severity (kept in `content.raw`).
* `vex`: list of statements with status and justification.
* `env`: orgspecific env vars configured per policy run (e.g., exposure).
* Optional `telemetry` if available.
3. **Rule execution**
* Compile DSL to IR once per policy version; cache.
* Execute rules per tuple; record which rules fired and the order.
* If no rule sets status, default is `affected`.
* If no rule sets severity, default severity uses `normalize_cvss(advisory)` with vendor defaults.
4. **Materialization**
* Write to `effective_finding_{policyId}` with `rationale` chain and references to raw docs.
* Emit pertuple trace events; sample and store full traces per run.
5. **Incremental updates**
* A watch job observes new `advisory_raw` and `vex_raw` inserts and SBOM deltas.
* The orchestrator computes the affected tuples and reevaluates only those.
6. **Replay**
* Any `policy_run` is fully reproducible by `(policy_id, version, input set, cursors)`.
---
### 3.4 VEX application semantics
* **Precedence**: a `not_affected` with strong justification (`component_not_present`, `vulnerable_code_not_present`, `fix_not_required`) wins unless another rule explicitly overrides by environment context.
* **Scoping**: VEX statements often specify product/component scope. Matching uses PURL equivalence and version ranges extracted during ingestion linkset generation.
* **Conflicts**: If multiple VEX statements conflict, the default is **mostspecific scope wins** (component > product > vendor), then newest `document_version`. Policies can override with explicit rules.
* **Explainability**: Every VEXdriven decision records which statement IDs were considered and which one won.
---
### 3.5 Advisory normalization rules
* **Vendor severity mapping**: Map GHSA levels or CSAF producttree severities to CVSSlike numeric bands via policy maps.
* **CVSS vector use**: If a valid vector exists in `content.raw`, parse and compute; apply policy modifiers from `profile severity`.
* **Temporal/environment modifiers**: Optional reductions for network exposure, isolation, or compensating controls, all encoded in policy.
---
### 3.6 Performance and scale
* Partition evaluation by SBOM ID and hash ranges of PURLs.
* Preindex `advisory_raw.linkset.purls` and `vex_raw.linkset.purls` (already in Epic 1).
* Use streaming iterators; avoid loading entire SBOM or advisory sets into memory.
* Materialize only changed findings (diffaware writes).
* Target: 100k components, 1M advisories considered, 5 minutes incremental SLA on commodity hardware.
---
### 3.7 Error codes
| Code | Meaning | HTTP |
| ------------- | ----------------------------------------------------- | ---- |
| `ERR_POL_001` | Policy syntax error | 400 |
| `ERR_POL_002` | Policy not approved for run | 403 |
| `ERR_POL_003` | Missing inputs (SBOM/advisory/vex fetch failed) | 424 |
| `ERR_POL_004` | Determinism guard triggered (nonpure function usage) | 500 |
| `ERR_POL_005` | Write denied to effective findings (caller invalid) | 403 |
| `ERR_POL_006` | Run canceled or timed out | 408 |
---
### 3.8 Observability
* Metrics:
* `policy_compile_seconds`, `policy_run_seconds{mode=...}`, `rules_fired_total`, `findings_written_total`, `vex_overrides_total`, `simulate_diff_total{delta=up|down|unchanged}`.
* Tracing:
* Spans: `policy.compile`, `policy.select`, `policy.eval`, `policy.materialize`.
* Logs:
* Include `policy_id`, `version`, `run_id`, `sbom_id`, `component_purl`, `advisory_id`, `vex_count`, `rule_hits`.
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
---
### 3.9 Security and tenancy
* Only users with `policy:write` can create/modify policies.
* `policy:approve` is a separate privileged role.
* Only Policy Engine service identity has `effective:write`.
* Tenancy is explicit on all documents and queries.
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
---
## 4) API surface
### 4.1 Policy CRUD and lifecycle
* `POST /policies` create draft
* `GET /policies?status=...` list
* `GET /policies/{policyId}/versions/{v}` fetch
* `POST /policies/{policyId}/submit` move draft to submitted
* `POST /policies/{policyId}/approve` approve version
* `POST /policies/{policyId}/archive` archive version
### 4.2 Compilation and validation
* `POST /policies/{policyId}/versions/{v}/compile`
* Returns IR checksum, syntax diagnostics, rule stats.
### 4.3 Runs
* `POST /policies/{policyId}/runs` body: `{mode, sbom_set, advisory_cursor?, vex_cursor?, env?}`
* `GET /policies/{policyId}/runs/{runId}` status + stats
* `POST /policies/{policyId}/simulate` returns **diff** vs current approved version on a sample SBOM set.
### 4.4 Findings and explanations
* `GET /findings/{policyId}?sbom_id=S-42&status=affected&severity=High+Critical`
* `GET /findings/{policyId}/{findingId}/explain` returns ordered rule hits and linked raw IDs.
All endpoints require tenant scoping and appropriate `policy:*` or `findings:*` roles.
---
## 5) Console (Policy Editor) and CLI behavior
**Console**
* Monacostyle editor with DSL syntax highlighting, lint, quick docs.
* Sidebyside **Simulation** panel: show count of affected findings before/after.
* Approval workflow: submit, review comments, approve with rationale.
* Diffs: show rulewise changes and estimated impact.
* Readonly run viewer: heatmap of rules fired, top suppressions, VEX wins.
**CLI**
* `stella policy new --name "Default Org Policy"`
* `stella policy edit P-7` opens local editor -> `submit`
* `stella policy approve P-7 --version 3`
* `stella policy simulate P-7 --sbom S-42 --env exposure=internal-only`
* `stella findings ls --policy P-7 --sbom S-42 --status affected`
Exit codes map to `ERR_POL_*`.
---
## 6) Implementation tasks
### 6.1 Policy Engine service
* [ ] Implement DSL parser and IR compiler (`stella-dsl@1`).
* [ ] Build evaluator with stable ordering and firstmatch semantics.
* [ ] Implement selection joiners for SBOM↔advisory↔vex using linksets.
* [ ] Materialization writer with upsertonly semantics to `effective_finding_{policyId}`.
* [ ] Determinism guard (ban wallclock, network, and RNG during eval).
* [ ] Incremental orchestrator listening to advisory/vex/SBOM change streams.
* [ ] Trace emitter with rulehit sampling.
* [ ] Unit tests, property tests, golden fixtures; perf tests to target SLA.
**Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
### 6.2 Web API
* [ ] Policy CRUD, compile, run, simulate, findings, explain endpoints.
* [ ] Pagination, filters, and tenant enforcement on all list endpoints.
* [ ] Error mapping to `ERR_POL_*`.
* [ ] Rate limits on simulate endpoints.
**Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
### 6.3 Console (Policy Editor)
* [ ] Editor with DSL syntax highlighting and inline diagnostics.
* [ ] Simulation UI with pre/post counts and top deltas.
* [ ] Approval workflow UI with audit trail.
* [ ] Run viewer dashboards (rule heatmap, VEX wins, suppressions).
**Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
### 6.4 CLI
* [ ] New commands: `policy new|edit|submit|approve|simulate`, `findings ls|get`.
* [ ] Json/YAML output formats for CI consumption.
* [ ] Nonzero exits on syntax errors or simulation failures; map to `ERR_POL_*`.
**Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
### 6.5 Conseiller & Excitator integration
* [ ] Provide search endpoints optimized for policy selection (batch by PURLs and IDs).
* [ ] Harden linkset extraction to maximize join recall.
* [ ] Add cursors for incremental selection windows per run.
**Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
### 6.6 SBOM Service
* [ ] Ensure fast PURL index and component metadata projection for policy queries.
* [ ] Provide relationship graph API for future transitive logic.
* [ ] Emit change events on SBOM updates.
**Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
### 6.7 Authority
* [ ] Define scopes: `policy:write`, `policy:approve`, `policy:run`, `findings:read`, `effective:write`.
* [ ] Issue service identity for Policy Engine with `effective:write` only.
* [ ] Enforce tenant claims at gateway.
**Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
### 6.8 CI/CD
* [ ] Lint policy DSL in PRs; block invalid syntax.
* [ ] Run `simulate` against golden SBOMs to detect explosive deltas.
* [ ] Determinism CI: two runs with identical seeds produce identical outputs.
**Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
---
## 7) Documentation changes (create/update these files)
1. **`/docs/policy/overview.md`**
* What the Policy Engine is, highlevel concepts, inputs, outputs, determinism.
2. **`/docs/policy/dsl.md`**
* Full grammar, builtins, examples, best practices, antipatterns.
3. **`/docs/policy/lifecycle.md`**
* Draft → submitted → approved → archived, roles, and audit trail.
4. **`/docs/policy/runs.md`**
* Run modes, incremental mechanics, cursors, replay.
5. **`/docs/api/policy.md`**
* Endpoints, request/response schemas, error codes.
6. **`/docs/cli/policy.md`**
* Command usage, examples, exit codes, JSON output contracts.
7. **`/docs/ui/policy-editor.md`**
* Screens, workflows, simulation, diffs, approvals.
8. **`/docs/architecture/policy-engine.md`**
* Detailed sequence diagrams, selection/join strategy, materialization schema.
9. **`/docs/observability/policy.md`**
* Metrics, tracing, logs, sample dashboards.
10. **`/docs/security/policy-governance.md`**
* Scopes, approvals, tenancy, least privilege.
11. **`/docs/examples/policies/`**
* `baseline.pol`, `serverless.pol`, `internal-only.pol`, each with commentary.
12. **`/docs/faq/policy-faq.md`**
* Common pitfalls, VEX conflict handling, determinism gotchas.
Each file includes a **Compliance checklist** for authors and reviewers.
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
---
## 8) Acceptance criteria
* Policies are versioned, approvable, and compilable; invalid DSL blocks merges.
* Engine produces deterministic outputs with full rationale chains.
* VEX precedence rules work per spec and are overridable by policy.
* Simulation yields accurate pre/post deltas and diffs.
* Only Policy Engine can write to `effective_finding_*`.
* Incremental runs pick up new advisories/VEX/SBOM changes without full reruns.
* Console and CLI cover authoring, simulation, approval, and retrieval.
* Observability dashboards show rule hits, VEX wins, and run timings.
---
## 9) Risks and mitigations
* **Policy sprawl**: too many similar policies.
* Mitigation: templates, policy inheritance in v1.1, tagging, ownership metadata.
* **Nondeterminism creep**: someone sneaks wallclock or network into evaluation.
* Mitigation: determinism guard, static analyzer, and CI replay check.
* **Join missrate**: weak linksets cause undermatching.
* Mitigation: linkset strengthening in ingestion, PURL equivalence tables, monitoring for “zerohit” rates.
* **Approval bottlenecks**: blocked rollouts.
* Mitigation: RBAC with delegated approvers and timeboxed SLAs.
---
## 10) Test plan
* **Unit**: parser, compiler, evaluator; conflict resolution; precedence.
* **Property**: random policies over synthetic inputs; ensure no panics and stable outputs.
* **Golden**: fixed SBOM + curated advisories/VEX → expected findings; compare every run.
* **Performance**: large SBOMs with heavy rule sets; assert run times and memory ceilings.
* **Integration**: endtoend simulate → approve → run → diff; verify write protections.
* **Chaos**: inject malformed VEX, missing advisories; ensure graceful degradation and clear errors.
---
## 11) Developer checklists
**Definition of Ready**
* Policy grammar finalized; examples prepared.
* Linkset join queries benchmarked.
* Owner and approvers assigned.
**Definition of Done**
* All APIs live with RBAC.
* CLI and Console features shipped.
* Determinism and golden tests green.
* Observability dashboards deployed.
* Docs in section 7 merged.
* Two real org policies migrated and in production.
---
## 12) Glossary
* **Policy**: versioned rule set controlling status and severity.
* **DSL**: domainspecific language used to express rules.
* **Run**: a single evaluation execution with defined inputs and outputs.
* **Simulation**: a run that doesnt write findings; returns diffs.
* **Materialization**: persisted effective findings for fast queries.
* **Determinism**: same inputs + same policy = same outputs. Always.
---
### Final imposed reminder
**Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.**