feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules

- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes. - Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes. - Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables. - Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
2025-10-30 00:09:39 +02:00
parent 3154c67978
commit 7b5bdcf4d3
503 changed files with 16136 additions and 54638 deletions
--- a/docs/risk/EPIC_18_RISK_PROFILES.md
+++ b/docs/risk/EPIC_18_RISK_PROFILES.md
@@ -1,260 +0,0 @@
-> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
-
---
-
-# Epic 18: Risk Scoring Profiles
-
-**Short name:** Risk Profiles
-**Primary components:** Policy Engine, Findings Ledger, Conseiller (Feedser), Excitator (VEXer), StellaOps Console, Policy Studio, CLI, Export Center, Authority & Tenancy, Observability
-**Surfaces:** policy documents, scoring engine, factor providers, explainability artifacts, APIs, CLI, UI
-
-**AOC ground rule reminder:** Conseiller and Excitator aggregate and link advisories/VEX. They never merge or mutate source records. Risk scoring consumes linked items and computes a contextual score per finding and per asset without collapsing sources; provenance is preserved and shown.
-
---
-
-## 1) What it is
-
-Risk Scoring Profiles let users define, version and apply customizable formulas that turn raw signals (CVSS, EPSS‑like exploit likelihood, KEV‑style exploited lists, VEX status, reachability, runtime evidence, fix availability, asset criticality, provenance trust, etc.) into a single normalized risk score from 0 to 100 with severity buckets. Profiles are authored in Policy Studio, attached to scopes/tenants/projects, simulated against inventories and SBOMs, and executed by a scoring engine that outputs:
-
-* A final score and severity.
-* A factor‑by‑factor contribution breakdown with math.
-* Gating decisions (e.g., VEX “not affected” forces score to 0).
-* Audit and provenance for every signal used.
-
-Profiles can differ by environment: “Exploit‑aware prod,” “Compliance‑focused,” “Safety‑critical,” “Dev velocity,” and so on. The engine is pluggable: new signals can be added without breaking existing profiles.
-
---
-
-## 2) Why
-
-* One size doesn’t fit anyone. Different orgs weigh exploitability vs business criticality differently.
-* Reduce noise and accelerate triage by aligning scores with how teams actually make decisions.
-* Make risk explainable. If a score says 86, show why.
-* Enable policy‑aware flows elsewhere: gates, notifications, dashboards, remediation queues.
-
---
-
-## 3) How it should work
-
-### 3.1 Core model
-
-A **RiskProfile** defines:
-
-* **Metadata:** name, version, description, owner, scope selector, status (draft/published/deprecated).
-* **Signals:** named inputs with source bindings and transforms.
-* **Formula:** a composition of weighted terms, caps, gates, and overrides producing a 0‑100 score.
-* **Severity mapping:** score→{Critical, High, Medium, Low, None}.
-* **Gates:** hard conditions that short‑circuit scoring (e.g., VEX Not Affected → 0).
-* **Overrides:** explicit per‑package/per‑CVE/per‑asset adjustments with audit.
-* **Explainability:** must compute contribution of each term and include raw values.
-* **Versioning:** immutable content hash, `profile_id@version`. Inheritance supported via `extends`.
-
-### 3.2 Signals (factor) catalog
-
-Initial signals supported out of the box:
-
-| Signal                 | Description                                      | Expected range | Default transform    |
-| ---------------------- | ------------------------------------------------ | -------------- | -------------------- |
-| `cvss_base`            | CVSS base score from each advisory               | 0..10          | linear: `x/10`       |
-| `epss_like`            | Exploit likelihood (0..1)                        | 0..1           | identity             |
-| `kev_flag`             | Known exploited in the wild (boolean)            | {0,1}          | step: 0 or 1         |
-| `vex_status`           | VEX: affected, not_affected, under_investigation | enum           | gate + multiplier    |
-| `reachability`         | Static reachability to vulnerable code path      | 0..1           | identity             |
-| `runtime_evidence`     | Runtime evidence of vulnerable symbol/path       | 0..1           | identity             |
-| `internet_exposed`     | Asset externally reachable                       | {0,1}          | multiplier           |
-| `asset_criticality`    | Business criticality of asset                    | 1..5           | normalize: `(x-1)/4` |
-| `fix_available`        | Patch or upgrade exists                          | {0,1}          | negative weight      |
-| `age_days`             | Days since advisory published                    | 0..∞           | logistic decay       |
-| `privilege_escalation` | Elevation potential                              | {0,1}          | positive bump        |
-| `rce_flag`             | Remote code execution                            | {0,1}          | positive bump        |
-| `provenance_trust`     | Signature/provenance (SLSA‑ish)                  | 0..1           | inverse weight       |
-| `pkg_popularity`       | Package ecosystem usage                          | 0..1           | mild bump            |
-| `source_consensus`     | Count of agreeing sources (Conseiller‑linked)    | 1..N           | saturating transform |
-
-Notes:
-
-* Conseiller can link multiple advisories per CVE. Signals like `cvss_base` and `kev_flag` are aggregated via declared **reducers**: `max`, `mean`, or **consensus** (e.g., count of sources claiming exploited).
-
-### 3.3 Formula template
-
-Default formula (normalized result 0..1 before scaling to 0..100):
-
-```
-score =
-  gate(VEX_not_affected => 0) *
-  clamp01(
-    w1*cvss'
-  + w2*epss'
-  + w3*reachability'
-  + w4*runtime_evidence'
-  + w5*internet_exposed'
-  + w6*asset_criticality'
-  + w7*kev_flag'
-  + w8*rce_flag'
-  + w9*privilege_escalation'
-  + w10*source_consensus'
-  + w11*(1 - provenance_trust')
-  + w12*(1 - fix_available')
-  + w13*age_decay'
-  + bias
-  )
-```
-
-* Each term is a transformed, normalized signal (denoted `'`).
-* Weights default to reasonable values (e.g., cvss 0.25, epss 0.2, reachability 0.1, runtime 0.1, internet_exposed 0.08, asset_criticality 0.08, kev 0.07, rce 0.04, priv_esc 0.03, consensus 0.03, provenance inverse 0.01, fix inverse 0.005, age 0.005).
-* **Severity mapping (default):**
-
-  * Critical ≥ 85
-  * High 70–84
-  * Medium 40–69
-  * Low 15–39
-  * None < 15
-
-Profiles can override weights, gating, transforms and severity thresholds.
-
-### 3.4 Reducers and provenance
-
-For signals with multiple sources:
-
-* `cvss_base`: default reducer `max`.
-* `kev_flag`: reducer `any`.
-* `epss_like`: reducer `max`.
-* `vex_status`: **gate precedence:** if any linked VEX says `not_affected`, apply gate 0 unless an explicit policy disables that source; otherwise, most conservative status wins (`affected` > `under_investigation` > `unknown`).
-* Every reduction lists contributing sources in the explanation with their digests.
-
-### 3.5 Explainability artifact
-
-For every scored item, produce a JSON object:
-
-```json
-{
-  "profile_id": "risk-default",
-  "profile_version": "1.2.0",
-  "input": { "asset_id": "...", "package": "openssl@1.1.1u", "cve": "CVE-XXXX-YYYY" },
-  "signals": {
-    "cvss_base": { "values": [{"source":"nvd","value":9.8}, {"source":"vendor","value":9.1}], "reducer":"max", "reduced":9.8, "normalized":0.98 },
-    "epss_like": { "value":0.72, "normalized":0.72 },
-    "vex_status": { "values":[{"source":"vendor","value":"affected"}], "decision":"affected" }
-  },
-  "formula": {
-    "weights": { "cvss":0.25, "epss":0.20 },
-    "gates": [{ "name":"VEX_not_affected", "applied": false }]
-  },
-  "contributions": [
-    { "signal":"cvss_base", "weight":0.25, "value":0.98, "contribution":24.5 },
-    { "signal":"epss_like", "weight":0.20, "value":0.72, "contribution":14.4 }
-  ],
-  "score": 87.1,
-  "severity": "Critical",
-  "provenance": { "calculated_at":"2025-10-25T12:00:00Z", "engine":"risk-engine@v0.6.3", "trace_id":"..." }
-}
-```
-
-### 3.6 Profile scoping and inheritance
-
-* Profiles attach to scopes via Authority & Tenancy: org/tenant/project/environment.
-* A scope resolves **one active profile** by precedence: project > environment > org default.
-* Profiles may `extends` a base profile, overriding weights and thresholds. Resolve via immutable parent chain.
-
-### 3.7 Execution path
-
-1. New or updated findings arrive from Conseiller/Excitator into Findings Ledger.
-2. A **Scoring Job** is enqueued per scope with a batch of items.
-3. The engine pulls necessary signals via **Factor Providers** (reachability, runtime, KEV lists, etc.).
-4. The formula executes; results are upserted to Findings Ledger with an explainability blob pointer.
-5. Notifications Studio triggers based on severity deltas.
-6. Console and CLI read scored findings; filters and charts operate on score and severity.
-
-### 3.8 Factor Provider interface
-
-```
-interface FactorProvider {
-  id(): string;
-  requiredInputs(): string[];
-  fetch(ctx, inputs[]): Promise<Map<inputKey, FactorValue>>;
-}
-```
-
-Providers must be deterministic and cacheable. Every factor has a TTL and a backfill policy.
-
-### 3.9 Simulation
-
-Policy Studio provides “Simulate with profile” functionality to test profiles against SBOMs or asset sets. Simulation outputs include distributions, severity shifts, and top movers, and can be exported.
-
-### 3.10 Air‑gapped behavior
-
-Profiles work offline; providers rely on bundled datasets produced by Export Center. Missing providers surface explicit gaps in explanations.
-
---
-
-## 4) Architecture
-
-### 4.1 New modules
-
-* `src/RiskEngine/StellaOps.RiskEngine/`
-* `src/RiskEngine/StellaOps.RiskEngine/providers/`
-* `src/Policy/StellaOps.Policy.RiskProfile/`
-* Database migrations for profiles/results/explanations
-* `src/UI/StellaOps.UI`
-* `src/Cli/StellaOps.Cli`
-* `src/ExportCenter/StellaOps.ExportCenter.RiskBundles`
-
-### 4.2 Data model
-
-Tables for `risk_profiles`, `scoring_jobs`, `scoring_results`, `explanations` with indexes on finding keys, scope, severity, and timestamps.
-
---
-
-## 5) APIs and contracts
-
-Endpoints include profile CRUD, publish, simulate, job enqueue, results queries, explanation retrieval, and schema discovery. Authentication scopes: `risk.profile:*`, `risk.result:read`, `risk.job:write`.
-
---
-
-## 6) Documentation changes
-
-List of required docs with banner statements covering overview, profiles, factors, formulas, explainability, API, console UI, CLI commands, air-gapped bundles, and AOC invariants.
-
---
-
-## 7) Implementation plan
-
-Seven phases: foundations, storage/APIs, Console & Policy Studio, CLI & SDKs, expanded factors, air-gapped support, quality/performance.
-
---
-
-## 8) Engineering tasks
-
-Detailed task list spanning schema, engine, providers, APIs, ledger integration, console, CLI, export center, observability, docs, and testing.
-
---
-
-## 9) Feature changes required in other components
-
-Defines cross-team expectations for Conseiller, Excitator, Findings Ledger, Policy Studio, Vulnerability Explorer, SBOM Graph Explorer, Notifications, Authority, Export Center, CLI & SDKs.
-
---
-
-## 10) Acceptance criteria
-
-Coverage of authoring, simulation, scoring, UI, CLI, air-gapped support, AOC invariants, and performance.
-
---
-
-## 11) Risks and mitigations
-
-Addresses signal drift, weight overfitting, performance, VEX trust, and compliance differences.
-
---
-
-## 12) Philosophy
-
-Principles: context, explainability, truth preservation, portability, and loud failures.
-
---
-
-## 13) Example profile
-
-Contains an abbreviated YAML example demonstrating schema usage, weights, gates, severity mapping, and overrides.
-
-> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
--- a/docs/risk/risk-profiles.md
+++ b/docs/risk/risk-profiles.md
@@ -0,0 +1,57 @@
+# Risk Scoring Profiles
+
+> Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
+
+## Overview
+
+Risk Scoring Profiles define customizable formulas that convert raw evidence (CVSS, EPSS-like exploit likelihood, KEV exploited lists, VEX status, reachability, runtime evidence, fix availability, asset criticality, provenance trust) into normalized risk scores (0–100) with severity buckets. Profiles are authored in Policy Studio, simulated, versioned, and executed by the scoring engine with full explainability.
+
+- **Primary components:** Policy Engine, Findings Ledger, Conseiller, Excitator, Console, Policy Studio, CLI, Export Center, Authority & Tenancy, Observability.
+- **Surfaces:** policy documents, scoring engine, factor providers, explainability artefacts, APIs, CLI, UI.
+
+Aggregation-Only Contract remains in force: Conseiller and Excitator never merge or mutate source records. Risk scoring consumes linked evidence and preserves provenance for explainability.
+
+## Core workflow
+
+1. **Profile authoring:** Policy Studio exposes declarative DSL to define factors, weights, thresholds, and severity buckets.
+2. **Simulation:** operators preview profiles against historical findings/SBOMs, compare with existing policies, and inspect factor breakdowns.
+3. **Activation:** Policy Engine evaluates profiles on change streams, producing scores and detailed factor contributions per finding and per asset.
+4. **Explainability:** CLI/Console display math traces, provenance IDs, and rationale for each factor. Export Center packages reports for auditors.
+5. **Versioning:** profiles carry semantic versions, promotion workflows, and rollback hooks; Authority scopes enforce who can publish or edit.
+
+## Factor model
+
+| Factor | Description | Typical signal source |
+| --- | --- | --- |
+| Exploit likelihood | EPSS/KEV or internal intel | Conseiller enrichment |
+| VEX status | not_affected / affected / fixed | Excitator (VEX Lens) |
+| Reachability | entrypoint closure, runtime observations | Scanner + Zastava |
+| Fix availability | patch released, vendor guidance | Conseiller, Policy Engine |
+| Asset criticality | business context, tenant overrides | Policy Studio inputs |
+| Provenance trust | signed evidence, attestation status | Attestor, Authority |
+
+Factors feed into a weighted scoring engine with per-factor contribution reporting.
+
+## Governance & guardrails
+
+- Profiles live in Policy Studio with draft/review/approval workflows.
+- Policy Engine enforces deterministic evaluation; simulations and production runs share the same scoring code.
+- CLI parity enables automated promotion, export/import, and simulation from pipelines.
+- Observability records scoring latency, factor distribution, and profile usage.
+- Offline support: profiles, factor plugins, and explain bundles ship inside mirror bundles for air-gapped environments.
+
+## Deliverables
+
+- Policy language reference and examples.
+- Simulation APIs/CLI with diff output.
+- Scoring engine implementation with explain traces and determinism checks.
+- Console visualizations (severity heatmaps, contribution waterfalls).
+- Export Center reports with risk scoring sections.
+- Observability dashboards for profile health and scoring throughput.
+
+## References
+
+- Policy core: `docs/modules/policy/architecture.md`
+- Findings ledger: `docs/modules/vuln-explorer/architecture.md`
+- VEX consensus: `docs/modules/vex-lens/architecture.md`
+- Offline operations: `docs/airgap/airgap-mode.md`