feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules

- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes. - Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes. - Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables. - Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
2025-10-30 00:09:39 +02:00
parent 3154c67978
commit 7b5bdcf4d3
503 changed files with 16136 additions and 54638 deletions
--- a/docs/modules/vuln-explorer/AGENTS.md
+++ b/docs/modules/vuln-explorer/AGENTS.md
@@ -0,0 +1,22 @@
+# Vulnerability Explorer agent guide
+
+## Mission
+Vulnerability Explorer delivers policy-aware triage, investigation, and reporting surfaces for effective findings.
+
+## Key docs
+- [Module README](./README.md)
+- [Architecture](./architecture.md)
+- [Implementation plan](./implementation_plan.md)
+- [Task board](./TASKS.md)
+
+## How to get started
+1. Review ./architecture.md for ledger schema, workflow states, and export requirements.
+2. Open ../../implplan/SPRINTS.md and locate stories for this component.
+3. Check ./TASKS.md and update status before/after work.
+4. Read README/architecture for design context and update as the implementation evolves.
+
+## Guardrails
+- Uphold Aggregation-Only Contract boundaries when consuming ingestion data.
+- Preserve determinism and provenance in all derived outputs.
+- Document offline/air-gap pathways for any new feature.
+- Update telemetry/observability assets alongside feature work.
--- a/docs/modules/vuln-explorer/README.md
+++ b/docs/modules/vuln-explorer/README.md
@@ -0,0 +1,29 @@
+# StellaOps Vulnerability Explorer
+
+Vulnerability Explorer delivers policy-aware triage, investigation, and reporting surfaces for effective findings.
+
+## Responsibilities
+- Present policy-evaluated findings with advisory, VEX, SBOM, and runtime context.
+- Capture triage workflow in an immutable findings ledger with role-based access.
+- Provide pivots, exports, and reports for auditors and operations teams.
+- Integrate explain traces, remediation notes, and offline bundles.
+
+## Key components
+- Findings Ledger service + API.
+- Console module and CLI verbs for triage workflows.
+- Export integrations for reports and evidence packages.
+
+## Integrations & dependencies
+- Policy Engine for effective findings streams.
+- Concelier/Excititor for evidence provenance.
+- Scheduler for remediation/verification jobs.
+- Notify for triage notifications.
+
+## Operational notes
+- Audit logging per Epic 6 requirements.
+- Offline-ready CSV/PDF exports with deterministic hashes.
+- Dashboards for MTTR and triage throughput.
+
+## Epic alignment
+- Epic 6: Vulnerability Explorer.
+- VULN stories tracked in ../../TASKS.md and src/VulnExplorer/**/TASKS.md.
--- a/docs/modules/vuln-explorer/TASKS.md
+++ b/docs/modules/vuln-explorer/TASKS.md
@@ -0,0 +1,9 @@
+# Task board — Vulnerability Explorer
+
+> Local tasks should link back to ./AGENTS.md and mirror status updates into ../../TASKS.md when applicable.
+
+| ID | Status | Owner(s) | Description | Notes |
+|----|--------|----------|-------------|-------|
+| VULNERABILITY-EXPLORER-DOCS-0001 | DOING (2025-10-29) | Docs Guild | Ensure ./README.md reflects the latest epic deliverables. | Align with ./AGENTS.md |
+| VULNERABILITY-EXPLORER-ENG-0001 | TODO | Module Team | Break down epic milestones into actionable stories. | Sync into ../../TASKS.md |
+| VULNERABILITY-EXPLORER-OPS-0001 | TODO | Ops Guild | Prepare runbooks/observability assets once MVP lands. | Document outputs in ./README.md |
--- a/docs/modules/vuln-explorer/architecture.md
+++ b/docs/modules/vuln-explorer/architecture.md
@@ -0,0 +1,66 @@
+# Vulnerability Explorer architecture
+
+> Based on Epic 6 – Vulnerability Explorer; this specification summarises the ledger model, triage workflows, APIs, and export requirements.
+
+## 1) Ledger data model
+
+- **Collections / tables**
+  - `finding_records` – canonical, policy-derived findings enriched with advisory, VEX, SBOM, runtime context. Includes `policyVersion`, `advisoryRawIds`, `vexRawIds`, `sbomComponentId`, and `explainBundleRef`.
+  - `finding_history` – append-only state transitions (`new`, `triaged`, `accepted_risk`, `remediated`, `false_positive`, etc.) with timestamps, actor, and justification.
+  - `triage_actions` – discrete operator actions (comment, assignment, remediation note, ticket link) with immutable provenance.
+  - `remediation_plans` – structured remediation steps (affected assets, deadlines, recommended fixes, auto-generated from SRM/AI hints).
+  - `reports` – saved report definitions, export manifests, and signatures.
+
+- **Immutability & provenance** – All updates are append-only; previous state is recoverable. Records capture `tenant`, `artifactId`, `findingKey`, `policyVersion`, `sourceRunId`, `sr mDigest`.
+
+## 2) Triage workflow
+
+1. **Ingest effective findings** from Policy Engine (stream `policy.finding.delta`). Each delta updates `finding_records`, generates history entries, and triggers notification rules.
+2. **Prioritisation** uses contextual heuristics: runtime exposure, VEX status, policy severity, AI hints. Stored as `priorityScore` with provenance from Zastava/AI modules.
+3. **Assignment & collaboration** – Operators claim findings, add comments, attach evidence, and link tickets. Assignment uses Authority identities and RBAC.
+4. **Remediation tracking** – Link remediation plans, record progress, and integrate with Scheduler for follow-up scans once fixes deploy.
+5. **Closure** – When Policy or rescans mark finding resolved, system logs closure with explain trace and updates audit ledger.
+
+State machine summary:
+
+```
+new -> (triage) triaged -> (remediate) in_progress -> (verify) awaiting_verification -> (scan) remediated
+new -> (false_positive) closed_false_positive
+new -> (risk_accept) accepted_risk
+```
+
+All transitions require justification; certain transitions (accepted risk) require multi-approver workflow defined by Policy Studio.
+
+## 3) APIs
+
+- `GET /v1/findings` — filtered listing with pagination, search (artifact, advisory, priority, status, assignee).
+- `GET /v1/findings/{id}` — detail view (policy context, explain trace, evidence timeline).
+- `POST /v1/findings/{id}/actions` — create triage action (assign, comment, status change, remediation, ticket link) with DSSE signature support.
+- `POST /v1/reports` — generate report artifact (JSON, CSV, PDF) defined by saved templates; records manifest + signature.
+- `GET /v1/exports/offline` — stream deterministic bundle for Offline Kit (findings JSON, history, attachments, manifest).
+
+CLI mirrors these endpoints (`stella findings list|view|update|export`). Console UI consumes the same APIs via typed clients.
+
+## 4) AI/automation integration
+
+- Advisory AI contributes remediation hints and conflict explanations stored alongside findings (`aiInsights`).
+- Scheduler integration triggers follow-up scans or policy re-evaluation when remediation plan reaches checkpoint.
+- Zastava (Differential SBOM) feeds runtime exposure signals to reprioritise findings automatically.
+
+## 5) Observability & compliance
+
+- Metrics: `findings_open_total{severity,tenant}`, `findings_mttr_seconds`, `triage_actions_total{type}`, `report_generation_seconds`.
+- Logs: structured with `findingId`, `artifactId`, `advisory`, `policyVersion`, `actor`, `actionType`.
+- Audit exports: `audit_log.jsonl` appended whenever state changes; offline bundles include signed audit log and manifest.
+- Compliance: accepted risk requires dual approval and stores justification plus expiry reminders (raised through Notify).
+
+## 6) Offline bundle requirements
+
+- Bundle structure:
+  - `manifest.json` (hashes, counts, policy version, generation timestamp).
+  - `findings.jsonl` (current open findings).
+  - `history.jsonl` (state changes).
+  - `actions.jsonl` (comments, assignments, tickets).
+  - `reports/` (generated PDFs/CSVs).
+  - `signatures/` (DSSE envelopes).
+- Bundles produced deterministically; Export Center consumes them for mirror profiles.
--- a/docs/modules/vuln-explorer/implementation_plan.md
+++ b/docs/modules/vuln-explorer/implementation_plan.md
@@ -0,0 +1,70 @@
+# Implementation plan — Vulnerability Explorer
+
+## Delivery phases
+- **Phase 1 – Findings Ledger & resolver**  
+  Create append-only ledger, projector, ecosystem resolvers (npm/Maven/PyPI/Go/RPM/DEB), canonical advisory keys, and provenance hashing.
+- **Phase 2 – API & simulation**  
+  Ship Vuln Explorer API (list/detail/grouping/simulation), batch evaluation with Policy Engine rationales, and export orchestrator.
+- **Phase 3 – Console & CLI workflows**  
+  Deliver triage UI (assignments, comments, remediation plans, simulation bar), keyboard accessibility, and CLI commands (`stella vuln ...`) with JSON/CSV output.
+- **Phase 4 – Automation & integrations**  
+  Integrate Advisory AI hints, Zastava runtime exposure, Notify rules, Scheduler follow-up scans, and Graph Explorer deep links.
+- **Phase 5 – Exports & offline parity**  
+  Generate deterministic bundles (JSON, CSV, PDF, Offline Kit manifests), audit logs, and signed reports.
+- **Phase 6 – Observability & hardening**  
+  Complete dashboards (projection lag, MTTR, accepted-risk cadence), alerts, runbooks, performance tuning (5M findings/tenant), and security/RBAC validation.
+
+## Work breakdown
+- **Findings Ledger**
+  - Define event schema, Merkle root anchoring, append-only storage, history tables.
+  - Projector to `finding_records` and `finding_history`, idempotent event processing, time travel snapshots.
+  - Resolver pipelines referencing SBOM inventory deltas, policy outputs, VEX consensus, runtime signals.
+- **API & exports**
+  - REST endpoints (`/v1/findings`, `/v1/findings/{id}`, `/actions`, `/reports`, `/exports`) with ABAC filters.
+  - Simulation endpoint returning diffs, integration with Policy Engine batch evaluation.
+  - Export jobs for JSON/CSV/PDF plus Offline Kit bundle assembly and signing.
+- **Console**
+  - Feature module `vuln-explorer` with grid, filters, saved views, deep links, detail tabs (policy, evidence, paths, remediation).
+  - Simulation drawer, delta chips, accepted-risk approvals, evidence bundle viewer.
+  - Accessibility (keyboard navigation, ARIA), virtualization for large result sets.
+- **CLI**
+  - Commands `stella vuln list|show|simulate|assign|accept-risk|verify-fix|export`.
+  - Stable schemas for automation; piping support; tests for exit codes.
+- **Integrations**
+  - Conseiller/Excitator: normalized advisory keys, linksets, evidence retrieval.
+  - SBOM Service: inventory deltas with scope/runtime flags, safe version hints.
+  - Notify: events for SLA breaches, accepted-risk expiries, remediation deadlines.
+  - Scheduler: trigger rescans when remediation plan milestones complete.
+- **Observability & ops**
+  - Metrics (open findings, MTTR, projection lag, export duration, SLA burn), logs/traces with correlation IDs.
+  - Alerting on projector backlog, API 5xx spikes, export failures, accepted-risk nearing expiry.
+  - Runbooks covering recompute storms, mapping errors, report issues.
+
+## Acceptance criteria
+- Ledger/event sourcing reproduces historical states byte-for-byte; Merkle hashes verify integrity.
+- Resolver respects ecosystem semantics, scope, and runtime context; path evidence presented in UI/CLI.
+- Triage workflows (assignment, comments, accepted-risk) enforce justification and approval requirements with audit records.
+- Simulation returns policy diffs without mutating state; CLI/UI parity achieved for simulation and exports.
+- Exports and Offline Kit bundles reproducible with signed manifests and provenance; reports available in JSON/CSV/PDF.
+- Observability dashboards show green SLOs, alerts fire for projection lag or SLA burns, and runbooks documented.
+- RBAC/ABAC validated; attachments encrypted; tenant isolation guaranteed.
+
+## Risks & mitigations
+- **Advisory identity collisions:** strict canonicalization, linkset references, raw evidence access.
+- **Resolver inaccuracies:** property-based tests, path verification, manual override workflows.
+- **Projection lag/backlog:** autoscaling, queue backpressure, alerting, pause controls.
+- **Export size/performance:** streaming NDJSON, size estimators, chunked downloads.
+- **User confusion on suppression:** rationale tab, explicit badges, explain traces.
+
+## Test strategy
+- **Unit:** resolver algorithms, state machine transitions, policy mapping, export builders.
+- **Integration:** ingestion → ledger → projector → API flow, simulation, Notify notifications.
+- **E2E:** Console triage scenarios, CLI flows, accessibility tests.
+- **Performance:** 5M findings/tenant, projection rebuild, export generation.
+- **Security:** RBAC/ABAC matrix, CSRF, attachment encryption, signed URL expiry.
+- **Determinism:** time-travel snapshots, export manifest hashing, Offline Kit replay.
+
+## Definition of done
+- Services, UI/CLI, integrations, exports, and observability deployed with runbooks and Offline Kit parity.
+- Documentation suite (overview, using-console, API, CLI, findings ledger, policy mapping, VEX/SBOM integration, telemetry, security, runbooks, install) updated with imposed rule statement.
+- ./TASKS.md and ../../TASKS.md reflect active progress; compliance checklists appended where required.