feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules

- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes.
- Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes.
- Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables.
- Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
This commit is contained in:
2025-10-30 00:09:39 +02:00
parent 3154c67978
commit 7b5bdcf4d3
503 changed files with 16136 additions and 54638 deletions

View File

@@ -0,0 +1,22 @@
# Vulnerability Explorer agent guide
## Mission
Vulnerability Explorer delivers policy-aware triage, investigation, and reporting surfaces for effective findings.
## Key docs
- [Module README](./README.md)
- [Architecture](./architecture.md)
- [Implementation plan](./implementation_plan.md)
- [Task board](./TASKS.md)
## How to get started
1. Review ./architecture.md for ledger schema, workflow states, and export requirements.
2. Open ../../implplan/SPRINTS.md and locate stories for this component.
3. Check ./TASKS.md and update status before/after work.
4. Read README/architecture for design context and update as the implementation evolves.
## Guardrails
- Uphold Aggregation-Only Contract boundaries when consuming ingestion data.
- Preserve determinism and provenance in all derived outputs.
- Document offline/air-gap pathways for any new feature.
- Update telemetry/observability assets alongside feature work.

View File

@@ -0,0 +1,29 @@
# StellaOps Vulnerability Explorer
Vulnerability Explorer delivers policy-aware triage, investigation, and reporting surfaces for effective findings.
## Responsibilities
- Present policy-evaluated findings with advisory, VEX, SBOM, and runtime context.
- Capture triage workflow in an immutable findings ledger with role-based access.
- Provide pivots, exports, and reports for auditors and operations teams.
- Integrate explain traces, remediation notes, and offline bundles.
## Key components
- Findings Ledger service + API.
- Console module and CLI verbs for triage workflows.
- Export integrations for reports and evidence packages.
## Integrations & dependencies
- Policy Engine for effective findings streams.
- Concelier/Excititor for evidence provenance.
- Scheduler for remediation/verification jobs.
- Notify for triage notifications.
## Operational notes
- Audit logging per Epic 6 requirements.
- Offline-ready CSV/PDF exports with deterministic hashes.
- Dashboards for MTTR and triage throughput.
## Epic alignment
- Epic 6: Vulnerability Explorer.
- VULN stories tracked in ../../TASKS.md and src/VulnExplorer/**/TASKS.md.

View File

@@ -0,0 +1,9 @@
# Task board — Vulnerability Explorer
> Local tasks should link back to ./AGENTS.md and mirror status updates into ../../TASKS.md when applicable.
| ID | Status | Owner(s) | Description | Notes |
|----|--------|----------|-------------|-------|
| VULNERABILITY-EXPLORER-DOCS-0001 | DOING (2025-10-29) | Docs Guild | Ensure ./README.md reflects the latest epic deliverables. | Align with ./AGENTS.md |
| VULNERABILITY-EXPLORER-ENG-0001 | TODO | Module Team | Break down epic milestones into actionable stories. | Sync into ../../TASKS.md |
| VULNERABILITY-EXPLORER-OPS-0001 | TODO | Ops Guild | Prepare runbooks/observability assets once MVP lands. | Document outputs in ./README.md |

View File

@@ -0,0 +1,66 @@
# Vulnerability Explorer architecture
> Based on Epic6 Vulnerability Explorer; this specification summarises the ledger model, triage workflows, APIs, and export requirements.
## 1) Ledger data model
- **Collections / tables**
- `finding_records` canonical, policy-derived findings enriched with advisory, VEX, SBOM, runtime context. Includes `policyVersion`, `advisoryRawIds`, `vexRawIds`, `sbomComponentId`, and `explainBundleRef`.
- `finding_history` append-only state transitions (`new`, `triaged`, `accepted_risk`, `remediated`, `false_positive`, etc.) with timestamps, actor, and justification.
- `triage_actions` discrete operator actions (comment, assignment, remediation note, ticket link) with immutable provenance.
- `remediation_plans` structured remediation steps (affected assets, deadlines, recommended fixes, auto-generated from SRM/AI hints).
- `reports` saved report definitions, export manifests, and signatures.
- **Immutability & provenance** All updates are append-only; previous state is recoverable. Records capture `tenant`, `artifactId`, `findingKey`, `policyVersion`, `sourceRunId`, `sr mDigest`.
## 2) Triage workflow
1. **Ingest effective findings** from Policy Engine (stream `policy.finding.delta`). Each delta updates `finding_records`, generates history entries, and triggers notification rules.
2. **Prioritisation** uses contextual heuristics: runtime exposure, VEX status, policy severity, AI hints. Stored as `priorityScore` with provenance from Zastava/AI modules.
3. **Assignment & collaboration** Operators claim findings, add comments, attach evidence, and link tickets. Assignment uses Authority identities and RBAC.
4. **Remediation tracking** Link remediation plans, record progress, and integrate with Scheduler for follow-up scans once fixes deploy.
5. **Closure** When Policy or rescans mark finding resolved, system logs closure with explain trace and updates audit ledger.
State machine summary:
```
new -> (triage) triaged -> (remediate) in_progress -> (verify) awaiting_verification -> (scan) remediated
new -> (false_positive) closed_false_positive
new -> (risk_accept) accepted_risk
```
All transitions require justification; certain transitions (accepted risk) require multi-approver workflow defined by Policy Studio.
## 3) APIs
- `GET /v1/findings` — filtered listing with pagination, search (artifact, advisory, priority, status, assignee).
- `GET /v1/findings/{id}` — detail view (policy context, explain trace, evidence timeline).
- `POST /v1/findings/{id}/actions` — create triage action (assign, comment, status change, remediation, ticket link) with DSSE signature support.
- `POST /v1/reports` — generate report artifact (JSON, CSV, PDF) defined by saved templates; records manifest + signature.
- `GET /v1/exports/offline` — stream deterministic bundle for Offline Kit (findings JSON, history, attachments, manifest).
CLI mirrors these endpoints (`stella findings list|view|update|export`). Console UI consumes the same APIs via typed clients.
## 4) AI/automation integration
- Advisory AI contributes remediation hints and conflict explanations stored alongside findings (`aiInsights`).
- Scheduler integration triggers follow-up scans or policy re-evaluation when remediation plan reaches checkpoint.
- Zastava (Differential SBOM) feeds runtime exposure signals to reprioritise findings automatically.
## 5) Observability & compliance
- Metrics: `findings_open_total{severity,tenant}`, `findings_mttr_seconds`, `triage_actions_total{type}`, `report_generation_seconds`.
- Logs: structured with `findingId`, `artifactId`, `advisory`, `policyVersion`, `actor`, `actionType`.
- Audit exports: `audit_log.jsonl` appended whenever state changes; offline bundles include signed audit log and manifest.
- Compliance: accepted risk requires dual approval and stores justification plus expiry reminders (raised through Notify).
## 6) Offline bundle requirements
- Bundle structure:
- `manifest.json` (hashes, counts, policy version, generation timestamp).
- `findings.jsonl` (current open findings).
- `history.jsonl` (state changes).
- `actions.jsonl` (comments, assignments, tickets).
- `reports/` (generated PDFs/CSVs).
- `signatures/` (DSSE envelopes).
- Bundles produced deterministically; Export Center consumes them for mirror profiles.

View File

@@ -0,0 +1,70 @@
# Implementation plan — Vulnerability Explorer
## Delivery phases
- **Phase 1 Findings Ledger & resolver**
Create append-only ledger, projector, ecosystem resolvers (npm/Maven/PyPI/Go/RPM/DEB), canonical advisory keys, and provenance hashing.
- **Phase 2 API & simulation**
Ship Vuln Explorer API (list/detail/grouping/simulation), batch evaluation with Policy Engine rationales, and export orchestrator.
- **Phase 3 Console & CLI workflows**
Deliver triage UI (assignments, comments, remediation plans, simulation bar), keyboard accessibility, and CLI commands (`stella vuln ...`) with JSON/CSV output.
- **Phase 4 Automation & integrations**
Integrate Advisory AI hints, Zastava runtime exposure, Notify rules, Scheduler follow-up scans, and Graph Explorer deep links.
- **Phase 5 Exports & offline parity**
Generate deterministic bundles (JSON, CSV, PDF, Offline Kit manifests), audit logs, and signed reports.
- **Phase 6 Observability & hardening**
Complete dashboards (projection lag, MTTR, accepted-risk cadence), alerts, runbooks, performance tuning (5M findings/tenant), and security/RBAC validation.
## Work breakdown
- **Findings Ledger**
- Define event schema, Merkle root anchoring, append-only storage, history tables.
- Projector to `finding_records` and `finding_history`, idempotent event processing, time travel snapshots.
- Resolver pipelines referencing SBOM inventory deltas, policy outputs, VEX consensus, runtime signals.
- **API & exports**
- REST endpoints (`/v1/findings`, `/v1/findings/{id}`, `/actions`, `/reports`, `/exports`) with ABAC filters.
- Simulation endpoint returning diffs, integration with Policy Engine batch evaluation.
- Export jobs for JSON/CSV/PDF plus Offline Kit bundle assembly and signing.
- **Console**
- Feature module `vuln-explorer` with grid, filters, saved views, deep links, detail tabs (policy, evidence, paths, remediation).
- Simulation drawer, delta chips, accepted-risk approvals, evidence bundle viewer.
- Accessibility (keyboard navigation, ARIA), virtualization for large result sets.
- **CLI**
- Commands `stella vuln list|show|simulate|assign|accept-risk|verify-fix|export`.
- Stable schemas for automation; piping support; tests for exit codes.
- **Integrations**
- Conseiller/Excitator: normalized advisory keys, linksets, evidence retrieval.
- SBOM Service: inventory deltas with scope/runtime flags, safe version hints.
- Notify: events for SLA breaches, accepted-risk expiries, remediation deadlines.
- Scheduler: trigger rescans when remediation plan milestones complete.
- **Observability & ops**
- Metrics (open findings, MTTR, projection lag, export duration, SLA burn), logs/traces with correlation IDs.
- Alerting on projector backlog, API 5xx spikes, export failures, accepted-risk nearing expiry.
- Runbooks covering recompute storms, mapping errors, report issues.
## Acceptance criteria
- Ledger/event sourcing reproduces historical states byte-for-byte; Merkle hashes verify integrity.
- Resolver respects ecosystem semantics, scope, and runtime context; path evidence presented in UI/CLI.
- Triage workflows (assignment, comments, accepted-risk) enforce justification and approval requirements with audit records.
- Simulation returns policy diffs without mutating state; CLI/UI parity achieved for simulation and exports.
- Exports and Offline Kit bundles reproducible with signed manifests and provenance; reports available in JSON/CSV/PDF.
- Observability dashboards show green SLOs, alerts fire for projection lag or SLA burns, and runbooks documented.
- RBAC/ABAC validated; attachments encrypted; tenant isolation guaranteed.
## Risks & mitigations
- **Advisory identity collisions:** strict canonicalization, linkset references, raw evidence access.
- **Resolver inaccuracies:** property-based tests, path verification, manual override workflows.
- **Projection lag/backlog:** autoscaling, queue backpressure, alerting, pause controls.
- **Export size/performance:** streaming NDJSON, size estimators, chunked downloads.
- **User confusion on suppression:** rationale tab, explicit badges, explain traces.
## Test strategy
- **Unit:** resolver algorithms, state machine transitions, policy mapping, export builders.
- **Integration:** ingestion → ledger → projector → API flow, simulation, Notify notifications.
- **E2E:** Console triage scenarios, CLI flows, accessibility tests.
- **Performance:** 5M findings/tenant, projection rebuild, export generation.
- **Security:** RBAC/ABAC matrix, CSRF, attachment encryption, signed URL expiry.
- **Determinism:** time-travel snapshots, export manifest hashing, Offline Kit replay.
## Definition of done
- Services, UI/CLI, integrations, exports, and observability deployed with runbooks and Offline Kit parity.
- Documentation suite (overview, using-console, API, CLI, findings ledger, policy mapping, VEX/SBOM integration, telemetry, security, runbooks, install) updated with imposed rule statement.
- ./TASKS.md and ../../TASKS.md reflect active progress; compliance checklists appended where required.