feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules

- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes.
- Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes.
- Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables.
- Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
This commit is contained in:
2025-10-30 00:09:39 +02:00
parent 3154c67978
commit 7b5bdcf4d3
503 changed files with 16136 additions and 54638 deletions

View File

@@ -0,0 +1,127 @@
# Export Center Architecture
> Derived from Epic10 Export Center and the subsequent export adapter deep dives.
The Export Center is the dedicated service layer that packages StellaOps evidence and policy overlays into reproducible bundles. It runs as a multi-surface API backed by asynchronous workers and format adapters, enforcing Aggregation-Only Contract (AOC) guardrails while providing deterministic manifests, signing, and distribution paths.
## Runtime topology
- **Export Center API (`StellaOps.ExportCenter.WebService`).** Receives profile CRUD, export run requests, status queries, and download streams through the unified Web API gateway. Enforces tenant scopes, RBAC, quotas, and concurrency guards.
- **Export Center Worker (`StellaOps.ExportCenter.Worker`).** Dequeues export jobs from the Orchestrator, resolves selectors, invokes adapters, and writes manifests and bundle artefacts. Stateless; scales horizontally.
- **Backing stores.**
- MongoDB collections: `export_profiles`, `export_runs`, `export_inputs`, `export_distributions`, `export_events`.
- Object storage bucket or filesystem for staging bundle payloads.
- Optional registry/object storage credentials injected via Authority-scoped secrets.
- **Integration peers.**
- **Findings Ledger** for advisory, VEX, SBOM payload streaming.
- **Policy Engine** for deterministic policy snapshots and evaluated findings.
- **Orchestrator** for job scheduling, quotas, and telemetry fan-out.
- **Authority** for tenant-aware access tokens and KMS key references.
- **Console & CLI** as presentation surfaces consuming the API.
## Job lifecycle
1. **Profile selection.** Operator or automation picks a profile (`json:raw`, `json:policy`, `trivy:db`, `trivy:java-db`, `mirror:full`, `mirror:delta`) and submits scope selectors (tenant, time window, products, SBOM subjects, ecosystems). See `docs/modules/export-center/profiles.md` for profile definitions and configuration fields.
2. **Planner resolution.** API validates selectors, expands include/exclude lists, and writes a pending `export_run` with immutable parameters and deterministic ordering hints.
3. **Orchestrator dispatch.** `export_run` triggers a job lease via Orchestrator with quotas per tenant/profile and concurrency caps (default 4 active per tenant).
4. **Worker execution.** Worker streams data from Findings Ledger and Policy Engine using pagination cursors. Adapters write canonical payloads to staging storage, compute checksums, and emit streaming progress events (SSE).
5. **Manifest and provenance emission.** Worker writes `export.json` and `provenance.json`, signs them with configured KMS keys (cosign-compatible), and uploads signatures alongside content.
6. **Distribution registration.** Worker records available distribution methods (download URL, OCI reference, object storage path), raises completion/failure events, and exposes metrics/logs.
7. **Download & verification.** Clients download bundles or pull OCI artefacts, verify signatures, and consume provenance to trace source artefacts.
Cancellation requests mark runs as `aborted` and cause workers to stop iterating sources; partially written files are destroyed and the run is marked with an audit entry.
## Core components
### API surface
- Detailed request and response payloads are catalogued in `docs/modules/export-center/api.md`.
- **Profiles API.**
- `GET /api/export/profiles`: list tenant-scoped profiles.
- `POST /api/export/profiles`: create custom profiles (variants of JSON, Trivy, mirror) with validated configuration schema.
- `PATCH /api/export/profiles/{id}`: update metadata; config changes clone new revision to preserve determinism.
- **Runs API.**
- `POST /api/export/runs`: submit export run for a profile with selectors and options (policy snapshot id, mirror base manifest).
- `GET /api/export/runs/{id}`: status, progress counters, provenance summary.
- `GET /api/export/runs/{id}/events`: server-sent events with state transitions, adapter milestones, signing status.
- `POST /api/export/runs/{id}/cancel`: cooperative cancellation with audit logging.
- **Downloads API.**
- `GET /api/export/runs/{id}/download`: streaming download with range support and checksum trailers.
- `GET /api/export/runs/{id}/manifest`: signed `export.json`.
- `GET /api/export/runs/{id}/provenance`: signed `provenance.json`.
All endpoints require Authority-issued JWT + DPoP tokens with scopes `export:run`, `export:read`, and tenant claim alignment. Rate-limiting and quotas surface via `X-Stella-Quota-*` headers.
### Worker pipeline
- **Input resolvers.** Query Findings Ledger and Policy Engine using stable pagination (Mongo `_id` ascending, or resume tokens for change streams). Selector expressions compile into Mongo filter fragments and/or API query parameters.
- **Adapter host.** Adapter plugin loader (restart-time only) resolves profile variant to adapter implementation. Adapters present a deterministic `RunAsync(context)` contract with streaming writers and telemetry instrumentation.
- **Content writers.**
- JSON adapters emit `.jsonl.zst` files with canonical ordering (tenant, subject, document id).
- Trivy adapters materialise SQLite databases or tar archives matching Trivy DB expectations; schema version gates prevent unsupported outputs.
- Mirror adapters assemble deterministic filesystem trees (manifests, indexes, payload subtrees) and, when configured, OCI artefact layers.
- **Manifest generator.** Aggregates counts, bytes, hash digests (SHA-256), profile metadata, and input references. Writes `export.json` and `provenance.json` using canonical JSON (sorted keys, RFC3339 UTC timestamps).
- **Signing service.** Integrates with platform KMS via Authority (default cosign signer). Produces in-toto SLSA attestations when configured. Supports detached signatures and optional in-bundle signatures.
- **Distribution drivers.** `dist-http` exposes staged files via download endpoint; `dist-oci` pushes artefacts to registries using ORAS with digest pinning; `dist-objstore` uploads to tenant-specific prefixes with immutability flags.
## Data model snapshots
| Collection | Purpose | Key fields | Notes |
|------------|---------|------------|-------|
| `export_profiles` | Profile definitions (kind, variant, config). | `_id`, `tenant`, `name`, `kind`, `variant`, `config_json`, `created_by`, `created_at`. | Config includes adapter parameters (included record types, compression, encryption). |
| `export_runs` | Run state machine and audit info. | `_id`, `profile_id`, `tenant`, `status`, `requested_by`, `selectors`, `policy_snapshot_id`, `started_at`, `completed_at`, `duration_ms`, `error_code`. | Immutable selectors; status transitions recorded in `export_events`. |
| `export_inputs` | Resolved input ranges. | `run_id`, `source`, `cursor`, `count`, `hash`. | Enables resumable retries and audit. |
| `export_distributions` | Distribution artefacts. | `run_id`, `type` (`http`, `oci`, `object`), `location`, `sha256`, `size_bytes`, `expires_at`. | `expires_at` used for retention policies and automatic pruning. |
| `export_events` | Timeline of state transitions and metrics. | `run_id`, `event_type`, `message`, `at`, `metrics`. | Feeds SSE stream and audit trails. |
## Adapter responsibilities
- **JSON (`json:raw`, `json:policy`).**
- Ensures canonical casing, timezone normalization, and linkset preservation.
- Policy variant embeds policy snapshot metadata (`policy_version`, `inputs_hash`, `decision_trace` fingerprint) and emits evaluated findings as separate files.
- Enforces AOC guardrails: no derived modifications to raw evidence fields.
- **Trivy (`trivy:db`, `trivy:java-db`).**
- Maps StellaOps advisory schema to Trivy DB format, handling namespace collisions and ecosystem-specific ranges.
- Validates compatibility against supported Trivy schema versions; run fails fast if mismatch.
- Emits optional manifest summarising package counts and severity distribution.
- **Mirror (`mirror:full`, `mirror:delta`).**
- Builds self-contained filesystem layout (`/manifests`, `/data/raw`, `/data/policy`, `/indexes`).
- Delta variant compares against base manifest (`base_export_id`) to write only changed artefacts; records `removed` entries for cleanup.
- Supports optional encryption of `/data` subtree (age/AES-GCM) with key wrapping stored in `provenance.json`.
Adapters expose structured telemetry events (`adapter.start`, `adapter.chunk`, `adapter.complete`) with record counts and byte totals per chunk. Failures emit `adapter.error` with reason codes.
## Signing and provenance
- **Manifest schema.** `export.json` contains run metadata, profile descriptor, selector summary, counts, SHA-256 digests, compression hints, and distribution list. Deterministic field ordering and normalized timestamps.
- **Provenance schema.** `provenance.json` captures in-toto subject listing (bundle digest, manifest digest), referenced inputs (findings ledger queries, policy snapshot ids, SBOM identifiers), tool version (`exporter_version`, adapter versions), and KMS key identifiers.
- **Attestation.** Cosign SLSA Level 2 template by default; optional SLSA Level 3 when supply chain attestations are enabled. Detached signatures stored alongside manifests; CLI/Console encourage `cosign verify --key <tenant-key>` workflow.
- **Audit trail.** Each run stores success/failure status, signature identifiers, and verification hints for downstream automation (CI pipelines, offline verification scripts).
## Distribution flows
- **HTTP download.** Console and CLI stream bundles via chunked transfer; supports range requests and resumable downloads. Response includes `X-Export-Digest`, `X-Export-Length`, and optional encryption metadata.
- **OCI push.** Worker uses ORAS to publish bundles as OCI artefacts with annotations describing profile, tenant, manifest digest, and provenance reference. Supports multi-tenant registries with `repository-per-tenant` naming.
- **Object storage.** Writes to tenant-prefixed paths (`s3://stella-exports/{tenant}/{run-id}/...`) with immutable retention policies. Retention scheduler purges expired runs based on profile configuration.
- **Offline Kit seeding.** Mirror bundles optionally staged into Offline Kit assembly pipelines, inheriting the same manifests and signatures.
## Observability
- **Metrics.** Emits `exporter_run_duration_seconds`, `exporter_run_bytes_total{profile}`, `exporter_run_failures_total{error_code}`, `exporter_active_runs{tenant}`, `exporter_distribution_push_seconds{type}`.
- **Logs.** Structured logs with fields `run_id`, `tenant`, `profile_kind`, `adapter`, `phase`, `correlation_id`, `error_code`. Phases include `plan`, `resolve`, `adapter`, `manifest`, `sign`, `distribute`.
- **Traces.** Optional OpenTelemetry spans (`export.plan`, `export.fetch`, `export.write`, `export.sign`, `export.distribute`) for cross-service correlation.
- **Dashboards & alerts.** DevOps pipeline seeds Grafana dashboards summarising throughput, size, failure ratios, and distribution latency. Alert thresholds: failure rate >5% per profile, median run duration >p95 baseline, signature verification failures >0.
## Security posture
- Tenant claim enforced at every query and distribution path; cross-tenant selectors rejected unless explicit cross-tenant mirror feature toggled with signed approval.
- RBAC scopes: `export:profile:manage`, `export:run`, `export:read`, `export:download`. Console hides actions without scope; CLI returns `401/403`.
- Encryption options configurable per profile; keys derived from Authority-managed KMS. Mirror encryption uses tenant-specific recipients; JSON/Trivy rely on transport security plus optional encryption at rest.
- Restart-only plugin loading ensures adapters and distribution drivers are vetted at deployment time, reducing runtime injection risks.
- Deterministic output ensures tamper detection via content hashes; provenance links to source runs and policy snapshots to maintain auditability.
## Deployment considerations
- Packaged as separate API and worker containers. Helm chart and compose overlays define horizontal scaling, worker concurrency, queue leases, and object storage credentials.
- Requires Authority client credentials for KMS and optional registry credentials stored via sealed secrets.
- Offline-first deployments disable OCI distribution by default and provide local object storage endpoints; HTTP downloads served via internal gateway.
- Health endpoints: `/health/ready` validates Mongo connectivity, object storage access, adapter registry integrity, and KMS signer readiness.
## Compliance checklist
- [ ] Profiles and runs enforce tenant scoping; cross-tenant exports disabled unless approved.
- [ ] Manifests and provenance files are generated with deterministic hashes and signed via configured KMS.
- [ ] Adapters run with restart-time registration only; no runtime plugin loading.
- [ ] Distribution drivers respect allowlist; OCI push disabled when offline mode is active.
- [ ] Metrics, logs, and traces follow observability guidelines; dashboards and alerts configured.
- [ ] Retention policies and pruning jobs configured for staged bundles.
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.