Files
git.stella-ops.org/docs/export-center/architecture.md
root 68da90a11a
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Restructure solution layout by module
2025-10-28 15:10:40 +02:00

126 lines
13 KiB
Markdown

# Export Center Architecture
The Export Center is the dedicated service layer that packages StellaOps evidence and policy overlays into reproducible bundles. It runs as a multi-surface API backed by asynchronous workers and format adapters, enforcing Aggregation-Only Contract (AOC) guardrails while providing deterministic manifests, signing, and distribution paths.
## Runtime topology
- **Export Center API (`StellaOps.ExportCenter.WebService`).** Receives profile CRUD, export run requests, status queries, and download streams through the unified Web API gateway. Enforces tenant scopes, RBAC, quotas, and concurrency guards.
- **Export Center Worker (`StellaOps.ExportCenter.Worker`).** Dequeues export jobs from the Orchestrator, resolves selectors, invokes adapters, and writes manifests and bundle artefacts. Stateless; scales horizontally.
- **Backing stores.**
- MongoDB collections: `export_profiles`, `export_runs`, `export_inputs`, `export_distributions`, `export_events`.
- Object storage bucket or filesystem for staging bundle payloads.
- Optional registry/object storage credentials injected via Authority-scoped secrets.
- **Integration peers.**
- **Findings Ledger** for advisory, VEX, SBOM payload streaming.
- **Policy Engine** for deterministic policy snapshots and evaluated findings.
- **Orchestrator** for job scheduling, quotas, and telemetry fan-out.
- **Authority** for tenant-aware access tokens and KMS key references.
- **Console & CLI** as presentation surfaces consuming the API.
## Job lifecycle
1. **Profile selection.** Operator or automation picks a profile (`json:raw`, `json:policy`, `trivy:db`, `trivy:java-db`, `mirror:full`, `mirror:delta`) and submits scope selectors (tenant, time window, products, SBOM subjects, ecosystems). See `docs/export-center/profiles.md` for profile definitions and configuration fields.
2. **Planner resolution.** API validates selectors, expands include/exclude lists, and writes a pending `export_run` with immutable parameters and deterministic ordering hints.
3. **Orchestrator dispatch.** `export_run` triggers a job lease via Orchestrator with quotas per tenant/profile and concurrency caps (default 4 active per tenant).
4. **Worker execution.** Worker streams data from Findings Ledger and Policy Engine using pagination cursors. Adapters write canonical payloads to staging storage, compute checksums, and emit streaming progress events (SSE).
5. **Manifest and provenance emission.** Worker writes `export.json` and `provenance.json`, signs them with configured KMS keys (cosign-compatible), and uploads signatures alongside content.
6. **Distribution registration.** Worker records available distribution methods (download URL, OCI reference, object storage path), raises completion/failure events, and exposes metrics/logs.
7. **Download & verification.** Clients download bundles or pull OCI artefacts, verify signatures, and consume provenance to trace source artefacts.
Cancellation requests mark runs as `aborted` and cause workers to stop iterating sources; partially written files are destroyed and the run is marked with an audit entry.
## Core components
### API surface
- Detailed request and response payloads are catalogued in `docs/export-center/api.md`.
- **Profiles API.**
- `GET /api/export/profiles`: list tenant-scoped profiles.
- `POST /api/export/profiles`: create custom profiles (variants of JSON, Trivy, mirror) with validated configuration schema.
- `PATCH /api/export/profiles/{id}`: update metadata; config changes clone new revision to preserve determinism.
- **Runs API.**
- `POST /api/export/runs`: submit export run for a profile with selectors and options (policy snapshot id, mirror base manifest).
- `GET /api/export/runs/{id}`: status, progress counters, provenance summary.
- `GET /api/export/runs/{id}/events`: server-sent events with state transitions, adapter milestones, signing status.
- `POST /api/export/runs/{id}/cancel`: cooperative cancellation with audit logging.
- **Downloads API.**
- `GET /api/export/runs/{id}/download`: streaming download with range support and checksum trailers.
- `GET /api/export/runs/{id}/manifest`: signed `export.json`.
- `GET /api/export/runs/{id}/provenance`: signed `provenance.json`.
All endpoints require Authority-issued JWT + DPoP tokens with scopes `export:run`, `export:read`, and tenant claim alignment. Rate-limiting and quotas surface via `X-Stella-Quota-*` headers.
### Worker pipeline
- **Input resolvers.** Query Findings Ledger and Policy Engine using stable pagination (Mongo `_id` ascending, or resume tokens for change streams). Selector expressions compile into Mongo filter fragments and/or API query parameters.
- **Adapter host.** Adapter plugin loader (restart-time only) resolves profile variant to adapter implementation. Adapters present a deterministic `RunAsync(context)` contract with streaming writers and telemetry instrumentation.
- **Content writers.**
- JSON adapters emit `.jsonl.zst` files with canonical ordering (tenant, subject, document id).
- Trivy adapters materialise SQLite databases or tar archives matching Trivy DB expectations; schema version gates prevent unsupported outputs.
- Mirror adapters assemble deterministic filesystem trees (manifests, indexes, payload subtrees) and, when configured, OCI artefact layers.
- **Manifest generator.** Aggregates counts, bytes, hash digests (SHA-256), profile metadata, and input references. Writes `export.json` and `provenance.json` using canonical JSON (sorted keys, RFC3339 UTC timestamps).
- **Signing service.** Integrates with platform KMS via Authority (default cosign signer). Produces in-toto SLSA attestations when configured. Supports detached signatures and optional in-bundle signatures.
- **Distribution drivers.** `dist-http` exposes staged files via download endpoint; `dist-oci` pushes artefacts to registries using ORAS with digest pinning; `dist-objstore` uploads to tenant-specific prefixes with immutability flags.
## Data model snapshots
| Collection | Purpose | Key fields | Notes |
|------------|---------|------------|-------|
| `export_profiles` | Profile definitions (kind, variant, config). | `_id`, `tenant`, `name`, `kind`, `variant`, `config_json`, `created_by`, `created_at`. | Config includes adapter parameters (included record types, compression, encryption). |
| `export_runs` | Run state machine and audit info. | `_id`, `profile_id`, `tenant`, `status`, `requested_by`, `selectors`, `policy_snapshot_id`, `started_at`, `completed_at`, `duration_ms`, `error_code`. | Immutable selectors; status transitions recorded in `export_events`. |
| `export_inputs` | Resolved input ranges. | `run_id`, `source`, `cursor`, `count`, `hash`. | Enables resumable retries and audit. |
| `export_distributions` | Distribution artefacts. | `run_id`, `type` (`http`, `oci`, `object`), `location`, `sha256`, `size_bytes`, `expires_at`. | `expires_at` used for retention policies and automatic pruning. |
| `export_events` | Timeline of state transitions and metrics. | `run_id`, `event_type`, `message`, `at`, `metrics`. | Feeds SSE stream and audit trails. |
## Adapter responsibilities
- **JSON (`json:raw`, `json:policy`).**
- Ensures canonical casing, timezone normalization, and linkset preservation.
- Policy variant embeds policy snapshot metadata (`policy_version`, `inputs_hash`, `decision_trace` fingerprint) and emits evaluated findings as separate files.
- Enforces AOC guardrails: no derived modifications to raw evidence fields.
- **Trivy (`trivy:db`, `trivy:java-db`).**
- Maps StellaOps advisory schema to Trivy DB format, handling namespace collisions and ecosystem-specific ranges.
- Validates compatibility against supported Trivy schema versions; run fails fast if mismatch.
- Emits optional manifest summarising package counts and severity distribution.
- **Mirror (`mirror:full`, `mirror:delta`).**
- Builds self-contained filesystem layout (`/manifests`, `/data/raw`, `/data/policy`, `/indexes`).
- Delta variant compares against base manifest (`base_export_id`) to write only changed artefacts; records `removed` entries for cleanup.
- Supports optional encryption of `/data` subtree (age/AES-GCM) with key wrapping stored in `provenance.json`.
Adapters expose structured telemetry events (`adapter.start`, `adapter.chunk`, `adapter.complete`) with record counts and byte totals per chunk. Failures emit `adapter.error` with reason codes.
## Signing and provenance
- **Manifest schema.** `export.json` contains run metadata, profile descriptor, selector summary, counts, SHA-256 digests, compression hints, and distribution list. Deterministic field ordering and normalized timestamps.
- **Provenance schema.** `provenance.json` captures in-toto subject listing (bundle digest, manifest digest), referenced inputs (findings ledger queries, policy snapshot ids, SBOM identifiers), tool version (`exporter_version`, adapter versions), and KMS key identifiers.
- **Attestation.** Cosign SLSA Level 2 template by default; optional SLSA Level 3 when supply chain attestations are enabled. Detached signatures stored alongside manifests; CLI/Console encourage `cosign verify --key <tenant-key>` workflow.
- **Audit trail.** Each run stores success/failure status, signature identifiers, and verification hints for downstream automation (CI pipelines, offline verification scripts).
## Distribution flows
- **HTTP download.** Console and CLI stream bundles via chunked transfer; supports range requests and resumable downloads. Response includes `X-Export-Digest`, `X-Export-Length`, and optional encryption metadata.
- **OCI push.** Worker uses ORAS to publish bundles as OCI artefacts with annotations describing profile, tenant, manifest digest, and provenance reference. Supports multi-tenant registries with `repository-per-tenant` naming.
- **Object storage.** Writes to tenant-prefixed paths (`s3://stella-exports/{tenant}/{run-id}/...`) with immutable retention policies. Retention scheduler purges expired runs based on profile configuration.
- **Offline Kit seeding.** Mirror bundles optionally staged into Offline Kit assembly pipelines, inheriting the same manifests and signatures.
## Observability
- **Metrics.** Emits `exporter_run_duration_seconds`, `exporter_run_bytes_total{profile}`, `exporter_run_failures_total{error_code}`, `exporter_active_runs{tenant}`, `exporter_distribution_push_seconds{type}`.
- **Logs.** Structured logs with fields `run_id`, `tenant`, `profile_kind`, `adapter`, `phase`, `correlation_id`, `error_code`. Phases include `plan`, `resolve`, `adapter`, `manifest`, `sign`, `distribute`.
- **Traces.** Optional OpenTelemetry spans (`export.plan`, `export.fetch`, `export.write`, `export.sign`, `export.distribute`) for cross-service correlation.
- **Dashboards & alerts.** DevOps pipeline seeds Grafana dashboards summarising throughput, size, failure ratios, and distribution latency. Alert thresholds: failure rate >5% per profile, median run duration >p95 baseline, signature verification failures >0.
## Security posture
- Tenant claim enforced at every query and distribution path; cross-tenant selectors rejected unless explicit cross-tenant mirror feature toggled with signed approval.
- RBAC scopes: `export:profile:manage`, `export:run`, `export:read`, `export:download`. Console hides actions without scope; CLI returns `401/403`.
- Encryption options configurable per profile; keys derived from Authority-managed KMS. Mirror encryption uses tenant-specific recipients; JSON/Trivy rely on transport security plus optional encryption at rest.
- Restart-only plugin loading ensures adapters and distribution drivers are vetted at deployment time, reducing runtime injection risks.
- Deterministic output ensures tamper detection via content hashes; provenance links to source runs and policy snapshots to maintain auditability.
## Deployment considerations
- Packaged as separate API and worker containers. Helm chart and compose overlays define horizontal scaling, worker concurrency, queue leases, and object storage credentials.
- Requires Authority client credentials for KMS and optional registry credentials stored via sealed secrets.
- Offline-first deployments disable OCI distribution by default and provide local object storage endpoints; HTTP downloads served via internal gateway.
- Health endpoints: `/health/ready` validates Mongo connectivity, object storage access, adapter registry integrity, and KMS signer readiness.
## Compliance checklist
- [ ] Profiles and runs enforce tenant scoping; cross-tenant exports disabled unless approved.
- [ ] Manifests and provenance files are generated with deterministic hashes and signed via configured KMS.
- [ ] Adapters run with restart-time registration only; no runtime plugin loading.
- [ ] Distribution drivers respect allowlist; OCI push disabled when offline mode is active.
- [ ] Metrics, logs, and traces follow observability guidelines; dashboards and alerts configured.
- [ ] Retention policies and pruning jobs configured for staged bundles.
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.