126 lines
13 KiB
Markdown
126 lines
13 KiB
Markdown
# Export Center Architecture
|
|
|
|
The Export Center is the dedicated service layer that packages StellaOps evidence and policy overlays into reproducible bundles. It runs as a multi-surface API backed by asynchronous workers and format adapters, enforcing Aggregation-Only Contract (AOC) guardrails while providing deterministic manifests, signing, and distribution paths.
|
|
|
|
## Runtime topology
|
|
- **Export Center API (`StellaOps.ExportCenter.WebService`).** Receives profile CRUD, export run requests, status queries, and download streams through the unified Web API gateway. Enforces tenant scopes, RBAC, quotas, and concurrency guards.
|
|
- **Export Center Worker (`StellaOps.ExportCenter.Worker`).** Dequeues export jobs from the Orchestrator, resolves selectors, invokes adapters, and writes manifests and bundle artefacts. Stateless; scales horizontally.
|
|
- **Backing stores.**
|
|
- MongoDB collections: `export_profiles`, `export_runs`, `export_inputs`, `export_distributions`, `export_events`.
|
|
- Object storage bucket or filesystem for staging bundle payloads.
|
|
- Optional registry/object storage credentials injected via Authority-scoped secrets.
|
|
- **Integration peers.**
|
|
- **Findings Ledger** for advisory, VEX, SBOM payload streaming.
|
|
- **Policy Engine** for deterministic policy snapshots and evaluated findings.
|
|
- **Orchestrator** for job scheduling, quotas, and telemetry fan-out.
|
|
- **Authority** for tenant-aware access tokens and KMS key references.
|
|
- **Console & CLI** as presentation surfaces consuming the API.
|
|
|
|
## Job lifecycle
|
|
1. **Profile selection.** Operator or automation picks a profile (`json:raw`, `json:policy`, `trivy:db`, `trivy:java-db`, `mirror:full`, `mirror:delta`) and submits scope selectors (tenant, time window, products, SBOM subjects, ecosystems). See `docs/export-center/profiles.md` for profile definitions and configuration fields.
|
|
2. **Planner resolution.** API validates selectors, expands include/exclude lists, and writes a pending `export_run` with immutable parameters and deterministic ordering hints.
|
|
3. **Orchestrator dispatch.** `export_run` triggers a job lease via Orchestrator with quotas per tenant/profile and concurrency caps (default 4 active per tenant).
|
|
4. **Worker execution.** Worker streams data from Findings Ledger and Policy Engine using pagination cursors. Adapters write canonical payloads to staging storage, compute checksums, and emit streaming progress events (SSE).
|
|
5. **Manifest and provenance emission.** Worker writes `export.json` and `provenance.json`, signs them with configured KMS keys (cosign-compatible), and uploads signatures alongside content.
|
|
6. **Distribution registration.** Worker records available distribution methods (download URL, OCI reference, object storage path), raises completion/failure events, and exposes metrics/logs.
|
|
7. **Download & verification.** Clients download bundles or pull OCI artefacts, verify signatures, and consume provenance to trace source artefacts.
|
|
|
|
Cancellation requests mark runs as `aborted` and cause workers to stop iterating sources; partially written files are destroyed and the run is marked with an audit entry.
|
|
|
|
## Core components
|
|
### API surface
|
|
- Detailed request and response payloads are catalogued in `docs/export-center/api.md`.
|
|
- **Profiles API.**
|
|
- `GET /api/export/profiles`: list tenant-scoped profiles.
|
|
- `POST /api/export/profiles`: create custom profiles (variants of JSON, Trivy, mirror) with validated configuration schema.
|
|
- `PATCH /api/export/profiles/{id}`: update metadata; config changes clone new revision to preserve determinism.
|
|
- **Runs API.**
|
|
- `POST /api/export/runs`: submit export run for a profile with selectors and options (policy snapshot id, mirror base manifest).
|
|
- `GET /api/export/runs/{id}`: status, progress counters, provenance summary.
|
|
- `GET /api/export/runs/{id}/events`: server-sent events with state transitions, adapter milestones, signing status.
|
|
- `POST /api/export/runs/{id}/cancel`: cooperative cancellation with audit logging.
|
|
- **Downloads API.**
|
|
- `GET /api/export/runs/{id}/download`: streaming download with range support and checksum trailers.
|
|
- `GET /api/export/runs/{id}/manifest`: signed `export.json`.
|
|
- `GET /api/export/runs/{id}/provenance`: signed `provenance.json`.
|
|
|
|
All endpoints require Authority-issued JWT + DPoP tokens with scopes `export:run`, `export:read`, and tenant claim alignment. Rate-limiting and quotas surface via `X-Stella-Quota-*` headers.
|
|
|
|
### Worker pipeline
|
|
- **Input resolvers.** Query Findings Ledger and Policy Engine using stable pagination (Mongo `_id` ascending, or resume tokens for change streams). Selector expressions compile into Mongo filter fragments and/or API query parameters.
|
|
- **Adapter host.** Adapter plugin loader (restart-time only) resolves profile variant to adapter implementation. Adapters present a deterministic `RunAsync(context)` contract with streaming writers and telemetry instrumentation.
|
|
- **Content writers.**
|
|
- JSON adapters emit `.jsonl.zst` files with canonical ordering (tenant, subject, document id).
|
|
- Trivy adapters materialise SQLite databases or tar archives matching Trivy DB expectations; schema version gates prevent unsupported outputs.
|
|
- Mirror adapters assemble deterministic filesystem trees (manifests, indexes, payload subtrees) and, when configured, OCI artefact layers.
|
|
- **Manifest generator.** Aggregates counts, bytes, hash digests (SHA-256), profile metadata, and input references. Writes `export.json` and `provenance.json` using canonical JSON (sorted keys, RFC3339 UTC timestamps).
|
|
- **Signing service.** Integrates with platform KMS via Authority (default cosign signer). Produces in-toto SLSA attestations when configured. Supports detached signatures and optional in-bundle signatures.
|
|
- **Distribution drivers.** `dist-http` exposes staged files via download endpoint; `dist-oci` pushes artefacts to registries using ORAS with digest pinning; `dist-objstore` uploads to tenant-specific prefixes with immutability flags.
|
|
|
|
## Data model snapshots
|
|
|
|
| Collection | Purpose | Key fields | Notes |
|
|
|------------|---------|------------|-------|
|
|
| `export_profiles` | Profile definitions (kind, variant, config). | `_id`, `tenant`, `name`, `kind`, `variant`, `config_json`, `created_by`, `created_at`. | Config includes adapter parameters (included record types, compression, encryption). |
|
|
| `export_runs` | Run state machine and audit info. | `_id`, `profile_id`, `tenant`, `status`, `requested_by`, `selectors`, `policy_snapshot_id`, `started_at`, `completed_at`, `duration_ms`, `error_code`. | Immutable selectors; status transitions recorded in `export_events`. |
|
|
| `export_inputs` | Resolved input ranges. | `run_id`, `source`, `cursor`, `count`, `hash`. | Enables resumable retries and audit. |
|
|
| `export_distributions` | Distribution artefacts. | `run_id`, `type` (`http`, `oci`, `object`), `location`, `sha256`, `size_bytes`, `expires_at`. | `expires_at` used for retention policies and automatic pruning. |
|
|
| `export_events` | Timeline of state transitions and metrics. | `run_id`, `event_type`, `message`, `at`, `metrics`. | Feeds SSE stream and audit trails. |
|
|
|
|
## Adapter responsibilities
|
|
- **JSON (`json:raw`, `json:policy`).**
|
|
- Ensures canonical casing, timezone normalization, and linkset preservation.
|
|
- Policy variant embeds policy snapshot metadata (`policy_version`, `inputs_hash`, `decision_trace` fingerprint) and emits evaluated findings as separate files.
|
|
- Enforces AOC guardrails: no derived modifications to raw evidence fields.
|
|
- **Trivy (`trivy:db`, `trivy:java-db`).**
|
|
- Maps StellaOps advisory schema to Trivy DB format, handling namespace collisions and ecosystem-specific ranges.
|
|
- Validates compatibility against supported Trivy schema versions; run fails fast if mismatch.
|
|
- Emits optional manifest summarising package counts and severity distribution.
|
|
- **Mirror (`mirror:full`, `mirror:delta`).**
|
|
- Builds self-contained filesystem layout (`/manifests`, `/data/raw`, `/data/policy`, `/indexes`).
|
|
- Delta variant compares against base manifest (`base_export_id`) to write only changed artefacts; records `removed` entries for cleanup.
|
|
- Supports optional encryption of `/data` subtree (age/AES-GCM) with key wrapping stored in `provenance.json`.
|
|
|
|
Adapters expose structured telemetry events (`adapter.start`, `adapter.chunk`, `adapter.complete`) with record counts and byte totals per chunk. Failures emit `adapter.error` with reason codes.
|
|
|
|
## Signing and provenance
|
|
- **Manifest schema.** `export.json` contains run metadata, profile descriptor, selector summary, counts, SHA-256 digests, compression hints, and distribution list. Deterministic field ordering and normalized timestamps.
|
|
- **Provenance schema.** `provenance.json` captures in-toto subject listing (bundle digest, manifest digest), referenced inputs (findings ledger queries, policy snapshot ids, SBOM identifiers), tool version (`exporter_version`, adapter versions), and KMS key identifiers.
|
|
- **Attestation.** Cosign SLSA Level 2 template by default; optional SLSA Level 3 when supply chain attestations are enabled. Detached signatures stored alongside manifests; CLI/Console encourage `cosign verify --key <tenant-key>` workflow.
|
|
- **Audit trail.** Each run stores success/failure status, signature identifiers, and verification hints for downstream automation (CI pipelines, offline verification scripts).
|
|
|
|
## Distribution flows
|
|
- **HTTP download.** Console and CLI stream bundles via chunked transfer; supports range requests and resumable downloads. Response includes `X-Export-Digest`, `X-Export-Length`, and optional encryption metadata.
|
|
- **OCI push.** Worker uses ORAS to publish bundles as OCI artefacts with annotations describing profile, tenant, manifest digest, and provenance reference. Supports multi-tenant registries with `repository-per-tenant` naming.
|
|
- **Object storage.** Writes to tenant-prefixed paths (`s3://stella-exports/{tenant}/{run-id}/...`) with immutable retention policies. Retention scheduler purges expired runs based on profile configuration.
|
|
- **Offline Kit seeding.** Mirror bundles optionally staged into Offline Kit assembly pipelines, inheriting the same manifests and signatures.
|
|
|
|
## Observability
|
|
- **Metrics.** Emits `exporter_run_duration_seconds`, `exporter_run_bytes_total{profile}`, `exporter_run_failures_total{error_code}`, `exporter_active_runs{tenant}`, `exporter_distribution_push_seconds{type}`.
|
|
- **Logs.** Structured logs with fields `run_id`, `tenant`, `profile_kind`, `adapter`, `phase`, `correlation_id`, `error_code`. Phases include `plan`, `resolve`, `adapter`, `manifest`, `sign`, `distribute`.
|
|
- **Traces.** Optional OpenTelemetry spans (`export.plan`, `export.fetch`, `export.write`, `export.sign`, `export.distribute`) for cross-service correlation.
|
|
- **Dashboards & alerts.** DevOps pipeline seeds Grafana dashboards summarising throughput, size, failure ratios, and distribution latency. Alert thresholds: failure rate >5% per profile, median run duration >p95 baseline, signature verification failures >0.
|
|
|
|
## Security posture
|
|
- Tenant claim enforced at every query and distribution path; cross-tenant selectors rejected unless explicit cross-tenant mirror feature toggled with signed approval.
|
|
- RBAC scopes: `export:profile:manage`, `export:run`, `export:read`, `export:download`. Console hides actions without scope; CLI returns `401/403`.
|
|
- Encryption options configurable per profile; keys derived from Authority-managed KMS. Mirror encryption uses tenant-specific recipients; JSON/Trivy rely on transport security plus optional encryption at rest.
|
|
- Restart-only plugin loading ensures adapters and distribution drivers are vetted at deployment time, reducing runtime injection risks.
|
|
- Deterministic output ensures tamper detection via content hashes; provenance links to source runs and policy snapshots to maintain auditability.
|
|
|
|
## Deployment considerations
|
|
- Packaged as separate API and worker containers. Helm chart and compose overlays define horizontal scaling, worker concurrency, queue leases, and object storage credentials.
|
|
- Requires Authority client credentials for KMS and optional registry credentials stored via sealed secrets.
|
|
- Offline-first deployments disable OCI distribution by default and provide local object storage endpoints; HTTP downloads served via internal gateway.
|
|
- Health endpoints: `/health/ready` validates Mongo connectivity, object storage access, adapter registry integrity, and KMS signer readiness.
|
|
|
|
## Compliance checklist
|
|
- [ ] Profiles and runs enforce tenant scoping; cross-tenant exports disabled unless approved.
|
|
- [ ] Manifests and provenance files are generated with deterministic hashes and signed via configured KMS.
|
|
- [ ] Adapters run with restart-time registration only; no runtime plugin loading.
|
|
- [ ] Distribution drivers respect allowlist; OCI push disabled when offline mode is active.
|
|
- [ ] Metrics, logs, and traces follow observability guidelines; dashboards and alerts configured.
|
|
- [ ] Retention policies and pruning jobs configured for staged bundles.
|
|
|
|
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
|