Files
git.stella-ops.org/EPIC_10.md
master 651b8e0fa3 feat: Add new projects to solution and implement contract testing documentation
- Added "StellaOps.Policy.Engine", "StellaOps.Cartographer", and "StellaOps.SbomService" projects to the StellaOps solution.
- Created AGENTS.md to outline the Contract Testing Guild Charter, detailing mission, scope, and definition of done.
- Established TASKS.md for the Contract Testing Task Board, outlining tasks for Sprint 62 and Sprint 63 related to mock servers and replay testing.
2025-10-27 07:57:55 +02:00

21 KiB
Raw Blame History

Fine. Heres your next brick of “maximum documentation.” Try not to drop it on your foot.

Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.


Epic 10: Export Center (JSON, Trivy DB, Mirror bundles)

Short name: Export Center Primary service: exporter Surfaces: Console (Web UI), CLI, Web API Touches: Conseiller (Feedser), Excitator (Vexer), SBOM Service, Policy Engine, VEX Consensus Lens, Findings Ledger, Authority (authN/Z), Object Storage, Orchestrator, Signing/Attestation, Telemetry

AOC ground rule: Conseiller and Excitator aggregate but never merge. The Export Center serializes evidence and policy results; it does not rewrite or “improve” your data in-flight.


1) What it is

The Export Center is the unified system for packaging StellaOps data into portable, verifiable bundles:

  • Canonical JSON exports for advisories, VEX, SBOMs, findings, and policy-evaluation snapshots.
  • Trivy DB compatible bundles so downstream scanners can use Stellas curated vulnerability knowledge without custom integrations.
  • Mirror bundles for airgapped or disconnected environments containing raw evidence, normalized records, indexes, and policy snapshots, with provenance, signatures, and optional encryption.

It centralizes format adapters, compliance and provenance, scheduling, versioning, and distribution (download, OCI push, file share). Every export is reproducible by run_id, cryptographically signed, and audittraceable back to source artifacts.


2) Why (brief)

Teams need to move results into other scanners, CI systems, and isolated networks without babysitting ten different scripts. The Export Center gives a single, policyaware, verifiable exit point that doesnt surprise compliance or set your ops team on fire.


3) How it should work (maximum detail)

3.1 Capabilities

  • Profiles

    • json:raw canonical JSON lines for advisories, VEX, SBOMs.
    • json:policy adds policy evaluation results (allow/deny/risk, rationales).
    • trivy:db Trivy DBcompatible export (core vulnerability DB).
    • trivy:java-db optional Java ecosystem DB export if enabled.
    • mirror:full airgap bundle with raw + normalized + indexes + policy + VEX consensus + SBOMs.
    • mirror:delta incremental bundle based on a previous export manifest.
  • Scope selectors

    • By time window, product, ecosystem, package, image digest, repository, tenant, tag.
    • Include/exclude SBOMs, advisories, VEX, findings, policy snapshots.
  • Policy awareness

    • Optionally bake a policy snapshot into the bundle, including the policy version, inputs and decision traces.
    • Can export raw evidence only (AOC) or raw + evaluated.
  • Provenance & signing

    • Generate attestation metadata with source URIs, artifact hashes, schema versions, export profile and filters.
    • Sign manifests and bundle using configured KMS; support detached and inbundle signatures.
  • Distribution

    • Download via Console or API (streaming).
    • Push to OCI registry as an artifact/image with annotations.
    • Write to object storage prefix for batch pickup.
  • Scheduling & automation

    • Oneoff, cron, and eventtriggered exports (e.g., after a “VEX consensus recompute” run).
    • Retention policies and automatic expiry for old bundles.
  • Observability

    • Export metrics, throughput, size, downstream pull counts (when pushed to registry with report backs).

3.2 Architecture

  • exporter (service)

    • Orchestrates export runs, gathers records from the ledger and indexes, calls format adapters, writes bundles, signs, and publishes distribution tasks.
    • Stateless workers pull “export.jobs” from the orchestrator, stream data, and write manifests into object storage.
  • format adapters

    • Pluggable adapters:

      • adapter-json: canonicalized JSONL writers per record type.
      • adapter-trivy: translates Stellas normalized advisory model into Trivy DB format (and Java DB if enabled).
      • adapter-mirror: constructs a portable filesystem/OCI layout with manifests, indexes, and data subtrees.
  • manifesting & provenance

    • export.json (export manifest): profile, filters, counts, schema versions, content checksums, start/finish times, inputs list (artifact ids + hashes).
    • provenance.json: full chain back to source runs and artifacts; linked signatures.
  • distribution engines

    • dist-http streaming for downloads.
    • dist-oci layer writer with descriptors and annotations.
    • dist-objstore for staging to buckets.
  • security

    • Tenant scoping, RBAC on export creation and retrieval.
    • Optional inbundle encryption (age/AESGCM) with key wrapping using KMS.

3.3 Data model (selected tables)

  • export_profiles

    • id, name, kind (json|trivy|mirror), variant (raw|policy|db|java-db|full|delta), config_json, created_by, created_at.
  • export_runs

    • id, profile_id, trigger (manual|schedule|event|api), state, filters_json, policy_version, started_at, finished_at, artifact_uri, size_bytes, sig_uri, provenance_uri, tenant_id, error_class, error_message.
  • export_inputs

    • Link table between export_runs and source artifacts. export_run_id, artifact_id, hash.
  • export_distributions

    • export_run_id, type (download|oci|objstore), target, state, meta_json, created_at, updated_at.

3.4 Canonical file layouts

JSON profile output Directory layout under export root:

/export/
  export.json                 # export manifest
  provenance.json             # provenance and source artifact chain
  signatures/
    export.sig                # detached signature for export.json
  advisories/
    normalized.jsonl          # normalized advisory records
  vex/
    normalized.jsonl          # normalized VEX records
  sboms/
    <subject>/sbom.spdx.json  # one per subject; SPDX JSON or CycloneDX JSON
  findings/
    policy_evaluated.jsonl    # if profile=json:policy

Trivy DB profile output Produced as a compressed artifact:

/export/
  export.json
  provenance.json
  trivy/
    db.bundle                 # Trivy DB compatible archive
    java-db.bundle            # optional Java DB bundle (if enabled)
  signatures/
    trivy-db.sig

Notes:

  • The adapter keeps an internal mapping of Stella normalized fields to Trivys expected fields and namespaces. The mapping is versioned to track upstream schema evolution.

Mirror bundle (filesystem layout)

/mirror/
  manifest.yaml               # high-level bundle manifest (profile, filters, counts)
  export.json                 # same as JSON profile
  provenance.json
  indexes/
    advisories.index.json     # quick lookups (pkg -> advisory ids)
    vex.index.json
    sbom.index.json
  advisories/raw/...
  advisories/normalized/...
  vex/raw/...
  vex/normalized/...
  sboms/raw/...
  sboms/graph/...
  policy/
    snapshot.yaml             # full policy set used for evaluation
    evaluations.jsonl         # decision outputs if requested
  consensus/
    vex_consensus.jsonl
  signatures/
    manifest.sig
    export.sig
  README.md

Mirror bundle (OCI layout) Following standard OCI image artifact layout with annotations (org.opencontainers.artifact.description, com.stella.export.profile, com.stella.export.filters), and manifest lists for large bundles.

3.5 Export workflow

  1. Plan Exporter computes candidates based on filters. For mirror:delta, compares with previous manifest to compute changes.

  2. Stream & write Records are streamed from the Findings Ledger and stores. Writers are forwardonly, emitting JSONL or adapterspecific structures, chunked for memory safety.

  3. Sign & attest Once all content hashes are stable, Export Center writes export.json, provenance.json, and signs using KMS. Optional encryption wraps data layers.

  4. Distribute Depending on profile settings, it exposes a download URL, pushes an OCI artifact, or writes to object storage. Distribution metadata is recorded.

  5. Audit & retention Run, manifest, and signatures are immutable. Retention policy prunes large data folders after N days with manifests retained longer.

3.6 APIs

POST   /export/profiles
GET    /export/profiles?kind=&variant=
GET    /export/profiles/{id}
PATCH  /export/profiles/{id}
DELETE /export/profiles/{id}

POST   /export/runs
GET    /export/runs?state=&profile_id=&from=&to=&tenant_id=
GET    /export/runs/{run_id}
POST   /export/runs/{run_id}/cancel

GET    /export/runs/{run_id}/download                # presigned URL or streaming
POST   /export/runs/{run_id}/distribute              # { "type":"oci|objstore", "target":"..." }
GET    /export/runs/{run_id}/manifest                # export.json
GET    /export/runs/{run_id}/provenance
GET    /export/runs/{run_id}/signatures

GET    /export/metrics/overview
WS     /export/streams/updates

Request example (create run):

{
  "profile_id": "prof_json_policy",
  "filters": {
    "time_from": "2025-01-01T00:00:00Z",
    "time_to": "2025-01-31T23:59:59Z",
    "ecosystems": ["pypi", "npm"],
    "include": ["advisories", "vex", "sboms", "findings"]
  },
  "distribution": { "type": "download" },
  "policy_version": "pol-v1.8.2"
}

3.7 CLI

stella export profiles list --kind mirror
stella export profiles create --file profile.yaml
stella export run create --profile prof_json_policy --from 2025-01-01 --to 2025-01-31 --include advisories,vex,sboms --download
stella export run status <run-id>
stella export run cancel <run-id>
stella export run get <run-id> --manifest
stella export run download <run-id> --out export-jan.tar.zst
stella export distribute <run-id> --oci ghcr.io/org/stella-export:jan2025
stella export verify <bundle> --manifest export.json --sig signatures/export.sig

Exit codes: 0 ok, 2 bad args, 4 not found, 5 denied, 6 integrity failed, 8 export error.

3.8 RBAC & security

  • Roles:

    • Export.Viewer: list runs, download completed bundles.
    • Export.Operator: create runs, cancel, schedule, distribute.
    • Export.Admin: manage profiles, set retention, manage signing keys.
  • Tenancy:

    • Every run and artifact scoped by tenant; crosstenant export is disallowed.
  • Secrets:

    • KMS references for signing and encryption; never store private keys.
  • PII & redaction:

    • Exporters must not include secrets or credentials. Redaction rules enforced at writer level with schemabased allowlists.

3.9 Observability

  • Metrics:

    • export_bytes_total{profile,tenant}
    • export_records_total{type}
    • export_duration_ms{profile}
    • export_failures_total{error_class}
    • export_distributions_total{type}
    • export_verify_fail_total
  • Traces:

    • Spans per export phase: plan, stream, write, sign, distribute; baggage includes export_run_id.
  • Logs:

    • Structured JSON with counts, sizes, hashes, and redaction hints.

3.10 Performance targets

  • Stream throughput ≥ 25k records/sec per worker for JSONL writing with compression.
  • Trivy bundle generation for 1M advisories ≤ 8 minutes on a standard worker.
  • Mirror delta export for 5% change set ≤ 2 minutes.

3.11 Edge cases & behavior

  • Schema drift: adapter refuses to emit unknown fields without explicit mapping; run fails with error_class=schema_mismatch.
  • Oversized bundles: automatic sharding by time or content type; mirror OCI uses multimanifest indices.
  • Missing policy snapshot: profile json:policy will autopull latest version unless pinned; pinning is recommended for reproducibility.
  • Duplicate evidence: writers dedupe by artifact hash and advisory id; AOC forbids merging.
  • Airgap encryption: if encrypt=true, mirror bundles require recipient public key material; decryption tooling documented.

4) Implementation plan

4.1 Modules

  • New service: src/StellaOps.ExportCenter

    • api/ REST + WS

    • planner/ scope planning, delta computation, sampling

    • adapters/

      • json/ canonical writers
      • trivy/ db builders and schema mapping
      • mirror/ fs/OCI builders, sharding, delta logic
    • signing/ KMS clients, attestations

    • dist/ download streaming, OCI push, object storage writer

    • state/ repositories, migrations

    • metrics/, audit/, security/

  • SDK/CLI

    • src/StellaOps.Cli subcommands with streaming download and verification.
  • Console

    • console/apps/export-center/ pages:

      • Overview, Profiles, Runs, Run Detail, Distributions, Settings.
    • Components: ExportPlanPreview, ProfileEditor, RunDiff, VerifyPanel.

  • Existing services updates

    • Findings Ledger: new paginated streaming endpoints for advisories/VEX/SBOM/findings by filters and snapshots.
    • Policy Engine: “policy snapshot” exportable endpoint.
    • VEX Lens: consensus snapshot endpoint.

4.2 Packaging & deployment

  • Containers:

    • stella/exporter:<ver>
    • stella/exporter-worker:<ver> (optional separated worker pool)
  • Helm:

    • WS replicas, concurrent export limits, default compression (zstd), default retention, KMS settings, OCI creds secrets.
  • DB migrations:

    • Create export_* tables with proper indices (tenant, time, state).

4.3 Rollout

  • Phase 1: JSON (raw, policy) and Mirror (full) as download only.
  • Phase 2: Trivy DB adapters, OCI distribution.
  • Phase 3: Mirror deltas, encryption, verification tooling, scheduling.

5) Documentation changes

Create/update the following docs; each page must end with the imposed rule statement.

  1. /docs/export-center/overview.md Purpose, profiles, supported targets, AOC alignment, security model.

  2. /docs/export-center/architecture.md Service components, adapters, manifests, signing, distribution flows.

  3. /docs/export-center/profiles.md Profile schemas, examples, versioning, compatibility notes.

  4. /docs/export-center/api.md All endpoints with request/response examples and error codes.

  5. /docs/export-center/cli.md Commands with examples, scripts for CI/CD, verification.

  6. /docs/export-center/mirror-bundles.md Filesystem and OCI layouts, delta exports, encryption, airgap import guide.

  7. /docs/export-center/trivy-adapter.md Field mapping, supported ecosystems, compatibility and test matrix.

  8. /docs/export-center/provenance-and-signing.md Manifest format, attestation details, verification process.

  9. /docs/operations/export-runbook.md Common failures, recovery, tuning, capacity planning.

  10. /docs/security/export-hardening.md RBAC, tenant isolation, secret redaction, encryption keys.

Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.


6) Engineering tasks

Backend: exporter

  • Migrations for export_profiles, export_runs, export_inputs, export_distributions.
  • Planner to resolve filters to iterators over advisory/VEX/SBOM/findings datasets with pagination.
  • JSON adapter: canonical JSONL writers with schema normalization and redaction enforcement.
  • Policy snapshot embedder: pull policy version and evaluation outputs when requested.
  • Trivy adapter: implement schema mapping, writer, integrity validation, and compatibility version flag.
  • Mirror adapter: filesystem and OCI writer, sharding, manifest creation, delta computation.
  • Signing/attestation using KMS; detached and embedded options.
  • Distribution engines: download streaming, OCI push, object storage staging.
  • API layer with async export run handling and WebSocket updates.
  • Rate limit and concurrency controls per tenant/profile.
  • Audit logging for all create/cancel/distribute/verify actions.

Integrations

  • Findings Ledger streaming APIs for each content type.
  • Policy Engine endpoint to return deterministic policy snapshot and decision set by run.
  • VEX Lens endpoint to expose consensus snapshot.

Console

  • Profiles CRUD with validation and test preview.
  • Create Run wizard with live count estimates and storage footprint prediction.
  • Runs list + detail page with manifest, provenance, and quick verify.
  • Download and distribution actions with progress and logs.
  • Verification panel to check signatures and hashes clientside.

CLI

  • stella export commands as defined; include verify that checks signatures and hashes.
  • Autoresume of interrupted downloads with range requests.
  • Friendly error messages for schema mismatch and verification failure.

Observability

  • Metrics and traces per §3.9; dashboards for throughput, durations, failures, sizes.
  • Alerts for export failure rate and verify failures.

Security & RBAC

  • Enforce tenant scoping at query level; fuzz tests for leakage.
  • Role matrix checks on each API; Console hides forbidden actions.
  • Encryption test vectors and key rotation procedure.

Docs

  • Author all files in §5 with concrete examples and diagrams.
  • Crosslink from Orchestrator, Policy Studio, VEX Lens, and SBOM docs to the Export Center pages.
  • Append imposed rule line at the end of each page.

Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.


7) Implementation notes per profile

7.1 JSON: raw

  • Content: advisories.normalized, vex.normalized, sboms (SPDX/CycloneDX), optional findings.raw.
  • Normalization: enforce Stella field casing, timestamps in RFC3339, unicode NFC.
  • Compression: .jsonl.zst per file to allow split/merge.

7.2 JSON: policy

  • Adds: policy_snapshot and findings.policy_evaluated.jsonl with decision, rule_id, rationale, inputs fingerprint.
  • Determinism: include policy_version and inputs_hash; replays should match exactly.

7.3 Trivy DB

  • Mapping:

    • Package name, ecosystem, version ranges, CVE/CWE/aliases, severity mapping, vendor statements, fixed versions.
    • Ensure namespace mapping avoids collisions (e.g., distro vs ecosystem).
  • Compatibility: version flag in manifest; adapter throws if upstream schema version not supported.

  • Validation: run postbuild sanity checks (counts, required indexes).

7.4 Mirror: full/delta

  • Full: everything needed to spin up an isolated readonly Stella mirror with local search.
  • Delta: compute changed/added/removed advisory ids, VEX statements, SBOM subjects; update indexes and manifest with base_export_id.
  • Encryption: if enabled, encrypt data subtrees; leave manifest.yaml unencrypted for discoverability unless strict=true.

8) Acceptance criteria

  • Operators can create an export with filters, download it, verify signatures, and trace back to source artifacts via provenance.
  • Trivy adapter produces a bundle consumable by Trivy without custom flags (basic validation in CI).
  • Mirror bundle imports successfully in an airgapped “mirrorreader” sample app and serves queries from indexes.
  • Policyaware exports include deterministic decisions matching the specified policy_version.
  • RBAC prevents a Viewer from creating or canceling exports; tenancy prevents crosstenant leakage.
  • Metrics and dashboards show perprofile throughput and error classes; alerts trigger on failure spikes.
  • Export retries are idempotent and do not duplicate content; hashes stable across reruns with identical inputs.

9) Risks & mitigations

  • Upstream schema changes break Trivy export. Mitigation: versioned adapter with compatibility gate; integration tests against known fixtures; fail early with clear remediation.

  • Bundle size explosion. Mitigation: zstd compression, sharding, delta exports, contentaddressed storage reuse for mirror OCI.

  • Data leakage via exports. Mitigation: strict allowlist schemas, redaction filters, RBAC, tenant scoping tests, encryption for mirror.

  • Nondeterministic policy outputs. Mitigation: pin policy version and inputs hash; snapshot embedded rules; deterministic evaluation mode only.

  • Slow downloads/timeouts. Mitigation: streaming with range support, resumable downloads, CDN integration if needed.


10) Test plan

  • Unit Schema normalization; Trivy mapping; mirror delta computation; manifest hashing; signing.

  • Integration Endtoend export with each profile; verify bundles; replay determinism of policy exports; OCI push/pull.

  • Compatibility Validate Trivy bundle against a matrix of versions; import mirror bundle into a reference reader and run queries.

  • Security Tenant isolation fuzzing; redaction checks; encryption roundtrip; signature verification with rotated keys.

  • Performance Large dataset generation; parallel writer stress; OCI multimanifest publishing; download resume under packet loss.

  • Chaos Kill exporter midwrite; ensure resume or clean failure without partial corrupt bundles.


11) Philosophy

  • Ports, not prisons. Exports should free your data to move with integrity and context, not trap it in a proprietary maze.
  • Reproducible or it didnt happen. Every bit derived from known inputs, signed, traceable.
  • Airgap is a firstclass citizen. Mirror bundles are not an afterthought; theyre how serious orgs actually run.

Final reminder: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.