487 lines
		
	
	
		
			20 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			487 lines
		
	
	
		
			20 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # component_architecture_devops.md — **Stella Ops Release & Operations** (2025Q4)
 | ||
| 
 | ||
| > **Scope.** Implementation‑ready blueprint for **how Stella Ops is built, versioned, signed, distributed, upgraded, licensed (PoE)**, and operated in customer environments (online and air‑gapped). Covers reproducible builds, supply‑chain attestations, registries, offline kits, migration/rollback, artifact lifecycle (RustFS default + Mongo, S3 fallback), monitoring SLOs, and customer activation.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 0) Product vision (operations lens)
 | ||
| 
 | ||
| Stella Ops must be **trustable at a glance** and **boringly operable**:
 | ||
| 
 | ||
| * Every release ships with **first‑party SBOMs, provenance, and signatures**; services verify **each other’s** integrity at runtime.
 | ||
| * Customers can deploy by **digest** and stay aligned with **LTS/stable/edge** channels.
 | ||
| * Paid customers receive **attestation authority** (Signer accepts their PoE) while the core platform remains **free to run**.
 | ||
| * Air‑gapped customers receive **offline kits** with verifiable digests and deterministic import.
 | ||
| * Artifacts expire predictably; operators know what’s kept, for how long, and why.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 1) Release trains & versioning
 | ||
| 
 | ||
| ### 1.1 Channels
 | ||
| 
 | ||
| * **LTS** (12‑month support window): quarterly cadence (Q1/Q2/Q3/Q4).
 | ||
| * **Stable** (default): monthly rollup (bug fixes + compatible features).
 | ||
| * **Edge**: weekly; for early adopters, no guarantees.
 | ||
| 
 | ||
| ### 1.2 Version strings
 | ||
| 
 | ||
| Semantic core + calendar tag:
 | ||
| 
 | ||
| ```
 | ||
| <MAJOR>.<MINOR>.<PATCH>  (<YYYY>.<MM>)   e.g., 2.4.1 (2027.06)
 | ||
| ```
 | ||
| 
 | ||
| * **MAJOR**: breaking API/DB changes (rare).
 | ||
| * **MINOR**: new features, compatible schema migrations (expand/contract pattern).
 | ||
| * **PATCH**: bug fixes, perf and security updates.
 | ||
| * **Calendar tag** exposes **release year** used by Signer for **PoE window checks**.
 | ||
| 
 | ||
| ### 1.3 Component alignment
 | ||
| 
 | ||
| A release is a **bundle** of image digests + charts + manifests. All services in a bundle are **wire‑compatible**. Mixed minor versions are allowed within a bounded skew:
 | ||
| 
 | ||
| * **Web UI ↔ backend**: `±1 minor`.
 | ||
| * **Scanner ↔ Policy/Excititor/Concelier**: `±1 minor`.
 | ||
| * **Authority/Signer/Attestor triangle**: **must** be same minor (crypto and DPoP/mTLS binding rules).
 | ||
| 
 | ||
| At startup, services **self‑advertise** their semver & channel; the UI surfaces **mismatch warnings**.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 2) Supply‑chain pipeline (how a release is built)
 | ||
| 
 | ||
| ### 2.1 Deterministic builds
 | ||
| 
 | ||
| * **Builders**: isolated **BuildKit** workers with pinned base images (digest only).
 | ||
| * **Pinning**: lock files or `go.mod`, `package-lock.json`, `global.json`, `Directory.Packages.props` are **frozen** at tag.
 | ||
| * **Reproducibility**: timestamps normalized; source date epoch; deterministic zips/tars.
 | ||
| * **Multi‑arch**: linux/amd64 + linux/arm64 (Windows images track M2 roadmap).
 | ||
| 
 | ||
| ### 2.2 First‑party SBOMs & provenance
 | ||
| 
 | ||
| * Each image gets **CycloneDX (JSON+Protobuf) SBOM** and **SLSA‑style provenance** attached as **OCI referrers**.
 | ||
| * Scanner’s **Buildx generator** is used to produce SBOMs *during* build; a separate post‑build scan verifies parity (red flag if drift).
 | ||
| * **Release manifest** (see §6.1) lists all digests and SBOM/attestation refs.
 | ||
| 
 | ||
| ### 2.3 Signing & transparency
 | ||
| 
 | ||
| * Images are **cosign‑signed** (keyless) with a Stella Ops release identity; inclusion in a **transparency log** (Rekor) is required.
 | ||
| * SBOM and provenance attestations are **DSSE** and also transparency‑logged.
 | ||
| * Release keys (Fulcio roots or public keys) are embedded in **Signer** policy (for **scanner‑release validation** at customer side).
 | ||
| 
 | ||
| ### 2.4 Gates & tests
 | ||
| 
 | ||
| * **Static**: linters, codegen checks, protobuf API freeze (backward‑compat tests).
 | ||
| * **Unit/integration**: per‑component, plus **end‑to‑end** flows (scan→vex→policy→sign→attest).
 | ||
| * **Perf SLOs**: hot paths (SBOM compose, diff, export) measured against budgets.
 | ||
| * **Security**: dependency audit vs Concelier export; container hardening tests; minimal caps.
 | ||
| * **Analyzer smoke**: restart-time language plug-ins (currently Python) verified via `dotnet run --project tools/LanguageAnalyzerSmoke` to ensure manifest integrity plus cold vs warm determinism (< 30 s / < 5 s budgets); the harness logs deviations from repository goldens for follow-up.
 | ||
| * **Canary cohort**: internal staging + selected customers; one week on **edge** before **stable** tag.
 | ||
| 
 | ||
| ### 2.5 Debug-store artefacts
 | ||
| 
 | ||
| * Every release exports stripped debug information for ELF binaries discovered in service images. Debug files follow the GNU build-id layout (`debug/.build-id/<aa>/<rest>.debug`) and are generated via `objcopy --only-keep-debug`.
 | ||
| * `debug/debug-manifest.json` captures build-id → component/image/source mappings with SHA-256 checksums so operators can mirror the directory into debuginfod or offline symbol stores. The manifest (and its `.sha256` companion) ships with every release bundle and Offline Kit.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 3) Distribution & activation
 | ||
| 
 | ||
| ### 3.1 Registries
 | ||
| 
 | ||
| * **Primary**: `registry.stella-ops.org` (OCI v2, supports Referrers API).
 | ||
| * **Mirrors**: GHCR (read‑only), regional mirrors for latency.
 | ||
|   * Operational runbook: see `docs/ops/concelier-mirror-operations.md` for deployment profiles, CDN guidance, and sync automation.
 | ||
| * **Pull by digest only** in Kubernetes/Compose manifests.
 | ||
| 
 | ||
| **Gating policy**:
 | ||
| 
 | ||
| * **Core images** (Authority, Scanner, Concelier, Excititor, Attestor, UI): public **read**.
 | ||
| * **Enterprise add‑ons** (if any) and **pre‑release**: private repos via the **Registry Token Service** (`src/Registry/StellaOps.Registry.TokenService`) which exchanges Authority-issued OpToks for short-lived Docker registry bearer tokens.
 | ||
| 
 | ||
| > Monetization lever is **signing** (PoE gate), not image pulls, so the core remains simple to consume.
 | ||
| 
 | ||
| ### 3.2 OAuth2 token service (for private repos)
 | ||
| 
 | ||
| * Docker Registry’s token flow backed by **Authority**:
 | ||
| 
 | ||
|   1. Client hits registry (`401` with `WWW-Authenticate: Bearer realm=…`).
 | ||
|   2. Client gets an **access token** from the token service (validated by Authority) with `scope=repository:…:pull`.
 | ||
|   3. Registry allows pull for the requested repo.
 | ||
| * Tokens are **short‑lived** (60–300 s) and **DPoP‑bound**.
 | ||
| 
 | ||
| The token service enforces plan gating via `registry-token.yaml` (see `docs/ops/registry-token-service.md`) and exposes Prometheus metrics (`registry_token_issued_total`, `registry_token_rejected_total`). Revoked licence identifiers halt issuance even when scope requirements are met.
 | ||
| 
 | ||
| ### 3.3 Offline kits (air‑gapped)
 | ||
| 
 | ||
| * Tarball per release channel:
 | ||
| 
 | ||
|   ```
 | ||
|   stellaops-kit-<ver>-<channel>.tar.zst
 | ||
|     /images/   OCI layout with all first-party images (multi-arch)
 | ||
|     /sboms/    CycloneDX JSON+PB for each image
 | ||
|     /attest/   DSSE bundles + Rekor proofs
 | ||
|     /charts/   Helm charts + values templates
 | ||
|     /compose/  docker-compose.yml + .env template
 | ||
|     /plugins/  Concelier/Excititor connectors (restart-time)
 | ||
|     /policy/   example policies
 | ||
|     /manifest/ release.yaml  (see §6.1)
 | ||
|   ```
 | ||
| * Import via CLI `offline kit import`; checks digests and signatures before load.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 4) Licensing (PoE) & monetization
 | ||
| 
 | ||
| **Principle**: **Only paid Stella Ops issues valid signed attestations.** Running the stack is free; signing requires PoE.
 | ||
| 
 | ||
| ### 4.1 PoE issuance
 | ||
| 
 | ||
| * Customers purchase a plan and obtain a **PoE artifact** from `www.stella-ops.org`:
 | ||
| 
 | ||
|   * **PoE‑JWT** (DPoP/mTLS‑bound) **or** **PoE mTLS client certificate**.
 | ||
|   * Contains: `license_id`, `plan`, `valid_release_year`, `max_version`, `exp`, optional `tenant/customer` IDs.
 | ||
| 
 | ||
| ### 4.2 Online enforcement
 | ||
| 
 | ||
| * **Signer** calls **Licensing /license/introspect** on every signing request (see signer doc).
 | ||
| * If **revoked/expired/out‑of‑window** → deny with machine‑readable reason.
 | ||
| * All **valid** bundles are DSSE‑signed and **Attestor** logs them; Rekor UUID returned.
 | ||
| * UI badges: “**Verified by Stella Ops**” with link to the public log.
 | ||
| 
 | ||
| ### 4.3 Air‑gapped / offline
 | ||
| 
 | ||
| * Customers obtain a **time‑boxed PoE lease** (signed JSON, 7–30 days).
 | ||
| * Signer accepts the lease and emits **provisional** attestations (clearly labeled).
 | ||
| * When connectivity returns, a background job **endorses** the provisional entries with the cloud service, updating their status to **verified**.
 | ||
| * Operators can export a **verification bundle** for auditors even before endorsement (contains DSSE + local Rekor proof + lease snapshot).
 | ||
| 
 | ||
| ### 4.4 Stolen/abused PoE
 | ||
| 
 | ||
| * Customers report theft; **Licensing** flags `license_id` as **revoked**.
 | ||
| * Subsequent Signer requests **deny**; previous attestations remain but can be marked **contested** (UI shows badge, optional re‑sign path upon new PoE).
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 5) Deployment path (customer side)
 | ||
| 
 | ||
| ### 5.1 First install
 | ||
| 
 | ||
| * **Helm** (Kubernetes) or **Compose** (VMs). Example (K8s):
 | ||
| 
 | ||
| ```bash
 | ||
| helm repo add stellaops https://charts.stella-ops.org
 | ||
| helm install stella stellaops/platform \
 | ||
|   --version 2.4.0 \
 | ||
|   --set global.channel=stable \
 | ||
|   --set authority.issuer=https://authority.stella.local \
 | ||
|   --set scanner.minio.endpoint=http://minio.stella.local:9000 \
 | ||
|   --set scanner.mongo.uri=mongodb://mongo/scanner \
 | ||
|   --set concelier.mongo.uri=mongodb://mongo/concelier \
 | ||
|   --set excititor.mongo.uri=mongodb://mongo/excititor
 | ||
| ```
 | ||
| 
 | ||
| * Post‑install job registers **Authority clients** (Scanner, Signer, Attestor, UI) and prints **bootstrap** URLs and client credentials (sealed secrets).
 | ||
| * UI banner shows **release bundle** and verification state (cosign OK? Rekor OK?).
 | ||
| 
 | ||
| ### 5.2 Updates
 | ||
| 
 | ||
| * **Blue/green**: pull new bundle by **digest**; deploy side‑by‑side; cut traffic.
 | ||
| 
 | ||
| * **Rolling**: upgrade stateful components in safe order:
 | ||
| 
 | ||
|   1. Authority (stateless, dual‑key rotation ready)
 | ||
|   2. Signer/Attestor (same minor)
 | ||
|   3. Scanner WebService & Workers
 | ||
|   4. Concelier, then Excititor (schema migrations are expand/contract)
 | ||
|   5. UI last
 | ||
| 
 | ||
| * **DB migrations** are **expand/contract**:
 | ||
| 
 | ||
|   * Phase A (release N): **add** new fields/indexes, write old+new.
 | ||
|   * Phase B (N+1): **read** new fields; **drop** old.
 | ||
|   * Rollback is a matter of redeploying previous images and keeping both schemas valid.
 | ||
| 
 | ||
| ### 5.3 Rollback
 | ||
| 
 | ||
| * Images referenced by **digest**; keep previous release manifest `K` versions back.
 | ||
| * `helm rollback` or compose `docker compose -f release-K.yml up -d`.
 | ||
| * Mongo migrations are additive; **no destructive changes** within a single minor.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 6) Release payloads & manifests
 | ||
| 
 | ||
| ### 6.1 Release manifest (`release.yaml`)
 | ||
| 
 | ||
| ```yaml
 | ||
| release:
 | ||
|   version: "2.4.1"
 | ||
|   channel: "stable"
 | ||
|   date: "2027-06-20T12:00:00Z"
 | ||
|   calendar: "2027.06"
 | ||
|   components:
 | ||
|     - name: scanner-webservice
 | ||
|       image: registry.stella-ops.org/stellaops/scanner-web@sha256:aa..bb
 | ||
|       sbom: oci://.../referrers/cdx-json@sha256:11..22
 | ||
|       provenance: oci://.../attest/provenance@sha256:33..44
 | ||
|       signature: { rekorUUID: "…" }
 | ||
|     - name: signer
 | ||
|       image: registry.stella-ops.org/stellaops/signer@sha256:cc..dd
 | ||
|       signature: { rekorUUID: "…" }
 | ||
|   charts:
 | ||
|     - name: platform
 | ||
|       version: "2.4.1"
 | ||
|       digest: "sha256:ee..ff"
 | ||
|   compose:
 | ||
|     file: "docker-compose.yml"
 | ||
|     digest: "sha256:77..88"
 | ||
|   checksums:
 | ||
|     sha256: "… digest of this release.yaml …"
 | ||
| ```
 | ||
| 
 | ||
| The manifest is **cosign‑signed**; UI/CLI can verify a bundle without talking to registries.
 | ||
| 
 | ||
| > Deployment guardrails – The repository keeps channel-aligned Compose bundles
 | ||
| > in `deploy/compose/` and Helm overlays in `deploy/helm/stellaops/`. Both sets
 | ||
| > pull their digests from `deploy/releases/` and are validated by
 | ||
| > `deploy/tools/validate-profiles.sh` to guarantee lint/dry-run cleanliness.
 | ||
| 
 | ||
| ### 6.2 Image labels (release metadata)
 | ||
| 
 | ||
| Each image sets OCI labels:
 | ||
| 
 | ||
| ```
 | ||
| org.opencontainers.image.version = "2.4.1"
 | ||
| org.opencontainers.image.revision = "<git sha>"
 | ||
| org.opencontainers.image.created = "2027-06-20T12:00:00Z"
 | ||
| org.stellaops.release.calendar = "2027.06"
 | ||
| org.stellaops.release.channel  = "stable"
 | ||
| org.stellaops.build.slsaProvenance = "oci://…"
 | ||
| ```
 | ||
| 
 | ||
| Signer validates **scanner** image’s cosign identity + calendar tag for **release window** checks.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 7) Artifact lifecycle & storage (RustFS/Mongo)
 | ||
| 
 | ||
| ### 7.1 Buckets & prefixes (RustFS)
 | ||
| 
 | ||
| ```
 | ||
| rustfs://stellaops/
 | ||
|   scanner/
 | ||
|     layers/<sha256>/sbom.cdx.json.zst
 | ||
|     images/<imgDigest>/inventory.cdx.pb
 | ||
|     images/<imgDigest>/usage.cdx.pb
 | ||
|     diffs/<old>_<new>/diff.json.zst
 | ||
|     attest/<artifactSha256>.dsse.json
 | ||
|   concelier/
 | ||
|     json/<exportId>/...
 | ||
|     trivy/<exportId>/...
 | ||
|   excititor/
 | ||
|     exports/<exportId>/...
 | ||
|   attestor/
 | ||
|     dsse/<bundleSha256>.json
 | ||
|     proof/<rekorUuid>.json
 | ||
| ```
 | ||
| 
 | ||
| ### 7.2 ILM classes
 | ||
| 
 | ||
| * **`short`**: working artifacts (diffs, queues) — TTL 7–14 days.
 | ||
| * **`default`**: SBOMs & indexes — TTL 90–180 days (configurable).
 | ||
| * **`compliance`**: signed reports & attested exports — retention enforced via RustFS hold or S3 Object Lock (governance/compliance) 1–7 years.
 | ||
| 
 | ||
| ### 7.3 Artifact Lifecycle Controller (ALC)
 | ||
| 
 | ||
| * A background worker (part of Scanner.WebService) enforces **TTL** and **reference counting**:
 | ||
| 
 | ||
|   * Artifacts referenced by **reports** or **tickets** are pinned.
 | ||
|   * ILM actions logged; UI shows per‑class usage & upcoming purges.
 | ||
| 
 | ||
| > **Migration note.** Follow `docs/ops/scanner-rustfs-migration.md` when transitioning existing
 | ||
| > MinIO buckets to RustFS. The provided migrator is idempotent and safe to rerun per prefix.
 | ||
| 
 | ||
| ### 7.4 Mongo retention
 | ||
| 
 | ||
| * **Scanner**: `runtime.events` use TTL (e.g., 30–90 days); **catalog** permanent.
 | ||
| * **Concelier/Excititor**: raw docs keep **last N windows**; canonical stores permanent.
 | ||
| * **Attestor**: `entries` permanent; `dedupe` TTL 24–48h.
 | ||
| 
 | ||
| ### 7.5 Mongo server baseline
 | ||
| 
 | ||
| * **Minimum supported server:** MongoDB **4.2+**. Driver 3.5.0 removes compatibility shims for 4.0; upstream has already announced 4.0 support will be dropped in upcoming C# driver releases. citeturn1open1
 | ||
| * **Deploy images:** Compose/Helm defaults stay on `mongo:7.x`. For air-gapped installs, refresh Offline Kit bundles so the packaged `mongod` matches ≥4.2.
 | ||
| * **Upgrade guard:** During rollout, verify replica sets reach FCV `4.2` or above before swapping binaries; automation should hard-stop if FCV is <4.2.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 8) Observability & SLOs (operations)
 | ||
| 
 | ||
| * **Uptime SLO**: 99.9% for Signer/Authority/Attestor; 99.5% for Scanner WebService; Excititor/Concelier 99.0%.
 | ||
| * **Error budgets**: tracked per month; dashboards show burn rates.
 | ||
| * **Golden signals**:
 | ||
| 
 | ||
|   * **Latency**: token issuance, sign→attest round‑trip, scan enqueue→emit, export build.
 | ||
|   * **Saturation**: queue depth, Mongo write IOPS, RustFS throughput / queue depth (or S3 metrics when in fallback mode).
 | ||
|   * **Traffic**: scans/min, attestations/min, webhook admits/min.
 | ||
|   * **Errors**: 5xx rates, cosign verification failures, Rekor timeouts.
 | ||
| 
 | ||
| Prometheus + OTLP; Grafana dashboards ship in the charts.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 9) Security & compliance operations
 | ||
| 
 | ||
| * **Key rotation**:
 | ||
| 
 | ||
|   * Authority JWKS: 60‑day cadence, dual‑key overlap.
 | ||
|   * Release signing identities: rotate per minor or quarterly.
 | ||
|   * Sigstore roots mirrored and pinned; alarms on drift.
 | ||
| 
 | ||
| * **FIPS mode** (Gov build):
 | ||
| 
 | ||
|   * Enforce `ES256` + KMS/HSM; disable Ed25519; MLS ciphers only.
 | ||
|   * Local **Rekor v2** and **Fulcio** alternatives; **air‑gapped** CA.
 | ||
| 
 | ||
| * **Vulnerability response**:
 | ||
| 
 | ||
|   * Concelier red-flag advisories trigger accelerated **stable** patch rollout; UI/CLI “security patch available” notice.
 | ||
|   * 2025-10: Pinned `MongoDB.Driver` **3.5.0** and `SharpCompress` **0.41.0** across services (DEVOPS-SEC-10-301) to eliminate NU1902/NU1903 warnings surfaced during scanner cache/worker test runs; repacked the local `Mongo2Go` feed so test fixtures inherit the patched dependencies; future bumps follow the same central override pattern.
 | ||
| 
 | ||
| * **Backups/DR**:
 | ||
| 
 | ||
|   * Mongo nightly snapshots; MinIO versioning + replication (if configured).
 | ||
|   * Restore runbooks tested quarterly with synthetic data.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 10) Customer update flow (how versions are fetched & activated)
 | ||
| 
 | ||
| ### 10.1 Online clusters
 | ||
| 
 | ||
| * **UI** surfaces update banner with **release manifest** diff and risk notes.
 | ||
| * Operator approves → **Controller** pulls new images by digest; health‑checks; moves traffic; deprecates old revision.
 | ||
| * Post‑switch, **schema Phase B** migrations (if any) run automatically.
 | ||
| 
 | ||
| ### 10.2 Air‑gapped clusters
 | ||
| 
 | ||
| * Operator downloads **offline kit** from a mirror → `stellaops offline kit import`.
 | ||
| * Controller validates bundle checksums and **cosign signatures**; applies charts/compose by digest.
 | ||
| * After install, **verify** page shows green checks: image sigs, SBOMs attached, provenance logged.
 | ||
| 
 | ||
| ### 10.3 CLI self‑update (optional)
 | ||
| 
 | ||
| * `stellaops self-update` pulls a **signed release manifest** and verifies the **CLI binary** with cosign before swapping (admin can disable).
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 11) Compatibility & deprecation policy
 | ||
| 
 | ||
| * **APIs** are stable within a **major**; breaking changes imply **MAJOR++** and deprecation period of one minor.
 | ||
| * **Storage**: expand/contract; “drop old fields” only after one minor grace.
 | ||
| * **Config**: feature flags (default off) for risky features (e.g., eBPF).
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 12) Runbooks (selected)
 | ||
| 
 | ||
| ### 12.1 Lost PoE
 | ||
| 
 | ||
| 1. Suspend **automatic attestation** jobs.
 | ||
| 2. Use CLI `stellaops signer status` to confirm `entitlement_denied`.
 | ||
| 3. Obtain new PoE from portal; verify on Signer `/poe/verify`.
 | ||
| 4. Re‑enable; optionally **re‑sign** last N reports (UI button → batch).
 | ||
| 
 | ||
| ### 12.2 Rekor outage (self‑hosted)
 | ||
| 
 | ||
| * Attestor returns `202 (pending)` with queued proof fetch.
 | ||
| * Keep DSSE bundles locally; re‑submit on schedule; UI badge shows **Pending**.
 | ||
| * If outage > SLA, you can switch to a **mirror** log in config; Attestor writes to both when restored.
 | ||
| 
 | ||
| ### 12.3 Emergency downgrade
 | ||
| 
 | ||
| * Identify prior release manifest (UI → Admin → Releases).
 | ||
| * `helm rollback stella <revision>` (or compose apply previous file).
 | ||
| * Services tolerate skew per §1.3; ensure **Signer/Authority/Attestor** are rolled together.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 13) Example: cluster bootstrap (Compose)
 | ||
| 
 | ||
| ```yaml
 | ||
| version: "3.9"
 | ||
| services:
 | ||
|   authority:
 | ||
|     image: registry.stella-ops.org/stellaops/authority@sha256:...
 | ||
|     env_file: ./env/authority.env
 | ||
|     ports: ["8440:8440"]
 | ||
|   signer:
 | ||
|     image: registry.stella-ops.org/stellaops/signer@sha256:...
 | ||
|     depends_on: [authority]
 | ||
|     environment:
 | ||
|       - SIGNER__POE__LICENSING__INTROSPECTURL=https://www.stella-ops.org/api/v1/license/introspect
 | ||
|   attestor:
 | ||
|     image: registry.stella-ops.org/stellaops/attestor@sha256:...
 | ||
|     depends_on: [signer]
 | ||
|   scanner-web:
 | ||
|     image: registry.stella-ops.org/stellaops/scanner-web@sha256:...
 | ||
|     environment:
 | ||
|       - SCANNER__S3__ENDPOINT=http://minio:9000
 | ||
|   scanner-worker:
 | ||
|     image: registry.stella-ops.org/stellaops/scanner-worker@sha256:...
 | ||
|     deploy: { replicas: 4 }
 | ||
|   concelier:
 | ||
|     image: registry.stella-ops.org/stellaops/concelier@sha256:...
 | ||
|   excititor:
 | ||
|     image: registry.stella-ops.org/stellaops/excititor@sha256:...
 | ||
|   web-ui:
 | ||
|     image: registry.stella-ops.org/stellaops/web-ui@sha256:...
 | ||
|   mongo:
 | ||
|     image: mongo:7
 | ||
|   minio:
 | ||
|     image: minio/minio:RELEASE.2025-07-10T00-00-00Z
 | ||
| ```
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 14) Governance & keys (who owns the trust root)
 | ||
| 
 | ||
| * **Release key policy**: only the Release Engineering group can push signed releases; 4‑eyes approval; TUF‑style manifest possible in future.
 | ||
| * **Signer acceptance policy**: embedded release identities are updated **only** via minor upgrade; emergency CRL supported.
 | ||
| * **Customer keys**: none needed for core use; enterprise add‑ons may require per‑customer registries and keys.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 15) Roadmap (Ops)
 | ||
| 
 | ||
| * **Windows containers GA** (Scanner + Zastava).
 | ||
| * **Key Transparency** for Signer certs.
 | ||
| * **Delta‑kit** (offline) for incremental updates.
 | ||
| * **Operator CRDs** (K8s) to manage policy and ILM declaratively.
 | ||
| * **SBOM **protobuf** as default transport at rest (smaller, faster).
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ### Appendix A — Minimal SLO monitors
 | ||
| 
 | ||
| * `authority.tokens_issued_total` slope ≈ normal.
 | ||
| * `signer.requests_total{result="success"}/minute` > 0 (when scans occur).
 | ||
| * `attestor.submit_latency_seconds{quantile=0.95}` < 0.3.
 | ||
| * `scanner.scan_latency_seconds{quantile=0.95}` < target per image size.
 | ||
| * `concelier.export.duration_seconds` stable; `excititor.consensus.conflicts_total` not exploding after policy changes.
 | ||
| * RustFS request error rate near zero (or `s3_requests_errors_total` when operating against S3); Mongo `opcounters` hit expected baseline.
 | ||
| 
 | ||
| ### Appendix B — Upgrade safety checklist
 | ||
| 
 | ||
| * Verify **release manifest** signature.
 | ||
| * Ensure **Signer/Authority/Attestor** are same minor.
 | ||
| * Verify **DB backups** < 24h old.
 | ||
| * Confirm **ILM** won’t purge compliance artifacts during upgrade window.
 | ||
| * Roll **one component** at a time; watch SLOs; abort on regression.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| **End — component_architecture_devops.md**
 |