36 KiB
No file to print Fine. Shipping containers, but for software. Here’s the serious version you can paste into your docs without the sarcasm.
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
Epic 13: Containerized Distribution & Quickstart
Short name: Containerized Distribution & Quickstart
Primary components: OCI images for all services, Compose Quickstart, Helm chart for production, Air‑gap bundles
Surfaces: Container registry, /deploy/*, /docs/install/*, Console onboarding screen
Touches: Authority (authN/Z), Web Services API, Orchestrator, Task Runner, Policy Engine, Conseiller (Feedser), Excitator (Vexer), Findings Ledger, Export Center, Notifications Studio, Advisory AI Assistant, Object Storage/KMS, Telemetry
AOC ground rule reminder: Conseiller and Excitator aggregate and link advisories/VEX. They never merge or mutate source records. Containerized deployments must preserve this behavior and expose links to originals.
1) What it is
A complete, reproducible containerized distribution of StellaOps with three delivery modes:
-
Quickstart (single host) using Docker Compose: one command to run a full stack suitable for evaluation and local development. Ships with seed data and sane defaults.
-
Production Helm chart for Kubernetes: modular, scalable, secure‑by‑default deployment with optional HA and external dependencies.
-
Air‑gapped bundles: signed offline packages containing images, seed configs, and installation scripts for disconnected environments.
All images are multi‑arch (amd64/arm64), signed, SBOM‑attached, and versioned with consistent tags. A “Download & Install” doc set guides users from zero to a working system in minutes and to a production‑ready posture in hours.
2) Why (brief)
People don’t adopt tools they can’t run quickly or securely. Containers make our deployment reproducible; Quickstart removes friction; Helm unlocks real ops. Air‑gap bundles acknowledge reality in regulated environments.
3) How it should work (maximum detail)
3.1 Image catalog
Build and publish OCI images for the following:
stella-api(Web Services API)stella-console(Web UI)stella-orchestrator(source/job scheduler)stella-task-runner(executes Task Packs remotely)stella-conseiller(Feedser; advisory aggregator)stella-excitator(Vexer; VEX aggregator)stella-policy(Policy Engine)stella-ledger(Findings Ledger worker; if separated from API)stella-export(Export Center worker; optional if part of API)stella-notify(Notifications Studio worker)stella-ai(Advisory AI Assistant; lightweight service calling configured LLM backends or local models)- Support services (optionally bundled for Quickstart):
postgres,redis,object-store(S3‑compatible),queue(NATS or RabbitMQ),otel-collector.
Image standards
- Base: distroless or minimal; non‑root user; read‑only filesystem; writable
/tmponly if needed. - Ports: declare via labels; expose health endpoints
/health/liveness,/health/readiness. - Env: explicit, documented, with safe defaults; secrets via env or file mounts only.
- Config:
STELLA_*envs or mounted config directory/etc/stella/. - SBOM: attach SPDX JSON as OCI artifact and include in
/app/sbom.spdx.jsonbaked at build time. - Signing: cosign attestations for image, SBOM, and provenance.
- Labels: org.opencontainers.image.* (title, version, revision, source, licenses).
- Entrypoint: PID 1 with reap; graceful shutdown on SIGTERM; configurable termination grace period.
- Logs: structured JSON by default; stdout/stderr only.
Tagging scheme
:vX.Y.Z(immutable release):vX.Y.Z-rc.N(release candidate):edge(latest main):nightly-YYYYMMDD(optional)- Multi‑arch manifest lists for linux/amd64 and linux/arm64.
3.2 Quickstart (Compose)
Goal: curl | sh equivalent that yields a working stack on a laptop/server with defaults and demo data. No internet beyond pulling images, unless configured.
Compose file deploy/compose/docker-compose.yml
-
Services:
api,console,orchestrator,task-runner,conseiller,excitator,policy,notify,export,aipostgres,redis,minio(S3),natsorrabbitmq,otel-collector
-
Volumes:
pgdata,minio-data,redis-data,stella-state(for local cache, packs registry)
-
Networks:
stella-netbridge
-
Ports (defaults):
- Console
8080, API8081, MinIO9000, NATS/RabbitMQ default ports
- Console
-
Env files:
.env.examplewith safe defaults; users copy to.env.
Seed data
- Seed admin account and tenant on first run via
stella-apimigration/seed job. - Seed demo SBOMs, advisories, VEX samples, baseline policy, and a task pack.
- On first login, Console shows “Welcome” wizard: confirm endpoints, generate API token, run sample scan import, open Vulnerability Explorer.
Security posture
- Default credentials only for Quickstart; randomize secrets on first
upand store in.secrets/file. - All services run as non‑root; bind to localhost by default unless
EXPOSE_PUBLIC=1set. - TLS optional via
CADDYornginxsidecar disabled by default.
One‑liner
./deploy/compose/quickstart.shdoes: preflight checks, pulls images, writes.env, runsdocker compose up -d, polls readiness, prints URLs and credentials.
Backups & reset
./deploy/compose/backup.shcreates a tarball of volumes and config../deploy/compose/reset.shnukes persistent volumes with a big scary prompt unless--yes.
3.3 Production Helm chart
Chart location: deploy/helm/stella/ with subcharts or toggles.
Chart features
-
Components enabled via values:
api,console,orchestrator,taskRunner,conseiller,excitator,policy,notify,export,ai. -
External dependencies by default:
- PostgreSQL, Redis, S3 bucket, Message queue, OTel endpoint provided via values.
- Optional “bundled” mode for lab clusters using StatefulSets.
-
Security:
- PodSecurityContext: runAsNonRoot, readOnlyRootFilesystem, fsGroup when needed.
- NetworkPolicy for east‑west traffic; deny‑all then allow specific ports.
- Secrets as
Secretfrom External Secrets operator or sealed secrets. - HPA per component; PDBs; liveness/readiness probes.
-
Ingress:
- One hostname for Console, one for API; TLS required in production values.
- Option to serve Console as static behind CDN while API behind private ingress gateway.
-
Config:
- Values for Authority provider, token TTLs, policy cache TTL, pack registry endpoint, notifications sinks, export locations.
- Feature flags per epic enablement.
-
Migrations:
stella-migratorJob runs before rollouts; idempotent migrations.- Optional “break glass” manual job.
-
Observability:
/metricsendpoints scraped by Prometheus; exemplars via OTel; logs structured.- OpenTelemetry auto‑config via env if collector provided.
-
Upgrades:
- Blue/green or rolling; readiness gates based on background indexers catching up.
- Chart hooks to block until Conseiller/Excitator catch up to feed watermarks.
3.4 Air‑gapped distribution
Bundle format
-
stella-bundle-vX.Y.Z.tar.zstcontaining:- All images as OCI layout (multi‑arch), cosign signatures, SBOMs, SLSA provenance.
load.shto import into a local registry.compose/andhelm/directories with pinned image digests.checksums.txtandbundle.sig.
-
Process
-
Online build job crafts bundle; signatures produced by CI keys.
-
Offline install:
- Verify
bundle.sig ./load.sh --to registry.local:5000helm install stella ./helm -f values-airgap.yaml --set image.registry=registry.local:5000
- Verify
-
3.5 Configuration matrix
Document every config knob in a single table:
- Auth: Authority issuer, JWKS, RBAC cache TTL.
- Storage: DB URL, pool sizes, migration flags.
- Object store: S3 endpoint, buckets, SSE, IAM.
- Queue: URL, prefetch, retention.
- Policy engine: rule cache TTL, default policy version.
- Conseiller/Excitator: polling intervals, feed sources, retry backoff, max in‑flight; merge disabled enforced.
- Orchestrator/Task Runner: concurrency, sandbox, network egress policy, artifact retention.
- Notifications: sinks, templates path, batch windows.
- Export Center: formats enabled, rate limits.
- AI Assistant: model endpoint, token limits, guardrails, disable by default.
3.6 Health, readiness, and upgrades
- Health endpoints:
GET /health/livenessreturns 200 if process responsive;GET /health/readinesschecks dependencies with timeout. - Graceful shutdown: SIGTERM starts drain; HTTP returns 503; background workers flush; exit on deadline.
- Upgrade choreography: migrations run, API becomes ready, workers rolling restart, indexes catch up, AOC evaluation warms caches, then flip traffic.
- Version skew policy: define supported skew between components; chart validates.
3.7 Security & compliance
- Image signing & verification: cosign attestations; optional admission policy to verify signatures by key.
- SBOM provenance: attach SPDX and provenance attestations; publish via registry referrers.
- Non‑root & least privilege: capabilities dropped; only NET_BIND for proxies if needed.
- Secrets handling: mount from files; avoid putting secrets in args; redacted logs by default.
- Audit: container labels propagate release metadata to all logs and spans.
- AOC enforcement: images for Conseiller/Excitator hard‑disable merge code paths via env/defaults.
3.8 Quickstart UX polish
- Console shows “Connected to Quickstart” banner with a button “View install docs” and “Export pack to production.”
- One click to generate a
Task Packthat exports seed data from Quickstart to a production tenant via Export Center.
4) Architecture
4.1 Repos & layout
/deploy
/compose
docker-compose.yml
.env.example
quickstart.sh
backup.sh
reset.sh
/helm
/stella
Chart.yaml
values.yaml
values-prod.yaml
values-airgap.yaml
templates/*.yaml
/docker
stella-api.Dockerfile
stella-console.Dockerfile
stella-orchestrator.Dockerfile
stella-task-runner.Dockerfile
stella-conseiller.Dockerfile
stella-excitator.Dockerfile
stella-policy.Dockerfile
stella-notify.Dockerfile
stella-export.Dockerfile
stella-ai.Dockerfile
4.2 CI/CD flow
- Build multi‑arch with buildx; run unit/integration tests; embed version metadata and SBOM.
- Sign images; push to registry; publish Helm chart with pinned digests.
- Generate Air‑gap bundle and signatures.
- Smoke test Quickstart on fresh VM; e2e tests exercise Console and CLI parity (Epic 12).
5) APIs and contracts
No new external APIs, but every service must expose:
GET /health/livenessandGET /health/readiness.GET /versionreturning{ version, gitCommit, buildDate }.GET /metricswhen enabled.- Config discovery endpoint for Console with trimmed, safe values (no secrets).
- Conseiller/Excitator must expose
GET /capabilitiesreturning{"merge": false}to prove merge is disabled.
6) Documentation changes
Create/update:
-
/docs/install/overview.mdSupported deployment modes, hardware requirements, network ports, quickstart vs production. -
/docs/install/compose-quickstart.mdPreconditions, one‑liner, first‑login wizard, seed data, reset/backup, common pitfalls. -
/docs/install/helm-prod.mdPrereqs, external dependencies, values reference, TLS/ingress, HPA, PDB, upgrades, rollbacks. -
/docs/install/airgap.mdBundle verification, loading into private registry, running without internet, patching images. -
/docs/install/configuration-reference.mdThe full configuration matrix with examples. -
/docs/security/supply-chain.mdImage signing, SBOMs, provenance, admission controls, non‑root posture. -
/docs/operations/health-and-readiness.mdEndpoints, probes, troubleshooting, expected states during upgrades. -
/docs/release/image-catalog.mdAll image names, tags, architectures, checksums; mapping between chart version and image digests. -
/docs/console/onboarding.mdQuickstart banner, links to install docs, exporting data to production.
Add at the top of each page:
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
7) Implementation plan
New modules/artifacts
- Dockerfiles per service under
/deploy/docker/with common builder stages. - Helm chart under
/deploy/helm/stella. - Compose quickstart under
/deploy/compose/. - Air‑gap bundle generator in CI, script
tools/make-airgap-bundle.sh. - Seed dataset packaged as container image layer or mounted config.
Changes to services
- Add health/version/metrics endpoints where missing.
- Ensure all services read config from env/files with defaults suitable for Quickstart.
- Conseiller/Excitator: add hard config flag
DISABLE_MERGE=truedefaulted in images and values. - API: seed job and migration runner; serve
/welcomestate for Console wizard. - Console: onboarding wizard and Quickstart banner.
- Task Runner: respect offline mode by failing gracefully if egress blocked.
Packaging & signing
- Embed SBOM in all images; publish as OCI referrers.
- Cosign sign images and attest provenance; verify in CI.
- Publish checksums and signatures on release page.
8) Engineering tasks
Images
- Author multi‑stage Dockerfiles with cache‑efficient builds.
- Add non‑root user, drop capabilities, read‑only FS, healthcheck scripts.
- Generate and attach SBOM for each image.
- Implement
/health/*,/version, optional/metrics.
Compose
- Write
docker-compose.ymlwith all core services and deps. - Create
.env.example,quickstart.sh,backup.sh,reset.sh. - Seed job container and sample data ingestion on first run.
Helm
- Scaffold chart; values for each component; pinned digests.
- Ingress, TLS, HPA, PDB, NetworkPolicy, ServiceAccount/RBAC.
- Migration Job and upgrade hooks; readiness gates for indexers.
- Documentation of values with
helm-docsgenerator.
Air‑gap
- Build script to save images to OCI layout; compress, sign, and checksum.
load.shto import into private registry and rewrite manifests.values-airgap.yamlwith image registry overrides.
Console & API
- Onboarding wizard, Quickstart banner, links to docs.
- Seed data endpoints guarded behind
QUICKSTART_MODE. - Config discovery endpoint for console.
Security
- Cosign integration; key management; CI verification step.
- Admission policy example in docs to enforce signatures.
- Secret redaction in logs; env var audit.
Observability
- OTel config sample;
/metricsendpoints; compose prom scrape. - Helm values for tracing and metrics.
Validation
- Fresh VM smoke test for Compose quickstart.
- Kind cluster e2e for Helm path.
- Air‑gap install test in CI with a local registry.
Docs
- Write all pages listed in §6 with copy‑pasteable commands and screenshots.
- Include a troubleshooting matrix: symptom → probable cause → fix.
- Add “Imposed rule” header line to each page.
9) Feature changes required
- Console: Onboarding wizard, Quickstart banner, and deep links to install docs; “Copy CLI” buttons should prefer the
stellacontainer image in quickstart if local binary missing. - API: Seed job and health endpoints; version reporting; feature flag
QUICKSTART_MODE. - Registry/Release tooling: Publish image catalog and checksums; maintain compatibility matrix per chart version.
- Task Runner: Offline mode awareness and explicit error when attempting egress in air‑gap.
- Conseiller/Excitator: enforce non‑merge at runtime and show capability endpoint.
10) Acceptance criteria
- Quickstart: from clean host to working Console in under 5 minutes on a typical laptop; seed data visible; AOC rules active.
- Helm: install succeeds with external dependencies; roll forward and roll back with zero data loss; probes green.
- Air‑gap: bundle verifies, loads to a private registry, and installs without external network.
- All images: signed, SBOM‑attached, non‑root, read‑only FS, health endpoints exposed.
- Docs: a new user can complete Quickstart without assistance; a platform team can deploy the chart with only values editing.
- Conseiller/Excitator: capability endpoint confirms
merge=false; tests prove aggregation‑only behavior.
11) Risks & mitigations
- Config sprawl. Centralize in
/docs/install/configuration-reference.mdand ship sane defaults. - Drift between Compose and Helm. Pin digests; generate manifests from a common values source where possible; CI diff.
- Resource contention in Quickstart. Limit concurrency; ship low default worker counts; document overrides.
- Air‑gap surprises. Remove implicit egress; provide offline doc copies in bundle; deterministic artifact paths.
- Security regressions. Enforce non‑root/read‑only in CI; signature verification gates release.
12) Philosophy
- First run matters. Quickstart must be boring, predictable, and immediately useful.
- Prod isn’t a flag. Helm defaults are safe; “convenience” belongs in Quickstart, not production.
- Prove your supply chain. Signed images, SBOMs, and provenance are table stakes, not an upsell.
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.