Files
git.stella-ops.org/deploy/compose/README.md
StellaOps Bot 0de92144d2
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Findings Ledger CI / build-test (push) Has been cancelled
Findings Ledger CI / migration-validation (push) Has been cancelled
Findings Ledger CI / generate-manifest (push) Has been cancelled
mock-dev-release / package-mock-release (push) Has been cancelled
feat(api): Implement Console Export Client and Models
- Added ConsoleExportClient for managing export requests and responses.
- Introduced ConsoleExportRequest and ConsoleExportResponse models.
- Implemented methods for creating and retrieving exports with appropriate headers.

feat(crypto): Add Software SM2/SM3 Cryptography Provider

- Implemented SmSoftCryptoProvider for software-only SM2/SM3 cryptography.
- Added support for signing and verification using SM2 algorithm.
- Included hashing functionality with SM3 algorithm.
- Configured options for loading keys from files and environment gate checks.

test(crypto): Add unit tests for SmSoftCryptoProvider

- Created comprehensive tests for signing, verifying, and hashing functionalities.
- Ensured correct behavior for key management and error handling.

feat(api): Enhance Console Export Models

- Expanded ConsoleExport models to include detailed status and event types.
- Added support for various export formats and notification options.

test(time): Implement TimeAnchorPolicyService tests

- Developed tests for TimeAnchorPolicyService to validate time anchors.
- Covered scenarios for anchor validation, drift calculation, and policy enforcement.
2025-12-07 00:27:33 +02:00

135 lines
8.4 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# StellaOps Compose Profiles
These Compose bundles ship the minimum services required to exercise the scanner pipeline plus control-plane dependencies. Every profile is pinned to immutable image digests sourced from `deploy/releases/*.yaml` and is linted via `docker compose config` in CI.
## Layout
| Path | Purpose |
| ---- | ------- |
| `docker-compose.dev.yaml` | Edge/nightly stack tuned for laptops and iterative work. |
| `docker-compose.stage.yaml` | Stable channel stack mirroring pre-production clusters. |
| `docker-compose.prod.yaml` | Production cutover stack with front-door network hand-off and Notify events enabled. |
| `docker-compose.airgap.yaml` | Stable stack with air-gapped defaults (no outbound hostnames). |
| `docker-compose.mirror.yaml` | Managed mirror topology for `*.stella-ops.org` distribution (Concelier + Excititor + CDN gateway). |
| `docker-compose.telemetry.yaml` | Optional OpenTelemetry collector overlay (mutual TLS, OTLP ingest endpoints). |
| `docker-compose.telemetry-storage.yaml` | Prometheus/Tempo/Loki storage overlay with multi-tenant defaults. |
| `docker-compose.gpu.yaml` | Optional GPU overlay enabling NVIDIA devices for Advisory AI web/worker. Apply with `-f docker-compose.<env>.yaml -f docker-compose.gpu.yaml`. |
| `env/*.env.example` | Seed `.env` files that document required secrets and ports per profile. |
| `scripts/backup.sh` | Pauses workers and creates tar.gz of Mongo/MinIO/Redis volumes (deterministic snapshot). |
| `scripts/reset.sh` | Stops the stack and removes Mongo/MinIO/Redis volumes after explicit confirmation. |
| `scripts/quickstart.sh` | Helper to validate config and start dev stack; set `USE_MOCK=1` to include `docker-compose.mock.yaml` overlay. |
| `docker-compose.mock.yaml` | Dev-only overlay with placeholder digests for missing services (orchestrator, policy-registry, packs, task-runner, VEX/Vuln stack). Use only with mock release manifest `deploy/releases/2025.09-mock-dev.yaml`. |
## Usage
```bash
cp env/dev.env.example dev.env
docker compose --env-file dev.env -f docker-compose.dev.yaml config
docker compose --env-file dev.env -f docker-compose.dev.yaml up -d
```
The stage and airgap variants behave the same way—swap the file names accordingly. All profiles expose 443/8443 for the UI and REST APIs, and they share a `stellaops` Docker network scoped to the compose project.
> **Graph Explorer reminder:** If you enable Cartographer or Graph API containers alongside these profiles, update `etc/authority.yaml` so the `cartographer-service` client is marked with `properties.serviceIdentity: "cartographer"` and carries a tenant hint. The Authority host now refuses `graph:write` tokens without that marker, so apply the configuration change before rolling out the updated images.
### Telemetry collector overlay
The OpenTelemetry collector overlay is optional and can be layered on top of any profile:
```bash
./ops/devops/telemetry/generate_dev_tls.sh
docker compose -f docker-compose.telemetry.yaml up -d
python ../../ops/devops/telemetry/smoke_otel_collector.py --host localhost
docker compose -f docker-compose.telemetry-storage.yaml up -d
```
The generator script creates a development CA plus server/client certificates under
`deploy/telemetry/certs/`. The smoke test sends OTLP/HTTP payloads using the generated
client certificate and asserts the collector reports accepted traces, metrics, and logs.
The storage overlay starts Prometheus, Tempo, and Loki with multitenancy enabled so you
can validate the end-to-end pipeline before promoting changes to staging. Adjust the
configs in `deploy/telemetry/storage/` before running in production.
Mount the same certificates when running workloads so the collector can enforce mutual TLS.
For production cutovers copy `env/prod.env.example` to `prod.env`, update the secret placeholders, and create the external network expected by the profile:
```bash
docker network create stellaops_frontdoor
docker compose --env-file prod.env -f docker-compose.prod.yaml config
```
### Scanner event stream settings
Scanner WebService can emit signed `scanner.report.*` events to Redis Streams when `SCANNER__EVENTS__ENABLED=true`. Each profile ships environment placeholders you can override in the `.env` file:
- `SCANNER_EVENTS_ENABLED` toggle emission on/off (defaults to `false`).
- `SCANNER_EVENTS_DRIVER` currently only `redis` is supported.
- `SCANNER_EVENTS_DSN` Redis endpoint; leave blank to reuse the queue DSN when it uses `redis://`.
- `SCANNER_EVENTS_STREAM` stream name (`stella.events` by default).
- `SCANNER_EVENTS_PUBLISH_TIMEOUT_SECONDS` per-publish timeout window (defaults to `5`).
- `SCANNER_EVENTS_MAX_STREAM_LENGTH` max stream length before Redis trims entries (defaults to `10000`).
Helm values mirror the same knobs under each services `env` map (see `deploy/helm/stellaops/values-*.yaml`).
### Scheduler worker configuration
Every Compose profile now provisions the `scheduler-worker` container (backed by the
`StellaOps.Scheduler.Worker.Host` entrypoint). The environment placeholders exposed
in the `.env` samples match the options bound by `AddSchedulerWorker`:
- `SCHEDULER_QUEUE_KIND` queue transport (`Nats` or `Redis`).
- `SCHEDULER_QUEUE_NATS_URL` NATS connection string used by planner/runner consumers.
- `SCHEDULER_STORAGE_DATABASE` MongoDB database name for scheduler state.
- `SCHEDULER_SCANNER_BASEADDRESS` base URL the runner uses when invoking Scanners
`/api/v1/reports` (defaults to the in-cluster `http://scanner-web:8444`).
Helm deployments inherit the same defaults from `services.scheduler-worker.env` in
`values.yaml`; override them per environment as needed.
### Advisory AI configuration
`advisory-ai-web` hosts the API/plan cache while `advisory-ai-worker` executes queued tasks. Both containers mount the shared volumes (`advisory-ai-queue`, `advisory-ai-plans`, `advisory-ai-outputs`) so they always read/write the same deterministic state. New environment knobs:
- `ADVISORY_AI_SBOM_BASEADDRESS` endpoint the SBOM context client hits (defaults to the in-cluster Scanner URL).
- `ADVISORY_AI_INFERENCE_MODE` `Local` (default) keeps inference on-prem; `Remote` posts sanitized prompts to the URL supplied via `ADVISORY_AI_REMOTE_BASEADDRESS`. Optional `ADVISORY_AI_REMOTE_APIKEY` carries the bearer token when remote inference is enabled.
- `ADVISORY_AI_WEB_PORT` host port for `advisory-ai-web`.
The Helm chart mirrors these settings under `services.advisory-ai-web` / `advisory-ai-worker` and expects a PVC named `stellaops-advisory-ai-data` so both deployments can mount the same RWX volume.
### Front-door network hand-off
`docker-compose.prod.yaml` adds a `frontdoor` network so operators can attach Traefik, Envoy, or an on-prem load balancer that terminates TLS. Override `FRONTDOOR_NETWORK` in `prod.env` if your reverse proxy uses a different bridge name. Attach only the externally reachable services (Authority, Signer, Attestor, Concelier, Scanner Web, Notify Web, UI) to that network—internal infrastructure (Mongo, MinIO, RustFS, NATS) stays on the private `stellaops` network.
### Updating to a new release
1. Import the new manifest into `deploy/releases/` (see `deploy/README.md`).
2. Update image digests in the relevant Compose file(s).
3. Re-run `docker compose config` to confirm the bundle is deterministic.
### Mock overlay for missing digests (dev only)
Until official digests land, you can exercise Compose packaging with mock placeholders:
```bash
# assumes docker-compose.dev.yaml as the base profile
USE_MOCK=1 ./scripts/quickstart.sh env/dev.env.example
```
The overlay pins the missing services (orchestrator, policy-registry, packs-registry, task-runner, VEX/Vuln stack) to mock digests from `deploy/releases/2025.09-mock-dev.yaml` and uses `sleep infinity` commands. Replace with real digests and service commands as soon as releases publish.
Keep digests synchronized between Compose, Helm, and the release manifest to preserve reproducibility guarantees. `deploy/tools/validate-profiles.sh` performs a quick audit.
### GPU toggle for Advisory AI
GPU is disabled by default. To run inference on NVIDIA GPUs:
```bash
docker compose \
--env-file prod.env \
-f docker-compose.prod.yaml \
-f docker-compose.gpu.yaml \
up -d
```
The GPU overlay requests one GPU for `advisory-ai-worker` and `advisory-ai-web` and sets `ADVISORY_AI_INFERENCE_GPU=true`. Ensure the host has the NVIDIA container runtime and that the base compose file still sets the correct digests.