devops folders consolidate
This commit is contained in:
@@ -1,150 +1,459 @@
|
||||
# Stella Ops Compose Profiles
|
||||
# Stella Ops Docker Compose Profiles
|
||||
|
||||
These Compose bundles ship the minimum services required to exercise the scanner pipeline plus control-plane dependencies. Every profile is pinned to immutable image digests sourced from `deploy/releases/*.yaml` and is linted via `docker compose config` in CI.
|
||||
Consolidated Docker Compose configuration for the StellaOps platform. All profiles use immutable image digests from `deploy/releases/*.yaml` and are validated via `docker compose config` in CI.
|
||||
|
||||
## Layout
|
||||
## Quick Reference
|
||||
|
||||
| I want to... | Command |
|
||||
|--------------|---------|
|
||||
| Run the full platform | `docker compose -f docker-compose.stella-ops.yml up -d` |
|
||||
| Add observability | `docker compose -f docker-compose.stella-ops.yml -f docker-compose.telemetry.yml up -d` |
|
||||
| Run CI/testing infrastructure | `docker compose -f docker-compose.testing.yml --profile ci up -d` |
|
||||
| Deploy with China compliance | See [China Compliance](#china-compliance-sm2sm3sm4) |
|
||||
| Deploy with Russia compliance | See [Russia Compliance](#russia-compliance-gost) |
|
||||
| Deploy with EU compliance | See [EU Compliance](#eu-compliance-eidas) |
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
### Core Stack Files
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `docker-compose.stella-ops.yml` | **Main stack**: PostgreSQL 18.1, Valkey 9.0.1, RustFS, Rekor v2, all StellaOps services |
|
||||
| `docker-compose.telemetry.yml` | **Observability**: OpenTelemetry collector, Prometheus, Tempo, Loki |
|
||||
| `docker-compose.testing.yml` | **CI/Testing**: Test databases, mock services, Gitea for integration tests |
|
||||
| `docker-compose.dev.yml` | **Minimal dev infrastructure**: PostgreSQL, Valkey, RustFS only |
|
||||
|
||||
### Specialized Infrastructure
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `docker-compose.bsim.yml` | **BSim analysis**: PostgreSQL for Ghidra binary similarity corpus |
|
||||
| `docker-compose.corpus.yml` | **Function corpus**: PostgreSQL for function behavior database |
|
||||
| `docker-compose.sealed-ci.yml` | **Air-gapped CI**: Sealed testing environment with authority, signer, attestor |
|
||||
| `docker-compose.telemetry-offline.yml` | **Offline observability**: Air-gapped Loki, Promtail, OTEL collector, Tempo, Prometheus |
|
||||
|
||||
### Regional Compliance Overlays
|
||||
|
||||
| File | Purpose | Jurisdiction |
|
||||
|------|---------|--------------|
|
||||
| `docker-compose.compliance-china.yml` | SM2/SM3/SM4 ShangMi crypto configuration | China (OSCCA) |
|
||||
| `docker-compose.compliance-russia.yml` | GOST R 34.10-2012 crypto configuration | Russia (FSB) |
|
||||
| `docker-compose.compliance-eu.yml` | eIDAS qualified trust services configuration | EU |
|
||||
|
||||
### Crypto Provider Overlays
|
||||
|
||||
| File | Purpose | Use Case |
|
||||
|------|---------|----------|
|
||||
| `docker-compose.crypto-sim.yml` | Universal crypto simulation | Testing without licensed crypto |
|
||||
| `docker-compose.cryptopro.yml` | CryptoPro CSP (real GOST) | Production Russia deployments |
|
||||
| `docker-compose.sm-remote.yml` | SM Remote service (real SM2) | Production China deployments |
|
||||
|
||||
### Additional Overlays
|
||||
|
||||
| File | Purpose | Use Case |
|
||||
|------|---------|----------|
|
||||
| `docker-compose.gpu.yaml` | NVIDIA GPU acceleration | Advisory AI inference with GPU |
|
||||
| `docker-compose.cas.yaml` | Content Addressable Storage | Dedicated CAS with retention policies |
|
||||
| `docker-compose.tile-proxy.yml` | Rekor tile caching proxy | Air-gapped Sigstore deployments |
|
||||
|
||||
### Supporting Files
|
||||
|
||||
| Path | Purpose |
|
||||
| ---- | ------- |
|
||||
| `docker-compose.dev.yaml` | Edge/nightly stack tuned for laptops and iterative work. |
|
||||
| `docker-compose.stage.yaml` | Stable channel stack mirroring pre-production clusters. |
|
||||
| `docker-compose.prod.yaml` | Production cutover stack with front-door network hand-off and Notify events enabled. |
|
||||
| `docker-compose.airgap.yaml` | Stable stack with air-gapped defaults (no outbound hostnames). |
|
||||
| `docker-compose.mirror.yaml` | Managed mirror topology for `*.stella-ops.org` distribution (Concelier + Excititor + CDN gateway). |
|
||||
| `docker-compose.rekor-v2.yaml` | Rekor v2 tiles overlay (MySQL-free) for bundled transparency logs. |
|
||||
| `docker-compose.telemetry.yaml` | Optional OpenTelemetry collector overlay (mutual TLS, OTLP ingest endpoints). |
|
||||
| `docker-compose.telemetry-storage.yaml` | Prometheus/Tempo/Loki storage overlay with multi-tenant defaults. |
|
||||
| `docker-compose.gpu.yaml` | Optional GPU overlay enabling NVIDIA devices for Advisory AI web/worker. Apply with `-f docker-compose.<env>.yaml -f docker-compose.gpu.yaml`. |
|
||||
| `env/*.env.example` | Seed `.env` files that document required secrets and ports per profile. |
|
||||
| `scripts/backup.sh` | Pauses workers and creates tar.gz of Mongo/MinIO/Valkey volumes (deterministic snapshot). |
|
||||
| `scripts/reset.sh` | Stops the stack and removes Mongo/MinIO/Valkey volumes after explicit confirmation. |
|
||||
| `scripts/quickstart.sh` | Helper to validate config and start dev stack; set `USE_MOCK=1` to include `docker-compose.mock.yaml` overlay. |
|
||||
| `docker-compose.mock.yaml` | Dev-only overlay with placeholder digests for missing services (orchestrator, policy-registry, packs, task-runner, VEX/Vuln stack). Use only with mock release manifest `deploy/releases/2025.09-mock-dev.yaml`. |
|
||||
|------|---------|
|
||||
| `env/*.env.example` | Environment variable templates per profile |
|
||||
| `scripts/backup.sh` | Create deterministic volume snapshots |
|
||||
| `scripts/reset.sh` | Stop stack and remove volumes (with confirmation) |
|
||||
|
||||
## Usage
|
||||
---
|
||||
|
||||
## Usage Patterns
|
||||
|
||||
### Basic Development
|
||||
|
||||
```bash
|
||||
cp env/dev.env.example dev.env
|
||||
docker compose --env-file dev.env -f docker-compose.dev.yaml config
|
||||
docker compose --env-file dev.env -f docker-compose.dev.yaml up -d
|
||||
# Copy environment template
|
||||
cp env/stellaops.env.example .env
|
||||
|
||||
# Validate configuration
|
||||
docker compose -f docker-compose.stella-ops.yml config
|
||||
|
||||
# Start the platform
|
||||
docker compose -f docker-compose.stella-ops.yml up -d
|
||||
|
||||
# View logs
|
||||
docker compose -f docker-compose.stella-ops.yml logs -f scanner-web
|
||||
```
|
||||
|
||||
The stage and airgap variants behave the same way—swap the file names accordingly. All profiles expose 443/8443 for the UI and REST APIs, and they share a `stellaops` Docker network scoped to the compose project.
|
||||
|
||||
### Rekor v2 overlay (tiles)
|
||||
|
||||
Use the overlay below and set the Rekor env vars in your `.env` file (see
|
||||
`env/dev.env.example`):
|
||||
|
||||
```bash
|
||||
docker compose --env-file dev.env \
|
||||
-f docker-compose.dev.yaml \
|
||||
-f docker-compose.rekor-v2.yaml \
|
||||
--profile sigstore up -d
|
||||
```
|
||||
|
||||
|
||||
> **Surface.Secrets:** set `SCANNER_SURFACE_SECRETS_PROVIDER`/`SCANNER_SURFACE_SECRETS_ROOT` in your `.env` and point `SURFACE_SECRETS_HOST_PATH` to the decrypted bundle path (default `./offline/surface-secrets`). The stack mounts that path read-only into Scanner Web/Worker so `secret://` references resolve without embedding plaintext.
|
||||
|
||||
> **Graph Explorer reminder:** If you enable Cartographer or Graph API containers alongside these profiles, update `etc/authority.yaml` so the `cartographer-service` client is marked with `properties.serviceIdentity: "cartographer"` and carries a tenant hint. The Authority host now refuses `graph:write` tokens without that marker, so apply the configuration change before rolling out the updated images.
|
||||
|
||||
### Telemetry collector overlay
|
||||
|
||||
The OpenTelemetry collector overlay is optional and can be layered on top of any profile:
|
||||
### With Observability
|
||||
|
||||
```bash
|
||||
# Generate TLS certificates for telemetry
|
||||
./ops/devops/telemetry/generate_dev_tls.sh
|
||||
docker compose -f docker-compose.telemetry.yaml up -d
|
||||
python ../../ops/devops/telemetry/smoke_otel_collector.py --host localhost
|
||||
docker compose -f docker-compose.telemetry-storage.yaml up -d
|
||||
|
||||
# Start platform with telemetry
|
||||
docker compose -f docker-compose.stella-ops.yml \
|
||||
-f docker-compose.telemetry.yml up -d
|
||||
```
|
||||
|
||||
The generator script creates a development CA plus server/client certificates under
|
||||
`deploy/telemetry/certs/`. The smoke test sends OTLP/HTTP payloads using the generated
|
||||
client certificate and asserts the collector reports accepted traces, metrics, and logs.
|
||||
The storage overlay starts Prometheus, Tempo, and Loki with multitenancy enabled so you
|
||||
can validate the end-to-end pipeline before promoting changes to staging. Adjust the
|
||||
configs in `deploy/telemetry/storage/` before running in production.
|
||||
Mount the same certificates when running workloads so the collector can enforce mutual TLS.
|
||||
|
||||
For production cutovers copy `env/prod.env.example` to `prod.env`, update the secret placeholders, and create the external network expected by the profile:
|
||||
### CI/Testing Infrastructure
|
||||
|
||||
```bash
|
||||
# Start CI infrastructure only (different ports to avoid conflicts)
|
||||
docker compose -f docker-compose.testing.yml --profile ci up -d
|
||||
|
||||
# Start mock services for integration testing
|
||||
docker compose -f docker-compose.testing.yml --profile mock up -d
|
||||
|
||||
# Start Gitea for SCM integration tests
|
||||
docker compose -f docker-compose.testing.yml --profile gitea up -d
|
||||
|
||||
# Start everything
|
||||
docker compose -f docker-compose.testing.yml --profile all up -d
|
||||
```
|
||||
|
||||
**Test Infrastructure Ports:**
|
||||
| Service | Port | Purpose |
|
||||
|---------|------|---------|
|
||||
| postgres-test | 5433 | PostgreSQL 18 for tests |
|
||||
| valkey-test | 6380 | Valkey for cache/queue tests |
|
||||
| rustfs-test | 8180 | S3-compatible storage |
|
||||
| mock-registry | 5001 | Container registry mock |
|
||||
| gitea | 3000 | Git hosting for SCM tests |
|
||||
|
||||
---
|
||||
|
||||
## Regional Compliance Deployments
|
||||
|
||||
### China Compliance (SM2/SM3/SM4)
|
||||
|
||||
**For Testing (simulation):**
|
||||
```bash
|
||||
docker compose -f docker-compose.stella-ops.yml \
|
||||
-f docker-compose.compliance-china.yml \
|
||||
-f docker-compose.crypto-sim.yml up -d
|
||||
```
|
||||
|
||||
**For Production (real SM crypto):**
|
||||
```bash
|
||||
docker compose -f docker-compose.stella-ops.yml \
|
||||
-f docker-compose.compliance-china.yml \
|
||||
-f docker-compose.sm-remote.yml up -d
|
||||
```
|
||||
|
||||
**With OSCCA-certified HSM:**
|
||||
```bash
|
||||
# Set HSM connection details in environment
|
||||
export SM_REMOTE_HSM_URL="https://sm-hsm.example.com:8900"
|
||||
export SM_REMOTE_HSM_API_KEY="your-api-key"
|
||||
|
||||
docker compose -f docker-compose.stella-ops.yml \
|
||||
-f docker-compose.compliance-china.yml \
|
||||
-f docker-compose.sm-remote.yml up -d
|
||||
```
|
||||
|
||||
**Algorithms:**
|
||||
- SM2: Public key cryptography (GM/T 0003-2012)
|
||||
- SM3: Hash function, 256-bit (GM/T 0004-2012)
|
||||
- SM4: Block cipher, 128-bit (GM/T 0002-2012)
|
||||
|
||||
---
|
||||
|
||||
### Russia Compliance (GOST)
|
||||
|
||||
**For Testing (simulation):**
|
||||
```bash
|
||||
docker compose -f docker-compose.stella-ops.yml \
|
||||
-f docker-compose.compliance-russia.yml \
|
||||
-f docker-compose.crypto-sim.yml up -d
|
||||
```
|
||||
|
||||
**For Production (CryptoPro CSP):**
|
||||
```bash
|
||||
# CryptoPro requires EULA acceptance
|
||||
CRYPTOPRO_ACCEPT_EULA=1 docker compose -f docker-compose.stella-ops.yml \
|
||||
-f docker-compose.compliance-russia.yml \
|
||||
-f docker-compose.cryptopro.yml up -d
|
||||
```
|
||||
|
||||
**Requirements for CryptoPro:**
|
||||
- CryptoPro CSP license files in `opt/cryptopro/downloads/`
|
||||
- `CRYPTOPRO_ACCEPT_EULA=1` environment variable
|
||||
- Valid CryptoPro container images
|
||||
|
||||
**Algorithms:**
|
||||
- GOST R 34.10-2012: Digital signature (256/512-bit)
|
||||
- GOST R 34.11-2012: Hash function (Streebog, 256/512-bit)
|
||||
- GOST R 34.12-2015: Block cipher (Kuznyechik, Magma)
|
||||
|
||||
---
|
||||
|
||||
### EU Compliance (eIDAS)
|
||||
|
||||
**For Testing (simulation):**
|
||||
```bash
|
||||
docker compose -f docker-compose.stella-ops.yml \
|
||||
-f docker-compose.compliance-eu.yml \
|
||||
-f docker-compose.crypto-sim.yml up -d
|
||||
```
|
||||
|
||||
**For Production:**
|
||||
EU eIDAS deployments typically integrate with external Qualified Trust Service Providers (QTSPs) rather than hosting crypto locally. Configure your QTSP integration in the application settings.
|
||||
|
||||
```bash
|
||||
docker compose -f docker-compose.stella-ops.yml \
|
||||
-f docker-compose.compliance-eu.yml up -d
|
||||
```
|
||||
|
||||
**Standards:**
|
||||
- ETSI TS 119 312 compliant algorithms
|
||||
- Qualified electronic signatures
|
||||
- QTSP integration for qualified trust services
|
||||
|
||||
---
|
||||
|
||||
## Crypto Simulation Details
|
||||
|
||||
The `docker-compose.crypto-sim.yml` overlay provides a unified simulation service for all sovereign crypto profiles:
|
||||
|
||||
| Algorithm ID | Simulation | Use Case |
|
||||
|--------------|------------|----------|
|
||||
| `SM2`, `sm.sim` | HMAC-SHA256 | China testing |
|
||||
| `GOST12-256`, `GOST12-512` | HMAC-SHA256 | Russia testing |
|
||||
| `ru.magma.sim`, `ru.kuznyechik.sim` | HMAC-SHA256 | Russia testing |
|
||||
| `DILITHIUM3`, `FALCON512`, `pq.sim` | HMAC-SHA256 | Post-quantum testing |
|
||||
| `fips.sim`, `eidas.sim`, `kcmvp.sim` | ECDSA P-256 | FIPS/EU/Korea testing |
|
||||
|
||||
**Important:** Simulation is for testing only. Uses deterministic HMAC or static ECDSA keys—not suitable for production or compliance certification.
|
||||
|
||||
---
|
||||
|
||||
## Configuration Reference
|
||||
|
||||
### Infrastructure Services
|
||||
|
||||
| Service | Default Port | Purpose |
|
||||
|---------|--------------|---------|
|
||||
| PostgreSQL | 5432 | Primary database |
|
||||
| Valkey | 6379 | Cache, queues, events |
|
||||
| RustFS | 8080 | S3-compatible artifact storage |
|
||||
| Rekor v2 | (internal) | Sigstore transparency log |
|
||||
|
||||
### Application Services
|
||||
|
||||
| Service | Default Port | Purpose |
|
||||
|---------|--------------|---------|
|
||||
| Authority | 8440 | OAuth2/OIDC identity provider |
|
||||
| Signer | 8441 | Cryptographic signing |
|
||||
| Attestor | 8442 | SLSA attestation |
|
||||
| Scanner Web | 8444 | SBOM/vulnerability scanning API |
|
||||
| Concelier | 8445 | Advisory aggregation |
|
||||
| Notify Web | 8446 | Notification service |
|
||||
| Issuer Directory | 8447 | CSAF publisher registry |
|
||||
| Advisory AI Web | 8448 | AI-powered advisory analysis |
|
||||
| Web UI | 8443 | Angular frontend |
|
||||
|
||||
### Environment Variables
|
||||
|
||||
Key variables (see `env/*.env.example` for complete list):
|
||||
|
||||
```bash
|
||||
# Database
|
||||
POSTGRES_USER=stellaops
|
||||
POSTGRES_PASSWORD=<secret>
|
||||
POSTGRES_DB=stellaops_platform
|
||||
|
||||
# Authority
|
||||
AUTHORITY_ISSUER=https://authority.example.com
|
||||
|
||||
# Scanner
|
||||
SCANNER_EVENTS_ENABLED=false
|
||||
SCANNER_OFFLINEKIT_ENABLED=false
|
||||
|
||||
# Crypto (for compliance overlays)
|
||||
STELLAOPS_CRYPTO_PROFILE=default # or: china, russia, eu
|
||||
STELLAOPS_CRYPTO_ENABLE_SIM=0 # set to 1 for simulation
|
||||
|
||||
# CryptoPro (Russia only)
|
||||
CRYPTOPRO_ACCEPT_EULA=0 # must be 1 to use CryptoPro
|
||||
|
||||
# SM Remote (China only)
|
||||
SM_SOFT_ALLOWED=1 # software-only SM2
|
||||
SM_REMOTE_HSM_URL= # optional: OSCCA-certified HSM
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Networking
|
||||
|
||||
All profiles use a shared `stellaops` Docker network. Production deployments can attach a `frontdoor` network for reverse proxy integration:
|
||||
|
||||
```bash
|
||||
# Create external network for load balancer
|
||||
docker network create stellaops_frontdoor
|
||||
docker compose --env-file prod.env -f docker-compose.prod.yaml config
|
||||
|
||||
# Set in environment
|
||||
export FRONTDOOR_NETWORK=stellaops_frontdoor
|
||||
```
|
||||
|
||||
### Scanner event stream settings
|
||||
Only externally-reachable services (Authority, Signer, Attestor, Concelier, Scanner Web, Notify Web, UI) attach to the frontdoor network. Infrastructure services (PostgreSQL, Valkey, RustFS) remain on the private network.
|
||||
|
||||
Scanner WebService can emit signed `scanner.report.*` events to Redis Streams when `SCANNER__EVENTS__ENABLED=true`. Each profile ships environment placeholders you can override in the `.env` file:
|
||||
---
|
||||
|
||||
- `SCANNER_EVENTS_ENABLED` – toggle emission on/off (defaults to `false`).
|
||||
- `SCANNER_EVENTS_DRIVER` – currently only `redis` is supported.
|
||||
- `SCANNER_EVENTS_DSN` – Redis endpoint; leave blank to reuse the queue DSN when it uses `redis://`.
|
||||
- `SCANNER_EVENTS_STREAM` – stream name (`stella.events` by default).
|
||||
- `SCANNER_EVENTS_PUBLISH_TIMEOUT_SECONDS` – per-publish timeout window (defaults to `5`).
|
||||
- `SCANNER_EVENTS_MAX_STREAM_LENGTH` – max stream length before Redis trims entries (defaults to `10000`).
|
||||
## Sigstore Tools
|
||||
|
||||
Helm values mirror the same knobs under each service’s `env` map (see `deploy/helm/stellaops/values-*.yaml`).
|
||||
|
||||
### Scheduler worker configuration
|
||||
|
||||
Every Compose profile now provisions the `scheduler-worker` container (backed by the
|
||||
`StellaOps.Scheduler.Worker.Host` entrypoint). The environment placeholders exposed
|
||||
in the `.env` samples match the options bound by `AddSchedulerWorker`:
|
||||
|
||||
- `SCHEDULER_QUEUE_KIND` – queue transport (`Nats` or `Redis`).
|
||||
- `SCHEDULER_QUEUE_NATS_URL` – NATS connection string used by planner/runner consumers.
|
||||
- `SCHEDULER_STORAGE_DATABASE` – PostgreSQL database name for scheduler state.
|
||||
- `SCHEDULER_SCANNER_BASEADDRESS` – base URL the runner uses when invoking Scanner’s
|
||||
`/api/v1/reports` (defaults to the in-cluster `http://scanner-web:8444`).
|
||||
|
||||
Helm deployments inherit the same defaults from `services.scheduler-worker.env` in
|
||||
`values.yaml`; override them per environment as needed.
|
||||
|
||||
### Advisory AI configuration
|
||||
|
||||
`advisory-ai-web` hosts the API/plan cache while `advisory-ai-worker` executes queued tasks. Both containers mount the shared volumes (`advisory-ai-queue`, `advisory-ai-plans`, `advisory-ai-outputs`) so they always read/write the same deterministic state. New environment knobs:
|
||||
|
||||
- `ADVISORY_AI_SBOM_BASEADDRESS` – endpoint the SBOM context client hits (defaults to the in-cluster Scanner URL).
|
||||
- `ADVISORY_AI_INFERENCE_MODE` – `Local` (default) keeps inference on-prem; `Remote` posts sanitized prompts to the URL supplied via `ADVISORY_AI_REMOTE_BASEADDRESS`. Optional `ADVISORY_AI_REMOTE_APIKEY` carries the bearer token when remote inference is enabled.
|
||||
- `ADVISORY_AI_WEB_PORT` – host port for `advisory-ai-web`.
|
||||
|
||||
The Helm chart mirrors these settings under `services.advisory-ai-web` / `advisory-ai-worker` and expects a PVC named `stellaops-advisory-ai-data` so both deployments can mount the same RWX volume.
|
||||
|
||||
### Front-door network hand-off
|
||||
|
||||
`docker-compose.prod.yaml` adds a `frontdoor` network so operators can attach Traefik, Envoy, or an on-prem load balancer that terminates TLS. Override `FRONTDOOR_NETWORK` in `prod.env` if your reverse proxy uses a different bridge name. Attach only the externally reachable services (Authority, Signer, Attestor, Concelier, Scanner Web, Notify Web, UI) to that network—internal infrastructure (Mongo, MinIO, RustFS, NATS) stays on the private `stellaops` network.
|
||||
|
||||
### Updating to a new release
|
||||
|
||||
1. Import the new manifest into `deploy/releases/` (see `deploy/README.md`).
|
||||
2. Update image digests in the relevant Compose file(s).
|
||||
3. Re-run `docker compose config` to confirm the bundle is deterministic.
|
||||
|
||||
### Mock overlay for missing digests (dev only)
|
||||
|
||||
Until official digests land, you can exercise Compose packaging with mock placeholders:
|
||||
Enable Sigstore CLI tools (rekor-cli, cosign) with the `sigstore` profile:
|
||||
|
||||
```bash
|
||||
# assumes docker-compose.dev.yaml as the base profile
|
||||
USE_MOCK=1 ./scripts/quickstart.sh env/dev.env.example
|
||||
docker compose -f docker-compose.stella-ops.yml --profile sigstore up -d
|
||||
```
|
||||
|
||||
The overlay pins the missing services (orchestrator, policy-registry, packs-registry, task-runner, VEX/Vuln stack) to mock digests from `deploy/releases/2025.09-mock-dev.yaml` and starts their real entrypoints so integration flows can be exercised end-to-end. Replace the mock pins with production digests once releases publish; keep the mock overlay dev-only.
|
||||
---
|
||||
|
||||
Keep digests synchronized between Compose, Helm, and the release manifest to preserve reproducibility guarantees. `deploy/tools/validate-profiles.sh` performs a quick audit.
|
||||
## GPU Support for Advisory AI
|
||||
|
||||
### GPU toggle for Advisory AI
|
||||
|
||||
GPU is disabled by default. To run inference on NVIDIA GPUs:
|
||||
GPU is disabled by default. To enable NVIDIA GPU inference:
|
||||
|
||||
```bash
|
||||
docker compose \
|
||||
--env-file prod.env \
|
||||
-f docker-compose.prod.yaml \
|
||||
-f docker-compose.gpu.yaml \
|
||||
up -d
|
||||
docker compose -f docker-compose.stella-ops.yml \
|
||||
-f docker-compose.gpu.yaml up -d
|
||||
```
|
||||
|
||||
The GPU overlay requests one GPU for `advisory-ai-worker` and `advisory-ai-web` and sets `ADVISORY_AI_INFERENCE_GPU=true`. Ensure the host has the NVIDIA container runtime and that the base compose file still sets the correct digests.
|
||||
**Requirements:**
|
||||
- NVIDIA GPU with CUDA support
|
||||
- nvidia-container-toolkit installed
|
||||
- Docker configured with nvidia runtime
|
||||
|
||||
---
|
||||
|
||||
## Content Addressable Storage (CAS)
|
||||
|
||||
The CAS overlay provides dedicated RustFS instances with retention policies for different artifact types:
|
||||
|
||||
```bash
|
||||
# Standalone CAS infrastructure
|
||||
docker compose -f docker-compose.cas.yaml up -d
|
||||
|
||||
# Combined with main stack
|
||||
docker compose -f docker-compose.stella-ops.yml \
|
||||
-f docker-compose.cas.yaml up -d
|
||||
```
|
||||
|
||||
**CAS Services:**
|
||||
| Service | Port | Purpose |
|
||||
|---------|------|---------|
|
||||
| rustfs-cas | 8180 | Runtime facts, signals, replay artifacts |
|
||||
| rustfs-evidence | 8181 | Merkle roots, hash chains, evidence bundles (immutable) |
|
||||
| rustfs-attestation | 8182 | DSSE envelopes, in-toto attestations (immutable) |
|
||||
|
||||
**Retention Policies (configurable via `env/cas.env.example`):**
|
||||
- Vulnerability DB: 7 days
|
||||
- SBOM artifacts: 365 days
|
||||
- Scan results: 90 days
|
||||
- Evidence bundles: Indefinite (immutable)
|
||||
- Attestations: Indefinite (immutable)
|
||||
|
||||
---
|
||||
|
||||
## Tile Proxy (Air-Gapped Sigstore)
|
||||
|
||||
For air-gapped deployments, the tile-proxy caches Rekor transparency log tiles locally from public Sigstore:
|
||||
|
||||
```bash
|
||||
docker compose -f docker-compose.stella-ops.yml \
|
||||
-f docker-compose.tile-proxy.yml up -d
|
||||
```
|
||||
|
||||
**Tile Proxy vs Rekor v2:**
|
||||
- Use `--profile sigstore` when running your own Rekor transparency log locally
|
||||
- Use `docker-compose.tile-proxy.yml` when caching tiles from public Sigstore (rekor.sigstore.dev)
|
||||
|
||||
**Configuration:**
|
||||
| Variable | Default | Purpose |
|
||||
|----------|---------|---------|
|
||||
| `REKOR_SERVER_URL` | `https://rekor.sigstore.dev` | Upstream Rekor to proxy |
|
||||
| `TILE_PROXY_SYNC_ENABLED` | `true` | Enable periodic tile sync |
|
||||
| `TILE_PROXY_SYNC_SCHEDULE` | `0 */6 * * *` | Sync every 6 hours |
|
||||
| `TILE_PROXY_CACHE_MAX_SIZE_GB` | `10` | Local cache size limit |
|
||||
|
||||
The proxy syncs tiles on schedule and serves them to internal services for offline verification.
|
||||
|
||||
---
|
||||
|
||||
## Maintenance
|
||||
|
||||
### Backup
|
||||
|
||||
```bash
|
||||
./scripts/backup.sh # Creates timestamped tar.gz of volumes
|
||||
```
|
||||
|
||||
### Reset
|
||||
|
||||
```bash
|
||||
./scripts/reset.sh # Stops stack, removes volumes (requires confirmation)
|
||||
```
|
||||
|
||||
### Validate Configuration
|
||||
|
||||
```bash
|
||||
docker compose -f docker-compose.stella-ops.yml config
|
||||
```
|
||||
|
||||
### Update to New Release
|
||||
|
||||
1. Import new manifest to `deploy/releases/`
|
||||
2. Update image digests in compose files
|
||||
3. Run `docker compose config` to validate
|
||||
4. Run `deploy/tools/validate-profiles.sh` for audit
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Port Conflicts
|
||||
|
||||
Override ports in your `.env` file:
|
||||
```bash
|
||||
POSTGRES_PORT=5433
|
||||
VALKEY_PORT=6380
|
||||
SCANNER_WEB_PORT=8544
|
||||
```
|
||||
|
||||
### Service Dependencies
|
||||
|
||||
Services declare `depends_on` with health checks. If a service fails to start, check its dependencies:
|
||||
```bash
|
||||
docker compose -f docker-compose.stella-ops.yml ps
|
||||
docker compose -f docker-compose.stella-ops.yml logs postgres
|
||||
docker compose -f docker-compose.stella-ops.yml logs valkey
|
||||
```
|
||||
|
||||
### Crypto Provider Issues
|
||||
|
||||
For crypto simulation issues:
|
||||
```bash
|
||||
# Check sim-crypto service
|
||||
docker compose logs sim-crypto
|
||||
curl http://localhost:18090/keys
|
||||
```
|
||||
|
||||
For CryptoPro issues:
|
||||
```bash
|
||||
# Verify EULA acceptance
|
||||
echo $CRYPTOPRO_ACCEPT_EULA # must be 1
|
||||
|
||||
# Check CryptoPro service
|
||||
docker compose logs cryptopro-csp
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Deployment Upgrade Runbook](../../docs/operations/devops/runbooks/deployment-upgrade.md)
|
||||
- [Local CI Guide](../../docs/technical/testing/LOCAL_CI_GUIDE.md)
|
||||
- [Crypto Profile Configuration](../../docs/security/crypto-profile-configuration.md)
|
||||
- [Regional Deployments](../../docs/operations/regional-deployments.md)
|
||||
|
||||
Reference in New Issue
Block a user