# Advisory AI Deployment Runbook ## Scope - Helm and Compose packaging for `advisory-ai-web` (API/plan cache) and `advisory-ai-worker` (inference/queue). - GPU toggle (NVIDIA) for on-prem inference; defaults remain CPU-safe. - Offline kit pickup instructions for including advisory AI artefacts. ## Helm Values already ship in `deploy/helm/stellaops/values-*.yaml` under `services.advisory-ai-web` and `advisory-ai-worker`. GPU enablement (example): ```yaml services: advisory-ai-worker: runtimeClassName: nvidia nodeSelector: nvidia.com/gpu.present: "true" tolerations: - key: nvidia.com/gpu operator: Exists effect: NoSchedule resources: limits: nvidia.com/gpu: 1 advisory-ai-web: runtimeClassName: nvidia resources: limits: nvidia.com/gpu: 1 ``` Apply: ```bash helm upgrade --install stellaops ./deploy/helm/stellaops \ -f deploy/helm/stellaops/values-prod.yaml \ -f deploy/helm/stellaops/values-mirror.yaml \ --set services.advisory-ai-worker.resources.limits.nvidia\.com/gpu=1 \ --set services.advisory-ai-worker.runtimeClassName=nvidia ``` ## Compose - Base profiles: `docker-compose.dev.yaml`, `stage`, `prod`, `airgap` already include advisory AI services and shared volumes. - GPU overlay: `docker-compose.gpu.yaml` (adds NVIDIA device reservations and `ADVISORY_AI_INFERENCE_GPU=true`). Use: ```bash docker compose --env-file prod.env \ -f docker-compose.prod.yaml \ -f docker-compose.gpu.yaml up -d ``` ## Offline kit pickup - Ensure advisory AI images are mirrored to your registry (or baked into airgap tar) before running the offline kit build. - Copy the following into `out/offline-kit/metadata/` before invoking the offline kit script: - `advisory-ai-web` image tar - `advisory-ai-worker` image tar - SBOM/provenance generated by the release pipeline - Verify `docs/24_OFFLINE_KIT.md` includes the advisory AI entries and rerun `tests/offline/test_build_offline_kit.py` if it changes. ## Runbook (prod quickstart) 1) Prepare secrets in ExternalSecret or Kubernetes secret named `stellaops-prod-core` (see helm values). 2) Run Helm install with prod values and GPU overrides as needed. 3) For Compose, use `prod.env` and optionally `docker-compose.gpu.yaml` overlay. 4) Validate health: - `GET /healthz` on `advisory-ai-web` - Check queue directories under `advisory-ai-*` volumes remain writable - Confirm inference path logs when GPU is detected (log key `advisory.ai.inference.gpu=true`). ## Evidence to attach (sprint) - Helm release output (rendered templates for advisory AI) - `docker-compose config` with/without GPU overlay - Offline kit metadata listing advisory AI images + SBOMs