# AdvisoryAI Deployment Guide (AIAI-31-008) This guide covers packaging AdvisoryAI for on-prem / offline environments, toggling remote inference, and recommended scaling settings. ## Artifacts - Dockerfile: `ops/advisory-ai/Dockerfile` (multi-role build for WebService / Worker). - Local compose: `ops/advisory-ai/docker-compose.advisoryai.yaml` (web + worker, shared data volume). - Helm chart: `ops/advisory-ai/helm/` (web Deployment + optional worker Deployment + Service + PVC stub). ## Build and run locally ```bash # Build images make advisoryai-web: docker build -f ops/advisory-ai/Dockerfile -t stellaops-advisoryai-web:dev \ --build-arg PROJECT=src/AdvisoryAI/StellaOps.AdvisoryAI.WebService/StellaOps.AdvisoryAI.WebService.csproj \ --build-arg APP_DLL=StellaOps.AdvisoryAI.WebService.dll . make advisoryai-worker: docker build -f ops/advisory-ai/Dockerfile -t stellaops-advisoryai-worker:dev \ --build-arg PROJECT=src/AdvisoryAI/StellaOps.AdvisoryAI.Worker/StellaOps.AdvisoryAI.Worker.csproj \ --build-arg APP_DLL=StellaOps.AdvisoryAI.Worker.dll . # Compose (offline friendly) docker compose -f ops/advisory-ai/docker-compose.advisoryai.yaml up -d --build ``` ## Remote inference toggle - Default: `ADVISORYAI__INFERENCE__MODE=Local` (fully offline). - Remote: set - `ADVISORYAI__INFERENCE__MODE=Remote` - `ADVISORYAI__INFERENCE__REMOTE__BASEADDRESS=https://inference.example.com` - `ADVISORYAI__INFERENCE__REMOTE__ENDPOINT=/v1/inference` - `ADVISORYAI__INFERENCE__REMOTE__APIKEY=` - Optional: `ADVISORYAI__INFERENCE__REMOTE__TIMEOUT=00:00:30` - Guardrails still enforced locally (see `ADVISORYAI__GUARDRAILS__*` options); keep secrets in mounted env/secret rather than images. ## Storage & persistence - File-system queue/cache/output paths default to `/app/data/{queue,plans,outputs}` and are pre-created at startup. - Compose mounts `advisoryai-data` volume; Helm uses `emptyDir` by default or a PVC when `storage.persistence.enabled=true`. - In sealed/air-gapped mode, mount guardrail lists/policy knobs under `/app/etc` and point env vars accordingly. ## Scaling guidance - WebService: start with 1 replica, scale horizontally by CPU (tokenization) or queue depth; set `ADVISORYAI__QUEUE__DIRECTORYPATH` to a shared PVC for multi-replica web+worker. - Worker: scale independently; use `worker.replicas` in Helm or add `--scale advisoryai-worker=N` in compose. Workers are CPU-bound; pin via `resources.requests/limits`. - Set rate limiter headers: add `X-StellaOps-Client` per caller to avoid shared buckets. ## Offline posture - Images build from source without external runtime downloads; keep `ADVISORYAI__INFERENCE__MODE=Local` to stay offline. - For registry-mirrored environments, push `stellaops-advisoryai-web` and `stellaops-advisoryai-worker` to the allowed registry and reference via Helm `image.repository`. - Disable OTEL exporters unless explicitly permitted; logs remain structured JSON to stdout. ## Air-gap checklist - Remote inference disabled (or routed through approved enclave). - Guardrail phrase list mounted read-only. - Data PVC scoped per tenant/project if multi-tenant; enforce scope via `X-StellaOps-Scopes`. - Validate that `/app/data` volume has backup/retention policy; cache pruning handled by storage options. ## Deliverables mapping - Compose + Dockerfile satisfy on-prem packaging. - Helm chart provides cluster deployment with persistence and remote toggle. - This guide documents scaling/offline posture required by Sprint 0110 AIAI-31-008.