Add 12 new sprint files (Integrations, Graph, JobEngine, FE, Router, AdvisoryAI), archive completed scheduler UI sprint, update module architecture docs (router, graph, jobengine, web, integrations), and add Gitea entrypoint script for local dev. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4.5 KiB
4.5 KiB
AdvisoryAI Deployment Guide (AIAI-31-008)
This guide covers packaging AdvisoryAI for on-prem / offline environments, toggling remote inference, and recommended scaling settings.
Artifacts
- Dockerfile:
ops/advisory-ai/Dockerfile(multi-role build for WebService / Worker). - Local compose:
ops/advisory-ai/docker-compose.advisoryai.yaml(web + worker, shared data volume). - Helm chart:
ops/advisory-ai/helm/(web Deployment + optional worker Deployment + Service + PVC stub).
Build and run locally
# Build images
make advisoryai-web: docker build -f ops/advisory-ai/Dockerfile -t stellaops-advisoryai-web:dev \
--build-arg PROJECT=src/AdvisoryAI/StellaOps.AdvisoryAI.WebService/StellaOps.AdvisoryAI.WebService.csproj \
--build-arg APP_DLL=StellaOps.AdvisoryAI.WebService.dll .
make advisoryai-worker: docker build -f ops/advisory-ai/Dockerfile -t stellaops-advisoryai-worker:dev \
--build-arg PROJECT=src/AdvisoryAI/StellaOps.AdvisoryAI.Worker/StellaOps.AdvisoryAI.Worker.csproj \
--build-arg APP_DLL=StellaOps.AdvisoryAI.Worker.dll .
# Compose (offline friendly)
docker compose -f ops/advisory-ai/docker-compose.advisoryai.yaml up -d --build
Remote inference toggle
- Default:
ADVISORYAI__INFERENCE__MODE=Local(fully offline). - Remote: set
ADVISORYAI__INFERENCE__MODE=RemoteADVISORYAI__INFERENCE__REMOTE__BASEADDRESS=https://inference.example.comADVISORYAI__INFERENCE__REMOTE__ENDPOINT=/v1/inferenceADVISORYAI__INFERENCE__REMOTE__APIKEY=<token>- Optional:
ADVISORYAI__INFERENCE__REMOTE__TIMEOUT=00:00:30
- Guardrails still enforced locally (see
ADVISORYAI__GUARDRAILS__*options); keep secrets in mounted env/secret rather than images.
Storage & persistence
- File-system queue/cache/output paths default to
/app/data/{queue,plans,outputs}and are pre-created at startup. - Compose mounts
advisoryai-datavolume; Helm usesemptyDirby default or a PVC whenstorage.persistence.enabled=true. - In sealed/air-gapped mode, mount guardrail lists/policy knobs under
/app/etcand point env vars accordingly.
PostgreSQL attribution and pooling
- AdvisoryAI knowledge-search and unified-search PostgreSQL traffic now uses a shared pooled
NpgsqlDataSourceinstead of per-operation transient data sources or raw connections. - Default
application_nameisstellaops-advisory-ai-web/knowledge-search, which makespg_stat_activityattribution stable for the web service. - Default idle-pool retention is
900seconds so the shared pool stays warm across the 5-minute unified-search refresh cycle instead of re-opening physical sessions each run. - Override these with:
ADVISORYAI__KnowledgeSearch__DatabaseApplicationNameADVISORYAI__KnowledgeSearch__DatabasePoolingEnabledADVISORYAI__KnowledgeSearch__DatabaseMinPoolSizeADVISORYAI__KnowledgeSearch__DatabaseMaxPoolSizeADVISORYAI__KnowledgeSearch__DatabaseConnectionIdleLifetimeSeconds
- Existing
ADVISORYAI__KnowledgeSearch__ConnectionStringremains authoritative for host/database/credentials; the new options only stamp attribution and pool behavior.
Scaling guidance
- WebService: start with 1 replica, scale horizontally by CPU (tokenization) or queue depth; set
ADVISORYAI__QUEUE__DIRECTORYPATHto a shared PVC for multi-replica web+worker. - Worker: scale independently; use
worker.replicasin Helm or add--scale advisoryai-worker=Nin compose. Workers are CPU-bound; pin viaresources.requests/limits. - Set rate limiter headers: add
X-StellaOps-Clientper caller to avoid shared buckets.
Offline posture
- Images build from source without external runtime downloads; keep
ADVISORYAI__INFERENCE__MODE=Localto stay offline. - For registry-mirrored environments, push
stellaops-advisoryai-webandstellaops-advisoryai-workerto the allowed registry and reference via Helmimage.repository. - Disable OTEL exporters unless explicitly permitted; logs remain structured JSON to stdout.
Air-gap checklist
- Remote inference disabled (or routed through approved enclave).
- Guardrail phrase list mounted read-only.
- Data PVC scoped per tenant/project if multi-tenant; enforce scope via
X-StellaOps-Scopes. - Validate that
/app/datavolume has backup/retention policy; cache pruning handled by storage options.
Deliverables mapping
- Compose + Dockerfile satisfy on-prem packaging.
- Helm chart provides cluster deployment with persistence and remote toggle.
- This guide documents scaling/offline posture required by Sprint 0110 AIAI-31-008.