Files
git.stella-ops.org/docs/modules/advisory-ai/deployment.md
master 50abd2137f Update docs, sprint plans, and compose configuration
Add 12 new sprint files (Integrations, Graph, JobEngine, FE, Router,
AdvisoryAI), archive completed scheduler UI sprint, update module
architecture docs (router, graph, jobengine, web, integrations),
and add Gitea entrypoint script for local dev.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 08:53:50 +03:00

4.5 KiB

AdvisoryAI Deployment Guide (AIAI-31-008)

This guide covers packaging AdvisoryAI for on-prem / offline environments, toggling remote inference, and recommended scaling settings.

Artifacts

  • Dockerfile: ops/advisory-ai/Dockerfile (multi-role build for WebService / Worker).
  • Local compose: ops/advisory-ai/docker-compose.advisoryai.yaml (web + worker, shared data volume).
  • Helm chart: ops/advisory-ai/helm/ (web Deployment + optional worker Deployment + Service + PVC stub).

Build and run locally

# Build images
make advisoryai-web: docker build -f ops/advisory-ai/Dockerfile -t stellaops-advisoryai-web:dev \
  --build-arg PROJECT=src/AdvisoryAI/StellaOps.AdvisoryAI.WebService/StellaOps.AdvisoryAI.WebService.csproj \
  --build-arg APP_DLL=StellaOps.AdvisoryAI.WebService.dll .
make advisoryai-worker: docker build -f ops/advisory-ai/Dockerfile -t stellaops-advisoryai-worker:dev \
  --build-arg PROJECT=src/AdvisoryAI/StellaOps.AdvisoryAI.Worker/StellaOps.AdvisoryAI.Worker.csproj \
  --build-arg APP_DLL=StellaOps.AdvisoryAI.Worker.dll .

# Compose (offline friendly)
docker compose -f ops/advisory-ai/docker-compose.advisoryai.yaml up -d --build

Remote inference toggle

  • Default: ADVISORYAI__INFERENCE__MODE=Local (fully offline).
  • Remote: set
    • ADVISORYAI__INFERENCE__MODE=Remote
    • ADVISORYAI__INFERENCE__REMOTE__BASEADDRESS=https://inference.example.com
    • ADVISORYAI__INFERENCE__REMOTE__ENDPOINT=/v1/inference
    • ADVISORYAI__INFERENCE__REMOTE__APIKEY=<token>
    • Optional: ADVISORYAI__INFERENCE__REMOTE__TIMEOUT=00:00:30
  • Guardrails still enforced locally (see ADVISORYAI__GUARDRAILS__* options); keep secrets in mounted env/secret rather than images.

Storage & persistence

  • File-system queue/cache/output paths default to /app/data/{queue,plans,outputs} and are pre-created at startup.
  • Compose mounts advisoryai-data volume; Helm uses emptyDir by default or a PVC when storage.persistence.enabled=true.
  • In sealed/air-gapped mode, mount guardrail lists/policy knobs under /app/etc and point env vars accordingly.

PostgreSQL attribution and pooling

  • AdvisoryAI knowledge-search and unified-search PostgreSQL traffic now uses a shared pooled NpgsqlDataSource instead of per-operation transient data sources or raw connections.
  • Default application_name is stellaops-advisory-ai-web/knowledge-search, which makes pg_stat_activity attribution stable for the web service.
  • Default idle-pool retention is 900 seconds so the shared pool stays warm across the 5-minute unified-search refresh cycle instead of re-opening physical sessions each run.
  • Override these with:
    • ADVISORYAI__KnowledgeSearch__DatabaseApplicationName
    • ADVISORYAI__KnowledgeSearch__DatabasePoolingEnabled
    • ADVISORYAI__KnowledgeSearch__DatabaseMinPoolSize
    • ADVISORYAI__KnowledgeSearch__DatabaseMaxPoolSize
    • ADVISORYAI__KnowledgeSearch__DatabaseConnectionIdleLifetimeSeconds
  • Existing ADVISORYAI__KnowledgeSearch__ConnectionString remains authoritative for host/database/credentials; the new options only stamp attribution and pool behavior.

Scaling guidance

  • WebService: start with 1 replica, scale horizontally by CPU (tokenization) or queue depth; set ADVISORYAI__QUEUE__DIRECTORYPATH to a shared PVC for multi-replica web+worker.
  • Worker: scale independently; use worker.replicas in Helm or add --scale advisoryai-worker=N in compose. Workers are CPU-bound; pin via resources.requests/limits.
  • Set rate limiter headers: add X-StellaOps-Client per caller to avoid shared buckets.

Offline posture

  • Images build from source without external runtime downloads; keep ADVISORYAI__INFERENCE__MODE=Local to stay offline.
  • For registry-mirrored environments, push stellaops-advisoryai-web and stellaops-advisoryai-worker to the allowed registry and reference via Helm image.repository.
  • Disable OTEL exporters unless explicitly permitted; logs remain structured JSON to stdout.

Air-gap checklist

  • Remote inference disabled (or routed through approved enclave).
  • Guardrail phrase list mounted read-only.
  • Data PVC scoped per tenant/project if multi-tenant; enforce scope via X-StellaOps-Scopes.
  • Validate that /app/data volume has backup/retention policy; cache pruning handled by storage options.

Deliverables mapping

  • Compose + Dockerfile satisfy on-prem packaging.
  • Helm chart provides cluster deployment with persistence and remote toggle.
  • This guide documents scaling/offline posture required by Sprint 0110 AIAI-31-008.