feat(advisory-ai): Add deployment guide, Dockerfile, and Helm chart for on-prem packaging

- Introduced a comprehensive deployment guide for AdvisoryAI, detailing local builds, remote inference toggles, and scaling guidance. - Created a multi-role Dockerfile for building WebService and Worker images. - Added a docker-compose file for local and offline deployment. - Implemented a Helm chart for Kubernetes deployment with persistence and remote inference options. - Established a new API endpoint `/advisories/summary` for deterministic summaries of observations and linksets. - Introduced a JSON schema for risk profiles and a validator to ensure compliance with the schema. - Added unit tests for the risk profile validator to ensure functionality and error handling.
2025-11-23 00:35:33 +02:00
parent 2e89a92d92
commit 8d78dd219b
33 changed files with 1254 additions and 259 deletions
--- a/docs/modules/advisory-ai/README.md
+++ b/docs/modules/advisory-ai/README.md
@@ -27,9 +27,9 @@ Advisory AI is the retrieval-augmented assistant that synthesizes advisory and V
 - Guardrail behaviour, blocked phrases, and operational alerts are detailed in `/docs/security/assistant-guardrails.md`.

 ## Deployment & configuration
- **Containers:** `advisory-ai-web` fronts the API/cache while `advisory-ai-worker` drains the queue and executes prompts. Both containers mount a shared RWX volume providing `/var/lib/advisory-ai/{queue,plans,outputs}`.
- **Remote inference toggle:** Set `ADVISORYAI__AdvisoryAI__Inference__Mode=Remote` to send sanitized prompts to an external inference tier. Provide `ADVISORYAI__AdvisoryAI__Inference__Remote__BaseAddress` (and optional `...ApiKey`) to complete the circuit; failures fall back to the sanitized prompt and surface `inference.fallback_*` metadata.
- **Helm/Compose:** Bundled manifests wire the SBOM base address, queue/plan/output directories, and inference options via the `AdvisoryAI` configuration section. Helm expects a PVC named `stellaops-advisory-ai-data`. Compose creates named volumes so the worker and web instances share deterministic state.
+- **Containers:** `advisory-ai-web` fronts the API/cache while `advisory-ai-worker` drains the queue and executes prompts. Both containers mount a shared RWX volume providing `/app/data/{queue,plans,outputs}` (defaults; configurable via `ADVISORYAI__STORAGE__*`).
+- **Remote inference toggle:** Set `ADVISORYAI__INFERENCE__MODE=Remote` to send sanitized prompts to an external inference tier. Provide `ADVISORYAI__INFERENCE__REMOTE__BASEADDRESS` (and optional `...__APIKEY`, `...__TIMEOUT`) to complete the circuit; failures fall back to the sanitized prompt and surface `inference.fallback_*` metadata.
+- **Helm/Compose:** Packaged manifests live under `ops/advisory-ai/` and wire SBOM base address, queue/plan/output directories, and inference options. Helm defaults to `emptyDir` with optional PVC; Compose creates named volumes so worker and web instances share deterministic state. See `docs/modules/advisory-ai/deployment.md` for commands.

 ## CLI usage
 - `stella advise run <summary|conflict|remediation> --advisory-key <id> [--artifact-id id] [--artifact-purl purl] [--policy-version v] [--profile profile] [--section name] [--force-refresh] [--timeout seconds]`
--- a/docs/modules/advisory-ai/deployment.md
+++ b/docs/modules/advisory-ai/deployment.md
@@ -0,0 +1,58 @@
+# AdvisoryAI Deployment Guide (AIAI-31-008)
+
+This guide covers packaging AdvisoryAI for on-prem / offline environments, toggling remote inference, and recommended scaling settings.
+
+## Artifacts
+- Dockerfile: `ops/advisory-ai/Dockerfile` (multi-role build for WebService / Worker).
+- Local compose: `ops/advisory-ai/docker-compose.advisoryai.yaml` (web + worker, shared data volume).
+- Helm chart: `ops/advisory-ai/helm/` (web Deployment + optional worker Deployment + Service + PVC stub).
+
+## Build and run locally
+```bash
+# Build images
+make advisoryai-web: docker build -f ops/advisory-ai/Dockerfile -t stellaops-advisoryai-web:dev \
+  --build-arg PROJECT=src/AdvisoryAI/StellaOps.AdvisoryAI.WebService/StellaOps.AdvisoryAI.WebService.csproj \
+  --build-arg APP_DLL=StellaOps.AdvisoryAI.WebService.dll .
+make advisoryai-worker: docker build -f ops/advisory-ai/Dockerfile -t stellaops-advisoryai-worker:dev \
+  --build-arg PROJECT=src/AdvisoryAI/StellaOps.AdvisoryAI.Worker/StellaOps.AdvisoryAI.Worker.csproj \
+  --build-arg APP_DLL=StellaOps.AdvisoryAI.Worker.dll .
+
+# Compose (offline friendly)
+docker compose -f ops/advisory-ai/docker-compose.advisoryai.yaml up -d --build
+```
+
+## Remote inference toggle
+- Default: `ADVISORYAI__INFERENCE__MODE=Local` (fully offline).
+- Remote: set
+  - `ADVISORYAI__INFERENCE__MODE=Remote`
+  - `ADVISORYAI__INFERENCE__REMOTE__BASEADDRESS=https://inference.example.com`
+  - `ADVISORYAI__INFERENCE__REMOTE__ENDPOINT=/v1/inference`
+  - `ADVISORYAI__INFERENCE__REMOTE__APIKEY=<token>`
+  - Optional: `ADVISORYAI__INFERENCE__REMOTE__TIMEOUT=00:00:30`
+- Guardrails still enforced locally (see `ADVISORYAI__GUARDRAILS__*` options); keep secrets in mounted env/secret rather than images.
+
+## Storage & persistence
+- File-system queue/cache/output paths default to `/app/data/{queue,plans,outputs}` and are pre-created at startup.
+- Compose mounts `advisoryai-data` volume; Helm uses `emptyDir` by default or a PVC when `storage.persistence.enabled=true`.
+- In sealed/air-gapped mode, mount guardrail lists/policy knobs under `/app/etc` and point env vars accordingly.
+
+## Scaling guidance
+- WebService: start with 1 replica, scale horizontally by CPU (tokenization) or queue depth; set `ADVISORYAI__QUEUE__DIRECTORYPATH` to a shared PVC for multi-replica web+worker.
+- Worker: scale independently; use `worker.replicas` in Helm or add `--scale advisoryai-worker=N` in compose. Workers are CPU-bound; pin via `resources.requests/limits`.
+- Set rate limiter headers: add `X-StellaOps-Client` per caller to avoid shared buckets.
+
+## Offline posture
+- Images build from source without external runtime downloads; keep `ADVISORYAI__INFERENCE__MODE=Local` to stay offline.
+- For registry-mirrored environments, push `stellaops-advisoryai-web` and `stellaops-advisoryai-worker` to the allowed registry and reference via Helm `image.repository`.
+- Disable OTEL exporters unless explicitly permitted; logs remain structured JSON to stdout.
+
+## Air-gap checklist
+- Remote inference disabled (or routed through approved enclave).
+- Guardrail phrase list mounted read-only.
+- Data PVC scoped per tenant/project if multi-tenant; enforce scope via `X-StellaOps-Scopes`.
+- Validate that `/app/data` volume has backup/retention policy; cache pruning handled by storage options.
+
+## Deliverables mapping
+- Compose + Dockerfile satisfy on-prem packaging.
+- Helm chart provides cluster deployment with persistence and remote toggle.
+- This guide documents scaling/offline posture required by Sprint 0110 AIAI-31-008.