feat(advisory-ai): Add deployment guide, Dockerfile, and Helm chart for on-prem packaging
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
- Introduced a comprehensive deployment guide for AdvisoryAI, detailing local builds, remote inference toggles, and scaling guidance. - Created a multi-role Dockerfile for building WebService and Worker images. - Added a docker-compose file for local and offline deployment. - Implemented a Helm chart for Kubernetes deployment with persistence and remote inference options. - Established a new API endpoint `/advisories/summary` for deterministic summaries of observations and linksets. - Introduced a JSON schema for risk profiles and a validator to ensure compliance with the schema. - Added unit tests for the risk profile validator to ensure functionality and error handling.
This commit is contained in:
@@ -27,9 +27,9 @@ Advisory AI is the retrieval-augmented assistant that synthesizes advisory and V
|
||||
- Guardrail behaviour, blocked phrases, and operational alerts are detailed in `/docs/security/assistant-guardrails.md`.
|
||||
|
||||
## Deployment & configuration
|
||||
- **Containers:** `advisory-ai-web` fronts the API/cache while `advisory-ai-worker` drains the queue and executes prompts. Both containers mount a shared RWX volume providing `/var/lib/advisory-ai/{queue,plans,outputs}`.
|
||||
- **Remote inference toggle:** Set `ADVISORYAI__AdvisoryAI__Inference__Mode=Remote` to send sanitized prompts to an external inference tier. Provide `ADVISORYAI__AdvisoryAI__Inference__Remote__BaseAddress` (and optional `...ApiKey`) to complete the circuit; failures fall back to the sanitized prompt and surface `inference.fallback_*` metadata.
|
||||
- **Helm/Compose:** Bundled manifests wire the SBOM base address, queue/plan/output directories, and inference options via the `AdvisoryAI` configuration section. Helm expects a PVC named `stellaops-advisory-ai-data`. Compose creates named volumes so the worker and web instances share deterministic state.
|
||||
- **Containers:** `advisory-ai-web` fronts the API/cache while `advisory-ai-worker` drains the queue and executes prompts. Both containers mount a shared RWX volume providing `/app/data/{queue,plans,outputs}` (defaults; configurable via `ADVISORYAI__STORAGE__*`).
|
||||
- **Remote inference toggle:** Set `ADVISORYAI__INFERENCE__MODE=Remote` to send sanitized prompts to an external inference tier. Provide `ADVISORYAI__INFERENCE__REMOTE__BASEADDRESS` (and optional `...__APIKEY`, `...__TIMEOUT`) to complete the circuit; failures fall back to the sanitized prompt and surface `inference.fallback_*` metadata.
|
||||
- **Helm/Compose:** Packaged manifests live under `ops/advisory-ai/` and wire SBOM base address, queue/plan/output directories, and inference options. Helm defaults to `emptyDir` with optional PVC; Compose creates named volumes so worker and web instances share deterministic state. See `docs/modules/advisory-ai/deployment.md` for commands.
|
||||
|
||||
## CLI usage
|
||||
- `stella advise run <summary|conflict|remediation> --advisory-key <id> [--artifact-id id] [--artifact-purl purl] [--policy-version v] [--profile profile] [--section name] [--force-refresh] [--timeout seconds]`
|
||||
|
||||
58
docs/modules/advisory-ai/deployment.md
Normal file
58
docs/modules/advisory-ai/deployment.md
Normal file
@@ -0,0 +1,58 @@
|
||||
# AdvisoryAI Deployment Guide (AIAI-31-008)
|
||||
|
||||
This guide covers packaging AdvisoryAI for on-prem / offline environments, toggling remote inference, and recommended scaling settings.
|
||||
|
||||
## Artifacts
|
||||
- Dockerfile: `ops/advisory-ai/Dockerfile` (multi-role build for WebService / Worker).
|
||||
- Local compose: `ops/advisory-ai/docker-compose.advisoryai.yaml` (web + worker, shared data volume).
|
||||
- Helm chart: `ops/advisory-ai/helm/` (web Deployment + optional worker Deployment + Service + PVC stub).
|
||||
|
||||
## Build and run locally
|
||||
```bash
|
||||
# Build images
|
||||
make advisoryai-web: docker build -f ops/advisory-ai/Dockerfile -t stellaops-advisoryai-web:dev \
|
||||
--build-arg PROJECT=src/AdvisoryAI/StellaOps.AdvisoryAI.WebService/StellaOps.AdvisoryAI.WebService.csproj \
|
||||
--build-arg APP_DLL=StellaOps.AdvisoryAI.WebService.dll .
|
||||
make advisoryai-worker: docker build -f ops/advisory-ai/Dockerfile -t stellaops-advisoryai-worker:dev \
|
||||
--build-arg PROJECT=src/AdvisoryAI/StellaOps.AdvisoryAI.Worker/StellaOps.AdvisoryAI.Worker.csproj \
|
||||
--build-arg APP_DLL=StellaOps.AdvisoryAI.Worker.dll .
|
||||
|
||||
# Compose (offline friendly)
|
||||
docker compose -f ops/advisory-ai/docker-compose.advisoryai.yaml up -d --build
|
||||
```
|
||||
|
||||
## Remote inference toggle
|
||||
- Default: `ADVISORYAI__INFERENCE__MODE=Local` (fully offline).
|
||||
- Remote: set
|
||||
- `ADVISORYAI__INFERENCE__MODE=Remote`
|
||||
- `ADVISORYAI__INFERENCE__REMOTE__BASEADDRESS=https://inference.example.com`
|
||||
- `ADVISORYAI__INFERENCE__REMOTE__ENDPOINT=/v1/inference`
|
||||
- `ADVISORYAI__INFERENCE__REMOTE__APIKEY=<token>`
|
||||
- Optional: `ADVISORYAI__INFERENCE__REMOTE__TIMEOUT=00:00:30`
|
||||
- Guardrails still enforced locally (see `ADVISORYAI__GUARDRAILS__*` options); keep secrets in mounted env/secret rather than images.
|
||||
|
||||
## Storage & persistence
|
||||
- File-system queue/cache/output paths default to `/app/data/{queue,plans,outputs}` and are pre-created at startup.
|
||||
- Compose mounts `advisoryai-data` volume; Helm uses `emptyDir` by default or a PVC when `storage.persistence.enabled=true`.
|
||||
- In sealed/air-gapped mode, mount guardrail lists/policy knobs under `/app/etc` and point env vars accordingly.
|
||||
|
||||
## Scaling guidance
|
||||
- WebService: start with 1 replica, scale horizontally by CPU (tokenization) or queue depth; set `ADVISORYAI__QUEUE__DIRECTORYPATH` to a shared PVC for multi-replica web+worker.
|
||||
- Worker: scale independently; use `worker.replicas` in Helm or add `--scale advisoryai-worker=N` in compose. Workers are CPU-bound; pin via `resources.requests/limits`.
|
||||
- Set rate limiter headers: add `X-StellaOps-Client` per caller to avoid shared buckets.
|
||||
|
||||
## Offline posture
|
||||
- Images build from source without external runtime downloads; keep `ADVISORYAI__INFERENCE__MODE=Local` to stay offline.
|
||||
- For registry-mirrored environments, push `stellaops-advisoryai-web` and `stellaops-advisoryai-worker` to the allowed registry and reference via Helm `image.repository`.
|
||||
- Disable OTEL exporters unless explicitly permitted; logs remain structured JSON to stdout.
|
||||
|
||||
## Air-gap checklist
|
||||
- Remote inference disabled (or routed through approved enclave).
|
||||
- Guardrail phrase list mounted read-only.
|
||||
- Data PVC scoped per tenant/project if multi-tenant; enforce scope via `X-StellaOps-Scopes`.
|
||||
- Validate that `/app/data` volume has backup/retention policy; cache pruning handled by storage options.
|
||||
|
||||
## Deliverables mapping
|
||||
- Compose + Dockerfile satisfy on-prem packaging.
|
||||
- Helm chart provides cluster deployment with persistence and remote toggle.
|
||||
- This guide documents scaling/offline posture required by Sprint 0110 AIAI-31-008.
|
||||
76
docs/modules/concelier/api/advisories-summary.md
Normal file
76
docs/modules/concelier/api/advisories-summary.md
Normal file
@@ -0,0 +1,76 @@
|
||||
# `/advisories/summary` API (draft v1)
|
||||
|
||||
Status: draft; aligns with LNM v1 (frozen 2025-11-17) and observation/linkset models already shipped in Concelier Core.
|
||||
|
||||
## Intent
|
||||
- Provide graph overlays and consoles a deterministic summary of observations and linksets without derived verdicts.
|
||||
- Preserve provenance and tenant isolation; results are stable for a given tenant + filter set.
|
||||
|
||||
## Request
|
||||
- Method: `GET`
|
||||
- Path: `/advisories/summary`
|
||||
- Headers:
|
||||
- `X-Stella-Tenant`: required.
|
||||
- `X-Stella-Request-Id`: optional for tracing.
|
||||
- Query parameters:
|
||||
- `purl` (optional, repeatable) — filter by component coordinates.
|
||||
- `alias` (optional, repeatable) — advisory aliases (CVE, GHSA, vendor IDs); case-insensitive; normalized and sorted server-side.
|
||||
- `source` (optional, repeatable) — upstream source identifiers.
|
||||
- `confidence_gte` (optional, decimal 0–1) — minimum linkset confidence.
|
||||
- `conflicts_only` (optional, bool) — when `true`, return only summaries with conflicts present.
|
||||
- `after` (optional, cursor) — opaque, tenant-scoped cursor for pagination.
|
||||
- `take` (optional, int, default 100, max 500) — page size.
|
||||
- `sort` (optional, enum: `advisory`, `observedAt`, default `advisory`) — always ascending and stable.
|
||||
|
||||
## Response (200)
|
||||
```json
|
||||
{
|
||||
"meta": {
|
||||
"tenant": "acme",
|
||||
"count": 2,
|
||||
"next": "opaque-cursor-or-null",
|
||||
"sort": "advisory"
|
||||
},
|
||||
"items": [
|
||||
{
|
||||
"advisoryKey": "cve-2024-1234",
|
||||
"aliases": ["GHSA-xxxx-yyyy", "CVE-2024-1234"],
|
||||
"source": "nvd",
|
||||
"observedAt": "2025-11-22T15:04:05Z",
|
||||
"linksetId": "ls_01H9A8...",
|
||||
"confidence": 0.82,
|
||||
"conflicts": [
|
||||
{ "field": "severity", "codes": ["severity-mismatch"], "sources": ["nvd", "vendor"] }
|
||||
],
|
||||
"counts": {
|
||||
"observations": 3,
|
||||
"conflictFields": 1
|
||||
},
|
||||
"provenance": {
|
||||
"observationIds": ["obs_01H9...", "obs_01H9..."],
|
||||
"schema": "lnm-1.0"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
- Ordering: stable by `sort` then `advisoryKey` then `linksetId`.
|
||||
- No derived verdicts or merged severity values; conflicts are emitted as structured markers only.
|
||||
|
||||
## Errors
|
||||
- `400` `ERR_AOC_001`: missing/invalid tenant or unsupported filter.
|
||||
- `400` `ERR_AOC_006`: `take` exceeds max or invalid cursor.
|
||||
- `401/403`: auth/tenant scope failures.
|
||||
- `429`: if per-tenant rate limits enforced (surface retry headers).
|
||||
|
||||
## Determinism & caching
|
||||
- Response depends solely on tenant + normalized filters + underlying linksets; no clock-based variation beyond `observedAt` from stored records.
|
||||
- Cache key (for optional summary cache):
|
||||
- `tenant|purls(sorted)|aliases(sorted)|sources(sorted)|confidence_gte|conflicts_only|sort|take|after`
|
||||
- Return transparency headers: `X-Stella-Cache-Key` (sha256-16), `X-Stella-Cache-Hit`, `X-Stella-Cache-Ttl`.
|
||||
- Cache TTL should default to 0 (disabled) until validated; determinism required for hits.
|
||||
|
||||
## Notes
|
||||
- Cursor must encode the final sort tuple (`advisoryKey`, `linksetId`, `observedAt`) to keep pagination stable when new data arrives.
|
||||
- All string comparisons for filters are case-insensitive; server normalizes to lower-case and sorts before query execution.
|
||||
- Conflicts mirror LNM conflict codes already produced by Core (no new codes introduced here).
|
||||
@@ -31,6 +31,21 @@ Status: draft · aligned with LNM v1 (frozen 2025-11-17)
|
||||
- Timestamps use UTC ticks; content hashes retain upstream digest prefix (e.g., `sha256:`) to distinguish sources.
|
||||
- Cache TTL defaults to 0 (disabled) unless configured; safe for offline/air-gapped deployments.
|
||||
|
||||
## Summary cache (for `/advisories/summary`)
|
||||
- Optional, disabled by default; enables deterministic paging for graph/console overlays.
|
||||
- Key material (pipe-delimited, normalized/sorted where applicable):
|
||||
- `tenant`
|
||||
- `purls[]` (sorted)
|
||||
- `aliases[]` (sorted, lower-case)
|
||||
- `sources[]` (sorted, lower-case)
|
||||
- `confidence_gte`
|
||||
- `conflicts_only`
|
||||
- `sort`
|
||||
- `take`
|
||||
- `afterCursor` (opaque; includes last sort tuple)
|
||||
- Transparency headers (if enabled): `X-Stella-Cache-Key` (sha256-16), `X-Stella-Cache-Hit`, `X-Stella-Cache-Ttl`.
|
||||
- Ordering for cacheable results: `sort` (advisory|observedAt), then `advisoryKey`, then `linksetId`; cursor encodes the tuple to keep pagination stable when new linksets land.
|
||||
|
||||
## Testing notes
|
||||
- Unit coverage: `AdvisoryChunkCacheKeyTests` (ordering, filter casing) and `AdvisoryChunkBuilderTests` (observationPath pointers influence chunk IDs).
|
||||
- Integration tests should assert headers when cache is enabled; disable cache for tests that assert body-only determinism.
|
||||
@@ -38,3 +53,4 @@ Status: draft · aligned with LNM v1 (frozen 2025-11-17)
|
||||
## TODOs / follow-ups
|
||||
- Add integration test that exercises a cache hit path and validates transparency headers.
|
||||
- Document cache configuration knobs in `appsettings.*` once finalized.
|
||||
- Add summary cache integration test once `/advisories/summary` endpoint is implemented.
|
||||
|
||||
Reference in New Issue
Block a user