# StellaOps Advisory AI Advisory AI is the retrieval-augmented assistant that synthesizes advisory and VEX evidence into operator-ready summaries, conflict explanations, and remediation plans with strict provenance. ## Responsibilities - Generate policy-aware advisory summaries with citations back to Conseiller and Excititor evidence. - Explain conflicting advisories/VEX statements using weights from VEX Lens and Policy Engine. - Propose remediation hints aligned with Offline Kit staging and export bundles. - Expose API/UI surfaces with guardrails on model prompts, outputs, and retention. ## Contributor quickstart - Read `docs/modules/advisory-ai/AGENTS.md` before making changes; it lists required docs, determinism/offline rules, and working directory scope. - Keep outputs aggregation-only with stable ordering and UTC timestamps; tests must cover guardrails, tenant safety, and provenance. - When updating contracts/telemetry, sync the relevant docs here and cross-link from sprint Decisions & Risks. ## Key components - RAG pipeline drawing from Conseiller, Excititor, VEX Lens, Policy Engine, and SBOM Service data. - Prompt templates and guard models enforcing provenance and redaction policies. - Vercel/offline inference workers with deterministic caching of generated artefacts. ## Integrations & dependencies - Authority for tenant-aware access control. - Policy Engine for context-specific decisions and explain traces. - Console/CLI for interaction surfaces. - Export Center/Vuln Explorer for embedding generated briefs. ## Operational notes - Model cache management and offline bundle packaging per Epic 8 requirements. - Usage/latency dashboards for prompt/response monitoring with `advisory_ai_latency_seconds`, guardrail block/validation counters, and citation coverage histograms wired into the default “Advisory AI” Grafana dashboard. - Alert policies fire when `advisory_ai_guardrail_blocks_total` or `advisory_ai_validation_failures_total` breach burn-rate thresholds (5 blocks/min or validation failures > 1% of traffic) and when latency p95 exceeds 30s. - Redaction policies validated against security/LLM guardrail tests. - Guardrail behaviour, blocked phrases, and operational alerts are detailed in `/docs/security/assistant-guardrails.md`. ## Outputs & artefacts - **Run/plan records (deterministic):** persisted under `/app/data/{queue,plans,outputs}` (or `ADVISORYAI__STORAGE__*` overrides) with ISO timestamps, provenance hashes, and stable ordering for replay. - **Service surfaces (air‑gap friendly):** `/ops/advisory-ai/runs` streams NDJSON status; `/ops/advisory-ai/runs/{id}` returns the immutable run/plan bundle with guardrail decisions. - **Events:** worker emits `advisory_ai_run_completed` with digests (plan, output, guardrail) for downstream consumers; feature-flagged to keep offline deployments silent. - **Offline bundle:** `advisory-ai-bundle.tgz` packages prompts, sanitized inputs, outputs, guardrail audit trail, and signatures; build via `docs/modules/advisory-ai/deployment.md` recipes to keep artefacts deterministic across air-gapped imports. - **Observability:** metrics/logs share the `advisory_ai` meter/logger namespace (latency, guardrail blocks/validations, citation coverage). Dashboards and alerts must reference these canonical names to avoid drift. ## Deployment & configuration - **Containers:** `advisory-ai-web` fronts the API/cache while `advisory-ai-worker` drains the queue and executes prompts. Both containers mount a shared RWX volume providing `/app/data/{queue,plans,outputs}` (defaults; configurable via `ADVISORYAI__STORAGE__*`). - **Remote inference toggle:** Set `ADVISORYAI__INFERENCE__MODE=Remote` to send sanitized prompts to an external inference tier. Provide `ADVISORYAI__INFERENCE__REMOTE__BASEADDRESS` (and optional `...__APIKEY`, `...__TIMEOUT`) to complete the circuit; failures fall back to the sanitized prompt and surface `inference.fallback_*` metadata. - **Helm/Compose:** Packaged manifests live under `ops/advisory-ai/` and wire SBOM base address, queue/plan/output directories, and inference options. Helm defaults to `emptyDir` with optional PVC; Compose creates named volumes so worker and web instances share deterministic state. See `docs/modules/advisory-ai/deployment.md` for commands. ## CLI usage - `stella advise run --advisory-key [--artifact-id id] [--artifact-purl purl] [--policy-version v] [--profile profile] [--section name] [--force-refresh] [--timeout seconds]` - Requests an advisory plan from the web service, enqueues execution, then polls for the generated output (default wait 120 s, single check if `--timeout 0`). - Renders plan metadata (cache key, prompt template, token budget), guardrail state, provenance hashes, signatures, and citations in a deterministic table view. - Honors `STELLAOPS_ADVISORYAI_URL` when set; otherwise the CLI reuses the backend URL and scopes requests via `X-StellaOps-Scopes`. ## Epic alignment - Epic 8: Advisory AI Assistant. - DOCS-AI stories to be tracked in ../../TASKS.md.