> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied. # Advisory AI Architecture _Updated: 2025-11-03 • Owner: Docs Guild & Advisory AI Guild • Status: Draft_ This document decomposes how Advisory AI transforms immutable evidence into deterministic, explainable outputs. It complements `docs/modules/advisory-ai/architecture.md` with service-level views, data flows, and integration checklists for Sprint 110. ## 1. High-level flow ``` Conseiller / Excititor / SBOM / Policy | (retrievers) v +----------------------------+ | AdvisoryPipelineOrchestrator | | (plan generation) | +----------------------------+ | plan + cache key v +----------------------------+ | Guarded Prompt Runtime | | (profile-specific) | +----------------------------+ | validated output + citations v +----------------------------+ | Cache & Provenance | | (Mongo + DSSE optional) | +----------------------------+ | \ v v REST API CLI / Console ``` Key stages: 1. **Retrieval** – deterministic chunkers pull AOC-compliant data: Conseiller advisories, Excititor VEX statements, SBOM context, Policy explain traces, optional runtime telemetry. 2. **Plan generation** – the orchestrator builds an `AdvisoryTaskPlan` (Summary / Conflict / Remediation) containing budgets, prompt template IDs, cache keys, and metadata. 3. **Guarded inference** – profile-specific prompt runners execute with guardrails (redaction, injection defence, citation enforcement). Failures are logged and downstream consumers receive deterministic errors. 4. **Persistence** – outputs are hashed (`outputHash`), referenced with `inputDigest`, optionally sealed with DSSE, and exposed for CLI/Console consumption. ## 2. Component responsibilities | Component | Description | Notes | |-----------|-------------|-------| | `AdvisoryRetrievalService` | Facade that composes Conseiller/Excititor/SBOM/Policy clients into context packs. | Deterministic ordering; per-source limits enforced. | | `AdvisoryPipelineOrchestrator` | Builds task plans, selects prompt templates, allocates token budgets. | Tenant-scoped; memoises by cache key. | | `GuardrailService` | Applies redaction filters, prompt allowlists, validation schemas, and DSSE sealing. | Shares configuration with Security Guild. | | `ProfileRegistry` | Maps profile IDs to runtime implementations (local model, remote connector). | Enforces tenant consent and allowlists. | | `AdvisoryOutputStore` | Mongo collection storing cached artefacts plus provenance manifest. | TTL defaults 24h; DSSE metadata optional. | | `AdvisoryPipelineWorker` | Background executor for queued jobs (future sprint once 004A wires queue). | Consumes `advisory.pipeline.execute` messages. | ## 3. Data contracts ### 3.1 `AdvisoryTaskRequest` ```json { "taskType": "Summary", "advisoryKey": "csaf:redhat:RHSA-2025:1001", "artifactId": "registry.stella-ops.internal/runtime/api", "artifactPurl": "pkg:oci/runtime-api@sha256:d2c3...", "policyVersion": "2025.10.1", "profile": "fips-local", "preferredSections": ["Summary", "Remediation"], "forceRefresh": false } ``` - `taskType` ∈ `Summary|Conflict|Remediation`. - Provide either `artifactId` or `artifactPurl` for remediation tasks (unlocks dependency analysis). - `forceRefresh` bypasses cache and regenerates output (deterministic with identical inputs). ### 3.2 `AdvisoryPipelinePlanResponse` Returned when plan preview is enabled; summarises chunk and vector usage so operators can verify evidence. ```json { "taskType": "Summary", "cacheKey": "adv-summary:csaf:redhat:RHSA-2025:1001:fips-local", "budget": { "promptTokens": 1024, "completionTokens": 256 }, "chunks": [{"documentId": "doc-1", "chunkId": "doc-1:0001", "section": "Summary"}], "vectors": [{"query": "Summary query", "matches": [{"chunkId": "doc-1:0001", "score": 0.92}]}], "sbom": { "artifactId": "registry.stella-ops.internal/runtime/api", "versionTimelineCount": 8, "dependencyPathCount": 5, "dependencyNodeCount": 17 } } ``` ### 3.3 Output envelope See `docs/advisory-ai/api.md` §6. Each response includes `inputDigest`, `outputHash`, Markdown content, citations, TTL, and context summary to support offline replay. ## 4. Profiles & runtime selection | Profile | Runtime | Crypto posture | Default availability | |---------|---------|----------------|----------------------| | `default` / `fips-local` | On-prem model (GPU/CPU) | FIPS-validated primitives | Enabled | | `gost-local` | Sovereign local model | GOST algorithms | Opt-in | | `cloud-openai` | Remote connector via secure gateway | Depends on hosting region | Disabled (requires tenant consent) | | Custom | Operator-supplied | Matches declared policy | Disabled until Authority admin approves | Profile selection is controlled via Authority configuration (`advisoryAi.allowedProfiles`). Remote profiles require tenant consent, allowlisted endpoints, and custom SLIs to track latency/error budgets. ## 5. Guardrails & validation pipeline 1. **Prompt preparation** – sanitized context injected into templated prompts (Liquid/Handlebars). Sensitive tokens scrubbed before render. 2. **Prompt allowlist** – each template fingerprinted; runtime rejects prompts whose hash is not documented. 3. **Response schema** – JSON validator ensures sections, severity tags, and citation arrays meet contract. 4. **Citation resolution** – referenced `[n]` items must map to context chunk identifiers. 5. **DSSE sealing (optional)** – outputs can be sealed with the Advisory AI signing key; DSSE bundle stored alongside cache artefact. 6. **Audit trail** – guardrail results logged (`advisory_ai.guardrail.blocked|passed`) with tenant and trace IDs. ## 6. Caching & storage model | Field | Description | |-------|-------------| | `_id` | `outputHash` (sha256 of content body). | | `inputDigest` | sha256 of canonical context pack. | | `taskType` | Summary/Conflict/Remediation. | | `profile` | Inference profile used. | | `content` | Markdown/JSON body and format metadata. | | `citations` | Array of `{index, kind, sourceId, uri}`. | | `generatedAt` | UTC timestamp. | | `ttlSeconds` | Derived from tenant configuration (default 86400). | | `dsse` | Optional DSSE bundle metadata. | Cache misses trigger orchestration and inference; hits return stored artefacts immediately. TTL expiry removes entries unless `forceRefresh` has already regenerated them. ## 7. Telemetry & SLOs Metrics (registered in Observability backlog): - `advisory_ai_requests_total{tenant,task,profile}` - `advisory_ai_latency_seconds_bucket` - `advisory_ai_guardrail_blocks_total` - `advisory_ai_cache_hits_total` - `advisory_ai_remote_profile_requests_total` Logs include `traceId`, `tenant`, `task`, `profile`, `outputHash`, `cacheStatus` (`hit|miss|bypass`). Prompt bodies are never logged; guardrail violations log sanitized excerpts only. Suggested SLOs: - **Latency:** P95 ≤ 3s (local), ≤ 8s (remote). - **Availability:** 99.5% successful responses per tenant over 7 days. - **Guardrail block rate:** ≤ 1%; investigate higher values. ## 8. Deployment & offline guidance - Package prompts, guardrail configs, profile manifests, and local model weights in the Offline Kit. - Remote profiles remain disabled until Authority admins set `advisoryAi.remoteProfiles` and record tenant consent. - Export Center reads cached outputs using `advisory-ai:view` and benefits from DSSE sealing when enabled. ## 9. Checklist - [ ] `AdvisoryRetrievalService` wired to the SBOM context client (AIAI-31-002). - [ ] Authority scopes (`advisory-ai:*`, `aoc:verify`) validated in staging. - [ ] Guardrail library reviewed by Security Guild (AIAI-31-005). - [ ] Cache TTLs/DSSE policy signed off by Platform & Compliance. - [ ] Observability dashboards published (DOCS-OBS backlog). - [ ] Offline Kit bundle updated with prompts, guardrails, local profile assets. --- _For questions or contributions, contact the Advisory AI Guild (Slack #guild-advisory-ai) and tag Docs Guild reviewers._