# Advisory AI architecture > Captures the retrieval, guardrail, and inference packaging requirements defined in the Advisory AI implementation plan and related module guides. ## 1) Goals - Summarise advisories/VEX evidence into operator-ready briefs with citations. - Explain conflicting statements with provenance and trust weights (using VEX Lens & Excititor data). - Suggest remediation plans aligned with Offline Kit deployment models and scheduler follow-ups. - Operate deterministically where possible; cache generated artefacts with digests for audit. ## 2) Pipeline overview ``` +---------------------+ Concelier/VEX Lens | Evidence Retriever | Policy Engine ----> | (vector + keyword) | ---> Context Pack (JSON) Zastava runtime +---------------------+ | v +-------------+ | Prompt | | Assembler | +-------------+ | v +-------------+ | Guarded LLM | | (local/host)| +-------------+ | v +-----------------+ | Citation & | | Validation | +-----------------+ | v +----------------+ | Output cache | | (hash, bundle) | +----------------+ ``` ## 3) Retrieval & context - Hybrid search: vector embeddings (SBERT-compatible) + keyword filters for advisory IDs, PURLs, CVEs. - Context packs include: - Advisory raw excerpts with highlighted sections and source URLs. - VEX statements (normalized tuples + trust metadata). - Policy explain traces for the affected finding. - Runtime/impact hints from Zastava (exposure, entrypoints). - Export-ready remediation data (fixed versions, patches). All context references include `content_hash` and `source_id` enabling verifiable citations. ## 4) Guardrails - Prompt templates enforce structure: summary, conflicts, remediation, references. - Response validator ensures: - No hallucinated advisories (every fact must map to input context). - Citations follow `[n]` indexing referencing actual sources. - Remediation suggestions only cite policy-approved sources (fixed versions, vendor hotfixes). - Moderation/PII filters prevent leaking secrets; responses failing validation are rejected and logged. ## 5) Output persistence - Cached artefacts stored in `advisory_ai_outputs` with fields: - `output_hash` (sha256 of JSON response). - `input_digest` (hash of context pack). - `summary`, `conflicts`, `remediation`, `citations`. - `generated_at`, `model_id`, `profile` (Sovereign/FIPS etc.). - `signatures` (optional DSSE if run in deterministic mode). - Offline bundle format contains `summary.md`, `citations.json`, `context_manifest.json`, `signatures/`. ## 6) Profiles & sovereignty - **Profiles:** `default`, `fips-local` (FIPS-compliant local model), `gost-local`, `cloud-openai` (optional, disabled by default). Each profile defines allowed models, key management, and telemetry endpoints. - **CryptoProfile/RootPack integration:** generated artefacts can be signed using configured CryptoProfile to satisfy procurement/trust requirements. ## 7) APIs - `POST /v1/advisory-ai/summaries` — generate (or retrieve cached) summary for `{advisoryKey, artifactId, policyVersion}`. - `POST /v1/advisory-ai/conflicts` — explain conflicting VEX statements with trust ranking. - `POST /v1/advisory-ai/remediation` — fetch remediation plan with target fix versions, prerequisites, verification steps. - `GET /v1/advisory-ai/outputs/{hash}` — retrieve cached artefact (used by CLI/Console/Export Center). All endpoints accept `profile` parameter (default `fips-local`) and return `output_hash`, `input_digest`, and `citations` for verification. ## 8) Observability - Metrics: `advisory_ai_requests_total{profile,type}`, `advisory_ai_latency_seconds`, `advisory_ai_validation_failures_total`. - Logs: include `output_hash`, `input_digest`, `profile`, `model_id`, `tenant`, `artifacts`. Sensitive context is not logged. - Traces: spans for retrieval, prompt assembly, model inference, validation, cache write. ## 9) Operational controls - Feature flags per tenant (`ai.summary.enabled`, `ai.remediation.enabled`). - Rate limits (per tenant, per profile) enforced by Orchestrator to prevent runaway usage. - Offline/air-gapped deployments run local models packaged with Offline Kit; model weights validated via manifest digests.