Temp commit to debug
8.1 KiB
Advisory AI architecture
Captures the retrieval, guardrail, and inference packaging requirements defined in the Advisory AI implementation plan and related module guides.
1) Goals
- Summarise advisories/VEX evidence into operator-ready briefs with citations.
- Explain conflicting statements with provenance and trust weights (using VEX Lens & Excititor data).
- Suggest remediation plans aligned with Offline Kit deployment models and scheduler follow-ups.
- Operate deterministically where possible; cache generated artefacts with digests for audit.
2) Pipeline overview
+---------------------+
Concelier/VEX Lens | Evidence Retriever |
Policy Engine ----> | (vector + keyword) | ---> Context Pack (JSON)
Zastava runtime +---------------------+
|
v
+-------------+
| Prompt |
| Assembler |
+-------------+
|
v
+-------------+
| Guarded LLM |
| (local/host)|
+-------------+
|
v
+-----------------+
| Citation & |
| Validation |
+-----------------+
|
v
+----------------+
| Output cache |
| (hash, bundle) |
+----------------+
3) Retrieval & context
-
Hybrid search: vector embeddings (SBERT-compatible) + keyword filters for advisory IDs, PURLs, CVEs.
-
Context packs include:
- Advisory raw excerpts with highlighted sections and source URLs.
- VEX statements (normalized tuples + trust metadata).
- Policy explain traces for the affected finding.
- Runtime/impact hints from Zastava (exposure, entrypoints).
- Export-ready remediation data (fixed versions, patches).
-
SBOM context retriever (AIAI-31-002) hydrates:
- Version timelines (first/last observed, status, fix availability).
- Dependency paths (runtime vs build/test, deduped by coordinate chain).
- Tenant environment flags (prod/stage toggles) with optional blast radius summary.
- Service-side clamps: max 500 timeline entries, 200 dependency paths, with client-provided toggles for env/blast data.
AddSbomContextHttpClient(...)registers the typed HTTP client that calls/v1/sbom/context, whileNullSbomContextClientremains the safe default for environments that have not yet exposed the SBOM service.
Sample configuration (wire real SBOM base URL + API key):
services.AddSbomContextHttpClient(options => { options.BaseAddress = new Uri("https://sbom-service.internal"); options.Endpoint = "/v1/sbom/context"; options.ApiKey = configuration["SBOM_SERVICE_API_KEY"]; options.UserAgent = "stellaops-advisoryai/1.0"; options.Tenant = configuration["TENANT_ID"]; }); services.AddAdvisoryPipeline();After configuration, issue a smoke request (e.g.,
ISbomContextRetriever.RetrieveAsync) during deployment validation to confirm end-to-end connectivity and credentials before enabling Advisory AI endpoints.
Retriever requests and results are trimmed/normalized before hashing; metadata (counts, provenance keys) is returned for downstream guardrails. Unit coverage ensures deterministic ordering and flag handling.
All context references include content_hash and source_id enabling verifiable citations.
4) Guardrails
- Prompt templates enforce structure: summary, conflicts, remediation, references.
- Response validator ensures:
- No hallucinated advisories (every fact must map to input context).
- Citations follow
[n]indexing referencing actual sources. - Remediation suggestions only cite policy-approved sources (fixed versions, vendor hotfixes).
- Moderation/PII filters prevent leaking secrets; responses failing validation are rejected and logged.
- Pre-flight guardrails redact secrets (AWS keys, generic API tokens, PEM blobs), block "ignore previous instructions"-style prompt injection attempts, enforce citation presence, and cap prompt payload length (default 16 kB). Guardrail outcomes and redaction counts surface via
advisory_guardrail_blocks/advisory_outputs_storedmetrics.
5) Deterministic tooling
- Version comparators — offline semantic version + RPM EVR parsers with range evaluators. Supports chained constraints (
>=,<=,!=) used by remediation advice and blast radius calcs.- Registered via
AddAdvisoryDeterministicToolsetfor reuse across orchestrator, CLI, and services.
- Registered via
- Orchestration pipeline — see
orchestration-pipeline.mdfor prerequisites, task breakdown, and cross-guild responsibilities before wiring the execution flows. - Planned extensions — NEVRA/EVR comparators, ecosystem-specific normalisers, dependency chain scorers (AIAI-31-003 scope).
- Exposed via internal interfaces to allow orchestrator/toolchain reuse; all helpers stay side-effect free and deterministic for golden testing.
6) Output persistence
- Cached artefacts stored in
advisory_ai_outputswith fields:output_hash(sha256 of JSON response).input_digest(hash of context pack).summary,conflicts,remediation,citations.generated_at,model_id,profile(Sovereign/FIPS etc.).signatures(optional DSSE if run in deterministic mode).
- Offline bundle format contains
summary.md,citations.json,context_manifest.json,signatures/.
7) Profiles & sovereignty
- Profiles:
default,fips-local(FIPS-compliant local model),gost-local,cloud-openai(optional, disabled by default). Each profile defines allowed models, key management, and telemetry endpoints. - CryptoProfile/RootPack integration: generated artefacts can be signed using configured CryptoProfile to satisfy procurement/trust requirements.
8) APIs
POST /api/v1/advisory/{task}— executes Summary/Conflict/Remediation pipeline (task∈summary|conflict|remediation). Requests accept{advisoryKey, artifactId?, policyVersion?, profile, preferredSections?, forceRefresh}and return sanitized prompt payloads, citations, guardrail metadata, provenance hash, and cache hints.GET /api/v1/advisory/outputs/{cacheKey}?taskType=SUMMARY&profile=default— retrieves cached artefacts for downstream consumers (Console, CLI, Export Center). Guardrail state and provenance hash accompany results.
All endpoints accept profile parameter (default fips-local) and return output_hash, input_digest, and citations for verification.
9) Observability
- Metrics:
advisory_ai_requests_total{profile,type},advisory_ai_latency_seconds,advisory_ai_validation_failures_total. - Logs: include
output_hash,input_digest,profile,model_id,tenant,artifacts. Sensitive context is not logged. - Traces: spans for retrieval, prompt assembly, model inference, validation, cache write.
10) Operational controls
- Feature flags per tenant (
ai.summary.enabled,ai.remediation.enabled). - Rate limits (per tenant, per profile) enforced by Orchestrator to prevent runaway usage.
- Offline/air-gapped deployments run local models packaged with Offline Kit; model weights validated via manifest digests.
11) Hosting surfaces
- WebService — exposes
/v1/advisory-ai/pipeline/{task}to materialise plans and enqueue execution messages. - Worker — background service draining the advisory pipeline queue (file-backed stub) pending integration with shared transport.
- Both hosts register
AddAdvisoryAiCore, which wires the SBOM context client, deterministic toolset, pipeline orchestrator, and queue metrics. - SBOM base address + tenant metadata are configured via
AdvisoryAI:SbomBaseAddressand propagated throughAddSbomContext.