- Added "StellaOps.Policy.Engine", "StellaOps.Cartographer", and "StellaOps.SbomService" projects to the StellaOps solution. - Created AGENTS.md to outline the Contract Testing Guild Charter, detailing mission, scope, and definition of done. - Established TASKS.md for the Contract Testing Task Board, outlining tasks for Sprint 62 and Sprint 63 related to mock servers and replay testing.
18 KiB
Below is the “maximum documentation” bundle for Epic 8. It’s engineered to be pasted into your repo without turning into yet another unread wiki tomb. Slight sarcasm included to keep blood flowing.
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
Epic 8: Advisory AI Assistant (summaries, conflict explain, remediation hints)
Short name: Advisory AI
Services touched: Conseiller (Feedser), Excitator (Vexer), VEX Lens, SBOM Service, Policy Engine, Findings Ledger, Web API Gateway, Authority (authN/Z), Console (Web UI), CLI, Telemetry/Analytics
AOC ground rule: Conseiller and Excitator aggregate but never merge or mutate source docs. Advisory AI produces derived summaries and plans with strict provenance and citations. No silent rewriting of evidence. Ever.
1) What it is
Advisory AI is a tenant‑scoped, retrieval‑augmented assistant that turns noisy security advisories and VEX statements into three consumable artifacts:
-
Advisory Summary Condenses one or more advisories (CSAF, OSV, GHSA, vendor PDFs, distro notices) into a concise brief with key facts: affected ranges, exploit status, impact, known workarounds, fixed versions, and links. Always cites the exact sources and sections used.
-
Conflict Explain Explains why VEX statements or advisories disagree for a specific artifact and version. Uses the VEX Consensus Lens outputs and issuer trust model to produce a human‑readable, step‑by‑step explanation: who said what, where the product scoping diverges, and what policy thresholds caused the final state.
-
Remediation Hints Suggests practical next steps: upgrade paths compatible with your dependency graph, backports, config toggles, temporary policy suppressions, or compensating controls. Every hint is grounded in SBOM, environment, and policy. It ships as structured JSON plus a human summary, ready to paste into a ticket.
It lives in the Console as a side panel, in the CLI for batch runs, and via APIs for automation. It does not change scanner results or consensus on its own. Humans remain in charge. The machine does the skimming and the math so humans can keep the judgment and the coffee.
2) Why (brief)
Advisories are long, inconsistent, and sometimes contradictory. Teams waste cycles reconciling PDFs with package manifests. The assistant eliminates that sludge: fast summaries, explicit conflict explanations, and remediation hints that are actually applicable to your software, not to an imaginary ideal project from 2013.
3) How it should work (maximum detail)
3.1 Capabilities
-
Summaries
-
Input: one advisory or a bundle linked by the same advisory key (CVE, GHSA, vendor‑ID), product scope, and environment.
-
Output:
- 150 to 300 words summary
AdvisorySummary JSON(schema below)- Citations with paragraph anchors
- Confidence label and coverage score (how much of the advisory set is represented)
-
-
Conflict Explain
- Input:
(artifact_id, purl, version, advisory_key)tuple. - Output: narrative plus a structured breakdown of consensus math, issuer votes, product mapping mismatches, and the exact policy knobs that tipped the result.
- Input:
-
Remediation Hints
- Input: same tuple plus SBOM context and environment.
- Output: ranked list of remediation options with feasibility score, blast radius estimate (derived from dependency paths), effort class, and links to fixed versions. Includes “do nothing” when the VEX consensus is not affected.
3.2 System design
Architecture diagram in words (because ASCII art is a crime):
-
Retrievers
- Structured retriever over Conseiller’s normalized advisory fields.
- Vector retriever over advisory text chunks with paragraph anchors.
- VEX retriever over Excitator evidence and VEX Lens consensus.
- SBOM retriever for purl, version, dependency paths, env flags.
-
Deterministic resolvers
- Version comparators per ecosystem.
- Range satisfaction checks.
- Dependency path scorers and blast radius estimator.
-
Orchestrator
- Task‑specific prompt templates for Summary, Conflict, Remediation.
- Tool calls to deterministics (version check, graph crawl) with results injected into the prompt as structured context.
- Strict token budgets and truncation rules to avoid model babble.
-
Models
- Default: on‑prem inference container with mid‑sized model.
- Optional: tenant‑enabled remote inference. Disabled by default.
- Temperature locked low for summary and conflict. Slightly higher for remediation narrative phrasing. No creativity in facts.
-
Guardrails
- Prompt injection defense by stripping or quarantining advisory text that tries to instruct the model.
- Fact boundary tagger. The assistant must only state facts that appear in structured inputs or cited chunks.
- Redaction of secrets before prompts.
- Output validator checks: required JSON fields, numeric ranges, valid version strings.
3.3 Data contracts
AdvisorySummary JSON
{
"advisory_key": "CVE-2025-12345",
"sources": [
{"id":"csaf:vendorA:2025-001","uri":"...","sections":["2.1","3.4"]},
{"id":"osv:pkg:npm/lodash","uri":"...","sections":["affected","references"]}
],
"affected_ranges": [
{"ecosystem":"npm","purl_family":"pkg:npm/lodash","introduced":"<4.17.15","fixed": "4.17.21"}
],
"exploit_status": "no_known_exploit | poc_public | exploited_in_the_wild | n/a",
"impact": {"cvss":[{"vector":"CVSS:3.1/AV:N/...","score":7.5}], "cwes":["CWE-79"]},
"workarounds": ["Disable feature X", "Set flag Y=false"],
"fixed_versions": ["4.17.21"],
"notes": "Vendor states not affected on platform Z due to build option W.",
"coverage_score": 0.86,
"generated_at": "2025-10-25T12:00:00Z"
}
ConflictExplanation JSON
{
"tuple": {"artifact_id":"svc:checkout@1.9.0","purl":"pkg:npm/lodash@4.17.20","advisory_key":"CVE-2025-12345"},
"consensus": {"state":"NOT_AFFECTED","confidence":0.82},
"quorum": [
{"issuer":"lodash-maintainers","status":"NOT_AFFECTED","weight":0.9,"sig":true,"justification":"component_not_present"},
{"issuer":"vendorX-distro","status":"AFFECTED","weight":0.25,"sig":false,"justification":"generic"}
],
"policy_factors": {"na_threshold":1.0,"aff_threshold":0.6,"recency_decay_days":90},
"mapping_issues": [{"kind":"cpe_to_purl","score":0.6,"detail":"CPE wildcard matched multiple purls"}],
"explanation_steps": [
"Exact purl match found for maintainer VEX; weight 0.9",
"Distro advisory generic; scope score 0.5; effective weight 0.25",
"NA threshold met. Result set to NOT_AFFECTED"
]
}
RemediationPlan JSON
{
"tuple": {"artifact_id":"svc:checkout@1.9.0","purl":"pkg:npm/lodash@4.17.20","advisory_key":"CVE-2025-12345"},
"options": [
{
"kind": "upgrade",
"target_version": "4.17.21",
"feasibility": 0.92,
"blast_radius": {"direct_callers":3,"transitive_depth":2},
"effort": "low | medium | high",
"rationale": "Semver patch, no breaking APIs in release notes",
"links": ["release_notes_uri"]
},
{
"kind": "workaround",
"action": "Set SAFE_MODE=true",
"feasibility": 0.6,
"blast_radius": {"feature_flags":["SAFE_MODE"]},
"effort": "low",
"rationale": "Vendor states mitigation reduces attack surface on feature X"
}
],
"preferred": 0,
"policy_effects": {"sla_days": 7, "severity_override": "medium_if_not_fixed"},
"generated_at": "2025-10-25T12:00:00Z"
}
3.4 APIs
POST /advisory/ai/summary
{
"advisory_key":"CVE-2025-12345",
"artifact_id":"svc:checkout@1.9.0",
"purl":"pkg:npm/lodash@4.17.20",
"sources":["csaf:*","osv:*"], // optional filters
"policy_version":"1.3.0",
"lang":"en"
}
-> 200 { "summary_text":"...", "summary": {AdvisorySummary}, "citations":[...] }
POST /advisory/ai/conflict
{
"artifact_id":"svc:checkout@1.9.0",
"purl":"pkg:npm/lodash@4.17.20",
"advisory_key":"CVE-2025-12345",
"policy_version":"1.3.0"
}
-> 200 { "explanation_text":"...", "explanation": {ConflictExplanation} }
POST /advisory/ai/remediation
{
"artifact_id":"svc:checkout@1.9.0",
"purl":"pkg:npm/lodash@4.17.20",
"advisory_key":"CVE-2025-12345",
"policy_version":"1.3.0",
"max_options":5,
"strategy_preference":["upgrade","backport","workaround"]
}
-> 200 { "plan_text":"...", "plan": {RemediationPlan} }
POST /advisory/ai/batch
{
"items":[ {tuple}, {tuple}, ... ],
"task":"summary | conflict | remediation",
"policy_version":"1.3.0"
}
-> 207 multi-status
Status codes: 400 invalid, 403 RBAC, 404 missing evidence, 409 conflict lock, 422 output validation failed, 429 rate limit.
3.5 Console (Web UI)
-
Surfaces:
- Vuln Explorer detail: “Advisory AI” side panel with 3 tabs: Summary, Conflict, Remediation.
- Consensus Lens detail: prominent “Explain conflict” button.
- Policy Studio sim: “Show effect on assistant output” preview.
-
UX details:
- Citations are footnotes with hover to show source paragraph.
- “Copy as ticket” produces Markdown and JSON.
- Plan options show feasibility bar, blast radius chips, and required approvals per policy.
- Injection warnings appear if advisory text included unsafe instructions.
-
A11y: ARIA tags for tabs, keyboard shortcuts
Gto generate,Rto refresh,Cto copy JSON.
3.6 CLI
stella advise summarize --advisory CVE-2025-12345 --artifact svc:checkout@1.9.0 --purl pkg:npm/lodash@4.17.20 --policy 1.3.0 --json
stella advise explain --advisory CVE-2025-12345 --artifact svc:checkout@1.9.0 --purl pkg:npm/lodash@4.17.20 --policy 1.3.0
stella advise remediate --advisory CVE-2025-12345 --artifact svc:checkout@1.9.0 --purl pkg:npm/lodash@4.17.20 --policy 1.3.0 --strategy upgrade,workaround --out plan.json
stella advise batch --file tuples.json --task remediation --policy 1.3.0
Exit codes: 0 ok, 2 invalid args, 4 not found, 5 denied, 7 validation fail.
3.7 RBAC and security
-
Roles:
- Viewer can run summaries and read explanations.
- Operator can run remediation and export plans.
- Admin can toggle model endpoints and guardrail settings.
-
Defaults:
- Remote model calls disabled.
- Redaction on.
- Prompt logging anonymized.
- Outputs stored as derived artifacts with TTL (default 30 days) unless pinned to a ticket.
3.8 Observability
-
Metrics:
advisory_ai_latency_msby task type.advisory_ai_guardrail_blocks_total.advisory_ai_output_validation_fail_total.advisory_ai_citation_coveragegauge.
-
Traces: retriever spans, tool calls, model inference, validator.
-
Logs: include tuple key, token usage, truncation events, and guardrail outcomes.
3.9 Performance
-
Targets:
- P95 under 1.5 s for Summary and Conflict with warm caches.
- P95 under 2.5 s for Remediation on medium SBOMs (1000 packages).
- Batch throughput 10 tuples per second per worker.
3.10 Edge cases
- Advisory missing fixed versions: produce workaround‑only plan and mark feasibility low.
- Conflicts with near‑tie weights: declare “DISPUTED” and require human approval, no auto plan preferred.
- Exotic version schemes: fallback to string compare with warning and feasibility cap.
- Private packages: no public release notes. Prefer internal changelog links if attached to artifact metadata.
- Multi‑env differences: render per‑env deltas when policy knobs differ (dev vs prod).
4) Implementation plan
4.1 Services and components
-
New:
src/StellaOps.AdvisoryAIretriever/wrappers for Conseiller, Excitator, VEX Lens, SBOM.deterministic/version and path analyzers.orchestrator/task routers and prompt builders.guardrails/injection, redaction, output validator.api/REST endpoints and schema enforcement.
-
Updates:
- Conseiller: expose paragraph‑level anchors for advisories.
- Excitator: expose justifications and product trees in normalized form.
- VEX Lens: stable API for quorum and rationale.
- SBOM Service: efficient path queries and versions timeline per purl.
4.2 Packaging
-
Container images:
stella/advisory-ai:<<version>>stella/inference:<<version>>(if using on‑prem model)
-
Helm values to toggle remote inference and GPU.
4.3 Rollout
- Phase 1: Summary and Conflict read‑only.
- Phase 2: Remediation with “Copy as ticket”.
- Phase 3: Batch APIs, CLI, and Policy Studio simulation hooks.
5) Documentation changes
Create or update the following files. Each doc ends with the imposed rule statement.
-
/docs/advisory-ai/overview.mdWhat it is, capabilities, guardrails, AOC alignment, RBAC. -
/docs/advisory-ai/architecture.mdRAG design, retrievers, orchestrator, deterministics, models, caching. -
/docs/advisory-ai/api.mdEndpoint specs, payload schemas, error codes, examples. -
/docs/advisory-ai/console.mdScreens, actions, a11y, how citations work, copy‑as‑ticket. -
/docs/advisory-ai/cli.mdCommand usage, exit codes, piping examples. -
/docs/policy/assistant-parameters.mdTemperature, max tokens, plan ranking weights, TTLs. -
/docs/security/assistant-guardrails.mdRedaction rules, injection defense, output validation, logging. -
/docs/sbom/remediation-heuristics.mdFeasibility scoring, blast radius, effort classes. -
/docs/runbooks/assistant-ops.mdWarmup, cache priming, model outages, scaling, on‑call steps.
6) Engineering tasks
Backend core
- Implement structured and vector retrievers with paragraph anchors from Conseiller.
- Implement VEX retriever using Lens APIs with caching.
- Build deterministics: ecosystem comparators, range checks, dependency path scorer.
- Implement orchestrator with task‑specific templates and tool call pipeline.
- Implement guardrails and validators with hard failure on invalid JSON.
- Add RBAC to endpoints and anonymized prompt logging.
- Add caching layer with tuple‑keyed entries and policy version scoping.
Integrations
- Conseiller: expose advisory chunk API and metadata needed for citations.
- Excitator: ensure justifications and product trees are queryable.
- VEX Lens: add “policy factors” endpoint for explanation rendering.
- SBOM Service: implement
GET /sbom/paths?purl=...and version timeline.
Console
- Build Advisory AI panel with 3 tabs and citation tooltips.
- Implement “Copy as ticket” (Markdown + JSON) and download.
- Add injection warning banner when triggered.
- Respect a11y requirements and shortcuts.
CLI
stella advise summarize|explain|remediate|batchwith JSON output.- Add
--outoption to save plans and summaries. - Tests for piping and jq workflows.
Observability
- Emit metrics and traces listed in §3.8.
- Dashboards: latency, guardrail blocks, validation fails, coverage.
Docs
- Write all files in §5 with examples and screenshots.
- Cross‑link to VEX Lens and Vulnerability Explorer docs.
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
7) Acceptance criteria
- Summaries cite specific source sections and reflect affected ranges and fixed versions correctly for at least 95% of a validation set.
- Conflict explanations enumerate issuers, weights, justifications, mapping issues, and policy thresholds that caused the consensus state.
- Remediation plans output at least one feasible option when a fixed version exists and correctly flag “no public fix” cases.
- JSON schemas validate for all outputs.
- Console shows the panel with citations, copy‑as‑ticket, and a11y passes.
- CLI produces identical JSON to API responses.
- Guardrails block injection attempts and redact secrets in prompts.
- P95 latency targets are met with warm caches.
- No mutation of raw advisory or VEX evidence occurs anywhere in the pipeline.
8) Risks and mitigations
- Prompt injection in advisory text. Strip instructions, sandbox chunks, and highlight to user when removed.
- Hallucinated facts. Hard validation requires facts to appear in structured inputs or cited text. Fail closed if not provable.
- Mapping errors produce bad hints. Depend on SBOM Graph and VEX Lens scope scores; cap feasibility when scope is weak.
- Model outage. Degrade to deterministic summaries (shorter, but accurate).
- Privacy concerns. Default on‑prem inference, remote endpoints opt‑in with clear flags and audit logs.
9) Test plan
- Unit: version comparators, range checks, feasibility scoring, output validators.
- Golden files: advisory sets mapped to expected summaries and plans; diff on each build.
- Injection tests: adversarial advisories with “ignore prior instructions” payloads must be neutralized.
- Integration: Conseiller→Advisory AI→Console loop, with VEX Lens conflicts and SBOM graph lookups.
- E2E: generate summary, explanation, and plan for representative ecosystems (npm, Maven, PyPI, Go, RPM/DEB).
- Perf: soak tests with 5k tuples batch; observe cache hit ratios and P95.
- A11y: keyboard navigation and screen reader labels.
10) Philosophy
- Facts first. If it is not in structured inputs or citations, it does not exist.
- Explain everything. Humans should see exactly why the tool said what it said.
- Helpful by default. Plans must consider the real dependency graph and environment, not fantasy.
- No silent merges. Evidence is sacred. Summaries and plans are separate, auditable derivatives.
Final reminder: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.