feat: Update Sprint 110 documentation and enhance Advisory AI tests for determinism and mTLS validation

2025-11-08 23:28:41 +02:00
parent ae69b1a8a1
commit d71c81e45d
9 changed files with 395 additions and 19 deletions
--- a/docs/modules/advisory-ai/architecture.md
+++ b/docs/modules/advisory-ai/architecture.md
@@ -131,9 +131,17 @@ All endpoints accept `profile` parameter (default `fips-local`) and return `outp
 - Rate limits (per tenant, per profile) enforced by Orchestrator to prevent runaway usage.
 - Offline/air-gapped deployments run local models packaged with Offline Kit; model weights validated via manifest digests.

-## 11) Hosting surfaces
-
- **WebService** — exposes `/v1/advisory-ai/pipeline/{task}` to materialise plans and enqueue execution messages.
- **Worker** — background service draining the advisory pipeline queue (file-backed stub) pending integration with shared transport.
- Both hosts register `AddAdvisoryAiCore`, which wires the SBOM context client, deterministic toolset, pipeline orchestrator, and queue metrics.
- SBOM base address + tenant metadata are configured via `AdvisoryAI:SbomBaseAddress` and propagated through `AddSbomContext`.
+## 11) Hosting surfaces
+
+- **WebService** — exposes `/v1/advisory-ai/pipeline/{task}` to materialise plans and enqueue execution messages.
+- **Worker** — background service draining the advisory pipeline queue (file-backed stub) pending integration with shared transport.
+- Both hosts register `AddAdvisoryAiCore`, which wires the SBOM context client, deterministic toolset, pipeline orchestrator, and queue metrics.
+- SBOM base address + tenant metadata are configured via `AdvisoryAI:SbomBaseAddress` and propagated through `AddSbomContext`.
+
+## 12) QA harness & determinism (Sprint 110 refresh)
+
+- **Injection fixtures:** `src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/TestData/prompt-injection-fixtures.txt` drives `AdvisoryGuardrailInjectionTests`, ensuring blocked phrases (`ignore previous instructions`, `override the system prompt`, etc.) are rejected with redaction counters, preventing prompt-injection regressions.
+- **Golden prompts:** `summary-prompt.json` now pairs with `conflict-prompt.json`; `AdvisoryPromptAssemblerTests` load both to enforce deterministic JSON payloads across task types and verify vector preview truncation (600 characters + ellipsis) keeps prompts under the documented perf ceiling.
+- **Plan determinism:** `AdvisoryPipelineOrchestratorTests` shuffle structured/vector/SBOM inputs and assert cache keys + metadata remain stable, proving that seeded plan caches stay deterministic even when retrievers emit out-of-order results.
+- **Execution telemetry:** `AdvisoryPipelineExecutorTests` exercise partial citation coverage (target ≥0.5 when only half the structured chunks are cited) so `advisory_ai_citation_coverage_ratio` reflects real guardrail quality.
+- **Plan cache stability:** `AdvisoryPlanCacheTests` now seed the in-memory cache with a fake time provider to confirm TTL refresh when plans are replaced, guaranteeing reproducible eviction under air-gapped runs.