feat: Update Sprint 110 documentation and enhance Advisory AI tests for determinism and mTLS validation
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled

This commit is contained in:
master
2025-11-08 23:28:41 +02:00
parent ae69b1a8a1
commit d71c81e45d
9 changed files with 395 additions and 19 deletions

View File

@@ -8,7 +8,7 @@ Active items only. Completed/historic work now resides in docs/implplan/archived
- 2025-11-04: AIAI-31-002 and AIAI-31-003 shipped with deterministic SBOM context client wiring (`AddSbomContext` typed HTTP client) and toolset integration; WebService/Worker now invoke the orchestrator with SBOM-backed simulations and emit initial metrics.
- 2025-11-03: AIAI-31-002 landed the configurable HTTP client + DI defaults; retriever now resolves data via `/v1/sbom/context`, retaining a null fallback until SBOM service ships.
- 2025-11-03: Follow-up: SBOM guild to deliver base URL/API key and run an Advisory AI smoke retrieval once SBOM-AIAI-31-001 endpoints are live.
- 2025-11-08: AIAI-31-009 moved to DOING building the QA harness (injection fixtures, golden/property/perf tests) plus documenting deterministic cache guarantees before release.
- 2025-11-08: AIAI-31-009 marked DONE injection harness + dual golden prompts + plan-cache determinism tests landed; perf memo added to Advisory AI architecture, `dotnet test src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/StellaOps.AdvisoryAI.Tests.csproj --no-build` green.
- **Concelier** CONCELIER-CORE-AOC-19-004 is the only in-flight Concelier item; air-gap, console, attestation, and Link-Not-Merge tasks remain TODO, and several connector upgrades still carry overdue October due dates.
- **Excititor** Excititor WebService, console, policy, and observability tracks are all TODO and hinge on Link-Not-Merge schema delivery plus trust-provenance connectors (SUSE/Ubuntu) progressing in section 110.C.
- **Mirror** Mirror Creator track (MIRROR-CRT-56-001 through MIRROR-CRT-58-002) has not started; DSSE signing, OCI bundle, and scheduling integrations depend on the deterministic bundle assembler landing first.

View File

@@ -131,9 +131,17 @@ All endpoints accept `profile` parameter (default `fips-local`) and return `outp
- Rate limits (per tenant, per profile) enforced by Orchestrator to prevent runaway usage.
- Offline/air-gapped deployments run local models packaged with Offline Kit; model weights validated via manifest digests.
## 11) Hosting surfaces
- **WebService** — exposes `/v1/advisory-ai/pipeline/{task}` to materialise plans and enqueue execution messages.
- **Worker** — background service draining the advisory pipeline queue (file-backed stub) pending integration with shared transport.
- Both hosts register `AddAdvisoryAiCore`, which wires the SBOM context client, deterministic toolset, pipeline orchestrator, and queue metrics.
- SBOM base address + tenant metadata are configured via `AdvisoryAI:SbomBaseAddress` and propagated through `AddSbomContext`.
## 11) Hosting surfaces
- **WebService** — exposes `/v1/advisory-ai/pipeline/{task}` to materialise plans and enqueue execution messages.
- **Worker** — background service draining the advisory pipeline queue (file-backed stub) pending integration with shared transport.
- Both hosts register `AddAdvisoryAiCore`, which wires the SBOM context client, deterministic toolset, pipeline orchestrator, and queue metrics.
- SBOM base address + tenant metadata are configured via `AdvisoryAI:SbomBaseAddress` and propagated through `AddSbomContext`.
## 12) QA harness & determinism (Sprint 110 refresh)
- **Injection fixtures:** `src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/TestData/prompt-injection-fixtures.txt` drives `AdvisoryGuardrailInjectionTests`, ensuring blocked phrases (`ignore previous instructions`, `override the system prompt`, etc.) are rejected with redaction counters, preventing prompt-injection regressions.
- **Golden prompts:** `summary-prompt.json` now pairs with `conflict-prompt.json`; `AdvisoryPromptAssemblerTests` load both to enforce deterministic JSON payloads across task types and verify vector preview truncation (600 characters + ellipsis) keeps prompts under the documented perf ceiling.
- **Plan determinism:** `AdvisoryPipelineOrchestratorTests` shuffle structured/vector/SBOM inputs and assert cache keys + metadata remain stable, proving that seeded plan caches stay deterministic even when retrievers emit out-of-order results.
- **Execution telemetry:** `AdvisoryPipelineExecutorTests` exercise partial citation coverage (target ≥0.5 when only half the structured chunks are cited) so `advisory_ai_citation_coverage_ratio` reflects real guardrail quality.
- **Plan cache stability:** `AdvisoryPlanCacheTests` now seed the in-memory cache with a fake time provider to confirm TTL refresh when plans are replaced, guaranteeing reproducible eviction under air-gapped runs.