feat(metrics): Add new histograms for chunk latency, results, and sources in AdvisoryAiMetrics

feat(telemetry): Record chunk latency, result count, and source count in AdvisoryAiTelemetry fix(endpoint): Include telemetry source count in advisory chunks endpoint response test(metrics): Enhance WebServiceEndpointsTests to validate new metrics for chunk latency, results, and sources refactor(tests): Update test utilities for Deno language analyzer tests chore(tests): Add performance tests for AdvisoryGuardrail with scenarios and blocked phrases docs: Archive Sprint 137 design document for scanner and surface enhancements
2025-11-10 22:26:43 +02:00
parent 56c687253f
commit b059bc7675
22 changed files with 427 additions and 37 deletions
--- a/docs/implplan/SPRINT_110_ingestion_evidence.md
+++ b/docs/implplan/SPRINT_110_ingestion_evidence.md
@@ -17,9 +17,11 @@ Active items only. Completed/historic work now resides in docs/implplan/archived
  - 2025-11-09: AIAI-31-009 remains DOING after converting the guardrail harness into JSON fixtures, expanding property/perf coverage, and validating offline cache seeding; remote inference packaging (AIAI-31-008) is still TODO until the policy knob work in AIAI-31-006..007 completes.
  - 2025-11-09: DOCS-AIAI-31-004 continues DOING—guardrail/offline sections are drafted, but screenshots plus copy blocks wait on CONSOLE-VULN-29-001, CONSOLE-VEX-30-001, and EXCITITOR-CONSOLE-23-001.
  - SBOM-AIAI-31-003 and DOCS-AIAI-31-005/006/008/009 remain BLOCKED pending SBOM-AIAI-31-001, CLI-VULN-29-001, CLI-VEX-30-001, POLICY-ENGINE-31-001, and DEVOPS-AIAI-31-001.
+  - 2025-11-10: AIAI-31-009 performance suite doubled dataset coverage (blocked phrase seed + perf scenarios) and now enforces sub-400 ms guardrail batches so Advisory AI can cite deterministic budgets.
 - **Concelier (110.B)** – `/advisories/{advisoryKey}/chunks` shipped on 2025-11-07 with tenant enforcement, chunk tuning knobs, and regression fixtures; structured field/caching work (CONCELIER-AIAI-31-002) is still TODO while telemetry/guardrail instrumentation (CONCELIER-AIAI-31-003) is DOING.
  - Air-gap provenance/staleness bundles (`CONCELIER-AIRGAP-56-001` → `CONCELIER-AIRGAP-58-001`), console views/deltas (`CONCELIER-CONSOLE-23-001..003`), and attestation metadata (`CONCELIER-ATTEST-73-001/002`) remain TODO pending Link-Not-Merge plus Cartographer schema delivery.
  - Connector provenance refreshes `FEEDCONN-ICSCISA-02-012` and `FEEDCONN-KISA-02-008` are still overdue, leaving evidence parity gaps for those feeds.
+  - 2025-11-10: CONCELIER-AIAI-31-003 shipped cache/request histograms + guardrail counters/log scopes; docs now map the new metrics for Advisory AI dashboards.
 - **Excititor (110.C)** – Normalized VEX justification projections (EXCITITOR-AIAI-31-001) are DOING as of 2025-11-09; the downstream chunk API (EXCITITOR-AIAI-31-002), telemetry/guardrails (EXCITITOR-AIAI-31-003), docs/OpenAPI alignment (EXCITITOR-AIAI-31-004), and attestation payload work (`EXCITITOR-ATTEST-*`) stay TODO until that projection work plus Link-Not-Merge schema land.
  - Mirror/air-gap backlog (`EXCITITOR-AIRGAP-56-001` .. `EXCITITOR-AIRGAP-58-001`) and connector provenance parity (`EXCITITOR-CONN-TRUST-01-001`) remain unscheduled, so Advisory AI cannot yet hydrate sealed VEX evidence or cite connector signatures.
 - **Mirror (110.D)** – MIRROR-CRT-56-001 (deterministic bundle assembler) has not kicked off, so DSSE/TUF (MIRROR-CRT-56-002), OCI exports (MIRROR-CRT-57-001), time anchors (MIRROR-CRT-57-002), CLI verbs (MIRROR-CRT-58-001), and Export Center automation (MIRROR-CRT-58-002) are all blocked.
--- a/docs/implplan/SPRINT_111_advisoryai.md
+++ b/docs/implplan/SPRINT_111_advisoryai.md
@@ -5,8 +5,18 @@ Active items only. Completed/historic work now resides in docs/implplan/archived
 [Ingestion & Evidence] 110.A) AdvisoryAI
 Depends on: Sprint 100.A - Attestor
 Summary: Ingestion & Evidence focus on AdvisoryAI.
+
 Task ID | State | Task description | Owners (Source)
 --- | --- | --- | ---
+DOCS-AIAI-31-006 | BLOCKED (2025-11-03) | Update `/docs/policy/assistant-parameters.md` covering temperature, token limits, ranking weights, TTLs. Dependencies: POLICY-ENGINE-31-001. | Docs Guild, Policy Guild (docs)
+DOCS-AIAI-31-008 | BLOCKED (2025-11-03) | Publish `/docs/sbom/remediation-heuristics.md` (feasibility scoring, blast radius). Dependencies: SBOM-AIAI-31-001. | Docs Guild, SBOM Service Guild (docs)
+DOCS-AIAI-31-009 | BLOCKED (2025-11-03) | Create `/docs/runbooks/assistant-ops.md` for warmup, cache priming, model outages, scaling. Dependencies: DEVOPS-AIAI-31-001. | Docs Guild, DevOps Guild (docs)
+SBOM-AIAI-31-003 | TODO (2025-11-03) | Publish the Advisory AI hand-off kit for `/v1/sbom/context`, share base URL/API key + tenant header contract, and run a joint end-to-end retrieval smoke test with Advisory AI. Dependencies: SBOM-AIAI-31-001. | SBOM Service Guild, Advisory AI Guild (src/SbomService/StellaOps.SbomService)
+AIAI-31-008 | TODO | Package inference on-prem container, remote inference toggle, Helm/Compose manifests, scaling guidance, offline kit instructions. Dependencies: AIAI-31-006..007. | Advisory AI Guild, DevOps Guild (src/AdvisoryAI/StellaOps.AdvisoryAI)
+AIAI-31-009 | DOING (2025-11-09) | Develop unit/golden/property/perf tests, injection harness, and regression suite; ensure determinism with seeded caches. Dependencies: AIAI-31-001..006. | Advisory AI Guild, QA Guild (src/AdvisoryAI/StellaOps.AdvisoryAI) |
+
+
+
 > 2025-11-03: WebService/Worker scaffolds created with in-memory cache/queue, minimal APIs (`/api/v1/advisory/plan`, `/api/v1/advisory/queue`), metrics counters, and plan cache instrumentation; worker processes queue using orchestrator.
 > 2025-11-04: SBOM base address now flows via `SbomContextClientOptions.BaseAddress`, worker emits queue/plan metrics, and orchestrator cache keys expanded to cover SBOM hash inputs.
 DOCS-AIAI-31-004 | DOING (2025-11-07) | Create `/docs/advisory-ai/console.md` with screenshots, a11y notes, copy-as-ticket instructions. Dependencies: CONSOLE-VULN-29-001, CONSOLE-VEX-30-001, EXCITITOR-CONSOLE-23-001. | Docs Guild, Console Guild (docs)
@@ -14,10 +24,6 @@ DOCS-AIAI-31-004 | DOING (2025-11-07) | Create `/docs/advisory-ai/console.md` wi
 > 2025-11-08: Console endpoints are staffed (CONSOLE-VULN-29-001 / CONSOLE-VEX-30-001 DOING); still waiting on EXCITITOR-CONSOLE-23-001 feeds before capturing screenshots/tests.
 > 2025-11-09: Guardrail/inference sections and offline playbooks documented; screenshot placeholders remain open.
 DOCS-AIAI-31-005 | BLOCKED (2025-11-03) | Publish `/docs/advisory-ai/cli.md` covering commands, exit codes, scripting patterns. Dependencies: CLI-VULN-29-001, CLI-VEX-30-001, AIAI-31-004C. | Docs Guild, DevEx/CLI Guild (docs)
-DOCS-AIAI-31-006 | BLOCKED (2025-11-03) | Update `/docs/policy/assistant-parameters.md` covering temperature, token limits, ranking weights, TTLs. Dependencies: POLICY-ENGINE-31-001. | Docs Guild, Policy Guild (docs)
-DOCS-AIAI-31-008 | BLOCKED (2025-11-03) | Publish `/docs/sbom/remediation-heuristics.md` (feasibility scoring, blast radius). Dependencies: SBOM-AIAI-31-001. | Docs Guild, SBOM Service Guild (docs)
-DOCS-AIAI-31-009 | BLOCKED (2025-11-03) | Create `/docs/runbooks/assistant-ops.md` for warmup, cache priming, model outages, scaling. Dependencies: DEVOPS-AIAI-31-001. | Docs Guild, DevOps Guild (docs)
-SBOM-AIAI-31-003 | TODO (2025-11-03) | Publish the Advisory AI hand-off kit for `/v1/sbom/context`, share base URL/API key + tenant header contract, and run a joint end-to-end retrieval smoke test with Advisory AI. Dependencies: SBOM-AIAI-31-001. | SBOM Service Guild, Advisory AI Guild (src/SbomService/StellaOps.SbomService)
 > 2025-11-03: DOCS-AIAI-31-003 moved to DOING – drafting Advisory AI API reference (endpoints, rate limits, error model) for sprint 110.
 > 2025-11-04: AIAI-31-005 DONE – guardrail pipeline redacts secrets, enforces citation/injection policies, emits block counters, and tests (`AdvisoryGuardrailPipelineTests`) cover redaction + citation validation.
 > 2025-11-03: DOCS-AIAI-31-003 marked DONE – `docs/advisory-ai/api.md` published with scopes, request/response schemas, rate limits, and error catalogue (Docs Guild).
@@ -31,12 +37,8 @@ SBOM-AIAI-31-003 | TODO (2025-11-03) | Publish the Advisory AI hand-off kit for
 > 2025-11-03: DOCS-AIAI-31-009 marked BLOCKED – DevOps runbook inputs (DEVOPS-AIAI-31-001) outstanding.
 > 2025-11-03: Shipped `/api/v1/advisory/{task}` execution and `/api/v1/advisory/outputs/{cacheKey}` retrieval endpoints with guardrail integration, provenance hashes, and metrics (RBAC & rate limiting still pending Authority scope delivery).
 > 2025-11-06: AIAI-31-007 completed – Advisory AI WebService/Worker emit latency histograms, guardrail/validation counters, citation coverage ratios, and OTEL spans; Grafana dashboard + burn-rate alerts refreshed.
-AIAI-31-008 | TODO | Package inference on-prem container, remote inference toggle, Helm/Compose manifests, scaling guidance, offline kit instructions. Dependencies: AIAI-31-006..007. | Advisory AI Guild, DevOps Guild (src/AdvisoryAI/StellaOps.AdvisoryAI)
-AIAI-31-009 | DOING (2025-11-09) | Develop unit/golden/property/perf tests, injection harness, and regression suite; ensure determinism with seeded caches. Dependencies: AIAI-31-001..006. | Advisory AI Guild, QA Guild (src/AdvisoryAI/StellaOps.AdvisoryAI)
+
 > 2025-11-09: Guardrail harness converted to JSON fixtures + legacy payloads, property-style plan cache load tests added, and file-system cache/output suites cover seeded/offline scenarios.
-
-
-
 > 2025-11-02: AIAI-31-004 kicked off orchestration pipeline design – establishing deterministic task sequence (summary/conflict/remediation) and cache key strategy.
 > 2025-11-02: AIAI-31-004 orchestration prerequisites documented in docs/modules/advisory-ai/orchestration-pipeline.md (tasks 004A/004B/004C).
 > 2025-11-02: AIAI-31-003 moved to DOING – beginning deterministic tooling (comparators, dependency analysis) while awaiting SBOM context client. Semantic & EVR comparators shipped; toolset interface published for orchestrator adoption.
--- a/docs/implplan/archived/SPRINT_137_scanner_gap_design.md
+++ b/docs/implplan/archived/SPRINT_137_scanner_gap_design.md
--- a/docs/implplan/execution-waves.md
+++ b/docs/implplan/execution-waves.md
@@ -9,9 +9,7 @@ Each wave groups sprints that declare the same leading dependency. Start waves o
 - Shared prerequisite(s): None (explicit)
 - Parallelism guidance: No upstream sprint recorded; confirm module AGENTS and readiness gates before parallel execution.
 - Sprints:
-  - SPRINT_110_ingestion_evidence.md — Sprint 110 - Ingestion & Evidence. Done.
  - SPRINT_130_scanner_surface.md — Sprint 130 - Scanner & Surface. Done.
-  - SPRINT_137_scanner_gap_design.md — Sprint 137 - Scanner & Surface. Done.
  - SPRINT_138_scanner_ruby_parity.md — Sprint 138 - Scanner & Surface. In progress.
  - SPRINT_140_runtime_signals.md — Sprint 140 - Runtime & Signals. In progress.
  - SPRINT_150_scheduling_automation.md — Sprint 150 - Scheduling & Automation
--- a/docs/observability/observability.md
+++ b/docs/observability/observability.md
@@ -150,6 +150,20 @@ Logs are shipped to the central Loki/Elasticsearch cluster. Use the template que

 to spot active AOC violations.

+### 1.3 · Advisory chunk API (Advisory AI feeds)
+
+Advisory AI now leans on Concelier’s `/advisories/{key}/chunks` endpoint for deterministic evidence packs. The service exports dedicated metrics so dashboards can highlight latency spikes, cache noise, or aggressive guardrail filtering before they impact Advisory AI responses.
+
+| Metric | Type | Labels | Description |
+| --- | --- | --- | --- |
+| `advisory_ai_chunk_requests_total` | Counter | `tenant`, `result`, `truncated`, `cache` | Count of chunk API calls, tagged with cache hits/misses and truncation state. |
+| `advisory_ai_chunk_latency_milliseconds` | Histogram | `tenant`, `result`, `truncated`, `cache` | End-to-end build latency (milliseconds) for each chunk request. |
+| `advisory_ai_chunk_segments` | Histogram | `tenant`, `result`, `truncated` | Number of chunk segments returned to the caller; watch for sudden drops tied to guardrails. |
+| `advisory_ai_chunk_sources` | Histogram | `tenant`, `result` | How many upstream observations/sources contributed to a response (after observation limits). |
+| `advisory_ai_guardrail_blocks_total` | Counter | `tenant`, `reason`, `cache` | Per-reason count of segments suppressed by guardrails (length, normalization, character set). |
+
+Dashboards should plot latency P95/P99 next to cache hit rates and guardrail block deltas to catch degradation early. Advisory AI CLI/Console surfaces the same metadata so support engineers can correlate with Grafana/Loki entries using `traceId`/`correlationId` headers.
+
 ---

 ## 4 · Dashboards
--- a/docs/updates/2025-11-07-concelier-advisory-chunks.md
+++ b/docs/updates/2025-11-07-concelier-advisory-chunks.md
@@ -9,4 +9,4 @@

 **Follow-ups**
 - [ ] CONCELIER-AIAI-31-002 – surface structured workaround/fix fields plus caching for downstream retrievers.
- [ ] CONCELIER-AIAI-31-003 – wire chunk request metrics/logs and guardrail telemetry once the API stabilizes.
+- [x] CONCELIER-AIAI-31-003 – wire chunk request metrics/logs and guardrail telemetry once the API stabilizes. (2025-11-10: request/latency/source histograms + structured guardrail logs shipped.)