Search/AdvisoryAI and DAL conversion to EF finishes up. Preparation for microservices consolidation.

2026-02-25 18:19:22 +02:00
parent 4db038123b
commit 63c70a6d37
447 changed files with 52257 additions and 2636 deletions
--- a/docs/operations/unified-search-operations.md
+++ b/docs/operations/unified-search-operations.md
@@ -0,0 +1,143 @@
+# Unified Search Operations Runbook
+
+## Scope
+Runbook for AdvisoryAI unified search setup, operations, troubleshooting, performance, and rollout control.
+
+## Setup
+1. Configure `AdvisoryAI:KnowledgeSearch:ConnectionString`.
+2. Configure `AdvisoryAI:UnifiedSearch` options.
+3. Ensure model artifact path exists when `VectorEncoderType=onnx`:
+   - default: `models/all-MiniLM-L6-v2.onnx`
+4. Rebuild index:
+   - `POST /v1/search/index/rebuild`
+5. Verify query endpoint:
+   - `POST /v1/search/query` with `X-StellaOps-Tenant` and `advisory-ai:operate` scope.
+
+## Key Endpoints
+- `POST /v1/search/query`
+- `POST /v1/search/synthesize`
+- `POST /v1/search/index/rebuild`
+- `POST /v1/advisory-ai/search/analytics`
+- `GET /v1/advisory-ai/search/quality/metrics`
+- `GET /v1/advisory-ai/search/quality/alerts`
+
+## Monitoring
+Track per-tenant and global:
+- Query throughput (`query`, `click`, `zero_result`, `synthesis` events)
+- P50/P95/P99 latency for `/v1/search/query`
+- Zero-result rate
+- Synthesis quota denials
+- Index size and rebuild duration
+- Active encoder diagnostics (`diagnostics.activeEncoder`)
+
+## Performance Targets
+- Instant results: P50 < 100ms, P95 < 200ms, P99 < 300ms
+- Full results (federated): P50 < 200ms, P95 < 500ms, P99 < 800ms
+- Deterministic synthesis: P50 < 30ms, P95 < 50ms
+- LLM synthesis: TTFB P50 < 1s, total P95 < 5s
+
+## SQL Query Tuning and EXPLAIN Evidence
+Unified search read paths rely on:
+- FTS query over `advisoryai.kb_chunk.body_tsv*`
+- Trigram fuzzy fallback (`%` / `similarity()`)
+- Vector nearest-neighbor (`embedding_vec <=> query_vector`)
+
+Recommended validation commands:
+```sql
+EXPLAIN (ANALYZE, BUFFERS)
+SELECT c.chunk_id
+FROM advisoryai.kb_chunk c
+WHERE c.body_tsv_en @@ websearch_to_tsquery('english', @query)
+ORDER BY ts_rank_cd(c.body_tsv_en, websearch_to_tsquery('english', @query), 32) DESC, c.chunk_id
+LIMIT 20;
+```
+
+```sql
+EXPLAIN (ANALYZE, BUFFERS)
+SELECT c.chunk_id
+FROM advisoryai.kb_chunk c
+WHERE c.embedding_vec IS NOT NULL
+ORDER BY c.embedding_vec <=> CAST(@query_vector AS vector), c.chunk_id
+LIMIT 20;
+```
+
+Index expectations:
+- `idx_kb_chunk_body_tsv_en` (GIN over `body_tsv_en`)
+- `idx_kb_chunk_body_trgm` (GIN trigram over `body`)
+- `idx_kb_chunk_embedding_vec_hnsw` (HNSW over `embedding_vec`)
+
+Automated EXPLAIN evidence is captured by:
+- `UnifiedSearchLiveAdapterIntegrationTests.PostgresKnowledgeSearchStore_ExplainAnalyze_ShowsIndexedSearchPlans`
+
+## Load and Capacity Envelope
+Validated test envelope (in-process benchmark harness):
+- 50 concurrent requests sustained
+- P95 < 500ms, P99 < 800ms
+
+Sizing guidance:
+- Up to 100k chunks: 2 vCPU / 4 GB RAM
+- 100k-500k chunks: 4 vCPU / 8 GB RAM
+- >500k chunks or heavy synthesis: 8 vCPU / 16 GB RAM, split synthesis workers
+
+## Feature Flags and Rollout
+Config path: `AdvisoryAI:UnifiedSearch:TenantFeatureFlags`
+- `Enabled`
+- `FederationEnabled`
+- `SynthesisEnabled`
+
+Example:
+```json
+{
+  "AdvisoryAI": {
+    "UnifiedSearch": {
+      "TenantFeatureFlags": {
+        "tenant-alpha": { "Enabled": true, "FederationEnabled": true, "SynthesisEnabled": false },
+        "tenant-beta":  { "Enabled": true, "FederationEnabled": false, "SynthesisEnabled": false }
+      }
+    }
+  }
+}
+```
+
+## Troubleshooting
+### Symptom: empty results
+- Verify tenant header is present.
+- Verify `UnifiedSearch.Enabled` and tenant flag `Enabled`.
+- Run index rebuild and check chunk count.
+
+### Symptom: poor semantic recall
+- Verify `VectorEncoderType` and active encoder diagnostics.
+- Confirm ONNX model path is accessible and valid.
+- Rebuild index after encoder switch.
+
+### Symptom: synthesis unavailable
+- Check `SynthesisEnabled` (global + tenant).
+- Check quota counters and provider configuration.
+
+### Symptom: high latency
+- Check federated backend timeout budget.
+- Review `EXPLAIN (ANALYZE)` plans.
+- Verify index health and cardinality growth by tenant.
+
+## Backup and Recovery
+- Unified index is derivable state.
+- Recovery sequence:
+  1. Restore primary domain systems (findings/vex/policy/docs sources).
+  2. Restore AdvisoryAI DB schema.
+  3. Trigger full index rebuild.
+  4. Validate with quality benchmark fast subset.
+
+## Validation Commands
+```bash
+# Fast PR-level quality gate
+dotnet test src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/StellaOps.AdvisoryAI.Tests.csproj \
+  -- --filter-class StellaOps.AdvisoryAI.Tests.UnifiedSearch.UnifiedSearchQualityBenchmarkFastSubsetTests
+
+# Full benchmark + tuning evidence
+dotnet test src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/StellaOps.AdvisoryAI.Tests.csproj \
+  -- --filter-class StellaOps.AdvisoryAI.Tests.UnifiedSearch.UnifiedSearchQualityBenchmarkTests
+
+# Performance envelope
+dotnet test src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/StellaOps.AdvisoryAI.Tests.csproj \
+  -- --filter-class StellaOps.AdvisoryAI.Tests.UnifiedSearch.UnifiedSearchPerformanceEnvelopeTests
+```