Files

master 63c70a6d37 Search/AdvisoryAI and DAL conversion to EF finishes up. Preparation for microservices consolidation.

2026-02-25 18:19:22 +02:00

4.7 KiB

Raw Blame History

Unified Search Operations Runbook

Scope

Runbook for AdvisoryAI unified search setup, operations, troubleshooting, performance, and rollout control.

Setup

Configure AdvisoryAI:KnowledgeSearch:ConnectionString.
Configure AdvisoryAI:UnifiedSearch options.
Ensure model artifact path exists when VectorEncoderType=onnx:
- default: models/all-MiniLM-L6-v2.onnx
Rebuild index:
- POST /v1/search/index/rebuild
Verify query endpoint:
- POST /v1/search/query with X-StellaOps-Tenant and advisory-ai:operate scope.

Key Endpoints

POST /v1/search/query
POST /v1/search/synthesize
POST /v1/search/index/rebuild
POST /v1/advisory-ai/search/analytics
GET /v1/advisory-ai/search/quality/metrics
GET /v1/advisory-ai/search/quality/alerts

Monitoring

Track per-tenant and global:

Query throughput (query, click, zero_result, synthesis events)
P50/P95/P99 latency for /v1/search/query
Zero-result rate
Synthesis quota denials
Index size and rebuild duration
Active encoder diagnostics (diagnostics.activeEncoder)

Performance Targets

Instant results: P50 < 100ms, P95 < 200ms, P99 < 300ms
Full results (federated): P50 < 200ms, P95 < 500ms, P99 < 800ms
Deterministic synthesis: P50 < 30ms, P95 < 50ms
LLM synthesis: TTFB P50 < 1s, total P95 < 5s

SQL Query Tuning and EXPLAIN Evidence

Unified search read paths rely on:

FTS query over advisoryai.kb_chunk.body_tsv*
Trigram fuzzy fallback (% / similarity())
Vector nearest-neighbor (embedding_vec <=> query_vector)

Recommended validation commands:

EXPLAIN (ANALYZE, BUFFERS)
SELECT c.chunk_id
FROM advisoryai.kb_chunk c
WHERE c.body_tsv_en @@ websearch_to_tsquery('english', @query)
ORDER BY ts_rank_cd(c.body_tsv_en, websearch_to_tsquery('english', @query), 32) DESC, c.chunk_id
LIMIT 20;

EXPLAIN (ANALYZE, BUFFERS)
SELECT c.chunk_id
FROM advisoryai.kb_chunk c
WHERE c.embedding_vec IS NOT NULL
ORDER BY c.embedding_vec <=> CAST(@query_vector AS vector), c.chunk_id
LIMIT 20;

Index expectations:

idx_kb_chunk_body_tsv_en (GIN over body_tsv_en)
idx_kb_chunk_body_trgm (GIN trigram over body)
idx_kb_chunk_embedding_vec_hnsw (HNSW over embedding_vec)

Automated EXPLAIN evidence is captured by:

UnifiedSearchLiveAdapterIntegrationTests.PostgresKnowledgeSearchStore_ExplainAnalyze_ShowsIndexedSearchPlans

Load and Capacity Envelope

Validated test envelope (in-process benchmark harness):

50 concurrent requests sustained
P95 < 500ms, P99 < 800ms

Sizing guidance:

Up to 100k chunks: 2 vCPU / 4 GB RAM
100k-500k chunks: 4 vCPU / 8 GB RAM
500k chunks or heavy synthesis: 8 vCPU / 16 GB RAM, split synthesis workers

Feature Flags and Rollout

Config path: AdvisoryAI:UnifiedSearch:TenantFeatureFlags

Enabled
FederationEnabled
SynthesisEnabled

Example:

{
  "AdvisoryAI": {
    "UnifiedSearch": {
      "TenantFeatureFlags": {
        "tenant-alpha": { "Enabled": true, "FederationEnabled": true, "SynthesisEnabled": false },
        "tenant-beta":  { "Enabled": true, "FederationEnabled": false, "SynthesisEnabled": false }
      }
    }
  }
}

Troubleshooting

Symptom: empty results

Verify tenant header is present.
Verify UnifiedSearch.Enabled and tenant flag Enabled.
Run index rebuild and check chunk count.

Symptom: poor semantic recall

Verify VectorEncoderType and active encoder diagnostics.
Confirm ONNX model path is accessible and valid.
Rebuild index after encoder switch.

Symptom: synthesis unavailable

Check SynthesisEnabled (global + tenant).
Check quota counters and provider configuration.

Symptom: high latency

Check federated backend timeout budget.
Review EXPLAIN (ANALYZE) plans.
Verify index health and cardinality growth by tenant.

Backup and Recovery

Unified index is derivable state.
Recovery sequence:
1. Restore primary domain systems (findings/vex/policy/docs sources).
2. Restore AdvisoryAI DB schema.
3. Trigger full index rebuild.
4. Validate with quality benchmark fast subset.

Validation Commands

# Fast PR-level quality gate
dotnet test src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/StellaOps.AdvisoryAI.Tests.csproj \
  -- --filter-class StellaOps.AdvisoryAI.Tests.UnifiedSearch.UnifiedSearchQualityBenchmarkFastSubsetTests

# Full benchmark + tuning evidence
dotnet test src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/StellaOps.AdvisoryAI.Tests.csproj \
  -- --filter-class StellaOps.AdvisoryAI.Tests.UnifiedSearch.UnifiedSearchQualityBenchmarkTests

# Performance envelope
dotnet test src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/StellaOps.AdvisoryAI.Tests.csproj \
  -- --filter-class StellaOps.AdvisoryAI.Tests.UnifiedSearch.UnifiedSearchPerformanceEnvelopeTests

4.7 KiB Raw Blame History