Files
git.stella-ops.org/docs/operations/unified-search-operations.md

4.7 KiB

Unified Search Operations Runbook

Scope

Runbook for AdvisoryAI unified search setup, operations, troubleshooting, performance, and rollout control.

Setup

  1. Configure AdvisoryAI:KnowledgeSearch:ConnectionString.
  2. Configure AdvisoryAI:UnifiedSearch options.
  3. Ensure model artifact path exists when VectorEncoderType=onnx:
    • default: models/all-MiniLM-L6-v2.onnx
  4. Rebuild index:
    • POST /v1/search/index/rebuild
  5. Verify query endpoint:
    • POST /v1/search/query with X-StellaOps-Tenant and advisory-ai:operate scope.

Key Endpoints

  • POST /v1/search/query
  • POST /v1/search/synthesize
  • POST /v1/search/index/rebuild
  • POST /v1/advisory-ai/search/analytics
  • GET /v1/advisory-ai/search/quality/metrics
  • GET /v1/advisory-ai/search/quality/alerts

Monitoring

Track per-tenant and global:

  • Query throughput (query, click, zero_result, synthesis events)
  • P50/P95/P99 latency for /v1/search/query
  • Zero-result rate
  • Synthesis quota denials
  • Index size and rebuild duration
  • Active encoder diagnostics (diagnostics.activeEncoder)

Performance Targets

  • Instant results: P50 < 100ms, P95 < 200ms, P99 < 300ms
  • Full results (federated): P50 < 200ms, P95 < 500ms, P99 < 800ms
  • Deterministic synthesis: P50 < 30ms, P95 < 50ms
  • LLM synthesis: TTFB P50 < 1s, total P95 < 5s

SQL Query Tuning and EXPLAIN Evidence

Unified search read paths rely on:

  • FTS query over advisoryai.kb_chunk.body_tsv*
  • Trigram fuzzy fallback (% / similarity())
  • Vector nearest-neighbor (embedding_vec <=> query_vector)

Recommended validation commands:

EXPLAIN (ANALYZE, BUFFERS)
SELECT c.chunk_id
FROM advisoryai.kb_chunk c
WHERE c.body_tsv_en @@ websearch_to_tsquery('english', @query)
ORDER BY ts_rank_cd(c.body_tsv_en, websearch_to_tsquery('english', @query), 32) DESC, c.chunk_id
LIMIT 20;
EXPLAIN (ANALYZE, BUFFERS)
SELECT c.chunk_id
FROM advisoryai.kb_chunk c
WHERE c.embedding_vec IS NOT NULL
ORDER BY c.embedding_vec <=> CAST(@query_vector AS vector), c.chunk_id
LIMIT 20;

Index expectations:

  • idx_kb_chunk_body_tsv_en (GIN over body_tsv_en)
  • idx_kb_chunk_body_trgm (GIN trigram over body)
  • idx_kb_chunk_embedding_vec_hnsw (HNSW over embedding_vec)

Automated EXPLAIN evidence is captured by:

  • UnifiedSearchLiveAdapterIntegrationTests.PostgresKnowledgeSearchStore_ExplainAnalyze_ShowsIndexedSearchPlans

Load and Capacity Envelope

Validated test envelope (in-process benchmark harness):

  • 50 concurrent requests sustained
  • P95 < 500ms, P99 < 800ms

Sizing guidance:

  • Up to 100k chunks: 2 vCPU / 4 GB RAM
  • 100k-500k chunks: 4 vCPU / 8 GB RAM
  • 500k chunks or heavy synthesis: 8 vCPU / 16 GB RAM, split synthesis workers

Feature Flags and Rollout

Config path: AdvisoryAI:UnifiedSearch:TenantFeatureFlags

  • Enabled
  • FederationEnabled
  • SynthesisEnabled

Example:

{
  "AdvisoryAI": {
    "UnifiedSearch": {
      "TenantFeatureFlags": {
        "tenant-alpha": { "Enabled": true, "FederationEnabled": true, "SynthesisEnabled": false },
        "tenant-beta":  { "Enabled": true, "FederationEnabled": false, "SynthesisEnabled": false }
      }
    }
  }
}

Troubleshooting

Symptom: empty results

  • Verify tenant header is present.
  • Verify UnifiedSearch.Enabled and tenant flag Enabled.
  • Run index rebuild and check chunk count.

Symptom: poor semantic recall

  • Verify VectorEncoderType and active encoder diagnostics.
  • Confirm ONNX model path is accessible and valid.
  • Rebuild index after encoder switch.

Symptom: synthesis unavailable

  • Check SynthesisEnabled (global + tenant).
  • Check quota counters and provider configuration.

Symptom: high latency

  • Check federated backend timeout budget.
  • Review EXPLAIN (ANALYZE) plans.
  • Verify index health and cardinality growth by tenant.

Backup and Recovery

  • Unified index is derivable state.
  • Recovery sequence:
    1. Restore primary domain systems (findings/vex/policy/docs sources).
    2. Restore AdvisoryAI DB schema.
    3. Trigger full index rebuild.
    4. Validate with quality benchmark fast subset.

Validation Commands

# Fast PR-level quality gate
dotnet test src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/StellaOps.AdvisoryAI.Tests.csproj \
  -- --filter-class StellaOps.AdvisoryAI.Tests.UnifiedSearch.UnifiedSearchQualityBenchmarkFastSubsetTests

# Full benchmark + tuning evidence
dotnet test src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/StellaOps.AdvisoryAI.Tests.csproj \
  -- --filter-class StellaOps.AdvisoryAI.Tests.UnifiedSearch.UnifiedSearchQualityBenchmarkTests

# Performance envelope
dotnet test src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/StellaOps.AdvisoryAI.Tests.csproj \
  -- --filter-class StellaOps.AdvisoryAI.Tests.UnifiedSearch.UnifiedSearchPerformanceEnvelopeTests