git.stella-ops.org/docs/operations/unified-search-operations.md

# Unified Search Operations Runbook

## Scope
Runbook for AdvisoryAI unified search setup, operations, troubleshooting, performance, and rollout control.

## Setup
1. Configure `AdvisoryAI:KnowledgeSearch:ConnectionString`.
2. Configure `AdvisoryAI:UnifiedSearch` options.
3. For live compose/runtime, set `AdvisoryAI:KnowledgeSearch:FindingsAdapterBaseUrl`, `...:VexAdapterBaseUrl`, and `...:PolicyAdapterBaseUrl` together so findings, VEX, and policy ingest from live services instead of partial fallback snapshots.
4. Ensure the published AdvisoryAI image carries the repo-shaped local corpus under `/app`, including `src/AdvisoryAI/StellaOps.AdvisoryAI/UnifiedSearch/Snapshots/{findings,vex,policy,graph,opsmemory,timeline,scanner}.snapshot.json`.
3. Ensure model artifact path exists when `VectorEncoderType=onnx`:
   - default: `models/all-MiniLM-L6-v2.onnx`
4. Rebuild indexes in order when verifying live search quality:
   - `POST /v1/advisory-ai/index/rebuild`
   - `POST /v1/search/index/rebuild`
5. Verify query endpoint:
   - `POST /v1/search/query` with `X-StellaOps-Tenant` and `advisory-ai:operate` scope.

## Key Endpoints
- `POST /v1/search/query`
- `POST /v1/search/synthesize`
- `POST /v1/search/index/rebuild`
- `POST /v1/advisory-ai/search/analytics`
- `GET /v1/advisory-ai/search/quality/metrics`
- `GET /v1/advisory-ai/search/quality/alerts`

## Monitoring
Track per-tenant and global:
- Query throughput (`query`, `click`, `zero_result`, `synthesis` events)
- Self-serve journey signals (`answer_frame`, `reformulation`, `rescue_action`)
- P50/P95/P99 latency for `/v1/search/query`
- Zero-result rate
- Fallback answer rate, clarify rate, insufficient-evidence rate
- Reformulation count, rescue-action count, abandoned fallback count
- Synthesis quota denials
- Index size and rebuild duration
- Active encoder diagnostics (`diagnostics.activeEncoder`)

## Performance Targets
- Instant results: P50 < 100ms, P95 < 200ms, P99 < 300ms
- Full results (federated): P50 < 200ms, P95 < 500ms, P99 < 800ms
- Deterministic synthesis: P50 < 30ms, P95 < 50ms
- LLM synthesis: TTFB P50 < 1s, total P95 < 5s

## SQL Query Tuning and EXPLAIN Evidence
Unified search read paths rely on:
- FTS query over `advisoryai.kb_chunk.body_tsv*`
- Trigram fuzzy fallback (`%` / `similarity()`)
- Vector nearest-neighbor (`embedding_vec <=> query_vector`)

Recommended validation commands:
```sql
EXPLAIN (ANALYZE, BUFFERS)
SELECT c.chunk_id
FROM advisoryai.kb_chunk c
WHERE c.body_tsv_en @@ websearch_to_tsquery('english', @query)
ORDER BY ts_rank_cd(c.body_tsv_en, websearch_to_tsquery('english', @query), 32) DESC, c.chunk_id
LIMIT 20;
```

```sql
EXPLAIN (ANALYZE, BUFFERS)
SELECT c.chunk_id
FROM advisoryai.kb_chunk c
WHERE c.embedding_vec IS NOT NULL
ORDER BY c.embedding_vec <=> CAST(@query_vector AS vector), c.chunk_id
LIMIT 20;
```

Index expectations:
- `idx_kb_chunk_body_tsv_en` (GIN over `body_tsv_en`)
- `idx_kb_chunk_body_trgm` (GIN trigram over `body`)
- `idx_kb_chunk_embedding_vec_hnsw` (HNSW over `embedding_vec`)

Automated EXPLAIN evidence is captured by:
- `UnifiedSearchLiveAdapterIntegrationTests.PostgresKnowledgeSearchStore_ExplainAnalyze_ShowsIndexedSearchPlans`

## Load and Capacity Envelope
Validated test envelope (in-process benchmark harness):
- 50 concurrent requests sustained
- P95 < 500ms, P99 < 800ms

Sizing guidance:
- Up to 100k chunks: 2 vCPU / 4 GB RAM
- 100k-500k chunks: 4 vCPU / 8 GB RAM
- >500k chunks or heavy synthesis: 8 vCPU / 16 GB RAM, split synthesis workers

## Feature Flags and Rollout
Config path: `AdvisoryAI:UnifiedSearch:TenantFeatureFlags`
- `Enabled`
- `FederationEnabled`
- `SynthesisEnabled`

Example:
```json
{
  "AdvisoryAI": {
    "UnifiedSearch": {
      "TenantFeatureFlags": {
        "tenant-alpha": { "Enabled": true, "FederationEnabled": true, "SynthesisEnabled": false },
        "tenant-beta":  { "Enabled": true, "FederationEnabled": false, "SynthesisEnabled": false }
      }
    }
  }
}
```

## Troubleshooting
### Symptom: empty results
- Verify tenant header is present.
- Verify `UnifiedSearch.Enabled` and tenant flag `Enabled`.
- Run index rebuild and check chunk count.
- If suggestions also fail, verify both rebuild steps were run in order and re-check with a known live query such as `database connectivity`.
- If only findings answer lanes work while VEX/policy/graph/OpsMemory remain corpus-unready, verify the published snapshot files exist under `/app/src/AdvisoryAI/StellaOps.AdvisoryAI/UnifiedSearch/Snapshots/` and confirm the VEX/policy adapter base URLs are configured in runtime env.

### Symptom: poor semantic recall
- Verify `VectorEncoderType` and active encoder diagnostics.
- Confirm ONNX model path is accessible and valid.
- Rebuild index after encoder switch.

### Symptom: synthesis unavailable
- Check `SynthesisEnabled` (global + tenant).
- Check quota counters and provider configuration.

### Symptom: search feels self-serve weak
- Inspect `GET /v1/advisory-ai/search/quality/metrics?period=7d`.
- Watch `fallbackAnswerRate`, `clarifyRate`, `insufficientRate`, `reformulationCount`, `rescueActionCount`, and `abandonedFallbackCount`.
- Inspect `GET /v1/advisory-ai/search/quality/alerts` for `fallback_loop` and `abandoned_fallback`.
- Treat repeated fallback loops as ranking/context gaps; treat abandoned fallback sessions as UX/product gaps.

### Symptom: high latency
- Check federated backend timeout budget.
- Review `EXPLAIN (ANALYZE)` plans.
- Verify index health and cardinality growth by tenant.

## Backup and Recovery
- Unified index is derivable state.
- Recovery sequence:
  1. Restore primary domain systems (findings/vex/policy/docs sources).
  2. Restore AdvisoryAI DB schema.
  3. Trigger full index rebuild.
  4. Validate with quality benchmark fast subset.

## Validation Commands
```bash
# Fast PR-level quality gate
dotnet test src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/StellaOps.AdvisoryAI.Tests.csproj \
  -- --filter-class StellaOps.AdvisoryAI.Tests.UnifiedSearch.UnifiedSearchQualityBenchmarkFastSubsetTests

# Full benchmark + tuning evidence
dotnet test src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/StellaOps.AdvisoryAI.Tests.csproj \
  -- --filter-class StellaOps.AdvisoryAI.Tests.UnifiedSearch.UnifiedSearchQualityBenchmarkTests

# Performance envelope
dotnet test src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/StellaOps.AdvisoryAI.Tests.csproj \
  -- --filter-class StellaOps.AdvisoryAI.Tests.UnifiedSearch.UnifiedSearchPerformanceEnvelopeTests

# Self-serve telemetry and gap surfacing slice
dotnet build src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/StellaOps.AdvisoryAI.Tests.csproj -v minimal
src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/bin/Debug/net10.0/StellaOps.AdvisoryAI.Tests.exe \
  -method "StellaOps.AdvisoryAI.Tests.Integration.UnifiedSearchSprintIntegrationTests.G10_SelfServeMetrics_IncludeFallbackReformulationAndRescueSignals" \
  -method "StellaOps.AdvisoryAI.Tests.Integration.UnifiedSearchSprintIntegrationTests.G10_RecoveredFallbackSessions_DoNotCountAsAbandoned" \
  -reporter verbose -noColor
```