165 lines
7.4 KiB
Markdown
165 lines
7.4 KiB
Markdown
# Unified Search Architecture
|
|
|
|
This document defines the architecture for AdvisoryAI unified search (Sprint 100, Phase 4 hardening).
|
|
|
|
## Goals
|
|
- Help operators and users unfamiliar with Stella Ops terminology find relevant results quickly.
|
|
- Merge platform knowledge, findings, VEX, policy, graph, timeline, scanner, and OpsMemory signals into one deterministic ranking stream.
|
|
- Keep the system offline-capable and tenant-safe.
|
|
|
|
## Four-Layer Architecture
|
|
|
|
```mermaid
|
|
flowchart LR
|
|
Q[Layer 1: Query Understanding]
|
|
R[Layer 2: Federated Retrieval]
|
|
F[Layer 3: Fusion and Entity Cards]
|
|
S[Layer 4: Synthesis]
|
|
|
|
Q --> R --> F --> S
|
|
```
|
|
|
|
### Layer 1: Query Understanding
|
|
- Input: `UnifiedSearchRequest` (`q`, filters, ambient context, session id).
|
|
- Ambient context envelope from Web clients includes route/session continuity fields and optional last-action metadata (`action`, `source`, `queryHint`, `domain`, `entityKey`, `route`, `occurredAt`) for follow-up ranking/refinement.
|
|
- Components:
|
|
- `EntityExtractor`
|
|
- `IntentClassifier`
|
|
- `DomainWeightCalculator`
|
|
- `AmbientContextProcessor`
|
|
- `SearchSessionContextService`
|
|
- Output: `QueryPlan` with intent, detected entities, domain weights, and context boosts.
|
|
|
|
### Layer 2: Federated Retrieval
|
|
- Sources queried in parallel:
|
|
- Primary universal index (`IKnowledgeSearchStore` FTS + vector candidates)
|
|
- Optional federated backends via `FederatedSearchDispatcher`
|
|
- Ingestion adapters keep index coverage aligned across domains:
|
|
- Findings, VEX, Policy (live + snapshot fallback)
|
|
- Graph, Timeline, Scanner, OpsMemory snapshots
|
|
- Platform catalog
|
|
- Tenant isolation is enforced in request filters and chunk identities.
|
|
|
|
### Layer 3: Fusion and Entity Cards
|
|
- `WeightedRrfFusion` merges lexical + vector candidates with domain weights.
|
|
- Additional boosts:
|
|
- Entity proximity
|
|
- Ambient/session carry-forward
|
|
- Graph gravity
|
|
- Optional popularity and freshness controls
|
|
- Current-page scope is applied here as a ranking bias, not a hard filter. When outside-scope results remain materially stronger, or remain inside a narrow relative score band, the response surfaces them in bounded `overflow` metadata instead of hiding them.
|
|
- `EntityCardAssembler` groups facets into entity cards and resolves aliases.
|
|
|
|
### Layer 4: Synthesis
|
|
- Deterministic synthesis is always available from top cards.
|
|
- Optional LLM tier (`SearchSynthesisService`) streams over SSE with:
|
|
- quota enforcement
|
|
- grounding score
|
|
- action suggestions
|
|
- If LLM is unavailable or blocked by quota, deterministic output is still returned.
|
|
- Query responses may also include a deterministic `contextAnswer` envelope for answer-first search UX:
|
|
- `status`: `grounded` | `clarify` | `insufficient`
|
|
- `code`, `summary`, `reason`, `evidence`
|
|
- bounded `citations`
|
|
- bounded follow-up `questions`
|
|
- The answer envelope is additive and optional so older clients remain compatible.
|
|
|
|
### Telemetry and gap surfacing
|
|
- Search analytics stays optional at the client layer; queries still work when analytics events are never emitted.
|
|
- `AdvisoryAI:KnowledgeSearch:SearchTelemetryEnabled=false` also disables backend analytics persistence, feedback persistence, popularity-map reads, and unified-search telemetry sink emission without disabling retrieval or history.
|
|
- When enabled, the self-serve lane records `answer_frame`, `reformulation`, and `rescue_action` with hashed query keys, hashed tenant-scoped session ids, and bounded answer metadata.
|
|
- Quality review surfaces:
|
|
- `GET /v1/advisory-ai/search/quality/metrics`
|
|
- `GET /v1/advisory-ai/search/quality/alerts`
|
|
- Current self-serve gap signals:
|
|
- fallback answer rate
|
|
- clarify rate
|
|
- insufficient-evidence rate
|
|
- reformulation count
|
|
- rescue-action count
|
|
- abandoned fallback count
|
|
- `fallback_loop` and `abandoned_fallback` alerts
|
|
|
|
## Data Flow
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant UI as Web UI / API Client
|
|
participant API as UnifiedSearchEndpoints
|
|
participant PLAN as QueryUnderstanding
|
|
participant IDX as KnowledgeSearchStore
|
|
participant FED as FederatedDispatcher
|
|
participant FUS as WeightedRrfFusion
|
|
participant CARDS as EntityCardAssembler
|
|
participant SYN as SearchSynthesisService
|
|
participant ANA as SearchAnalyticsService
|
|
|
|
UI->>API: POST /v1/search/query
|
|
API->>PLAN: Build QueryPlan
|
|
PLAN-->>API: intent + entities + domain weights
|
|
API->>IDX: SearchFtsAsync + LoadVectorCandidatesAsync
|
|
API->>FED: DispatchAsync (optional)
|
|
IDX-->>API: lexical + vector rows
|
|
FED-->>API: federated rows + diagnostics
|
|
API->>FUS: Fuse rankings
|
|
FUS-->>API: ranked rows
|
|
API->>CARDS: Assemble entity cards
|
|
CARDS-->>API: entity cards
|
|
API->>ANA: Record query/click/zero_result
|
|
API-->>UI: UnifiedSearchResponse
|
|
|
|
UI->>API: POST /v1/search/synthesize
|
|
API->>SYN: ExecuteAsync
|
|
SYN-->>UI: SSE deterministic-first + optional LLM chunks
|
|
```
|
|
|
|
## Contracts and API Surface
|
|
- `POST /v1/search/query`
|
|
- `POST /v1/search/suggestions/evaluate`
|
|
- `POST /v1/search/synthesize`
|
|
- `POST /v1/search/index/rebuild`
|
|
|
|
`POST /v1/search/query` response notes:
|
|
- Entity cards remain the primary retrieval payload.
|
|
- `contextAnswer` is the preferred answer-first surface for Web self-serve UX when present.
|
|
- `contextAnswer` is query-driven rather than mode-driven: compare-like requests may blend close evidence clusters, while scoped troubleshoot requests prefer a single decisive answer when one result clearly leads.
|
|
- `overflow` is additive and bounded so FE can show "outside the current page, but likely relevant" results without reintroducing a scope toggle.
|
|
- `overflow` is intentionally narrow: it is suppressed when the current-scope winner has a clear lead, so FE can trust the primary section as the best local answer.
|
|
- `coverage` is additive and bounded so FE can suppress misleading suggestions when the active corpus has no sensible candidates for that domain.
|
|
- Live local verification currently covers the Doctor/knowledge path after the documented rebuild order:
|
|
1. `POST /v1/advisory-ai/index/rebuild`
|
|
2. `POST /v1/search/index/rebuild`
|
|
|
|
`POST /v1/search/suggestions/evaluate` response notes:
|
|
- Intended for proactive suggestion chips and page-owned prompts before the user commits a search.
|
|
- Returns per-query viability plus bounded domain coverage.
|
|
- Additive fields:
|
|
- `viabilityState`: `grounded` | `needs_clarification` | `no_match` | `scope_unready` | `corpus_unready`
|
|
- `scopeReady`: `true` when the current route scope has indexed corpus behind it
|
|
- `viable=true` is reserved for suggestions that already have grounded evidence; clarify-only suggestions are not considered executable.
|
|
- Does not require telemetry and does not record answer-frame analytics.
|
|
|
|
OpenAPI contract presence is validated by integration test:
|
|
- `UnifiedSearchEndpointsIntegrationTests.OpenApi_Includes_UnifiedSearch_Contracts`
|
|
|
|
## Determinism Rules
|
|
- Stable ordering tie-breaks by `kind` then `chunkId`.
|
|
- Ranking benchmark includes a deterministic stability hash across top results.
|
|
- Session context is ephemeral and expires by inactivity timeout.
|
|
|
|
## Configuration
|
|
Primary section: `AdvisoryAI:UnifiedSearch`
|
|
- `Enabled`
|
|
- `BaseDomainWeights`
|
|
- `Weighting.*` (domain/intent/entity/role boosts)
|
|
- `Federation.*`
|
|
- `GravityBoost.*`
|
|
- `Synthesis.*`
|
|
- `Ingestion.*`
|
|
- `Session.*`
|
|
- `TenantFeatureFlags.<tenant>.{Enabled,FederationEnabled,SynthesisEnabled}`
|
|
|
|
Detailed operator config and examples:
|
|
- `docs/operations/unified-search-operations.md`
|
|
- `docs/modules/advisory-ai/knowledge-search.md`
|