7.0 KiB
7.0 KiB
AdvisoryAI Knowledge Search (AKS)
Why retrieval-first
AKS is a deterministic retrieval system for operational problem solving across Stella Ops docs, OpenAPI contracts, and Doctor checks. It is designed to work offline and does not require GPU-backed or hosted LLM inference for correctness.
LLMs can still be used as optional formatters later, but AKS correctness is grounded in source retrieval and explicit references.
Scope
- Module owner:
src/AdvisoryAI/**. - Search surfaces consuming AKS:
- Web global search in
src/Web/StellaOps.Web/**. - CLI commands in
src/Cli/**.
- Web global search in
- Doctor execution remains authoritative in Doctor module. AKS only indexes metadata and remediation references.
Architecture
- Ingestion/indexing:
- Markdown allow-list/manifest -> section chunks.
- OpenAPI aggregate (
openapi_current.jsonstyle artifact) -> per-operation chunks + normalized operation tables. - Doctor seed + controls metadata (including CLI-discovered Doctor check catalog projection) -> doctor projection chunks.
- Storage:
- PostgreSQL tables in schema
advisoryaivia migrationsrc/AdvisoryAI/StellaOps.AdvisoryAI/Storage/Migrations/002_knowledge_search.sql.
- PostgreSQL tables in schema
- Retrieval:
- FTS (
tsvector+websearch_to_tsquery) + optional vector stage. - Deterministic fusion and tie-breaking in
KnowledgeSearchService.
- FTS (
- Delivery:
- API endpoint:
POST /v1/advisory-ai/search. - Index rebuild endpoint:
POST /v1/advisory-ai/index/rebuild.
- API endpoint:
Data model
AKS schema tables:
advisoryai.kb_doc: canonical source docs with product/version/content hash metadata.advisoryai.kb_chunk: searchable units (md_section,api_operation,doctor_check) with anchors, spans,tsvector, and embeddings.advisoryai.api_spec: raw OpenAPI snapshot (jsonb) by service.advisoryai.api_operation: normalized operation records (method,path,operation_id, tags, request/response/security json).advisoryai.doctor_search_projection: searchable doctor metadata and remediation.
Vector support:
- Tries
CREATE EXTENSION vector. - If unavailable, AKS remains fully functional via FTS and deterministic array embeddings fallback.
Deterministic ingestion rules
Markdown
- Source order:
- Allow-list file:
src/AdvisoryAI/StellaOps.AdvisoryAI/KnowledgeSearch/knowledge-docs-allowlist.json. - Generated manifest (optional, from CLI tool):
knowledge-docs-manifest.json. - Fallback scan roots (
docs/**) only if allow-list resolves no markdown files.
- Allow-list file:
- Chunk by H2/H3 headings.
- Stable anchors using slug + duplicate suffix.
- Stable chunk IDs from source path + anchor + span.
- Metadata includes path, anchor, section path, tags.
OpenAPI
- Source order:
- Aggregated OpenAPI file path (default
devops/compose/openapi_current.json). - Fallback repository scan for
openapi.jsonwhen aggregate is missing.
- Aggregated OpenAPI file path (default
- Parse deterministic JSON aggregate for MVP.
- Emit one searchable chunk per HTTP operation.
- Preserve structured operation payloads (
request_json,responses_json,security_json).
Doctor
- Source order:
- Seed file
src/AdvisoryAI/StellaOps.AdvisoryAI/KnowledgeSearch/doctor-search-seed.json. - Controls file
src/AdvisoryAI/StellaOps.AdvisoryAI/KnowledgeSearch/doctor-search-controls.json(contains control fields plus fallback metadata fromstella advisoryai sources prepare). - Optional Doctor endpoint metadata (
DoctorChecksEndpoint) when configured.
- Seed file
stella advisoryai sources preparemerges configured seed entries withDoctorEngine.ListChecks()(when available in CLI runtime) and writes enriched control projection metadata (title,severity,description,remediation,runCommand,symptoms,tags,references).- Emit doctor chunk + projection record including:
checkCode,title,severity,runCommand, remediation, symptoms.- control metadata (
control,requiresConfirmation,isDestructive,inspectCommand,verificationCommand).
Ranking strategy
Implemented in src/AdvisoryAI/StellaOps.AdvisoryAI/KnowledgeSearch/KnowledgeSearchService.cs:
- Candidate retrieval:
- lexical set from FTS.
- optional vector set from embedding candidates.
- Fusion:
- reciprocal rank fusion style scoring.
- Deterministic boosts:
- exact
checkCodematch. - exact
operationIdmatch. METHOD /pathmatch.- filter-aligned service/tag boosts.
- exact
- Deterministic ordering:
- score desc -> kind asc -> chunk id asc.
API contract
Search
POST /v1/advisory-ai/search- Request:
q(required),k,filters.type|product|version|service|tags,includeDebug.
- Response:
- typed results (
docs|api|doctor) with snippet, score, and open action.
- typed results (
Rebuild
POST /v1/advisory-ai/index/rebuild- Rebuilds AKS deterministically from local docs/specs/doctor metadata.
Web behavior
Global search now consumes AKS and supports:
- Mixed grouped results (
Docs,API Endpoints,Doctor Checks). - Type filter chips.
- Result actions:
- Docs:
Open. - API:
Curl(copy command). - Doctor:
Run(navigate to doctor and copy run command).
- Docs:
Moreaction for "show more like this" local query expansion.
CLI behavior
AKS commands:
stella search "<query>" [--type docs|api|doctor] [--product ...] [--version ...] [--service ...] [--tag ...] [--k N] [--json]stella doctor suggest "<symptom>" [--product ...] [--version ...] [--k N] [--json]stella advisoryai index rebuild [--json]stella advisoryai sources prepare [--repo-root ...] [--docs-allowlist ...] [--docs-manifest-output ...] [--openapi-output ...] [--doctor-seed ...] [--doctor-controls-output ...] [--overwrite] [--json]
Output:
- Human mode: grouped actionable references.
- JSON mode: stable machine-readable payload.
Test/benchmark strategy
Implemented benchmark framework:
- Generator:
KnowledgeSearchBenchmarkDatasetGenerator(deterministic synthetic set with explicit ground truth). - Runner:
KnowledgeSearchBenchmarkRunner(recall@k, p50/p95 latency, stability pass). - Models/serialization:
KnowledgeSearchBenchmarkModels.csKnowledgeSearchBenchmarkJson.cs
Tests:
src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/KnowledgeSearch/KnowledgeSearchBenchmarkTests.cs- verifies deterministic dataset generation with >= 1000 queries.
- verifies recall/latency metrics and top-k match behavior.
Dedicated AKS test DB
Compose profile:
devops/compose/docker-compose.advisoryai-knowledge-test.yml
Init script:
devops/compose/postgres-init/advisoryai-knowledge-test/01_extensions.sql
Example workflow:
docker compose -f devops/compose/docker-compose.advisoryai-knowledge-test.yml up -d
stella advisoryai sources prepare --json
stella advisoryai index rebuild --json
dotnet test src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/StellaOps.AdvisoryAI.Tests.csproj
Known limitations and follow-ups
- YAML OpenAPI ingestion is not included in MVP.
- End-to-end benchmark against live Postgres-backed AKS service is planned as a follow-up CI lane.
- Optional external embedding providers can be added later without changing API contracts.