Auto-rebuild AdvisoryAI knowledge corpus on startup

This commit is contained in:
master
2026-03-10 20:18:12 +02:00
parent d93006a8fa
commit f727ec24fd
7 changed files with 435 additions and 0 deletions

View File

@@ -0,0 +1,54 @@
# Sprint 20260310_033 - FE Live Frontdoor Unified Search Route Matrix
## Topic & Scope
- Reverify unified search directly on `https://stella-ops.local` after the full scratch rebuild and backend refresh, not only on the standalone search harness.
- Exercise supported route-local search starters end to end through the real authenticated shell and capture runtime evidence for route context, query execution, and result grounding.
- Repair any search-runtime convergence defect that prevents a wiped local install from surfacing viable Doctor, Policy, Findings, or VEX starters without manual post-start rebuild steps.
- Working directory: `src/AdvisoryAI/`.
- Allowed coordination edits: `src/Web/StellaOps.Web/scripts`, `docs/modules/advisory-ai/**`, and this sprint file.
- Expected evidence: a live Playwright frontdoor sweep script, JSON output under `src/Web/StellaOps.Web/output/playwright/`, focused AdvisoryAI tests, targeted image rebuild/redeploy, and a scoped local commit.
## Dependencies & Concurrency
- Depends on the scratch rebuild baseline and the current healthy compose stack on `https://stella-ops.local`.
- Safe parallelism: stay within live search harnesses, unified-search UI, and minimal docs updates required by any discovered defect.
## Documentation Prerequisites
- `AGENTS.md`
- `docs/qa/feature-checks/FLOW.md`
- `docs/modules/ui/search-zero-learning-primary-entry.md`
- `docs/modules/advisory-ai/knowledge-search.md`
## Delivery Tracker
### FE-LIVE-SEARCH-001 - Add and execute a frontdoor unified-search route matrix
Status: DONE
Dependency: none
Owners: QA, 3rd line support, Product Manager, Architect, Developer
Task description:
- The repo already has live search verification for the standalone local shell plus AdvisoryAI runtime, but this scratch iteration needs the same route-by-route proof against the real authenticated Stella Ops frontdoor.
- Add a script that authenticates against `https://stella-ops.local`, opens the supported route-local search surfaces, captures surfaced starter chips, executes each chip, and fails on missing context, missing starters, degraded banners, dead-end query execution, or runtime/network errors.
- Live proof now shows a deeper backend/setup failure: Doctor context renders, but `POST /api/v1/search/suggestions/evaluate` returns `current_scope_corpus_unready` for the knowledge scope after a full scratch rebuild. The fix must make AdvisoryAI converge the knowledge corpus on startup instead of relying on manual rebuild commands.
Completion criteria:
- [x] A live frontdoor search matrix script exists under `src/Web/StellaOps.Web/scripts/`.
- [x] The script writes structured JSON evidence under `src/Web/StellaOps.Web/output/playwright/`.
- [x] The script verifies route context plus starter-chip execution on Doctor, Security Triage, Policy, and Advisories & VEX.
- [x] Any product defects exposed by the run are root-caused, fixed, rebuilt, reverified, and committed in this iteration.
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-03-10 | Sprint created after the scratch rebuild, canonical route sweep, and release-promotion repair commit. Notifications recheck is clean again on the rebuilt stack, so the next untouched high-risk live surface is unified search through the real frontdoor shell. | QA |
| 2026-03-10 | Added `scripts/live-frontdoor-unified-search-route-matrix.mjs` and ran it against the rebuilt stack. Doctor search reproduces a real setup/runtime defect: the frontdoor returns `current_scope_corpus_unready` for all knowledge-scope starter queries even though the shell context is correct. Root-cause work is now moving into AdvisoryAI startup convergence. | QA / 3rd line support |
| 2026-03-10 | Implemented AdvisoryAI startup convergence so the knowledge corpus rebuilds automatically on fresh service startup, rebuilt and redeployed `advisory-ai-web`, and confirmed the live container reports `documents=470`, `chunks=9051`, `api_operations=2190`, `doctor_projections=8` during startup rebuild. | Developer / 3rd line support |
| 2026-03-10 | Reverified the live authenticated shell with a Playwright all-chip probe and wrote `src/Web/StellaOps.Web/output/playwright/live-frontdoor-unified-search-route-matrix-manual.json`. Doctor, Security Triage, Policy, and Advisories & VEX all render context-aware starter chips and their visible chip actions now resolve to grounded answers with cards. | QA |
## Decisions & Risks
- Decision: frontdoor search verification must not rely on the standalone Angular/AdvisoryAI harness alone; the authenticated shell is the product surface the client sees.
- Decision: scratch deployment success requires AdvisoryAI to populate its own knowledge corpus on startup. A healthy container with an empty knowledge scope is not an acceptable “ready” state.
- Decision: only the AdvisoryAI web host owns startup knowledge-index convergence. The shared library must not register that hosted service globally because the worker shares the same core registrations and would otherwise perform a duplicate rebuild on startup.
- Risk: live search starters depend on current route context and runtime corpus readiness, so the sweep must distinguish product regressions from transient auth/runtime setup failures with structured evidence.
## Next Checkpoints
- Implement the live frontdoor search sweep harness.
- Run it against the rebuilt stack and triage any failures before widening to the next untouched page family.

View File

@@ -388,6 +388,7 @@ Notes:
- Set `AdvisoryAI__KnowledgeSearch__RepositoryRoot` only when you are running the service from a non-standard layout or a packaged binary tree that is not inside the repository.
- `stella advisoryai index rebuild` and `stella search index rebuild` invoke authenticated backend endpoints. For a local source-checkout verification lane without a signed-in CLI session, use `sources prepare` via CLI and the direct HTTP rebuild calls above with explicit `X-StellaOps-*` headers.
- Compose/runtime requirement: the published AdvisoryAI service image must carry a repo-shaped local corpus under its app content root so `POST /v1/advisory-ai/index/rebuild` can resolve `docs/**`, `devops/compose/openapi_current.json`, and `src/AdvisoryAI/StellaOps.AdvisoryAI/KnowledgeSearch/*.json` even when the source checkout is not mounted into the container. If those assets are absent, live search on `stella-ops.local` degrades to partial unified rows only and documentation/Doctor/API answers disappear.
- Fresh service startup now auto-runs the knowledge rebuild by default (`AdvisoryAI__KnowledgeSearch__KnowledgeAutoIndexOnStartup=true`). This is the scratch-setup convergence path for `stella-ops.local`: a wiped deployment must populate the documentation/API/Doctor corpus without requiring operators to call `POST /v1/advisory-ai/index/rebuild` manually. Keep the manual endpoint for explicit refreshes and local live-search lanes, but do not depend on it for first-run correctness.
- The published app content root must also carry the full unified snapshot corpus under `src/AdvisoryAI/StellaOps.AdvisoryAI/UnifiedSearch/Snapshots/*.json`; packaging only findings/VEX/policy snapshots leaves graph, OpsMemory, timeline, and scanner answer lanes permanently corpus-unready in the live shell.
### CLI setup in a source checkout

View File

@@ -15,6 +15,7 @@ using StellaOps.AdvisoryAI.Evidence;
using StellaOps.AdvisoryAI.Explanation;
using StellaOps.AdvisoryAI.Hosting;
using StellaOps.AdvisoryAI.Inference.LlmProviders;
using StellaOps.AdvisoryAI.KnowledgeSearch;
using StellaOps.AdvisoryAI.Metrics;
using StellaOps.AdvisoryAI.Orchestration;
using StellaOps.AdvisoryAI.Outputs;
@@ -53,6 +54,7 @@ builder.Configuration
builder.Services.AddAdvisoryAiCore(builder.Configuration);
builder.Services.AddUnifiedSearch(builder.Configuration);
builder.Services.TryAddEnumerable(ServiceDescriptor.Singleton<IHostedService, KnowledgeSearchStartupRebuildService>());
var llmAdapterEnabled = builder.Configuration.GetValue<bool?>("AdvisoryAI:Adapters:Llm:Enabled") ?? false;
if (llmAdapterEnabled)

View File

@@ -54,6 +54,8 @@ public sealed class KnowledgeSearchOptions
public List<string> OpenApiRoots { get; set; } = ["src", "devops/compose"];
public bool KnowledgeAutoIndexOnStartup { get; set; } = true;
public string UnifiedFindingsSnapshotPath { get; set; } =
"src/AdvisoryAI/StellaOps.AdvisoryAI/UnifiedSearch/Snapshots/findings.snapshot.json";

View File

@@ -0,0 +1,57 @@
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
namespace StellaOps.AdvisoryAI.KnowledgeSearch;
internal sealed class KnowledgeSearchStartupRebuildService : IHostedService
{
private readonly KnowledgeSearchOptions _options;
private readonly IKnowledgeIndexer _indexer;
private readonly ILogger<KnowledgeSearchStartupRebuildService> _logger;
public KnowledgeSearchStartupRebuildService(
IOptions<KnowledgeSearchOptions> options,
IKnowledgeIndexer indexer,
ILogger<KnowledgeSearchStartupRebuildService> logger)
{
ArgumentNullException.ThrowIfNull(options);
_options = options.Value ?? new KnowledgeSearchOptions();
_indexer = indexer ?? throw new ArgumentNullException(nameof(indexer));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
public async Task StartAsync(CancellationToken cancellationToken)
{
if (!_options.Enabled)
{
_logger.LogDebug("AdvisoryAI knowledge search is disabled; skipping startup rebuild.");
return;
}
if (!_options.KnowledgeAutoIndexOnStartup)
{
_logger.LogDebug("AdvisoryAI knowledge startup rebuild is disabled.");
return;
}
try
{
var summary = await _indexer.RebuildAsync(cancellationToken).ConfigureAwait(false);
_logger.LogInformation(
"AdvisoryAI knowledge startup rebuild completed: documents={DocumentCount}, chunks={ChunkCount}, api_specs={ApiSpecCount}, api_operations={ApiOperationCount}, doctor_projections={DoctorProjectionCount}, duration_ms={DurationMs}",
summary.DocumentCount,
summary.ChunkCount,
summary.ApiSpecCount,
summary.ApiOperationCount,
summary.DoctorProjectionCount,
summary.DurationMs);
}
catch (Exception ex) when (ex is not OperationCanceledException)
{
_logger.LogWarning(ex, "AdvisoryAI knowledge startup rebuild failed.");
}
}
public Task StopAsync(CancellationToken cancellationToken) => Task.CompletedTask;
}

View File

@@ -0,0 +1,81 @@
using Microsoft.Extensions.Logging.Abstractions;
using Microsoft.Extensions.Options;
using StellaOps.AdvisoryAI.KnowledgeSearch;
using Xunit;
namespace StellaOps.AdvisoryAI.Tests.KnowledgeSearch;
[Trait("Category", "Unit")]
public sealed class KnowledgeSearchStartupRebuildServiceTests
{
[Fact]
public async Task StartAsync_rebuilds_knowledge_index_when_enabled()
{
var indexer = new RecordingKnowledgeIndexer();
var service = new KnowledgeSearchStartupRebuildService(
Options.Create(new KnowledgeSearchOptions
{
Enabled = true,
KnowledgeAutoIndexOnStartup = true,
}),
indexer,
NullLogger<KnowledgeSearchStartupRebuildService>.Instance);
await service.StartAsync(CancellationToken.None);
Assert.Equal(1, indexer.RebuildCallCount);
}
[Fact]
public async Task StartAsync_skips_rebuild_when_startup_bootstrap_is_disabled()
{
var indexer = new RecordingKnowledgeIndexer();
var service = new KnowledgeSearchStartupRebuildService(
Options.Create(new KnowledgeSearchOptions
{
Enabled = true,
KnowledgeAutoIndexOnStartup = false,
}),
indexer,
NullLogger<KnowledgeSearchStartupRebuildService>.Instance);
await service.StartAsync(CancellationToken.None);
Assert.Equal(0, indexer.RebuildCallCount);
}
[Fact]
public async Task StartAsync_skips_rebuild_when_knowledge_search_is_disabled()
{
var indexer = new RecordingKnowledgeIndexer();
var service = new KnowledgeSearchStartupRebuildService(
Options.Create(new KnowledgeSearchOptions
{
Enabled = false,
KnowledgeAutoIndexOnStartup = true,
}),
indexer,
NullLogger<KnowledgeSearchStartupRebuildService>.Instance);
await service.StartAsync(CancellationToken.None);
Assert.Equal(0, indexer.RebuildCallCount);
}
private sealed class RecordingKnowledgeIndexer : IKnowledgeIndexer
{
public int RebuildCallCount { get; private set; }
public Task<KnowledgeRebuildSummary> RebuildAsync(CancellationToken cancellationToken)
{
RebuildCallCount++;
return Task.FromResult(new KnowledgeRebuildSummary(
DocumentCount: 470,
ChunkCount: 9050,
ApiSpecCount: 1,
ApiOperationCount: 2190,
DoctorProjectionCount: 8,
DurationMs: 42));
}
}
}

View File

@@ -0,0 +1,238 @@
#!/usr/bin/env node
import { mkdirSync, writeFileSync } from 'node:fs';
import path from 'node:path';
import { fileURLToPath } from 'node:url';
import { chromium } from 'playwright';
import { authenticateFrontdoor, createAuthenticatedContext } from './live-frontdoor-auth.mjs';
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
const webRoot = path.resolve(__dirname, '..');
const outputDirectory = path.join(webRoot, 'output', 'playwright');
const statePath = path.join(outputDirectory, 'live-frontdoor-auth-state.json');
const reportPath = path.join(outputDirectory, 'live-frontdoor-auth-report.json');
const resultPath = path.join(outputDirectory, 'live-frontdoor-unified-search-route-matrix.json');
const scopeQuery = 'tenant=demo-prod&regions=us-east&environments=stage&timeWindow=7d';
const routeMatrix = [
{
key: 'doctor',
label: 'Doctor',
route: '/ops/operations/doctor',
expectedContext: /doctor diagnostics/i,
},
{
key: 'triage',
label: 'Security Triage',
route: '/security/triage',
expectedContext: /(security\s*\/\s*triage|findings triage)/i,
},
{
key: 'policy',
label: 'Policy',
route: '/ops/policy',
expectedContext: /(policy|policy workspace)/i,
},
{
key: 'vex',
label: 'Advisories & VEX',
route: '/security/advisories-vex',
expectedContext: /(advisories\s*&\s*vex|vex intelligence)/i,
},
];
function createRuntime() {
return {
consoleErrors: [],
pageErrors: [],
};
}
function attachRuntimeListeners(page, runtime) {
page.on('console', (message) => {
if (message.type() === 'error') {
runtime.consoleErrors.push({
timestamp: Date.now(),
page: page.url(),
text: message.text(),
});
}
});
page.on('pageerror', (error) => {
runtime.pageErrors.push({
timestamp: Date.now(),
page: page.url(),
message: error.message,
});
});
}
async function readVisibleTexts(locator) {
return locator.evaluateAll((nodes) =>
nodes
.map((node) => (node.textContent || '').trim().replace(/\s+/g, ' '))
.filter(Boolean),
).catch(() => []);
}
async function openSearch(page, route) {
await page.goto(`https://stella-ops.local${route}?${scopeQuery}`, {
waitUntil: 'domcontentloaded',
timeout: 30_000,
});
await page.waitForTimeout(2_500);
const input = page.locator('app-global-search input[type="text"]').first();
await input.click({ timeout: 10_000 });
await page.waitForSelector('.search__results', { state: 'visible', timeout: 10_000 });
await page.waitForTimeout(4_000);
}
async function captureSnapshot(page, routeConfig, runtime, routeStartedAt) {
return {
route: routeConfig.route,
url: page.url(),
contextTitle: (await page.locator('.search__context-title').first().textContent().catch(() => '')).trim(),
starterChips: await readVisibleTexts(page.locator('.search__suggestions .search__chip')),
degradedBanners: await readVisibleTexts(page.locator('.search__degraded-banner')),
emptyStates: await readVisibleTexts(page.locator('.search__empty, .search__empty-state-copy')),
answerStatuses: await page.locator('[data-answer-status]').evaluateAll((nodes) =>
nodes
.map((node) => node.getAttribute('data-answer-status') || '')
.filter(Boolean),
).catch(() => []),
cardTitles: await readVisibleTexts(page.locator('.search__cards .entity-card__title')),
consoleErrors: runtime.consoleErrors.filter((entry) => entry.timestamp >= routeStartedAt),
pageErrors: runtime.pageErrors.filter((entry) => entry.timestamp >= routeStartedAt),
};
}
async function executeStarter(page, routeConfig, starterIndex, runtime) {
const routeStartedAt = Date.now();
await openSearch(page, routeConfig.route);
const chips = page.locator('.search__suggestions .search__chip');
const count = await chips.count().catch(() => 0);
if (count <= starterIndex) {
throw new Error(`Starter chip index ${starterIndex} is not available on ${routeConfig.route}`);
}
const chip = chips.nth(starterIndex);
const starterText = ((await chip.textContent().catch(() => '')) || '').trim();
if (!starterText) {
throw new Error(`Starter chip index ${starterIndex} is blank on ${routeConfig.route}`);
}
await chip.click({ timeout: 10_000 });
await page.waitForTimeout(5_000);
const snapshot = await captureSnapshot(page, routeConfig, runtime, routeStartedAt);
const answerStatus = snapshot.answerStatuses[0] ?? null;
return {
starterIndex,
starterText,
answerStatus,
ok: answerStatus === 'grounded' && snapshot.cardTitles.length > 0,
snapshot,
};
}
function buildIssues(routeResult) {
const issues = [];
if (!routeResult.contextMatchesExpected) {
issues.push(`Unexpected context title on ${routeResult.route}: ${routeResult.contextTitle || '<empty>'}`);
}
if (routeResult.snapshot.degradedBanners.length > 0) {
issues.push(`Degraded banner visible on ${routeResult.route}: ${routeResult.snapshot.degradedBanners.join(' | ')}`);
}
if (routeResult.snapshot.starterChips.length === 0) {
issues.push(`No starter chips rendered on ${routeResult.route}`);
}
issues.push(...routeResult.snapshot.consoleErrors.map((entry) => `console:${entry.text}`));
issues.push(...routeResult.snapshot.pageErrors.map((entry) => `pageerror:${entry.message}`));
for (const starter of routeResult.executedStarters) {
if (!starter.ok) {
issues.push(`Starter index ${starter.starterIndex} "${starter.starterText}" did not resolve to grounded results on ${routeResult.route}`);
}
}
return issues;
}
async function main() {
mkdirSync(outputDirectory, { recursive: true });
const authReport = await authenticateFrontdoor({
statePath,
reportPath,
headless: true,
});
const browser = await chromium.launch({
headless: true,
args: ['--disable-dev-shm-usage'],
});
const context = await createAuthenticatedContext(browser, authReport, { statePath });
const page = await context.newPage();
const runtime = createRuntime();
attachRuntimeListeners(page, runtime);
const results = [];
const runtimeIssues = [];
try {
for (const routeConfig of routeMatrix) {
const routeStartedAt = Date.now();
await openSearch(page, routeConfig.route);
const snapshot = await captureSnapshot(page, routeConfig, runtime, routeStartedAt);
const executedStarters = [];
for (let starterIndex = 0; starterIndex < Math.min(snapshot.starterChips.length, 1); starterIndex += 1) {
// eslint-disable-next-line no-await-in-loop
executedStarters.push(await executeStarter(page, routeConfig, starterIndex, runtime));
}
const routeResult = {
key: routeConfig.key,
label: routeConfig.label,
route: routeConfig.route,
contextTitle: snapshot.contextTitle,
contextMatchesExpected: routeConfig.expectedContext.test(snapshot.contextTitle),
snapshot,
executedStarters,
};
results.push(routeResult);
runtimeIssues.push(...buildIssues(routeResult));
}
} finally {
const summary = {
checkedAtUtc: new Date().toISOString(),
scopeQuery,
routesChecked: results.length,
results,
runtimeIssueCount: runtimeIssues.length,
runtimeIssues,
};
writeFileSync(resultPath, `${JSON.stringify(summary, null, 2)}\n`, 'utf8');
await context.close();
await browser.close();
if (runtimeIssues.length > 0) {
throw new Error(runtimeIssues.join('; '));
}
process.stdout.write(`live-frontdoor-unified-search-route-matrix: ${results.length} routes checked, ${runtimeIssues.length} issues\n`);
}
}
main().catch((error) => {
process.stderr.write(`[live-frontdoor-unified-search-route-matrix] ${error instanceof Error ? error.message : String(error)}\n`);
process.exit(1);
});