Verify live search suggestions against ingested corpus
This commit is contained in:
@@ -20,7 +20,7 @@
|
||||
## Delivery Tracker
|
||||
|
||||
### QA-ZL-001 - Add live corpus preflight and rebuild checks
|
||||
Status: TODO
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Test Automation
|
||||
Task description:
|
||||
@@ -28,12 +28,12 @@ Task description:
|
||||
- Fail with explicit setup diagnostics when the corpus is empty or stale instead of producing misleading UI failures.
|
||||
|
||||
Completion criteria:
|
||||
- [ ] The live suite checks rebuild/readiness before suggestion assertions.
|
||||
- [ ] Failure output distinguishes ingestion failure from UI failure.
|
||||
- [ ] Setup docs reference compiled CLI and HTTP rebuild fallbacks.
|
||||
- [x] The live suite checks rebuild/readiness before suggestion assertions.
|
||||
- [x] Failure output distinguishes ingestion failure from UI failure.
|
||||
- [x] Setup docs reference compiled CLI and HTTP rebuild fallbacks.
|
||||
|
||||
### QA-ZL-002 - Prove every surfaced suggestion succeeds
|
||||
Status: TODO
|
||||
Status: DONE
|
||||
Dependency: QA-ZL-001
|
||||
Owners: Test Automation
|
||||
Task description:
|
||||
@@ -41,32 +41,37 @@ Task description:
|
||||
- Include pages that rely on current-scope weighting and overflow fallback.
|
||||
|
||||
Completion criteria:
|
||||
- [ ] The live suite iterates through each surfaced suggestion on the covered pages.
|
||||
- [ ] Every rendered suggestion produces a visible non-dead-end state.
|
||||
- [ ] Previously failing suggestion paths are covered explicitly.
|
||||
- [x] The live suite iterates through each surfaced suggestion on the covered pages.
|
||||
- [x] Every rendered suggestion produces a visible non-dead-end state.
|
||||
- [x] Previously failing suggestion paths are covered explicitly.
|
||||
|
||||
### QA-ZL-003 - Verify search-to-chat consolidation
|
||||
Status: TODO
|
||||
Status: DONE
|
||||
Dependency: QA-ZL-002
|
||||
Owners: Test Automation
|
||||
Task description:
|
||||
- Verify the compact chat launcher and answer-panel handoff preserve query, page context, and evidence after the search redesign.
|
||||
|
||||
Completion criteria:
|
||||
- [ ] Search is the tested primary entry in all covered flows.
|
||||
- [ ] AdvisoryAI opens as a secondary deep-dive from search with inherited context.
|
||||
- [ ] Execution log records the final full-pack commands and outcomes.
|
||||
- [x] Search is the tested primary entry in all covered flows.
|
||||
- [x] AdvisoryAI opens as a secondary deep-dive from search with inherited context.
|
||||
- [x] Execution log records the final full-pack commands and outcomes.
|
||||
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2026-03-07 | Sprint created for live corpus-backed suggestion reliability and zero-learning search verification. | Project Manager |
|
||||
| 2026-03-07 | Reproduced the user-facing failure against `http://127.1.0.44`: health was up but `POST /v1/advisory-ai/index/rebuild` returned `documentCount=0`, `chunkCount=0`, and `doctorProjectionCount=0`, so suggestion preflight now treats empty-corpus services as setup failures instead of UI regressions. | Test Automation |
|
||||
| 2026-03-07 | Prepared sources against the repo-controlled service, rebuilt both indexes, and verified live query `database connectivity` returned `contextAnswer.status=grounded` with knowledge cards and citations. | Test Automation |
|
||||
| 2026-03-07 | Ran `npx playwright test tests/e2e/unified-search-contextual-suggestions.live.e2e.spec.ts --config playwright.config.ts` against `http://127.0.0.1:10451`; result `5/5` passed covering chip viability, every surfaced suggestion, result-open follow-up chips, and Ask-AdvisoryAI handoff. | Test Automation |
|
||||
|
||||
## Decisions & Risks
|
||||
- Decision: live reliability gates are required because static mocks cannot prove suggestion viability against real corpora.
|
||||
- Decision: a healthy service with an empty corpus is an ingestion/setup failure, not a passing baseline; live E2E must fail before UI assertions in that case.
|
||||
- Risk: local environments may have partially ingested or empty corpora, especially in Doctor/knowledge projections.
|
||||
- Mitigation: add explicit corpus preflight and rebuild guidance so the suite fails with actionable diagnostics.
|
||||
- Mitigation: use a repo-controlled local service (`http://127.0.0.1:10451`) with `advisoryai sources prepare`, `POST /v1/advisory-ai/index/rebuild`, and `POST /v1/search/index/rebuild` before running the live suite.
|
||||
|
||||
## Next Checkpoints
|
||||
- 2026-03-09: Land live corpus preflight before broadening the suggestion matrix.
|
||||
- 2026-03-10: Run the final live suggestion pack and capture exact outcomes in the execution log.
|
||||
- 2026-03-09: Broaden live coverage beyond Doctor once findings/policy/VEX ingestion parity is available.
|
||||
- 2026-03-10: Fold the live reliability lane into the consolidated zero-learning search redesign phases.
|
||||
|
||||
@@ -403,7 +403,8 @@ Current live verification coverage:
|
||||
- Rebuild order exercised against a running local service: `POST /v1/advisory-ai/index/rebuild` then `POST /v1/search/index/rebuild`
|
||||
- Verified live query: `database connectivity`
|
||||
- Verified live outcome: response includes `contextAnswer.status = grounded`, citations, and entity cards over ingested data
|
||||
- Verified live suggestion lane: the Doctor-page `database connectivity` chip remains a viable query after rebuild and is exercised by `src/Web/StellaOps.Web/tests/e2e/unified-search-contextual-suggestions.live.e2e.spec.ts`
|
||||
- Verified live suggestion lane: `src/Web/StellaOps.Web/tests/e2e/unified-search-contextual-suggestions.live.e2e.spec.ts` now preflights corpus readiness, validates suggestion viability, executes every surfaced Doctor suggestion, asserts grounded-or-clarify answer states, verifies follow-up chips after result open, and verifies Ask-AdvisoryAI inherits the live query context
|
||||
- Verified local corpus baseline on 2026-03-07 after `advisoryai sources prepare`: `documentCount = 470`, `chunkCount = 9050`, `apiOperationCount = 2190`, `doctorProjectionCount = 8`
|
||||
- Other routes still rely on deterministic mock-backed Playwright coverage until their ingestion parity is explicitly verified
|
||||
|
||||
Or use the full CI testing stack:
|
||||
|
||||
@@ -61,6 +61,7 @@
|
||||
- Knowledge/domain emptiness should be detectable so the UI can suppress invalid chips.
|
||||
- Empty-state contextual chips and page-owned common-question chips should preflight through the backend viability endpoint before they render.
|
||||
- Live Playwright coverage must assert that every surfaced suggestion returns visible results.
|
||||
- A service health check alone is not enough. On 2026-03-07, `http://127.1.0.44/health` returned `200` while the live knowledge rebuild returned `documentCount=0`; the product still surfaced dead chips. Corpus readiness is the gate, not process liveness.
|
||||
|
||||
## Phase map
|
||||
- Phase 1: FE primary-entry consolidation and removal of explicit search controls.
|
||||
@@ -68,3 +69,4 @@
|
||||
- Phase 3: FE consumption of overflow results and executable suggestion contracts.
|
||||
- Implemented on 2026-03-07: backend `contextAnswer` is now preferred over frontend heuristics, overflow renders as a secondary result section, and suggestion viability preflight suppresses dead chips before they are shown.
|
||||
- Phase 4: Live Playwright reliability matrix with corpus preflight and chip-success guarantees.
|
||||
- Implemented on 2026-03-07: the live suite now rebuilds the active corpus, fails fast on empty knowledge projections, iterates every surfaced Doctor suggestion, and verifies Ask-AdvisoryAI inherits the live search context.
|
||||
|
||||
Reference in New Issue
Block a user