Plan search operator correction phases

2026-03-07 20:35:32 +02:00
parent a3f532359b
commit 8ee5dcf420
5 changed files with 343 additions and 2 deletions
--- a/docs/implplan/SPRINT_20260307_031_DOCS_search_operator_correction_phases.md
+++ b/docs/implplan/SPRINT_20260307_031_DOCS_search_operator_correction_phases.md
@@ -0,0 +1,77 @@
+# Sprint 20260307-031 - Search Operator Correction Phases
+
+## Topic & Scope
+- Convert fresh operator feedback from live search use into explicit product rules and execution phases.
+- Close the gap between "search-first on paper" and the actual shipped interaction model.
+- Publish the next implementation phases so FE, AdvisoryAI, and E2E work align on one zero-learning experience.
+- Working directory: `docs/`.
+- Expected evidence: updated why/how documentation, new phased sprint files, and a docs-only commit.
+
+## Dependencies & Concurrency
+- Extends `SPRINT_20260307_025_DOCS_search_consolidation_corrective_phases.md`.
+- Extends `SPRINT_20260307_028_AdvisoryAI_consolidated_ranking_blending_and_optional_telemetry.md`.
+- Safe parallelism: implementation may start in `src/Web/StellaOps.Web/**`, `src/AdvisoryAI/**`, and `src/Web/StellaOps.Web/tests/e2e/**` once these rules are published.
+
+## Documentation Prerequisites
+- `docs/modules/ui/search-zero-learning-primary-entry.md`
+- `docs/modules/advisory-ai/knowledge-search.md`
+- `docs/modules/advisory-ai/unified-search-architecture.md`
+- `src/AdvisoryAI/__Tests/INFRASTRUCTURE.md`
+
+## Delivery Tracker
+
+### DOCS-ZL2-001 - Capture live operator objections as product rules
+Status: DONE
+Dependency: none
+Owners: Project Manager, Documentation author
+Task description:
+- Record the specific failures reported from real use of the consolidated search: assistant/search split, leftover mode thinking, exposed scope mechanics, misplaced `Did you mean`, dead suggestions, and history pollution from failed searches.
+- Translate those into product rules that are precise enough to guide implementation and reject regressions.
+
+Completion criteria:
+- [x] The why/how doc records the live operator objections explicitly.
+- [x] The rules state that search is primary and AdvisoryAI is secondary.
+- [x] The rules state that telemetry is optional and suggestion viability depends on corpus readiness.
+
+### DOCS-ZL2-002 - Publish implementation phases with single-directory ownership
+Status: DONE
+Dependency: DOCS-ZL2-001
+Owners: Project Manager
+Task description:
+- Break the follow-up work into implementation phases that can be executed and committed independently.
+- Each phase must have a single owning directory, explicit dependencies, and non-ambiguous completion criteria.
+
+Completion criteria:
+- [x] Separate sprint files exist for FE surface cleanup, AdvisoryAI query/viability hardening, and live E2E verification.
+- [x] Each sprint includes exact tests and evidence expectations.
+- [x] Dependencies and safe parallelism notes are explicit.
+
+### DOCS-ZL2-003 - Publish setup and live validation expectations
+Status: DONE
+Dependency: DOCS-ZL2-002
+Owners: Documentation author
+Task description:
+- Make the live validation path explicit: corpus rebuild order, CLI compile-or-source-run expectations, and the rule that search suggestion coverage is invalid when the corpus is empty even if service health is green.
+- Keep the docs concise and operator-focused.
+
+Completion criteria:
+- [x] The doc links to the compiled CLI or `dotnet run` fallback guidance already in the repo.
+- [x] The phase stack explicitly requires ingestion-backed validation.
+- [x] The decisions section references the updated why/how doc.
+
+## Execution Log
+| Date (UTC) | Update | Owner |
+| --- | --- | --- |
+| 2026-03-07 | Sprint created to turn live operator search feedback into phased corrective work with explicit ownership and test gates. | Project Manager |
+| 2026-03-07 | Published operator-correction rules in `search-zero-learning-primary-entry.md`, added FE/backend/live-E2E phase sprints, and wired ingestion-backed validation plus optional-telemetry expectations into the phase stack. | Project Manager |
+
+## Decisions & Risks
+- Decision: the product no longer asks the user to understand search modes, scope toggles, or recovery mechanics.
+- Decision: suggestion executability is part of product correctness, not a nice-to-have.
+- Risk: previous corrective work removed some visible controls but left hidden dependencies in FE contracts and history storage.
+- Mitigation: phase the correction pass into FE cleanup, backend contract hardening, and live-ingested E2E gates.
+- Reference: `docs/modules/ui/search-zero-learning-primary-entry.md`
+
+## Next Checkpoints
+- 2026-03-08: publish the correction phases and start FE surface cleanup.
+- 2026-03-09: start backend query-understanding and suggestion viability hardening.
--- a/docs/implplan/SPRINT_20260307_032_FE_search_primary_surface_cleanup.md
+++ b/docs/implplan/SPRINT_20260307_032_FE_search_primary_surface_cleanup.md
@@ -0,0 +1,86 @@
+# Sprint 20260307-032 - FE Search Primary Surface Cleanup
+
+## Topic & Scope
+- Finish the search-first surface so operators can use it without learning Stella-specific search mechanics.
+- Remove the remaining FE mode residue, simplify correction/result cues, and make recent history truly success-only.
+- Keep the assistant as a secondary action launched beside search or from grounded answers.
+- Working directory: `src/Web/StellaOps.Web/`.
+- Expected evidence: FE implementation, targeted unit tests, Playwright regression coverage, and a scoped commit.
+
+## Dependencies & Concurrency
+- Depends on `SPRINT_20260307_031_DOCS_search_operator_correction_phases.md`.
+- Informs `SPRINT_20260307_033_AdvisoryAI_search_query_understanding_and_viability.md`.
+- Safe parallelism: backend work may proceed in `src/AdvisoryAI/**` if it does not break the current Web contract before this sprint lands.
+
+## Documentation Prerequisites
+- `docs/modules/ui/search-zero-learning-primary-entry.md`
+- `src/Web/StellaOps.Web/AGENTS.md`
+
+## Delivery Tracker
+
+### FE-ZL2-001 - Remove remaining FE mode dependencies
+Status: TODO
+Dependency: none
+Owners: Developer
+Task description:
+- Remove `Find / Explain / Act` as a frontend concept, not just as a visible control.
+- Eliminate mode-driven prompt helpers and chip/question filtering from the shared search context contracts and FE composition paths.
+
+Completion criteria:
+- [ ] Search and assistant FE code no longer depend on `SearchExperienceModeService`.
+- [ ] Page-owned search contracts do not use `preferredModes`.
+- [ ] Search-to-chat prompts derive from query, route, evidence, and last actions only.
+
+### FE-ZL2-002 - Tighten the primary search surface
+Status: TODO
+Dependency: FE-ZL2-001
+Owners: Developer
+Task description:
+- Keep the assistant launcher beside the search input, move `Did you mean` directly below the field, and simplify operator-facing labels.
+- Remove residual explanatory clutter that teaches the system instead of helping the user search.
+
+Completion criteria:
+- [ ] `Did you mean` renders as an input-adjacent correction cue.
+- [ ] Empty-state starters stay concise and executable with no ranking-mechanics copy.
+- [ ] Result labels use plain operator language for in-scope and spillover sections.
+
+### FE-ZL2-003 - Migrate recent history to a success-only contract
+Status: TODO
+Dependency: FE-ZL2-002
+Owners: Developer
+Task description:
+- Replace the legacy bare-string history store with a structured success-only format.
+- Drop old failed or unknown legacy entries on load so history reflects only searches that actually worked.
+
+Completion criteria:
+- [ ] Local recent history stores structured successful entries rather than bare strings.
+- [ ] Legacy entries are ignored unless they can be confirmed from server history as successful.
+- [ ] No-result searches never reappear after reload.
+
+### FE-ZL2-004 - Verify the simplified search surface
+Status: TODO
+Dependency: FE-ZL2-003
+Owners: Developer, Test Automation
+Task description:
+- Add targeted Angular and Playwright coverage for the simplified search model.
+- Tests must prove the removal of FE modes, input-adjacent correction cues, and success-only history behavior.
+
+Completion criteria:
+- [ ] Angular tests cover history migration and no-mode FE composition.
+- [ ] Playwright covers input correction placement, success-only history, and assistant launch from the field.
+- [ ] No route-jump or visible scope/mode controls remain in covered flows.
+
+## Execution Log
+| Date (UTC) | Update | Owner |
+| --- | --- | --- |
+| 2026-03-07 | Sprint created for the FE half of the operator correction pass on global search. | Project Manager |
+
+## Decisions & Risks
+- Decision: search surface simplification is not cosmetic; it removes product concepts the operator should never need to learn.
+- Risk: FE still carries hidden mode-specific behavior in contracts and helper services.
+- Mitigation: remove the mode service and related contract fields as part of this sprint rather than hiding them behind UI changes.
+- Reference: `docs/modules/ui/search-zero-learning-primary-entry.md`
+
+## Next Checkpoints
+- 2026-03-08: remove FE mode dependencies and simplify the input/result surface.
+- 2026-03-08: land history migration and deterministic regression coverage.
--- a/docs/implplan/SPRINT_20260307_033_AdvisoryAI_search_query_understanding_and_viability.md
+++ b/docs/implplan/SPRINT_20260307_033_AdvisoryAI_search_query_understanding_and_viability.md
@@ -0,0 +1,75 @@
+# Sprint 20260307-033 - AdvisoryAI Search Query Understanding And Viability
+
+## Topic & Scope
+- Make unified search infer answer shape from the query and context instead of relying on frontend mode hints.
+- Harden suggestion viability so the UI can suppress dead suggestions when the corpus is empty, stale, or unsupported for the current route.
+- Keep telemetry fully optional and separate from retrieval correctness.
+- Working directory: `src/AdvisoryAI/`.
+- Expected evidence: backend contract/logic changes, focused service/integration tests, and a scoped commit.
+
+## Dependencies & Concurrency
+- Depends on `SPRINT_20260307_031_DOCS_search_operator_correction_phases.md`.
+- Builds on `SPRINT_20260307_028_AdvisoryAI_consolidated_ranking_blending_and_optional_telemetry.md`.
+- Safe parallelism: FE surface work in `src/Web/StellaOps.Web/**` may proceed as long as the additive contracts remain backward-compatible until both sprints land.
+
+## Documentation Prerequisites
+- `docs/modules/ui/search-zero-learning-primary-entry.md`
+- `docs/modules/advisory-ai/knowledge-search.md`
+- `docs/modules/advisory-ai/unified-search-architecture.md`
+- `src/AdvisoryAI/AGENTS.md`
+
+## Delivery Tracker
+
+### AI-ZL2-001 - Remove mode-shaped answer assumptions from unified search
+Status: TODO
+Dependency: none
+Owners: Developer
+Task description:
+- Ensure the backend answer path derives decisive vs blended vs clarify behavior from query structure, route context, evidence distribution, and recent actions.
+- Do not require FE mode hints to choose answer behavior.
+
+Completion criteria:
+- [ ] Unified search answer behavior no longer depends on FE mode fields.
+- [ ] Tests prove query/context-driven decisive and blended answers.
+- [ ] Clarify and insufficient fallbacks remain deterministic.
+
+### AI-ZL2-002 - Strengthen suggestion viability and corpus readiness signals
+Status: TODO
+Dependency: AI-ZL2-001
+Owners: Developer
+Task description:
+- Expand suggestion viability responses so FE can distinguish a genuinely viable suggestion from empty-corpus or unsupported-domain states.
+- Keep the response bounded and deterministic.
+
+Completion criteria:
+- [ ] Viability responses differentiate viable, no-match, and corpus-readiness failures.
+- [ ] Empty corpus or missing supported projections suppress suggestions cleanly.
+- [ ] Integration tests cover live-readiness and empty-corpus behavior.
+
+### AI-ZL2-003 - Keep telemetry optional and non-blocking
+Status: TODO
+Dependency: AI-ZL2-002
+Owners: Developer
+Task description:
+- Preserve the optional telemetry posture while the new viability and answer behavior land.
+- Search correctness, history, and suggestion gating must work with telemetry disabled.
+
+Completion criteria:
+- [ ] Telemetry-disabled paths still return identical retrieval and viability behavior.
+- [ ] Tests prove analytics and feedback remain disabled while search works.
+- [ ] Docs state that viability and history do not depend on telemetry.
+
+## Execution Log
+| Date (UTC) | Update | Owner |
+| --- | --- | --- |
+| 2026-03-07 | Sprint created for the backend half of the operator correction pass on unified search. | Project Manager |
+
+## Decisions & Risks
+- Decision: FE should not be responsible for choosing answer mode or rescuing dead suggestions.
+- Risk: if viability is too coarse, FE will still surface suggestions that fail in live corpora.
+- Mitigation: return explicit bounded viability states and keep live-ingested tests as the final gate.
+- Reference: `docs/modules/ui/search-zero-learning-primary-entry.md`
+
+## Next Checkpoints
+- 2026-03-09: land query/context-driven answer shaping.
+- 2026-03-09: land stricter suggestion viability states and backend tests.
--- a/docs/implplan/SPRINT_20260307_034_FE_live_search_readiness_matrix.md
+++ b/docs/implplan/SPRINT_20260307_034_FE_live_search_readiness_matrix.md
@@ -0,0 +1,75 @@
+# Sprint 20260307-034 - FE Live Search Readiness Matrix
+
+## Topic & Scope
+- Prove the operator-corrected search experience with deterministic and live-ingested Playwright coverage.
+- Fail early when corpus rebuilds are missing, the local CLI is uncompiled, or the active corpora cannot support surfaced suggestions.
+- Keep telemetry-off behavior covered so optional analytics never become a hidden dependency.
+- Working directory: `src/Web/StellaOps.Web/tests/e2e/`.
+- Expected evidence: Playwright suites, exact setup commands, execution logs, and a scoped commit.
+
+## Dependencies & Concurrency
+- Depends on `SPRINT_20260307_032_FE_search_primary_surface_cleanup.md`.
+- Depends on `SPRINT_20260307_033_AdvisoryAI_search_query_understanding_and_viability.md`.
+- Safe parallelism: live suites may run in parallel only when each suite uses isolated services or a read-only prepared corpus.
+
+## Documentation Prerequisites
+- `docs/modules/ui/search-zero-learning-primary-entry.md`
+- `docs/modules/advisory-ai/knowledge-search.md`
+- `src/AdvisoryAI/__Tests/INFRASTRUCTURE.md`
+
+## Delivery Tracker
+
+### QA-ZL2-001 - Expand deterministic Playwright for the simplified surface
+Status: TODO
+Dependency: none
+Owners: Test Automation
+Task description:
+- Cover the simplified top-bar search experience with deterministic mocks.
+- Prove correction placement, assistant launch, success-only history, and no dead-end starter flows.
+
+Completion criteria:
+- [ ] Deterministic E2E covers `Did you mean` placement, assistant launch, success-only history, and spillover rendering.
+- [ ] Covered flows have no visible scope/mode controls.
+- [ ] Covered starter chips always land on a meaningful result or explicit grounded fallback.
+
+### QA-ZL2-002 - Run live ingestion-backed suggestion and readiness matrix
+Status: TODO
+Dependency: QA-ZL2-001
+Owners: Test Automation
+Task description:
+- Rebuild live corpora using the documented CLI compile-or-source-run path and verify corpus readiness before UI checks start.
+- Execute surfaced suggestions on covered pages and fail on dead ends.
+
+Completion criteria:
+- [ ] Live suites preflight corpus readiness, not just process health.
+- [ ] Every surfaced live suggestion on covered pages resolves to results or an explicit grounded fallback state.
+- [ ] Execution logs record the exact rebuild and Playwright commands.
+
+### QA-ZL2-003 - Verify telemetry-off search flows
+Status: TODO
+Dependency: QA-ZL2-002
+Owners: Test Automation
+Task description:
+- Prove that the operator journey works with telemetry disabled or unavailable.
+- Search and assistant deep-dive must keep functioning without analytics calls.
+
+Completion criteria:
+- [ ] Playwright covers a telemetry-off path.
+- [ ] Search, answer rendering, suggestions, and assistant handoff still work.
+- [ ] The execution log records the telemetry-off configuration and results.
+
+## Execution Log
+| Date (UTC) | Update | Owner |
+| --- | --- | --- |
+| 2026-03-07 | Sprint created to keep live-ingested suggestion correctness and telemetry-off behavior as explicit release gates. | Project Manager |
+
+## Decisions & Risks
+- Decision: live suggestion correctness is a product requirement; deterministic mocks alone are insufficient evidence.
+- Decision: setup failures such as an uncompiled CLI or empty corpus must fail the suite early and clearly.
+- Risk: corpus parity may remain uneven across domains.
+- Mitigation: treat coverage per domain explicitly and only surface suggestions when the backed corpus can support them.
+- Reference: `docs/modules/ui/search-zero-learning-primary-entry.md`
+
+## Next Checkpoints
+- 2026-03-10: run the live readiness matrix against the simplified FE surface.
+- 2026-03-10: add telemetry-off E2E evidence for the corrected flow.
--- a/docs/modules/ui/search-zero-learning-primary-entry.md
+++ b/docs/modules/ui/search-zero-learning-primary-entry.md
@@ -12,15 +12,24 @@
 - Current-scope behavior is correct as a ranking concept, but the UI still talks about the mechanism instead of simply showing the best in-scope answer first and only then out-of-scope overflow.
 - Suggestion reliability is fixed only when the active corpus is actually ingested. A healthy process with an empty corpus is still a bad operator experience unless the UI suppresses or fails those paths explicitly.

+## What still fails after live operator use
+- The product still leaves visible or hidden traces of a dual entry model. Search is supposed to be the starting point, but AdvisoryAI still feels like a separate feature instead of a secondary deep-dive opened from the search field.
+- The current surface still carries internal search concepts in contracts and helpers even after the visible buttons were reduced. If `Find / Explain / Act` still changes prompt or chip behavior, the product is still teaching an internal model.
+- The current history contract cannot reliably remove old failed searches after reload because the local store still accepts legacy bare-string entries with no result outcome attached.
+- `Did you mean` is still visually tied to the results surface rather than the input correction moment. It needs to live immediately below the search field.
+- Suggestions are still too easy to surface without enough corpus proof. Search must treat corpus readiness and suggestion executability as a product requirement, not a test-only concern.
+
 ## Product rules
 1. Search is the primary entry point.
 2. AdvisoryAI is a secondary deep-dive opened from search, not a competing starting point or route jump.
-3. Search should infer relevance and intent; the user should not need to choose a mode.
+3. Search should infer relevance and intent; the user should not need to choose a mode or understand Stella search mechanics.
 4. Search should bias to the current page context automatically; the user should not need to toggle scope.
 5. Search should never advertise a suggestion that cannot execute against the active corpus.
 6. Search history should contain only successful searches.
 7. Telemetry remains optional. Search behavior must not depend on analytics emission.
 8. Search UI must avoid teaching Stella terminology or search mechanics before the operator has even started.
+9. `Did you mean` belongs directly below the search field because it is an input correction, not a downstream refinement.
+10. Search should summarize close evidence automatically. AdvisoryAI expands detail; it should not be required to make the primary result understandable.

 ## Target interaction model
 ### Entry
@@ -101,4 +110,23 @@
 - Implemented before and during the corrective phases: explicit scope/mode/recovery controls were removed from the main search flow, implicit current-scope weighting and overflow contracts were added, and suggestion viability preflight now suppresses dead chips before render.
 - Implemented before the corrective phases: the live Doctor suggestion suite now rebuilds the active corpus, fails on empty knowledge projections, iterates every surfaced suggestion, and verifies Ask-AdvisoryAI inherits the live search context.
 - Implemented from the corrective phases: backend overflow is now narrow enough that clear in-scope winners suppress out-of-scope spillover, blended summaries only appear for genuinely close evidence clusters, and `SearchTelemetryEnabled` cleanly disables analytics/feedback/sink emission without affecting retrieval or history.
- Still pending from the corrective phases: broader live-page matrices and explicit client-side telemetry opt-out.
+- Still pending from the corrective phases: removal of hidden FE mode dependencies, a structured success-only history contract that purges failed legacy entries on load, stricter suggestion/corpus readiness gating across more routes, and broader live-page matrices.
+
+## Execution phases - operator correction pass
+### Phase 1 - Search-first primary surface cleanup
+- Remove the remaining FE mode concept from shared search contracts, prompt helpers, and page-owned chip selection.
+- Keep one visible primary entry: search input with a compact assistant launcher beside it.
+- Attach `Did you mean` directly to the input area and simplify result labels to plain operator language.
+- Migrate recent history to a structured success-only format and ignore legacy failed entries on load.
+
+### Phase 2 - Backend query understanding and suggestion hardening
+- Infer answer shape from the query, route context, visible entities, and recent actions instead of any FE mode hint.
+- Return stricter suggestion viability signals so FE can suppress dead suggestions when the corpus is empty, stale, or outside the current route's supported domains.
+- Keep out-of-scope overflow secondary and only when it materially improves the answer.
+- Keep telemetry optional and separate from retrieval, ranking, suggestion gating, and history.
+
+### Phase 3 - Live ingestion-backed readiness and regression gate
+- Run deterministic Playwright against the simplified surface on every change.
+- Run live Playwright against ingested corpora and fail early on empty corpus, missing rebuilds, or uncompiled CLI assumptions.
+- Assert that every surfaced suggestion on covered routes resolves to a non-dead-end state.
+- Treat corpus readiness as part of release verification for search suggestions.