Close admin trust audit gaps and stabilize live sweeps

2026-03-12 10:14:00 +02:00
parent a00efb7ab2
commit 6964a046a5
50 changed files with 5968 additions and 2850 deletions
--- a/docs/implplan/SPRINT_20260311_014_Platform_scratch_iteration_003_full_route_action_audit.md
+++ b/docs/implplan/SPRINT_20260311_014_Platform_scratch_iteration_003_full_route_action_audit.md
@@ -0,0 +1,76 @@
+# Sprint 20260311_014 - Platform Scratch Iteration 003 Full Route Action Audit
+
+## Topic & Scope
+- Wipe the Stella-only local runtime again and rerun the documented setup path from zero state.
+- Re-test the rebuilt stack as a first-time operator with Playwright route and action coverage across the core release, security, ops, integration, and setup surfaces.
+- If this fresh-stack pass exposes real failures, trace root cause, choose clean fixes, implement them, redeploy, and reverify before the iteration is closed.
+- Working directory: `.`.
+- Expected evidence: Stella-only wipe log, documented setup execution proof, fresh Playwright route/action results, root-cause notes for any failures, and a local commit for the iteration.
+
+## Dependencies & Concurrency
+- Depends on the clean iteration record in `a00efb7ab`.
+- Safe parallelism: none during the wipe/rebuild and live sweeps because the environment reset is global to the machine.
+
+## Documentation Prerequisites
+- `AGENTS.md`
+- `docs/INSTALL_GUIDE.md`
+- `docs/dev/DEV_ENVIRONMENT_SETUP.md`
+- `docs/qa/feature-checks/FLOW.md`
+
+## Delivery Tracker
+
+### PLATFORM-SCRATCH-ITER3-001 - Wipe Stella-only runtime state and rerun documented setup
+Status: DONE
+Dependency: none
+Owners: QA, 3rd line support
+Task description:
+- Remove Stella-only containers, images, volumes, and networks, then rerun the documented setup path from the same first-time operator entrypoint.
+
+Completion criteria:
+- [x] Stella-only Docker state is removed without touching unrelated local assets.
+- [x] `scripts/setup.ps1` is rerun from zero Stella state.
+- [x] The bootstrap outcome is captured with concrete evidence.
+
+### PLATFORM-SCRATCH-ITER3-002 - Re-run live route and action sweeps on the fresh stack
+Status: DONE
+Dependency: PLATFORM-SCRATCH-ITER3-001
+Owners: QA
+Task description:
+- Re-authenticate on the rebuilt stack and rerun the route and action sweeps needed to validate page loads and user actions on the fresh deployment.
+
+Completion criteria:
+- [x] Fresh route sweep evidence is captured.
+- [x] Fresh action sweep evidence is captured for the covered surface families.
+- [x] Any newly exposed failures are enumerated before fixes begin.
+
+### PLATFORM-SCRATCH-ITER3-003 - Root-cause and repair the next live failures
+Status: DONE
+Dependency: PLATFORM-SCRATCH-ITER3-002
+Owners: 3rd line support, Product Manager, Architect, Developer
+Task description:
+- Diagnose and fix any fresh-stack defects surfaced by the iteration. If no new defect is exposed, record the clean pass explicitly and close the iteration with a local commit.
+
+Completion criteria:
+- [x] Each exposed failure has a documented root cause, or the clean pass is explicitly recorded.
+- [x] Any required fix favors clean ownership/contracts over temporary fallbacks.
+- [x] The iteration is committed locally after re-verification.
+
+## Execution Log
+| Date (UTC) | Update | Owner |
+| --- | --- | --- |
+| 2026-03-11 | Sprint created to start scratch iteration 003 immediately after the previous clean zero-state pass. | QA |
+| 2026-03-12 | Cleansed Playwright output, reran the setup-topology and uncovered-surface sweeps, and verified both clean (`0` failed actions, `0` runtime issues) to remove stale harness noise before the aggregate pass. | QA |
+| 2026-03-12 | Ran `live-full-core-audit.mjs` across all 19 suites. First pass isolated one failing action in `ops-policy-action-sweep` (`/ops/policy/simulation -> button:View Results`) while every other route/page/action suite passed. | QA |
+| 2026-03-12 | Root-caused the policy simulation miss to a harness defect: multiple shadow-mode enable buttons exist during async load, and the sweep was selecting the first disabled control instead of an enabled action target. | 3rd line support / Architect |
+| 2026-03-12 | Updated `live-ops-policy-action-sweep.mjs` to wait for an enabled shadow-mode control and for `View Results` to become interactable, then reran the targeted policy sweep cleanly (`0` failed actions, `0` runtime issues). | Developer |
+| 2026-03-12 | Reran `live-full-core-audit.mjs`; final aggregate result was `19/19` suites passed with `failedSuiteCount=0`, including `111/111` canonical routes and all covered action families. | QA |
+
+## Decisions & Risks
+- Decision: keep iterating from true zero Stella state even after a clean pass so regressions that appear only intermittently still have a chance to surface.
+- Risk: the documented setup path is expensive by design; correctness under wipe-and-rebuild remains the priority over speed.
+- Decision: treat broad Playwright harness reliability as part of the product verification contract. False negatives that stem from stale readiness assumptions or disabled-control races are fixed before declaring a route family broken.
+- Decision: the policy simulation `View Results` failure was not a product regression. The clean fix was to make the QA harness wait for the first enabled shadow-mode control rather than clicking the first matching label during async load.
+
+## Next Checkpoints
+- Start the next zero-state iteration and repeat the full route/page/action pass before any new fixes.
+- Expand search-specific user journeys beyond the current 4-route matrix if fresh user reports expose ranking or handoff gaps.