Complete scratch iteration 004 setup and grouped route-action fixes

2026-03-12 19:28:42 +02:00
parent d8d3133060
commit 317e55e623
26 changed files with 1124 additions and 304 deletions
--- a/docs/INSTALL_GUIDE.md
+++ b/docs/INSTALL_GUIDE.md
@@ -55,8 +55,8 @@ The scripts will:
 3. Copy `env/stellaops.env.example` to `.env` if needed (works out of the box)
 4. Start infrastructure and wait for healthy containers
 5. Create or reuse the external frontdoor Docker network from `.env` (`FRONTDOOR_NETWORK`, default `stellaops_frontdoor`)
-6. Stop repo-local host-run Stella services that would lock build outputs, then build repo-owned .NET solutions and publish backend services locally into small Docker contexts before building hardened runtime images (vendored dependency trees such as `node_modules` are excluded)
-7. Launch the full platform with health checks and wait for the first-user frontdoor bootstrap path (`/welcome`, `/envsettings.json`, OIDC discovery, `/connect/authorize`) before reporting success
+6. Stop repo-local host-run Stella services that would lock build outputs, then build repo-owned .NET solutions and publish backend services locally into small Docker contexts before building hardened runtime images (vendored or generated trees such as `node_modules`, `dist`, `coverage`, and `output` are excluded)
+7. Launch the full platform with health checks, perform one bounded restart pass for services that stay unhealthy after first boot, and wait for the first-user frontdoor bootstrap path (`/welcome`, `/envsettings.json`, OIDC discovery, `/connect/authorize`) before reporting success

 Open **https://stella-ops.local** when setup completes.

--- a/docs/dev/DEV_ENVIRONMENT_SETUP.md
+++ b/docs/dev/DEV_ENVIRONMENT_SETUP.md
@@ -29,7 +29,7 @@ Setup scripts validate prerequisites, build solutions and Docker images, and lau
 ./scripts/setup.sh --images-only # only build Docker images
 ```

-The scripts will check for required tools (dotnet 10.x, node 20+, npm 10+, docker, git), warn about missing hosts file entries, copy `.env` from the example if needed, and stop repo-local host-run Stella services before the solution build so scratch bootstraps do not fail on locked `bin/Debug` outputs. A full setup now waits for the first-user frontdoor bootstrap path as well: `/welcome`, `/envsettings.json`, OIDC discovery, and a PKCE-style `/connect/authorize` request must all be live before the script prints success. See the manual steps below for details on each stage.
+The scripts will check for required tools (dotnet 10.x, node 20+, npm 10+, docker, git), warn about missing hosts file entries, copy `.env` from the example if needed, and stop repo-local host-run Stella services before the solution build so scratch bootstraps do not fail on locked `bin/Debug` outputs. Solution discovery is limited to repo-owned sources and skips generated trees such as `dist`, `coverage`, and `output`, so copied docs samples do not break scratch setup. A full setup now also performs one bounded restart pass for services that stay unhealthy after the first compose boot, then waits for the first-user frontdoor bootstrap path: `/welcome`, `/envsettings.json`, OIDC discovery, and a PKCE-style `/connect/authorize` request must all be live before the script prints success. See the manual steps below for details on each stage.

 On Windows and Linux, the backend image builder now publishes each selected .NET service locally and builds the hardened runtime image from a small temporary context. That avoids repeatedly streaming the whole monorepo into Docker during scratch setup.

--- a/docs/implplan/SPRINT_20260312_002_Platform_scratch_iteration_004_setup_solution_discovery_guard.md
+++ b/docs/implplan/SPRINT_20260312_002_Platform_scratch_iteration_004_setup_solution_discovery_guard.md
@@ -0,0 +1,92 @@
+# Sprint 20260312_002 - Platform Scratch Iteration 004 Setup Solution Discovery Guard
+
+## Topic & Scope
+- Wipe Stella-owned runtime state again and rerun the documented setup path from zero.
+- Treat setup itself as a first-user contract: if the documented bootstrap touches generated artifacts as if they were source-owned modules, fix that root cause before continuing into UI QA.
+- Rebuild and re-enter Playwright route/action coverage only after setup converges cleanly from the wipe.
+- Working directory: `.`.
+- Expected evidence: zero-state wipe proof, setup failure root cause, setup-script repair, rerun setup evidence, and the next live Playwright results.
+
+## Dependencies & Concurrency
+- Depends on the clean worktree baseline after `509b97a1a`, `19b9c90a8`, and `d8d313306`.
+- Safe parallelism: none during wipe/setup because the environment reset is global to the machine.
+
+## Documentation Prerequisites
+- `AGENTS.md`
+- `docs/INSTALL_GUIDE.md`
+- `docs/dev/DEV_ENVIRONMENT_SETUP.md`
+- `docs/qa/feature-checks/FLOW.md`
+
+## Delivery Tracker
+
+### PLATFORM-SCRATCH-ITER4-001 - Reproduce scratch setup from zero state
+Status: DONE
+Dependency: none
+Owners: QA, 3rd line support
+Task description:
+- Remove Stella-only containers, images, volumes, and the frontdoor network, then rerun the documented setup entrypoint from zero Stella state.
+
+Completion criteria:
+- [x] Stella-only Docker state is removed.
+- [x] `scripts/setup.ps1` is rerun from zero state.
+- [x] The first blocking setup failure is captured with concrete evidence.
+
+### PLATFORM-SCRATCH-ITER4-002 - Root-cause and repair generated-solution discovery
+Status: DONE
+Dependency: PLATFORM-SCRATCH-ITER4-001
+Owners: 3rd line support, Architect, Developer
+Task description:
+- Diagnose why the documented setup path is trying to build generated docs sample solutions from `dist/`, apply a clean source/discovery fix in the shared solution builders, and document the rule.
+
+Completion criteria:
+- [x] Generated output trees are excluded from solution discovery on both Windows and Linux setup paths.
+- [x] The setup docs state that generated trees are skipped.
+- [x] Scratch setup is rerun from the same zero-state workflow and no longer fails on generated docs sample solutions.
+
+### PLATFORM-SCRATCH-ITER4-003 - Resume first-user Playwright route/action audit after clean setup
+Status: DONE
+Dependency: PLATFORM-SCRATCH-ITER4-002
+Owners: QA
+Task description:
+- Once setup succeeds from zero state, rerun the first-user Playwright route/action audit and continue the normal diagnose/fix/retest loop for any live defects that remain.
+
+Completion criteria:
+- [x] Fresh route sweep evidence is captured on the post-fix scratch stack.
+- [x] Fresh action sweep evidence is captured before any additional fixes.
+- [x] Any newly exposed defects are enumerated before repair work begins.
+
+### PLATFORM-SCRATCH-ITER4-004 - Group post-setup route and action fixes before the next reset
+Status: DONE
+Dependency: PLATFORM-SCRATCH-ITER4-003
+Owners: QA, 3rd line support, Architect, Developer
+Task description:
+- Fix the grouped defects exposed by the resumed audit in one iteration: docs handoff rendering, trust/setup scope preservation, notification setup navigation accessibility, and harness gaps that were misclassifying live behavior during scratch verification.
+
+Completion criteria:
+- [x] Docs search handoffs render shipped markdown even when a module doc contains malformed fenced blocks.
+- [x] Trust/signing and setup-notifications tabs preserve scope query state through all tested navigations.
+- [x] The resumed scratch-stack aggregate audit and the targeted ops-policy rerun both pass on the repaired build.
+
+## Execution Log
+| Date (UTC) | Update | Owner |
+| --- | --- | --- |
+| 2026-03-12 | Sprint created after the next zero-state setup rerun failed during solution discovery. | QA |
+| 2026-03-12 | Reproduced the scratch failure from a full wipe: `scripts/setup.ps1` reached `scripts/build-all-solutions.ps1`, discovered `src/Web/StellaOps.Web/dist/stellaops-web/browser/docs-content/modules/router/samples/Examples.Router.sln`, and failed because the generated docs sample solution is not valid under the repo CPM rules. | QA / 3rd line support |
+| 2026-03-12 | Confirmed the next grouped root causes before fixing: shipped console images omitted `docs-content` because `devops/docker/Dockerfile.console` never copied repo `docs/`, and zero-state setup left `timeline` / `cartographer` unhealthy until a manual restart, so the setup scripts need bounded post-compose convergence instead of reporting success from a partially settled stack. | QA / 3rd line support / Architect |
+| 2026-03-12 | Repaired shared setup discovery to skip generated `dist`, `coverage`, and `output` trees on both PowerShell and shell paths; updated setup docs and reran the scratch bootstrap from zero state through the full `36/36` solution build matrix without rediscovering generated sample solutions. | 3rd line support / Developer |
+| 2026-03-12 | Resumed the first-user Playwright audit on the rebuilt scratch stack and captured a clean aggregate baseline: canonical route sweep `111/111`, aggregate live audit `20/20` suites passed, plus the targeted `ops-policy` rerun passed with `failedActionCount=0` and `runtimeIssueCount=0`. | QA |
+| 2026-03-12 | Grouped the post-setup fixes exposed during the resumed audit: hardened shipped docs markdown rendering against malformed fences, corrected the malformed Advisory AI module doc, preserved scope/query state across trust and notifications setup shells, and tightened several Playwright harnesses so they wait for resolved UI state instead of reporting false negatives during cold-load scratch verification. | QA / Architect / Developer |
+| 2026-03-12 | Verified the grouped fixes with focused Angular coverage `42/42`, `npm run build`, live dist sync into `compose_console-dist`, targeted live sweeps for search, trust/admin, notifications/watchlist, topology, evidence export, release promotion, and a clean rerun of `live-ops-policy-action-sweep.mjs`. | QA / Developer |
+
+## Decisions & Risks
+- Decision: generated output trees under `src/**` are not source-owned build inputs and must be excluded at the shared solution discovery layer, not with one-off exceptions in setup callers.
+- Risk: copied docs samples can reappear after future web builds. The exclusion rule therefore covers `dist`, `coverage`, and `output` globally instead of naming a single sample path.
+- Decision: direct `/docs/*` routes are part of the shipped frontdoor contract, so the console image must package repo docs during Angular builds rather than relying on local dist copies or manual volume sync.
+- Decision: scratch setup should absorb one bounded restart pass for services that remain unhealthy after first compose boot; manual container restarts are not an acceptable first-user recovery path.
+- Decision: shipped docs rendering must tolerate malformed fenced blocks in module markdown, because a single malformed doc must not turn global search knowledge handoffs into blank or broken user routes.
+- Decision: setup shell tabs and sub-tabs are part of the scoped frontdoor contract; query state must be merged through trust and notifications navigation instead of being silently dropped on tab changes.
+- Decision: once an uncovered menu-adjacent route or action is manually exercised during QA, it belongs in the Playwright sweeps so future scratch iterations verify it automatically rather than rediscovering it manually.
+
+## Next Checkpoints
+- Start the next scratch reset iteration from zero Stella-owned runtime state again.
+- Keep extending the aggregate Playwright coverage so fewer manual rediscoveries survive into later setup cycles.
--- a/docs/modules/advisory-ai/chat-interface.md
+++ b/docs/modules/advisory-ai/chat-interface.md
@@ -363,7 +363,8 @@ AdvisoryAI:

 ### Vulnerability Investigation

-```n User: What is the status of CVE-2023-44487 in our environment?
+```text
+User: What is the status of CVE-2023-44487 in our environment?

 Assistant: CVE-2023-44487 (HTTP/2 Rapid Reset) **is reachable** in your environment
 [reach:api-gateway:grpc.Server].
@@ -376,12 +377,13 @@ Assistant: CVE-2023-44487 (HTTP/2 Rapid Reset) **is reachable** in your environm
 **Recommended actions:**
 [Accept Risk]{action:approve,cve_id=CVE-2023-44487}
 [Create VEX]{action:create_vex,product=api-gateway,vulnerability=CVE-2023-44487,status=affected}
-```n
+```
 ---

 ### Action Execution

-```n User: Please create a VEX statement marking CVE-2023-44487 as affected for api-gateway.
+```text
+User: Please create a VEX statement marking CVE-2023-44487 as affected for api-gateway.

 Assistant: I will create a VEX statement with the following details:

@@ -391,7 +393,7 @@ Assistant: I will create a VEX statement with the following details:
 - **Justification:** Vulnerable code is reachable

 [Create VEX Statement]{action:create_vex,product=api-gateway,vulnerability=CVE-2023-44487,status=affected,justification=vulnerable_code_present}
-```n
+```
 ---

 ## Configuration
--- a/docs/modules/ui/README.md
+++ b/docs/modules/ui/README.md
@@ -8,6 +8,10 @@

 The Console presents operator dashboards for scans, policies, VEX evidence, runtime posture, and admin workflows.

+## Latest updates (2026-03-12)
+- Console container builds now copy the repo `docs/` tree into the Angular build stage so `docs-content` is bundled into shipped images and direct `/docs/*` routes resolve on the live frontdoor instead of only in local dist copies.
+- Live search route verification now treats knowledge-card handoffs as failed unless the destination documentation page renders real content, preventing blank docs routes from slipping through route-only checks.
+
 ## Latest updates (2026-03-10)
 - Hardened revived `Ops > Policy > Simulation` direct-entry surfaces so coverage, lint, promotion-gate, and diff routes restore stable defaults when host wiring omits pack/version/environment inputs.
 - Coverage now hydrates on first render instead of waiting for a second interaction, preventing blank direct-route states on `/ops/policy/simulation/coverage`.