Close scratch iteration 009 grouped policy and VEX audit repairs

2026-03-13 19:25:48 +02:00
parent 6954ac7967
commit bf4ff5bfd7
41 changed files with 2413 additions and 553 deletions
--- a/docs/implplan/SPRINT_20260313_004_Platform_scratch_iteration_009_full_route_action_audit.md
+++ b/docs/implplan/SPRINT_20260313_004_Platform_scratch_iteration_009_full_route_action_audit.md
@@ -0,0 +1,84 @@
+# Sprint 20260313_004 - Platform Scratch Iteration 009 Full Route Action Audit
+
+## Topic & Scope
+- Wipe Stella-owned runtime state again and rerun the documented setup path from zero state.
+- Re-enter the application as a first-time user after bootstrap and rerun the full route, page-load, and page-action audit with Playwright.
+- Recheck changed or newly discovered surfaces and convert any new manual findings into retained Playwright scenarios before the iteration is considered complete.
+- Group any newly exposed defects before fixing so the next commit closes a full iteration rather than a single page slice.
+- Working directory: `.`.
+- Expected evidence: wipe proof, setup convergence proof, fresh Playwright route/page/action evidence, retained scenario coverage for new findings, grouped defect list, fixes, and retest results.
+
+## Dependencies & Concurrency
+- Depends on local commit `6954ac796` as the clean baseline for the next scratch cycle.
+- Safe parallelism: none during wipe/setup because the environment reset is global to the machine.
+
+## Documentation Prerequisites
+- `AGENTS.md`
+- `docs/INSTALL_GUIDE.md`
+- `docs/dev/DEV_ENVIRONMENT_SETUP.md`
+- `docs/qa/feature-checks/FLOW.md`
+
+## Delivery Tracker
+
+### PLATFORM-SCRATCH-ITER9-001 - Rebuild from zero Stella runtime state
+Status: DONE
+Dependency: none
+Owners: QA, 3rd line support
+Task description:
+- Remove Stella-only containers, images, volumes, and the frontdoor network, then rerun the documented setup entrypoint from zero Stella state.
+
+Completion criteria:
+- [x] Stella-only Docker state is removed.
+- [x] scripts/setup.ps1 is rerun from zero state.
+- [x] The first setup outcome is captured before UI verification starts.
+
+### PLATFORM-SCRATCH-ITER9-002 - Re-run the first-user full route/page/action audit
+Status: DONE
+Dependency: PLATFORM-SCRATCH-ITER9-001
+Owners: QA
+Task description:
+- After scratch setup converges, rerun the canonical route sweep plus the full route/page/action audit suite, including changed-surface and route-ownership checks, and enumerate every newly exposed issue before repair work begins.
+
+Completion criteria:
+- [x] Fresh route sweep evidence is captured on the rebuilt stack.
+- [x] Fresh route/page/action evidence is captured across the full aggregate suite, including changed-surface and ownership checks.
+- [x] Newly exposed defects are grouped and any new manual findings are queued into retained Playwright scenarios before any fix commit is prepared.
+
+### PLATFORM-SCRATCH-ITER9-003 - Repair the grouped defects exposed by the fresh audit
+Status: DONE
+Dependency: PLATFORM-SCRATCH-ITER9-002
+Owners: 3rd line support, Architect, Developer
+Task description:
+- Diagnose the grouped failures exposed by the fresh audit, choose the clean product/architecture-conformant fix, implement it, add retained Playwright coverage for the new behavior when needed, and rerun the affected verification slices plus the aggregate audit before committing.
+
+Completion criteria:
+- [x] Root causes are recorded for the grouped failures.
+- [x] Fixes land with focused regression coverage and retained Playwright scenario updates where practical.
+- [x] The rebuilt stack is retested before the iteration commit.
+
+## Execution Log
+| Date (UTC) | Update | Owner |
+| --- | --- | --- |
+| 2026-03-13 | Sprint created for the next scratch iteration after local commit `6954ac796` closed the previous clean baseline. | QA |
+| 2026-03-13 | Removed Stella-only containers, `stellaops/*:dev` images, Stella compose volumes, and the `stellaops` / `stellaops_frontdoor` networks to return the machine to zero Stella runtime state for the new iteration. | QA / 3rd line support |
+| 2026-03-13 | The zero-state setup rerun completed cleanly: `36/36` solution builds passed, the full image matrix rebuilt, platform services converged, and `60/61` Stella containers are healthy on `https://stella-ops.local`. | QA / 3rd line support |
+| 2026-03-13 | The standalone canonical route sweep finished with `111/111` passed routes and `0` failed routes on the rebuilt stack. | QA |
+| 2026-03-13 | The first-user aggregate Playwright audit finished cleanly at `22/22` passed suites. The retained surface now includes tightened user-reported admin/trust/report checks, deeper `/ops/policy/*` tab coverage, and corrected uncovered-surface navigation waiting for slower SPA hand-offs like `/releases/environments -> Open Agents`. | QA |
+| 2026-03-13 | The aggregate audit recorded one first-pass runtime-only setup-topology failure, auto-retried it, and stabilized cleanly. The behavior did not reproduce after retry, so the issue was recorded as cold-start audit noise rather than a product regression. | QA / 3rd line support |
+| 2026-03-13 | Grouped defects from the fresh audit were traced to two root-cause families: policy governance compatibility gaps and placeholder tenant scope on the web shell, plus missing VexHub repository registrations/startup migrations/runtime model compatibility for fresh databases. | 3rd line support |
+| 2026-03-13 | Implemented the grouped repair set, then revalidated it with focused retained tests and targeted executable slices: Angular feature specs `14/14`, `GovernanceCompatibilityEndpointsTests` `6/6`, and VexHub registration/model tests `2/2`. | Developer / Test Automation |
+
+## Decisions & Risks
+- Decision: each scratch iteration remains a full wipe -> setup -> route/page/action audit -> grouped remediation loop; if the audit comes back clean, that still counts as a completed iteration because the full loop was executed.
+- Decision: changed or newly discovered user flows must be converted into retained Playwright coverage before the next scratch iteration starts so the audit surface expands instead of rediscovering the same gaps manually.
+- Risk: scratch rebuilds remain expensive, so verification stays Playwright-first with focused test/build slices rather than indiscriminate full-solution test runs.
+- Decision: policy governance compatibility stays tenant/project scoped end to end. The repair uses shared live scope resolution in the web shell and deterministic compatibility endpoints in the gateway instead of hardcoded tenants or page-local mock state.
+- Decision: fresh-install VexHub convergence stays startup-migration driven. Missing source/conflict/ingestion-job repositories and the `SearchVector` EF model incompatibility were fixed in the persistence layer rather than worked around in the UI.
+- Decision: newly discovered manual routes and user-reported surfaces were converted into retained Playwright coverage before the iteration closed, including security reports tab embedding, trust/admin surfaces, deeper policy navigation, and delayed uncovered-surface link hand-offs.
+- Risk: the full aggregate audit still sees one cold-start-only topology runtime failure that stabilizes after automatic retry. The iteration keeps that retry evidence recorded so repeated occurrence can be treated as a real product defect rather than silently ignored.
+- Risk: `dotnet test --filter` remains unreliable on these Microsoft.Testing.Platform projects. Targeted backend evidence for this iteration therefore uses the direct xUnit executables (`6/6` policy, `2/2` VexHub) instead of solution-level filtered runs.
+
+## Next Checkpoints
+- Start iteration 010 from another Stella-only wipe and rerun the documented setup path from zero state.
+- Run the full Playwright route/page/action audit, including the expanded policy/admin/trust/reports/search retained coverage, before any new fix work begins.
+
--- a/docs/modules/policy/architecture.md
+++ b/docs/modules/policy/architecture.md
@@ -850,6 +850,29 @@ stella exception status <request-id>
 - [Trust Lattice Policy Gates](#63--trust-lattice-policy-gates)
 - [Budget Attestation](./budget-attestation.md)

+### Governance Compatibility Endpoints
+
+The console governance workspaces also depend on a tenant-scoped compatibility surface under `/api/v1/governance/*` that lives in the Policy gateway.
+
+- `GET /api/v1/governance/trust-weights`
+- `PUT /api/v1/governance/trust-weights/{weightId}`
+- `POST /api/v1/governance/trust-weights/preview-impact`
+- `GET /api/v1/governance/staleness/config`
+- `PUT /api/v1/governance/staleness/config/{dataType}`
+- `GET /api/v1/governance/staleness/status`
+- `GET /api/v1/governance/conflicts/dashboard`
+- `GET /api/v1/governance/conflicts`
+- `POST /api/v1/governance/conflicts/{conflictId}/resolve`
+- `POST /api/v1/governance/conflicts/{conflictId}/ignore`
+
+Contract requirements:
+- All requests are tenant-scoped and may include an optional `projectId`.
+- Console clients must resolve live tenant scope from the active session/context and must not rely on legacy placeholder aliases.
+- Conflict dashboard/list responses remain deterministic so scratch rebuilds and replayed Playwright sweeps see stable cards, trend buckets, and action affordances.
+
+Implementation reference:
+- `src/Policy/StellaOps.Policy.Gateway/Endpoints/GovernanceCompatibilityEndpoints.cs`
+
 ---

 ## 7 · Security & Tenancy
--- a/docs/modules/vex-hub/architecture.md
+++ b/docs/modules/vex-hub/architecture.md
@@ -19,7 +19,7 @@ Non-goals: policy decisioning (Policy Engine), consensus computation (VexLens),
 - **VexHub.Worker**: Background workers for ingestion schedules and validation pipelines.
 - **Normalization Pipeline**: Canonicalizes statements, deduplicates, and links provenance.
 - **Validation Pipeline**: Schema validation (OpenVEX/CycloneDX/CSAF) and signature checks.
- **Storage**: PostgreSQL schema `vexhub` for normalized statements, provenance, conflicts, and export cursors.
+- **Storage**: PostgreSQL schema `vexhub` for sources, normalized statements, provenance, conflicts, ingestion jobs, and export cursors.

 ## 4) Data Model (Draft)
 - `vexhub.statement`
@@ -37,11 +37,25 @@ All tables must include `tenant_id`, UTC timestamps, and deterministic ordering
 - `GET /api/v1/vex/cve/{cve-id}`
 - `GET /api/v1/vex/package/{purl}`
 - `GET /api/v1/vex/source/{source-id}`
+- `GET /api/v1/vex/stats`
 - `GET /api/v1/vex/export` (bulk OpenVEX feed)
 - `GET /api/v1/vex/index` (vex-index.json)

 Responses are deterministic: stable ordering by `timestamp DESC`, then `source_id ASC`, then `statement_hash ASC`.

+`GET /api/v1/vex/stats` returns the dashboard contract consumed by the console VEX surfaces:
+- `totalStatements`
+- `verifiedStatements`
+- `flaggedStatements`
+- `byStatus`
+- `bySource`
+- `recentActivity`
+- `trends`
+- `generatedAt`
+
+The stats endpoint must keep working on fresh installs even when a committed EF compiled-model stub is empty; runtime model fallback is required until a real optimized model is generated.
+The service must also auto-apply embedded SQL migrations for schema `vexhub` on startup so wiped volumes converge without manual SQL bootstrap.
+
 ## 6) Determinism & Offline Posture
 - Ingestion runs against frozen snapshots where possible; all outputs include `snapshot_hash`.
 - Canonical JSON serialization with stable key ordering.
@@ -67,6 +81,7 @@ Responses are deterministic: stable ordering by `timestamp DESC`, then `source_i
 ## 10) Testing Strategy
 - Unit tests for normalization and validation pipelines.
 - Integration tests with Postgres for ingestion and API outputs.
+- Persistence registration and runtime-model tests that prove source/conflict/ingestion-job repositories and startup migrations are wired on the service path.
 - Determinism tests comparing repeated exports with identical inputs.

-*Last updated: 2025-12-22.*
+*Last updated: 2026-03-13.*