Restore policy simulation history compatibility

This commit is contained in:
master
2026-03-10 00:42:18 +02:00
parent ac544c0064
commit 1df79ac75e
4 changed files with 1050 additions and 0 deletions

View File

@@ -0,0 +1,79 @@
# Sprint 20260309-011 - Platform Live Remaining Route Contract Repair
## Topic & Scope
- Repair the remaining authenticated live frontdoor route failures exposed by the full scratch rebuild after the shared gateway/runtime regressions were already cleared.
- Fix root causes in the correct layer: Authority scope semantics, Platform compatibility read models, Policy governance/simulation surfaces, Signals compatibility endpoints, JobEngine SQL fallback behavior, and the remaining frontend response-shape adapters.
- Keep the iteration driven by real Playwright evidence from `https://stella-ops.local`, then rebuild and redeploy the touched services before rerunning the authenticated sweep.
- Working directory: `src/Platform/StellaOps.Platform.WebService`.
- Allowed coordination edits: `src/Authority/StellaOps.Authority/StellaOps.Auth.ServerIntegration/**`, `src/Authority/StellaOps.Authority/StellaOps.Auth.ServerIntegration.Tests/**`, `src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.Infrastructure/**`, `src/Policy/StellaOps.Policy.Gateway/**`, `src/Policy/__Tests/StellaOps.Policy.Gateway.Tests/**`, `src/Signals/StellaOps.Signals/**`, `src/Signals/__Tests/StellaOps.Signals.Tests/**`, `src/Web/StellaOps.Web/**`, `docs/modules/platform/**`, `docs/modules/policy/**`, `docs/modules/signals/**`, `docs/modules/ui/console-architecture.md`, `docs/implplan/SPRINT_20260309_011_Platform_live_remaining_route_contract_repair.md`.
- Expected evidence: targeted unit/integration test runs against individual `.csproj` files, rebuilt service images, redeployed live stack, refreshed authenticated Playwright route/action artifacts.
## Dependencies & Concurrency
- Depends on `SPRINT_20260309_001_Platform_scratch_setup_bootstrap_restore.md` for the scratch rebuild baseline, `SPRINT_20260309_006_Platform_rebuild_runtime_contract_repairs.md` for the migration/binding recovery, `SPRINT_20260309_008_Router_live_messaging_heartbeat_contract_repair.md` for the cleared gateway health flap, and `SPRINT_20260309_010_FE_live_auth_scope_console_and_policy_alignment.md` for the already-isolated frontend route inventory.
- Safe parallelism: avoid unrelated component-revival and search work outside the paths listed above; do not revert unrelated dirty files in the shared worktree.
## Documentation Prerequisites
- `AGENTS.md`
- `docs/code-of-conduct/CODE_OF_CONDUCT.md`
- `docs/qa/feature-checks/FLOW.md`
- `docs/modules/platform/architecture-overview.md`
- `docs/modules/policy/architecture.md`
- `docs/modules/signals/guides/unknowns-registry.md`
- `docs/modules/ui/console-architecture.md`
## Delivery Tracker
### LIVE-REPAIR-011-001 - Repair remaining authenticated route contracts at the source
Status: DOING
Dependency: none
Owners: Developer, Test Automation
Task description:
- Fix the confirmed live contract defects behind the remaining failed routes: quota authorization OR-scope semantics, dead-letter summary SQL fallback coverage, missing Platform console/AOC compatibility endpoints, missing Policy governance and shadow-mode/simulation endpoints, missing Signals compatibility list/stats endpoints, and the remaining frontend adapters for pack-registry and notifications.
- Favor durable compatibility/read-model layers and tests over route-local workarounds.
Completion criteria:
- [ ] `/ops/operations/quotas`, `/ops/operations/dead-letter`, `/ops/operations/aoc`, `/ops/operations/signals`, `/ops/operations/packs`, `/ops/operations/notifications`, `/ops/operations/status`, `/ops/policy/simulation`, `/ops/policy/trust-weights`, and `/ops/policy/staleness` stop failing for the currently confirmed source-level reasons.
- [ ] Targeted tests against the touched `.csproj` and frontend spec files fail before the fix and pass after it.
- [ ] Updated docs describe any new compatibility contract that is now part of the live platform.
### LIVE-REPAIR-011-002 - Rebuild and redeploy the repaired service slice
Status: TODO
Dependency: LIVE-REPAIR-011-001
Owners: Developer, QA
Task description:
- Rebuild every touched service and the web bundle from the repaired source, redeploy them into the local compose stack, and verify direct service readiness before rerunning Playwright.
Completion criteria:
- [ ] Changed images and the web bundle are rebuilt from current source.
- [ ] The live compose stack is redeployed without disturbing unrelated in-flight work.
- [ ] Direct service probes succeed for the repaired compatibility surfaces before the browser sweep resumes.
### LIVE-REPAIR-011-003 - Reverify the authenticated frontdoor with Playwright
Status: TODO
Dependency: LIVE-REPAIR-011-002
Owners: QA
Task description:
- Rerun the authenticated frontdoor Playwright checks from the rebuilt stack, verify the previously failing pages load cleanly, and record any remaining route/action defects for the next iteration instead of declaring premature all-clear.
Completion criteria:
- [ ] Authenticated Playwright auth bootstrap and canonical route sweep are rerun against `https://stella-ops.local`.
- [ ] Targeted page/action rechecks are captured for the repaired route family.
- [ ] Remaining failures, if any, are documented with current artifacts and triaged to the next sprint item.
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-03-09 | Sprint created after the rebuilt live stack still failed 10 authenticated canonical routes due to confirmed source-level contract gaps across Authority, Platform, JobEngine, Policy, Signals, and Web. | Developer |
| 2026-03-09 | Policy simulation compatibility handlers now serve history, compare, verify, and pin contracts in the Policy gateway; targeted xUnit v3 class execution passed, and live frontdoor retesting isolated the remaining failure to router translation gaps for `/policy/simulations*` rather than missing service endpoints. | Developer |
## Decisions & Risks
- Decision: keep quota backward compatibility in Authority authorization semantics rather than diluting Platform policy names or broadening token issuance.
- Decision: add deterministic compatibility/read-model endpoints where the rebuilt frontend already depends on stable contracts (`/api/console/status`, `/api/v1/aoc/*`, `/api/v1/governance/*`, `/policy/shadow/*`, `/api/v1/signals*`) instead of replacing live HTTP clients with mocks.
- Decision: treat Policy simulation history tools as a two-layer repair. First restore the backend compatibility contract inside `StellaOps.Policy.Gateway`; then handle the frontdoor router translation for `/policy/simulations*` as a separate iteration so service and routing fixes remain independently auditable.
- Risk: the notifications health `400` remains the least-certain defect in the current set; if the direct service probe still disagrees with the frontdoor after the rebuild, isolate it in the Notify slice rather than masking it in Playwright expectations.
- Audit note: one external web lookup was attempted earlier in the session before the repo web-fetch policy was re-read; no external code or configuration was imported, and implementation continued using local docs and source only.
## Next Checkpoints
- 2026-03-09: land scoped source/test fixes for the remaining authenticated route cluster.
- 2026-03-09: rebuild the changed services and web bundle from source.
- 2026-03-09: rerun authenticated Playwright sweeps and either commit the repaired iteration or record the remaining defects for the next pass.

View File

@@ -38,6 +38,17 @@ Non-goals: policy authoring UI (handled by Console), ingestion or advisory norma
- Translation sources are layered deterministically: shared embedded `common` bundle -> Policy embedded bundle (`Translations/*.policy.json`) -> Platform runtime override bundle.
- The rollout localizes selected request validation and readiness responses for `en-US` and `de-DE`.
### 1.2 · Simulation compatibility contract (Sprint 20260309_011)
- The Policy Gateway exposes a deterministic compatibility surface for the Console simulation history workflow while the deeper Policy Engine read models continue to evolve.
- Compatibility endpoints under `/policy` include:
- `GET /policy/simulations/history`
- `GET /policy/simulations/compare`
- `POST /policy/simulations/{simulationId}/verify`
- `PATCH /policy/simulations/{simulationId}`
- These endpoints return tenant-scoped history entries, comparison diffs, reproducibility checks, and pin state with stable field names that match the live Console contract (`resultHash`, `findingsBySeverity`, `pinned`, `matchPercentage`, `discrepancies`).
- The compatibility layer is intentionally stateful per tenant so operators can exercise history actions end to end in the live shell without client-side mocks.
---
## 2·High-Level Architecture