docs(implplan): sprint log updates for scheduler plugin, retest, no-mocks
- SPRINT_20260408_003_JobEngine: TASK-013 added for scheduler persistence
auto-migrations + dedupe 007/008 migrations; execution log notes the
2026-04-13 QA finding and trend-endpoint fix (commit 337aa5802).
- SPRINT_20260409_002_Platform + SPRINT_20260410_001_Web_runtime_no_mocks:
log updates reflecting current state of ongoing work.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -500,6 +500,24 @@ Completion criteria:
|
||||
- [ ] Doctor scheduling migration documented
|
||||
- [ ] Plugin development guide exists for future plugin authors
|
||||
|
||||
### TASK-013 - Wire Scheduler persistence auto-migrations + dedupe 007/008
|
||||
Status: TODO
|
||||
Dependency: TASK-006
|
||||
Owners: Developer / Implementer
|
||||
Task description:
|
||||
- The Scheduler service does not wire `AddStartupMigrations` (see `src/JobEngine/StellaOps.Scheduler.__Libraries/StellaOps.Scheduler.Persistence/Extensions/SchedulerPersistenceExtensions.cs`). Migrations 007 (`job_kind`/`plugin_config`) and 008 (`doctor_trends`) never execute on startup. `SystemScheduleBootstrap` crashes on every boot with `Npgsql.PostgresException 42703: column "job_kind" of relation "schedules" does not exist`, blocking default Doctor schedule seeding (the goal of TASK-008).
|
||||
- Two collision pairs exist under `Migrations/`: `007_add_job_kind_plugin_config.sql` + `007_add_schedule_job_kind.sql`, and `008_add_doctor_trends.sql` + `008_doctor_trends_table.sql`. Pick one of each, delete the other, reconcile index/comment differences.
|
||||
- Wire `AddStartupMigrations("scheduler", "StellaOps.Scheduler", persistenceAssembly)` from `StellaOps.Infrastructure.Postgres.Migrations` in `AddSchedulerPersistence(...)` so the embedded SQL files run on every cold start. Reference pattern: `src/Signals/__Libraries/StellaOps.Signals.Persistence/Extensions/`.
|
||||
- This violates the non-negotiable rule in `CLAUDE.md §2.7` (Database auto-migration requirement). Symptom observed 2026-04-13 on a stack that had been running since fresh DB bootstrap.
|
||||
|
||||
Completion criteria:
|
||||
- [ ] `scheduler.schema_migrations` table exists after a fresh-DB startup (parity with other services in §2.7)
|
||||
- [ ] `scheduler.schedules` has `job_kind TEXT NOT NULL DEFAULT 'scan'` and `plugin_config JSONB` columns
|
||||
- [ ] `scheduler.doctor_trends` table exists
|
||||
- [ ] `SystemScheduleBootstrap` seeds 3 default Doctor schedules without error on fresh DB
|
||||
- [ ] Duplicate 007/008 SQL files collapsed to one each
|
||||
- [ ] Integration test (or targeted manual run) proves volume-reset → working scheduler without any manual `psql`
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
@@ -508,6 +526,7 @@ Completion criteria:
|
||||
| 2026-04-08 | Batch 1 complete: Plugin.Abstractions library (ISchedulerJobPlugin, SchedulerPluginRegistry, ScanJobPlugin), Schedule model extended with JobKind+PluginConfig, SQL migration 007, contracts updated, Program.cs wired. All 143 existing tests pass. | Developer |
|
||||
| 2026-04-08 | Batch 2 complete: DoctorJobPlugin created with HTTP execution, trend storage (PostgresDoctorTrendRepository), alert service, trend endpoints. SQL migration 008 for doctor_trends table. 3 default Doctor schedules seeded. | Developer |
|
||||
| 2026-04-08 | Batch 3 complete: doctor-scheduler commented out in both compose files. AGENTS.md created for scheduler plugins. Build verified: WebService + Doctor plugin compile with 0 warnings/errors. | Developer |
|
||||
| 2026-04-13 | QA verification on running stack: Doctor trend endpoints returned 500 due to missing `[FromServices]` on `IDoctorTrendRepository?` in three endpoints. Fixed (commit `337aa5802`); all four trend endpoints now return HTTP 200 via gateway. Discovered Scheduler persistence never wires `AddStartupMigrations` — migrations 007/008 never ran; `SystemScheduleBootstrap` crashes on every boot; duplicate 007/008 SQL files present. Opened TASK-013. | QA / Developer |
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
|
||||
@@ -116,7 +116,7 @@ Completion criteria:
|
||||
- [x] The `/ops/scripts` UI loads script data instead of showing the generic load failure banner.
|
||||
|
||||
### RETEST-008 - Remove bogus local feed-mirror timeout state for `mirror-osv-001`
|
||||
Status: DOING
|
||||
Status: DONE
|
||||
Dependency: RETEST-007
|
||||
Owners: Developer / QA
|
||||
Task description:
|
||||
@@ -125,9 +125,9 @@ Task description:
|
||||
- Reverify the fix with focused API/UI checks against the local stack.
|
||||
|
||||
Completion criteria:
|
||||
- [ ] The local OSV mirror detail no longer reports a fabricated timeout error.
|
||||
- [ ] Backend and frontend seed/mock fixtures stay aligned for the OSV mirror state.
|
||||
- [ ] Live local verification is recorded in the sprint log.
|
||||
- [x] The local OSV mirror detail no longer reports a fabricated timeout error.
|
||||
- [x] Backend and frontend seed/mock fixtures stay aligned for the OSV mirror state.
|
||||
- [x] Live local verification is recorded in the sprint log.
|
||||
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
@@ -153,6 +153,9 @@ Completion criteria:
|
||||
| 2026-04-09 | Authority standard-plugin bootstrap hardening shipped. `StandardPluginBootstrapper` now retries the bootstrap pass instead of abandoning seeding after the first transient storage failure, and the focused Authority Standard Plugin suite now passes `44/44`, including a new transient-failure regression test. Module docs were updated to record that bootstrap user/client provisioning now converges when storage becomes reachable after startup. | Developer / QA |
|
||||
| 2026-04-09 | Live deployment and browser verification completed. Rebuilding `stellaops/authority:dev` through `devops/docker/Dockerfile.platform` was blocked by stale Dockerfile paths (`/src/Signer`, `/src/Scheduler` no longer exist), so the updated Authority host was published locally and copied into the running container before restarting `stellaops-authority`. On restart, Authority logs confirmed `assigned role 'admin' to bootstrap user 'admin'`. The real browser-level script `npm run test:e2e:live:auth` then succeeded and produced a full `demo-prod` session for `admin`, landing on `https://stella-ops.local/?tenant=demo-prod®ions=apac,eu-west,us-east,us-west` with title `Dashboard - StellaOps`. | Developer / QA |
|
||||
| 2026-04-10 | `/ops/scripts` was fixed at the owning-service layer. Release Orchestrator scripts now bind isolated `ScriptsPostgresOptions`, embed their own `scripts` schema DDL/seed migration, and register a dedicated startup migration host. A shared infrastructure bug also had to be corrected: `AddStartupMigrations(...)` previously used `AddHostedService(...)`, which deduplicated the second migration host in a single service and prevented the `scripts` schema migrator from starting beside the core `release_orchestrator` migrator. Focused Release Orchestrator integration tests now pass `15/15`, release-orchestrator was republished into the live container, startup logs show `Migration.ReleaseOrchestrator.Scripts` applying `001_initial.sql`, and a real authenticated Playwright probe confirmed `https://stella-ops.local/ops/scripts` loads with heading `Scripts`, no generic error banner, and 4 rendered scripts while `GET /api/v2/scripts` returns `200`. | Developer / QA |
|
||||
| 2026-04-13 | Feed-mirror backend convergence work replaced the remaining active Concelier placeholder handlers for snapshot actions, retention updates, air-gap bundles/imports, and version locks with persisted PostgreSQL/filesystem-backed behavior. A focused xUnit v3 class run (`StellaOps.Concelier.WebService.Tests.AdvisorySourceEndpointsTests`) verified the touched endpoint slice at `10` total / `9` passed / `1` pre-existing unrelated failure (`ListEndpoints_WithoutTenantHeader_ReturnsBadRequest` returns `200` instead of `400`). | Developer / QA |
|
||||
| 2026-04-13 | Republish of the updated Concelier host exposed two production-only runtime faults that the prior local binary had been hiding: embedded Release Orchestrator topology stores were still registered through an unbound `Func<Guid>` path, and `001_regions_and_infra_bindings.sql` used an invalid table-level `UNIQUE(... COALESCE(...))` expression. Both were fixed, Concelier was republished into the live container, and startup now applies `ReleaseOrchestrator.Environment` migrations cleanly before reaching `healthy`. | Developer / QA |
|
||||
| 2026-04-13 | Live browser verification of `https://stella-ops.local/ops/operations/feeds/mirror/mirror-osv-001` succeeded through the authenticated frontdoor. The page no longer renders the `Sync Error` banner, and the browser-backed API call to `/api/v1/concelier/mirrors/mirror-osv-001` returned `200` with `feedType=osv`, `syncStatus=synced`, and `errorMessage=null`. | Developer / QA |
|
||||
|
||||
## Decisions & Risks
|
||||
- Decision: the retest follows the user-specified order exactly: backend unit, backend E2E, frontend unit, frontend E2E.
|
||||
@@ -172,6 +175,11 @@ Completion criteria:
|
||||
- Decision: startup schema migration hosts now register as explicit `IHostedService` singletons instead of `AddHostedService(...)` so one service can own and auto-migrate multiple PostgreSQL schemas without the second host being silently deduplicated.
|
||||
- Risk: the local browser trust chain is still not accepted by the ad hoc Playwright CLI browser session in this terminal, so the final UI verification for `/ops/scripts` used the repo’s existing `live-frontdoor-auth.mjs` harness with `ignoreHTTPSErrors: true` instead of the Playwright MCP bridge.
|
||||
|
||||
- Decision: the active Concelier feed-mirror surfaces now use real backend persistence for snapshot operations, retention settings, bundle creation/download/import, and version locks instead of placeholder `501 Not Implemented` handlers or in-memory catalogs behind the live UI routes.
|
||||
- Decision: Concelier's embedded Release Orchestrator topology runtime now binds its Postgres-backed stores explicitly through `ConcelierTopologyIdentityAccessor` rather than an unbound `Func<Guid>` DI primitive, because the latter only failed once the live service was republished and the deletion worker activated on startup.
|
||||
- Decision: `src/ReleaseOrchestrator/__Libraries/StellaOps.ReleaseOrchestrator.Environment/Migrations/001_regions_and_infra_bindings.sql` now enforces the infrastructure-binding scope uniqueness rule with a named unique index, not an invalid table-level `UNIQUE` constraint containing `COALESCE(...)`.
|
||||
- Risk: the Concelier WebService test project still has a pre-existing auth-contract failure (`ListEndpoints_WithoutTenantHeader_ReturnsBadRequest` currently returns `200` instead of `400`), and the repo's Microsoft.Testing.Platform setup still rejects legacy `dotnet test --filter` targeted evidence (`MTP0001`). Focused verification for this fix path therefore used the xUnit v3 class runner plus live frontdoor browser checks.
|
||||
|
||||
## Next Checkpoints
|
||||
- Finish backend unit lane and decide whether failures require code fixes before backend E2E.
|
||||
- Use the live stack for backend API verification once the backend unit lane is complete.
|
||||
|
||||
@@ -62,7 +62,7 @@ Completion criteria:
|
||||
- [ ] Fake seeded timeout/demo bundle/version-lock/import/offline payloads are removed from the live endpoint path.
|
||||
|
||||
### NOMOCK-004 - Verify live feed UI behavior and log remaining blocked runtime in-memory services
|
||||
Status: TODO
|
||||
Status: DOING
|
||||
Dependency: NOMOCK-002
|
||||
Owners: Developer / QA
|
||||
Task description:
|
||||
@@ -78,14 +78,18 @@ Completion criteria:
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2026-04-10 | Sprint created to remove live production-path stubs, mock providers, and in-memory runtime bindings starting with the active Angular app configuration and Concelier feed-mirror surfaces. | Developer |
|
||||
| 2026-04-13 | Inventory extended beyond the initial Concelier slice. Confirmed `/ops/operations/feeds/mirror/*` now runs on persisted Concelier state, but found two still-live script-path fictions: Platform still registers `IScriptService` to `InMemoryScriptService` in `src/Platform/StellaOps.Platform.WebService/Program.cs`, and the owning Release Orchestrator `/api/v2/scripts/{id}/check-compatibility` endpoint still returns an unconditional stub success from `src/ReleaseOrchestrator/__Apps/StellaOps.ReleaseOrchestrator.WebApi/Endpoints/ScriptsEndpoints.cs`. Next implementation slice expands to `src/ReleaseOrchestrator/**` for script compatibility truthfulness and to the Platform alias only if still needed after the owning-service fix. | Developer |
|
||||
|
||||
## Decisions & Risks
|
||||
- Decision: this sprint prioritizes live runtime paths the browser can currently reach over test-only mock helpers.
|
||||
- Decision: unsupported operations must return truthful empty/problem responses rather than seeded demo success/error payloads.
|
||||
- Decision: after the feed-mirror cleanup, the next highest-value runtime slice is the scripts compatibility path because the browser uses the real `/api/v2/scripts` backend and its compatibility action still reports fabricated success.
|
||||
- Risk: several modules outside the initial slice still boot with runtime in-memory stores (`Notify`, `Graph`, `Policy`, `Platform`, `Scheduler`, `Scanner`, `BinaryIndex`, `Signals`, `SbomService`, `Signer`, `PacksRegistry`, `AdvisoryAI`). They will need follow-on slices unless a real persistence path already exists and can be wired safely.
|
||||
- Risk: Platform still contains a production registration of `IScriptService -> InMemoryScriptService`, but the live frontdoor currently routes `/api/v2/scripts` to Release Orchestrator. Removing that alias safely requires checking any direct Platform callers so the owning-service fix lands first.
|
||||
- Risk: some feed-mirror sub-features appear to have no real persisted backend contract yet, so removing fake data may temporarily surface explicit `501`/empty-state behavior in the UI until the owning backend is implemented.
|
||||
|
||||
## Next Checkpoints
|
||||
- Remove the active Angular VEX Hub mock provider.
|
||||
- Convert the Concelier feed-mirror endpoints from seeded data to real source/read-model state.
|
||||
- Replace the Release Orchestrator script compatibility stub with persisted script-aware evaluation.
|
||||
- Re-test the live feed pages and record the next runtime cleanup slice.
|
||||
|
||||
Reference in New Issue
Block a user