Add unit and integration tests for VexCandidateEmitter and SmartDiff repositories

- Implemented comprehensive unit tests for VexCandidateEmitter to validate candidate emission logic based on various scenarios including absent and present APIs, confidence thresholds, and rate limiting. - Added integration tests for SmartDiff PostgreSQL repositories, covering snapshot storage and retrieval, candidate storage, and material risk change handling. - Ensured tests validate correct behavior for storing, retrieving, and querying snapshots and candidates, including edge cases and expected outcomes.
2025-12-16 18:44:25 +02:00
parent 2170a58734
commit 3a2100aa78
126 changed files with 15776 additions and 542 deletions
--- a/src/Scheduler/AGENTS.md
+++ b/src/Scheduler/AGENTS.md
@@ -2,7 +2,7 @@

 ## Roles
 - **Scheduler Worker/WebService Engineer**: .NET 10 (preview) across workers, web service, and shared libraries; keep jobs/metrics deterministic and tenant-safe.
- **QA / Reliability**: Adds/maintains unit + integration tests in `__Tests`, covers determinism, job orchestration, and metrics; validates Mongo/Redis/NATS contracts without live cloud deps.
+- **QA / Reliability**: Adds/maintains unit + integration tests in `__Tests`, covers determinism, job orchestration, and metrics; validates PostgreSQL/Redis/NATS contracts without live cloud deps.
 - **Docs/Runbook Touches**: Update `docs/modules/scheduler/**` and `operations/` assets when contracts or operational characteristics change.

 ## Required Reading
--- a/src/Scheduler/StellaOps.Scheduler.WebService/docs/SCHED-WEB-16-103-RUN-APIS.md
+++ b/src/Scheduler/StellaOps.Scheduler.WebService/docs/SCHED-WEB-16-103-RUN-APIS.md
@@ -6,21 +6,21 @@

 | Method | Path | Description | Scopes |
 | ------ | ---- | ----------- | ------ |
-| `GET` | `/api/v1/scheduler/runs` | List runs for the current tenant (filter by schedule, state, createdAfter, cursor). | `scheduler.runs.read` |
-| `GET` | `/api/v1/scheduler/runs/{runId}` | Retrieve run details. | `scheduler.runs.read` |
-| `GET` | `/api/v1/scheduler/runs/{runId}/deltas` | Fetch deterministic delta metadata for the specified run. | `scheduler.runs.read` |
-| `GET` | `/api/v1/scheduler/runs/queue/lag` | Snapshot queue depth per transport/queue for console dashboards. | `scheduler.runs.read` |
-| `GET` | `/api/v1/scheduler/runs/{runId}/stream` | Server-sent events (SSE) stream for live progress, queue lag, and heartbeats. | `scheduler.runs.read` |
-| `POST` | `/api/v1/scheduler/runs` | Create an ad-hoc run bound to an existing schedule. | `scheduler.runs.write` |
-| `POST` | `/api/v1/scheduler/runs/{runId}/cancel` | Transition a run to `cancelled` when still in a non-terminal state. | `scheduler.runs.manage` |
-| `POST` | `/api/v1/scheduler/runs/{runId}/retry` | Clone a terminal run into a new manual retry, preserving provenance. | `scheduler.runs.manage` |
-| `POST` | `/api/v1/scheduler/runs/preview` | Resolve impacted images using the ImpactIndex without enqueuing work. | `scheduler.runs.preview` |
-| `GET` | `/api/v1/scheduler/policies/simulations` | List policy simulations for the current tenant (filters: policyId, status, since, limit). | `policy:simulate` |
-| `GET` | `/api/v1/scheduler/policies/simulations/{simulationId}` | Retrieve simulation status snapshot. | `policy:simulate` |
-| `GET` | `/api/v1/scheduler/policies/simulations/{simulationId}/stream` | SSE stream emitting simulation status, queue lag, and heartbeats. | `policy:simulate` |
-| `POST` | `/api/v1/scheduler/policies/simulations` | Enqueue a policy simulation (mode=`simulate`) with optional SBOM inputs and metadata. | `policy:simulate` |
-| `POST` | `/api/v1/scheduler/policies/simulations/{simulationId}/cancel` | Request cancellation for an in-flight simulation. | `policy:simulate` |
-| `POST` | `/api/v1/scheduler/policies/simulations/{simulationId}/retry` | Clone a terminal simulation into a new run preserving inputs/metadata. | `policy:simulate` |
+| `GET` | `/api/v1/scheduler/runs` | List runs for the current tenant (filter by schedule, state, createdAfter, cursor). | `scheduler.runs.read` |
+| `GET` | `/api/v1/scheduler/runs/{runId}` | Retrieve run details. | `scheduler.runs.read` |
+| `GET` | `/api/v1/scheduler/runs/{runId}/deltas` | Fetch deterministic delta metadata for the specified run. | `scheduler.runs.read` |
+| `GET` | `/api/v1/scheduler/runs/queue/lag` | Snapshot queue depth per transport/queue for console dashboards. | `scheduler.runs.read` |
+| `GET` | `/api/v1/scheduler/runs/{runId}/stream` | Server-sent events (SSE) stream for live progress, queue lag, and heartbeats. | `scheduler.runs.read` |
+| `POST` | `/api/v1/scheduler/runs` | Create an ad-hoc run bound to an existing schedule. | `scheduler.runs.write` |
+| `POST` | `/api/v1/scheduler/runs/{runId}/cancel` | Transition a run to `cancelled` when still in a non-terminal state. | `scheduler.runs.manage` |
+| `POST` | `/api/v1/scheduler/runs/{runId}/retry` | Clone a terminal run into a new manual retry, preserving provenance. | `scheduler.runs.manage` |
+| `POST` | `/api/v1/scheduler/runs/preview` | Resolve impacted images using the ImpactIndex without enqueuing work. | `scheduler.runs.preview` |
+| `GET` | `/api/v1/scheduler/policies/simulations` | List policy simulations for the current tenant (filters: policyId, status, since, limit). | `policy:simulate` |
+| `GET` | `/api/v1/scheduler/policies/simulations/{simulationId}` | Retrieve simulation status snapshot. | `policy:simulate` |
+| `GET` | `/api/v1/scheduler/policies/simulations/{simulationId}/stream` | SSE stream emitting simulation status, queue lag, and heartbeats. | `policy:simulate` |
+| `POST` | `/api/v1/scheduler/policies/simulations` | Enqueue a policy simulation (mode=`simulate`) with optional SBOM inputs and metadata. | `policy:simulate` |
+| `POST` | `/api/v1/scheduler/policies/simulations/{simulationId}/cancel` | Request cancellation for an in-flight simulation. | `policy:simulate` |
+| `POST` | `/api/v1/scheduler/policies/simulations/{simulationId}/retry` | Clone a terminal simulation into a new run preserving inputs/metadata. | `policy:simulate` |

 All endpoints require a tenant context (`X-Tenant-Id`) and the appropriate scheduler scopes. Development mode allows header-based auth; production deployments must rely on Authority-issued tokens (OpTok + DPoP).

@@ -80,12 +80,12 @@ GET /api/v1/scheduler/runs?scheduleId=sch_4f2c7d9e0a2b4c64a0e7b5f9d65c1234&state
 ```

 ```json
-{
-  "runs": [
-    {
-      "schemaVersion": "scheduler.run@1",
-      "id": "run_c7b4e9d2f6a04f8784a40476d8a2f771",
-      "tenantId": "tenant-alpha",
+{
+  "runs": [
+    {
+      "schemaVersion": "scheduler.run@1",
+      "id": "run_c7b4e9d2f6a04f8784a40476d8a2f771",
+      "tenantId": "tenant-alpha",
      "scheduleId": "sch_4f2c7d9e0a2b4c64a0e7b5f9d65c1234",
      "trigger": "manual",
      "state": "planning",
@@ -103,13 +103,13 @@ GET /api/v1/scheduler/runs?scheduleId=sch_4f2c7d9e0a2b4c64a0e7b5f9d65c1234&state
      "reason": {
        "manualReason": "Nightly backfill"
      },
-      "createdAt": "2025-10-26T03:12:45Z"
-    }
-  ]
-}
-```
-
-When additional pages are available the response includes `"nextCursor": "<base64>"`. Clients pass this cursor via `?cursor=` to fetch the next deterministic slice (ordering = `createdAt desc, id desc`).
+      "createdAt": "2025-10-26T03:12:45Z"
+    }
+  ]
+}
+```
+
+When additional pages are available the response includes `"nextCursor": "<base64>"`. Clients pass this cursor via `?cursor=` to fetch the next deterministic slice (ordering = `createdAt desc, id desc`).

 ## Cancel Run

@@ -148,33 +148,33 @@ POST /api/v1/scheduler/runs/run_c7b4e9d2f6a04f8784a40476d8a2f771/cancel

 ## Impact Preview

-`/api/v1/scheduler/runs/preview` resolves impacted images via the ImpactIndex without mutating state. When `scheduleId` is provided the schedule selector is reused; callers may alternatively supply an explicit selector.
-
-## Retry Run
-
-`POST /api/v1/scheduler/runs/{runId}/retry` clones a terminal run into a new manual run with `retryOf` pointing to the original identifier. Retry is scope-gated with `scheduler.runs.manage`; the new run’s `reason.manualReason` gains a `retry-of:<runId>` suffix for provenance.
-
-## Run deltas
-
-`GET /api/v1/scheduler/runs/{runId}/deltas` returns an immutable, deterministically sorted array of delta summaries (`[imageDigest, severity slices, KEV hits, attestations]`).
-
-## Queue lag snapshot
-
-`GET /api/v1/scheduler/runs/queue/lag` exposes queue depth summaries for planner/runner transports. The payload includes `capturedAt`, `totalDepth`, `maxDepth`, and ordered queue entries (transport + queue + depth). Console uses this for backlog dashboards and alert thresholds.
-
-## Live stream (SSE)
-
-`GET /api/v1/scheduler/runs/{runId}/stream` emits server-sent events for:
-
- `initial` — full run snapshot
- `stateChanged` — state/started/finished transitions
- `segmentProgress` — stats updates
- `deltaSummary` — deltas available
- `queueLag` — periodic queue snapshots
- `heartbeat` — uptime keep-alive (default 5s)
- `completed` — terminal summary
-
-The stream is tolerant to clients reconnecting (idempotent payloads, deterministic ordering) and honours tenant scope plus cancellation tokens.
+`/api/v1/scheduler/runs/preview` resolves impacted images via the ImpactIndex without mutating state. When `scheduleId` is provided the schedule selector is reused; callers may alternatively supply an explicit selector.
+
+## Retry Run
+
+`POST /api/v1/scheduler/runs/{runId}/retry` clones a terminal run into a new manual run with `retryOf` pointing to the original identifier. Retry is scope-gated with `scheduler.runs.manage`; the new run’s `reason.manualReason` gains a `retry-of:<runId>` suffix for provenance.
+
+## Run deltas
+
+`GET /api/v1/scheduler/runs/{runId}/deltas` returns an immutable, deterministically sorted array of delta summaries (`[imageDigest, severity slices, KEV hits, attestations]`).
+
+## Queue lag snapshot
+
+`GET /api/v1/scheduler/runs/queue/lag` exposes queue depth summaries for planner/runner transports. The payload includes `capturedAt`, `totalDepth`, `maxDepth`, and ordered queue entries (transport + queue + depth). Console uses this for backlog dashboards and alert thresholds.
+
+## Live stream (SSE)
+
+`GET /api/v1/scheduler/runs/{runId}/stream` emits server-sent events for:
+
+- `initial` — full run snapshot
+- `stateChanged` — state/started/finished transitions
+- `segmentProgress` — stats updates
+- `deltaSummary` — deltas available
+- `queueLag` — periodic queue snapshots
+- `heartbeat` — uptime keep-alive (default 5s)
+- `completed` — terminal summary
+
+The stream is tolerant to clients reconnecting (idempotent payloads, deterministic ordering) and honours tenant scope plus cancellation tokens.

 ```http
 POST /api/v1/scheduler/runs/preview
@@ -216,106 +216,106 @@ POST /api/v1/scheduler/runs/preview

 ### Integration notes

-* Run creation and cancellation produce audit entries under category `scheduler.run` with correlation metadata when provided.
-* The preview endpoint relies on the ImpactIndex stub in development. Production deployments must register the concrete index implementation before use.
-* Planner/worker orchestration tasks will wire run creation to queueing in SCHED-WORKER-16-201/202.
-
-## Policy simulations
-
-The policy simulation APIs mirror the run endpoints but operate on policy-mode jobs (`mode=simulate`) scoped by tenant and RBAC (`policy:simulate`).
-
-### Create simulation
-
-```http
-POST /api/v1/scheduler/policies/simulations
-X-Tenant-Id: tenant-alpha
-Authorization: Bearer <OpTok>
-```
-
-```json
-{
-  "policyId": "P-7",
-  "policyVersion": 4,
-  "priority": "normal",
-  "metadata": {
-    "source": "console.review"
-  },
-  "inputs": {
-    "sbomSet": ["sbom:S-318", "sbom:S-42"],
-    "captureExplain": true
-  }
-}
-```
-
-```json
-HTTP/1.1 201 Created
-Location: /api/v1/scheduler/policies/simulations/run:P-7:20251103T153000Z:e4d1a9b2
-{
-  "simulation": {
-    "schemaVersion": "scheduler.policy-run-status@1",
-    "runId": "run:P-7:20251103T153000Z:e4d1a9b2",
-    "tenantId": "tenant-alpha",
-    "policyId": "P-7",
-    "policyVersion": 4,
-    "mode": "simulate",
-    "status": "queued",
-    "priority": "normal",
-    "queuedAt": "2025-11-03T15:30:00Z",
-    "stats": {
-      "components": 0,
-      "rulesFired": 0,
-      "findingsWritten": 0,
-      "vexOverrides": 0
-    },
-    "inputs": {
-      "sbomSet": ["sbom:S-318", "sbom:S-42"],
-      "captureExplain": true
-    }
-  }
-}
-```
-
-Canonical payload lives in `samples/api/scheduler/policy-simulation-status.json`.
-
-### List and fetch simulations
-
- `GET /api/v1/scheduler/policies/simulations?policyId=P-7&status=queued&limit=25`
- `GET /api/v1/scheduler/policies/simulations/{simulationId}`
-
-The response envelope mirrors `policy-run-status` but uses `simulations` / `simulation` wrappers. All metadata keys are lower-case; retries append `retry-of=<priorRunId>` for provenance.
-
-### Cancel and retry
-
- `POST /api/v1/scheduler/policies/simulations/{simulationId}/cancel`
-  - Marks the job as `cancellationRequested` and surfaces the reason. Worker execution honours this flag before leasing.
- `POST /api/v1/scheduler/policies/simulations/{simulationId}/retry`
-  - Clones a terminal simulation, preserving inputs/metadata and adding `metadata.retry-of` pointing to the original run ID. Returns `409 Conflict` when the simulation is not terminal.
-
-### Live stream (SSE)
-
-`GET /api/v1/scheduler/policies/simulations/{simulationId}/stream` emits:
-
- `retry` — reconnection hint (milliseconds) emitted before events.
- `initial` — current simulation snapshot.
- `status` — status/attempt/stat updates.
- `queueLag` — periodic queue depth summary (shares payload with run streams).
- `heartbeat` — keep-alive ping (default 5s; configurable under `Scheduler:RunStream`).
- `completed` — terminal summary (`succeeded`, `failed`, or `cancelled`).
- `notFound` — emitted if the run record disappears while streaming.
-
-Heartbeats, queue lag summaries, and the reconnection directive are sent immediately after connection so Console clients receive deterministic telemetry when loading a simulation workspace.
-
-### Metrics
-
-```
-GET /api/v1/scheduler/policies/simulations/metrics
-X-Tenant-Id: tenant-alpha
-Authorization: Bearer <OpTok>
-```
-
-Returns queue depth and latency summaries tailored for simulation dashboards and alerting. Response properties align with the metric names exposed via OTEL (`policy_simulation_queue_depth`, `policy_simulation_latency_seconds`). Canonical payload lives at `samples/api/scheduler/policy-simulation-metrics.json`.
-
- `policy_simulation_queue_depth.total` — pending simulation jobs (aggregate of `pending`, `dispatching`, `submitted`).
- `policy_simulation_latency.*` — latency percentiles (seconds) computed from the most recent terminal simulations.
-
-> **Note:** When Mongo storage is not configured the metrics provider is disabled and the endpoint responds with `501 Not Implemented`.
+* Run creation and cancellation produce audit entries under category `scheduler.run` with correlation metadata when provided.
+* The preview endpoint relies on the ImpactIndex stub in development. Production deployments must register the concrete index implementation before use.
+* Planner/worker orchestration tasks will wire run creation to queueing in SCHED-WORKER-16-201/202.
+
+## Policy simulations
+
+The policy simulation APIs mirror the run endpoints but operate on policy-mode jobs (`mode=simulate`) scoped by tenant and RBAC (`policy:simulate`).
+
+### Create simulation
+
+```http
+POST /api/v1/scheduler/policies/simulations
+X-Tenant-Id: tenant-alpha
+Authorization: Bearer <OpTok>
+```
+
+```json
+{
+  "policyId": "P-7",
+  "policyVersion": 4,
+  "priority": "normal",
+  "metadata": {
+    "source": "console.review"
+  },
+  "inputs": {
+    "sbomSet": ["sbom:S-318", "sbom:S-42"],
+    "captureExplain": true
+  }
+}
+```
+
+```json
+HTTP/1.1 201 Created
+Location: /api/v1/scheduler/policies/simulations/run:P-7:20251103T153000Z:e4d1a9b2
+{
+  "simulation": {
+    "schemaVersion": "scheduler.policy-run-status@1",
+    "runId": "run:P-7:20251103T153000Z:e4d1a9b2",
+    "tenantId": "tenant-alpha",
+    "policyId": "P-7",
+    "policyVersion": 4,
+    "mode": "simulate",
+    "status": "queued",
+    "priority": "normal",
+    "queuedAt": "2025-11-03T15:30:00Z",
+    "stats": {
+      "components": 0,
+      "rulesFired": 0,
+      "findingsWritten": 0,
+      "vexOverrides": 0
+    },
+    "inputs": {
+      "sbomSet": ["sbom:S-318", "sbom:S-42"],
+      "captureExplain": true
+    }
+  }
+}
+```
+
+Canonical payload lives in `samples/api/scheduler/policy-simulation-status.json`.
+
+### List and fetch simulations
+
+- `GET /api/v1/scheduler/policies/simulations?policyId=P-7&status=queued&limit=25`
+- `GET /api/v1/scheduler/policies/simulations/{simulationId}`
+
+The response envelope mirrors `policy-run-status` but uses `simulations` / `simulation` wrappers. All metadata keys are lower-case; retries append `retry-of=<priorRunId>` for provenance.
+
+### Cancel and retry
+
+- `POST /api/v1/scheduler/policies/simulations/{simulationId}/cancel`
+  - Marks the job as `cancellationRequested` and surfaces the reason. Worker execution honours this flag before leasing.
+- `POST /api/v1/scheduler/policies/simulations/{simulationId}/retry`
+  - Clones a terminal simulation, preserving inputs/metadata and adding `metadata.retry-of` pointing to the original run ID. Returns `409 Conflict` when the simulation is not terminal.
+
+### Live stream (SSE)
+
+`GET /api/v1/scheduler/policies/simulations/{simulationId}/stream` emits:
+
+- `retry` — reconnection hint (milliseconds) emitted before events.
+- `initial` — current simulation snapshot.
+- `status` — status/attempt/stat updates.
+- `queueLag` — periodic queue depth summary (shares payload with run streams).
+- `heartbeat` — keep-alive ping (default 5s; configurable under `Scheduler:RunStream`).
+- `completed` — terminal summary (`succeeded`, `failed`, or `cancelled`).
+- `notFound` — emitted if the run record disappears while streaming.
+
+Heartbeats, queue lag summaries, and the reconnection directive are sent immediately after connection so Console clients receive deterministic telemetry when loading a simulation workspace.
+
+### Metrics
+
+```
+GET /api/v1/scheduler/policies/simulations/metrics
+X-Tenant-Id: tenant-alpha
+Authorization: Bearer <OpTok>
+```
+
+Returns queue depth and latency summaries tailored for simulation dashboards and alerting. Response properties align with the metric names exposed via OTEL (`policy_simulation_queue_depth`, `policy_simulation_latency_seconds`). Canonical payload lives at `samples/api/scheduler/policy-simulation-metrics.json`.
+
+- `policy_simulation_queue_depth.total` — pending simulation jobs (aggregate of `pending`, `dispatching`, `submitted`).
+- `policy_simulation_latency.*` — latency percentiles (seconds) computed from the most recent terminal simulations.
+
+> **Note:** When PostgreSQL storage is not configured the metrics provider is disabled and the endpoint responds with `501 Not Implemented`.
--- a/src/Scheduler/StellaOps.Scheduler.WebService/docs/SCHED-WEB-27-002-POLICY-SIMULATION-WEBHOOKS.md
+++ b/src/Scheduler/StellaOps.Scheduler.WebService/docs/SCHED-WEB-27-002-POLICY-SIMULATION-WEBHOOKS.md
@@ -8,7 +8,7 @@
 - `GET /api/v1/scheduler/policies/simulations/metrics` (scope: `policy:simulate`)
 - Returns queue depth grouped by status plus latency percentiles derived from the most recent sample window (default 200 terminal runs).
 - Surface area is unchanged from the implementation in Sprint 27 week 1; consumers should continue to rely on the contract in `samples/api/scheduler/policy-simulation-metrics.json`.
- When backing storage is not Mongo the endpoint responds `501 Not Implemented`.
+- When backing storage is not PostgreSQL the endpoint responds `501 Not Implemented`.

 ## 2. Completion webhooks

--- a/src/Scheduler/__Libraries/StellaOps.Scheduler.Models/docs/SCHED-MODELS-16-103-DESIGN.md
+++ b/src/Scheduler/__Libraries/StellaOps.Scheduler.Models/docs/SCHED-MODELS-16-103-DESIGN.md
@@ -2,7 +2,7 @@

 ## Goals
 - Track schema revisions for `Schedule` and `Run` documents so storage upgrades are deterministic across air-gapped installs.
- Provide reusable upgrade helpers that normalize Mongo snapshots (raw BSON → JSON) into the latest DTOs without mutating inputs.
+- Provide reusable upgrade helpers that normalize PostgreSQL snapshots (raw JSONB → JSON) into the latest DTOs without mutating inputs.
 - Formalize the allowed `RunState` graph and surface guard-rail helpers (timestamps, stats monotonicity) for planners/runners.

 ## Non-goals
@@ -17,7 +17,7 @@
  - `scheduler.impact-set@1` (shared envelope used by planners).
 - Expose `EnsureSchedule`, `EnsureRun`, `EnsureImpactSet` helpers mirroring the Notify model pattern to normalize missing/whitespace values.
 - Extend `Schedule`, `Run`, and `ImpactSet` records with an optional `schemaVersion` constructor parameter defaulting through the `Ensure*` helpers. The canonical JSON serializer will list `schemaVersion` first so documents round-trip deterministically.
- Persisted Mongo documents will now always include `schemaVersion`; exporters/backups can rely on this when bundling Offline Kit snapshots.
+- Persisted PostgreSQL documents will now always include `schemaVersion`; exporters/backups can rely on this when bundling Offline Kit snapshots.

 ## Migration Helper Shape
 - Add `SchedulerSchemaMigration` static class with:
@@ -55,8 +55,8 @@
 - Expose small helper to tag `RunReason.ImpactWindowFrom/To` automatically when set by planners (using normalized ISO-8601).

 ## Interaction Points
- **WebService**: call `SchedulerSchemaMigration.UpgradeSchedule` when returning schedules from Mongo, so clients always see the newest DTO regardless of stored version.
- **Storage.Mongo**: wrap DTO round-trips; the migration helper acts during read, and the state machine ensures updates respect transition rules before writing.
+- **WebService**: call `SchedulerSchemaMigration.UpgradeSchedule` when returning schedules from PostgreSQL, so clients always see the newest DTO regardless of stored version.
+- **Storage.Postgres**: wrap DTO round-trips; the migration helper acts during read, and the state machine ensures updates respect transition rules before writing.
 - **Queue/Worker**: use `RunStateMachine.EnsureTransition` to guard planner/runner state updates (replace ad-hoc `with run` clones).
 - **Offline Kit**: embed `schemaVersion` in exported JSON/Trivy artifacts; migrations ensure air-gapped upgrades flow without manual scripts.

@@ -67,20 +67,20 @@
 4. Update modules (Storage, WebService, Worker) to use new helpers; add logging around migrations/transitions.

 ## Test Strategy
- **Migration happy-path**: load sample Mongo fixtures for `schedule@1` and `run@1`, assert `schemaVersion` normalization, deduplicated subscribers, limits defaults. Include snapshots without the version field to exercise defaulting logic.
+- **Migration happy-path**: load sample PostgreSQL fixtures for `schedule@1` and `run@1`, assert `schemaVersion` normalization, deduplicated subscribers, limits defaults. Include snapshots without the version field to exercise defaulting logic.
 - **Legacy upgrade cases**: craft synthetic `schedule@0` / `run@0` JSON fragments (missing new fields, using old enum names) and verify version-specific fixups produce the latest DTO while populating `MigrationResult.Warnings`.
 - **Strict mode behavior**: attempt to upgrade documents with unexpected properties and ensure warnings/throws align with configuration.
 - **Run state transitions**: unit-test `RunStateMachine` for every allowed edge, invalid transitions, and timestamp/error invariants (e.g., `FinishedAt` only set on terminal states). Provide parameterized tests to confirm stats monotonicity enforcement.
 - **Serialization determinism**: round-trip upgraded DTOs via `CanonicalJsonSerializer` to confirm property order includes `schemaVersion` first and produces stable hashes.
 - **Documentation snippets**: extend module README or API docs with example migrations/run-state usage; verify via doc samples test (if available) or include as part of CI doc linting.

-## Open Questions
- Do we need downgrade (`ToVersion`) helpers for Offline Kit exports? (Assumed no for now. Add backlog item if required.)
- Should `ImpactSet` migrations live here or in ImpactIndex module? (Lean towards here because DTO defined in Models; coordinate with ImpactIndex guild if they need specialized upgrades.)
- How do we surface migration warnings to telemetry? Proposal: caller logs `warning` with `MigrationResult.Warnings` immediately after calling helper.
-
-## Status — 2025-10-20
-
- `SchedulerSchemaMigration` now upgrades legacy `@0` schedule/run/impact-set documents to the `@1` schema, defaulting missing counters/arrays and normalizing booleans & severities. Each backfill emits a warning so storage/web callers can log the mutation.
- `RunStateMachine.EnsureTransition` guards timestamp ordering and stats monotonicity; builders and extension helpers are wired into the scheduler worker/web service plans.
- Tests exercising legacy upgrades live in `StellaOps.Scheduler.Models.Tests/SchedulerSchemaMigrationTests.cs`; add new fixtures there when introducing additional schema versions.
+## Open Questions
+- Do we need downgrade (`ToVersion`) helpers for Offline Kit exports? (Assumed no for now. Add backlog item if required.)
+- Should `ImpactSet` migrations live here or in ImpactIndex module? (Lean towards here because DTO defined in Models; coordinate with ImpactIndex guild if they need specialized upgrades.)
+- How do we surface migration warnings to telemetry? Proposal: caller logs `warning` with `MigrationResult.Warnings` immediately after calling helper.
+
+## Status — 2025-10-20
+
+- `SchedulerSchemaMigration` now upgrades legacy `@0` schedule/run/impact-set documents to the `@1` schema, defaulting missing counters/arrays and normalizing booleans & severities. Each backfill emits a warning so storage/web callers can log the mutation.
+- `RunStateMachine.EnsureTransition` guards timestamp ordering and stats monotonicity; builders and extension helpers are wired into the scheduler worker/web service plans.
+- Tests exercising legacy upgrades live in `StellaOps.Scheduler.Models.Tests/SchedulerSchemaMigrationTests.cs`; add new fixtures there when introducing additional schema versions.
--- a/src/Scheduler/__Libraries/StellaOps.Scheduler.Worker/docs/SCHED-WORKER-16-201-PLANNER.md
+++ b/src/Scheduler/__Libraries/StellaOps.Scheduler.Worker/docs/SCHED-WORKER-16-201-PLANNER.md
@@ -15,7 +15,7 @@ surface) so we can operate across tenants without bespoke cursors.
 - Delegates resolution to `PlannerExecutionService` which:
  - Pulls the owning `Schedule` and normalises its selector to the run tenant.
  - Invokes `IImpactTargetingService` to resolve impacted digests.
-  - Emits canonical `ImpactSet` snapshots to Mongo for reuse/debugging.
+  - Emits canonical `ImpactSet` snapshots to PostgreSQL for reuse/debugging.
  - Updates run stats/state and projects summaries via `IRunSummaryService`.
  - Enqueues a deterministic `PlannerQueueMessage` to the planner queue when
 impacted images exist; otherwise the run completes immediately.
--- a/src/Scheduler/__Libraries/StellaOps.Scheduler.Worker/docs/SCHED-WORKER-16-203-RUNNER.md
+++ b/src/Scheduler/__Libraries/StellaOps.Scheduler.Worker/docs/SCHED-WORKER-16-203-RUNNER.md
@@ -49,6 +49,6 @@ exponential backoff.

 - `AddSchedulerWorker(configuration)` registers impact targeting, planner
  dispatch, runner execution, and the three hosted services. Call it after
-  `AddSchedulerQueues` and `AddSchedulerMongoStorage` when bootstrapping the
+  `AddSchedulerQueues` and `AddSchedulerPostgresStorage` when bootstrapping the
  worker host.
 - Extend execution metrics (Sprint 16-205) before exposing Prometheus counters.
--- a/src/Scheduler/__Libraries/StellaOps.Scheduler.Worker/docs/SCHED-WORKER-20-301-POLICY-RUNS.md
+++ b/src/Scheduler/__Libraries/StellaOps.Scheduler.Worker/docs/SCHED-WORKER-20-301-POLICY-RUNS.md
@@ -3,7 +3,7 @@
 _Sprint 20 · Scheduler Worker Guild_

 This milestone introduces the worker-side plumbing required to trigger Policy Engine
-runs from scheduler-managed jobs. The worker now leases policy run jobs from Mongo,
+runs from scheduler-managed jobs. The worker now leases policy run jobs from PostgreSQL,
 submits them to the Policy Engine REST API, and tracks submission state deterministically.

 ## Highlights
@@ -11,8 +11,8 @@ submits them to the Policy Engine REST API, and tracks submission state determin
 - New `PolicyRunJob` DTO (stored in `policy_jobs`) captures run metadata, attempts,
  lease ownership, and cancellation markers. Schema version `scheduler.policy-run-job@1`
  added to `SchedulerSchemaVersions` with canonical serializer coverage.
- Mongo storage gains `policy_jobs` collection with indexes on `{tenantId, status, availableAt}`
-  and `runId` uniqueness for idempotency. Repository `IPolicyRunJobRepository` exposes
+- PostgreSQL storage gains `policy_jobs` table with indexes on `(tenant_id, status, available_at)`
+  and `run_id` uniqueness for idempotency. Repository `IPolicyRunJobRepository` exposes
  leasing and replace semantics guarded by lease owner checks.
 - Worker options now include `Policy` dispatch/API subsections covering lease cadence,
  retry backoff, idempotency headers, and base URL validation.
--- a/src/Scheduler/__Libraries/StellaOps.Scheduler.Worker/docs/SCHED-WORKER-21-201-GRAPH-BUILD.md
+++ b/src/Scheduler/__Libraries/StellaOps.Scheduler.Worker/docs/SCHED-WORKER-21-201-GRAPH-BUILD.md
@@ -2,7 +2,7 @@

 _Sprint 21 · Scheduler Worker Guild_

-The graph build worker leases pending `GraphBuildJob` records from Mongo, invokes
+The graph build worker leases pending `GraphBuildJob` records from PostgreSQL, invokes
 Cartographer to construct graph snapshots, and records terminal status via the
 Scheduler WebService webhook so downstream systems observe completion events.