chore: devops compose tweaks, playwright artifacts, sprint log updates

devops/compose: docker-compose.stella-ops.legacy.yml +
docker-compose.stella-services.yml receive small service wiring updates.

Playwright: refreshed auth-state/report fixtures from the latest
integrations + setup-wizard + policy-runtime live runs. Includes a new
playwright-report-integrations/ bundle.

Docs: SPRINT_20260410_001 (runtime no-mocks) significantly expanded with
additional NOMOCK tasks reflecting the Postgres-backed work shipped across
Policy, Graph, Excititor, VexLens, Scanner, VexHub. SPRINT_20260413_004
(UI-only setup bootstrap closure) log updates.

Gitignore: narrow the earlier `output/` rule to `/output/` so the tracked
src/Web/StellaOps.Web/output/playwright fixtures continue to be picked up.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
master
2026-04-15 11:16:33 +03:00
parent fc14a59b1f
commit a6a7e0a134
28 changed files with 2937 additions and 762 deletions

View File

@@ -116,6 +116,204 @@ Completion criteria:
- [x] The Angular release-environment client targets `/api/v1/release-orchestrator`, translates real payloads, and submits type-specific target connection config.
- [x] Serial backend/frontend verification passes with concrete evidence.
### NOMOCK-008 - Replace VexLens noise-gating in-memory stores with persisted runtime storage
Status: DONE
Dependency: NOMOCK-001
Owners: Developer
Task description:
- Remove the live VexLens noise-gating runtime dependency on `InMemorySnapshotStore` and `InMemoryGatingStatisticsStore`.
- Persist raw snapshots, gated snapshots, and gating statistics in the owning VexLens PostgreSQL schema with startup migrations wired through the existing local runtime.
- Make the gate/delta/statistics endpoints truthful by persisting gated snapshots and recorded statistics as part of the real backend path.
Completion criteria:
- [x] `ISnapshotStore` and `IGatingStatisticsStore` resolve to PostgreSQL-backed implementations in the VexLens web runtime.
- [x] The `vexlens` schema auto-migrates the new noise-gating tables on startup.
- [x] Gating operations persist the gated snapshot and statistics so later delta/statistics reads no longer depend on process memory.
- [x] Focused backend tests cover persistence-backed storage and endpoint write-through behavior.
### NOMOCK-009 - Wire Angular noise-gating UI to the real VexLens client
Status: DONE
Dependency: NOMOCK-008
Owners: Developer
Task description:
- Register the production Angular DI bindings for the noise-gating client so triage surfaces use the real `/api/v1/vexlens` backend path instead of an absent optional provider or a mock helper.
- Add a focused regression test that guards the production provider wiring.
Completion criteria:
- [x] `app.config.ts` provides `NOISE_GATING_API_BASE_URL`, `NoiseGatingApiHttpClient`, and `NOISE_GATING_API`.
- [x] The triage runtime resolves the real noise-gating HTTP client without feature-local mocks.
- [x] A focused frontend test guards the provider wiring.
### NOMOCK-010 - Remove VexHub API-key-only runtime auth drift and fake export fallback
Status: DONE
Dependency: NOMOCK-001
Owners: Developer
Task description:
- Remove the export-path fake-success fallback from VexHub so backend failures surface as truthful `problem+json` errors instead of synthetic empty OpenVEX documents.
- Align the first-party StellaOps bearer-token path and the external API-key path on one canonical VexHub scope contract so the frontdoor, service authorization, and API-key authentication all enforce the same real backend rules.
- Preserve legacy API-key configuration compatibility by normalizing old dot-form VexHub scopes onto the canonical Authority scopes inside the VexHub API-key handler.
Completion criteria:
- [x] VexHub export failures return truthful backend error responses instead of fabricated OpenVEX success payloads.
- [x] First-party Authority bearer tokens with `vexhub:read` or `vexhub:admin` authorize live VexHub routes through the frontdoor.
- [x] Legacy API-key VexHub scope values normalize to the canonical Authority scopes without keeping dual required-scope policies in the live runtime path.
- [x] Focused backend tests and a live frontdoor bearer-auth probe pass with concrete evidence.
### NOMOCK-011 - Converge the live VEX console onto the real VexHub and VexLens contract
Status: DONE
Dependency: NOMOCK-010
Owners: Developer
Task description:
- Remove the remaining live console dependence on retired VEX mock-era endpoints and compatibility DTOs so the Angular runtime reads statement/search data from the real VexHub API and computes consensus/conflicts through the owning VexLens API.
- Align the browser detail panel with the real backend contract for statement detail, consensus results, and conflict resolution, and add a focused frontend verification lane that covers the active VEX runtime without relying on the repo-wide Angular target.
Completion criteria:
- [x] The Angular VEX runtime uses `GET /api/v1/vex/search`, `GET /api/v1/vex/statement/{id}`, `POST /api/v1/vexlens/consensus`, and `POST /api/v1/vex/conflicts/resolve`.
- [x] VexHub backend search and conflict-resolution endpoints support the live console contract.
- [x] Focused backend/frontend verification passes for the VEX runtime slice.
- [x] Live frontdoor verification confirms the VEX search route reaches the real backend path.
### NOMOCK-012 - Remove Excititor live in-memory VEX stores and demo seed migrations
Status: DONE
Dependency: NOMOCK-001
Owners: Developer
Task description:
- Remove the live `InMemoryVexProviderStore`, `InMemoryVexConnectorStateRepository`, and `InMemoryVexClaimStore` fallbacks from Excititor so the web host and worker both resolve the real persistence-backed runtime path.
- Add a real PostgreSQL `IVexClaimStore`, wire Excititor startup migrations, and stop embedding archived or demo-seed SQL so fresh local installs converge on truthful persisted state instead of historical demo rows.
Completion criteria:
- [x] Excititor WebService and Worker no longer register runtime in-memory VEX provider, connector-state, or claim stores.
- [x] `StellaOps.Excititor.Persistence` owns a PostgreSQL `IVexClaimStore` and startup migrations for the `vex` schema.
- [x] Archived pre-1.0 SQL and demo seed migrations are excluded from the live Excititor migration assembly.
- [x] Serial verification proves the persistence library and both live hosts build cleanly, and the remaining Excititor persistence-test blocker is logged concretely.
### NOMOCK-013 - Remove Scanner live manifest/proof in-memory repositories from the running WebService
Status: DONE
Dependency: NOMOCK-001
Owners: Developer
Task description:
- Remove the live `InMemoryScanManifestRepository`, `InMemoryProofBundleRepository`, `TestManifestRepository`, and `TestProofBundleRepository` runtime bindings from `StellaOps.Scanner.WebService` so the manifest/proof and score-replay surfaces resolve the real PostgreSQL-backed storage path.
- Preserve singleton score-replay service lifetimes by using scoped adapters for manifest/proof persistence, then prove the live `/api/v1/scans/{id}/manifest`, `/proofs`, and `/proofs/{rootHash}` routes read persisted rows and survive a `scanner-web` container recreate.
Completion criteria:
- [x] `StellaOps.Scanner.WebService` no longer registers live in-memory or test manifest/proof repositories.
- [x] The running scanner host builds and redeploys with scoped adapters over `PostgresScanManifestRepository` and `PostgresProofBundleRepository`.
- [x] Live API verification against `http://scanner.stella-ops.local/api/v1/scans/{id}/manifest` and `/proofs*` returns persisted PostgreSQL data before and after a `scanner-web` recreate.
- [x] The remaining Scanner targeted-test blocker is logged concretely instead of being treated as implementation-complete verification.
### NOMOCK-014 - Remove Policy live gate-bypass audit in-memory runtime binding
Status: DONE
Dependency: NOMOCK-001
Owners: Developer
Task description:
- Remove the live `IGateBypassAuditRepository -> InMemoryGateBypassAuditRepository` binding from `StellaOps.Policy.Engine` and cut the gate-bypass audit path over to the existing PostgreSQL-backed repository using the active tenant context with a deterministic fallback to `public`.
- Repair the Policy host/test blockers uncovered by that cutover: the config-driven persistence registration path was missing `IGateBypassAuditPersistence`, and the live host still had duplicate governance/risk-profile route names that caused first-request `500` failures and masked direct-service verification.
Completion criteria:
- [x] `StellaOps.Policy.Engine` resolves `IGateBypassAuditRepository` to `PostgresGateBypassAuditRepository` instead of the in-memory store.
- [x] The config-driven Policy persistence registration path includes `IGateBypassAuditPersistence`.
- [x] Focused Policy engine verification passes for the new registration path and the pre-existing host-route collision is fixed.
- [x] Live frontdoor verification no longer returns `503 Target microservice unavailable` for the Policy gate path after bringing the real `policy-engine` service up.
### NOMOCK-015 - Persist Policy live snapshot and ledger-export runtime state
Status: DONE
Dependency: NOMOCK-014
Owners: Developer
Task description:
- Replace the live engine-local `ISnapshotStore` and `ILedgerExportStore` in-memory bindings in `StellaOps.Policy.Engine` with PostgreSQL-backed adapters owned by `StellaOps.Policy.Persistence`, using runtime tables that match the current engine snapshot/export contracts instead of forcing the older generic snapshot entity shape.
- Make the Policy startup migrations safe on reused local volumes so `policy-engine` can converge the full schema, including the new runtime tables, without crashing on pre-existing indexes, triggers, or RLS policies. Then prove the live snapshot route works both directly and through the frontdoor and survives a `policy-engine` recreate.
Completion criteria:
- [x] `StellaOps.Policy.Engine` resolves its live snapshot and ledger-export runtime stores to PostgreSQL-backed adapters instead of process-local in-memory stores.
- [x] `StellaOps.Policy.Persistence` owns startup-migrated runtime tables for `policy.engine_ledger_exports` and `policy.engine_snapshots`.
- [x] `001_initial_schema.sql` is idempotent enough for reused local volumes and no longer crash-loops `policy-engine` on duplicate indexes, triggers, or policies.
- [x] Focused Policy runtime registration/store tests pass, and live direct plus frontdoor snapshot create/list/get verification succeeds with persistence across a `policy-engine` recreate.
### NOMOCK-016 - Remove Graph API live demo/in-memory runtime graph binding
Status: DONE
Dependency: NOMOCK-001
Owners: Developer
Task description:
- Remove the active Graph API runtime path that decided between persisted rows and demo/in-memory graph data from an early startup snapshot. The live `/graph/query`, `/graph/diff`, and `/graphs*` compatibility surfaces must resolve their runtime repository from final `Postgres:Graph` options so test hosts and local compose instances use the persisted graph when PostgreSQL is configured.
- Add focused Graph API verification that proves the host resolves `IGraphRuntimeRepository` to the Postgres-backed runtime repository when a Graph connection string is configured, and that the persisted row/snapshot endpoints continue to work through the compatibility facade instead of silently falling back to an empty or seeded in-memory graph.
Completion criteria:
- [x] `StellaOps.Graph.Api` resolves `IGraphRuntimeRepository` from final `PostgresOptions` rather than an early startup boolean.
- [x] The live Graph runtime path no longer relies on demo-seeded `InMemoryGraphRepository` data when `Postgres:Graph` is configured.
- [x] Focused Graph API registration, compatibility, and Postgres runtime integration tests pass against the specific Graph API test project.
### NOMOCK-017 - Remove Policy Gateway live delta snapshot in-memory runtime binding
Status: DONE
Dependency: NOMOCK-015
Owners: Developer
Task description:
- Remove the standalone `StellaOps.Policy.Gateway` runtime binding from `ISnapshotStore -> InMemorySnapshotStore` so `/api/v1/policy/deltas/*` uses the same persisted engine snapshot projection as the real Policy runtime.
- Reuse the engine-owned `PersistedKnowledgeSnapshotStore` and `DeltaSnapshotServiceAdapter` in the gateway host instead of keeping a second compatibility-only projection that fabricates empty packages, reachability, VEX, violation, and unknown sets.
- Add a focused host test that proves the gateway resolves the persisted delta runtime path and projects real compatibility inputs from `policy.engine_snapshots`.
Completion criteria:
- [x] `StellaOps.Policy.Gateway` no longer registers `InMemorySnapshotStore` on the live delta path.
- [x] The standalone gateway resolves `StellaOps.Policy.Snapshots.ISnapshotStore` and `StellaOps.Policy.Deltas.ISnapshotService` through the persisted engine projection services.
- [x] A focused gateway host test proves projected packages, reachability, VEX statements, policy violations, and unknowns are populated from a persisted snapshot document.
### NOMOCK-018 - Remove Policy Gateway live gate-bypass audit in-memory runtime binding
Status: DONE
Dependency: NOMOCK-017
Owners: Developer
Task description:
- Remove the standalone `StellaOps.Policy.Gateway` binding from `IGateBypassAuditRepository -> InMemoryGateBypassAuditRepository` so gate-bypass auditing uses the real Policy PostgreSQL persistence path in both Policy hosts.
- Resolve the gateway repository through `IGateBypassAuditPersistence` and the unified `IStellaOpsTenantAccessor`, keeping the same deterministic default-tenant fallback (`public`) used by `StellaOps.Policy.Engine`.
- Add focused gateway host coverage that proves the gateway now resolves `PostgresGateBypassAuditRepository` for both explicit-tenant and default-tenant cases.
Completion criteria:
- [x] `StellaOps.Policy.Gateway` no longer registers `InMemoryGateBypassAuditRepository` on the live host path.
- [x] The standalone gateway resolves `IGateBypassAuditRepository` to `PostgresGateBypassAuditRepository` with tenant-scoped behavior.
- [x] A focused gateway host test proves current-tenant and default-tenant fallback behavior.
### NOMOCK-019 - Remove fake Policy async gate-evaluation queue runtime
Status: DONE
Dependency: NOMOCK-018
Owners: Developer
Task description:
- Remove the fictional `InMemoryGateEvaluationQueue` and `GateEvaluationWorker` runtime path from both `StellaOps.Policy.Gateway` and the merged `StellaOps.Policy.Engine` gateway surface. The old path fabricated "no drift" gate contexts, emitted fake job IDs, and claimed deferred evaluation had been queued even though no persisted scheduler-backed dispatcher or job-status surface existed.
- Keep the truthful unsupported branch when scheduler persistence is absent, but add the real runtime branch when `Postgres:Scheduler` is configured: scheduler startup migrations, persisted queueing through `StellaOps.Scheduler.Persistence`, dispatch/worker execution, and persisted job-status reads from `/api/v1/policy/gate/jobs/{jobId}`.
- Add focused gateway and engine host tests that prove both runtime branches: honest `501` problem responses when the queue is unavailable, and real pending/completed scheduler-backed job lifecycle behavior when the async runtime is registered.
Completion criteria:
- [x] `StellaOps.Policy.Gateway` and `StellaOps.Policy.Engine` no longer register `InMemoryGateEvaluationQueue` or `GateEvaluationWorker`.
- [x] Registry webhook push endpoints return `501` with an explicit problem response when no scheduler-backed async queue is available.
- [x] When `Postgres:Scheduler` is configured, registry webhook push events enqueue persisted async gate-evaluation jobs and expose `/api/v1/policy/gate/jobs/{jobId}` status/results.
- [x] Focused gateway and engine host tests prove both the unsupported runtime binding and the scheduler-backed enqueue/status/dispatch behavior.
### NOMOCK-020 - Auto-bootstrap Policy first-run baseline state from persisted upstream results
Status: DONE
Dependency: NOMOCK-019
Owners: Developer
Task description:
- Remove the remaining manual-seed requirement for first-run Policy gate evaluation. When a tenant has completed persisted Policy orchestration results but no `policy.engine_ledger_exports` or `policy.engine_snapshots` rows yet, the live engine must build the first ledger export and baseline snapshot automatically instead of failing webhook/sync gate evaluation with "snapshot not found".
- Cut the upstream Policy runtime stores (`IOrchestratorJobStore`, `IWorkerResultStore`) from process-local memory over to PostgreSQL-backed adapters so the bootstrap path reads real `policy.orchestrator_jobs` and `policy.worker_results` data after host/container recreates.
- Add focused engine verification for the persisted runtime-store registrations and the zero-snapshot bootstrap behavior, then prove the live webhook path succeeds for a brand-new tenant seeded only with persisted orchestrator/worker rows.
Completion criteria:
- [x] `StellaOps.Policy.Engine` resolves `IOrchestratorJobStore` and `IWorkerResultStore` to PostgreSQL-backed adapters over persisted Policy tables.
- [x] Sync and async gate evaluation automatically build the first ledger export and baseline snapshot when no baseline exists but completed persisted Policy result data is available.
- [x] Focused engine tests prove the persisted runtime-store registrations and the first-run bootstrap behavior without relying on pre-seeded exports/snapshots.
- [x] Live verification succeeds for a brand-new tenant seeded only with persisted `policy.orchestrator_jobs` and `policy.worker_results`, and Postgres shows auto-created `policy.engine_ledger_exports` plus baseline/target `policy.engine_snapshots`.
### NOMOCK-021 - Make Policy orchestrator submission produce worker results automatically
Status: DONE
Dependency: NOMOCK-020
Owners: Developer
Task description:
- Remove the remaining manual step on the upstream Policy producer path. Submitting `/policy/orchestrator/jobs` must no longer leave real orchestrator rows stranded in `queued` until an operator or test harness separately calls `/policy/worker/run`.
- Add a deterministic background execution path that wakes on startup and on new submissions, leases queued jobs from the configured `IOrchestratorJobStore`, marks them `running`, executes `PolicyWorkerService`, persists `policy.worker_results`, and records terminal `completed` or `failed` status on the orchestrator job.
- Keep the contract boundary explicit: `/policy/orchestrator/jobs` is the persisted producer surface, while `/api/policy/eval/batch` remains stateless and must not backfill orchestrator or worker tables.
Completion criteria:
- [x] Queued Policy orchestrator jobs auto-execute without a separate manual `POST /policy/worker/run`.
- [x] Terminal orchestrator job state persists `completed` or `failed` instead of leaving jobs stuck in `running`.
- [x] Focused engine host coverage proves submit -> poll -> worker-result behavior for the live producer path.
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
@@ -131,6 +329,25 @@ Completion criteria:
| 2026-04-13 | Completed live authenticated end-to-end proof for the persisted release-environment detail surface. Through `https://stella-ops.local/releases/environments` the browser created environment `e2e-728048` (`201`, id `0d1a9597-c30c-4a17-891a-9dcdfe1ccffa`), updated settings (`200`), created a target (`201`, id `2ab5b680-1c09-4149-b160-af08a614b19c`), deleted that target (`204`), created a freeze window (`201`, id `2377401c-425d-4277-a02b-ec5605cccf1a`), deleted that freeze window (`204`), and deleted the environment (`204`). The two leftover trial environments from earlier failed runs, `E2E e2e-792313` and `E2E e2e-543565`, were then cleaned up through the same authenticated backend path with `204` responses so the local stack did not retain stale verification data. | Developer |
| 2026-04-14 | Closed an unrelated live frontdoor blocker uncovered during the authenticated dashboard rerun: `/api/v1/vulnerabilities/status` returned `503` because `scanner-web` was missing from the local compose runtime even though the gateway route still pointed at `scanner.stella-ops.local`. Built `stellaops/scanner-web:dev`, started `stellaops-scanner-web`, and verified the path now returns `401` unauthenticated through `https://stella-ops.local/api/v1/vulnerabilities/status`, proving the route is back on the real scanner service instead of failing as an unavailable target. | Developer |
| 2026-04-14 | Replaced the live Release Control deployment seed path with persisted runtime state. `StellaOps.ReleaseOrchestrator.WebApi` now binds `IDeploymentCompatibilityStore` to a PostgreSQL-backed store, startup auto-migrates `release_orchestrator.deployments`, and the store no longer seeds fake `dep-001`..`dep-004` rows on first access. Serial verification passed: `dotnet test src/ReleaseOrchestrator/__Tests/StellaOps.ReleaseOrchestrator.Integration.Tests/StellaOps.ReleaseOrchestrator.Integration.Tests.csproj` `24/24`, `npm run build` in `src/Web/StellaOps.Web`, live API proof that the deployments list starts empty (`200`, `totalCount=0`), `POST /api/v1/release-orchestrator/deployments` creates `dep-4536d81685ac` (`201`), the same deployment survives a `release-orchestrator` container recreate, and the live browser route `/releases/deployments` now mounts the real Angular deployment feature and renders the persisted `checkout-api` deployment card and detail view instead of the old hardcoded `DEP-2026-*` screens. | Developer |
| 2026-04-14 | Expanded the runtime cleanup inventory into VexLens after the VEX Hub and issuer paths were moved to real services. Confirmed the remaining live noise-gating gap: `src/VexLens/StellaOps.VexLens/Extensions/VexLensServiceCollectionExtensions.cs` still registers `ISnapshotStore` and `IGatingStatisticsStore` to in-memory implementations, `src/VexLens/StellaOps.VexLens.WebService/Extensions/VexLensEndpointExtensions.cs` does not persist gated snapshots/statistics after `GateAsync`, and `src/Web/StellaOps.Web/src/app/app.config.ts` does not bind the production noise-gating API client at all. Next implementation slice is `src/VexLens/**` plus the Angular DI binding in `src/Web/StellaOps.Web/**`. | Developer |
| 2026-04-14 | Completed the VexLens noise-gating runtime slice. `ISnapshotStore` and `IGatingStatisticsStore` now resolve to PostgreSQL-backed implementations, startup auto-migrates `vexlens.noise_gate_*`, gating endpoints persist gated snapshots/statistics, and the Angular production runtime binds the real noise-gating client from `app.config.ts`. Serial verification passed: `dotnet test src/VexLens/__Tests/StellaOps.VexLens.WebService.Tests/StellaOps.VexLens.WebService.Tests.csproj` `7/7`, `npx vitest run src/tests/triage/noise-gating-api.providers.spec.ts --config vitest.codex.config.ts` `1/1`, rebuilt `stellaops/vexlens-web:dev`, verified `stellaops-vexlens-web` healthy, seeded `live-gate-001` into the live PostgreSQL schema, then confirmed live `POST http://vexlens.stella-ops.local/api/v1/vexlens/gating/snapshots/live-gate-001/gate` `200`, live `GET http://vexlens.stella-ops.local/api/v1/vexlens/gating/statistics` `200` with `totalSnapshots=1`, and live DB counts `raw|1`, `gated|1`, `stats|1`. | Developer |
| 2026-04-14 | Completed the VexHub runtime auth/export slice. `StellaOps.VexHub.WebService` no longer fabricates empty OpenVEX exports on backend failure, now accepts first-party Authority bearer tokens alongside API keys, and converges both runtime auth paths on canonical Authority scopes by normalizing legacy API-key `VexHub.Read` / `VexHub.Admin` values to `vexhub:read` / `vexhub:admin`. The gateway/frontdoor denial root cause was also confirmed during this slice: the router currently treats extracted required claims as an `AND` set, so dual-scope compatibility policies break live frontdoor auth even when the service-level authorization would accept either scope. Serial verification passed: `dotnet test src/VexHub/__Tests/StellaOps.VexHub.WebService.Tests/StellaOps.VexHub.WebService.Tests.csproj` `15/15`, rebuilt `stellaops/vexhub-web:dev`, verified `stellaops-vexhub-web` healthy, and confirmed authenticated live `GET https://stella-ops.local/api/v1/vex/export` returns `200` with `application/vnd.openvex+json` and a real `@context` document when called with the Authority bearer token from the signed-in frontdoor session. | Developer |
| 2026-04-14 | Converged the live VEX console onto the real VexHub and VexLens contract. Angular now reads search/detail data from `/api/v1/vex/search` and `/api/v1/vex/statement/{id}`, computes consensus through `POST /api/v1/vexlens/consensus`, and resolves conflicts through `/api/v1/vex/conflicts/resolve`; the VEX detail panel was also corrected to request the real consensus-result DTO instead of the retired consensus shape. Serial verification passed: `dotnet test src/VexHub/__Tests/StellaOps.VexHub.WebService.Tests/StellaOps.VexHub.WebService.Tests.csproj` `17/17`, `npm run test:vex` `101/101`, `npm run build` in `src/Web/StellaOps.Web`, refreshed the live `compose_console-dist` bundle, re-authenticated the frontdoor session with `npm run test:e2e:live:auth`, and verified live `/ops/policy/vex/search` reaches the real backend with `GET https://stella-ops.local/api/v1/vex/search?limit=20` `200`. The local dataset had no statement rows during this pass, so live statement-detail click-through could not be exercised against populated data. | Developer |
| 2026-04-14 | Completed the Excititor runtime-persistence slice. `StellaOps.Excititor.WebService` and `StellaOps.Excititor.Worker` no longer register runtime `InMemoryVexProviderStore`, `InMemoryVexConnectorStateRepository`, or `InMemoryVexClaimStore`; `StellaOps.Excititor.Persistence` now wires startup migrations for `vex`, owns a PostgreSQL `IVexClaimStore`, and adds `003_vex_claim_store.sql` to create `vex.claims` while removing historical demo rows from prior local installs. The migration-loader root cause for the long-standing Excititor persistence failure was also corrected by restricting embedded SQL resources to active top-level migrations and deleting the live demo-seed migration. Serial verification passed: `dotnet build src/Concelier/__Libraries/StellaOps.Excititor.Persistence/StellaOps.Excititor.Persistence.csproj` `0 errors`, manifest-resource check returned only `001_initial_schema.sql`, `002_vex_evidence_links.sql`, `003_vex_claim_store.sql`, `dotnet build src/Concelier/StellaOps.Excititor.WebService/StellaOps.Excititor.WebService.csproj` `0 errors`, and `dotnet build src/Concelier/StellaOps.Excititor.Worker/StellaOps.Excititor.Worker.csproj` `0 errors`. Direct disposable-Postgres application of `001`+`002`+`003` succeeded. The remaining blocker is test-harness related: `dotnet test src/Concelier/__Tests/StellaOps.Excititor.Persistence.Tests/StellaOps.Excititor.Persistence.Tests.csproj` now reaches the testhost after the migration-assembly fix, but the full suite stops advancing and a filtered rerun is unreliable because Microsoft.Testing.Platform ignores `VSTestTestCaseFilter` and one stale Excititor testhost held the `TestResults` log file open until it was terminated. | Developer |
| 2026-04-14 | Completed the Scanner manifest/proof runtime slice. `StellaOps.Scanner.WebService` no longer binds live score-replay and manifest/proof retrieval to `InMemoryScanManifestRepository`, `InMemoryProofBundleRepository`, `TestManifestRepository`, or `TestProofBundleRepository`; it now uses scoped adapters over `PostgresScanManifestRepository` and `PostgresProofBundleRepository` so singleton replay services still read persisted rows. Serial verification passed: `dotnet build src/Scanner/StellaOps.Scanner.WebService/StellaOps.Scanner.WebService.csproj` `0 errors`; `docker build -f devops/docker/Dockerfile.hardened.template . --build-arg SDK_IMAGE=mcr.microsoft.com/dotnet/sdk:10.0-noble --build-arg RUNTIME_IMAGE=mcr.microsoft.com/dotnet/aspnet:10.0-noble --build-arg APP_PROJECT=src/Scanner/StellaOps.Scanner.WebService/StellaOps.Scanner.WebService.csproj --build-arg APP_BINARY=StellaOps.Scanner.WebService --build-arg APP_PORT=8444 -t stellaops/scanner-web:dev` completed; and the live `scanner-web` container was recreated twice. Direct PostgreSQL seeding created one real `scanner.scans` / `scanner.scan_manifest` / `scanner.proof_bundle` dataset for scan `11111111-1111-1111-1111-111111111111`; live API proof through `http://scanner.stella-ops.local/api/v1/scans/11111111-1111-1111-1111-111111111111/manifest` and `/proofs` returned that persisted row set before and after the second `scanner-web` recreate. The remaining blocker is test-harness related: a targeted `dotnet test src/Scanner/__Tests/StellaOps.Scanner.WebService.Tests/StellaOps.Scanner.WebService.Tests.csproj -- --filter-class StellaOps.Scanner.WebService.Tests.ManifestEndpointsTests` run reaches the xUnit/MTP testhost but stops advancing after launch, so live API proof is currently stronger evidence than the class-filtered test lane for this slice. | Developer |
| 2026-04-14 | Completed the Policy gate-bypass audit runtime slice. `StellaOps.Policy.Engine` no longer binds `IGateBypassAuditRepository` to `InMemoryGateBypassAuditRepository`; it now creates `PostgresGateBypassAuditRepository` from the real persistence interface and current tenant context, falling back deterministically to `public` only when no tenant is present. The config-driven Policy persistence extension was also fixed to register `IGateBypassAuditPersistence`, and the pre-existing live Policy host failure was repaired by namespacing the `/api/v1/governance/*` endpoint names so they no longer collide with `/api/risk/*` route names. Serial verification passed: `dotnet build src/Policy/StellaOps.Policy.Engine/StellaOps.Policy.Engine.csproj` `0 errors`, `dotnet test src/Policy/__Tests/StellaOps.Policy.Engine.Tests/StellaOps.Policy.Engine.Tests.csproj -- --filter-class StellaOps.Policy.Engine.Tests.Integration.PolicyEngineGateBypassAuditRegistrationTests` `2/2`, and `dotnet test src/Policy/__Tests/StellaOps.Policy.Engine.Tests/StellaOps.Policy.Engine.Tests.csproj -- --filter-class StellaOps.Policy.Engine.Tests.Integration.PolicyEngineApiHostTests` `5/5`. Live verification exposed and then cleared a separate runtime blocker: the frontdoor originally returned `503 Target microservice unavailable` for `POST /api/v1/policy/gate/evaluate` because `stellaops-policy-engine` was not running locally. After building `stellaops/policy-engine:dev` and starting the service, the same authenticated browser-backed frontdoor request returned `401 Authentication required` instead of `503`, and direct `GET http://policy-engine.stella-ops.local/healthz` returned `200`. | Developer |
| 2026-04-14 | Completed the Policy snapshot/export runtime slice. `StellaOps.Policy.Engine` now binds `ISnapshotStore` and `ILedgerExportStore` to PostgreSQL-backed adapters over new runtime tables `policy.engine_snapshots` and `policy.engine_ledger_exports`, and `StellaOps.Policy.Persistence` now applies startup migrations for the Policy schema directly on `policy-engine` boot. The first live redeploy exposed a real migration fault on reused local volumes: `001_initial_schema.sql` still used non-idempotent `CREATE INDEX`, `CREATE TRIGGER`, and `CREATE POLICY` statements, so the service crash-looped on `42P07 relation "idx_recheck_policies_tenant" already exists`. That migration was hardened for reused local databases, the image was rebuilt, and `stellaops-policy-engine` returned to `healthy`. Serial verification passed: `dotnet build src/Policy/StellaOps.Policy.Engine/StellaOps.Policy.Engine.csproj` `0 errors`, `dotnet test src/Policy/__Tests/StellaOps.Policy.Engine.Tests/StellaOps.Policy.Engine.Tests.csproj -- --filter-class StellaOps.Policy.Engine.Tests.Ledger.PostgresLedgerExportStoreTests` `2/2`, `dotnet test src/Policy/__Tests/StellaOps.Policy.Engine.Tests/StellaOps.Policy.Engine.Tests.csproj -- --filter-class StellaOps.Policy.Engine.Tests.Snapshots.PostgresSnapshotStoreTests` `2/2`, and `dotnet test src/Policy/__Tests/StellaOps.Policy.Engine.Tests/StellaOps.Policy.Engine.Tests.csproj -- --filter-class StellaOps.Policy.Engine.Tests.Integration.PolicyEngineRuntimeStoreRegistrationTests` `2/2`. Live verification also passed: direct authenticated `POST/GET http://policy-engine.stella-ops.local/api/policy/snapshots*` returned `200`, the created snapshot `CAW59134KVADGKWSSH9RQARA54` remained available after a full `policy-engine` recreate, and authenticated frontdoor `POST/GET https://stella-ops.local/api/policy/snapshots*` returned `200` with snapshot `0BKN7YSPAWQM7SVMVQVTK53QKC`. One transient frontdoor `503 No instances available` appeared immediately after the first service restart, but it cleared once router/service state converged and the routed path verified successfully. | Developer |
| 2026-04-14 | Completed the Policy merged-gateway tenant-bridge slice. The merged gateway endpoints in `StellaOps.Policy.Engine` were still using the unified `RequireTenant()` filter from `StellaOps.Auth.ServerIntegration`, but the host only registered the legacy Policy-specific tenant middleware. That mismatch caused authenticated tenant-scoped gateway routes like `POST /api/policy/deltas/compute` to fail before the handler with `500` even though direct baseline selection and delta computation were healthy. `Program.cs` now registers `AddStellaOpsTenantServices()` and runs `UseStellaOpsTenantMiddleware()` alongside the existing Policy tenant context middleware, and the regression suite now asserts that `IStellaOpsTenantAccessor` resolves in the Policy host. Serial verification passed: `dotnet test src/Policy/__Tests/StellaOps.Policy.Engine.Tests/StellaOps.Policy.Engine.Tests.csproj -- --filter-class StellaOps.Policy.Engine.Tests.Integration.PolicyEngineDeltaApiTests` `1/1` and `dotnet test src/Policy/__Tests/StellaOps.Policy.Engine.Tests/StellaOps.Policy.Engine.Tests.csproj -- --filter-class StellaOps.Policy.Engine.Tests.Integration.PolicyEngineRuntimeStoreRegistrationTests` with the new unified-tenant assertion. Live verification also passed after rebuilding `stellaops/policy-engine:dev` with `devops/docker/build-all.ps1 -Services policy-engine` and recreating `stellaops-policy-engine`: direct tenant-scoped `POST http://policy-engine.stella-ops.local/api/policy/deltas/compute` with intentionally incomplete JSON now returns `400` instead of the old pre-handler `500`, and unauthenticated frontdoor `POST https://stella-ops.local/api/v1/policy/gate/evaluate` remains a clean `401` rather than a service failure. | Developer |
| 2026-04-14 | Completed the Graph runtime repository slice. `StellaOps.Graph.Api` no longer decides its live runtime graph source from the early `hasPostgres` startup snapshot; `IGraphRuntimeRepository` now resolves from final `Postgres:Graph` options so test hosts and local compose runs use `PostgresGraphRuntimeRepository` whenever Graph persistence is configured, while the no-Postgres runtime fallback is an empty in-memory repository rather than demo-seeded graph data. Focused verification passed against the specific Graph API test project: `dotnet test src/Graph/__Tests/StellaOps.Graph.Api.Tests/StellaOps.Graph.Api.Tests.csproj -- --filter-class StellaOps.Graph.Api.Tests.GraphRuntimeRepositoryRegistrationTests` `2/2`, `dotnet test src/Graph/__Tests/StellaOps.Graph.Api.Tests/StellaOps.Graph.Api.Tests.csproj -- --filter-class StellaOps.Graph.Api.Tests.GraphCompatibilityEndpointsIntegrationTests` `3/3`, and `dotnet test src/Graph/__Tests/StellaOps.Graph.Api.Tests/StellaOps.Graph.Api.Tests.csproj -- --filter-class StellaOps.Graph.Api.Tests.GraphPostgresRuntimeIntegrationTests` `2/2`. The deeper runtime regression also uncovered and corrected two test-level issues: `/graph/query` requires `query` or `filters`, and raw NDJSON assertions against edge IDs must account for JSON escaping of `>` or assert decoded edge fields instead. | Developer |
| 2026-04-15 | Completed the standalone Policy Gateway delta-runtime slice. `StellaOps.Policy.Gateway` no longer binds `ISnapshotStore` to `InMemorySnapshotStore`; it now reuses `StellaOps.Policy.Engine`'s persisted `PersistedKnowledgeSnapshotStore` and `DeltaSnapshotServiceAdapter` so the live compatibility gateway projects real packages, reachability, VEX statements, policy violations, and unknowns from `policy.engine_snapshots` instead of fabricating mostly-empty delta input. Focused verification passed: `dotnet test src/Policy/__Tests/StellaOps.Policy.Gateway.Tests/StellaOps.Policy.Gateway.Tests.csproj -- --filter-class StellaOps.Policy.Gateway.Tests.PolicyGatewayPersistedDeltaRuntimeTests` `1/1`. The first run exposed a host-startup test harness issue because startup migrations expect a real Policy connection string; the focused factory now removes hosted services so the test proves the runtime DI/projection behavior rather than unrelated migration bootstrapping. | Developer |
| 2026-04-15 | Completed the standalone Policy Gateway gate-bypass audit slice. `StellaOps.Policy.Gateway` no longer binds `IGateBypassAuditRepository` to `InMemoryGateBypassAuditRepository`; it now resolves `PostgresGateBypassAuditRepository` from `IGateBypassAuditPersistence` plus the live `IStellaOpsTenantAccessor`, with the same deterministic `public` fallback tenant used by `StellaOps.Policy.Engine`. Focused verification passed in the specific gateway test project: `dotnet test src/Policy/__Tests/StellaOps.Policy.Gateway.Tests/StellaOps.Policy.Gateway.Tests.csproj -- --filter-class StellaOps.Policy.Gateway.Tests.PolicyGatewayPersistedDeltaRuntimeTests` `3/3`, and `dotnet build src/Policy/StellaOps.Policy.Gateway/StellaOps.Policy.Gateway.csproj` completed with `0` errors. The gateway shared test factory was also corrected to provide a dummy `Postgres:Policy:ConnectionString`, which was previously missing and prevented persistence-backed services from resolving in focused host tests. | Developer |
| 2026-04-15 | Completed the Policy async webhook queue truthfulness slice. `StellaOps.Policy.Gateway` and the merged `StellaOps.Policy.Engine` gateway surface no longer register `InMemoryGateEvaluationQueue` or the background `GateEvaluationWorker`; both hosts now resolve `IGateEvaluationQueue` to an explicit unsupported runtime adapter, and registry webhook push endpoints return `501` problem details instead of fabricated `202 Accepted` job IDs when no scheduler-backed dispatcher exists. Serial verification passed: `dotnet build src/Policy/StellaOps.Policy.Gateway/StellaOps.Policy.Gateway.csproj` `0 errors`, `dotnet build src/Policy/StellaOps.Policy.Engine/StellaOps.Policy.Engine.csproj` `0 errors`, `dotnet test src/Policy/__Tests/StellaOps.Policy.Gateway.Tests/StellaOps.Policy.Gateway.Tests.csproj -- --filter-class StellaOps.Policy.Gateway.Tests.RegistryWebhookQueueRuntimeTests` `2/2`, and `dotnet test src/Policy/__Tests/StellaOps.Policy.Engine.Tests/StellaOps.Policy.Engine.Tests.csproj -- --filter-class StellaOps.Policy.Engine.Tests.Integration.PolicyEngineRegistryWebhookRuntimeTests` `2/2`. The first gateway test run exposed a harness-only startup migration issue because the shared `TestPolicyGatewayFactory` keeps hosted services enabled by default; the focused webhook tests now remove `IHostedService` registrations so they verify the new DI/runtime behavior rather than an unrelated local PostgreSQL dependency. | Developer |
| 2026-04-15 | Completed the real Policy async registry-webhook dispatcher slice. `StellaOps.Policy.Engine` now provides a runtime-selected async gate-evaluation path: when `Postgres:Scheduler` is absent, webhook push handlers still fail honestly with `501`; when it is configured, both `StellaOps.Policy.Engine` and `StellaOps.Policy.Gateway` register the shared scheduler-backed queue runtime, auto-migrate scheduler persistence, enqueue deduplicated `policy.gate-evaluation` jobs, dispatch them through the worker service, and expose persisted status/results from `GET /api/v1/policy/gate/jobs/{jobId}`. Serial verification passed: `dotnet test src/Policy/__Tests/StellaOps.Policy.Gateway.Tests/StellaOps.Policy.Gateway.Tests.csproj --no-restore -- --filter-class StellaOps.Policy.Gateway.Tests.RegistryWebhookQueueRuntimeTests` `4/4`, `dotnet test src/Policy/__Tests/StellaOps.Policy.Engine.Tests/StellaOps.Policy.Engine.Tests.csproj --no-restore -- --filter-class StellaOps.Policy.Engine.Tests.Integration.PolicyEngineRegistryWebhookRuntimeTests --filter-class StellaOps.Policy.Engine.Tests.Integration.PolicyEngineSchedulerWebhookRuntimeTests` `3/3`, and `dotnet test src/Policy/__Tests/StellaOps.Policy.Engine.Tests/StellaOps.Policy.Engine.Tests.csproj --no-restore -- --filter-class StellaOps.Policy.Engine.Tests.Deltas.PersistedKnowledgeSnapshotStoreTests --filter-class StellaOps.Policy.Engine.Tests.Deltas.DeltaSnapshotServiceAdapterTests` `3/3`. The first delta rerun used the file name instead of the test class name and selected `0` tests under Microsoft.Testing.Platform; the corrected class filters are now recorded above. | Developer |
| 2026-04-15 | Completed live compose proof for the scheduler-backed Policy async webhook path. The active `docker-compose.stella-services.yml` and legacy compose definition now pass `STELLAOPS_POLICY_ENGINE_Postgres__Scheduler__*` into `policy-engine`; after rebuilding `stellaops/policy-engine:dev` and recreating `stellaops-policy-engine`, the host applied `Scheduler.Persistence` startup migrations and remained healthy. Direct live proof against `http://policy-engine.stella-ops.local` with `X-Stella-Tenant: demo-prod` then showed `POST /api/v1/webhooks/registry/generic` returning `202 Accepted` with `Location: /api/v1/policy/gate/jobs/22ad496a-ef16-4a2b-b132-50f990f41d79`, and `GET /api/v1/policy/gate/jobs/22ad496a-ef16-4a2b-b132-50f990f41d79` returned `200` with persisted failed status, retry counters, timestamps, and the truthful runtime error `Target snapshot sha256:bbbb... not found`. This proves the live host is no longer on the old fake queue path. | Developer |
| 2026-04-15 | Completed the Policy artifact-target snapshot runtime slice. `StellaOps.Policy.Engine` now materializes tenant-scoped target snapshots with artifact digest/repository/tag from persisted `policy.engine_ledger_exports` before synchronous or queued gate evaluation instead of treating image digests as ad-hoc snapshot IDs. The live queue worker also needed a follow-up fix after first deployment: `GateTargetSnapshotMaterializer` had an internal constructor, so the scheduler worker could not resolve it through DI until the constructor was made public and the image was rebuilt. Focused verification passed: `dotnet build src/Policy/StellaOps.Policy.Engine/StellaOps.Policy.Engine.csproj --no-restore` `0 errors`, and `dotnet test src/Policy/__Tests/StellaOps.Policy.Engine.Tests/StellaOps.Policy.Engine.Tests.csproj --no-restore -- --filter-class StellaOps.Policy.Engine.Tests.Integration.PolicyEngineGateTargetSnapshotRuntimeTests --filter-class StellaOps.Policy.Engine.Tests.Integration.PolicyEngineSchedulerWebhookRuntimeTests` `2/2`. Live direct verification then seeded one persisted `policy.engine_ledger_exports` row plus one baseline `policy.engine_snapshots` row for tenant `demo-prod`, replayed `POST http://policy-engine.stella-ops.local/api/v1/webhooks/registry/generic`, received `202 Accepted` for job `73a05fdf-4077-44e1-8deb-744695579631`, observed `GET /api/v1/policy/gate/jobs/73a05fdf-4077-44e1-8deb-744695579631` returning `completed` / `succeeded`, and confirmed persisted target snapshot `N0Q038BR9RWMZRD8J7KC7T9DSC` for `demo/api@sha256:bbbb...` in `policy.engine_snapshots`. | Developer |
| 2026-04-15 | Completed the Policy first-run bootstrap slice. `StellaOps.Policy.Engine` now resolves `IOrchestratorJobStore` and `IWorkerResultStore` through PostgreSQL-backed adapters over `policy.orchestrator_jobs` and `policy.worker_results`, and sync/async gate evaluation now auto-builds the first ledger export and baseline snapshot when no baseline exists but completed persisted Policy result data does. Focused verification passed: `dotnet test src/Policy/__Tests/StellaOps.Policy.Engine.Tests/StellaOps.Policy.Engine.Tests.csproj --no-restore -- --filter-class StellaOps.Policy.Engine.Tests.Integration.PolicyEngineRuntimeStoreRegistrationTests` `6/6`, `dotnet test src/Policy/__Tests/StellaOps.Policy.Engine.Tests/StellaOps.Policy.Engine.Tests.csproj --no-restore -- --filter-class StellaOps.Policy.Engine.Tests.Integration.PolicyEngineGateTargetSnapshotRuntimeTests` `3/3`, `dotnet build src/Policy/StellaOps.Policy.Engine/StellaOps.Policy.Engine.csproj --no-restore` `0 errors`, and `dotnet build src/Policy/StellaOps.Policy.Gateway/StellaOps.Policy.Gateway.csproj --no-restore` `0 errors`. Live verification rebuilt `stellaops/policy-engine:dev`, recreated `stellaops-policy-engine`, seeded brand-new tenant `bootstrap-live-20260415b` with only one completed `policy.orchestrator_jobs` row plus one completed `policy.worker_results` row, confirmed `policy.engine_ledger_exports=0` and `policy.engine_snapshots=0` before replay, then observed `POST http://policy-engine.stella-ops.local/api/v1/webhooks/registry/generic` return `202 Accepted` for job `9413c45c-621b-42f3-b8d0-967bb1d0bbab` and `GET /api/v1/policy/gate/jobs/9413c45c-621b-42f3-b8d0-967bb1d0bbab` return `completed` / `succeeded`. Postgres then showed `policy.engine_ledger_exports=1` and two `policy.engine_snapshots` rows for the tenant: auto-created baseline snapshot `62WF54K1SZEZXVNGY7TADCJ46M` and target snapshot `25KR52BRM5YWJHGNDES67X2Z5G` for `demo/api@sha256:bbbb...`. | Developer |
| 2026-04-15 | Completed the Policy orchestrator producer runtime slice. `StellaOps.Policy.Engine` now signals a dedicated background host when `/policy/orchestrator/jobs` submits a queued job; the host drains queued work from `IOrchestratorJobStore`, executes it through `PolicyWorkerService`, persists `policy.worker_results`, and records terminal `completed` or `failed` status instead of depending on a separate manual `/policy/worker/run` step. Focused verification passed: `dotnet test src/Policy/__Tests/StellaOps.Policy.Engine.Tests/StellaOps.Policy.Engine.Tests.csproj --no-restore -m:1 /p:UseSharedCompilation=false -- --filter-class StellaOps.Policy.Engine.Tests.Integration.PolicyEngineRuntimeStoreRegistrationTests` `6/6` and `dotnet test src/Policy/__Tests/StellaOps.Policy.Engine.Tests/StellaOps.Policy.Engine.Tests.csproj --no-restore -m:1 /p:UseSharedCompilation=false -- --filter-class StellaOps.Policy.Engine.Tests.Integration.PolicyEngineOrchestratorProducerRuntimeTests` `1/1`. | Developer |
| 2026-04-15 | Added an executable live proof harness for the Policy orchestrator producer path in `src/Web/StellaOps.Web/tests/e2e/integrations/policy-orchestrator.e2e.spec.ts` with runner `npm run test:e2e:policy:producer:live`. Live compose verification now passes end-to-end against `http://policy-engine.stella-ops.local`: the harness acquired a real Authority-backed token through the existing frontdoor auth flow, submitted `POST /policy/orchestrator/jobs` for tenant `demo-prod`, observed queued job `6VRYPQBCYP6N5Z2PX27TN0ERJ0`, polled `GET /policy/orchestrator/jobs/6VRYPQBCYP6N5Z2PX27TN0ERJ0` to terminal `completed`, fetched `GET /policy/worker/jobs/6VRYPQBCYP6N5Z2PX27TN0ERJ0`, and recorded proof artifact `src/Web/StellaOps.Web/output/playwright/policy-orchestrator-live-proof.json` with matching `result_hash` `D70BF3B49550F501A69CD357520F8B51812F248C7DEB133305B5ABE2E7554FB8`. | Developer |
## Decisions & Risks
- Decision: this sprint prioritizes live runtime paths the browser can currently reach over test-only mock helpers.
@@ -145,15 +362,45 @@ Completion criteria:
- Decision: the owning Release Orchestrator environment surface now emits the environment-management enums with the standard Web JSON string-enum contract, and the Angular client accepts the owning API's PascalCase and numeric variants so detail/settings/targets/freeze-window flows remain compatible during rollout.
- Decision: live deployment monitoring state now starts empty and becomes real only after `/api/v1/release-orchestrator/deployments` mutations persist to PostgreSQL; seeded compatibility rows are no longer acceptable on the browser path.
- Decision: `/releases/deployments` now reuses the existing Release Orchestrator deployment store/components that call the real deployments API instead of the older standalone Angular stub pages with hardcoded `DEP-2026-*` payloads.
- Risk: several modules outside the initial slice still boot with runtime in-memory stores (`Notify`, `Graph`, `Policy`, `Platform`, `Scheduler`, `Scanner`, `BinaryIndex`, `Signals`, `SbomService`, `Signer`, `PacksRegistry`, `AdvisoryAI`). They will need follow-on slices unless a real persistence path already exists and can be wired safely.
- Decision: the next active runtime slice is VexLens noise-gating because the backend endpoints exist but still rely on process-local storage and the Angular production app configuration never binds the real client.
- Decision: VexLens noise-gating now persists raw snapshots, gated snapshots, and aggregated statistics in PostgreSQL; the live gating/statistics endpoints are no longer backed by process-local memory.
- Decision: VexHub now uses the canonical Authority scopes `vexhub:read` / `vexhub:admin` as its single runtime contract for both service authorization and frontdoor routing; legacy API-key scope values are normalized inside the VexHub API-key handler instead of keeping multi-scope compatibility policies on the live HTTP path.
- Decision: VexHub export endpoints must fail truthfully with `problem+json` on backend generation faults rather than returning fabricated empty OpenVEX payloads.
- Decision: the live VEX console now uses `GET /api/v1/vex/search`, `GET /api/v1/vex/statement/{id}`, `POST /api/v1/vexlens/consensus`, and `POST /api/v1/vex/conflicts/resolve` as its canonical backend contract; the retired `GET /api/v1/vexlens/consensus/{cve}` and `GET /api/v1/vexlens/conflicts/{cve}` routes are not part of the live runtime path.
- Decision: `stellaops-web:test-vex` is the focused frontend verification lane for the VEX runtime slice because the default Angular target intentionally excludes these specs.
- Decision: Excititor runtime hosts now rely exclusively on the persistence-backed registrations from `AddExcititorPersistence`; live in-memory VEX provider, connector-state, and claim stores are no longer acceptable on the browser or worker path.
- Decision: `StellaOps.Policy.Engine` now runs both tenant stacks on the merged gateway surface: the legacy Policy tenant context for internal repositories and the unified StellaOps tenant accessor/middleware required by `RequireTenant()` endpoint filters copied from Policy Gateway. Without that bridge, tenant-scoped merged gateway routes fail before handlers with `500`.
- Decision: Excititor startup migrations now own both schema convergence and cleanup of historical demo VEX rows in local databases; the live migration assembly must embed only active top-level SQL files.
- Decision: Scanner manifest/proof and score-replay retrieval must resolve `scanner.scan_manifest` and `scanner.proof_bundle` through scoped PostgreSQL repositories even when higher-level replay services remain singletons; runtime bindings to `InMemoryScanManifestRepository`, `InMemoryProofBundleRepository`, `TestManifestRepository`, and `TestProofBundleRepository` are no longer acceptable on the live host.
- Decision: Policy gate-bypass auditing now uses the real PostgreSQL-backed `policy.gate_bypass_audit` path with tenant-aware resolution from `ITenantContextAccessor`; the in-memory audit repository is no longer acceptable on the live Policy host.
- Decision: the `/api/v1/governance/*` compatibility surface now uses unique `Governance.*` endpoint names so it can coexist with the main `/api/risk/*` runtime endpoints without crashing the Policy host on first request.
- Decision: the live Policy snapshot/list/get and ledger-export runtime paths now persist to Policy-owned PostgreSQL tables `policy.engine_snapshots` and `policy.engine_ledger_exports`; engine-local in-memory stores are no longer acceptable for those runtime surfaces.
- Decision: `StellaOps.Policy.Gateway` now reuses `StellaOps.Policy.Engine`'s persisted snapshot projection services for delta compatibility, so the standalone gateway no longer fabricates empty compatibility payloads through `InMemorySnapshotStore`.
- Decision: `StellaOps.Policy.Gateway` now resolves gate-bypass auditing through `PostgresGateBypassAuditRepository` using the unified StellaOps tenant accessor, so the standalone gateway no longer diverges from `policy-engine` with an in-memory audit store.
- Decision: registry webhook push endpoints in both Policy hosts now use a truthful runtime-selected async path: `501 problem+json` when scheduler persistence is absent, and `202 Accepted` with persisted scheduler-backed job IDs plus `/api/v1/policy/gate/jobs/{jobId}` status when `Postgres:Scheduler` is configured. The fictional in-memory queue/worker path and fabricated "no drift" gate contexts remain removed from the live runtime.
- Decision: Policy gate evaluation now materializes a persisted target snapshot with `artifact_digest`, `artifact_repository`, and `artifact_tag` from the latest tenant `policy.engine_ledger_exports` document (or the baseline snapshot's export) before delta computation. The live runtime no longer treats an image digest as a synthetic snapshot identifier on either the sync or async path.
- Decision: first-run Policy bootstrap now reads completed persisted orchestration results from `policy.orchestrator_jobs` and `policy.worker_results`, auto-builds the first `policy.engine_ledger_exports` document, and auto-creates the first baseline snapshot when no baseline exists and the request did not specify an explicit baseline reference.
- Decision: `/policy/orchestrator/jobs` is now the owning persisted producer path for upstream Policy execution state. Submitting a job signals `PolicyOrchestratorJobWorkerHost`, which leases queued jobs from `IOrchestratorJobStore`, executes them through `PolicyWorkerService`, writes `policy.worker_results`, and records terminal `completed` or `failed` status. `/api/policy/eval/batch` remains strictly stateless and is not allowed to backfill those tables.
- Decision: the live proof for the Policy orchestrator producer path is now automated through `src/Web/StellaOps.Web/tests/e2e/integrations/policy-orchestrator.e2e.spec.ts` and `npm run test:e2e:policy:producer:live`, reusing the existing frontdoor OIDC auth bootstrap to obtain a bearer token before calling the direct compose host `http://policy-engine.stella-ops.local`.
- Decision: the active local compose definition for `policy-engine` must pass `STELLAOPS_POLICY_ENGINE_Postgres__Scheduler__ConnectionString` and `SchemaName`, otherwise the live host cannot activate the real async queue branch even though the code is present.
- Decision: `Policy.Persistence` startup migrations must remain idempotent on reused local volumes because the local container reset path frequently reuses existing Policy objects without preserved migration state; duplicate-index/trigger/policy failures are not acceptable convergence behavior.
- Decision: Graph runtime repository selection now happens at service resolution from final `Postgres:Graph` options; the live `/graph/query`, `/graph/diff`, and `/graphs*` compatibility surfaces must never depend on the historical demo-seeded graph when Graph persistence is configured.
- Risk: several modules outside the initial slice still boot with runtime in-memory stores (`Notify`, `Policy`, `Platform`, `Scheduler`, `Scanner`, `BinaryIndex`, `Signals`, `SbomService`, `Signer`, `PacksRegistry`, `AdvisoryAI`). They will need follow-on slices unless a real persistence path already exists and can be wired safely.
- Risk: the code-level and project-level checks are green and the frontdoor is back up, but this sprint slice still lacks an authenticated live `/api/v2/scripts` verification after the remote-path change. `stellaops-router-gateway` remains `unhealthy`, so browser-level proof should wait for the router health follow-up.
- Risk: some feed-mirror sub-features appear to have no real persisted backend contract yet, so removing fake data may temporarily surface explicit `501`/empty-state behavior in the UI until the owning backend is implemented.
- Risk: the global Angular `ng test --watch=false --include ...` path is still blocked by unrelated compile failures outside this slice, so focused frontend verification for the release-environment detail flow currently depends on direct Vitest execution instead of the repo-wide Angular test target.
- Risk: after recreating `release-orchestrator`, the router can transiently serve `503 No instances available` until the new instance state converges; one live verification pass required a router restart before the frontdoor resumed routing the environment-management path.
- Risk: the deployment monitoring route is now truthful and persistent, but it is still a compatibility projection over deployment state rather than the full deployment engine and artifact/evidence pipeline. The browser no longer sees fake rows, but deeper deployment execution slices still need follow-on work.
- Risk: `stellaops-vexlens-web` still logs `libgssapi_krb5.so.2` load failures during startup even though the service becomes healthy and the live gating/statistics path works. That native-library gap still needs a follow-up before claiming the container image is fully converged.
- Risk: the frontdoor root cause uncovered during the VexHub slice is still broader than this module. `src/Router/StellaOps.Gateway.WebService/Authorization/AuthorizationMiddleware.cs` currently enforces extracted required claims as a hard `AND` set, so any future service policy that exposes alternative scope values will break at the gateway even if the service-level policy is intentionally `OR`.
- Risk: the live VEX search route now reaches the real backend, but the local dataset had no statement rows during verification, so statement-detail and consensus click-through remain proven by focused frontend tests rather than a populated live browser session.
- Risk: the Excititor persistence-suite verification is still partially blocked by Microsoft.Testing.Platform behavior rather than product logic. The old migration `42601` failure is gone after trimming embedded resources, but the full `StellaOps.Excititor.Persistence.Tests` project still hangs after launching the testhost, and filtered reruns are not trustworthy because MTP ignores `VSTestTestCaseFilter`.
- Risk: the Scanner manifest/proof runtime slice is live and backed by PostgreSQL, but targeted `StellaOps.Scanner.WebService.Tests` class-filtered runs still stop advancing after the xUnit/MTP testhost launches. Until that harness issue is fixed, live API verification is the most reliable evidence for this slice.
## Next Checkpoints
- Remove the active Angular VEX Hub mock provider.
- Re-test the live VEX Hub browser surfaces and continue stripping remaining VEX/VEXLens compatibility or seeded runtime paths.
- Continue the next highest-value live cleanup outside the completed Scanner manifest/proof and Graph runtime slices: replace remaining active runtime in-memory stores in feeds, Policy, and Scheduler.
- Convert the Concelier feed-mirror endpoints from seeded data to real source/read-model state.
- Replace the remaining on-disk stub deployment pages under `src/Web/StellaOps.Web/src/app/features/deployments/` with thin wrappers or remove them once no legacy references remain.
- Decide whether the real release-environment management feature should replace the current `/environments/overview` redirect path or continue to coexist with the topology inventory surface.

View File

@@ -99,6 +99,20 @@ Completion criteria:
- [x] Frontend tests cover the probe-succeeds-but-step-is-not-yet-applied case.
- [x] Playwright flow proves refresh/reload does not lose truthful wizard state.
### BOOTSTRAP-006 - Prevent local hostname binding from double-registering explicit Kestrel ports
Status: DONE
Dependency: BOOTSTRAP-001
Owners: Developer / QA
Task description:
- `timeline-web` was the first local service to ship both an explicit `Kestrel:Endpoints` configuration and the shared `.stella-ops.local` binding helper from `StellaOps.Auth.ServerIntegration`.
- The shared helper re-added `ASPNETCORE_URLS` as manual Kestrel listeners even when the application had already declared Kestrel endpoints in configuration, which caused a duplicate bind on `8080` and left `stellaops-timeline-web` in a restart loop during scratch local setup.
- Fix the shared helper so explicit Kestrel endpoint configuration wins, add regression coverage in the server-integration test project, and revalidate the live timeline container from the rebuilt image.
Completion criteria:
- [x] The shared local-binding helper skips `ServerUrls` re-registration when `Kestrel:Endpoints` are already configured.
- [x] Regression tests cover the explicit-Kestrel case and the legacy no-Kestrel case.
- [x] `stellaops-timeline-web` starts healthy from a rebuilt local image after a CLI scratch bootstrap.
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
@@ -116,6 +130,10 @@ Completion criteria:
| 2026-04-14 | Closed the remaining web-suite caveat by synchronizing stale security/audit/settings/setup-wizard specs with the current shipped contracts and rerunning the deterministic web batches through the previously failing tail. Batch `27/33` passed with `79/79` tests, batch `28/33` passed with `65/65`, and batches `29-33/33` passed cleanly, leaving the default web batch lane green. | Developer |
| 2026-04-14 | Fixed the last local setup-finalize blocker by converging `platform.environment_settings` from the legacy tenant-scoped bootstrap shape to the installation-scoped schema expected by the truthful setup flow, updating the compose fallback, and adding regression coverage around the migration/runtime compatibility path. | Developer |
| 2026-04-14 | Re-ran the full setup wizard from scratch through `src/Web/StellaOps.Web/scripts/live-setup-wizard-full-bootstrap.mjs`. The refreshed artifact `src/Web/StellaOps.Web/output/playwright/live-setup-wizard-full-bootstrap.json` recorded `failedActionCount=0`, `runtimeIssueCount=0`, and final completion through `crypto-finalize-completed`, while `https://stella-ops.local/healthz` stayed `ready=true`. | Developer |
| 2026-04-15 | Diagnosed the remaining scratch-bootstrap instability as a shared local-binding defect: `StellaOpsLocalHostnameExtensions` re-added `ASPNETCORE_URLS` listeners even when a service already defined `Kestrel:Endpoints`, which made `timeline-web` bind `8080` twice. Fixed the helper, added regression coverage in `StellaOps.Auth.ServerIntegration.Tests`, rebuilt only `stellaops/timeline-web:dev`, and revalidated the container as `healthy` under the CLI-driven local stack. | Developer |
| 2026-04-15 | Re-ran the local stack from a zero-state wipe, then completed the installation bootstrap through `src/Web/StellaOps.Web/scripts/live-setup-wizard-full-bootstrap.mjs` and tenant onboarding through `src/Web/StellaOps.Web/scripts/live-integrations-ui-bootstrap.mjs`. The first integrations pass exposed a Playwright harness race at the provider-selection step, so the harness was hardened to wait for the active wizard heading (`Select .* Provider` or `Connection & Credentials`) before proceeding. | Developer |
| 2026-04-15 | After the harness fix, the refreshed Playwright artifact `src/Web/StellaOps.Web/output/playwright/live-integrations-ui-bootstrap.json` recorded `failedIntegrationCount=0`, `healthyIntegrationCount=16`, and `successfulTestCount=16`. Independent verification through `GET http://127.1.0.42/api/v1/integrations?page=1&pageSize=50` returned `totalCount=16` and `unhealthy=0` for tenant `demo-prod`, while `https://stella-ops.local/healthz` remained `ready=true`. | Developer |
| 2026-04-15 | Fixed a separate post-bootstrap UI truthfulness bug in `src/app/features/integration-hub/integration-hub.component.ts`: the suggested-setup cards rendered `Not started` before the six summary queries resolved, and the Secrets card queried `RepoSource` instead of `SecretsManager`. The hub now shows loading indicators until counts resolve and correctly counts Vault/Consul as Secrets integrations. | Developer |
## Decisions & Risks
- Decision: a truthful UI setup starts only after the control plane is already reachable in the browser. Docker/host/runtime bring-up remains a machine bootstrap concern, not a browser concern.
@@ -123,8 +141,10 @@ Completion criteria:
- Decision: secret material belongs in a secret authority, not in the integration catalog and not in frontend-only state. The UI must talk to a backend secret-staging contract that returns an authref binding.
- Decision: the first shipped Secret Authority writer targets Vault KV v2 only. Other secrets-manager providers fail explicitly with `501 not_implemented` instead of pretending write support exists.
- Decision: installation-scoped wizard progress is now persisted in `platform.setup_sessions`, and only non-sensitive draft values are stored there.
- Decision: the Playwright integrations harness must treat provider selection as an asynchronous wizard state, not as an immediate DOM fact after navigation. The runner now waits for either the provider step or the connection step before branching, which keeps zero-state UI reruns deterministic on slower local stacks.
- Decision: `platform.environment_settings` is installation-scoped in both startup migrations and compose bootstrap fallbacks; local bootstrap must not preseed `SetupComplete` or carry tenant-scoped keys forward.
- Decision: the live UI bootstrap artifact is considered green when the integration catalog converges to `16/16` healthy entries and the per-integration create/test/health checks succeed, even if background assistant/context requests are aborted during route transitions.
- Decision: `TryAddStellaOpsLocalBinding` must defer to explicit `Kestrel:Endpoints` configuration; the helper may add the extra `.stella-ops.local` listeners on `80/443`, but it must not re-register `ServerUrls` into Kestrel when the application already owns its endpoint list.
- Risk: if the setup wizard continues to mix installation-scoped and tenant-scoped concerns, it will keep drifting into a misleading all-in-one setup surface that cannot be made truthful.
- Risk: adding a secret staging API without strong audit and scope controls would weaken the platform security posture.
- Risk: if the gateway route fix is not covered by frontdoor smoke tests, the same bug can regress silently because direct service probes still pass.