docs: integrations GitLab registry auth + sprint plans

Add GitLab container registry connector docs (WWW-Authenticate Bearer
token exchange, authref config). Add sprint files for container rebuild,
regression retest, and UI no-mocks work.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
master
2026-04-10 12:28:59 +03:00
parent 36eaf5e798
commit 39111b35c2
5 changed files with 372 additions and 2 deletions

View File

@@ -0,0 +1,91 @@
# Sprint 20260409-001 -- Local Container Rebuild, Integrations, and Sources
## Topic & Scope
- Rebuild the Stella Ops local container install from a clean compose-owned state so the stack can be reprovisioned deterministically.
- Recreate the supported local integration lane used by the repo docs and E2E coverage: core stack, QA fixtures, and real third-party services.
- Re-enable and verify the advisory source catalog after the rebuild, capturing any blockers or degraded paths encountered during convergence.
- Working directory: `devops/compose`.
- Expected evidence: compose teardown/recreate logs, healthy container status, integration API verification, advisory source check results, and recorded struggles in this sprint.
## Dependencies & Concurrency
- Required docs: `docs/INSTALL_GUIDE.md`, `docs/quickstart.md`, `devops/compose/README.md`, `docs/modules/platform/architecture-overview.md`, `docs/modules/integrations/architecture.md`.
- This is an operator-style sprint; tasks are sequential because the reset wipes the environment that later tasks depend on.
- The rebuild is scoped to Stella-owned Docker compose resources only. Unrelated Docker containers, images, networks, and volumes must not be touched.
- Cross-module edits are allowed only for bootstrap blockers discovered during the rebuild, scoped to `src/Attestor/**`, `src/JobEngine/**`, `src/Integrations/**`, `docs/integrations/**`, and `docs/modules/integrations/**`.
## Documentation Prerequisites
- `docs/INSTALL_GUIDE.md`
- `docs/quickstart.md`
- `devops/compose/README.md`
- `src/Integrations/README.md`
- `src/Concelier/StellaOps.Concelier.WebService/Extensions/SourceManagementEndpointExtensions.cs`
- `src/Integrations/StellaOps.Integrations.WebService/IntegrationEndpoints.cs`
## Delivery Tracker
### LOCAL-REBUILD-001 - Wipe Stella local compose state and bootstrap from scratch
Status: DONE
Dependency: none
Owners: Developer / Ops Integrator
Task description:
- Stop the Stella local compose lanes, remove Stella-owned containers and persistent volumes, recreate the required Docker networks, and bring the documented local stack back with the repo-supported scripts and compose files.
- Preserve repo configuration unless a documented bootstrap precondition is broken. If `.env` or hosts entries require repair, do the minimal corrective action and record it.
Completion criteria:
- [x] Stella-owned compose services are stopped and removed cleanly before rebuild.
- [x] Stella-owned persistent volumes required for a scratch bootstrap are recreated from empty state.
- [x] Core stack is running again with healthy status for required services.
- [x] Any bootstrap deviation from the documented path is logged in `Decisions & Risks`.
### LOCAL-REBUILD-002 - Recreate the local integration provider lane
Status: DONE
Dependency: LOCAL-REBUILD-001
Owners: Developer / Ops Integrator
Task description:
- Start the QA integration fixtures and the real third-party integration compose lane supported by the repo.
- Register the local integrations that the stack can actively exercise after bootstrap, using the live API surface rather than mock-only assumptions.
Completion criteria:
- [x] QA fixtures are running and healthy.
- [x] Real local providers are running for the supported low-idle lane, with optional profiles enabled only when needed.
- [x] Integration catalog entries exist for the local providers that can be verified from this environment.
- [x] Integration test and health endpoints succeed for the registered providers, or failures are logged with concrete cause.
### LOCAL-REBUILD-003 - Enable and verify advisory sources after rebuild
Status: DONE
Dependency: LOCAL-REBUILD-002
Owners: Developer / Ops Integrator
Task description:
- Re-run the advisory source auto-configuration flow on the rebuilt stack, confirm the catalog is populated, and verify the resulting source health state.
- Capture any source families that remain unhealthy or require unavailable credentials/fixtures.
Completion criteria:
- [x] Advisory source catalog is reachable on the rebuilt stack.
- [x] Bulk source check runs to completion.
- [x] Healthy sources are enabled after the check.
- [x] Any unhealthy or skipped sources are recorded with exact failure details.
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-04-09 | Sprint created for a Stella-only local wipe, rebuild, integration reprovisioning, and advisory source verification. | Developer |
| 2026-04-09 | Wiped Stella-owned Docker state, recreated `stellaops` / `stellaops_frontdoor`, rebuilt blocked images (`platform`, `scheduler-web`), and bootstrapped the stack with the setup script plus manual network recovery. | Developer |
| 2026-04-09 | Registered the full local provider lane through `integrations-web`, provisioned Vault-backed GitLab credentials, enabled the heavy GitLab registry profile, and converged the live integration catalog to 13/13 active providers. | Developer |
| 2026-04-09 | Ran `POST http://127.1.0.9/api/v1/advisory-sources/check` against the live source catalog and confirmed 74/74 advisory sources healthy and enabled. | Developer |
| 2026-04-09 | Updated local integration docs to document GitLab registry auth (`authref://vault/gitlab#registry-basic`) and the Bearer-challenge registry flow implemented in the Docker registry connector. | Developer |
## Decisions & Risks
- Decision: destructive cleanup is limited to Stella-owned Docker compose resources (`stellaops-*` containers, compose-owned volumes, and Stella compose networks) because the user asked to rebuild the local Stella installation, not wipe unrelated Docker state.
- Decision: use the repo-documented compose lanes as the authority for what counts as "all integrations" locally, then register every provider the local stack can actually exercise from this machine.
- Decision: bootstrap blocker fixes were applied in owning modules when the clean rebuild exposed real defects: duplicate publish outputs in Attestor persistence packages, stale script wiring in `scheduler-web`, and missing Bearer-challenge handling in the Docker registry connector.
- Decision: integration docs were updated in [../../devops/compose/README.md](../../devops/compose/README.md), [../integrations/LOCAL_SERVICES.md](../integrations/LOCAL_SERVICES.md), and [../modules/integrations/architecture.md](../modules/integrations/architecture.md) to match the live GitLab registry auth flow used by the rebuilt environment.
- Audit: the user-requested source revalidation used `POST http://127.1.0.9/api/v1/advisory-sources/check` against the runtime catalog exposed by `GET http://127.1.0.9/api/v1/advisory-sources/catalog`; purpose: re-enable and verify all advisory sources after the wipe. Result: 74/74 healthy and enabled.
- Risk: `scripts/setup.ps1` still misparses comments in `hosts.stellaops.local`, reporting bogus missing aliases such as comment words, and it still does not recreate the external `stellaops` network required by the split compose lanes.
- Risk: the local machine could not update `C:\Windows\System32\drivers\etc\hosts` because the session did not have elevation; Docker-network aliases were sufficient for inter-container traffic, but host-based friendly names remain an operator follow-up.
- Risk: all compose files currently share the Docker Compose project name `compose`, so `docker compose ... ps` and `up/down` calls emit cross-file orphan noise and make service-scoped status harder to trust.
- Risk: fresh-volume bootstrap still leaves unrelated core services unhealthy: `router-gateway` rejects duplicate `/platform` routes, `findings-ledger-web` crashes because `findings.ledger_projection_offsets` is missing, `timeline-web` restart-loops in startup migration handling, and `graph-api` / `scheduler-web` remain unhealthy. The requested integration and advisory-source lanes are usable, but a full-stack fresh install is not yet fully converged.
## Next Checkpoints
- Fix the remaining fresh-volume core blockers (`router-gateway`, `findings-ledger-web`, `timeline-web`, `graph-api`, `scheduler-web`) and rerun the bootstrap smoke.
- Repair `scripts/setup.ps1` host-alias parsing and external-network recreation so the documented path works without manual intervention.
- Re-run the clean install after those blockers land and archive this sprint once full-stack convergence is proven.

View File

@@ -0,0 +1,178 @@
# Sprint 20260409-002 -- Local Stack Regression Retest
## Topic & Scope
- Regress the rebuilt local Stella Ops environment in the order requested by the user: backend unit tests, backend end-to-end checks, frontend unit tests, then frontend end-to-end checks.
- Keep execution strictly serial: no overlapping test runs across projects or suites.
- Capture failures with concrete project, command, and runtime evidence so blockers can be fixed and retested deterministically.
- Working directory: `.`.
- Expected evidence: per-project test command output, live API/browser verification artifacts, and sprint execution log updates for each lane.
## Dependencies & Concurrency
- Required docs: `docs/qa/feature-checks/FLOW.md`, `docs/code-of-conduct/TESTING_PRACTICES.md`, `devops/compose/README.md`, `docs/modules/platform/architecture-overview.md`.
- Depends on [SPRINT_20260409_001_Platform_local_container_rebuild_integrations_sources.md](/C:/dev/New%20folder/git.stella-ops.org/docs/implplan/SPRINT_20260409_001_Platform_local_container_rebuild_integrations_sources.md) because the retest uses the rebuilt local stack and seeded integration/source catalogs from that sprint.
- Test execution is strictly sequential per the user request. Only one project-level test command may run at a time.
- Cross-module reads and test execution are allowed across `src/**`, `src/Web/**`, and `devops/compose/**`. Code edits are allowed only if a failing test exposes a concrete product defect that must be fixed to continue the retest.
## Documentation Prerequisites
- `docs/qa/feature-checks/FLOW.md`
- `docs/code-of-conduct/TESTING_PRACTICES.md`
- `devops/compose/README.md`
- `docs/modules/platform/architecture-overview.md`
## Delivery Tracker
### RETEST-001 - Run backend unit test lane sequentially
Status: DONE
Dependency: none
Owners: QA / Test Automation
Task description:
- Identify the backend test projects that cover the rebuilt local integration and advisory-source paths plus the core services currently known to be unhealthy after fresh bootstrap.
- Run each backend unit/integration-oriented project one at a time, record exact commands and outcomes, and stop to triage any hard failures before advancing to backend E2E checks.
Completion criteria:
- [x] Backend project list and execution order are recorded in the sprint log.
- [x] Each selected backend test project is run individually with exact command evidence.
- [x] Failures are captured with concrete project-level detail and triage notes.
### RETEST-002 - Run backend end-to-end verification against the live stack
Status: DONE
Dependency: RETEST-001
Owners: QA
Task description:
- Exercise the live backend surfaces exposed by the rebuilt local stack, starting with the integration and advisory-source APIs already proven during rebuild and extending into the remaining core services needed for a broader backend regression call.
- Use real HTTP requests and service health evidence from the local environment; do not treat compile/test passes as sufficient.
Completion criteria:
- [x] Live backend endpoints are exercised with fresh requests against the rebuilt environment.
- [x] Responses and any failing services are captured with exact evidence.
- [x] Backend E2E status is recorded as pass/fail/blocker with follow-up notes.
### RETEST-003 - Run frontend unit test lane sequentially
Status: DONE
Dependency: RETEST-002
Owners: QA / Test Automation
Task description:
- Run the frontend unit-test suites one project at a time after backend E2E completes, using the repository-supported Node toolchain already present on this machine.
- Record command output, failures, and any environment issues before moving to browser-based verification.
Completion criteria:
- [x] Frontend unit test projects are executed serially.
- [x] Results are captured with exact commands and pass/fail counts.
- [x] Any failures or skipped areas are recorded with reason.
### RETEST-004 - Run frontend end-to-end verification serially
Status: DONE
Dependency: RETEST-003
Owners: QA
Task description:
- Run browser-based UI verification against the rebuilt local environment after frontend unit tests finish, using the host aliases now present on the machine.
- Validate the key local integration and source-management flows visible through the web UI, and capture failures with enough detail to reproduce.
Completion criteria:
- [x] Browser-based frontend verification runs after unit tests, not before.
- [x] Key UI flows are exercised against the local stack.
- [x] Outcomes and blockers are recorded with concrete evidence.
### RETEST-005 - Restore router frontdoor startup for local browser access
Status: DONE
Dependency: RETEST-004
Owners: Developer / QA
Task description:
- Diagnose and fix the local `router-gateway` startup failure that leaves `https://stella-ops.local/` unavailable.
- Keep the gateway's fail-fast configuration validation intact; remove the duplicate `/platform` route at the correct configuration-loading layer rather than weakening validation.
- Reverify the router with focused gateway tests plus live HTTP/TLS checks against the local host aliases.
Completion criteria:
- [x] The local router no longer crash-loops on duplicate `/platform` routes.
- [x] `https://stella-ops.local/` responds without `ERR_CONNECTION_CLOSED`.
- [x] Focused router tests and live frontdoor checks are recorded in the sprint log.
### RETEST-006 - Restore local admin login convergence after Authority bootstrap race
Status: DONE
Dependency: RETEST-005
Owners: Developer / QA
Task description:
- Diagnose and fix the local login failure for the documented demo admin account (`admin / Admin@Stella2026!`).
- Keep the Authority bootstrap and password-verification paths deterministic; do not paper over the issue by weakening authentication checks.
- Reverify the fix with focused Authority Standard Plugin tests plus a real browser/UI login against `https://stella-ops.local/`.
Completion criteria:
- [x] The Authority bootstrap admin user is created reliably even when PostgreSQL is not ready on the first startup attempt.
- [x] `admin / Admin@Stella2026!` can sign in successfully through the local browser flow.
- [x] Focused tests and live login evidence are recorded in the sprint log.
### RETEST-007 - Restore local scripts catalog convergence for `/ops/scripts`
Status: DONE
Dependency: RETEST-006
Owners: Developer / QA
Task description:
- Diagnose and fix the local `/ops/scripts` failure so the scripts catalog loads through the browser against the rebuilt local stack.
- Keep schema ownership deterministic: the service that serves `/api/v2/scripts` must own and auto-migrate the `scripts` schema on startup instead of depending on another module's migrations or manual SQL.
- Reverify the fix with focused Release Orchestrator tests plus live API/browser checks against `https://stella-ops.local/ops/scripts`.
Completion criteria:
- [x] The `scripts` schema converges automatically on fresh local startup under the owning Release Orchestrator service.
- [x] `GET /api/v2/scripts` succeeds through the local gateway without `relation "scripts.scripts" does not exist`.
- [x] The `/ops/scripts` UI loads script data instead of showing the generic load failure banner.
### RETEST-008 - Remove bogus local feed-mirror timeout state for `mirror-osv-001`
Status: DOING
Dependency: RETEST-007
Owners: Developer / QA
Task description:
- Diagnose the `Sync Error / Connection timeout after 30s.` message shown on the local mirror detail page at `/ops/operations/feeds/mirror/mirror-osv-001`.
- Keep local feed-mirror behavior truthful: if the mirror management surface is currently backed by seeded/stubbed data, it must not report a fake runtime timeout that never actually occurred.
- Reverify the fix with focused API/UI checks against the local stack.
Completion criteria:
- [ ] The local OSV mirror detail no longer reports a fabricated timeout error.
- [ ] Backend and frontend seed/mock fixtures stay aligned for the OSV mirror state.
- [ ] Live local verification is recorded in the sprint log.
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-04-09 | Sprint created for serial backend/frontend regression retesting of the rebuilt local stack. | QA |
| 2026-04-09 | Backend unit lane selected and ordered for serial execution: `StellaOps.Platform.WebService.Tests`, `StellaOps.Integrations.Tests`, `StellaOps.Integrations.Plugin.Tests`, `StellaOps.Concelier.WebService.Tests`, `StellaOps.Gateway.WebService.Tests`, `StellaOps.Scheduler.Worker.Tests`, `StellaOps.Graph.Api.Tests`, `StellaOps.Findings.Ledger.Tests`, `StellaOps.Timeline.WebService.Tests`. | QA |
| 2026-04-09 | Backend unit lane executed serially. Passed: `StellaOps.Integrations.Plugin.Tests` (18/18), `StellaOps.Scheduler.Worker.Tests` (139/139), `StellaOps.Graph.Api.Tests` (77/77). Failed: `StellaOps.Platform.WebService.Tests` (9 failing assertions around seed/migration/quota flows), `StellaOps.Integrations.Tests` (7 host-boot failures from missing `StellaOps.Audit.Emission`), `StellaOps.Gateway.WebService.Tests` (1 readiness regression), `StellaOps.Findings.Ledger.Tests` (5 contract failures due PostgreSQL connectivity), `StellaOps.Timeline.WebService.Tests` (17 failures due PostgreSQL startup migration connectivity). `StellaOps.Concelier.WebService.Tests` was aborted after a non-advancing run following host startup, and is currently treated as a hang-class blocker pending deeper triage. | QA |
| 2026-04-09 | Backend E2E over host aliases completed. Healthy surfaces: `platform` (`/healthz`, `/api/v1/platform/health/summary`), `integrations` (`providers` 17, catalog 13/13 active), and Concelier static source-management (`POST /api/v1/advisory-sources/check` => `74/74` healthy). Blockers: `router` unreachable (`stellaops-router-gateway` unhealthy), `scheduler` connection closed (`stellaops-scheduler-web` unhealthy), `graph` unreachable while `stellaops-graph-api` restart-loops, `findings` unreachable while `stellaops-findings-ledger-web` restart-loops, and `timeline` unreachable while `stellaops-timeline-web` restart-loops. Additional defect: Concelier UI read-model endpoints (`GET /api/v1/advisory-sources`, `/summary`) only report 3 sources (2 healthy, 1 stale), which diverges from the 74-source static catalog/check surface. | QA |
| 2026-04-09 | Frontend unit lane executed serially. `npm test -- --watch=false` failed in batch 1 before test execution because the Angular build is broken (`AuditStatsSummary.byModule` missing new module keys, missing `toggleExpand`, missing scheduler component imports, and implicit-`any` callbacks in scheduler spec files). `npm run test:topology` also failed at compile time with a large missing-file/type-program regression across route specs and lazy-loaded feature components. `npm run test:active-surfaces` passed (`7` files / `25` tests). | QA |
| 2026-04-09 | Frontend E2E over the live frontdoor is currently blocked at entry. Both `npm run test:e2e:live:auth` and `npm run test:e2e:live:changed-surfaces` failed on `page.goto('https://stella-ops.local/welcome')` with `net::ERR_CONNECTION_CLOSED`; plain HTTP `http://stella-ops.local` also returned an empty reply, matching the unhealthy router/frontdoor state. | QA |
| 2026-04-09 | Expanded serial test sweep requested. Second-pass backend order selected: `StellaOps.Authority.Core.Tests`, `StellaOps.Auth.ServerIntegration.Tests`, `StellaOps.Policy.Engine.Tests`, `StellaOps.Policy.Scoring.Tests`, `StellaOps.Scanner.Core.Tests`, `StellaOps.Scanner.WebService.Tests`, `StellaOps.ReleaseOrchestrator.Workflow.Tests`, `StellaOps.ReleaseOrchestrator.Integration.Tests`, `StellaOps.EvidenceLocker.Tests`, `StellaOps.BinaryIndex.WebService.Tests`, `StellaOps.Doctor.WebService.Tests`, `StellaOps.ReachGraph.WebService.Tests`, `StellaOps.Notify.WebService.Tests`, `StellaOps.VexHub.WebService.Tests`, `StellaOps.Unknowns.WebService.Tests`, followed by repo-level integration/E2E projects as time permits. | QA |
| 2026-04-09 | Repository test surface enumerated for scope control: `rg --files src -g "*Tests.csproj"` returned `503` test projects under `src`, confirming the retest remains a sampled regression sweep rather than an exhaustive full-repo run. | QA |
| 2026-04-09 | Second-pass backend sweep executed serially. Passed: `StellaOps.Authority.Core.Tests` (46/46), `StellaOps.Auth.ServerIntegration.Tests` (30/30), `StellaOps.Policy.Scoring.Tests` (263/263), `StellaOps.Scanner.Core.Tests` (339/339), `StellaOps.ReleaseOrchestrator.Workflow.Tests` (488/488), `StellaOps.ReleaseOrchestrator.Integration.Tests` (12/12), `StellaOps.EvidenceLocker.Tests` (132/132), `StellaOps.BinaryIndex.WebService.Tests` (54/54), `StellaOps.Doctor.WebService.Tests` (35/35), `StellaOps.ReachGraph.WebService.Tests` (26/26), `StellaOps.Unknowns.WebService.Tests` (10/10). Failed: `StellaOps.Policy.Engine.Tests` (4 failures from duplicate endpoint name `ListRiskProfiles` in host boot), `StellaOps.Notify.WebService.Tests` (4 endpoint-contract regressions: readiness `400`, normalize endpoints `401`), `StellaOps.VexHub.WebService.Tests` (compile break: `InMemoryVexSourceRepository` missing `UpdateFailureTrackingAsync`). `StellaOps.Scanner.WebService.Tests` was aborted after a non-advancing run entered execution and left only an empty `TestResults` log, so it is currently tracked as a stall-class blocker. | QA |
| 2026-04-09 | Third-pass serial backend sweep focused on root-cause isolation. Passed: `StellaOps.Notify.Core.Tests` (59/59), `StellaOps.Notify.Engine.Tests` (33/33), `StellaOps.Notify.Persistence.Tests` (109/109), `StellaOps.Scheduler.Persistence.Tests` (95/95), `StellaOps.Workflow.WebService.Tests` (4/4), `StellaOps.VexHub.Core.Tests` (1/1), `StellaOps.Concelier.Core.Tests` (569/569), `StellaOps.Concelier.SourceIntel.Tests` (61/61), `StellaOps.Feedser.Core.Tests` (81/81). Failed: `StellaOps.Scheduler.WebService.Tests` (107 failures / 18 passes, all cascading from unresolved DI registrations for `IImpactSnapshotRepository`, `IPolicyRunJobRepository`, and `IGraphJobRepository`), `StellaOps.Workflow.Engine.Tests` (6 failures / 133 passes, canonical workflow rendering now returns only 5 of 10 expected definitions and round-trip compilation injects `assign-business-reference` producing non-identical canonical JSON), `StellaOps.Concelier.Persistence.Tests` (29 failures / 207 passes, missing PostgreSQL relations including `kev_flags`, `sources`, and `merge_events`). This isolates Notify web regressions to the web surface rather than core/engine/persistence layers, and isolates Scheduler breakage to web-host service wiring rather than persistence repositories. | QA |
| 2026-04-09 | Fourth-pass serial backend sweep widened coverage without code changes. Passed: `StellaOps.Notify.Queue.Tests` (14/14), `StellaOps.Scheduler.Queue.Tests` (102/102), `StellaOps.Scheduler.Plugin.Tests` (16/16), `StellaOps.Workflow.DataStore.PostgreSQL.Tests` (13/13), `StellaOps.Feedser.BinaryAnalysis.Tests` (26/26), `StellaOps.Concelier.SbomIntegration.Tests` (130/130). Failed: `StellaOps.Notify.Worker.Tests` failed at build time because the referenced worker project path `src/Notify/StellaOps.Notify.Worker/StellaOps.Notify.Worker.csproj` does not exist and worker namespaces/types (`Handlers`, `Processing`, `INotifyEventHandler`, `NotifyWorkerOptions`) cannot be resolved; `StellaOps.Scheduler.Models.Tests` failed 1/143 because `ScheduleSample_RoundtripsThroughCanonicalSerializer` now emits extra fields (`jobKind`, `source`) and no longer round-trips to the expected canonical sample; `StellaOps.Excititor.WebService.Tests` failed 7/37 because OIDC metadata bootstrapping points to `http://localhost/.well-known/openid-configuration` and rejects non-HTTPS (`IDX20108`). `StellaOps.Concelier.Integration.Tests` did not execute any real integration coverage because its only test was skipped behind `STELLAOPS_INTEGRATION_TESTS=true`. `StellaOps.Workflow.Renderer.Tests` was manually stopped after entering a long-running artifact-generation loop under `TestResults/workflow-renderings/20260409/DocumentProcessingWorkflow` with no terminal result emitted during the observation window. | QA |
| 2026-04-09 | Fifth-pass serial workflow and Excititor-internal sweep executed without code changes. Passed: `StellaOps.Workflow.Signaling.Redis.Tests` (2/2), `StellaOps.Workflow.DataStore.MongoDB.Tests` (11/11), `StellaOps.Excititor.Core.Tests` (185/185), `StellaOps.Excititor.Policy.Tests` (2/2), `StellaOps.Excititor.Export.Tests` (16/16), `StellaOps.Excititor.Formats.OpenVEX.Tests` (15/15), `StellaOps.Excititor.Formats.CycloneDX.Tests` (15/15), `StellaOps.Excititor.Formats.CSAF.Tests` (13/13), `StellaOps.Excititor.Plugin.Tests` (25/25), `StellaOps.Excititor.Attestation.Tests` (17/17), `StellaOps.Excititor.Worker.Tests` (70/70), and `StellaOps.Excititor.ArtifactStores.S3.Tests` (2/2). Failed: `StellaOps.Workflow.DataStore.Oracle.Tests` (26 failures / 14 passes; DI/runtime configuration gaps including missing `StackExchange.Redis.IConnectionMultiplexer`, missing `IOracleAqTransport`, and unconfigured EF `DbContext` provider), and `StellaOps.Excititor.Persistence.Tests` (48 failures / 6 passes; shared Postgres test fixture cannot apply Excititor migrations because PostgreSQL reports `42601: syntax error at or near \"(\"`). `StellaOps.Excititor.Core.UnitTests` remained a harness anomaly: `dotnet test` exited `0` after restore/build, but emitted no test-host execution and produced no `TestResults`. | QA |
| 2026-04-09 | Fifth-pass serial Excititor connector sweep broadened coverage beyond the core libraries. Passed: `StellaOps.Excititor.Connectors.Cisco.CSAF.Tests` (9/9), `StellaOps.Excititor.Connectors.RedHat.CSAF.Tests` (13/14 with 1 skip), `StellaOps.Excititor.Connectors.Ubuntu.CSAF.Tests` (10/10), and `StellaOps.Excititor.Connectors.Oracle.CSAF.Tests` (10/10). This confirms the connector-specific CSAF import/export layers are largely healthy even while Excititor persistence and web-host/OIDC paths remain broken. | QA |
| 2026-04-09 | Sixth-pass serial Excititor connector sweep continued into additional source types. Passed: `StellaOps.Excititor.Connectors.MSRC.CSAF.Tests` (12/12) and `StellaOps.Excititor.Connectors.OCI.OpenVEX.Attest.Tests` (17/17). The connector matrix continues to indicate localized breakage in persistence and web-host startup rather than a connector-wide ingestion/export regression. | QA |
| 2026-04-09 | User-reported browser failure on `https://stella-ops.local/` was rechecked live. `curl -vk https://stella-ops.local/` resolves `stella-ops.local` to `127.1.0.1` but fails during TLS handshake (`schannel: failed to receive handshake`), and `docker ps` still reports `stellaops-router-gateway` as `unhealthy`. Router logs confirm the same startup blocker as earlier: `Duplicate route path '/platform' already defined by Route[96]`, so the frontdoor remains non-functional even though many backend services and test projects are healthy. | QA |
| 2026-04-09 | Router/frontdoor defect fixed without weakening gateway validation. Root cause was config composition in local compose: `devops/compose/router-gateway-local.json` was being mounted as `/app/appsettings.local.json`, so its route table merged with the baked-in gateway `appsettings.json` and duplicated `/platform`. The local router config was normalized into a standalone gateway configuration (`Node`, `Transports`, `Routing`, `OpenApi`, `Auth`, `Health`, `Routes`, `Logging`) and compose now mounts it as `/app/appsettings.json` in both local compose variants so it replaces, rather than merges with, the baked-in route table. A guard test was added to keep the compose config standalone. | Developer / QA |
| 2026-04-09 | Focused router verification after the fix: `dotnet test src/Router/__Tests/StellaOps.Gateway.WebService.Tests/StellaOps.Gateway.WebService.Tests.csproj -v minimal` now runs with the new compose-config guard and reports `289` passed / `1` failed, where the sole remaining failure is the pre-existing readiness regression (`HealthReady_ReturnsOk_WhenRequiredMicroserviceIsRegistered` expected `200`, got `503`). The router container was then force-recreated from `devops/compose`, after which `docker ps` shows `stellaops-router-gateway` `healthy`, `curl -vk https://stella-ops.local/` returns `HTTP/1.1 200 OK` with the Stella Ops HTML shell, and `curl -I http://stella-ops.local/` returns `302 Found` redirecting to HTTPS. | Developer / QA |
| 2026-04-09 | Local login failure triaged to Authority bootstrap convergence, not bad credentials. Live Authority logs showed the browser POSTs reaching `/authorize`, but the standard plugin rejected `admin` as an unknown user. Startup logs showed the root cause: during local cold start the plugin attempted bootstrap seeding once, hit `Npgsql.NpgsqlException: Failed to connect ... Connection refused`, logged the error, and never retried, leaving the service healthy but without the bootstrap admin account. | Developer / QA |
| 2026-04-09 | Authority standard-plugin bootstrap hardening shipped. `StandardPluginBootstrapper` now retries the bootstrap pass instead of abandoning seeding after the first transient storage failure, and the focused Authority Standard Plugin suite now passes `44/44`, including a new transient-failure regression test. Module docs were updated to record that bootstrap user/client provisioning now converges when storage becomes reachable after startup. | Developer / QA |
| 2026-04-09 | Live deployment and browser verification completed. Rebuilding `stellaops/authority:dev` through `devops/docker/Dockerfile.platform` was blocked by stale Dockerfile paths (`/src/Signer`, `/src/Scheduler` no longer exist), so the updated Authority host was published locally and copied into the running container before restarting `stellaops-authority`. On restart, Authority logs confirmed `assigned role 'admin' to bootstrap user 'admin'`. The real browser-level script `npm run test:e2e:live:auth` then succeeded and produced a full `demo-prod` session for `admin`, landing on `https://stella-ops.local/?tenant=demo-prod&regions=apac,eu-west,us-east,us-west` with title `Dashboard - StellaOps`. | Developer / QA |
| 2026-04-10 | `/ops/scripts` was fixed at the owning-service layer. Release Orchestrator scripts now bind isolated `ScriptsPostgresOptions`, embed their own `scripts` schema DDL/seed migration, and register a dedicated startup migration host. A shared infrastructure bug also had to be corrected: `AddStartupMigrations(...)` previously used `AddHostedService(...)`, which deduplicated the second migration host in a single service and prevented the `scripts` schema migrator from starting beside the core `release_orchestrator` migrator. Focused Release Orchestrator integration tests now pass `15/15`, release-orchestrator was republished into the live container, startup logs show `Migration.ReleaseOrchestrator.Scripts` applying `001_initial.sql`, and a real authenticated Playwright probe confirmed `https://stella-ops.local/ops/scripts` loads with heading `Scripts`, no generic error banner, and 4 rendered scripts while `GET /api/v2/scripts` returns `200`. | Developer / QA |
## Decisions & Risks
- Decision: the retest follows the user-specified order exactly: backend unit, backend E2E, frontend unit, frontend E2E.
- Decision: no more than one project-level test run will execute at once, even where the repo could support more concurrency.
- Risk: the fresh bootstrap still leaves several core services unhealthy (`router-gateway`, `findings-ledger-web`, `timeline-web`, `graph-api`, `scheduler-web`), so backend and frontend E2E coverage may be partially blocked even if unit suites pass.
- Risk: some `.NET` test entrypoints in this repo use Microsoft.Testing.Platform, which previously ignored `--filter` in at least one integration suite; command output must be inspected carefully so suite totals are not misreported as targeted evidence.
- Risk: the repository currently exposes `503` distinct `*Tests.csproj` projects under `src`, so this sprint is explicitly a risk-based sampled regression sweep, not a complete full-repo certification run.
- Risk: additional second-pass failures show contract drift and compile drift beyond the original container-health blockers, notably duplicate endpoint names in Policy host boot, normalize/auth regressions in Notify, and an interface-implementation mismatch in VexHub test infrastructure.
- Risk: deeper isolation shows several failures are layer-specific rather than module-wide: Notify core/engine/persistence pass while web contracts fail; Scheduler persistence passes while the web host cannot resolve repository services; Concelier core/source-intel/feed ingestion pass while persistence fails on missing relations and the web-service suite still hangs.
- Risk: the latest batch adds more compile/configuration drift not visible from service health alone: missing Notify worker project references, scheduler model sample drift, Excititor/OIDC test assumptions that now require HTTPS metadata, Concelier integration tests gated entirely by `STELLAOPS_INTEGRATION_TESTS`, and a long-running workflow renderer suite that generates artifacts without reaching a terminal result promptly.
- Risk: the latest workflow datastore pass shows Oracle-specific integration coverage is significantly behind PostgreSQL/MongoDB parity: Oracle tests fail on missing Redis multiplexer wiring, missing `IOracleAqTransport`, and missing EF provider configuration inside the runtime host.
- Risk: Excititor internals are not uniformly broken. Most core/policy/export/format/worker/connector projects are green, but persistence is heavily red because the shared Postgres fixture cannot apply Excititor migrations (`42601` syntax error near `(`), and `StellaOps.Excititor.Core.UnitTests` appears miswired as a non-executing test harness despite `dotnet test` returning success.
- Decision: local compose now treats `devops/compose/router-gateway-local.json` as a full replacement gateway configuration mounted at container `appsettings.json`. This preserves strict duplicate-route validation while preventing local route-table double-loading.
- Risk: the gateway suite still has one unrelated readiness failure (`HealthReady_ReturnsOk_WhenRequiredMicroserviceIsRegistered` returning `503`), so router startup/frontdoor availability is fixed but the router test project is not yet fully green.
- Decision: Authority standard-plugin bootstrap provisioning now retries transient startup failures so the documented local admin account and seeded console client converge after PostgreSQL becomes reachable, rather than depending on startup order luck.
- Risk: `devops/docker/Dockerfile.platform` is stale relative to the current repo layout (`/src/Signer` and `/src/Scheduler` COPY steps fail), so image rebuilds for the updated Authority service required a temporary local publish plus container copy path instead of a clean Docker target rebuild.
- Decision: startup schema migration hosts now register as explicit `IHostedService` singletons instead of `AddHostedService(...)` so one service can own and auto-migrate multiple PostgreSQL schemas without the second host being silently deduplicated.
- Risk: the local browser trust chain is still not accepted by the ad hoc Playwright CLI browser session in this terminal, so the final UI verification for `/ops/scripts` used the repos existing `live-frontdoor-auth.mjs` harness with `ignoreHTTPSErrors: true` instead of the Playwright MCP bridge.
## Next Checkpoints
- Finish backend unit lane and decide whether failures require code fixes before backend E2E.
- Use the live stack for backend API verification once the backend unit lane is complete.
- Proceed to frontend unit and frontend E2E only after backend lanes are recorded.

View File

@@ -0,0 +1,91 @@
# Sprint 20260410-001 -- Runtime No-Mocks Real Backend Wiring
## Topic & Scope
- Remove live production-path stubs, mock providers, demo payloads, and in-memory stores that currently let the UI report fictional backend state.
- Start with the active browser/runtime path: Angular production DI bindings plus the Concelier feed-mirror surfaces behind `/ops/operations/feeds`.
- Replace fake success/error payloads with real persistence-backed reads, real job dispatch, or explicit unsupported/problem responses when no real backend exists yet.
- Working directory: `.`.
- Expected evidence: scoped code changes, targeted tests, live API/UI verification, and a logged inventory of remaining runtime in-memory blockers.
## Dependencies & Concurrency
- Required docs: `docs/modules/platform/architecture-overview.md`, `docs/modules/concelier/architecture.md`, `src/Web/StellaOps.Web/AGENTS.md`, `src/Concelier/AGENTS.md`, `src/Concelier/StellaOps.Concelier.WebService/AGENTS.md`.
- Depends on [SPRINT_20260409_002_Platform_local_stack_regression_retest.md](/C:/dev/New%20folder/git.stella-ops.org/docs/implplan/SPRINT_20260409_002_Platform_local_stack_regression_retest.md) for the user-reported local failures that exposed the fake runtime paths.
- Initial implementation is limited to `src/Web/StellaOps.Web/**`, `src/Concelier/**`, and related docs/tests. Additional module cleanup discovered during inventory must be logged before expansion.
- Verification commands may run sequentially; no project-level test concurrency is needed for this sprint.
## Documentation Prerequisites
- `docs/modules/platform/architecture-overview.md`
- `docs/modules/concelier/architecture.md`
- `src/Web/StellaOps.Web/AGENTS.md`
- `src/Concelier/AGENTS.md`
- `src/Concelier/StellaOps.Concelier.WebService/AGENTS.md`
## Delivery Tracker
### NOMOCK-001 - Inventory live runtime mock and in-memory bindings
Status: DOING
Dependency: none
Owners: Developer
Task description:
- Identify production-path Angular providers, API clients, and service registrations that still bind to mocks, seeded demo payloads, or in-memory stores during normal local/runtime execution.
- Separate true runtime bindings from test-only helpers so cleanup work targets the user-visible path first.
Completion criteria:
- [ ] Active production-path mock/in-memory bindings are listed in the execution log with file references.
- [ ] Test-only mocks are distinguished from runtime bindings.
- [ ] The initial implementation slice is explicitly scoped from that inventory.
### NOMOCK-002 - Remove active Angular production mock providers
Status: TODO
Dependency: NOMOCK-001
Owners: Developer
Task description:
- Remove any Angular app-level production DI binding that resolves a mock client instead of a real HTTP client.
- Add a focused regression test so the main application configuration cannot silently drift back to mock providers.
Completion criteria:
- [ ] `app.config.ts` no longer binds production API tokens to mock implementations.
- [ ] A targeted frontend test guards the affected provider wiring.
- [ ] Live UI requests hit the real backend client path.
### NOMOCK-003 - Replace feed-mirror seeded/stubbed backend behavior with real backend state
Status: TODO
Dependency: NOMOCK-001
Owners: Developer
Task description:
- Remove the seeded feed-mirror DTO catalog and fabricated sync/offline/bundle/version-lock responses from the Concelier web service.
- Back mirror list/detail/state off the real advisory-source read model and source persistence, use the real source sync trigger path, and return truthful empty/problem responses for operations that do not yet have a real persistent backend.
Completion criteria:
- [ ] `/api/v1/concelier/mirrors` and `/api/v1/concelier/mirrors/{id}` read from real persisted source state rather than `MirrorSeedData`.
- [ ] `/api/v1/concelier/mirrors/{id}/sync` uses real job dispatch instead of fabricated success payloads.
- [ ] Fake seeded timeout/demo bundle/version-lock/import/offline payloads are removed from the live endpoint path.
### NOMOCK-004 - Verify live feed UI behavior and log remaining blocked runtime in-memory services
Status: TODO
Dependency: NOMOCK-002
Owners: Developer / QA
Task description:
- Re-test the affected browser/API flows after the runtime cleanup and record what still cannot be converted because the owning module lacks a real persistent backend implementation.
- Keep the user-visible contract truthful: blocked modules must fail honestly rather than returning invented data.
Completion criteria:
- [ ] `/ops/operations/feeds` and `/ops/operations/feeds/mirror/*` are verified against the live stack.
- [ ] Targeted automated tests covering the changed runtime path pass or fail with concrete blockers recorded.
- [ ] Remaining cross-module runtime in-memory bindings are logged with next-action notes.
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-04-10 | Sprint created to remove live production-path stubs, mock providers, and in-memory runtime bindings starting with the active Angular app configuration and Concelier feed-mirror surfaces. | Developer |
## Decisions & Risks
- Decision: this sprint prioritizes live runtime paths the browser can currently reach over test-only mock helpers.
- Decision: unsupported operations must return truthful empty/problem responses rather than seeded demo success/error payloads.
- Risk: several modules outside the initial slice still boot with runtime in-memory stores (`Notify`, `Graph`, `Policy`, `Platform`, `Scheduler`, `Scanner`, `BinaryIndex`, `Signals`, `SbomService`, `Signer`, `PacksRegistry`, `AdvisoryAI`). They will need follow-on slices unless a real persistence path already exists and can be wired safely.
- Risk: some feed-mirror sub-features appear to have no real persisted backend contract yet, so removing fake data may temporarily surface explicit `501`/empty-state behavior in the UI until the owning backend is implemented.
## Next Checkpoints
- Remove the active Angular VEX Hub mock provider.
- Convert the Concelier feed-mirror endpoints from seeded data to real source/read-model state.
- Re-test the live feed pages and record the next runtime cleanup slice.

View File

@@ -205,6 +205,9 @@ vault kv put secret/jenkins api-token="your-jenkins-token"
# Store Nexus admin password
vault kv put secret/nexus admin-password="your-nexus-password"
# Store GitLab PATs for API and registry access
vault kv put secret/gitlab access-token="glpat-your-token" registry-basic="root:glpat-your-token"
```
---
@@ -320,10 +323,16 @@ GITLAB_ENABLE_REGISTRY=true GITLAB_ENABLE_PACKAGES=true \
docker compose -f docker-compose.integrations.yml --profile heavy up -d gitlab
```
**Stella Ops integration config (SCM):**
**Stella Ops integration config (SCM / CI):**
- Endpoint: `http://gitlab.stella-ops.local:8929`
- AuthRef: `authref://vault/gitlab#access-token`
**Stella Ops integration config (Registry):**
- Endpoint: `http://gitlab.stella-ops.local:5050`
- AuthRef: `authref://vault/gitlab#registry-basic`
- Secret format: `username:personal-access-token` (local default: `root:<token>`)
- The Docker registry connector follows GitLab's `WWW-Authenticate: Bearer` challenge and exchanges this basic secret against `/jwt/auth` before retrying catalog and tag probes.
---
## Mock Fixtures
@@ -386,7 +395,7 @@ docker compose -f docker-compose.integrations.yml down -v
| **Registry** | Harbor | harbor-fixture (mock) | Ready |
| **Registry** | Docker Hub / OCI | docker-registry | Ready |
| **Registry** | Nexus | nexus | Ready |
| **Registry** | GitLab Registry | gitlab (heavy) | Ready when `GITLAB_ENABLE_REGISTRY=true` |
| **Registry** | GitLab Registry | gitlab (heavy) | Ready when `GITLAB_ENABLE_REGISTRY=true` and `authref://vault/gitlab#registry-basic` is populated |
| **SCM** | GitHub App | github-app-fixture (mock) | Ready |
| **SCM** | Gitea | gitea | Ready |
| **SCM** | GitLab Server | gitlab (heavy) | Ready with Vault-backed PAT |

View File

@@ -100,6 +100,7 @@ public interface IIntegrationPlugin
- **GitHub App** - Operators provide either the GitHub Cloud root (`https://github.com`), a GitHub Enterprise Server root, or an explicit `/api/v3` base. The connector normalizes the endpoint to a single API root and probes relative `app` / `rate_limit` paths so GitHub Enterprise onboarding never falls back to origin-root `/app`.
- **Harbor** - Operators provide the Harbor base URL. Stella Ops probes the provider-specific `/api/v2.0/health` route for connection tests and health checks.
- **Docker Registry / GitLab Container Registry** - Operators provide the registry base URL. When the registry responds with `WWW-Authenticate: Bearer ...`, the connector exchanges the configured secret against the advertised token realm and retries with the returned bearer token. The local GitLab registry uses `authref://vault/gitlab#registry-basic`, storing `username:personal-access-token`.
## Security Considerations