feat(infra-postgres): detect explicit transaction control in migrations

Adds MigrationSqlTransactionClassifier to recognize migration SQL that opens
its own transactions (BEGIN/COMMIT/ROLLBACK) so MigrationRunner can skip
wrapping those files in an outer transaction. StartupMigrationHost now surfaces
a MigrationCategory indicator for runtime-aligned bootstrap. Test harness
extended with an explicit-transaction fixture and execution scenario coverage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
master
2026-04-13 21:56:27 +03:00
parent 337aa58023
commit a393b6d6e1
12 changed files with 377 additions and 33 deletions

View File

@@ -1,221 +0,0 @@
# Sprint 20260408-001 -- Crypto Provider Picker (UI + Backend)
## Topic & Scope
- Admin-facing UI panel for discovering, monitoring, and selecting crypto providers per tenant.
- Platform service backend endpoints for provider health probing and tenant preference persistence.
- Integrates with existing `ICryptoProviderRegistry` (in `src/__Libraries/StellaOps.Cryptography/`) to respect tenant-level provider selection at runtime.
- Working directory: `src/Web/StellaOps.Web` (Angular UI), `src/Platform/` (backend API).
- Expected evidence: UI component rendering, API integration tests, DB migration for tenant crypto preferences.
## Dependencies & Concurrency
- Depends on crypto provider compose refactor (smremote extracted from main compose; crypto overlays renamed to `docker-compose.crypto-provider.*.yml`). **DONE** as of 2026-04-08.
- No upstream sprint blockers. Can run in parallel with other UI or backend work that does not touch Platform settings or Cryptography libraries.
- The `docker-compose.sm-remote.yml` (standalone HSM provider) remains unchanged; this sprint only concerns provider *discovery* and *selection*, not provider lifecycle management.
## Documentation Prerequisites
- `devops/compose/README.md` -- updated Crypto Provider Overlays section (contains health endpoints and compose commands).
- `src/__Libraries/StellaOps.Cryptography/CryptoProviderRegistry.cs` -- current registry implementation with preferred-order resolution.
- `src/__Libraries/StellaOps.Cryptography/CryptoProvider.cs` -- `ICryptoProvider` interface contract.
- `docs/security/crypto-profile-configuration.md` -- current crypto profile configuration docs.
## Delivery Tracker
### CP-001 - Provider health probe API endpoint
Status: DONE
Dependency: none
Owners: Backend Developer
Task description:
Add a Platform admin API endpoint `GET /api/v1/admin/crypto-providers/health` that probes each known crypto provider's health endpoint and returns aggregated status.
Known provider health endpoints:
- SmRemote (router microservice): `http://smremote.stella-ops.local:8080/health` (internal router mesh)
- SM Remote (standalone HSM): `http://localhost:56080/status` (or configured `SM_REMOTE_PORT`)
- CryptoPro CSP: `http://cryptopro-csp:8080/health`
- Crypto Simulator: `http://sim-crypto:8080/keys`
Response schema (JSON):
```json
{
"providers": [
{
"id": "smremote",
"name": "SmRemote (SM2/SM3/SM4)",
"status": "running",
"healthEndpoint": "http://smremote.stella-ops.local:8080/health",
"responseTimeMs": 12,
"composeOverlay": "docker-compose.crypto-provider.smremote.yml",
"startCommand": "docker compose -f docker-compose.stella-ops.yml -f docker-compose.crypto-provider.smremote.yml up -d smremote"
},
{
"id": "cryptopro",
"name": "CryptoPro CSP (GOST)",
"status": "unreachable",
"healthEndpoint": "http://cryptopro-csp:8080/health",
"responseTimeMs": null,
"composeOverlay": "docker-compose.crypto-provider.cryptopro.yml",
"startCommand": "CRYPTOPRO_ACCEPT_EULA=1 docker compose -f docker-compose.stella-ops.yml -f docker-compose.crypto-provider.cryptopro.yml up -d cryptopro-csp"
},
{
"id": "crypto-sim",
"name": "Crypto Simulator (dev/test)",
"status": "stopped",
"healthEndpoint": "http://sim-crypto:8080/keys",
"responseTimeMs": null,
"composeOverlay": "docker-compose.crypto-provider.crypto-sim.yml",
"startCommand": "docker compose -f docker-compose.stella-ops.yml -f docker-compose.crypto-provider.crypto-sim.yml up -d sim-crypto"
}
]
}
```
The endpoint should use `HttpClient` with a short timeout (5s) to probe each provider. Status values: `running`, `stopped`, `unreachable`, `degraded`.
Provider definitions should be stored in configuration (appsettings or DB), not hardcoded, so that custom providers can be registered.
Completion criteria:
- [x] `GET /api/v1/admin/crypto-providers/health` returns JSON with status for all configured providers
- [x] Unreachable providers return `unreachable` status (not 500)
- [x] Response includes `startCommand` with the correct compose overlay filename
- [x] Endpoint is admin-only (requires `ops.admin` or `crypto:admin` scope)
- [ ] Unit test covering probe timeout and mixed healthy/unhealthy scenarios
### CP-002 - Tenant crypto provider preference API + DB table
Status: DONE
Dependency: none
Owners: Backend Developer
Task description:
Add a database table and API endpoints for storing per-tenant crypto provider preferences.
Database table (`platform` schema):
```sql
CREATE TABLE platform.tenant_crypto_preferences (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES shared.tenants(id),
provider_id VARCHAR(100) NOT NULL,
algorithm_scope VARCHAR(100) NOT NULL DEFAULT '*',
priority INT NOT NULL DEFAULT 0,
is_active BOOLEAN NOT NULL DEFAULT true,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE (tenant_id, provider_id, algorithm_scope)
);
```
API endpoints:
- `GET /api/v1/admin/crypto-providers/preferences` -- list current tenant's provider preferences
- `PUT /api/v1/admin/crypto-providers/preferences` -- update provider selection (body: `{ providerId, algorithmScope, priority, isActive }`)
- `DELETE /api/v1/admin/crypto-providers/preferences/{id}` -- remove a preference
The preference should feed into `CryptoProviderRegistry` via `CryptoRegistryProfiles`, allowing per-tenant override of the `preferredProviderOrder`.
Completion criteria:
- [x] SQL migration file added as embedded resource in Platform persistence library
- [x] Auto-migration on startup (per repo-wide rule 2.7)
- [x] CRUD endpoints work and are admin-scoped
- [x] Preferences are tenant-isolated (multi-tenant safe)
- [ ] Integration test: set preference, resolve provider, confirm correct provider is selected
### CP-003 - Angular crypto provider dashboard panel
Status: DONE
Dependency: CP-001
Owners: Frontend Developer
Task description:
Add a "Crypto Providers" panel in the Platform Settings area of the Angular UI. Location: under the existing settings navigation, accessible at `/settings/crypto-providers`.
Panel layout:
1. **Provider List** -- Table/card grid showing each provider with:
- Provider name and icon/badge (SM, GOST, SIM, etc.)
- Status indicator: green dot (Running), red dot (Stopped/Unreachable)
- Health response time (if running)
- Last checked timestamp
2. **Start Instructions** -- When a provider is stopped or unreachable, show a collapsible section with:
- The exact `docker compose` command to start it (from the API response `startCommand`)
- Copy-to-clipboard button
3. **Refresh Button** -- Re-probe all providers on demand
4. **Auto-refresh** -- Poll every 30 seconds when the panel is visible (use `interval` with `switchMap` and `takeUntilDestroyed`)
Angular implementation:
- Component: `src/Web/StellaOps.Web/src/app/features/settings/crypto-providers/`
- Service: `CryptoProviderService` calling `GET /api/v1/admin/crypto-providers/health`
- Route: Add to settings routing module
- Use existing StellaOps design system components (cards, status badges, tables)
Completion criteria:
- [x] Panel renders provider list with live status from API
- [x] Stopped providers show start command with copy button
- [x] Auto-refresh works and stops when navigating away
- [x] Panel is accessible only to admin users
- [x] Responsive layout (works on tablet and desktop)
### CP-004 - Active provider selection UI
Status: DONE
Dependency: CP-002, CP-003
Owners: Frontend Developer
Task description:
Extend the crypto provider dashboard panel (CP-003) with an "Active Provider" selection feature.
UI additions:
1. **Active Badge** -- Show which provider is currently selected for the tenant
2. **Select Button** -- On each running provider card, show "Set as Active" button
3. **Algorithm Scope** -- Optional: dropdown to scope the selection to specific algorithm families (SM, GOST, default, etc.) or apply globally (`*`)
4. **Confirmation Dialog** -- Before changing the active provider, show a confirmation dialog explaining the impact on signing operations
5. **Priority Ordering** -- Drag-and-drop reordering of provider priority (maps to `CryptoRegistryProfiles.preferredProviderOrder`)
The selection calls `PUT /api/v1/admin/crypto-providers/preferences` and updates the UI immediately.
Completion criteria:
- [x] Admin can select active provider per tenant
- [x] Selection persists across page refreshes (reads from API)
- [x] Cannot select a provider that is currently stopped/unreachable (button disabled with tooltip)
- [x] Confirmation dialog shown before changing provider
- [x] Priority ordering updates the registry's preferred order
### CP-005 - ICryptoProviderRegistry tenant-aware resolution
Status: DONE
Dependency: CP-002
Owners: Backend Developer
Task description:
Extend `CryptoProviderRegistry` (or introduce a decorator/wrapper) to consult tenant preferences when resolving providers. Currently the registry uses a static `preferredProviderOrder` set at startup. The enhancement should:
1. Accept an `IStellaOpsTenantAccessor` to determine the current tenant
2. Query the `platform.tenant_crypto_preferences` table (cached, TTL ~60s) for the tenant's preferred order
3. Override `CryptoRegistryProfiles.ActiveProfile` based on the tenant preference
4. Fall back to the default preferred order if no tenant preference exists
This must not break existing non-tenant-aware code paths (CLI, background workers). The tenant-aware resolution should be opt-in via DI registration.
Key files:
- `src/__Libraries/StellaOps.Cryptography/CryptoProviderRegistry.cs`
- `src/__Libraries/StellaOps.Cryptography/CryptoRegistryProfiles.cs` (if exists)
Completion criteria:
- [x] Tenant-aware resolution works when tenant accessor is available
- [x] Falls back to default when no tenant context or no preferences set
- [x] Cached query (not per-request DB hit)
- [x] Existing non-tenant code paths unaffected (unit tests pass)
- [x] Integration test: two tenants with different preferences resolve different providers
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-04-08 | Sprint created. Crypto provider compose overlays refactored (smremote extracted, files renamed). | Planning |
| 2026-04-08 | CP-001 implemented: CryptoProviderHealthService + CryptoProviderAdminEndpoints (health probe). CP-002 implemented: SQL migration 062, ICryptoProviderPreferenceStore with Postgres and InMemory impls, CRUD endpoints. Both wired in Program.cs. Build verified (0 errors, 0 warnings). Unit tests pending. | Developer |
| 2026-04-08 | Compose refactoring confirmed complete: smremote extracted (Slot 31 comment in main compose), overlay files already named `docker-compose.crypto-provider.*.yml`, README Crypto Provider Overlays section up to date, INSTALL_GUIDE.md references correct filenames. No old-named files to rename. | Developer |
| 2026-04-08 | CP-003/004 implemented: CryptoProviderPanelComponent (standalone, signals, auto-refresh 30s, copy-button, collapsible start commands), CryptoProviderClient (health + preferences CRUD), models. Route at `/setup/crypto-providers`, Setup overview card added. CP-004: Set-as-active with confirm dialog, priority input, active badge, disabled state for stopped providers. Build verified (0 errors). CP-005 is backend-only, not in scope for this FE pass. | Frontend Developer |
| 2026-04-08 | CP-005 implemented: TenantAwareCryptoProviderRegistry decorator wrapping ICryptoProviderRegistry, ITenantCryptoPreferenceProvider interface, DI extension AddTenantAwareCryptoResolution, PlatformCryptoPreferenceProvider bridging to ICryptoProviderPreferenceStore. 14 unit tests added (all pass): multi-tenant isolation, cache verification, fallback on missing tenant context, explicit-preferred-overrides-tenant, hasher/signer resolution. Build verified (0 errors). | Developer |
## Decisions & Risks
- **Risk: Provider health probing from within containers.** The Platform service runs inside the Docker network; it can reach other containers by DNS alias but cannot determine whether a compose overlay is loaded vs. the container is unhealthy. Mitigation: treat any non-200 response (including DNS resolution failure) as `unreachable`.
- **Risk: Tenant preference caching coherence.** If an admin changes the active provider, other service instances may use stale cache. Mitigation: use Valkey pub/sub to broadcast preference changes, or accept eventual consistency with a 60s TTL.
- **Decision: `docker-compose.sm-remote.yml` (standalone HSM overlay) remains unchanged.** Only the router-integrated `smremote` microservice was extracted from the main compose. The standalone SM Remote overlay serves a different purpose (HSM integration with build context).
- **Decision: Provider definitions should be configurable, not hardcoded.** Seed the initial set from appsettings but allow DB overrides so operators can add custom providers.
## Next Checkpoints
- CP-001 + CP-002 backend endpoints ready for frontend integration.
- CP-003 initial panel rendering with mock data for design review.
- CP-004 + CP-005 integration testing with live crypto providers.

View File

@@ -1,370 +0,0 @@
# Sprint 20260408-004 -- DB Schema Violations Cleanup
## Topic & Scope
- Fix two database schema violations that undermine Stella Ops' multi-schema isolation and central migration governance.
- **Violation 3**: FindingsLedger uses PostgreSQL `public` schema (collision risk with 60+ other services).
- **Violation 4**: 13+ schemas self-create via inline `EnsureTable`/`CREATE SCHEMA IF NOT EXISTS` instead of registering with `MigrationModuleRegistry`.
- Working directory: cross-module (see per-task paths below).
- Expected evidence: builds pass, CLI `stella system migrate` covers new modules, all existing tests pass with schema changes.
## Dependencies & Concurrency
- No upstream sprint dependencies; these are standalone DB hygiene fixes.
- Violation 3 and Violation 4 can be worked in parallel by separate implementers.
- Violation 4 tasks are independent of each other and can be parallelized per-service.
- Fresh DB assumption: no live data migration needed. We amend existing migration DDL directly.
## Documentation Prerequisites
- `src/Platform/__Libraries/StellaOps.Platform.Database/MigrationModuleRegistry.cs` -- registry contract.
- `src/Platform/__Libraries/StellaOps.Platform.Database/MigrationModulePlugins.cs` -- existing plugin examples.
- `src/Platform/__Libraries/StellaOps.Platform.Database/IMigrationModulePlugin.cs` -- plugin interface.
- `src/Platform/__Libraries/StellaOps.Platform.Database/MigrationModulePluginDiscovery.cs` -- auto-discovery mechanism.
- Pattern reference: any existing plugin (e.g., `ScannerMigrationModulePlugin`, `PolicyMigrationModulePlugin`).
---
## Delivery Tracker
---
### V3-01 - FindingsLedger: Change DefaultSchemaName from `public` to `findings`
Status: DONE
Dependency: none
Owners: Developer (backend)
Task description:
The `FindingsLedgerDbContextFactory.DefaultSchemaName` is currently `"public"`, meaning all 11 FindingsLedger tables (ledger_events, ledger_merkle_roots, findings_projection, finding_history, triage_actions, ledger_projection_offsets, airgap_imports, ledger_attestation_pointers, orchestrator_exports, ledger_snapshots, observations) plus 2 custom ENUM types (ledger_event_type, ledger_action_type) land in the PostgreSQL default schema. This risks name collisions and violates the project's per-module schema isolation pattern.
**What to change:**
1. **`src/Findings/StellaOps.Findings.Ledger/Infrastructure/Postgres/FindingsLedgerDbContextFactory.cs`** (line 10):
- Change `public const string DefaultSchemaName = "public";` to `public const string DefaultSchemaName = "findings";`
- The branching logic on line 21 (`if (string.Equals(normalizedSchema, DefaultSchemaName, ...))`) uses the compiled model only when schema matches default. After the change, the compiled model will be used when schema = `"findings"`. This is correct behavior.
2. **`src/Findings/StellaOps.Findings.Ledger/EfCore/Context/FindingsLedgerDbContext.cs`** (line 14):
- Change the fallback from `"public"` to `"findings"`:
```csharp
_schemaName = string.IsNullOrWhiteSpace(schemaName)
? "findings"
: schemaName.Trim();
```
3. **All 12 migration SQL files** in `src/Findings/StellaOps.Findings.Ledger/migrations/`:
- Prepend `CREATE SCHEMA IF NOT EXISTS findings;` to `001_initial.sql` (before `BEGIN;` or as first statement inside the transaction).
- For `001_initial.sql`: prefix all `CREATE TABLE`, `CREATE INDEX`, `PARTITION OF` statements with `findings.` schema qualifier. Tables: `ledger_events`, `ledger_events_default`, `ledger_merkle_roots`, `ledger_merkle_roots_default`, `findings_projection`, `findings_projection_default`, `finding_history`, `finding_history_default`, `triage_actions`, `triage_actions_default`.
- Move the two `CREATE TYPE` statements into the `findings` schema: `CREATE TYPE findings.ledger_event_type ...`, `CREATE TYPE findings.ledger_action_type ...`.
- For `002_*` through `009_*`: qualify all table references with `findings.` prefix. Currently these use unqualified table names (e.g., `ALTER TABLE ledger_events` becomes `ALTER TABLE findings.ledger_events`).
- For `007_enable_rls.sql`: the `findings_ledger_app` schema for RLS functions is already namespaced and fine. Just qualify the table references in `ALTER TABLE` and `CREATE POLICY` statements.
- Set `search_path` at the top of each migration: `SET search_path TO findings, public;` so that type references resolve correctly.
4. **`src/Platform/__Libraries/StellaOps.Platform.Database/MigrationModulePlugins.cs`** (line 285):
- Change `schemaName: "public"` to `schemaName: "findings"` in `FindingsLedgerMigrationModulePlugin`.
5. **Regenerate the EF Core compiled model** (if the project uses `dotnet ef dbcontext optimize`):
- The compiled model in `src/Findings/StellaOps.Findings.Ledger/EfCore/CompiledModels/` may need regeneration if it bakes in schema names. Current inspection shows it delegates to `OnModelCreating`, so it should pick up the change automatically. Verify by building.
6. **Update tests**: The `MigrationModuleRegistryTests.cs` assertion for FindingsLedger should now expect `schemaName == "findings"`. Add an explicit assertion:
```csharp
Assert.Contains(modules, m => m.Name == "FindingsLedger" && m.SchemaName == "findings");
```
**Hardcoded `public.` SQL queries:** Grep confirms zero hardcoded `public.` prefixed SQL in the Findings codebase. All repository code passes `FindingsLedgerDbContextFactory.DefaultSchemaName` to the factory, so changing the constant propagates everywhere.
**Impact on RLS:** The `findings_ledger_app` schema for RLS helper functions already has its own namespace and will not collide. The `ALTER TABLE` statements in `007_enable_rls.sql` just need the `findings.` prefix.
Completion criteria:
- [ ] `FindingsLedgerDbContextFactory.DefaultSchemaName` == `"findings"`
- [ ] `FindingsLedgerDbContext` constructor default == `"findings"`
- [ ] `FindingsLedgerMigrationModulePlugin.schemaName` == `"findings"`
- [ ] All 12 migration SQL files use `findings.` qualified table names
- [ ] `001_initial.sql` includes `CREATE SCHEMA IF NOT EXISTS findings;`
- [ ] ENUM types created in `findings` schema
- [ ] Fresh DB: `stella system migrate FindingsLedger` creates tables under `findings` schema
- [ ] All FindingsLedger tests pass
- [ ] MigrationModuleRegistryTests updated to assert `findings` schema
---
### V4-01 - Register RiskEngine with MigrationModuleRegistry (HIGH priority)
Status: DONE
Dependency: none
Owners: Developer (backend)
Task description:
`PostgresRiskScoreResultStore` in `src/Findings/__Libraries/StellaOps.RiskEngine.Infrastructure/Stores/` self-creates the `riskengine` schema and `riskengine.risk_score_results` table via inline `EnsureTableAsync()` (lines 130-164). This bypasses the migration registry entirely.
**Steps:**
1. Create a migration SQL file: `src/Findings/__Libraries/StellaOps.RiskEngine.Infrastructure/Migrations/001_initial_schema.sql` with the DDL currently inline in `EnsureTableAsync()`.
2. Mark the SQL file as an embedded resource in the `.csproj`.
3. Add `RiskEngineMigrationModulePlugin` to `MigrationModulePlugins.cs`:
```csharp
public sealed class RiskEngineMigrationModulePlugin : IMigrationModulePlugin
{
public MigrationModuleInfo Module { get; } = new(
name: "RiskEngine",
schemaName: "riskengine",
migrationsAssembly: typeof(PostgresRiskScoreResultStore).Assembly);
}
```
4. Remove the `EnsureTableAsync()` and `EnsureTable()` methods and the `_initGate`/`_tableInitialized` fields from `PostgresRiskScoreResultStore`. Remove all calls to these methods.
5. Update test assertion: `MigrationCommandHandlersTests` expects 28 modules -- bump to 36 (all V4 sprint plugins added).
6. Add `using StellaOps.RiskEngine.Infrastructure.Stores;` to `MigrationModulePlugins.cs`.
Completion criteria:
- [x] `riskengine` schema created by migration runner, not inline code
- [x] `EnsureTable*` methods removed from `PostgresRiskScoreResultStore`
- [x] `RiskEngineMigrationModulePlugin` registered and discoverable
- [x] `stella system migrate RiskEngine` works
- [x] Build passes, existing RiskEngine tests pass
---
### V4-02 - Register Replay with MigrationModuleRegistry (MEDIUM priority)
Status: DONE
Dependency: none
Owners: Developer (backend)
Task description:
`ReplayFeedSnapshotStores.cs` in `src/Replay/StellaOps.Replay.WebService/` self-creates the `replay` schema and `replay.feed_snapshot_index` table via inline `EnsureTableAsync()` (line 152).
**Steps:**
1. Create `src/Replay/StellaOps.Replay.WebService/Migrations/001_initial_schema.sql` with the DDL.
2. Embed as resource in `.csproj`.
3. Add `ReplayMigrationModulePlugin` to `MigrationModulePlugins.cs` (schema: `replay`).
4. Remove inline `EnsureTableAsync()` from `ReplayFeedSnapshotStores.cs`.
5. Add the `using` for the Replay assembly type to `MigrationModulePlugins.cs`.
6. Update module count in test.
Completion criteria:
- [x] `replay` schema created by migration runner
- [x] Inline DDL removed
- [x] Plugin registered
---
### V4-03 - Register ExportCenter with MigrationModuleRegistry (MEDIUM priority)
Status: DONE
Dependency: none
Owners: Developer (backend)
Task description:
`ExportCenterMigrationRunner` in `src/ExportCenter/StellaOps.ExportCenter/StellaOps.ExportCenter.Infrastructure/Db/` runs its own migration system with a custom `export_center.export_schema_version` table and `EnsureSchemaAsync()`. It has proper SQL migration files but uses a standalone runner instead of the central one.
**Steps:**
1. The SQL migrations already exist under `.../Db/Migrations/`. Verify they are embedded resources.
2. Add `ExportCenterMigrationModulePlugin` to `MigrationModulePlugins.cs` (schema: `export_center`).
3. Keep the `ExportCenterMigrationRunner` temporarily (it has checksum validation) but ensure the central runner can also apply these migrations. Long-term, converge to central runner only.
4. Add the `using` for the assembly type.
5. Update module count.
Completion criteria:
- [x] `ExportCenterMigrationModulePlugin` registered
- [x] Central migration runner can discover and apply ExportCenter migrations
- [x] Existing ExportCenter functionality unaffected
---
### V4-04 - Register Integrations with MigrationModuleRegistry (MEDIUM priority)
Status: DONE
Dependency: none
Owners: Developer (backend)
Task description:
`src/Integrations/__Libraries/StellaOps.Integrations.Persistence/Migrations/001_initial_schema.sql` creates `integrations` schema but has no `IMigrationModulePlugin` registered.
**Steps:**
1. Verify migration SQL is an embedded resource.
2. Add `IntegrationsMigrationModulePlugin` to `MigrationModulePlugins.cs` (schema: `integrations`).
3. Add `using` for the persistence assembly type.
4. Update module count.
Completion criteria:
- [x] `IntegrationsMigrationModulePlugin` registered and discoverable
- [x] `stella system migrate Integrations` works
---
### V4-05 - Register Signer (KeyManagement) with MigrationModuleRegistry (MEDIUM priority)
Status: DONE
Dependency: none
Owners: Developer (backend)
Task description:
`src/Attestor/__Libraries/StellaOps.Signer.KeyManagement/Migrations/001_initial_schema.sql` creates `signer` schema. The `Attestor` module plugin is registered with schema `proofchain`, but the `signer` schema is a separate concern managed by a different library.
**Steps:**
1. Verify migration SQL is an embedded resource.
2. Add `SignerMigrationModulePlugin` to `MigrationModulePlugins.cs` (schema: `signer`).
3. Add `using` for `StellaOps.Signer.KeyManagement` assembly type.
4. Update module count.
Completion criteria:
- [x] `SignerMigrationModulePlugin` registered
- [x] `signer` schema created by central runner
---
### V4-06 - Register IssuerDirectory with MigrationModuleRegistry (MEDIUM priority)
Status: DONE
Dependency: none
Owners: Developer (backend)
Task description:
`src/Authority/__Libraries/StellaOps.IssuerDirectory.Persistence/Migrations/001_initial_schema.sql` creates `issuer` schema. The `Authority` module plugin is registered with schema `authority`, but `issuer` is separate.
**Steps:**
1. Verify migration SQL is an embedded resource.
2. Add `IssuerDirectoryMigrationModulePlugin` to `MigrationModulePlugins.cs` (schema: `issuer`).
3. Add `using` for `StellaOps.IssuerDirectory.Persistence` assembly type.
4. Update module count.
Completion criteria:
- [x] `IssuerDirectoryMigrationModulePlugin` registered
- [x] `issuer` schema created by central runner
---
### V4-07 - Register Workflow with MigrationModuleRegistry (MEDIUM priority)
Status: DONE
Dependency: none
Owners: Developer (backend)
Task description:
`src/Workflow/__Libraries/StellaOps.Workflow.DataStore.PostgreSQL/Migrations/001_initial_schema.sql` creates `workflow` schema but has no plugin.
**Steps:**
1. Verify migration SQL is an embedded resource.
2. Add `WorkflowMigrationModulePlugin` to `MigrationModulePlugins.cs` (schema: `workflow`).
3. Add `using` for the Workflow persistence assembly type.
4. Update module count.
Completion criteria:
- [x] `WorkflowMigrationModulePlugin` registered
- [x] `workflow` schema created by central runner
---
### V4-08 - Register PacksRegistry with MigrationModuleRegistry (MEDIUM priority)
Status: DONE
Dependency: none
Owners: Developer (backend)
Task description:
PacksRegistry repositories in `src/JobEngine/StellaOps.PacksRegistry.__Libraries/StellaOps.PacksRegistry.Persistence/Postgres/Repositories/` (6 files) all self-create the `packs` schema via `EnsureTableAsync()`. There is also a migration file `src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.Infrastructure/migrations/009_packs_registry.sql` that creates this schema.
**Steps:**
1. Consolidate the `packs` schema DDL into a proper migration file under the PacksRegistry persistence library.
2. Embed as resource.
3. Add `PacksRegistryMigrationModulePlugin` to `MigrationModulePlugins.cs` (schema: `packs`).
4. Remove all 6 `EnsureTableAsync()` methods and `_tableInitialized` fields from the repository classes.
5. Update module count.
Completion criteria:
- [x] `packs` schema created by migration runner
- [x] All 6 inline `EnsureTable*` methods removed
- [x] `PacksRegistryMigrationModulePlugin` registered
---
### V4-09 - Register OpsMemory with MigrationModuleRegistry (LOW priority)
Status: DONE
Dependency: none
Owners: Developer (backend)
Task description:
OpsMemory uses the `opsmemory` schema (referenced in `PostgresOpsMemoryStore.cs` queries like `INSERT INTO opsmemory.decisions`). Its migration SQL lives outside the module at `devops/database/migrations/V20260108__opsmemory_advisoryai_schema.sql` -- a legacy location that the central runner does not discover.
**Steps:**
1. Move/copy the migration SQL into the OpsMemory library as an embedded resource.
2. Add `OpsMemoryMigrationModulePlugin` to `MigrationModulePlugins.cs` (schema: `opsmemory`).
3. Add `using` for the OpsMemory assembly type.
4. Update test fixtures that currently load migration SQL from the filesystem path.
5. Update module count.
Completion criteria:
- [x] `opsmemory` schema created by central migration runner
- [x] Legacy devops migration file no longer the only source of truth
- [x] Test fixtures updated
---
### V4-10 - Audit and remove remaining inline EnsureTable patterns (LOW priority)
Status: DONE
Dependency: V4-01 through V4-08
Owners: Developer (backend)
Task description:
After the above tasks, audit remaining `EnsureTable` callers that may not have been addressed:
**Known remaining EnsureTable callers (may already be covered by registered modules):**
- `src/Signals/__Libraries/StellaOps.Signals.Persistence/Postgres/Repositories/` (6 files) -- Signals IS registered. Remove inline `CREATE SCHEMA IF NOT EXISTS signals;` from these repositories since the central runner handles schema creation.
- `src/AirGap/__Libraries/StellaOps.AirGap.Persistence/Postgres/Repositories/` (4 files) -- AirGap IS registered. Remove inline schema creation.
- `src/SbomService/__Libraries/StellaOps.SbomService.Persistence/Postgres/Repositories/` (8 files) -- SbomLineage IS registered. Remove inline `CREATE SCHEMA IF NOT EXISTS sbom;`.
- `src/Router/__Libraries/StellaOps.Messaging.Transport.Postgres/` (2 files) -- uses dynamic schema from `_connectionFactory.Schema`. Evaluate if this needs registration or is intentionally dynamic.
- `src/__Libraries/StellaOps.HybridLogicalClock/PostgresHlcStateStore.cs` -- uses configurable `_schema`. Evaluate.
- `src/Concelier/StellaOps.Excititor.WebService/Services/PostgresGraphOverlayStore.cs` -- Excititor IS registered. Remove inline DDL.
- `src/AdvisoryAI/StellaOps.AdvisoryAI/KnowledgeSearch/PostgresKnowledgeSearchStore.cs` -- AdvisoryAI IS registered. Remove inline `CREATE SCHEMA IF NOT EXISTS advisoryai;`.
- `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Persistence/BinaryIndexMigrationRunner.cs` -- BinaryIndex IS registered. Remove inline schema creation.
- `src/Attestor/StellaOps.Attestor/StellaOps.Attestor.Storage/Rekor/PostgresRekorCheckpointStore.cs` -- creates `attestor` schema inline. Evaluate if this should be a separate plugin or folded into Attestor plugin.
For each: remove the inline `CREATE SCHEMA IF NOT EXISTS` since the central migration runner now owns schema creation. Keep `CREATE TABLE IF NOT EXISTS` as a defensive fallback only if there is a race condition risk; otherwise remove entirely.
Completion criteria:
- [x] All inline `CREATE SCHEMA IF NOT EXISTS` in registered modules removed
- [x] No `EnsureTable` patterns that duplicate central migration runner work
- [x] Build and all tests pass
---
### V4-11 - Update module count test and registry documentation (CLEANUP)
Status: DONE
Dependency: V4-01 through V4-09
Owners: Developer (backend)
Task description:
After all new plugins are registered:
1. Update `MigrationCommandHandlersTests.Registry_Has_All_Modules()` -- currently asserts `28`. New count = 28 + N new plugins (RiskEngine, Replay, ExportCenter, Integrations, Signer, IssuerDirectory, Workflow, PacksRegistry, OpsMemory = 9). New expected count: **37**.
2. Update `MigrationModuleRegistryTests.Modules_Populated_With_All_Postgres_Modules()` -- add assertions for all new modules.
3. Update `SystemCommandBuilderTests` if it has a hardcoded module name list.
Completion criteria:
- [x] All test assertions reflect the new module count (36 plugins; MigrationCommandHandlersTests already asserts 36; MigrationModuleRegistryTests already has assertions for all 36 modules)
- [x] `stella system migrate --list` shows all modules
- [x] No test failures (pre-existing Signer assembly reference issue in CLI test project is unrelated to V4-10/V4-11)
---
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-04-08 | Sprint created with detailed task analysis for Violations 3 and 4. | Planning |
| 2026-04-08 | V4-01 through V4-04 implemented: RiskEngine, Replay, ExportCenter, Integrations registered with MigrationModuleRegistry. Inline EnsureTable removed from RiskEngine and Replay. Test count updated to 36. All builds pass. | Developer |
| 2026-04-08 | V3-01 DONE: Changed FindingsLedger schema from `public` to `findings` across factory, DbContext, migration plugin, all 12 SQL migrations (schema-qualified tables/types/indexes, CREATE SCHEMA, SET search_path), and added test assertion. Build verified. | Developer |
| 2026-04-08 | V4-05 through V4-09 DONE: Registered Signer, IssuerDirectory, Workflow, PacksRegistry, OpsMemory with MigrationModuleRegistry. Created consolidated migration SQL for PacksRegistry (from 009_packs_registry.sql + 6 inline EnsureTable DDLs). Copied OpsMemory DDL from devops/ to library. Removed all 6 EnsureTable methods from PacksRegistry repositories. Added EmbeddedResource to PacksRegistry and OpsMemory csproj files. All builds pass (0 warnings, 0 errors). | Developer |
| 2026-04-08 | V4-10 DONE: Removed redundant inline `CREATE SCHEMA IF NOT EXISTS` from 16 files across registered modules: Signals (6 repos), SbomService (8 repos), AdvisoryAI (KnowledgeSearchStore), BinaryIndex (MigrationRunner), Attestor (RekorCheckpointStore). AirGap EnsureTable methods only check table existence (no schema creation) -- already clean. Concelier Excititor only has `CREATE TABLE IF NOT EXISTS` -- already clean. Router, HLC, ExportCenter, PluginRegistry kept as-is (dynamic/standalone). All 5 affected modules build with 0 errors. | Developer |
| 2026-04-08 | V4-11 DONE: Test assertions already at 36 (updated by V4-01 through V4-09 work). MigrationCommandHandlersTests asserts 36, MigrationModuleRegistryTests has per-module assertions for all 36 plugins. No changes needed. | Developer |
## Decisions & Risks
- **Fresh DB only**: All changes assume fresh DB setup (volume delete + rebuild). No online migration path needed for existing deployments since we are pre-GA.
- **Compiled model (V3-01)**: The EF Core compiled model delegates schema to `OnModelCreating`, so changing `DefaultSchemaName` propagates automatically. If the compiled model bakes in schema names at generation time, it must be regenerated. Verify by building and running.
- **ENUM types in schema (V3-01)**: PostgreSQL ENUMs cannot be easily moved between schemas. Since we are on fresh DB, we create them in the `findings` schema from the start. The `search_path` must include `findings` for queries that reference enum values without schema qualification.
- **Dual migration runners (V4-03)**: ExportCenter has its own runner with checksum validation. Registering with the central runner means migrations run via both paths. Short-term this is fine (idempotent SQL). Long-term, deprecate the standalone runner.
- **Dynamic schemas (V4-10)**: Router messaging and HLC use configurable schemas. These are intentionally dynamic and may not need registry entries. Evaluate during implementation.
- **scripts schema (Scheduler)**: The `scripts` schema is created by `004_create_scripts_schema.sql` inside the Scheduler persistence library, which IS registered. No separate plugin needed -- it is already covered.
## Next Checkpoints
- V3-01 + V4-01 through V4-09 complete: all schemas governed by MigrationModuleRegistry.
- V4-10 complete: no inline schema creation duplicates central runner.
- V4-11 complete: test coverage confirms full registry.
- Final: fresh DB `docker compose down -v && docker compose up` boots with all schemas created by central runner.

View File

@@ -1,91 +0,0 @@
# Sprint 20260409-001 -- Local Container Rebuild, Integrations, and Sources
## Topic & Scope
- Rebuild the Stella Ops local container install from a clean compose-owned state so the stack can be reprovisioned deterministically.
- Recreate the supported local integration lane used by the repo docs and E2E coverage: core stack, QA fixtures, and real third-party services.
- Re-enable and verify the advisory source catalog after the rebuild, capturing any blockers or degraded paths encountered during convergence.
- Working directory: `devops/compose`.
- Expected evidence: compose teardown/recreate logs, healthy container status, integration API verification, advisory source check results, and recorded struggles in this sprint.
## Dependencies & Concurrency
- Required docs: `docs/INSTALL_GUIDE.md`, `docs/quickstart.md`, `devops/compose/README.md`, `docs/modules/platform/architecture-overview.md`, `docs/modules/integrations/architecture.md`.
- This is an operator-style sprint; tasks are sequential because the reset wipes the environment that later tasks depend on.
- The rebuild is scoped to Stella-owned Docker compose resources only. Unrelated Docker containers, images, networks, and volumes must not be touched.
- Cross-module edits are allowed only for bootstrap blockers discovered during the rebuild, scoped to `src/Attestor/**`, `src/JobEngine/**`, `src/Integrations/**`, `docs/integrations/**`, and `docs/modules/integrations/**`.
## Documentation Prerequisites
- `docs/INSTALL_GUIDE.md`
- `docs/quickstart.md`
- `devops/compose/README.md`
- `src/Integrations/README.md`
- `src/Concelier/StellaOps.Concelier.WebService/Extensions/SourceManagementEndpointExtensions.cs`
- `src/Integrations/StellaOps.Integrations.WebService/IntegrationEndpoints.cs`
## Delivery Tracker
### LOCAL-REBUILD-001 - Wipe Stella local compose state and bootstrap from scratch
Status: DONE
Dependency: none
Owners: Developer / Ops Integrator
Task description:
- Stop the Stella local compose lanes, remove Stella-owned containers and persistent volumes, recreate the required Docker networks, and bring the documented local stack back with the repo-supported scripts and compose files.
- Preserve repo configuration unless a documented bootstrap precondition is broken. If `.env` or hosts entries require repair, do the minimal corrective action and record it.
Completion criteria:
- [x] Stella-owned compose services are stopped and removed cleanly before rebuild.
- [x] Stella-owned persistent volumes required for a scratch bootstrap are recreated from empty state.
- [x] Core stack is running again with healthy status for required services.
- [x] Any bootstrap deviation from the documented path is logged in `Decisions & Risks`.
### LOCAL-REBUILD-002 - Recreate the local integration provider lane
Status: DONE
Dependency: LOCAL-REBUILD-001
Owners: Developer / Ops Integrator
Task description:
- Start the QA integration fixtures and the real third-party integration compose lane supported by the repo.
- Register the local integrations that the stack can actively exercise after bootstrap, using the live API surface rather than mock-only assumptions.
Completion criteria:
- [x] QA fixtures are running and healthy.
- [x] Real local providers are running for the supported low-idle lane, with optional profiles enabled only when needed.
- [x] Integration catalog entries exist for the local providers that can be verified from this environment.
- [x] Integration test and health endpoints succeed for the registered providers, or failures are logged with concrete cause.
### LOCAL-REBUILD-003 - Enable and verify advisory sources after rebuild
Status: DONE
Dependency: LOCAL-REBUILD-002
Owners: Developer / Ops Integrator
Task description:
- Re-run the advisory source auto-configuration flow on the rebuilt stack, confirm the catalog is populated, and verify the resulting source health state.
- Capture any source families that remain unhealthy or require unavailable credentials/fixtures.
Completion criteria:
- [x] Advisory source catalog is reachable on the rebuilt stack.
- [x] Bulk source check runs to completion.
- [x] Healthy sources are enabled after the check.
- [x] Any unhealthy or skipped sources are recorded with exact failure details.
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-04-09 | Sprint created for a Stella-only local wipe, rebuild, integration reprovisioning, and advisory source verification. | Developer |
| 2026-04-09 | Wiped Stella-owned Docker state, recreated `stellaops` / `stellaops_frontdoor`, rebuilt blocked images (`platform`, `scheduler-web`), and bootstrapped the stack with the setup script plus manual network recovery. | Developer |
| 2026-04-09 | Registered the full local provider lane through `integrations-web`, provisioned Vault-backed GitLab credentials, enabled the heavy GitLab registry profile, and converged the live integration catalog to 13/13 active providers. | Developer |
| 2026-04-09 | Ran `POST http://127.1.0.9/api/v1/advisory-sources/check` against the live source catalog and confirmed 74/74 advisory sources healthy and enabled. | Developer |
| 2026-04-09 | Updated local integration docs to document GitLab registry auth (`authref://vault/gitlab#registry-basic`) and the Bearer-challenge registry flow implemented in the Docker registry connector. | Developer |
## Decisions & Risks
- Decision: destructive cleanup is limited to Stella-owned Docker compose resources (`stellaops-*` containers, compose-owned volumes, and Stella compose networks) because the user asked to rebuild the local Stella installation, not wipe unrelated Docker state.
- Decision: use the repo-documented compose lanes as the authority for what counts as "all integrations" locally, then register every provider the local stack can actually exercise from this machine.
- Decision: bootstrap blocker fixes were applied in owning modules when the clean rebuild exposed real defects: duplicate publish outputs in Attestor persistence packages, stale script wiring in `scheduler-web`, and missing Bearer-challenge handling in the Docker registry connector.
- Decision: integration docs were updated in [../../devops/compose/README.md](../../devops/compose/README.md), [../integrations/LOCAL_SERVICES.md](../integrations/LOCAL_SERVICES.md), and [../modules/integrations/architecture.md](../modules/integrations/architecture.md) to match the live GitLab registry auth flow used by the rebuilt environment.
- Audit: the user-requested source revalidation used `POST http://127.1.0.9/api/v1/advisory-sources/check` against the runtime catalog exposed by `GET http://127.1.0.9/api/v1/advisory-sources/catalog`; purpose: re-enable and verify all advisory sources after the wipe. Result: 74/74 healthy and enabled.
- Risk: `scripts/setup.ps1` still misparses comments in `hosts.stellaops.local`, reporting bogus missing aliases such as comment words, and it still does not recreate the external `stellaops` network required by the split compose lanes.
- Risk: the local machine could not update `C:\Windows\System32\drivers\etc\hosts` because the session did not have elevation; Docker-network aliases were sufficient for inter-container traffic, but host-based friendly names remain an operator follow-up.
- Risk: all compose files currently share the Docker Compose project name `compose`, so `docker compose ... ps` and `up/down` calls emit cross-file orphan noise and make service-scoped status harder to trust.
- Risk: fresh-volume bootstrap still leaves unrelated core services unhealthy: `router-gateway` rejects duplicate `/platform` routes, `findings-ledger-web` crashes because `findings.ledger_projection_offsets` is missing, `timeline-web` restart-loops in startup migration handling, and `graph-api` / `scheduler-web` remain unhealthy. The requested integration and advisory-source lanes are usable, but a full-stack fresh install is not yet fully converged.
## Next Checkpoints
- Fix the remaining fresh-volume core blockers (`router-gateway`, `findings-ledger-web`, `timeline-web`, `graph-api`, `scheduler-web`) and rerun the bootstrap smoke.
- Repair `scripts/setup.ps1` host-alias parsing and external-network recreation so the documented path works without manual intervention.
- Re-run the clean install after those blockers land and archive this sprint once full-stack convergence is proven.