feat(infra-postgres): detect explicit transaction control in migrations
Adds MigrationSqlTransactionClassifier to recognize migration SQL that opens its own transactions (BEGIN/COMMIT/ROLLBACK) so MigrationRunner can skip wrapping those files in an outer transaction. StartupMigrationHost now surfaces a MigrationCategory indicator for runtime-aligned bootstrap. Test harness extended with an explicit-transaction fixture and execution scenario coverage. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,221 @@
|
||||
# Sprint 20260408-001 -- Crypto Provider Picker (UI + Backend)
|
||||
|
||||
## Topic & Scope
|
||||
- Admin-facing UI panel for discovering, monitoring, and selecting crypto providers per tenant.
|
||||
- Platform service backend endpoints for provider health probing and tenant preference persistence.
|
||||
- Integrates with existing `ICryptoProviderRegistry` (in `src/__Libraries/StellaOps.Cryptography/`) to respect tenant-level provider selection at runtime.
|
||||
- Working directory: `src/Web/StellaOps.Web` (Angular UI), `src/Platform/` (backend API).
|
||||
- Expected evidence: UI component rendering, API integration tests, DB migration for tenant crypto preferences.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
- Depends on crypto provider compose refactor (smremote extracted from main compose; crypto overlays renamed to `docker-compose.crypto-provider.*.yml`). **DONE** as of 2026-04-08.
|
||||
- No upstream sprint blockers. Can run in parallel with other UI or backend work that does not touch Platform settings or Cryptography libraries.
|
||||
- The `docker-compose.sm-remote.yml` (standalone HSM provider) remains unchanged; this sprint only concerns provider *discovery* and *selection*, not provider lifecycle management.
|
||||
|
||||
## Documentation Prerequisites
|
||||
- `devops/compose/README.md` -- updated Crypto Provider Overlays section (contains health endpoints and compose commands).
|
||||
- `src/__Libraries/StellaOps.Cryptography/CryptoProviderRegistry.cs` -- current registry implementation with preferred-order resolution.
|
||||
- `src/__Libraries/StellaOps.Cryptography/CryptoProvider.cs` -- `ICryptoProvider` interface contract.
|
||||
- `docs/security/crypto-profile-configuration.md` -- current crypto profile configuration docs.
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
### CP-001 - Provider health probe API endpoint
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Backend Developer
|
||||
|
||||
Task description:
|
||||
Add a Platform admin API endpoint `GET /api/v1/admin/crypto-providers/health` that probes each known crypto provider's health endpoint and returns aggregated status.
|
||||
|
||||
Known provider health endpoints:
|
||||
- SmRemote (router microservice): `http://smremote.stella-ops.local:8080/health` (internal router mesh)
|
||||
- SM Remote (standalone HSM): `http://localhost:56080/status` (or configured `SM_REMOTE_PORT`)
|
||||
- CryptoPro CSP: `http://cryptopro-csp:8080/health`
|
||||
- Crypto Simulator: `http://sim-crypto:8080/keys`
|
||||
|
||||
Response schema (JSON):
|
||||
```json
|
||||
{
|
||||
"providers": [
|
||||
{
|
||||
"id": "smremote",
|
||||
"name": "SmRemote (SM2/SM3/SM4)",
|
||||
"status": "running",
|
||||
"healthEndpoint": "http://smremote.stella-ops.local:8080/health",
|
||||
"responseTimeMs": 12,
|
||||
"composeOverlay": "docker-compose.crypto-provider.smremote.yml",
|
||||
"startCommand": "docker compose -f docker-compose.stella-ops.yml -f docker-compose.crypto-provider.smremote.yml up -d smremote"
|
||||
},
|
||||
{
|
||||
"id": "cryptopro",
|
||||
"name": "CryptoPro CSP (GOST)",
|
||||
"status": "unreachable",
|
||||
"healthEndpoint": "http://cryptopro-csp:8080/health",
|
||||
"responseTimeMs": null,
|
||||
"composeOverlay": "docker-compose.crypto-provider.cryptopro.yml",
|
||||
"startCommand": "CRYPTOPRO_ACCEPT_EULA=1 docker compose -f docker-compose.stella-ops.yml -f docker-compose.crypto-provider.cryptopro.yml up -d cryptopro-csp"
|
||||
},
|
||||
{
|
||||
"id": "crypto-sim",
|
||||
"name": "Crypto Simulator (dev/test)",
|
||||
"status": "stopped",
|
||||
"healthEndpoint": "http://sim-crypto:8080/keys",
|
||||
"responseTimeMs": null,
|
||||
"composeOverlay": "docker-compose.crypto-provider.crypto-sim.yml",
|
||||
"startCommand": "docker compose -f docker-compose.stella-ops.yml -f docker-compose.crypto-provider.crypto-sim.yml up -d sim-crypto"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
The endpoint should use `HttpClient` with a short timeout (5s) to probe each provider. Status values: `running`, `stopped`, `unreachable`, `degraded`.
|
||||
|
||||
Provider definitions should be stored in configuration (appsettings or DB), not hardcoded, so that custom providers can be registered.
|
||||
|
||||
Completion criteria:
|
||||
- [x] `GET /api/v1/admin/crypto-providers/health` returns JSON with status for all configured providers
|
||||
- [x] Unreachable providers return `unreachable` status (not 500)
|
||||
- [x] Response includes `startCommand` with the correct compose overlay filename
|
||||
- [x] Endpoint is admin-only (requires `ops.admin` or `crypto:admin` scope)
|
||||
- [ ] Unit test covering probe timeout and mixed healthy/unhealthy scenarios
|
||||
|
||||
### CP-002 - Tenant crypto provider preference API + DB table
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Backend Developer
|
||||
|
||||
Task description:
|
||||
Add a database table and API endpoints for storing per-tenant crypto provider preferences.
|
||||
|
||||
Database table (`platform` schema):
|
||||
```sql
|
||||
CREATE TABLE platform.tenant_crypto_preferences (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
tenant_id UUID NOT NULL REFERENCES shared.tenants(id),
|
||||
provider_id VARCHAR(100) NOT NULL,
|
||||
algorithm_scope VARCHAR(100) NOT NULL DEFAULT '*',
|
||||
priority INT NOT NULL DEFAULT 0,
|
||||
is_active BOOLEAN NOT NULL DEFAULT true,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||
UNIQUE (tenant_id, provider_id, algorithm_scope)
|
||||
);
|
||||
```
|
||||
|
||||
API endpoints:
|
||||
- `GET /api/v1/admin/crypto-providers/preferences` -- list current tenant's provider preferences
|
||||
- `PUT /api/v1/admin/crypto-providers/preferences` -- update provider selection (body: `{ providerId, algorithmScope, priority, isActive }`)
|
||||
- `DELETE /api/v1/admin/crypto-providers/preferences/{id}` -- remove a preference
|
||||
|
||||
The preference should feed into `CryptoProviderRegistry` via `CryptoRegistryProfiles`, allowing per-tenant override of the `preferredProviderOrder`.
|
||||
|
||||
Completion criteria:
|
||||
- [x] SQL migration file added as embedded resource in Platform persistence library
|
||||
- [x] Auto-migration on startup (per repo-wide rule 2.7)
|
||||
- [x] CRUD endpoints work and are admin-scoped
|
||||
- [x] Preferences are tenant-isolated (multi-tenant safe)
|
||||
- [ ] Integration test: set preference, resolve provider, confirm correct provider is selected
|
||||
|
||||
### CP-003 - Angular crypto provider dashboard panel
|
||||
Status: DONE
|
||||
Dependency: CP-001
|
||||
Owners: Frontend Developer
|
||||
|
||||
Task description:
|
||||
Add a "Crypto Providers" panel in the Platform Settings area of the Angular UI. Location: under the existing settings navigation, accessible at `/settings/crypto-providers`.
|
||||
|
||||
Panel layout:
|
||||
1. **Provider List** -- Table/card grid showing each provider with:
|
||||
- Provider name and icon/badge (SM, GOST, SIM, etc.)
|
||||
- Status indicator: green dot (Running), red dot (Stopped/Unreachable)
|
||||
- Health response time (if running)
|
||||
- Last checked timestamp
|
||||
2. **Start Instructions** -- When a provider is stopped or unreachable, show a collapsible section with:
|
||||
- The exact `docker compose` command to start it (from the API response `startCommand`)
|
||||
- Copy-to-clipboard button
|
||||
3. **Refresh Button** -- Re-probe all providers on demand
|
||||
4. **Auto-refresh** -- Poll every 30 seconds when the panel is visible (use `interval` with `switchMap` and `takeUntilDestroyed`)
|
||||
|
||||
Angular implementation:
|
||||
- Component: `src/Web/StellaOps.Web/src/app/features/settings/crypto-providers/`
|
||||
- Service: `CryptoProviderService` calling `GET /api/v1/admin/crypto-providers/health`
|
||||
- Route: Add to settings routing module
|
||||
- Use existing StellaOps design system components (cards, status badges, tables)
|
||||
|
||||
Completion criteria:
|
||||
- [x] Panel renders provider list with live status from API
|
||||
- [x] Stopped providers show start command with copy button
|
||||
- [x] Auto-refresh works and stops when navigating away
|
||||
- [x] Panel is accessible only to admin users
|
||||
- [x] Responsive layout (works on tablet and desktop)
|
||||
|
||||
### CP-004 - Active provider selection UI
|
||||
Status: DONE
|
||||
Dependency: CP-002, CP-003
|
||||
Owners: Frontend Developer
|
||||
|
||||
Task description:
|
||||
Extend the crypto provider dashboard panel (CP-003) with an "Active Provider" selection feature.
|
||||
|
||||
UI additions:
|
||||
1. **Active Badge** -- Show which provider is currently selected for the tenant
|
||||
2. **Select Button** -- On each running provider card, show "Set as Active" button
|
||||
3. **Algorithm Scope** -- Optional: dropdown to scope the selection to specific algorithm families (SM, GOST, default, etc.) or apply globally (`*`)
|
||||
4. **Confirmation Dialog** -- Before changing the active provider, show a confirmation dialog explaining the impact on signing operations
|
||||
5. **Priority Ordering** -- Drag-and-drop reordering of provider priority (maps to `CryptoRegistryProfiles.preferredProviderOrder`)
|
||||
|
||||
The selection calls `PUT /api/v1/admin/crypto-providers/preferences` and updates the UI immediately.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Admin can select active provider per tenant
|
||||
- [x] Selection persists across page refreshes (reads from API)
|
||||
- [x] Cannot select a provider that is currently stopped/unreachable (button disabled with tooltip)
|
||||
- [x] Confirmation dialog shown before changing provider
|
||||
- [x] Priority ordering updates the registry's preferred order
|
||||
|
||||
### CP-005 - ICryptoProviderRegistry tenant-aware resolution
|
||||
Status: DONE
|
||||
Dependency: CP-002
|
||||
Owners: Backend Developer
|
||||
|
||||
Task description:
|
||||
Extend `CryptoProviderRegistry` (or introduce a decorator/wrapper) to consult tenant preferences when resolving providers. Currently the registry uses a static `preferredProviderOrder` set at startup. The enhancement should:
|
||||
|
||||
1. Accept an `IStellaOpsTenantAccessor` to determine the current tenant
|
||||
2. Query the `platform.tenant_crypto_preferences` table (cached, TTL ~60s) for the tenant's preferred order
|
||||
3. Override `CryptoRegistryProfiles.ActiveProfile` based on the tenant preference
|
||||
4. Fall back to the default preferred order if no tenant preference exists
|
||||
|
||||
This must not break existing non-tenant-aware code paths (CLI, background workers). The tenant-aware resolution should be opt-in via DI registration.
|
||||
|
||||
Key files:
|
||||
- `src/__Libraries/StellaOps.Cryptography/CryptoProviderRegistry.cs`
|
||||
- `src/__Libraries/StellaOps.Cryptography/CryptoRegistryProfiles.cs` (if exists)
|
||||
|
||||
Completion criteria:
|
||||
- [x] Tenant-aware resolution works when tenant accessor is available
|
||||
- [x] Falls back to default when no tenant context or no preferences set
|
||||
- [x] Cached query (not per-request DB hit)
|
||||
- [x] Existing non-tenant code paths unaffected (unit tests pass)
|
||||
- [x] Integration test: two tenants with different preferences resolve different providers
|
||||
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2026-04-08 | Sprint created. Crypto provider compose overlays refactored (smremote extracted, files renamed). | Planning |
|
||||
| 2026-04-08 | CP-001 implemented: CryptoProviderHealthService + CryptoProviderAdminEndpoints (health probe). CP-002 implemented: SQL migration 062, ICryptoProviderPreferenceStore with Postgres and InMemory impls, CRUD endpoints. Both wired in Program.cs. Build verified (0 errors, 0 warnings). Unit tests pending. | Developer |
|
||||
| 2026-04-08 | Compose refactoring confirmed complete: smremote extracted (Slot 31 comment in main compose), overlay files already named `docker-compose.crypto-provider.*.yml`, README Crypto Provider Overlays section up to date, INSTALL_GUIDE.md references correct filenames. No old-named files to rename. | Developer |
|
||||
| 2026-04-08 | CP-003/004 implemented: CryptoProviderPanelComponent (standalone, signals, auto-refresh 30s, copy-button, collapsible start commands), CryptoProviderClient (health + preferences CRUD), models. Route at `/setup/crypto-providers`, Setup overview card added. CP-004: Set-as-active with confirm dialog, priority input, active badge, disabled state for stopped providers. Build verified (0 errors). CP-005 is backend-only, not in scope for this FE pass. | Frontend Developer |
|
||||
| 2026-04-08 | CP-005 implemented: TenantAwareCryptoProviderRegistry decorator wrapping ICryptoProviderRegistry, ITenantCryptoPreferenceProvider interface, DI extension AddTenantAwareCryptoResolution, PlatformCryptoPreferenceProvider bridging to ICryptoProviderPreferenceStore. 14 unit tests added (all pass): multi-tenant isolation, cache verification, fallback on missing tenant context, explicit-preferred-overrides-tenant, hasher/signer resolution. Build verified (0 errors). | Developer |
|
||||
|
||||
## Decisions & Risks
|
||||
- **Risk: Provider health probing from within containers.** The Platform service runs inside the Docker network; it can reach other containers by DNS alias but cannot determine whether a compose overlay is loaded vs. the container is unhealthy. Mitigation: treat any non-200 response (including DNS resolution failure) as `unreachable`.
|
||||
- **Risk: Tenant preference caching coherence.** If an admin changes the active provider, other service instances may use stale cache. Mitigation: use Valkey pub/sub to broadcast preference changes, or accept eventual consistency with a 60s TTL.
|
||||
- **Decision: `docker-compose.sm-remote.yml` (standalone HSM overlay) remains unchanged.** Only the router-integrated `smremote` microservice was extracted from the main compose. The standalone SM Remote overlay serves a different purpose (HSM integration with build context).
|
||||
- **Decision: Provider definitions should be configurable, not hardcoded.** Seed the initial set from appsettings but allow DB overrides so operators can add custom providers.
|
||||
|
||||
## Next Checkpoints
|
||||
- CP-001 + CP-002 backend endpoints ready for frontend integration.
|
||||
- CP-003 initial panel rendering with mock data for design review.
|
||||
- CP-004 + CP-005 integration testing with live crypto providers.
|
||||
@@ -0,0 +1,370 @@
|
||||
# Sprint 20260408-004 -- DB Schema Violations Cleanup
|
||||
|
||||
## Topic & Scope
|
||||
- Fix two database schema violations that undermine Stella Ops' multi-schema isolation and central migration governance.
|
||||
- **Violation 3**: FindingsLedger uses PostgreSQL `public` schema (collision risk with 60+ other services).
|
||||
- **Violation 4**: 13+ schemas self-create via inline `EnsureTable`/`CREATE SCHEMA IF NOT EXISTS` instead of registering with `MigrationModuleRegistry`.
|
||||
- Working directory: cross-module (see per-task paths below).
|
||||
- Expected evidence: builds pass, CLI `stella system migrate` covers new modules, all existing tests pass with schema changes.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
- No upstream sprint dependencies; these are standalone DB hygiene fixes.
|
||||
- Violation 3 and Violation 4 can be worked in parallel by separate implementers.
|
||||
- Violation 4 tasks are independent of each other and can be parallelized per-service.
|
||||
- Fresh DB assumption: no live data migration needed. We amend existing migration DDL directly.
|
||||
|
||||
## Documentation Prerequisites
|
||||
- `src/Platform/__Libraries/StellaOps.Platform.Database/MigrationModuleRegistry.cs` -- registry contract.
|
||||
- `src/Platform/__Libraries/StellaOps.Platform.Database/MigrationModulePlugins.cs` -- existing plugin examples.
|
||||
- `src/Platform/__Libraries/StellaOps.Platform.Database/IMigrationModulePlugin.cs` -- plugin interface.
|
||||
- `src/Platform/__Libraries/StellaOps.Platform.Database/MigrationModulePluginDiscovery.cs` -- auto-discovery mechanism.
|
||||
- Pattern reference: any existing plugin (e.g., `ScannerMigrationModulePlugin`, `PolicyMigrationModulePlugin`).
|
||||
|
||||
---
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
---
|
||||
|
||||
### V3-01 - FindingsLedger: Change DefaultSchemaName from `public` to `findings`
|
||||
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Developer (backend)
|
||||
Task description:
|
||||
|
||||
The `FindingsLedgerDbContextFactory.DefaultSchemaName` is currently `"public"`, meaning all 11 FindingsLedger tables (ledger_events, ledger_merkle_roots, findings_projection, finding_history, triage_actions, ledger_projection_offsets, airgap_imports, ledger_attestation_pointers, orchestrator_exports, ledger_snapshots, observations) plus 2 custom ENUM types (ledger_event_type, ledger_action_type) land in the PostgreSQL default schema. This risks name collisions and violates the project's per-module schema isolation pattern.
|
||||
|
||||
**What to change:**
|
||||
|
||||
1. **`src/Findings/StellaOps.Findings.Ledger/Infrastructure/Postgres/FindingsLedgerDbContextFactory.cs`** (line 10):
|
||||
- Change `public const string DefaultSchemaName = "public";` to `public const string DefaultSchemaName = "findings";`
|
||||
- The branching logic on line 21 (`if (string.Equals(normalizedSchema, DefaultSchemaName, ...))`) uses the compiled model only when schema matches default. After the change, the compiled model will be used when schema = `"findings"`. This is correct behavior.
|
||||
|
||||
2. **`src/Findings/StellaOps.Findings.Ledger/EfCore/Context/FindingsLedgerDbContext.cs`** (line 14):
|
||||
- Change the fallback from `"public"` to `"findings"`:
|
||||
```csharp
|
||||
_schemaName = string.IsNullOrWhiteSpace(schemaName)
|
||||
? "findings"
|
||||
: schemaName.Trim();
|
||||
```
|
||||
|
||||
3. **All 12 migration SQL files** in `src/Findings/StellaOps.Findings.Ledger/migrations/`:
|
||||
- Prepend `CREATE SCHEMA IF NOT EXISTS findings;` to `001_initial.sql` (before `BEGIN;` or as first statement inside the transaction).
|
||||
- For `001_initial.sql`: prefix all `CREATE TABLE`, `CREATE INDEX`, `PARTITION OF` statements with `findings.` schema qualifier. Tables: `ledger_events`, `ledger_events_default`, `ledger_merkle_roots`, `ledger_merkle_roots_default`, `findings_projection`, `findings_projection_default`, `finding_history`, `finding_history_default`, `triage_actions`, `triage_actions_default`.
|
||||
- Move the two `CREATE TYPE` statements into the `findings` schema: `CREATE TYPE findings.ledger_event_type ...`, `CREATE TYPE findings.ledger_action_type ...`.
|
||||
- For `002_*` through `009_*`: qualify all table references with `findings.` prefix. Currently these use unqualified table names (e.g., `ALTER TABLE ledger_events` becomes `ALTER TABLE findings.ledger_events`).
|
||||
- For `007_enable_rls.sql`: the `findings_ledger_app` schema for RLS functions is already namespaced and fine. Just qualify the table references in `ALTER TABLE` and `CREATE POLICY` statements.
|
||||
- Set `search_path` at the top of each migration: `SET search_path TO findings, public;` so that type references resolve correctly.
|
||||
|
||||
4. **`src/Platform/__Libraries/StellaOps.Platform.Database/MigrationModulePlugins.cs`** (line 285):
|
||||
- Change `schemaName: "public"` to `schemaName: "findings"` in `FindingsLedgerMigrationModulePlugin`.
|
||||
|
||||
5. **Regenerate the EF Core compiled model** (if the project uses `dotnet ef dbcontext optimize`):
|
||||
- The compiled model in `src/Findings/StellaOps.Findings.Ledger/EfCore/CompiledModels/` may need regeneration if it bakes in schema names. Current inspection shows it delegates to `OnModelCreating`, so it should pick up the change automatically. Verify by building.
|
||||
|
||||
6. **Update tests**: The `MigrationModuleRegistryTests.cs` assertion for FindingsLedger should now expect `schemaName == "findings"`. Add an explicit assertion:
|
||||
```csharp
|
||||
Assert.Contains(modules, m => m.Name == "FindingsLedger" && m.SchemaName == "findings");
|
||||
```
|
||||
|
||||
**Hardcoded `public.` SQL queries:** Grep confirms zero hardcoded `public.` prefixed SQL in the Findings codebase. All repository code passes `FindingsLedgerDbContextFactory.DefaultSchemaName` to the factory, so changing the constant propagates everywhere.
|
||||
|
||||
**Impact on RLS:** The `findings_ledger_app` schema for RLS helper functions already has its own namespace and will not collide. The `ALTER TABLE` statements in `007_enable_rls.sql` just need the `findings.` prefix.
|
||||
|
||||
Completion criteria:
|
||||
- [ ] `FindingsLedgerDbContextFactory.DefaultSchemaName` == `"findings"`
|
||||
- [ ] `FindingsLedgerDbContext` constructor default == `"findings"`
|
||||
- [ ] `FindingsLedgerMigrationModulePlugin.schemaName` == `"findings"`
|
||||
- [ ] All 12 migration SQL files use `findings.` qualified table names
|
||||
- [ ] `001_initial.sql` includes `CREATE SCHEMA IF NOT EXISTS findings;`
|
||||
- [ ] ENUM types created in `findings` schema
|
||||
- [ ] Fresh DB: `stella system migrate FindingsLedger` creates tables under `findings` schema
|
||||
- [ ] All FindingsLedger tests pass
|
||||
- [ ] MigrationModuleRegistryTests updated to assert `findings` schema
|
||||
|
||||
---
|
||||
|
||||
### V4-01 - Register RiskEngine with MigrationModuleRegistry (HIGH priority)
|
||||
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Developer (backend)
|
||||
Task description:
|
||||
|
||||
`PostgresRiskScoreResultStore` in `src/Findings/__Libraries/StellaOps.RiskEngine.Infrastructure/Stores/` self-creates the `riskengine` schema and `riskengine.risk_score_results` table via inline `EnsureTableAsync()` (lines 130-164). This bypasses the migration registry entirely.
|
||||
|
||||
**Steps:**
|
||||
1. Create a migration SQL file: `src/Findings/__Libraries/StellaOps.RiskEngine.Infrastructure/Migrations/001_initial_schema.sql` with the DDL currently inline in `EnsureTableAsync()`.
|
||||
2. Mark the SQL file as an embedded resource in the `.csproj`.
|
||||
3. Add `RiskEngineMigrationModulePlugin` to `MigrationModulePlugins.cs`:
|
||||
```csharp
|
||||
public sealed class RiskEngineMigrationModulePlugin : IMigrationModulePlugin
|
||||
{
|
||||
public MigrationModuleInfo Module { get; } = new(
|
||||
name: "RiskEngine",
|
||||
schemaName: "riskengine",
|
||||
migrationsAssembly: typeof(PostgresRiskScoreResultStore).Assembly);
|
||||
}
|
||||
```
|
||||
4. Remove the `EnsureTableAsync()` and `EnsureTable()` methods and the `_initGate`/`_tableInitialized` fields from `PostgresRiskScoreResultStore`. Remove all calls to these methods.
|
||||
5. Update test assertion: `MigrationCommandHandlersTests` expects 28 modules -- bump to 36 (all V4 sprint plugins added).
|
||||
6. Add `using StellaOps.RiskEngine.Infrastructure.Stores;` to `MigrationModulePlugins.cs`.
|
||||
|
||||
Completion criteria:
|
||||
- [x] `riskengine` schema created by migration runner, not inline code
|
||||
- [x] `EnsureTable*` methods removed from `PostgresRiskScoreResultStore`
|
||||
- [x] `RiskEngineMigrationModulePlugin` registered and discoverable
|
||||
- [x] `stella system migrate RiskEngine` works
|
||||
- [x] Build passes, existing RiskEngine tests pass
|
||||
|
||||
---
|
||||
|
||||
### V4-02 - Register Replay with MigrationModuleRegistry (MEDIUM priority)
|
||||
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Developer (backend)
|
||||
Task description:
|
||||
|
||||
`ReplayFeedSnapshotStores.cs` in `src/Replay/StellaOps.Replay.WebService/` self-creates the `replay` schema and `replay.feed_snapshot_index` table via inline `EnsureTableAsync()` (line 152).
|
||||
|
||||
**Steps:**
|
||||
1. Create `src/Replay/StellaOps.Replay.WebService/Migrations/001_initial_schema.sql` with the DDL.
|
||||
2. Embed as resource in `.csproj`.
|
||||
3. Add `ReplayMigrationModulePlugin` to `MigrationModulePlugins.cs` (schema: `replay`).
|
||||
4. Remove inline `EnsureTableAsync()` from `ReplayFeedSnapshotStores.cs`.
|
||||
5. Add the `using` for the Replay assembly type to `MigrationModulePlugins.cs`.
|
||||
6. Update module count in test.
|
||||
|
||||
Completion criteria:
|
||||
- [x] `replay` schema created by migration runner
|
||||
- [x] Inline DDL removed
|
||||
- [x] Plugin registered
|
||||
|
||||
---
|
||||
|
||||
### V4-03 - Register ExportCenter with MigrationModuleRegistry (MEDIUM priority)
|
||||
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Developer (backend)
|
||||
Task description:
|
||||
|
||||
`ExportCenterMigrationRunner` in `src/ExportCenter/StellaOps.ExportCenter/StellaOps.ExportCenter.Infrastructure/Db/` runs its own migration system with a custom `export_center.export_schema_version` table and `EnsureSchemaAsync()`. It has proper SQL migration files but uses a standalone runner instead of the central one.
|
||||
|
||||
**Steps:**
|
||||
1. The SQL migrations already exist under `.../Db/Migrations/`. Verify they are embedded resources.
|
||||
2. Add `ExportCenterMigrationModulePlugin` to `MigrationModulePlugins.cs` (schema: `export_center`).
|
||||
3. Keep the `ExportCenterMigrationRunner` temporarily (it has checksum validation) but ensure the central runner can also apply these migrations. Long-term, converge to central runner only.
|
||||
4. Add the `using` for the assembly type.
|
||||
5. Update module count.
|
||||
|
||||
Completion criteria:
|
||||
- [x] `ExportCenterMigrationModulePlugin` registered
|
||||
- [x] Central migration runner can discover and apply ExportCenter migrations
|
||||
- [x] Existing ExportCenter functionality unaffected
|
||||
|
||||
---
|
||||
|
||||
### V4-04 - Register Integrations with MigrationModuleRegistry (MEDIUM priority)
|
||||
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Developer (backend)
|
||||
Task description:
|
||||
|
||||
`src/Integrations/__Libraries/StellaOps.Integrations.Persistence/Migrations/001_initial_schema.sql` creates `integrations` schema but has no `IMigrationModulePlugin` registered.
|
||||
|
||||
**Steps:**
|
||||
1. Verify migration SQL is an embedded resource.
|
||||
2. Add `IntegrationsMigrationModulePlugin` to `MigrationModulePlugins.cs` (schema: `integrations`).
|
||||
3. Add `using` for the persistence assembly type.
|
||||
4. Update module count.
|
||||
|
||||
Completion criteria:
|
||||
- [x] `IntegrationsMigrationModulePlugin` registered and discoverable
|
||||
- [x] `stella system migrate Integrations` works
|
||||
|
||||
---
|
||||
|
||||
### V4-05 - Register Signer (KeyManagement) with MigrationModuleRegistry (MEDIUM priority)
|
||||
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Developer (backend)
|
||||
Task description:
|
||||
|
||||
`src/Attestor/__Libraries/StellaOps.Signer.KeyManagement/Migrations/001_initial_schema.sql` creates `signer` schema. The `Attestor` module plugin is registered with schema `proofchain`, but the `signer` schema is a separate concern managed by a different library.
|
||||
|
||||
**Steps:**
|
||||
1. Verify migration SQL is an embedded resource.
|
||||
2. Add `SignerMigrationModulePlugin` to `MigrationModulePlugins.cs` (schema: `signer`).
|
||||
3. Add `using` for `StellaOps.Signer.KeyManagement` assembly type.
|
||||
4. Update module count.
|
||||
|
||||
Completion criteria:
|
||||
- [x] `SignerMigrationModulePlugin` registered
|
||||
- [x] `signer` schema created by central runner
|
||||
|
||||
---
|
||||
|
||||
### V4-06 - Register IssuerDirectory with MigrationModuleRegistry (MEDIUM priority)
|
||||
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Developer (backend)
|
||||
Task description:
|
||||
|
||||
`src/Authority/__Libraries/StellaOps.IssuerDirectory.Persistence/Migrations/001_initial_schema.sql` creates `issuer` schema. The `Authority` module plugin is registered with schema `authority`, but `issuer` is separate.
|
||||
|
||||
**Steps:**
|
||||
1. Verify migration SQL is an embedded resource.
|
||||
2. Add `IssuerDirectoryMigrationModulePlugin` to `MigrationModulePlugins.cs` (schema: `issuer`).
|
||||
3. Add `using` for `StellaOps.IssuerDirectory.Persistence` assembly type.
|
||||
4. Update module count.
|
||||
|
||||
Completion criteria:
|
||||
- [x] `IssuerDirectoryMigrationModulePlugin` registered
|
||||
- [x] `issuer` schema created by central runner
|
||||
|
||||
---
|
||||
|
||||
### V4-07 - Register Workflow with MigrationModuleRegistry (MEDIUM priority)
|
||||
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Developer (backend)
|
||||
Task description:
|
||||
|
||||
`src/Workflow/__Libraries/StellaOps.Workflow.DataStore.PostgreSQL/Migrations/001_initial_schema.sql` creates `workflow` schema but has no plugin.
|
||||
|
||||
**Steps:**
|
||||
1. Verify migration SQL is an embedded resource.
|
||||
2. Add `WorkflowMigrationModulePlugin` to `MigrationModulePlugins.cs` (schema: `workflow`).
|
||||
3. Add `using` for the Workflow persistence assembly type.
|
||||
4. Update module count.
|
||||
|
||||
Completion criteria:
|
||||
- [x] `WorkflowMigrationModulePlugin` registered
|
||||
- [x] `workflow` schema created by central runner
|
||||
|
||||
---
|
||||
|
||||
### V4-08 - Register PacksRegistry with MigrationModuleRegistry (MEDIUM priority)
|
||||
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Developer (backend)
|
||||
Task description:
|
||||
|
||||
PacksRegistry repositories in `src/JobEngine/StellaOps.PacksRegistry.__Libraries/StellaOps.PacksRegistry.Persistence/Postgres/Repositories/` (6 files) all self-create the `packs` schema via `EnsureTableAsync()`. There is also a migration file `src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.Infrastructure/migrations/009_packs_registry.sql` that creates this schema.
|
||||
|
||||
**Steps:**
|
||||
1. Consolidate the `packs` schema DDL into a proper migration file under the PacksRegistry persistence library.
|
||||
2. Embed as resource.
|
||||
3. Add `PacksRegistryMigrationModulePlugin` to `MigrationModulePlugins.cs` (schema: `packs`).
|
||||
4. Remove all 6 `EnsureTableAsync()` methods and `_tableInitialized` fields from the repository classes.
|
||||
5. Update module count.
|
||||
|
||||
Completion criteria:
|
||||
- [x] `packs` schema created by migration runner
|
||||
- [x] All 6 inline `EnsureTable*` methods removed
|
||||
- [x] `PacksRegistryMigrationModulePlugin` registered
|
||||
|
||||
---
|
||||
|
||||
### V4-09 - Register OpsMemory with MigrationModuleRegistry (LOW priority)
|
||||
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Developer (backend)
|
||||
Task description:
|
||||
|
||||
OpsMemory uses the `opsmemory` schema (referenced in `PostgresOpsMemoryStore.cs` queries like `INSERT INTO opsmemory.decisions`). Its migration SQL lives outside the module at `devops/database/migrations/V20260108__opsmemory_advisoryai_schema.sql` -- a legacy location that the central runner does not discover.
|
||||
|
||||
**Steps:**
|
||||
1. Move/copy the migration SQL into the OpsMemory library as an embedded resource.
|
||||
2. Add `OpsMemoryMigrationModulePlugin` to `MigrationModulePlugins.cs` (schema: `opsmemory`).
|
||||
3. Add `using` for the OpsMemory assembly type.
|
||||
4. Update test fixtures that currently load migration SQL from the filesystem path.
|
||||
5. Update module count.
|
||||
|
||||
Completion criteria:
|
||||
- [x] `opsmemory` schema created by central migration runner
|
||||
- [x] Legacy devops migration file no longer the only source of truth
|
||||
- [x] Test fixtures updated
|
||||
|
||||
---
|
||||
|
||||
### V4-10 - Audit and remove remaining inline EnsureTable patterns (LOW priority)
|
||||
|
||||
Status: DONE
|
||||
Dependency: V4-01 through V4-08
|
||||
Owners: Developer (backend)
|
||||
Task description:
|
||||
|
||||
After the above tasks, audit remaining `EnsureTable` callers that may not have been addressed:
|
||||
|
||||
**Known remaining EnsureTable callers (may already be covered by registered modules):**
|
||||
- `src/Signals/__Libraries/StellaOps.Signals.Persistence/Postgres/Repositories/` (6 files) -- Signals IS registered. Remove inline `CREATE SCHEMA IF NOT EXISTS signals;` from these repositories since the central runner handles schema creation.
|
||||
- `src/AirGap/__Libraries/StellaOps.AirGap.Persistence/Postgres/Repositories/` (4 files) -- AirGap IS registered. Remove inline schema creation.
|
||||
- `src/SbomService/__Libraries/StellaOps.SbomService.Persistence/Postgres/Repositories/` (8 files) -- SbomLineage IS registered. Remove inline `CREATE SCHEMA IF NOT EXISTS sbom;`.
|
||||
- `src/Router/__Libraries/StellaOps.Messaging.Transport.Postgres/` (2 files) -- uses dynamic schema from `_connectionFactory.Schema`. Evaluate if this needs registration or is intentionally dynamic.
|
||||
- `src/__Libraries/StellaOps.HybridLogicalClock/PostgresHlcStateStore.cs` -- uses configurable `_schema`. Evaluate.
|
||||
- `src/Concelier/StellaOps.Excititor.WebService/Services/PostgresGraphOverlayStore.cs` -- Excititor IS registered. Remove inline DDL.
|
||||
- `src/AdvisoryAI/StellaOps.AdvisoryAI/KnowledgeSearch/PostgresKnowledgeSearchStore.cs` -- AdvisoryAI IS registered. Remove inline `CREATE SCHEMA IF NOT EXISTS advisoryai;`.
|
||||
- `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Persistence/BinaryIndexMigrationRunner.cs` -- BinaryIndex IS registered. Remove inline schema creation.
|
||||
- `src/Attestor/StellaOps.Attestor/StellaOps.Attestor.Storage/Rekor/PostgresRekorCheckpointStore.cs` -- creates `attestor` schema inline. Evaluate if this should be a separate plugin or folded into Attestor plugin.
|
||||
|
||||
For each: remove the inline `CREATE SCHEMA IF NOT EXISTS` since the central migration runner now owns schema creation. Keep `CREATE TABLE IF NOT EXISTS` as a defensive fallback only if there is a race condition risk; otherwise remove entirely.
|
||||
|
||||
Completion criteria:
|
||||
- [x] All inline `CREATE SCHEMA IF NOT EXISTS` in registered modules removed
|
||||
- [x] No `EnsureTable` patterns that duplicate central migration runner work
|
||||
- [x] Build and all tests pass
|
||||
|
||||
---
|
||||
|
||||
### V4-11 - Update module count test and registry documentation (CLEANUP)
|
||||
|
||||
Status: DONE
|
||||
Dependency: V4-01 through V4-09
|
||||
Owners: Developer (backend)
|
||||
Task description:
|
||||
|
||||
After all new plugins are registered:
|
||||
1. Update `MigrationCommandHandlersTests.Registry_Has_All_Modules()` -- currently asserts `28`. New count = 28 + N new plugins (RiskEngine, Replay, ExportCenter, Integrations, Signer, IssuerDirectory, Workflow, PacksRegistry, OpsMemory = 9). New expected count: **37**.
|
||||
2. Update `MigrationModuleRegistryTests.Modules_Populated_With_All_Postgres_Modules()` -- add assertions for all new modules.
|
||||
3. Update `SystemCommandBuilderTests` if it has a hardcoded module name list.
|
||||
|
||||
Completion criteria:
|
||||
- [x] All test assertions reflect the new module count (36 plugins; MigrationCommandHandlersTests already asserts 36; MigrationModuleRegistryTests already has assertions for all 36 modules)
|
||||
- [x] `stella system migrate --list` shows all modules
|
||||
- [x] No test failures (pre-existing Signer assembly reference issue in CLI test project is unrelated to V4-10/V4-11)
|
||||
|
||||
---
|
||||
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2026-04-08 | Sprint created with detailed task analysis for Violations 3 and 4. | Planning |
|
||||
| 2026-04-08 | V4-01 through V4-04 implemented: RiskEngine, Replay, ExportCenter, Integrations registered with MigrationModuleRegistry. Inline EnsureTable removed from RiskEngine and Replay. Test count updated to 36. All builds pass. | Developer |
|
||||
| 2026-04-08 | V3-01 DONE: Changed FindingsLedger schema from `public` to `findings` across factory, DbContext, migration plugin, all 12 SQL migrations (schema-qualified tables/types/indexes, CREATE SCHEMA, SET search_path), and added test assertion. Build verified. | Developer |
|
||||
| 2026-04-08 | V4-05 through V4-09 DONE: Registered Signer, IssuerDirectory, Workflow, PacksRegistry, OpsMemory with MigrationModuleRegistry. Created consolidated migration SQL for PacksRegistry (from 009_packs_registry.sql + 6 inline EnsureTable DDLs). Copied OpsMemory DDL from devops/ to library. Removed all 6 EnsureTable methods from PacksRegistry repositories. Added EmbeddedResource to PacksRegistry and OpsMemory csproj files. All builds pass (0 warnings, 0 errors). | Developer |
|
||||
| 2026-04-08 | V4-10 DONE: Removed redundant inline `CREATE SCHEMA IF NOT EXISTS` from 16 files across registered modules: Signals (6 repos), SbomService (8 repos), AdvisoryAI (KnowledgeSearchStore), BinaryIndex (MigrationRunner), Attestor (RekorCheckpointStore). AirGap EnsureTable methods only check table existence (no schema creation) -- already clean. Concelier Excititor only has `CREATE TABLE IF NOT EXISTS` -- already clean. Router, HLC, ExportCenter, PluginRegistry kept as-is (dynamic/standalone). All 5 affected modules build with 0 errors. | Developer |
|
||||
| 2026-04-08 | V4-11 DONE: Test assertions already at 36 (updated by V4-01 through V4-09 work). MigrationCommandHandlersTests asserts 36, MigrationModuleRegistryTests has per-module assertions for all 36 plugins. No changes needed. | Developer |
|
||||
|
||||
## Decisions & Risks
|
||||
- **Fresh DB only**: All changes assume fresh DB setup (volume delete + rebuild). No online migration path needed for existing deployments since we are pre-GA.
|
||||
- **Compiled model (V3-01)**: The EF Core compiled model delegates schema to `OnModelCreating`, so changing `DefaultSchemaName` propagates automatically. If the compiled model bakes in schema names at generation time, it must be regenerated. Verify by building and running.
|
||||
- **ENUM types in schema (V3-01)**: PostgreSQL ENUMs cannot be easily moved between schemas. Since we are on fresh DB, we create them in the `findings` schema from the start. The `search_path` must include `findings` for queries that reference enum values without schema qualification.
|
||||
- **Dual migration runners (V4-03)**: ExportCenter has its own runner with checksum validation. Registering with the central runner means migrations run via both paths. Short-term this is fine (idempotent SQL). Long-term, deprecate the standalone runner.
|
||||
- **Dynamic schemas (V4-10)**: Router messaging and HLC use configurable schemas. These are intentionally dynamic and may not need registry entries. Evaluate during implementation.
|
||||
- **scripts schema (Scheduler)**: The `scripts` schema is created by `004_create_scripts_schema.sql` inside the Scheduler persistence library, which IS registered. No separate plugin needed -- it is already covered.
|
||||
|
||||
## Next Checkpoints
|
||||
- V3-01 + V4-01 through V4-09 complete: all schemas governed by MigrationModuleRegistry.
|
||||
- V4-10 complete: no inline schema creation duplicates central runner.
|
||||
- V4-11 complete: test coverage confirms full registry.
|
||||
- Final: fresh DB `docker compose down -v && docker compose up` boots with all schemas created by central runner.
|
||||
@@ -0,0 +1,91 @@
|
||||
# Sprint 20260409-001 -- Local Container Rebuild, Integrations, and Sources
|
||||
|
||||
## Topic & Scope
|
||||
- Rebuild the Stella Ops local container install from a clean compose-owned state so the stack can be reprovisioned deterministically.
|
||||
- Recreate the supported local integration lane used by the repo docs and E2E coverage: core stack, QA fixtures, and real third-party services.
|
||||
- Re-enable and verify the advisory source catalog after the rebuild, capturing any blockers or degraded paths encountered during convergence.
|
||||
- Working directory: `devops/compose`.
|
||||
- Expected evidence: compose teardown/recreate logs, healthy container status, integration API verification, advisory source check results, and recorded struggles in this sprint.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
- Required docs: `docs/INSTALL_GUIDE.md`, `docs/quickstart.md`, `devops/compose/README.md`, `docs/modules/platform/architecture-overview.md`, `docs/modules/integrations/architecture.md`.
|
||||
- This is an operator-style sprint; tasks are sequential because the reset wipes the environment that later tasks depend on.
|
||||
- The rebuild is scoped to Stella-owned Docker compose resources only. Unrelated Docker containers, images, networks, and volumes must not be touched.
|
||||
- Cross-module edits are allowed only for bootstrap blockers discovered during the rebuild, scoped to `src/Attestor/**`, `src/JobEngine/**`, `src/Integrations/**`, `docs/integrations/**`, and `docs/modules/integrations/**`.
|
||||
|
||||
## Documentation Prerequisites
|
||||
- `docs/INSTALL_GUIDE.md`
|
||||
- `docs/quickstart.md`
|
||||
- `devops/compose/README.md`
|
||||
- `src/Integrations/README.md`
|
||||
- `src/Concelier/StellaOps.Concelier.WebService/Extensions/SourceManagementEndpointExtensions.cs`
|
||||
- `src/Integrations/StellaOps.Integrations.WebService/IntegrationEndpoints.cs`
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
### LOCAL-REBUILD-001 - Wipe Stella local compose state and bootstrap from scratch
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Developer / Ops Integrator
|
||||
Task description:
|
||||
- Stop the Stella local compose lanes, remove Stella-owned containers and persistent volumes, recreate the required Docker networks, and bring the documented local stack back with the repo-supported scripts and compose files.
|
||||
- Preserve repo configuration unless a documented bootstrap precondition is broken. If `.env` or hosts entries require repair, do the minimal corrective action and record it.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Stella-owned compose services are stopped and removed cleanly before rebuild.
|
||||
- [x] Stella-owned persistent volumes required for a scratch bootstrap are recreated from empty state.
|
||||
- [x] Core stack is running again with healthy status for required services.
|
||||
- [x] Any bootstrap deviation from the documented path is logged in `Decisions & Risks`.
|
||||
|
||||
### LOCAL-REBUILD-002 - Recreate the local integration provider lane
|
||||
Status: DONE
|
||||
Dependency: LOCAL-REBUILD-001
|
||||
Owners: Developer / Ops Integrator
|
||||
Task description:
|
||||
- Start the QA integration fixtures and the real third-party integration compose lane supported by the repo.
|
||||
- Register the local integrations that the stack can actively exercise after bootstrap, using the live API surface rather than mock-only assumptions.
|
||||
|
||||
Completion criteria:
|
||||
- [x] QA fixtures are running and healthy.
|
||||
- [x] Real local providers are running for the supported low-idle lane, with optional profiles enabled only when needed.
|
||||
- [x] Integration catalog entries exist for the local providers that can be verified from this environment.
|
||||
- [x] Integration test and health endpoints succeed for the registered providers, or failures are logged with concrete cause.
|
||||
|
||||
### LOCAL-REBUILD-003 - Enable and verify advisory sources after rebuild
|
||||
Status: DONE
|
||||
Dependency: LOCAL-REBUILD-002
|
||||
Owners: Developer / Ops Integrator
|
||||
Task description:
|
||||
- Re-run the advisory source auto-configuration flow on the rebuilt stack, confirm the catalog is populated, and verify the resulting source health state.
|
||||
- Capture any source families that remain unhealthy or require unavailable credentials/fixtures.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Advisory source catalog is reachable on the rebuilt stack.
|
||||
- [x] Bulk source check runs to completion.
|
||||
- [x] Healthy sources are enabled after the check.
|
||||
- [x] Any unhealthy or skipped sources are recorded with exact failure details.
|
||||
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2026-04-09 | Sprint created for a Stella-only local wipe, rebuild, integration reprovisioning, and advisory source verification. | Developer |
|
||||
| 2026-04-09 | Wiped Stella-owned Docker state, recreated `stellaops` / `stellaops_frontdoor`, rebuilt blocked images (`platform`, `scheduler-web`), and bootstrapped the stack with the setup script plus manual network recovery. | Developer |
|
||||
| 2026-04-09 | Registered the full local provider lane through `integrations-web`, provisioned Vault-backed GitLab credentials, enabled the heavy GitLab registry profile, and converged the live integration catalog to 13/13 active providers. | Developer |
|
||||
| 2026-04-09 | Ran `POST http://127.1.0.9/api/v1/advisory-sources/check` against the live source catalog and confirmed 74/74 advisory sources healthy and enabled. | Developer |
|
||||
| 2026-04-09 | Updated local integration docs to document GitLab registry auth (`authref://vault/gitlab#registry-basic`) and the Bearer-challenge registry flow implemented in the Docker registry connector. | Developer |
|
||||
|
||||
## Decisions & Risks
|
||||
- Decision: destructive cleanup is limited to Stella-owned Docker compose resources (`stellaops-*` containers, compose-owned volumes, and Stella compose networks) because the user asked to rebuild the local Stella installation, not wipe unrelated Docker state.
|
||||
- Decision: use the repo-documented compose lanes as the authority for what counts as "all integrations" locally, then register every provider the local stack can actually exercise from this machine.
|
||||
- Decision: bootstrap blocker fixes were applied in owning modules when the clean rebuild exposed real defects: duplicate publish outputs in Attestor persistence packages, stale script wiring in `scheduler-web`, and missing Bearer-challenge handling in the Docker registry connector.
|
||||
- Decision: integration docs were updated in [../../devops/compose/README.md](../../devops/compose/README.md), [../integrations/LOCAL_SERVICES.md](../integrations/LOCAL_SERVICES.md), and [../modules/integrations/architecture.md](../modules/integrations/architecture.md) to match the live GitLab registry auth flow used by the rebuilt environment.
|
||||
- Audit: the user-requested source revalidation used `POST http://127.1.0.9/api/v1/advisory-sources/check` against the runtime catalog exposed by `GET http://127.1.0.9/api/v1/advisory-sources/catalog`; purpose: re-enable and verify all advisory sources after the wipe. Result: 74/74 healthy and enabled.
|
||||
- Risk: `scripts/setup.ps1` still misparses comments in `hosts.stellaops.local`, reporting bogus missing aliases such as comment words, and it still does not recreate the external `stellaops` network required by the split compose lanes.
|
||||
- Risk: the local machine could not update `C:\Windows\System32\drivers\etc\hosts` because the session did not have elevation; Docker-network aliases were sufficient for inter-container traffic, but host-based friendly names remain an operator follow-up.
|
||||
- Risk: all compose files currently share the Docker Compose project name `compose`, so `docker compose ... ps` and `up/down` calls emit cross-file orphan noise and make service-scoped status harder to trust.
|
||||
- Risk: fresh-volume bootstrap still leaves unrelated core services unhealthy: `router-gateway` rejects duplicate `/platform` routes, `findings-ledger-web` crashes because `findings.ledger_projection_offsets` is missing, `timeline-web` restart-loops in startup migration handling, and `graph-api` / `scheduler-web` remain unhealthy. The requested integration and advisory-source lanes are usable, but a full-stack fresh install is not yet fully converged.
|
||||
|
||||
## Next Checkpoints
|
||||
- Fix the remaining fresh-volume core blockers (`router-gateway`, `findings-ledger-web`, `timeline-web`, `graph-api`, `scheduler-web`) and rerun the bootstrap smoke.
|
||||
- Repair `scripts/setup.ps1` host-alias parsing and external-network recreation so the documented path works without manual intervention.
|
||||
- Re-run the clean install after those blockers land and archive this sprint once full-stack convergence is proven.
|
||||
@@ -0,0 +1,119 @@
|
||||
# Sprint 20260413-001 - Scratch Setup With Local Integrations
|
||||
|
||||
## Topic & Scope
|
||||
- Rebuild the Stella Ops local environment from Stella-owned runtime state only and prove the documented scratch setup path still converges on this machine.
|
||||
- Bring up the local third-party integration lanes that the repo supports for real local validation, including the optional heavier providers when they are practical to run locally.
|
||||
- Fix only the concrete bootstrap and setup blockers exposed by the scratch run, then sync the setup docs and sprint log with the actual working commands and residual risks.
|
||||
- Working directory: `.`.
|
||||
- Expected evidence: Stella-only cleanup and bootstrap commands, compose health/status output, local integration verification results, and updated setup/docs references for any changed behavior.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
- Required docs: `docs/INSTALL_GUIDE.md`, `docs/dev/DEV_ENVIRONMENT_SETUP.md`, `devops/compose/README.md`, `docs/integrations/LOCAL_SERVICES.md`, `docs/modules/platform/architecture-overview.md`, `docs/modules/integrations/architecture.md`.
|
||||
- This sprint executes sequentially because the scratch setup and integration registration depend on a single mutable local Docker environment.
|
||||
- Cross-module edits are allowed only for setup blockers found during the scratch run, scoped to `scripts/**`, `devops/**`, `docs/**`, `src/Router/**`, `src/Authority/**`, `src/Platform/**`, `src/ReleaseOrchestrator/**`, `src/Integrations/**`, `src/Concelier/**`, `src/Findings/**`, `src/Timeline/**`, `src/Graph/**`, and `src/JobEngine/**`.
|
||||
|
||||
## Documentation Prerequisites
|
||||
- `docs/INSTALL_GUIDE.md`
|
||||
- `docs/dev/DEV_ENVIRONMENT_SETUP.md`
|
||||
- `devops/compose/README.md`
|
||||
- `docs/integrations/LOCAL_SERVICES.md`
|
||||
- `docs/modules/platform/architecture-overview.md`
|
||||
- `docs/modules/integrations/architecture.md`
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
### SETUP-001 - Rebuild Stella local stack from scratch on this machine
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Developer / Ops Integrator
|
||||
Task description:
|
||||
- Remove Stella-owned local Docker runtime state needed for a clean bootstrap, then run the repo-supported scratch setup path on this workstation.
|
||||
- Treat the documented scripts and compose files as the authority. If a blocker appears, capture the exact failure, apply the smallest correct fix, and rerun until the core stack is usable or a real blocker remains.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Stella-owned Docker runtime state is reset without touching unrelated machine resources.
|
||||
- [x] The repo-supported setup path is executed from a clean starting point.
|
||||
- [x] Core stack health, frontdoor reachability, and bootstrap readiness are captured with concrete evidence.
|
||||
- [x] Any setup blocker is either fixed or recorded with exact failure details.
|
||||
|
||||
### SETUP-002 - Bring up and verify the local integrations lane
|
||||
Status: DONE
|
||||
Dependency: SETUP-001
|
||||
Owners: Developer / Ops Integrator
|
||||
Task description:
|
||||
- Start the supported local third-party integration compose lanes and verify the providers that Stella Ops can actually exercise in this environment.
|
||||
- Include the deterministic QA fixtures and the real-provider compose lane, enabling optional heavier profiles only when they can be run locally and verified.
|
||||
|
||||
Completion criteria:
|
||||
- [x] QA integration fixtures are started and checked when needed for success-path validation.
|
||||
- [x] The real local provider lane is running for the supported providers on this machine.
|
||||
- [x] Reachable providers are verified with concrete API/health evidence.
|
||||
- [x] Any provider left disabled or degraded is recorded with the exact reason.
|
||||
|
||||
### SETUP-003 - Sync setup docs and sprint evidence with the proven path
|
||||
Status: DONE
|
||||
Dependency: SETUP-002
|
||||
Owners: Developer / Documentation Author
|
||||
Task description:
|
||||
- Update the installation, environment, or integration docs only where the live scratch run proves the documented path is incomplete or wrong.
|
||||
- Record the final working command sequence, deviations, and remaining risks in this sprint so the setup is auditable and repeatable.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Setup and integration docs match the validated local path.
|
||||
- [x] Decisions, deviations, and remaining risks are logged with linked docs.
|
||||
- [x] The sprint execution log captures the scratch run, fixes, and final verification outcome.
|
||||
|
||||
### SETUP-004 - Converge fresh-volume core services to ready state
|
||||
Status: DONE
|
||||
Dependency: SETUP-003
|
||||
Owners: Developer / Ops Integrator
|
||||
Task description:
|
||||
- Repair the remaining startup blockers in Findings Ledger, Timeline, and Graph so a fresh local PostgreSQL volume converges without manual SQL, CLI migration commands, or degraded background services.
|
||||
- Keep fixes scoped to the concrete startup path: migration wiring, migration categorization/idempotency, and hosted-service lifecycle defects required for `https://stella-ops.local/healthz` to return `ready=true`.
|
||||
|
||||
Completion criteria:
|
||||
- [x] `stellaops-findings-ledger-web` starts cleanly after auto-applying the `findings` schema migrations, including `ledger_projection_offsets`.
|
||||
- [x] `stellaops-timeline-web` starts cleanly on a fresh database without pending manual release migrations blocking startup.
|
||||
- [x] `stellaops-graph-api` remains healthy without restart loops from `GraphChangeStreamProcessor`.
|
||||
- [x] Frontdoor health returns `ready=true` with no missing required microservices.
|
||||
|
||||
### SETUP-005 - Make the full local integrations path repeatable
|
||||
Status: DONE
|
||||
Dependency: SETUP-004
|
||||
Owners: Developer / Documentation Author
|
||||
Task description:
|
||||
- Close the gap between the current successful local state and a repeatable scratch path by automating or documenting the remaining GitLab/bootstrap work needed for all local-ready integrations, including GitLab SCM, CI, and registry.
|
||||
- Reverify the final local integration catalog and advisory-source health after the core platform is green so the sprint closes on the actual end-state.
|
||||
|
||||
Completion criteria:
|
||||
- [x] The scratch path can provision all supported local integrations, including GitLab-backed entries, without undocumented manual steps.
|
||||
- [x] The local integration catalog verifies healthy for every supported local provider on tenant `demo-prod`.
|
||||
- [x] Setup docs and sprint evidence reflect the final converged path and any remaining non-local upstream risks.
|
||||
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2026-04-13 | Sprint created for a fresh Stella-only local bootstrap plus local integrations bring-up and verification on this machine. | Developer |
|
||||
| 2026-04-13 | Reset Stella-owned compose state, rebuilt only the missing local images, restored the core stack, and confirmed frontdoor readiness at `https://stella-ops.local` plus router health at `/healthz`. | Developer |
|
||||
| 2026-04-13 | Started QA fixtures plus the local third-party provider lane (`gitea`, `jenkins`, `nexus`, `vault`, `docker-registry`, `minio`, `consul`, `gitlab`, `runtime-host-fixture`) and verified the container health set. | Developer |
|
||||
| 2026-04-13 | Added `scripts/register-local-integrations.ps1`, converged `demo-prod` to 13/13 healthy local-ready integration entries, and recorded the optional GitLab/PAT caveat in `docs/INSTALL_GUIDE.md`, `devops/compose/README.md`, and `docs/integrations/LOCAL_SERVICES.md`. | Developer |
|
||||
| 2026-04-13 | Generated a local GitLab PAT through the live GitLab UI, stored `authref://vault/gitlab#access-token` plus `authref://vault/gitlab#registry-basic` in the dev Vault, re-enabled the GitLab registry surface, and converged `demo-prod` to 16/16 healthy local integration entries including GitLab Server, GitLab CI, and GitLab Container Registry. | Developer |
|
||||
| 2026-04-13 | Fixed fresh-volume startup blockers across Findings, Timeline, Graph, and shared Postgres migration infrastructure: explicit-transaction startup migrations now run without an outer transaction, Graph startup now applies both persistence and API schema migrations, wildcard local URL bindings are normalized for Kestrel, and Findings rollback SQL remains operator-only instead of embedded as a forward startup migration. | Developer |
|
||||
| 2026-04-13 | Rebuilt and redeployed `findings-ledger-web`, `timeline-web`, `graph-api`, and `router-gateway`; verified all four containers healthy and confirmed `https://stella-ops.local/healthz` returns `ready=true` with no missing required microservices. | Developer |
|
||||
| 2026-04-13 | Revalidated the tenant integration catalog after core recovery: `demo-prod` reports 16/16 configured local integrations healthy (Consul, Docker Registry, eBPF runtime host, Gitea, GitHub App fixture, GitLab CI, GitLab Container Registry, GitLab Server, Harbor fixture, Jenkins, MinIO, Nexus Registry, NVD Mirror, OSV Mirror, StellaOps Mirror, Vault). | Developer |
|
||||
| 2026-04-13 | Re-ran `powershell -ExecutionPolicy Bypass -File scripts/register-local-integrations.ps1 -Tenant demo-prod -IncludeGitLab -IncludeGitLabRegistry`; all 16 local integrations converged as `existing` and revalidated healthy, confirming the helper is idempotent on the green stack. | Developer |
|
||||
| 2026-04-13 | Re-ran `POST http://127.1.0.9/api/v1/advisory-sources/check` with tenant `demo-prod`; result: 74/74 healthy, 0 failed, all enabled. | Developer |
|
||||
|
||||
## Decisions & Risks
|
||||
- Decision: destructive operations in this sprint are limited to Stella-owned Docker compose containers, volumes, and networks required for the local scratch bootstrap.
|
||||
- Risk: the repo is already in a dirty working tree, including active Concelier and Web changes outside this sprint. Those edits will not be reverted and may influence the observed scratch-setup behavior.
|
||||
- Decision: the repeatable local-registration default is the 13 turnkey providers that pass without extra secret material on a fresh machine: Harbor fixture, Docker Registry, Nexus, GitHub App fixture, Gitea, Jenkins, Vault, Consul, eBPF runtime-host fixture, MinIO, and the three Concelier-backed feed mirror providers.
|
||||
- Decision: GitLab Server, GitLab CI, and GitLab Container Registry remain opt-in from the helper script because the live topology requires Vault-backed PAT material (`authref://vault/gitlab#access-token` / `authref://vault/gitlab#registry-basic`) to avoid `401 Unauthorized`.
|
||||
- Decision: this workstation now has local-only GitLab bootstrap secrets in the dev Vault under `secret/gitlab`, which were generated solely to exercise the GitLab SCM, CI, and registry providers end-to-end during the scratch setup.
|
||||
- Decision: `src/__Libraries/StellaOps.Infrastructure.Postgres/Migrations/StartupMigrationHost.cs` and `MigrationRunner.cs` now detect startup scripts that manage their own `BEGIN`/`COMMIT` lifecycle and execute them outside an outer `NpgsqlTransaction`, which keeps fresh-volume bootstraps deterministic for embedded SQL migrations that use explicit transactions.
|
||||
- Decision: Findings RLS rollback stays available as `migrations/007_enable_rls_rollback.sql`, but it is excluded from embedded resources so startup only applies forward migrations. The operator-facing rollback path is now documented in `docs/modules/findings-ledger/operations/rls-migration.md` and `docs/contracts/findings-ledger-rls.md`.
|
||||
- Risk: the machine-local GitLab PAT and Vault secrets are suitable only for this workstation's dev bootstrap path and will need regeneration if GitLab tokens are revoked or rotated.
|
||||
|
||||
## Next Checkpoints
|
||||
- Re-run `powershell -ExecutionPolicy Bypass -File scripts/register-local-integrations.ps1 -Tenant demo-prod -IncludeGitLab -IncludeGitLabRegistry` after future local compose wipes or secret rotation.
|
||||
- Re-run `POST http://127.1.0.9/api/v1/advisory-sources/check` when external upstream availability needs to be revalidated.
|
||||
Reference in New Issue
Block a user