# Migration Conventions This document defines the standard conventions for database migrations in StellaOps. ## File Naming All migration files must follow the naming pattern: ``` NNN_description.sql # Standard migrations (001-099 startup, 100+ release) SNNN_description.sql # Seed migrations (reference data) DMNNN_description.sql # Data migrations (batched, background) ``` Where: - `NNN` = 3-digit zero-padded number (001, 002, ..., 099, 100) - `description` = lowercase letters, numbers, and underscores only - Extension = `.sql` ### Examples ``` 001_create_tables.sql ✓ Correct (startup) 002_add_indexes.sql ✓ Correct (startup) 100_drop_legacy_column.sql ✓ Correct (release) S001_seed_default_roles.sql ✓ Correct (seed) DM001_backfill_tenant_ids.sql ✓ Correct (data migration) 0059_scans_table.sql ✗ Wrong (4-digit prefix) V1102_001__schema.sql ✗ Wrong (Flyway-style) 20251214_AddSchema.sql ✗ Wrong (EF Core timestamp) create-tables.sql ✗ Wrong (no numeric prefix) ``` ### Migration Categories | Category | Prefix | Execution | Breaking Changes | |----------|--------|-----------|------------------| | Startup | 001-099 | Automatic at application boot | Never | | Release | 100-199 | Manual via CLI before deployment | Yes | | Seed | S001-S999 | Automatic at application boot | Never | | Data | DM001-DM999 | Background job via CLI | Varies | ## File Organization Each module should place migrations in a standard location: ``` src//__Libraries/StellaOps..Storage.Postgres/Migrations/ ``` Alternative paths for specialized modules: ``` src//__Libraries/StellaOps..Persistence/Migrations/ src//StellaOps./StellaOps..Infrastructure/Db/Migrations/ ``` ### Embedded Resources Migration files must be embedded in the assembly for air-gap compatibility: ```xml ``` ## WebService Ownership Each database schema is owned by exactly one WebService: | Schema | Owner WebService | Notes | |--------|------------------|-------| | `auth` | Authority.WebService | | | `vuln` | Concelier.WebService | | | `vex` | Excititor.WebService | | | `policy` | Policy.Gateway | | | `scheduler` | Scheduler.WebService | | | `notify` | Notify.WebService | | | `scanner` | Scanner.WebService | Also owns `binaries` | | `proofchain` | Attestor.WebService | | | `signer` | Signer.WebService | | | `signals` | Signals | Standalone service | | `evidence` | EvidenceLocker.WebService | | | `export` | ExportCenter.WebService | | | `issuer` | IssuerDirectory.WebService | | | `orchestrator` | Orchestrator.WebService | | | `findings` | Findings.Ledger.WebService | | | `vexhub` | VexHub.WebService | | | `unknowns` | Policy.Gateway | Shared ownership | ### Registration Pattern Each WebService registers its migrations in `Program.cs` or a startup extension: ```csharp // Example: Scheduler.WebService/Program.cs builder.Services.AddStartupMigrations( schemaName: "scheduler", moduleName: "Scheduler", migrationsAssembly: typeof(StellaOps.Scheduler.Storage.Postgres.Marker).Assembly, connectionStringSelector: options => options.Postgres.ConnectionString); ``` ### No Shared Migrations Migrations must NOT be shared across WebServices: - Each WebService controls its own schema exclusively - Cross-schema dependencies use conditional DDL (`IF EXISTS`) - API calls are used for runtime cross-module data access ## Migration Content Guidelines ### Startup Migrations (001-099) - Must complete in under 60 seconds - Must be idempotent (use `IF NOT EXISTS`, `CREATE OR REPLACE`) - Must NOT drop tables, columns, or constraints - Must NOT TRUNCATE data - Must NOT add NOT NULL columns without defaults ```sql -- Good: Idempotent table creation CREATE TABLE IF NOT EXISTS scanner.scans ( scan_id UUID PRIMARY KEY, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); -- Good: Safe index creation CREATE INDEX IF NOT EXISTS idx_scans_created ON scanner.scans(created_at DESC); -- Bad: Non-idempotent (will fail if exists) CREATE TABLE scanner.scans (...); -- Bad: Breaking change in startup migration ALTER TABLE scanner.scans DROP COLUMN legacy_field; ``` ### Release Migrations (100-199) - May contain breaking changes - Require manual execution before deployment - Should be tested in staging environment first - Block application startup if pending ```sql -- Release migration for breaking change -- 100_remove_legacy_columns.sql -- Step 1: Add replacement column (could be startup migration) ALTER TABLE scanner.scans ADD COLUMN IF NOT EXISTS new_field TEXT; -- Step 2: Migrate data (requires release migration) UPDATE scanner.scans SET new_field = legacy_field WHERE new_field IS NULL; -- Step 3: Drop old column (breaking) ALTER TABLE scanner.scans DROP COLUMN IF EXISTS legacy_field; ``` ### Seed Migrations (S001-S999) - Insert reference data that rarely changes - Use `ON CONFLICT DO NOTHING` for idempotency - Run automatically at startup ```sql -- S001_seed_vulnerability_severities.sql INSERT INTO policy.severities (severity_id, name, score_min, score_max) VALUES ('critical', 'Critical', 9.0, 10.0), ('high', 'High', 7.0, 8.9), ('medium', 'Medium', 4.0, 6.9), ('low', 'Low', 0.1, 3.9), ('none', 'None', 0.0, 0.0) ON CONFLICT (severity_id) DO NOTHING; ``` ### Data Migrations (DM001-DM999) - Long-running data transformations - Execute in batches to avoid locks - Run via CLI or background job ```sql -- DM001_backfill_tenant_ids.sql -- Backfill tenant_id for existing records (batched) DO $$ DECLARE batch_size INT := 1000; updated INT := 1; BEGIN WHILE updated > 0 LOOP WITH batch AS ( SELECT scan_id FROM scanner.scans WHERE tenant_id IS NULL LIMIT batch_size FOR UPDATE SKIP LOCKED ) UPDATE scanner.scans s SET tenant_id = '00000000-0000-0000-0000-000000000000'::UUID FROM batch b WHERE s.scan_id = b.scan_id; GET DIAGNOSTICS updated = ROW_COUNT; COMMIT; PERFORM pg_sleep(0.1); -- Rate limit END LOOP; END $$; ``` ## Validation Migrations are validated at startup and in CI: 1. **Duplicate prefix detection**: Multiple files with same number → Error 2. **Naming convention check**: Non-standard naming → Warning 3. **Checksum validation**: Modified applied migrations → Error 4. **Dangerous operation check**: DROP in startup migration → Error ### CI Validation Run migration validation in CI pipelines: ```bash .gitea/scripts/validate/validate-migrations.sh ``` Or with strict mode (fail on warnings): ```bash .gitea/scripts/validate/validate-migrations.sh --strict ``` ## Rollback Strategy StellaOps uses a **forward-only migration strategy**: - Migrations cannot be rolled back automatically - To fix a bad migration, create a new migration that undoes the changes - In emergencies, restore from database backup ### Emergency Rollback 1. Restore database from backup (pre-migration) 2. Deploy previous application version 3. Analyze and fix the migration issue 4. Create corrective migration 5. Deploy new version with fix ## Testing ### Integration Tests Use `PostgresIntegrationFixture` with Testcontainers: ```csharp [Collection(ScannerPostgresCollection.Name)] public class ScanRepositoryTests : MigrationTestBase { public ScanRepositoryTests(ScannerPostgresFixture fixture) : base(fixture) { } [Fact] public async Task Should_Insert_Scan() { // Database is clean (truncated) before each test await ExecuteSqlAsync("INSERT INTO scanner.scans ..."); } } ``` ### Migration Tests Test that migrations apply correctly: ```csharp [Fact] public async Task All_Migrations_Apply_Without_Error() { var status = await _fixture.Fixture.GetMigrationStatusAsync(); Assert.Empty(status.ChecksumErrors); Assert.True(status.IsUpToDate); } ``` ## Monitoring OpenTelemetry metrics for migrations: | Metric | Type | Description | |--------|------|-------------| | `stellaops.migrations.applied.total` | Counter | Migrations applied | | `stellaops.migrations.failed.total` | Counter | Migration failures | | `stellaops.migrations.duration.seconds` | Histogram | Execution duration | | `stellaops.migrations.lock.wait.seconds` | Histogram | Lock wait time | | `stellaops.migrations.pending.count` | UpDownCounter | Pending migrations | Traces are emitted with activity source: `StellaOps.Infrastructure.Postgres.Migrations`