Files
git.stella-ops.org/docs/db/MIGRATION_CONVENTIONS.md

8.5 KiB

Migration Conventions

This document defines the standard conventions for database migrations in StellaOps.

File Naming

All migration files must follow the naming pattern:

NNN_description.sql     # Standard migrations (001-099 startup, 100+ release)
SNNN_description.sql    # Seed migrations (reference data)
DMNNN_description.sql   # Data migrations (batched, background)

Where:

  • NNN = 3-digit zero-padded number (001, 002, ..., 099, 100)
  • description = lowercase letters, numbers, and underscores only
  • Extension = .sql

Examples

001_create_tables.sql           ✓ Correct (startup)
002_add_indexes.sql             ✓ Correct (startup)
100_drop_legacy_column.sql      ✓ Correct (release)
S001_seed_default_roles.sql     ✓ Correct (seed)
DM001_backfill_tenant_ids.sql   ✓ Correct (data migration)

0059_scans_table.sql            ✗ Wrong (4-digit prefix)
V1102_001__schema.sql           ✗ Wrong (Flyway-style)
20251214_AddSchema.sql          ✗ Wrong (EF Core timestamp)
create-tables.sql               ✗ Wrong (no numeric prefix)

Migration Categories

Category Prefix Execution Breaking Changes
Startup 001-099 Automatic at application boot Never
Release 100-199 Manual via CLI before deployment Yes
Seed S001-S999 Automatic at application boot Never
Data DM001-DM999 Background job via CLI Varies

File Organization

Each module should place migrations in a standard location:

src/<Module>/__Libraries/StellaOps.<Module>.Storage.Postgres/Migrations/

Alternative paths for specialized modules:

src/<Module>/__Libraries/StellaOps.<Module>.Persistence/Migrations/
src/<Module>/StellaOps.<Module>/StellaOps.<Module>.Infrastructure/Db/Migrations/

Embedded Resources

Migration files must be embedded in the assembly for air-gap compatibility:

<ItemGroup>
  <EmbeddedResource Include="Migrations\*.sql" />
</ItemGroup>

WebService Ownership

Each database schema is owned by exactly one WebService:

Schema Owner WebService Notes
auth Authority.WebService
vuln Concelier.WebService
vex Excititor.WebService
policy Policy.Gateway
scheduler Scheduler.WebService
notify Notify.WebService
scanner Scanner.WebService Also owns binaries
proofchain Attestor.WebService
signer Signer.WebService
signals Signals Standalone service
evidence EvidenceLocker.WebService
export ExportCenter.WebService
issuer IssuerDirectory.WebService
orchestrator Orchestrator.WebService
findings Findings.Ledger.WebService
vexhub VexHub.WebService
unknowns Policy.Gateway Shared ownership

Registration Pattern

Each WebService registers its migrations in Program.cs or a startup extension:

// Example: Scheduler.WebService/Program.cs
builder.Services.AddStartupMigrations<SchedulerOptions>(
    schemaName: "scheduler",
    moduleName: "Scheduler",
    migrationsAssembly: typeof(StellaOps.Scheduler.Storage.Postgres.Marker).Assembly,
    connectionStringSelector: options => options.Postgres.ConnectionString);

No Shared Migrations

Migrations must NOT be shared across WebServices:

  • Each WebService controls its own schema exclusively
  • Cross-schema dependencies use conditional DDL (IF EXISTS)
  • API calls are used for runtime cross-module data access

Migration Content Guidelines

Startup Migrations (001-099)

  • Must complete in under 60 seconds
  • Must be idempotent (use IF NOT EXISTS, CREATE OR REPLACE)
  • Must NOT drop tables, columns, or constraints
  • Must NOT TRUNCATE data
  • Must NOT add NOT NULL columns without defaults
-- Good: Idempotent table creation
CREATE TABLE IF NOT EXISTS scanner.scans (
    scan_id UUID PRIMARY KEY,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Good: Safe index creation
CREATE INDEX IF NOT EXISTS idx_scans_created
    ON scanner.scans(created_at DESC);

-- Bad: Non-idempotent (will fail if exists)
CREATE TABLE scanner.scans (...);

-- Bad: Breaking change in startup migration
ALTER TABLE scanner.scans DROP COLUMN legacy_field;

Release Migrations (100-199)

  • May contain breaking changes
  • Require manual execution before deployment
  • Should be tested in staging environment first
  • Block application startup if pending
-- Release migration for breaking change
-- 100_remove_legacy_columns.sql

-- Step 1: Add replacement column (could be startup migration)
ALTER TABLE scanner.scans
    ADD COLUMN IF NOT EXISTS new_field TEXT;

-- Step 2: Migrate data (requires release migration)
UPDATE scanner.scans
SET new_field = legacy_field
WHERE new_field IS NULL;

-- Step 3: Drop old column (breaking)
ALTER TABLE scanner.scans
DROP COLUMN IF EXISTS legacy_field;

Seed Migrations (S001-S999)

  • Insert reference data that rarely changes
  • Use ON CONFLICT DO NOTHING for idempotency
  • Run automatically at startup
-- S001_seed_vulnerability_severities.sql
INSERT INTO policy.severities (severity_id, name, score_min, score_max)
VALUES
    ('critical', 'Critical', 9.0, 10.0),
    ('high', 'High', 7.0, 8.9),
    ('medium', 'Medium', 4.0, 6.9),
    ('low', 'Low', 0.1, 3.9),
    ('none', 'None', 0.0, 0.0)
ON CONFLICT (severity_id) DO NOTHING;

Data Migrations (DM001-DM999)

  • Long-running data transformations
  • Execute in batches to avoid locks
  • Run via CLI or background job
-- DM001_backfill_tenant_ids.sql
-- Backfill tenant_id for existing records (batched)

DO $$
DECLARE
    batch_size INT := 1000;
    updated INT := 1;
BEGIN
    WHILE updated > 0 LOOP
        WITH batch AS (
            SELECT scan_id
            FROM scanner.scans
            WHERE tenant_id IS NULL
            LIMIT batch_size
            FOR UPDATE SKIP LOCKED
        )
        UPDATE scanner.scans s
        SET tenant_id = '00000000-0000-0000-0000-000000000000'::UUID
        FROM batch b
        WHERE s.scan_id = b.scan_id;

        GET DIAGNOSTICS updated = ROW_COUNT;
        COMMIT;
        PERFORM pg_sleep(0.1); -- Rate limit
    END LOOP;
END $$;

Validation

Migrations are validated at startup and in CI:

  1. Duplicate prefix detection: Multiple files with same number → Error
  2. Naming convention check: Non-standard naming → Warning
  3. Checksum validation: Modified applied migrations → Error
  4. Dangerous operation check: DROP in startup migration → Error

CI Validation

Run migration validation in CI pipelines:

.gitea/scripts/validate/validate-migrations.sh

Or with strict mode (fail on warnings):

.gitea/scripts/validate/validate-migrations.sh --strict

Rollback Strategy

StellaOps uses a forward-only migration strategy:

  • Migrations cannot be rolled back automatically
  • To fix a bad migration, create a new migration that undoes the changes
  • In emergencies, restore from database backup

Emergency Rollback

  1. Restore database from backup (pre-migration)
  2. Deploy previous application version
  3. Analyze and fix the migration issue
  4. Create corrective migration
  5. Deploy new version with fix

Testing

Integration Tests

Use PostgresIntegrationFixture with Testcontainers:

[Collection(ScannerPostgresCollection.Name)]
public class ScanRepositoryTests : MigrationTestBase<ScannerPostgresFixture>
{
    public ScanRepositoryTests(ScannerPostgresFixture fixture) : base(fixture) { }

    [Fact]
    public async Task Should_Insert_Scan()
    {
        // Database is clean (truncated) before each test
        await ExecuteSqlAsync("INSERT INTO scanner.scans ...");
    }
}

Migration Tests

Test that migrations apply correctly:

[Fact]
public async Task All_Migrations_Apply_Without_Error()
{
    var status = await _fixture.Fixture.GetMigrationStatusAsync();
    Assert.Empty(status.ChecksumErrors);
    Assert.True(status.IsUpToDate);
}

Monitoring

OpenTelemetry metrics for migrations:

Metric Type Description
stellaops.migrations.applied.total Counter Migrations applied
stellaops.migrations.failed.total Counter Migration failures
stellaops.migrations.duration.seconds Histogram Execution duration
stellaops.migrations.lock.wait.seconds Histogram Lock wait time
stellaops.migrations.pending.count UpDownCounter Pending migrations

Traces are emitted with activity source: StellaOps.Infrastructure.Postgres.Migrations