Files
git.stella-ops.org/docs/db/MIGRATION_CONVENTIONS.md

306 lines
8.5 KiB
Markdown

# Migration Conventions
This document defines the standard conventions for database migrations in StellaOps.
## File Naming
All migration files must follow the naming pattern:
```
NNN_description.sql # Standard migrations (001-099 startup, 100+ release)
SNNN_description.sql # Seed migrations (reference data)
DMNNN_description.sql # Data migrations (batched, background)
```
Where:
- `NNN` = 3-digit zero-padded number (001, 002, ..., 099, 100)
- `description` = lowercase letters, numbers, and underscores only
- Extension = `.sql`
### Examples
```
001_create_tables.sql ✓ Correct (startup)
002_add_indexes.sql ✓ Correct (startup)
100_drop_legacy_column.sql ✓ Correct (release)
S001_seed_default_roles.sql ✓ Correct (seed)
DM001_backfill_tenant_ids.sql ✓ Correct (data migration)
0059_scans_table.sql ✗ Wrong (4-digit prefix)
V1102_001__schema.sql ✗ Wrong (Flyway-style)
20251214_AddSchema.sql ✗ Wrong (EF Core timestamp)
create-tables.sql ✗ Wrong (no numeric prefix)
```
### Migration Categories
| Category | Prefix | Execution | Breaking Changes |
|----------|--------|-----------|------------------|
| Startup | 001-099 | Automatic at application boot | Never |
| Release | 100-199 | Manual via CLI before deployment | Yes |
| Seed | S001-S999 | Automatic at application boot | Never |
| Data | DM001-DM999 | Background job via CLI | Varies |
## File Organization
Each module should place migrations in a standard location:
```
src/<Module>/__Libraries/StellaOps.<Module>.Storage.Postgres/Migrations/
```
Alternative paths for specialized modules:
```
src/<Module>/__Libraries/StellaOps.<Module>.Persistence/Migrations/
src/<Module>/StellaOps.<Module>/StellaOps.<Module>.Infrastructure/Db/Migrations/
```
### Embedded Resources
Migration files must be embedded in the assembly for air-gap compatibility:
```xml
<ItemGroup>
<EmbeddedResource Include="Migrations\*.sql" />
</ItemGroup>
```
## WebService Ownership
Each database schema is owned by exactly one WebService:
| Schema | Owner WebService | Notes |
|--------|------------------|-------|
| `auth` | Authority.WebService | |
| `vuln` | Concelier.WebService | |
| `vex` | Excititor.WebService | |
| `policy` | Policy.Gateway | |
| `scheduler` | Scheduler.WebService | |
| `notify` | Notify.WebService | |
| `scanner` | Scanner.WebService | Also owns `binaries` |
| `proofchain` | Attestor.WebService | |
| `signer` | Signer.WebService | |
| `signals` | Signals | Standalone service |
| `evidence` | EvidenceLocker.WebService | |
| `export` | ExportCenter.WebService | |
| `issuer` | IssuerDirectory.WebService | |
| `orchestrator` | Orchestrator.WebService | |
| `findings` | Findings.Ledger.WebService | |
| `vexhub` | VexHub.WebService | |
| `unknowns` | Policy.Gateway | Shared ownership |
### Registration Pattern
Each WebService registers its migrations in `Program.cs` or a startup extension:
```csharp
// Example: Scheduler.WebService/Program.cs
builder.Services.AddStartupMigrations<SchedulerOptions>(
schemaName: "scheduler",
moduleName: "Scheduler",
migrationsAssembly: typeof(StellaOps.Scheduler.Storage.Postgres.Marker).Assembly,
connectionStringSelector: options => options.Postgres.ConnectionString);
```
### No Shared Migrations
Migrations must NOT be shared across WebServices:
- Each WebService controls its own schema exclusively
- Cross-schema dependencies use conditional DDL (`IF EXISTS`)
- API calls are used for runtime cross-module data access
## Migration Content Guidelines
### Startup Migrations (001-099)
- Must complete in under 60 seconds
- Must be idempotent (use `IF NOT EXISTS`, `CREATE OR REPLACE`)
- Must NOT drop tables, columns, or constraints
- Must NOT TRUNCATE data
- Must NOT add NOT NULL columns without defaults
```sql
-- Good: Idempotent table creation
CREATE TABLE IF NOT EXISTS scanner.scans (
scan_id UUID PRIMARY KEY,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Good: Safe index creation
CREATE INDEX IF NOT EXISTS idx_scans_created
ON scanner.scans(created_at DESC);
-- Bad: Non-idempotent (will fail if exists)
CREATE TABLE scanner.scans (...);
-- Bad: Breaking change in startup migration
ALTER TABLE scanner.scans DROP COLUMN legacy_field;
```
### Release Migrations (100-199)
- May contain breaking changes
- Require manual execution before deployment
- Should be tested in staging environment first
- Block application startup if pending
```sql
-- Release migration for breaking change
-- 100_remove_legacy_columns.sql
-- Step 1: Add replacement column (could be startup migration)
ALTER TABLE scanner.scans
ADD COLUMN IF NOT EXISTS new_field TEXT;
-- Step 2: Migrate data (requires release migration)
UPDATE scanner.scans
SET new_field = legacy_field
WHERE new_field IS NULL;
-- Step 3: Drop old column (breaking)
ALTER TABLE scanner.scans
DROP COLUMN IF EXISTS legacy_field;
```
### Seed Migrations (S001-S999)
- Insert reference data that rarely changes
- Use `ON CONFLICT DO NOTHING` for idempotency
- Run automatically at startup
```sql
-- S001_seed_vulnerability_severities.sql
INSERT INTO policy.severities (severity_id, name, score_min, score_max)
VALUES
('critical', 'Critical', 9.0, 10.0),
('high', 'High', 7.0, 8.9),
('medium', 'Medium', 4.0, 6.9),
('low', 'Low', 0.1, 3.9),
('none', 'None', 0.0, 0.0)
ON CONFLICT (severity_id) DO NOTHING;
```
### Data Migrations (DM001-DM999)
- Long-running data transformations
- Execute in batches to avoid locks
- Run via CLI or background job
```sql
-- DM001_backfill_tenant_ids.sql
-- Backfill tenant_id for existing records (batched)
DO $$
DECLARE
batch_size INT := 1000;
updated INT := 1;
BEGIN
WHILE updated > 0 LOOP
WITH batch AS (
SELECT scan_id
FROM scanner.scans
WHERE tenant_id IS NULL
LIMIT batch_size
FOR UPDATE SKIP LOCKED
)
UPDATE scanner.scans s
SET tenant_id = '00000000-0000-0000-0000-000000000000'::UUID
FROM batch b
WHERE s.scan_id = b.scan_id;
GET DIAGNOSTICS updated = ROW_COUNT;
COMMIT;
PERFORM pg_sleep(0.1); -- Rate limit
END LOOP;
END $$;
```
## Validation
Migrations are validated at startup and in CI:
1. **Duplicate prefix detection**: Multiple files with same number → Error
2. **Naming convention check**: Non-standard naming → Warning
3. **Checksum validation**: Modified applied migrations → Error
4. **Dangerous operation check**: DROP in startup migration → Error
### CI Validation
Run migration validation in CI pipelines:
```bash
.gitea/scripts/validate/validate-migrations.sh
```
Or with strict mode (fail on warnings):
```bash
.gitea/scripts/validate/validate-migrations.sh --strict
```
## Rollback Strategy
StellaOps uses a **forward-only migration strategy**:
- Migrations cannot be rolled back automatically
- To fix a bad migration, create a new migration that undoes the changes
- In emergencies, restore from database backup
### Emergency Rollback
1. Restore database from backup (pre-migration)
2. Deploy previous application version
3. Analyze and fix the migration issue
4. Create corrective migration
5. Deploy new version with fix
## Testing
### Integration Tests
Use `PostgresIntegrationFixture` with Testcontainers:
```csharp
[Collection(ScannerPostgresCollection.Name)]
public class ScanRepositoryTests : MigrationTestBase<ScannerPostgresFixture>
{
public ScanRepositoryTests(ScannerPostgresFixture fixture) : base(fixture) { }
[Fact]
public async Task Should_Insert_Scan()
{
// Database is clean (truncated) before each test
await ExecuteSqlAsync("INSERT INTO scanner.scans ...");
}
}
```
### Migration Tests
Test that migrations apply correctly:
```csharp
[Fact]
public async Task All_Migrations_Apply_Without_Error()
{
var status = await _fixture.Fixture.GetMigrationStatusAsync();
Assert.Empty(status.ChecksumErrors);
Assert.True(status.IsUpToDate);
}
```
## Monitoring
OpenTelemetry metrics for migrations:
| Metric | Type | Description |
|--------|------|-------------|
| `stellaops.migrations.applied.total` | Counter | Migrations applied |
| `stellaops.migrations.failed.total` | Counter | Migration failures |
| `stellaops.migrations.duration.seconds` | Histogram | Execution duration |
| `stellaops.migrations.lock.wait.seconds` | Histogram | Lock wait time |
| `stellaops.migrations.pending.count` | UpDownCounter | Pending migrations |
Traces are emitted with activity source: `StellaOps.Infrastructure.Postgres.Migrations`