Files
git.stella-ops.org/src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo/MIGRATIONS.md
2025-11-18 23:45:25 +02:00

4.6 KiB

Mongo Schema Migration Playbook

This module owns the persistent shape of Concelier's MongoDB database. Upgrades must be deterministic and safe to run on live replicas. The MongoMigrationRunner executes idempotent migrations on startup immediately after the bootstrapper completes its collection and index checks.

Execution Path

  1. StellaOps.Concelier.WebService calls MongoBootstrapper.InitializeAsync() during startup.
  2. Once collections and baseline indexes are ensured, the bootstrapper invokes MongoMigrationRunner.RunAsync().
  3. Each IMongoMigration implementation is sorted by its Id (ordinal compare) and executed exactly once. Completion is recorded in the schema_migrations collection.
  4. Failures surface during startup and prevent the service from serving traffic, matching our "fail-fast" requirement for storage incompatibilities.

Creating a Migration

  1. Implement IMongoMigration under StellaOps.Concelier.Storage.Mongo.Migrations. Use a monotonically increasing identifier such as yyyyMMdd_description.
  2. Keep the body idempotent: query state first, drop/re-create indexes only when mismatch is detected, and avoid multi-document transactions unless required.
  3. Add the migration to DI in ServiceCollectionExtensions so it flows into the runner.
  4. Write an integration test that exercises the migration against a Mongo2Go instance to validate behaviour.

Current Migrations

Id Description
20241005_document_expiry_indexes Ensures document collection uses the correct TTL/partial index depending on raw document retention settings.
20241005_gridfs_expiry_indexes Aligns the GridFS documents.files TTL index with retention settings.
20251019_advisory_event_collections Creates/aligns indexes for advisory_statements and advisory_conflicts collections powering the event log + conflict replay pipeline.
20251028_advisory_raw_idempotency_index Applies compound unique index on (source.vendor, upstream.upstream_id, upstream.content_hash, tenant) after verifying no duplicates exist.
20251028_advisory_supersedes_backfill Renames legacy advisory collection to a read-only backup view and backfills supersedes chains across advisory_raw.
20251028_advisory_raw_validator Applies Aggregation-Only Contract JSON schema validator to the advisory_raw collection with configurable enforcement level.
20251104_advisory_observations_raw_linkset Backfills rawLinkset on advisory_observations using stored advisory_raw documents so canonical and raw projections co-exist for downstream policy joins.
20251117_advisory_linksets_tenant_lower Lowercases advisory_linksets.tenantId to align writes with lookup filters.

Operator Runbook

  • schema_migrations records each applied migration (_id, description, appliedAt). Review this collection when auditing upgrades.
  • Prior to applying 20251028_advisory_raw_idempotency_index, run the duplicate audit script against the target database:
    mongo concelier ops/devops/scripts/check-advisory-raw-duplicates.js --eval 'var LIMIT=200;'
    
    Resolve any reported rows before rolling out the migration.
  • After 20251028_advisory_supersedes_backfill completes, ensure db.advisory reports type: "view" and options.viewOn: "advisory_backup_20251028". Supersedes chains can be spot-checked via db.advisory_raw.find({ supersedes: { $exists: true } }).limit(5).
  • To re-run a migration in a lab, delete the corresponding document from schema_migrations and restart the service. Do not do this in production unless the migration body is known to be idempotent and safe.
  • When changing retention settings (RawDocumentRetention), deploy the new configuration and restart Concelier. The migration runner will adjust indexes on the next boot.
  • For the event-log collections (advisory_statements, advisory_conflicts), rollback is simply db.advisory_statements.drop() / db.advisory_conflicts.drop() followed by a restart if you must revert to the pre-event-log schema (only in labs). Production rollbacks should instead gate merge features that rely on these collections.
  • If migrations fail, restart with Logging__LogLevel__StellaOps.Concelier.Storage.Mongo.Migrations=Debug to surface diagnostic output. Remediate underlying index/collection drift before retrying.

Validating an Upgrade

  1. Run dotnet test --filter MongoMigrationRunnerTests to exercise integration coverage.
  2. In staging, execute db.schema_migrations.find().sort({_id:1}) to verify applied migrations and timestamps.
  3. Inspect index shapes: db.document.getIndexes() and db.documents.files.getIndexes() for TTL/partial filter alignment.