Files
git.stella-ops.org/docs/modules/excititor/operations/vex-raw-validator.md
2025-12-24 14:19:46 +02:00

2.3 KiB

Excititor · VEX Raw Collection Validator (AOC-19-001/002)

DEPRECATED: This document describes MongoDB validators which are no longer used. Excititor now uses PostgreSQL for persistence (Sprint 4400). Schema validation is now performed via PostgreSQL constraints and check constraints. See docs/db/SPECIFICATION.md for current database schema.

  • Date: 2025-11-25
  • Scope: EXCITITOR-STORE-AOC-19-001 / 19-002
  • Working directory: src/Excititor/__Libraries/StellaOps.Excititor.Storage.Mongo (deprecated)

What shipped (historical)

  • $jsonSchema validator applied to vex_raw (migration 20251125-vex-raw-json-schema) with validationAction=warn, validationLevel=moderate to surface contract violations without impacting ingestion.
  • Schema lives at docs/modules/excititor/schemas/vex_raw.schema.json (mirrors Mongo validator fields: digest/id, providerId, format, sourceUri, retrievedAt, optional content/GridFS object id, metadata strings).
  • Migration is auto-registered in DI; hosted migration runner applies it on service start. New collections created with the validator if missing.

How to run (online/offline)

  1. Ensure Excititor WebService/Worker starts with Mongo credentials that allow collMod.
  2. Validator applies automatically via migration runner. To force manually:
    mongosh "$MONGO_URI" --eval 'db.runCommand({collMod:"vex_raw", validator:'$(cat docs/modules/excititor/schemas/vex_raw.schema.json)', validationAction:"warn", validationLevel:"moderate"})'
    
  3. Offline kit: bundle docs/modules/excititor/schemas/vex_raw.schema.json with release artifacts; ops can apply via mongosh or mongo offline against snapshots.

Rollback / relax

  • To relax validation (e.g., hotfix window): db.runCommand({collMod:"vex_raw", validator:{}, validationAction:"warn", validationLevel:"off"}).
  • Reapplying the migration restores the schema.

Compatibility notes

  • Validator keeps additionalProperties=true to avoid blocking future fields; required set is minimal to guarantee provenance + content hash presence.
  • Action is warn to avoid breaking existing feeds; flip to error once downstream datasets are clean.

Acceptance

  • Contract + schema captured.
  • Migration in code and auto-applied.
  • Rollback path documented.