# Raw Linkset Backfill & Adoption Plan _Last updated: 2025-10-31_ Owners: Concelier Storage Guild, DevOps Guild, Policy Guild ## Context - Concelier observations now emit both a **canonical linkset** (deduped, normalised identifiers) and a **raw linkset** (`rawLinkset`) that preserves upstream ordering, duplicates, and original pointer metadata. - Existing `concelier.advisory_observations` documents created before 2025-10-31 do **not** contain the `rawLinkset` field. - Policy Engine selection joiners (`POLICY-ENGINE-20-003`) will switch to the raw projection once backfill completes and consumers validate fixtures. ## Objectives 1. Populate `rawLinkset` for historical observations across online clusters and Offline Kit bundles without breaking append-only guarantees. 2. Provide migration scripts + runbook so operators can rehearse in staging (and air-gapped deployments) before production rollout. 3. Unblock Policy Engine adoption by guaranteeing dual projections exist for all tenants. ## Deliverables - [ ] **Migration script** (`20251104_advisory_observations_raw_linkset_backfill.csx`) - Iterates observations lacking `rawLinkset` - Rehydrates raw document via existing snapshot (or cached DTO) - Reuses `AdvisoryObservationFactory.CreateRawLinkset` - Writes using `$set` with optimistic retry; preserves `updatedAt` via `setOnInsert` - [ ] **Offline Kit updater** (extend `ops/offline-kit/scripts/export_offline_bundle.py`) to patch bundles in-place - [ ] **Runbook** covering: - Pre-check query: `db.concelier.advisory_observations.countDocuments({ rawLinkset: { $exists: false } })` - Backup procedure (`mongodump` or snapshot requirement) - Dry-run mode limiting batches by tenant - Metrics/telemetry expectations (`concelier.migrations.documents_processed_total`) - Rollback (no-op because field addition; note to retain snapshot for verification) - [ ] **Fixture updates** ensuring storage/CLI/Policy tests include `rawLinkset` - [ ] **Policy Engine follow-up** to flip joiners once `rawLinkset` population reaches 100% (tracked via metrics). ## Timeline | Date (UTC) | Milestone | Notes | |------------|-----------|-------| | 2025-10-31 | Handshake w/ Policy | Agreement to consume `rawLinkset`; this document created. | | 2025-11-01 | Draft migration script | Validate against staging dataset snapshots. | | 2025-11-04 | Storage task CONCELIER-STORE-AOC-19-005 due | Deliver script + runbook for review. | | 2025-11-06 | Staging backfill rehearsal | Target < 30 min runtime on 5M observations. | | 2025-11-08 | Policy fixtures updated | POL engine branch consumes `rawLinkset`. | | 2025-11-11 | Production rollout window | Pending DevOps sign-off after rehearsals. | ## Open Questions - Do we need archival of the canonical-only projection for backwards compatibility exports? (Policy to confirm.) - Offline Kit delta: should we regenerate entire bundle or ship incremental patch? (DevOps reviewing.) - Metrics: add `raw_linkset_missing_total` counter to detect regressions post-backfill? ## Next Actions - [ ] Concelier Storage Guild: prototype migration script, share for review (`2025-11-01`). - [ ] DevOps Guild: schedule staging rehearsal + update `docs/deploy/containers.md` with new runbook section. - [ ] Policy Guild: prepare feature flag/branch to switch joiners once metrics show zero missing `rawLinkset`.