Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
- Introduced `ui_bench_driver.mjs` to read scenarios and fixture manifest, generating a deterministic run plan. - Created `ui_bench_plan.md` outlining the purpose, scope, and next steps for the benchmark. - Added `ui_bench_scenarios.json` containing various scenarios for graph UI interactions. - Implemented tests for CLI commands, ensuring bundle verification and telemetry defaults. - Developed schemas for orchestrator components, including replay manifests and event envelopes. - Added mock API for risk management, including listing and statistics functionalities. - Implemented models for risk profiles and query options to support the new API.
2.4 KiB
2.4 KiB
Supersedes backfill rollout plan (DEVOPS-AOC-19-101)
Scope: Concelier Link-Not-Merge backfill and supersedes processing once advisory_raw idempotency index is in staging.
Preconditions
- Idempotency index verified in staging (
advisory_rawduplicate inserts rejected; log hash recorded). - LNM migrations 21-101/102 applied (shards, TTL, tombstones).
- Event transport to NATS/Redis disabled during backfill to avoid noisy downstream replays.
- Offline kit mirror includes current hashes for
advisory_rawand backfill bundle.
Rollout steps (staging → prod)
- Freeze window (announce 24h prior)
- Pause Concelier ingest workers (
CONCELIER_INGEST_ENABLED=false). - Stop outbox publisher or point to blackhole NATS subject.
- Pause Concelier ingest workers (
- Dry-run (staging)
- Run backfill job with
--dry-runto emit counts only. - Verify: new supersedes records count == expected; no write errors; idempotency violations = 0.
- Capture logs + SHA256 of generated report.
- Run backfill job with
- Prod execution
- Run backfill job with
--batch-size=500and--stop-on-error. - Monitor: insert rate, error rate, Mongo oplog lag; target <5% CPU on primary.
- Run backfill job with
- Validation
- Run consistency check:
advisory_observationscount stable (no drop).- Supersedes edges present for all prior conflicts.
- Idempotency index hit rate <0.1%.
- Run API spot check:
/advisories/summaryreturns supersedes metadata;advisory.linkset.updatedevents absent during freeze.
- Run consistency check:
- Unfreeze
- Re-enable ingest + outbox publisher.
- Trigger single
advisory.observation.updated@1replay to confirm event path is healthy.
Rollback
- If errors >0 or idempotency violations observed:
- Stop job, keep ingest paused.
- Run rollback script
ops/devops/scripts/rollback-lnm-backfill.jsto remove supersedes/tombstones inserted in current window. - Restore Mongo from last checkpointed snapshot if rollback script fails.
Evidence to capture
- Job command + arguments.
- SHA256 of backfill bundle and report.
- Idempotency violation count.
- Post-run consistency report (JSON) stored under
ops/devops/artifacts/aoc-supersedes/<timestamp>/.
Monitoring/Alerts
- Add temporary Grafana panel for idempotency violations and Mongo ops/sec during job.
- Alert if job runtime exceeds 2h or if oplog lag > 60s.
Owners
- Run: DevOps Guild
- Approvals: Concelier Storage Guild + Platform Security