Files
git.stella-ops.org/ops/devops/aoc/supersedes-rollout.md
StellaOps Bot 44171930ff
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
feat: Add UI benchmark driver and scenarios for graph interactions
- Introduced `ui_bench_driver.mjs` to read scenarios and fixture manifest, generating a deterministic run plan.
- Created `ui_bench_plan.md` outlining the purpose, scope, and next steps for the benchmark.
- Added `ui_bench_scenarios.json` containing various scenarios for graph UI interactions.
- Implemented tests for CLI commands, ensuring bundle verification and telemetry defaults.
- Developed schemas for orchestrator components, including replay manifests and event envelopes.
- Added mock API for risk management, including listing and statistics functionalities.
- Implemented models for risk profiles and query options to support the new API.
2025-12-02 01:28:17 +02:00

2.4 KiB

Supersedes backfill rollout plan (DEVOPS-AOC-19-101)

Scope: Concelier Link-Not-Merge backfill and supersedes processing once advisory_raw idempotency index is in staging.

Preconditions

  • Idempotency index verified in staging (advisory_raw duplicate inserts rejected; log hash recorded).
  • LNM migrations 21-101/102 applied (shards, TTL, tombstones).
  • Event transport to NATS/Redis disabled during backfill to avoid noisy downstream replays.
  • Offline kit mirror includes current hashes for advisory_raw and backfill bundle.

Rollout steps (staging → prod)

  1. Freeze window (announce 24h prior)
    • Pause Concelier ingest workers (CONCELIER_INGEST_ENABLED=false).
    • Stop outbox publisher or point to blackhole NATS subject.
  2. Dry-run (staging)
    • Run backfill job with --dry-run to emit counts only.
    • Verify: new supersedes records count == expected; no write errors; idempotency violations = 0.
    • Capture logs + SHA256 of generated report.
  3. Prod execution
    • Run backfill job with --batch-size=500 and --stop-on-error.
    • Monitor: insert rate, error rate, Mongo oplog lag; target <5% CPU on primary.
  4. Validation
    • Run consistency check:
      • advisory_observations count stable (no drop).
      • Supersedes edges present for all prior conflicts.
      • Idempotency index hit rate <0.1%.
    • Run API spot check: /advisories/summary returns supersedes metadata; advisory.linkset.updated events absent during freeze.
  5. Unfreeze
    • Re-enable ingest + outbox publisher.
    • Trigger single advisory.observation.updated@1 replay to confirm event path is healthy.

Rollback

  • If errors >0 or idempotency violations observed:
    • Stop job, keep ingest paused.
    • Run rollback script ops/devops/scripts/rollback-lnm-backfill.js to remove supersedes/tombstones inserted in current window.
    • Restore Mongo from last checkpointed snapshot if rollback script fails.

Evidence to capture

  • Job command + arguments.
  • SHA256 of backfill bundle and report.
  • Idempotency violation count.
  • Post-run consistency report (JSON) stored under ops/devops/artifacts/aoc-supersedes/<timestamp>/.

Monitoring/Alerts

  • Add temporary Grafana panel for idempotency violations and Mongo ops/sec during job.
  • Alert if job runtime exceeds 2h or if oplog lag > 60s.

Owners

  • Run: DevOps Guild
  • Approvals: Concelier Storage Guild + Platform Security