Files
git.stella-ops.org/docs/ops/authority-backup-restore.md
master 607e72e2a1
Some checks failed
Build Test Deploy / docs (push) Has been cancelled
Build Test Deploy / deploy (push) Has been cancelled
Build Test Deploy / build-test (push) Has been cancelled
Build Test Deploy / authority-container (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
up
2025-10-12 20:37:18 +03:00

5.8 KiB
Raw Blame History

Authority Backup & Restore Runbook

Scope

  • Applies to: StellaOps Authority deployments running the official ops/authority/docker-compose.authority.yaml stack or equivalent Kubernetes packaging.
  • Artifacts covered: MongoDB (stellaops-authority database), Authority configuration (etc/authority.yaml), plugin manifests under etc/authority.plugins/, and signing key material stored in the authority-keys volume (defaults to /app/keys inside the container).
  • Frequency: Run the full procedure prior to upgrades, before rotating keys, and at least once per 24h in production. Store snapshots in an encrypted, access-controlled vault.

Inventory Checklist

Component Location (compose default) Notes
Mongo data mongo-data volume (/var/lib/docker/volumes/.../mongo-data) Contains all Authority collections (AuthorityUser, AuthorityClient, AuthorityToken, etc.).
Configuration etc/authority.yaml Mounted read-only into the container at /etc/authority.yaml.
Plugin manifests etc/authority.plugins/*.yaml Includes standard.yaml with tokenSigning.keyDirectory.
Signing keys authority-keys volume -> /app/keys Path is derived from tokenSigning.keyDirectory (defaults to ../keys relative to the manifest).

TIP: Confirm the deployed key directory via tokenSigning.keyDirectory in etc/authority.plugins/standard.yaml; some installations relocate keys to /var/lib/stellaops/authority/keys.

Hot Backup (no downtime)

  1. Create output directory: mkdir -p backup/$(date +%Y-%m-%d) on the host.
  2. Dump Mongo:
    docker compose -f ops/authority/docker-compose.authority.yaml exec mongo \
      mongodump --archive=/dump/authority-$(date +%Y%m%dT%H%M%SZ).gz \
      --gzip --db stellaops-authority
    docker compose -f ops/authority/docker-compose.authority.yaml cp \
      mongo:/dump/authority-$(date +%Y%m%dT%H%M%SZ).gz backup/
    
    The mongodump archive preserves indexes and can be restored with mongorestore --archive --gzip.
  3. Capture configuration + manifests:
    cp etc/authority.yaml backup/
    rsync -a etc/authority.plugins/ backup/authority.plugins/
    
  4. Export signing keys: the compose file maps authority-keys to a local Docker volume. Snapshot it without stopping the service:
    docker run --rm \
      -v authority-keys:/keys \
      -v "$(pwd)/backup:/backup" \
      busybox tar czf /backup/authority-keys-$(date +%Y%m%dT%H%M%SZ).tar.gz -C /keys .
    
  5. Checksum: generate SHA-256 digests for every file and store them alongside the artefacts.
  6. Encrypt & upload: wrap the backup folder using your secrets management standard (e.g., age, GPG) and upload to the designated offline vault.

Cold Backup (planned downtime)

  1. Notify stakeholders and drain traffic (CLI clients should refresh tokens afterwards).
  2. Stop services:
    docker compose -f ops/authority/docker-compose.authority.yaml down
    
  3. Back up volumes directly using tar:
    docker run --rm -v mongo-data:/data -v "$(pwd)/backup:/backup" \
      busybox tar czf /backup/mongo-data-$(date +%Y%m%d).tar.gz -C /data .
    docker run --rm -v authority-keys:/keys -v "$(pwd)/backup:/backup" \
      busybox tar czf /backup/authority-keys-$(date +%Y%m%d).tar.gz -C /keys .
    
  4. Copy configuration + manifests as in the hot backup (steps 36).
  5. Restart services and verify health:
    docker compose -f ops/authority/docker-compose.authority.yaml up -d
    curl -fsS http://localhost:8080/ready
    

Restore Procedure

  1. Provision clean volumes: remove existing volumes if youre rebuilding a node (docker volume rm mongo-data authority-keys), then recreate the compose stack so empty volumes exist.
  2. Restore Mongo:
    docker compose exec -T mongo mongorestore --archive --gzip --drop < backup/authority-YYYYMMDDTHHMMSSZ.gz
    
    Use --drop to replace collections; omit if doing a partial restore.
  3. Restore configuration/manifests: copy authority.yaml and authority.plugins/* into place before starting the Authority container.
  4. Restore signing keys: untar into the mounted volume:
    docker run --rm -v authority-keys:/keys -v "$(pwd)/backup:/backup" \
      busybox tar xzf /backup/authority-keys-YYYYMMDD.tar.gz -C /keys
    
    Ensure file permissions remain 600 for private keys (chmod -R 600).
  5. Start services & validate:
    docker compose up -d
    curl -fsS http://localhost:8080/health
    
  6. Validate JWKS and tokens: call /jwks and issue a short-lived token via the CLI to confirm key material matches expectations. If the restored environment requires a fresh signing key, follow the rotation SOP in docs/11_AUTHORITY.md using ops/authority/key-rotation.sh to invoke /internal/signing/rotate.

Disaster Recovery Notes

  • Air-gapped replication: replicate archives via the Offline Update Kit transport channels; never attach USB devices without scanning.
  • Retention: maintain 30 daily snapshots + 12 monthly archival copies. Rotate encryption keys annually.
  • Key compromise: if signing keys are suspected compromised, restore from the latest clean backup, rotate via OPS3 (see ops/authority/key-rotation.sh and docs/11_AUTHORITY.md), and publish a revocation notice.
  • Mongo version: keep dump/restore images pinned to the deployment version (compose uses mongo:7). Restoring across major versions requires a compatibility review.

Verification Checklist

  • /ready reports all identity providers ready.
  • OAuth flows issue tokens signed by the restored keys.
  • PluginRegistrationSummary logs expected providers on startup.
  • Revocation manifest export (dotnet run --project src/StellaOps.Authority) succeeds.
  • Monitoring dashboards show metrics resuming (see OPS5 deliverables).