up the blokcing tasks
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Notify Smoke Test / Notifier Service Tests (push) Has been cancelled
Notify Smoke Test / Notification Smoke Test (push) Has been cancelled
Notify Smoke Test / Notify Unit Tests (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Export Center CI / export-ci (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Risk Bundle CI / risk-bundle-build (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Risk Bundle CI / risk-bundle-offline-kit (push) Has been cancelled
Risk Bundle CI / publish-checksums (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Notify Smoke Test / Notifier Service Tests (push) Has been cancelled
Notify Smoke Test / Notification Smoke Test (push) Has been cancelled
Notify Smoke Test / Notify Unit Tests (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Export Center CI / export-ci (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Risk Bundle CI / risk-bundle-build (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Risk Bundle CI / risk-bundle-offline-kit (push) Has been cancelled
Risk Bundle CI / publish-checksums (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled
This commit is contained in:
@@ -1,23 +1,23 @@
|
||||
# Runbook — Replay Operations
|
||||
# Runbook - Replay Operations
|
||||
|
||||
> **Audience:** Ops Guild · Evidence Locker Guild · Scanner Guild · Authority/Signer · Attestor
|
||||
> **Prereqs:** `docs/replay/DETERMINISTIC_REPLAY.md`, `docs/replay/DEVS_GUIDE_REPLAY.md`, `docs/replay/TEST_STRATEGY.md`, `docs/modules/platform/architecture-overview.md` §5
|
||||
> **Audience:** Ops Guild / Evidence Locker Guild / Scanner Guild / Authority/Signer / Attestor
|
||||
> **Prereqs:** `docs/replay/DETERMINISTIC_REPLAY.md`, `docs/replay/DEVS_GUIDE_REPLAY.md`, `docs/replay/TEST_STRATEGY.md`, `docs/modules/platform/architecture-overview.md`
|
||||
|
||||
This runbook governs day-to-day replay operations, retention, and incident handling across online and air-gapped environments. Keep it in sync with the tasks in `docs/implplan/SPRINT_0187_0001_0001_evidence_locker_cli_integration.md`.
|
||||
|
||||
---
|
||||
|
||||
## 1 · Terminology
|
||||
## 1 Terminology
|
||||
|
||||
- **Replay Manifest** — `manifest.json` describing scan inputs, outputs, signatures.
|
||||
- **Input Bundle** — `inputbundle.tar.zst` containing feeds, policies, tools, env.
|
||||
- **Output Bundle** — `outputbundle.tar.zst` with SBOM, findings, VEX, logs.
|
||||
- **DSSE Envelope** — Signed metadata produced by Authority/Signer.
|
||||
- **RootPack** — Trusted key bundle used to validate DSSE signatures offline.
|
||||
- **Replay Manifest** - `manifest.json` describing scan inputs, outputs, signatures.
|
||||
- **Input Bundle** - `inputbundle.tar.zst` containing feeds, policies, tools, env.
|
||||
- **Output Bundle** - `outputbundle.tar.zst` with SBOM, findings, VEX, logs.
|
||||
- **DSSE Envelope** - Signed metadata produced by Authority/Signer.
|
||||
- **RootPack** - Trusted key bundle used to validate DSSE signatures offline.
|
||||
|
||||
---
|
||||
|
||||
## 2 · Normal operations
|
||||
## 2 Normal operations
|
||||
|
||||
1. **Ingestion**
|
||||
- Scanner WebService writes manifest metadata to `replay_runs`.
|
||||
@@ -28,14 +28,15 @@ This runbook governs day-to-day replay operations, retention, and incident handl
|
||||
- Metrics `replay_verify_total{result}`, `replay_bundle_size_bytes` recorded in Telemetry Stack (see `docs/modules/telemetry/architecture.md`).
|
||||
- Failures alert `#ops-replay` via PagerDuty with runbook link.
|
||||
3. **Retention**
|
||||
- Hot CAS retention: 180 days (configurable per tenant). Cron job `replay-retention` prunes expired digests and writes audit entries.
|
||||
- Cold storage (Evidence Locker): 2 years; legal holds extend via `/evidence/holds`. Ensure holds recorded in `timeline.events` with type `replay.hold.created`.
|
||||
- Hot CAS retention: 180 days (configurable per tenant). Cron job `replay-retention` prunes expired digests and writes audit entries.
|
||||
- Cold storage (Evidence Locker): 2 years; legal holds extend via `/evidence/holds`. Ensure holds recorded in `timeline.events` with type `replay.hold.created`.
|
||||
- Retention declaration: validate against `docs/schemas/replay-retention.schema.json` (frozen 2025-12-10). Include `retention_policy_id`, `tenant_id`, `bundle_type`, `retention_days`, `legal_hold`, `purge_after`, `checksum`, `created_at`. Audit checksum via DSSE envelope when persisting.
|
||||
4. **Access control**
|
||||
- Only service identities with `replay:read` scope may fetch bundles. CLI requires device or client credential flow with DPoP.
|
||||
|
||||
---
|
||||
|
||||
## 3 · Incident response (Replay Integrity)
|
||||
## 3 Incident response (Replay Integrity)
|
||||
|
||||
| Step | Action | Owner | Notes |
|
||||
|------|--------|-------|-------|
|
||||
@@ -43,13 +44,13 @@ This runbook governs day-to-day replay operations, retention, and incident handl
|
||||
| 2 | Lock affected bundles (`POST /evidence/holds`) | Evidence Locker | Reference incident ticket |
|
||||
| 3 | Re-run `stella verify` with `--explain` to gather diffs | Scanner Guild | Attach diff JSON to incident |
|
||||
| 4 | Check Rekor inclusion proofs (`stella verify --ledger`) | Attestor | Flag if ledger mismatch or stale |
|
||||
| 5 | If tool hash drift → coordinate Signer for rotation | Authority/Signer | Rotate DSSE profile, update RootPack |
|
||||
| 5 | If tool hash drift -> coordinate Signer for rotation | Authority/Signer | Rotate DSSE profile, update RootPack |
|
||||
| 6 | Update incident timeline (`docs/runbooks/replay_ops.md` -> Incident Log) | Ops Guild | Record timestamps and decisions |
|
||||
| 7 | Close hold once resolved, publish postmortem | Ops + Docs | Postmortem must reference replay spec sections |
|
||||
|
||||
---
|
||||
|
||||
## 4 · Air-gapped workflow
|
||||
## 4 Air-gapped workflow
|
||||
|
||||
1. Receive Offline Kit bundle containing:
|
||||
- `offline/replay/<scan-id>/manifest.json`
|
||||
@@ -62,17 +63,17 @@ This runbook governs day-to-day replay operations, retention, and incident handl
|
||||
|
||||
---
|
||||
|
||||
## 5 · Maintenance checklist
|
||||
## 5 Maintenance checklist
|
||||
|
||||
- [ ] RootPack rotated quarterly; CLI/Evidence Locker updated with new fingerprints.
|
||||
- [ ] CAS retention job executed successfully in the past 24 hours.
|
||||
- [ ] CAS retention job executed successfully in the past 24 hours.
|
||||
- [ ] Replay verification metrics present in dashboards (x64 + arm64 lanes).
|
||||
- [ ] Runbook incident log updated (see section 6) for the last drill.
|
||||
- [ ] Offline kit instructions verified against current CLI version.
|
||||
|
||||
---
|
||||
|
||||
## 6 · Incident log
|
||||
## 6 Incident log
|
||||
|
||||
| Date (UTC) | Incident ID | Tenant | Summary | Follow-up |
|
||||
|------------|-------------|--------|---------|-----------|
|
||||
@@ -80,16 +81,16 @@ This runbook governs day-to-day replay operations, retention, and incident handl
|
||||
|
||||
---
|
||||
|
||||
## 7 · References
|
||||
## 7 References
|
||||
|
||||
- `docs/replay/DETERMINISTIC_REPLAY.md`
|
||||
- `docs/replay/DEVS_GUIDE_REPLAY.md`
|
||||
- `docs/replay/TEST_STRATEGY.md`
|
||||
- `docs/modules/platform/architecture-overview.md` §5
|
||||
- `docs/modules/platform/architecture-overview.md` section 5
|
||||
- `docs/modules/evidence-locker/architecture.md`
|
||||
- `docs/modules/telemetry/architecture.md`
|
||||
- `docs/implplan/SPRINT_0187_0001_0001_evidence_locker_cli_integration.md`
|
||||
|
||||
---
|
||||
|
||||
*Created: 2025-11-03 — Update alongside replay task status changes.*
|
||||
*Created: 2025-11-03 - Update alongside replay task status changes.*
|
||||
|
||||
Reference in New Issue
Block a user