Files
git.stella-ops.org/docs/modules/airgap/runbooks/quarantine-investigation.md
2026-01-06 19:07:48 +02:00

2.1 KiB
Raw Blame History

AirGap Quarantine Investigation Runbook

Purpose

Quarantine preserves failed bundle imports for offline forensic analysis. It keeps the original bundle and the verification context (reason + logs) so operators can diagnose tampering, trust-root drift, or packaging issues without re-running in an online environment.

Location & Structure

Default root: /updates/quarantine

Per-tenant layout: /updates/quarantine/<tenantId>/<timestamp>-<reason>-<id>/

Removal staging: /updates/quarantine/<tenantId>/.removed/<quarantineId>/

Files in a quarantine entry

  • bundle.tar.zst - the original bundle as provided
  • manifest.json - bundle manifest (when available)
  • verification.log - validation step output (TUF/DSSE/Merkle/rotation/monotonicity, etc.)
  • failure-reason.txt - human-readable failure summary (reason + timestamp + metadata)
  • quarantine.json - structured metadata for listing/automation

Investigation steps (offline)

  1. Identify the tenant and locate the quarantine root on the importer host.
  2. Pick the newest quarantine entry for the tenant (timestamp prefix).
  3. Read failure-reason.txt first to capture the top-level reason and metadata.
  4. Review verification.log for the precise failing step.
  5. If needed, extract and inspect bundle.tar.zst in an isolated workspace (no network).
  6. Decide whether the entry should be retained (for audit) or removed after investigation.

Removal & Retention

  • Removal requires a human-provided reason (audit trail). Implementations should use the quarantine services remove operation which moves entries under .removed/.
  • Retention and quota controls are configured via AirGap:Quarantine settings (root, TTL, max size); TTL cleanup can remove entries older than the retention period.

Common failure categories

  • tuf:* - invalid/expired metadata or snapshot hash mismatch
  • dsse:* - signature invalid or trust root mismatch
  • merkle-* - payload entry set invalid or empty
  • rotation:* - root rotation policy failure (dual approval, no-op rotation, etc.)
  • version-non-monotonic:* - rollback prevention triggered (force activation requires a justification)