# Replay Pipeline Contract (Scanner ↔ Worker ↔ CAS) Purpose: unblock Sprint 0186 replay tasks by defining the worker→webservice contract, manifest fields, and CAS layout for record/replay. ## Bundle layout - Format: `tar.zst`, deterministic ordering, UTF-8, LF endings. - Top-level entries: - `manifest.json` — canonical JSON, UTF-8. - `inputs/` — sealed scan inputs (config, policies, feeds) as provided to the worker. - `artifacts/` — analyzer outputs (SBOM, VEX, findings, entropy, logs), named by subject digest and analyzer id. - `evidence/` — DSSE envelopes and attestations. - `checksums.txt` — SHA256 of every file in bundle (POSIX path + two spaces + hash). ## manifest.json fields - `scan_id` (uuid), `tenant`, `subject` (image digest or purl). - `tool`: `id`, `version`, `commit`, `invocation_hash`. - `policy`: `id`, `version`, `hash`. - `feeds`: array of `{ id, version, hash }`. - `inputs_hash`: SHA256 of normalized `inputs/`. - `artifacts`: array of `{ path, type, analyzer, subject, hash, merkle_root? }`. - `entropy`: `{ path, hash, penalties }` when present. - `timeline`: ordered event ids + hashes for replay audit. - `created_at`: ISO-8601 UTC. Canonicalization: RFC3339/ISO timestamps, sorted keys (encoder stable), lists sorted by `path` unless natural order documented (timeline). ## Transport - Worker POSTs to WebService: `POST /api/v1/replay/runs/{scanId}/bundle` - Headers: `X-Tenant-Id`, `Content-Type: application/zstd` - Body: bundle bytes - Response: `201` with `{ cas_uri, manifest_hash, status_url }` - WebService stores bundle at CAS path: `cas/{subject}/{scan_id}/{manifest_hash}.tar.zst` - `manifest_hash` = SHA256(manifest.json canonical bytes) - DSSE envelope optional: `cas/.../{manifest_hash}.tar.zst.dsse` ## DSSE signing - Payload type: `application/vnd.stellaops.replay-bundle+json` - Body: canonical `manifest.json` - Signer: Signer service with replay profile; Authority verifies using replay trust root; Rekor optional. ## Determinism rules - Fixed clock from worker (override via env `STELLAOPS_REPLAY_FIXED_CLOCK`). - RNG seed carried in manifest (`tool.rng_seed`), replay MUST reuse. - Concurrency cap recorded (`tool.max_parallel`), replay must honor <= value. - Log filtering: strip non-deterministic timestamps before hashing. ## Error handling - 400: missing tenant, bad bundle; 422: manifest invalid; 409: manifest_hash already stored (idempotent); 500: CAS failure -> retry with backoff. ## Validation checklist - Verify `checksums.txt` matches bundle. - Verify `inputs_hash` recomputes. - Verify `manifest_hash` == canonical SHA256(manifest.json). - Verify DSSE (if present) against replay trust root.