Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
sdk-generator-smoke / sdk-smoke (push) Has been cancelled
SDK Publish & Sign / sdk-publish (push) Has been cancelled
api-governance / spectral-lint (push) Has been cancelled
oas-ci / oas-validate (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled
3.3 KiB
3.3 KiB
Replay Test Strategy
Imposed rule: Replay tests must use frozen inputs (SBOM, advisories, VEX, feeds, policy, tools) and fixed seeds/clocks; any non-determinism is a test failure.
This strategy defines how we validate replayability of Scanner outputs and attestations across tool/definition updates and environments.
1. Goals
- Prove that a recorded scan bundle (inputs + manifests) replays bit-for-bit across environments.
- Detect drift from feeds, policy, or tooling changes before shipping releases.
- Provide auditors with evidence (hashes, DSSE bundles) that replays are deterministic.
2. Test layers
- Golden replay: take a recorded bundle (SBOM/VEX/feeds/policy/tool hashes) and rerun; assert hash equality for SBOM, findings, VEX, logs. Fail on any difference.
- Feed drift guard: rerun bundle after feed update; expect differences; ensure drift is surfaced (hash mismatch, diff report) not silently masked.
- Tool upgrade: rerun with new scanner version; expect stable outputs if no functional change, otherwise require documented diffs.
- Policy change: rerun with updated policy; expect explain trace to show changed rules and hash delta; diff must be recorded.
- Offline: replay in sealed mode using only bundle contents; no network access permitted.
3. Inputs
- Replay bundle contents:
sbom,feeds.tar.gz,policy.tar.gz,scanner-image,reachability.graph,runtime-trace(optional),replay.yaml. - Hash manifest: SHA-256 for every file; top-level Merkle root.
- DSSE attestations (optional): for replay manifest and artifacts.
4. Determinism settings
- Fixed clock (
--fixed-clockISO-8601), RNG seed (RNG_SEED), single-threaded mode (SCANNER_MAX_CONCURRENCY=1), stable ordering (sorted inputs), log filtering (strip timestamps/PIDs). - Disable network/egress; rely on bundled feeds/policy.
5. Assertions
- Hash equality for outputs: SBOMs, findings, VEX, logs (canonicalised), determinism.json (if present).
- Verify DSSE signatures and Rekor proofs when available; fail if mismatched or missing.
- Report diff summary when hashes differ (feed/tool/policy drift).
6. Tooling
- CLI:
stella replay run --bundle <path> --fixed-clock 2025-11-01T00:00:00Z --seed 1337 --single-threaded. - Scripts:
scripts/replay/verify_bundle.sh(hash/manifest check),scripts/replay/run_replay.sh(orchestrates fixed settings),scripts/replay/diff_outputs.py(canonical diffs). - CI:
bench:determinismtarget executes golden replay on reference bundles; fails on hash delta.
7. Outputs
replay-results.jsonwith per-artifact hashes, pass/fail, diff counts.replay.logfiltered (no timestamps/PIDs),replay.hashes(sha256sum of outputs).- Optional DSSE attestation for replay results.
8. Reporting
- Publish results to CI artifacts; store in Evidence Locker for audit.
- Add summary to release notes when replay is part of a release gate.
9. Checklists
- Bundle verified (hash manifest, DSSE if present).
- Fixed clock/seed/concurrency applied.
- Network disabled; feeds/policy/tooling from bundle only.
- Outputs hashed and compared to baseline; diffs recorded.
- Replay results stored + (optionally) attested.
References
docs/modules/scanner/determinism-score.mddocs/replay/DETERMINISTIC_REPLAY.mddocs/modules/scanner/entropy.md