Files
git.stella-ops.org/docs/replay/TEST_STRATEGY.md
StellaOps Bot ea970ead2a
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
sdk-generator-smoke / sdk-smoke (push) Has been cancelled
SDK Publish & Sign / sdk-publish (push) Has been cancelled
api-governance / spectral-lint (push) Has been cancelled
oas-ci / oas-validate (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled
up
2025-11-27 07:46:56 +02:00

3.3 KiB

Replay Test Strategy

Imposed rule: Replay tests must use frozen inputs (SBOM, advisories, VEX, feeds, policy, tools) and fixed seeds/clocks; any non-determinism is a test failure.

This strategy defines how we validate replayability of Scanner outputs and attestations across tool/definition updates and environments.

1. Goals

  • Prove that a recorded scan bundle (inputs + manifests) replays bit-for-bit across environments.
  • Detect drift from feeds, policy, or tooling changes before shipping releases.
  • Provide auditors with evidence (hashes, DSSE bundles) that replays are deterministic.

2. Test layers

  1. Golden replay: take a recorded bundle (SBOM/VEX/feeds/policy/tool hashes) and rerun; assert hash equality for SBOM, findings, VEX, logs. Fail on any difference.
  2. Feed drift guard: rerun bundle after feed update; expect differences; ensure drift is surfaced (hash mismatch, diff report) not silently masked.
  3. Tool upgrade: rerun with new scanner version; expect stable outputs if no functional change, otherwise require documented diffs.
  4. Policy change: rerun with updated policy; expect explain trace to show changed rules and hash delta; diff must be recorded.
  5. Offline: replay in sealed mode using only bundle contents; no network access permitted.

3. Inputs

  • Replay bundle contents: sbom, feeds.tar.gz, policy.tar.gz, scanner-image, reachability.graph, runtime-trace (optional), replay.yaml.
  • Hash manifest: SHA-256 for every file; top-level Merkle root.
  • DSSE attestations (optional): for replay manifest and artifacts.

4. Determinism settings

  • Fixed clock (--fixed-clock ISO-8601), RNG seed (RNG_SEED), single-threaded mode (SCANNER_MAX_CONCURRENCY=1), stable ordering (sorted inputs), log filtering (strip timestamps/PIDs).
  • Disable network/egress; rely on bundled feeds/policy.

5. Assertions

  • Hash equality for outputs: SBOMs, findings, VEX, logs (canonicalised), determinism.json (if present).
  • Verify DSSE signatures and Rekor proofs when available; fail if mismatched or missing.
  • Report diff summary when hashes differ (feed/tool/policy drift).

6. Tooling

  • CLI: stella replay run --bundle <path> --fixed-clock 2025-11-01T00:00:00Z --seed 1337 --single-threaded.
  • Scripts: scripts/replay/verify_bundle.sh (hash/manifest check), scripts/replay/run_replay.sh (orchestrates fixed settings), scripts/replay/diff_outputs.py (canonical diffs).
  • CI: bench:determinism target executes golden replay on reference bundles; fails on hash delta.

7. Outputs

  • replay-results.json with per-artifact hashes, pass/fail, diff counts.
  • replay.log filtered (no timestamps/PIDs), replay.hashes (sha256sum of outputs).
  • Optional DSSE attestation for replay results.

8. Reporting

  • Publish results to CI artifacts; store in Evidence Locker for audit.
  • Add summary to release notes when replay is part of a release gate.

9. Checklists

  • Bundle verified (hash manifest, DSSE if present).
  • Fixed clock/seed/concurrency applied.
  • Network disabled; feeds/policy/tooling from bundle only.
  • Outputs hashed and compared to baseline; diffs recorded.
  • Replay results stored + (optionally) attested.

References

  • docs/modules/scanner/determinism-score.md
  • docs/replay/DETERMINISTIC_REPLAY.md
  • docs/modules/scanner/entropy.md