stella-ops.org/git.stella-ops.org

Fork 0

Files

StellaOps Bot ea970ead2a

Docs CI / lint-and-preview (push) Has been cancelled

Details

sdk-generator-smoke / sdk-smoke (push) Has been cancelled

Details

SDK Publish & Sign / sdk-publish (push) Has been cancelled

Details

api-governance / spectral-lint (push) Has been cancelled

Details

oas-ci / oas-validate (push) Has been cancelled

Details

Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled

Details

2025-11-27 07:46:56 +02:00

3.3 KiB

Raw Blame History

Replay Test Strategy

Imposed rule: Replay tests must use frozen inputs (SBOM, advisories, VEX, feeds, policy, tools) and fixed seeds/clocks; any non-determinism is a test failure.

This strategy defines how we validate replayability of Scanner outputs and attestations across tool/definition updates and environments.

1. Goals

Prove that a recorded scan bundle (inputs + manifests) replays bit-for-bit across environments.
Detect drift from feeds, policy, or tooling changes before shipping releases.
Provide auditors with evidence (hashes, DSSE bundles) that replays are deterministic.

2. Test layers

Golden replay: take a recorded bundle (SBOM/VEX/feeds/policy/tool hashes) and rerun; assert hash equality for SBOM, findings, VEX, logs. Fail on any difference.
Feed drift guard: rerun bundle after feed update; expect differences; ensure drift is surfaced (hash mismatch, diff report) not silently masked.
Tool upgrade: rerun with new scanner version; expect stable outputs if no functional change, otherwise require documented diffs.
Policy change: rerun with updated policy; expect explain trace to show changed rules and hash delta; diff must be recorded.
Offline: replay in sealed mode using only bundle contents; no network access permitted.

3. Inputs

Replay bundle contents: sbom, feeds.tar.gz, policy.tar.gz, scanner-image, reachability.graph, runtime-trace (optional), replay.yaml.
Hash manifest: SHA-256 for every file; top-level Merkle root.
DSSE attestations (optional): for replay manifest and artifacts.

4. Determinism settings

Fixed clock (--fixed-clock ISO-8601), RNG seed (RNG_SEED), single-threaded mode (SCANNER_MAX_CONCURRENCY=1), stable ordering (sorted inputs), log filtering (strip timestamps/PIDs).
Disable network/egress; rely on bundled feeds/policy.

5. Assertions

Hash equality for outputs: SBOMs, findings, VEX, logs (canonicalised), determinism.json (if present).
Verify DSSE signatures and Rekor proofs when available; fail if mismatched or missing.
Report diff summary when hashes differ (feed/tool/policy drift).

6. Tooling

CLI: stella replay run --bundle <path> --fixed-clock 2025-11-01T00:00:00Z --seed 1337 --single-threaded.
Scripts: scripts/replay/verify_bundle.sh (hash/manifest check), scripts/replay/run_replay.sh (orchestrates fixed settings), scripts/replay/diff_outputs.py (canonical diffs).
CI: bench:determinism target executes golden replay on reference bundles; fails on hash delta.

7. Outputs

replay-results.json with per-artifact hashes, pass/fail, diff counts.
replay.log filtered (no timestamps/PIDs), replay.hashes (sha256sum of outputs).
Optional DSSE attestation for replay results.

8. Reporting

Publish results to CI artifacts; store in Evidence Locker for audit.
Add summary to release notes when replay is part of a release gate.

9. Checklists

Bundle verified (hash manifest, DSSE if present).
Fixed clock/seed/concurrency applied.
Network disabled; feeds/policy/tooling from bundle only.
Outputs hashed and compared to baseline; diffs recorded.
Replay results stored + (optionally) attested.

References

docs/modules/scanner/determinism-score.md
docs/replay/DETERMINISTIC_REPLAY.md
docs/modules/scanner/entropy.md

3.3 KiB Raw Blame History

Replay Test Strategy

1. Goals

2. Test layers

3. Inputs

4. Determinism settings

5. Assertions

6. Tooling

7. Outputs

8. Reporting

9. Checklists

References

3.3 KiB

Raw Blame History