Determinism Benchmark Harness (BENCH-DETERMINISM-401-057)
Location: src/Bench/StellaOps.Bench/Determinism
What it does
- Runs a deterministic, offline-friendly benchmark that hashes scanner outputs for paired SBOM/VEX inputs.
- Produces
results.csv,inputs.sha256, andsummary.jsoncapturing determinism rate. - Ships with a built-in mock scanner so CI/offline runs do not need external tools.
Quick start
cd src/Bench/StellaOps.Bench/Determinism
python3 run_bench.py --shuffle --runs 3 --output out
Outputs land in out/:
results.csv– per-run hashes (mode/run/scanner)inputs.sha256– deterministic manifest of SBOM/VEX/config inputssummary.json– aggregate determinism rate
Inputs
- SBOMs:
inputs/sboms/*.json(sample SPDX provided) - VEX:
inputs/vex/*.json(sample OpenVEX provided) - Scanner config:
configs/scanners.json(defaults to built-in mock scanner) - Sample manifest:
inputs/inputs.sha256covers the bundled sample SBOM/VEX/config for quick offline verification; regenerate when inputs change.
Adding real scanners
- Add an entry to
configs/scanners.jsonwithkind: "command"and a command array, e.g.:
{
"name": "scannerX",
"kind": "command",
"command": ["python", "../../scripts/scannerX_wrapper.py", "{sbom}", "{vex}"]
}
- Commands must write JSON with a top-level
findingsarray; each finding should includepurl,vulnerability,status, andbase_score. - Keep commands offline and deterministic; pin any feeds to local bundles before running.
Determinism expectations
- Canonical and shuffled runs should yield identical hashes per scanner/SBOM/VEX tuple.
- CI should treat determinism_rate < 0.95 as a failure once wired into workflows.
Maintenance
- Tests live in
tests/and cover shuffle stability + manifest generation. - Update
docs/benchmarks/signals/bench-determinism.mdwhen inputs/outputs change. - Mirror task status in
docs/implplan/SPRINT_0512_0001_0001_bench.mdandsrc/Bench/StellaOps.Bench/TASKS.md.