Files
git.stella-ops.org/src/Bench/StellaOps.Bench/Determinism
StellaOps Bot 8abbf9574d up
2025-11-27 21:10:06 +02:00
..
up
2025-11-27 07:46:56 +02:00
up
2025-11-27 21:10:06 +02:00
up
2025-11-27 21:09:47 +02:00
up
2025-11-27 07:46:56 +02:00
up
2025-11-27 08:51:10 +02:00
up
2025-11-27 21:10:06 +02:00
up
2025-11-27 21:10:06 +02:00
up
2025-11-27 07:46:56 +02:00
up
2025-11-27 08:51:10 +02:00

Determinism Benchmark Harness (BENCH-DETERMINISM-401-057)

Location: src/Bench/StellaOps.Bench/Determinism

What it does

  • Runs a deterministic, offline-friendly benchmark that hashes scanner outputs for paired SBOM/VEX inputs.
  • Produces results.csv, inputs.sha256, and summary.json capturing determinism rate.
  • Ships with a built-in mock scanner so CI/offline runs do not need external tools.

Quick start

cd src/Bench/StellaOps.Bench/Determinism
python3 run_bench.py --shuffle --runs 3 --output out

Outputs land in out/:

  • results.csv per-run hashes (mode/run/scanner)
  • inputs.sha256 deterministic manifest of SBOM/VEX/config inputs
  • summary.json aggregate determinism rate

Inputs

  • SBOMs: inputs/sboms/*.json (sample SPDX provided)
  • VEX: inputs/vex/*.json (sample OpenVEX provided)
  • Scanner config: configs/scanners.json (defaults to built-in mock scanner)
  • Sample manifest: inputs/inputs.sha256 covers the bundled sample SBOM/VEX/config for quick offline verification; regenerate when inputs change.

Adding real scanners

  1. Add an entry to configs/scanners.json with kind: "command" and a command array, e.g.:
{
  "name": "scannerX",
  "kind": "command",
  "command": ["python", "../../scripts/scannerX_wrapper.py", "{sbom}", "{vex}"]
}
  1. Commands must write JSON with a top-level findings array; each finding should include purl, vulnerability, status, and base_score.
  2. Keep commands offline and deterministic; pin any feeds to local bundles before running.

Determinism expectations

  • Canonical and shuffled runs should yield identical hashes per scanner/SBOM/VEX tuple.
  • CI should treat determinism_rate < 0.95 as a failure once wired into workflows.

Maintenance

  • Tests live in tests/ and cover shuffle stability + manifest generation.
  • Update docs/benchmarks/signals/bench-determinism.md when inputs/outputs change.
  • Mirror task status in docs/implplan/SPRINT_0512_0001_0001_bench.md and src/Bench/StellaOps.Bench/TASKS.md.