stella-ops.org/git.stella-ops.org

Files

History

StellaOps Bot 8abbf9574d up

2025-11-27 21:10:06 +02:00

..

up

2025-11-27 07:46:56 +02:00

up

2025-11-27 21:10:06 +02:00

up

2025-11-27 21:09:47 +02:00

__init__.py

up

2025-11-27 07:46:56 +02:00

.gitignore

up

2025-11-27 08:51:10 +02:00

offline_run.sh

up

2025-11-27 21:10:06 +02:00

README.md

up

2025-11-27 21:10:06 +02:00

run_bench.py

up

2025-11-27 07:46:56 +02:00

run_reachability.py

up

2025-11-27 08:51:10 +02:00

README.md

Determinism Benchmark Harness (BENCH-DETERMINISM-401-057)

Location: src/Bench/StellaOps.Bench/Determinism

What it does

Runs a deterministic, offline-friendly benchmark that hashes scanner outputs for paired SBOM/VEX inputs.
Produces results.csv, inputs.sha256, and summary.json capturing determinism rate.
Ships with a built-in mock scanner so CI/offline runs do not need external tools.

Quick start

cd src/Bench/StellaOps.Bench/Determinism
python3 run_bench.py --shuffle --runs 3 --output out

Outputs land in out/:

results.csv – per-run hashes (mode/run/scanner)
inputs.sha256 – deterministic manifest of SBOM/VEX/config inputs
summary.json – aggregate determinism rate

Inputs

SBOMs: inputs/sboms/*.json (sample SPDX provided)
VEX: inputs/vex/*.json (sample OpenVEX provided)
Scanner config: configs/scanners.json (defaults to built-in mock scanner)
Sample manifest: inputs/inputs.sha256 covers the bundled sample SBOM/VEX/config for quick offline verification; regenerate when inputs change.

Adding real scanners

Add an entry to configs/scanners.json with kind: "command" and a command array, e.g.:

{
  "name": "scannerX",
  "kind": "command",
  "command": ["python", "../../scripts/scannerX_wrapper.py", "{sbom}", "{vex}"]
}

Commands must write JSON with a top-level findings array; each finding should include purl, vulnerability, status, and base_score.
Keep commands offline and deterministic; pin any feeds to local bundles before running.

Determinism expectations

Canonical and shuffled runs should yield identical hashes per scanner/SBOM/VEX tuple.
CI should treat determinism_rate < 0.95 as a failure once wired into workflows.

Maintenance

Tests live in tests/ and cover shuffle stability + manifest generation.
Update docs/benchmarks/signals/bench-determinism.md when inputs/outputs change.
Mirror task status in docs/implplan/SPRINT_0512_0001_0001_bench.md and src/Bench/StellaOps.Bench/TASKS.md.