Files
git.stella-ops.org/docs/benchmarks/signals/bench-determinism.md
master 2de8d1784b
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
new advisories
2025-11-23 23:38:25 +02:00

50 lines
2.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Determinism Benchmark (cross-scanner) — Draft
Source: advisory “23-Nov-2025 - Benchmarking Determinism in Vulnerability Scoring”. This doc captures the runnable harness pattern and expected outputs for task BENCH-DETERMINISM-401-057.
## Goal
- Measure determinism rate, order-invariance, and CVSS delta σ across scanners when fed identical SBOM+VEX inputs with frozen feeds.
## Minimal harness (Python excerpt)
```python
# run_bench.py (excerpt) — deterministic JSON hashing
def canon(obj):
return json.dumps(obj, sort_keys=True, separators=(',', ':')).encode()
def shas(b):
return hashlib.sha256(b).hexdigest()
for sbom, vex in zip(SBOMS, VEXES):
for scanner, tmpl in SCANNERS.items():
for mode in ("canonical", "shuffled"):
for i in range(10):
if mode == "shuffled":
sb, vx = shuffle(sbom), shuffle(vex)
out = run(tmpl.format(sbom=sb, vex=vx))
norm = normalize(out) # purl, vuln id, base_cvss, effective
blob = canon({"scanner": scanner, "sbom": sbom,
"vex": vex, "findings": norm})
results.append({
"hash": shas(blob), "mode": mode,
"run": i, "scanner": scanner, "sbom": sbom
})
```
## Inputs
- 35 SBOMs (CycloneDX 1.6 / SPDX 3.0.1) + matching VEX docs covering affected/not_affected/fixed.
- Feeds bundle: vendor DBs (NVD, GHSA, OVAL) hashed and frozen.
- Policy: single normalization profile (e.g., prefer vendor scores, CVSS v3.1).
## Metrics
- Determinism rate = identical_hash_runs / total_runs (20 per scanner/SBOM).
- Order-invariance failures (# distinct hashes between canonical vs shuffled).
- CVSS delta σ vs reference; VEX stability (σ_after ≤ σ_before).
## Deliverables
- `bench/determinism/` with harness, hashed inputs, and `results.csv`.
- CI target `bench:determinism` producing determinism% and σ per scanner.
## Links
- Source advisory: `docs/product-advisories/23-Nov-2025 - Benchmarking Determinism in Vulnerability Scoring.md`
- Sprint task: BENCH-DETERMINISM-401-057 (SPRINT_0401_0001_0001_reachability_evidence_chain.md)