Add receipt input JSON and SHA256 hash for CVSS policy scoring tests

- Introduced a new JSON fixture `receipt-input.json` containing base, environmental, and threat metrics for CVSS scoring. - Added corresponding SHA256 hash file `receipt-input.sha256` to ensure integrity of the JSON fixture.
2025-12-04 07:30:42 +02:00
parent 2d079d61ed
commit e1262eb916
91 changed files with 19493 additions and 187 deletions
--- a/bench/reachability-benchmark/docs/gaps/benchmark-gaps-remediation.md
+++ b/bench/reachability-benchmark/docs/gaps/benchmark-gaps-remediation.md
@@ -8,11 +8,14 @@ This note closes BENCH-GAPS-513-018, DATASET-GAPS-513-019, and REACH-FIXTURE-GAP
 ## What changed
 - **Benchmark kit manifest + schema**: `benchmark/schemas/benchmark-manifest.schema.json` with signed/hashed entries for cases, truth, baselines, schemas, and tools. Sample at `benchmark/manifest.sample.json`.
 - **Offline verifier**: `tools/verify_manifest.py` validates the manifest against local files (hashes, required entries, DSSE envelope presence) to keep runs deterministic and tamper-evident.
+- **Coverage/trace schemas**: `schemas/coverage.schema.json` and `schemas/trace.schema.json` govern oracle outputs referenced by manifest hashes.
 - **Submission provenance checks**: manifest requires SHA-256 for submission schema, scorer package, and each baseline submission; DSSE path optional but encouraged.
 - **Determinism env templates**: manifest captures `sourceDateEpoch` and per-tool pinned versions; cases must provide build seeds in case metadata.
 - **Unreachability oracles**: truth files must include explicit rationale for unreachable cases; manifest enforces presence of `truth` artifact per case.
 - **Sandbox/redaction guidance**: case metadata must declare `sandbox` and `redaction` policy fields (schema updated) to ensure PII removal and constrained execution.
 - **Resource normalization**: manifest records build/runtime resource limits (cpu/memory) for repeatable benchmarking.
+- **Offline kit & checklist**: dataset safety checklist at `benchmark/checklists/dataset-safety.md`; deterministic packaging via `tools/package_offline_kit.sh`.
+- **Frozen baselines**: Semgrep rulepack hash pinned at `baselines/semgrep/rules.sha256`; manifest supports hashed baseline submissions.

 ## How to use
 ```bash
--- a/bench/reachability-benchmark/docs/submission-guide.md
+++ b/bench/reachability-benchmark/docs/submission-guide.md
@@ -53,7 +53,16 @@ This guide explains how to produce a compliant submission for the Stella Ops rea
  - `submission.json`
  - Tool version & configuration (README)
  - Optional logs and runtime metrics
+- For production submissions, sign `submission.json` with DSSE and record the envelope under `signatures` in the manifest (see `benchmark/manifest.sample.json`).
 - Do **not** include binaries that require network access or licenses we cannot redistribute.

+## Provenance & Manifest
+- Reference kit manifest: `benchmark/manifest.sample.json` (schema: `benchmark/schemas/benchmark-manifest.schema.json`).
+- Validate your bundle offline:
+  ```bash
+  python tools/verify_manifest.py benchmark/manifest.sample.json --root bench/reachability-benchmark
+  ```
+- Determinism templates: `benchmark/templates/determinism/*.env` can be sourced by build scripts per language.
+
 ## Support
 - Open issues in the public repo (once live) or provide a reproducible script that runs fully offline.