# Reachability Benchmark Launch (BENCH-LAUNCH-513-017) ## Audience - Security engineering and platform teams evaluating reachability analysis tools. - Benchmark participants (vendors, OSS maintainers) who need deterministic scoring. ## Positioning - **Deterministic by default:** fixed seeds, SOURCE_DATE_EPOCH builds, sorted outputs. - **Offline ready:** no registry pulls or telemetry; baselines run without network. - **Explainable:** truth sets include static/dynamic evidence; scorer rewards path + guards. - **Vendor-neutral:** Semgrep / CodeQL / Stella baselines provided for comparison. ## What’s included - Cases across JS, Python, C (Java pending JDK availability). - Schemas for cases, entrypoints, truth, and submissions. - Baselines: Semgrep, CodeQL, Stella (offline). - Tooling: scorer (`rb-score`), leaderboard (`rb-compare`), deterministic CI script (`ci/run-ci.sh`). - Static site (`website/`) for quick start + leaderboard view. ## How to try it ```bash # Build and validate python tools/build/build_all.py --cases cases python tools/validate.py --schemas schemas # Run baselines (offline) bash baselines/semgrep/run_all.sh cases /tmp/semgrep bash baselines/stella/run_all.sh cases /tmp/stella bash baselines/codeql/run_all.sh cases /tmp/codeql # Score your submission tools/scorer/rb_score.py --truth benchmark/truth/.json --submission submission.json --format json ``` ## Key dates - 2025-12-01: Public beta (v1.0.0 schemas, JS/PY/C cases, offline baselines). - 2025-12-15 (target): Add Java track once JDK available in CI. - Quarterly: hidden set rotation + leaderboard refresh. ## Calls to action - Vendors: submit offline‑reproducible `submission.json` for inclusion on the public leaderboard. - Practitioners: run baselines locally to benchmark internal pipelines. - OSS: propose new cases via PR; follow determinism checklist in `docs/submission-guide.md`. ## Risks & mitigations - **Java track blocked (JDK)** — provide runner with JDK>=17; until then Java is excluded from CI. - **Hidden set leakage** — governed by rotation policy in `docs/governance.md`; no public release of hidden cases. - **Telemetry drift** — all runner scripts disable telemetry by env; reviewers verify no network calls.