# StellaOps Reachability Benchmark (Public) Deterministic, reproducible benchmark for reachability analysis tools. ## Goals - Provide open cases with ground truth for reachable/unreachable sinks. - Enforce determinism (hash-stable builds, fixed seeds, pinned deps). - Enable fair scoring via the `rb-score` CLI and published schemas. ## Layout - `cases///` — benchmark cases with deterministic Dockerfiles, pinned deps, oracle tests. - `schemas/` — JSON/YAML schemas for cases, entrypoints, truth, submissions. - `benchmark/truth/` — ground-truth labels (hidden/internal split optional). - `benchmark/submissions/` — sample submissions and format reference. - `tools/scorer/` — `rb-score` CLI and tests. - `baselines/` — reference runners (Semgrep, CodeQL, Stella) with normalized outputs. - `ci/` — deterministic CI workflows and scripts. - `website/` — static site (leaderboard/docs/downloads). ## Determinism & Offline Rules - No network during build/test; pin images/deps; set `SOURCE_DATE_EPOCH`. - Sort file lists; stable JSON/YAML emitters; fixed RNG seeds. - All scripts must succeed on a clean machine with cached toolchain tarballs only. ## Licensing - Apache-2.0 for all benchmark assets. Third-party snippets must be license-compatible and attributed. ## Quick Start (once populated) ```bash # schema sanity checks (offline) python tools/validate.py all schemas/examples # score a submission (coming in task 513-008) cd tools/scorer ./rb-score --cases ../cases --truth ../benchmark/truth --submission ../benchmark/submissions/sample.json ``` ## Contributing See CONTRIBUTING.md. Open issues/PRs welcome; please provide hashes and logs for reproducibility.