git.stella-ops.org/bench/reachability-benchmark/docs/submission-guide.md

# Reachability Benchmark · Submission Guide

This guide explains how to produce a compliant submission for the Stella Ops reachability benchmark. It is fully offline-friendly.

## Prerequisites
- Python 3.11+
- Your analyzer toolchain (no network calls during analysis)
- Schemas from `schemas/` and truth from `benchmark/truth/`

## Steps
1) **Build cases deterministically**
   ```bash
   python tools/build/build_all.py --cases cases
   ```
   - Sets `SOURCE_DATE_EPOCH`.
   - Uses vendored Temurin 21 via `tools/java/ensure_jdk.sh` when `JAVA_HOME`/`javac` are missing; pass `--skip-lang` if another toolchain is unavailable on your runner.

2) **Run your analyzer**
   - For each case, produce sink predictions in memory-safe JSON.
   - Do not reach out to the internet, package registries, or remote APIs.

3) **Emit `submission.json`**
   - Must conform to `schemas/submission.schema.json` (`version: 1.0.0`).
   - Sort cases and sinks alphabetically to ensure determinism.
   - Include optional runtime stats under `run` (time_s, peak_mb) if available.

4) **Validate**
   ```bash
   python tools/validate.py --submission submission.json --schema schemas/submission.schema.json
   ```

5) **Score locally**
   ```bash
   tools/scorer/rb_score.py --truth benchmark/truth/<aggregate>.json --submission submission.json --format json
   ```

6) **Compare (optional)**
   ```bash
   tools/scorer/rb_compare.py --truth benchmark/truth/<aggregate>.json \
     --submissions submission.json baselines/*/submission.json \
     --output leaderboard.json --text
   ```

## Determinism checklist
- Set `SOURCE_DATE_EPOCH` for all builds.
- Disable telemetry/version checks in your analyzer.
- Avoid nondeterministic ordering (sort file and sink lists).
- No network access; use vendored toolchains only.
- Use fixed seeds for any sampling.

## Packaging
- Submit a zip/tar with:
  - `submission.json`
  - Tool version & configuration (README)
  - Optional logs and runtime metrics
- For production submissions, sign `submission.json` with DSSE and record the envelope under `signatures` in the manifest (see `benchmark/manifest.sample.json`).
- Do **not** include binaries that require network access or licenses we cannot redistribute.

## Provenance & Manifest
- Reference kit manifest: `benchmark/manifest.sample.json` (schema: `benchmark/schemas/benchmark-manifest.schema.json`).
- Validate your bundle offline:
  ```bash
  python tools/verify_manifest.py benchmark/manifest.sample.json --root bench/reachability-benchmark
  ```
- Determinism templates: `benchmark/templates/determinism/*.env` can be sourced by build scripts per language.

## Support
- Open issues in the public repo (once live) or provide a reproducible script that runs fully offline.