Files
git.stella-ops.org/bench/reachability-benchmark/docs/submission-guide.md
StellaOps Bot 909d9b6220
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
up
2025-12-01 21:16:22 +02:00

2.0 KiB

Reachability Benchmark · Submission Guide

This guide explains how to produce a compliant submission for the Stella Ops reachability benchmark. It is fully offline-friendly.

Prerequisites

  • Python 3.11+
  • Your analyzer toolchain (no network calls during analysis)
  • Schemas from schemas/ and truth from benchmark/truth/

Steps

  1. Build cases deterministically

    python tools/build/build_all.py --cases cases
    
    • Sets SOURCE_DATE_EPOCH.
    • Skips Java by default if JDK is unavailable (pass --skip-lang as needed).
  2. Run your analyzer

    • For each case, produce sink predictions in memory-safe JSON.
    • Do not reach out to the internet, package registries, or remote APIs.
  3. Emit submission.json

    • Must conform to schemas/submission.schema.json (version: 1.0.0).
    • Sort cases and sinks alphabetically to ensure determinism.
    • Include optional runtime stats under run (time_s, peak_mb) if available.
  4. Validate

    python tools/validate.py --submission submission.json --schema schemas/submission.schema.json
    
  5. Score locally

    tools/scorer/rb_score.py --truth benchmark/truth/<aggregate>.json --submission submission.json --format json
    
  6. Compare (optional)

    tools/scorer/rb_compare.py --truth benchmark/truth/<aggregate>.json \
      --submissions submission.json baselines/*/submission.json \
      --output leaderboard.json --text
    

Determinism checklist

  • Set SOURCE_DATE_EPOCH for all builds.
  • Disable telemetry/version checks in your analyzer.
  • Avoid nondeterministic ordering (sort file and sink lists).
  • No network access; use vendored toolchains only.
  • Use fixed seeds for any sampling.

Packaging

  • Submit a zip/tar with:
    • submission.json
    • Tool version & configuration (README)
    • Optional logs and runtime metrics
  • Do not include binaries that require network access or licenses we cannot redistribute.

Support

  • Open issues in the public repo (once live) or provide a reproducible script that runs fully offline.