Files
git.stella-ops.org/bench/reachability-benchmark/baselines/stella/README.md
StellaOps Bot 909d9b6220
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
up
2025-12-01 21:16:22 +02:00

867 B

Stella Ops baseline

Deterministic baseline runner that emits a benchmark submission using the published ground-truth labels and the expected Stella Ops reachability signal shape.

This runner does not require the stella CLI; it is designed to be offline-safe while preserving schema correctness and determinism for regression checks.

Usage

# One case
baselines/stella/run_case.sh cases/js/unsafe-eval /tmp/stella-out

# All cases under a root
baselines/stella/run_all.sh cases /tmp/stella-all

Outputs:

  • Per-case: <out>/submission.json
  • All cases: <out>/submission.json (merged, deterministic ordering)

Determinism posture

  • Pure local file reads (case.yaml + truth), no network or external binaries.
  • Stable ordering of cases and sinks.
  • Timestamps are not emitted; all numeric values are fixed.

Requirements

  • Python 3.11+.