- Implemented `run-scanner-ci.sh` to build and run tests for the Scanner solution with a warmed NuGet cache. - Created `excititor-vex-traces.json` dashboard for monitoring Excititor VEX observations. - Added Docker Compose configuration for the OTLP span sink in `docker-compose.spansink.yml`. - Configured OpenTelemetry collector in `otel-spansink.yaml` to receive and process traces. - Developed `run-spansink.sh` script to run the OTLP span sink for Excititor traces. - Introduced `FileSystemRiskBundleObjectStore` for storing risk bundle artifacts in the filesystem. - Built `RiskBundleBuilder` for creating risk bundles with associated metadata and providers. - Established `RiskBundleJob` to execute the risk bundle creation and storage process. - Defined models for risk bundle inputs, entries, and manifests in `RiskBundleModels.cs`. - Implemented signing functionality for risk bundle manifests with `HmacRiskBundleManifestSigner`. - Created unit tests for `RiskBundleBuilder`, `RiskBundleJob`, and signing functionality to ensure correctness. - Added filesystem artifact reader tests to validate manifest parsing and artifact listing. - Included test manifests for egress scenarios in the task runner tests. - Developed timeline query service tests to verify tenant and event ID handling.
StellaOps Reachability Benchmark (Public)
Deterministic, reproducible benchmark for reachability analysis tools.
Goals
- Provide open cases with ground truth for reachable/unreachable sinks.
- Enforce determinism (hash-stable builds, fixed seeds, pinned deps).
- Enable fair scoring via the
rb-scoreCLI and published schemas.
Layout
cases/<lang>/<project>/— benchmark cases with deterministic Dockerfiles, pinned deps, oracle tests.schemas/— JSON/YAML schemas for cases, entrypoints, truth, submissions.benchmark/truth/— ground-truth labels (hidden/internal split optional).benchmark/submissions/— sample submissions and format reference.tools/scorer/—rb-scoreCLI and tests.tools/build/—build_all.py(run all cases) andvalidate_builds.py(run twice and compare hashes).baselines/— reference runners (Semgrep, CodeQL, Stella) with normalized outputs.ci/— deterministic CI workflows and scripts.website/— static site (leaderboard/docs/downloads).
Sample cases added (JS track):
cases/js/unsafe-eval(reachable sink) →benchmark/truth/js-unsafe-eval.json.cases/js/guarded-eval(unreachable by default) →benchmark/truth/js-guarded-eval.json.cases/js/express-eval(admin eval reachable) →benchmark/truth/js-express-eval.json.cases/js/express-guarded(admin eval gated by env) →benchmark/truth/js-express-guarded.json.cases/js/fastify-template(template rendering reachable) →benchmark/truth/js-fastify-template.json.
Sample cases added (Python track):
cases/py/unsafe-exec(reachable eval) →benchmark/truth/py-unsafe-exec.json.cases/py/guarded-exec(unreachable when FEATURE_ENABLE != 1) →benchmark/truth/py-guarded-exec.json.cases/py/flask-template(template rendering reachable) →benchmark/truth/py-flask-template.json.cases/py/fastapi-guarded(unreachable unless ALLOW_EXEC=true) →benchmark/truth/py-fastapi-guarded.json.cases/py/django-ssti(template rendering reachable, autoescape off) →benchmark/truth/py-django-ssti.json.
Sample cases added (Java track):
cases/java/spring-deserialize(reachable Java deserialization) →benchmark/truth/java-spring-deserialize.json.cases/java/spring-guarded(deserialization unreachable unless ALLOW_DESER=true) →benchmark/truth/java-spring-guarded.json.
Determinism & Offline Rules
- No network during build/test; pin images/deps; set
SOURCE_DATE_EPOCH. - Sort file lists; stable JSON/YAML emitters; fixed RNG seeds.
- All scripts must succeed on a clean machine with cached toolchain tarballs only.
Licensing
- Apache-2.0 for all benchmark assets. Third-party snippets must be license-compatible and attributed.
Quick Start (once populated)
# schema sanity checks (offline)
python tools/validate.py all schemas/examples
# score a submission (coming in task 513-008)
cd tools/scorer
./rb-score --cases ../cases --truth ../benchmark/truth --submission ../benchmark/submissions/sample.json
Contributing
See CONTRIBUTING.md. Open issues/PRs welcome; please provide hashes and logs for reproducibility.