up
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Findings Ledger CI / build-test (push) Has been cancelled
Findings Ledger CI / migration-validation (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
cryptopro-linux-csp / build-and-test (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
sm-remote-ci / build-and-test (push) Has been cancelled
Findings Ledger CI / generate-manifest (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Findings Ledger CI / build-test (push) Has been cancelled
Findings Ledger CI / migration-validation (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
cryptopro-linux-csp / build-and-test (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
sm-remote-ci / build-and-test (push) Has been cancelled
Findings Ledger CI / generate-manifest (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
This commit is contained in:
1
bench/reachability-benchmark/.gitignore
vendored
Normal file
1
bench/reachability-benchmark/.gitignore
vendored
Normal file
@@ -0,0 +1 @@
|
||||
.jdk/
|
||||
@@ -20,6 +20,7 @@
|
||||
## Working Agreements
|
||||
- Determinism: pin toolchains; set `SOURCE_DATE_EPOCH`; sort file lists; stable JSON/YAML ordering; fixed seeds for any sampling.
|
||||
- Offline posture: no network at build/test time; vendored toolchains; registry pulls are forbidden—use cached/bundled images.
|
||||
- Java builds: use vendored Temurin 21 via `tools/java/ensure_jdk.sh` when `JAVA_HOME`/`javac` are absent; keep `.jdk/` out of VCS and use `build_all.py --skip-lang` when a toolchain is missing.
|
||||
- Licensing: all benchmark content Apache-2.0; include LICENSE in repo root; third-party cases must have compatible licenses and attributions.
|
||||
- Evidence: each case must include oracle tests/coverage proving reachability label; store truth and submissions under `benchmark/truth/` and `benchmark/submissions/` with JSON Schema.
|
||||
- Security: no secrets; scrub URLs/tokens; deterministic CI artifacts only.
|
||||
|
||||
@@ -8,38 +8,42 @@ Deterministic, reproducible benchmark for reachability analysis tools.
|
||||
- Enable fair scoring via the `rb-score` CLI and published schemas.
|
||||
|
||||
## Layout
|
||||
- `cases/<lang>/<project>/` — benchmark cases with deterministic Dockerfiles, pinned deps, oracle tests.
|
||||
- `schemas/` — JSON/YAML schemas for cases, entrypoints, truth, submissions.
|
||||
- `benchmark/truth/` — ground-truth labels (hidden/internal split optional).
|
||||
- `benchmark/submissions/` — sample submissions and format reference.
|
||||
- `tools/scorer/` — `rb-score` CLI and tests.
|
||||
- `tools/build/` — `build_all.py` (run all cases) and `validate_builds.py` (run twice and compare hashes).
|
||||
- `baselines/` — reference runners (Semgrep, CodeQL, Stella) with normalized outputs.
|
||||
- `ci/` — deterministic CI workflows and scripts.
|
||||
- `website/` — static site (leaderboard/docs/downloads).
|
||||
- `cases/<lang>/<project>/` ƒ?" benchmark cases with deterministic Dockerfiles, pinned deps, oracle tests.
|
||||
- `schemas/` ƒ?" JSON/YAML schemas for cases, entrypoints, truth, submissions.
|
||||
- `benchmark/truth/` ƒ?" ground-truth labels (hidden/internal split optional).
|
||||
- `benchmark/submissions/` ƒ?" sample submissions and format reference.
|
||||
- `tools/scorer/` ƒ?" `rb-score` CLI and tests.
|
||||
- `tools/build/` ƒ?" `build_all.py` (run all cases) and `validate_builds.py` (run twice and compare hashes).
|
||||
- `baselines/` ƒ?" reference runners (Semgrep, CodeQL, Stella) with normalized outputs.
|
||||
- `ci/` ƒ?" deterministic CI workflows and scripts.
|
||||
- `website/` ƒ?" static site (leaderboard/docs/downloads).
|
||||
|
||||
Sample cases added (JS track):
|
||||
- `cases/js/unsafe-eval` (reachable sink) → `benchmark/truth/js-unsafe-eval.json`.
|
||||
- `cases/js/guarded-eval` (unreachable by default) → `benchmark/truth/js-guarded-eval.json`.
|
||||
- `cases/js/express-eval` (admin eval reachable) → `benchmark/truth/js-express-eval.json`.
|
||||
- `cases/js/express-guarded` (admin eval gated by env) → `benchmark/truth/js-express-guarded.json`.
|
||||
- `cases/js/fastify-template` (template rendering reachable) → `benchmark/truth/js-fastify-template.json`.
|
||||
- `cases/js/unsafe-eval` (reachable sink) ƒ+' `benchmark/truth/js-unsafe-eval.json`.
|
||||
- `cases/js/guarded-eval` (unreachable by default) ƒ+' `benchmark/truth/js-guarded-eval.json`.
|
||||
- `cases/js/express-eval` (admin eval reachable) ƒ+' `benchmark/truth/js-express-eval.json`.
|
||||
- `cases/js/express-guarded` (admin eval gated by env) ƒ+' `benchmark/truth/js-express-guarded.json`.
|
||||
- `cases/js/fastify-template` (template rendering reachable) ƒ+' `benchmark/truth/js-fastify-template.json`.
|
||||
|
||||
Sample cases added (Python track):
|
||||
- `cases/py/unsafe-exec` (reachable eval) → `benchmark/truth/py-unsafe-exec.json`.
|
||||
- `cases/py/guarded-exec` (unreachable when FEATURE_ENABLE != 1) → `benchmark/truth/py-guarded-exec.json`.
|
||||
- `cases/py/flask-template` (template rendering reachable) → `benchmark/truth/py-flask-template.json`.
|
||||
- `cases/py/fastapi-guarded` (unreachable unless ALLOW_EXEC=true) → `benchmark/truth/py-fastapi-guarded.json`.
|
||||
- `cases/py/django-ssti` (template rendering reachable, autoescape off) → `benchmark/truth/py-django-ssti.json`.
|
||||
- `cases/py/unsafe-exec` (reachable eval) ƒ+' `benchmark/truth/py-unsafe-exec.json`.
|
||||
- `cases/py/guarded-exec` (unreachable when FEATURE_ENABLE != 1) ƒ+' `benchmark/truth/py-guarded-exec.json`.
|
||||
- `cases/py/flask-template` (template rendering reachable) ƒ+' `benchmark/truth/py-flask-template.json`.
|
||||
- `cases/py/fastapi-guarded` (unreachable unless ALLOW_EXEC=true) ƒ+' `benchmark/truth/py-fastapi-guarded.json`.
|
||||
- `cases/py/django-ssti` (template rendering reachable, autoescape off) ƒ+' `benchmark/truth/py-django-ssti.json`.
|
||||
|
||||
Sample cases added (Java track):
|
||||
- `cases/java/spring-deserialize` (reachable Java deserialization) → `benchmark/truth/java-spring-deserialize.json`.
|
||||
- `cases/java/spring-guarded` (deserialization unreachable unless ALLOW_DESER=true) → `benchmark/truth/java-spring-guarded.json`.
|
||||
- `cases/java/spring-deserialize` (reachable Java deserialization) ƒ+' `benchmark/truth/java-spring-deserialize.json`.
|
||||
- `cases/java/spring-guarded` (deserialization unreachable unless ALLOW_DESER=true) ƒ+' `benchmark/truth/java-spring-guarded.json`.
|
||||
- `cases/java/micronaut-deserialize` (reachable Micronaut-style deserialization) ƒ+' `benchmark/truth/java-micronaut-deserialize.json`.
|
||||
- `cases/java/micronaut-guarded` (unreachable unless ALLOW_MN_DESER=true) ƒ+' `benchmark/truth/java-micronaut-guarded.json`.
|
||||
- `cases/java/spring-reflection` (reflection sink reachable via Class.forName) ƒ+' `benchmark/truth/java-spring-reflection.json`.
|
||||
|
||||
## Determinism & Offline Rules
|
||||
- No network during build/test; pin images/deps; set `SOURCE_DATE_EPOCH`.
|
||||
- Sort file lists; stable JSON/YAML emitters; fixed RNG seeds.
|
||||
- All scripts must succeed on a clean machine with cached toolchain tarballs only.
|
||||
- Java builds auto-use vendored Temurin 21 via `tools/java/ensure_jdk.sh` when `JAVA_HOME`/`javac` are absent.
|
||||
|
||||
## Licensing
|
||||
- Apache-2.0 for all benchmark assets. Third-party snippets must be license-compatible and attributed.
|
||||
@@ -50,8 +54,10 @@ Sample cases added (Java track):
|
||||
python tools/validate.py all schemas/examples
|
||||
|
||||
# score a submission (coming in task 513-008)
|
||||
cd tools/scorer
|
||||
./rb-score --cases ../cases --truth ../benchmark/truth --submission ../benchmark/submissions/sample.json
|
||||
./tools/scorer/rb-score --cases cases --truth benchmark/truth --submission benchmark/submissions/sample.json
|
||||
|
||||
# deterministic case builds (skip a language when a toolchain is unavailable)
|
||||
python tools/build/build_all.py --cases cases --skip-lang js
|
||||
```
|
||||
|
||||
## Contributing
|
||||
|
||||
@@ -1,11 +1,16 @@
|
||||
# Reachability Benchmark Changelog
|
||||
|
||||
## 1.0.1 · 2025-12-03
|
||||
## 1.0.2 Aú 2025-12-05
|
||||
- Unblocked Java track with vendored Temurin 21 (`tools/java/ensure_jdk.sh`) and deterministic build artifacts (coverage + traces).
|
||||
- Added three more Java cases (`micronaut-deserialize`, `micronaut-guarded`, `spring-reflection`) to reach 5/5 required cases.
|
||||
- `tools/build/build_all.py` now supports `--skip-lang` and runs under WSL-aware bash; CI builds Java cases by default.
|
||||
|
||||
## 1.0.1 Aú 2025-12-03
|
||||
- Added manifest schema + sample manifest with hashes, SBOM/attestation entries, and sandbox/redaction metadata.
|
||||
- Added coverage/trace schemas and extended validator to cover them.
|
||||
- Introduced `tools/verify_manifest.py` and deterministic offline kit packaging script.
|
||||
- Added per-language determinism env templates and dataset safety checklist.
|
||||
- Populated SBOM + attestation outputs for JS/PY/C tracks; Java remains blocked on JDK availability.
|
||||
- Populated SBOM + attestation outputs for JS/PY/C tracks.
|
||||
|
||||
## 1.0.0 · 2025-12-01
|
||||
## 1.0.0 Aú 2025-12-01
|
||||
- Initial public dataset, scorer, baselines, and website.
|
||||
|
||||
@@ -8,7 +8,7 @@ Version: 1.0.1 · Date: 2025-12-03
|
||||
- [x] Published schemas/validators: truth/submission/coverage/trace + manifest schemas; validated via `tools/validate.py` and `tools/verify_manifest.py`.
|
||||
- [x] Evidence bundles: coverage + traces + attestation + sbom recorded per case (sample manifest).
|
||||
- [x] Binary case recipe: `cases/**/build/build.sh` pinned `SOURCE_DATE_EPOCH` and env templates under `benchmark/templates/determinism/`.
|
||||
- [x] Determinism CI: `ci/run-ci.sh` + `tools/verify_manifest.py` run twice to compare hashes; Java track still blocked on JDK availability.
|
||||
- [x] Determinism CI: `ci/run-ci.sh` + `tools/verify_manifest.py` run twice to compare hashes; Java track uses vendored Temurin 21 via `tools/java/ensure_jdk.sh`.
|
||||
- [x] Signed baselines: baseline submissions may include DSSE path in manifest (not required for sample kit); rulepack hashes recorded separately.
|
||||
- [x] Submission policy: CLA/DSSE optional in sample; production kits require DSSE envelope recorded in `signatures`.
|
||||
- [x] Semantic versioning & changelog: see `benchmark/CHANGELOG.md`; manifest `version` mirrors dataset release.
|
||||
|
||||
@@ -1,92 +1,203 @@
|
||||
{
|
||||
"schemaVersion": "1.0.0",
|
||||
"kitId": "reachability-benchmark:public-v1",
|
||||
"version": "1.0.1",
|
||||
"artifacts": {
|
||||
"baselineSubmissions": [],
|
||||
"scorer": {
|
||||
"path": "tools/scorer/rb_score.py",
|
||||
"sha256": "32d4f69f5d1d4b87902d6c4f020efde703487d526bf7d42b4438cb2499813f7f"
|
||||
},
|
||||
"submissionSchema": {
|
||||
"path": "schemas/submission.schema.json",
|
||||
"sha256": "de5bebb2dbcd085d7896f47a16b9d3837a65fb7f816dcf7e587967d5848c50a7"
|
||||
}
|
||||
},
|
||||
"cases": [
|
||||
{
|
||||
"hashes": {
|
||||
"attestation": {
|
||||
"path": "cases/js/unsafe-eval/outputs/attestation.json",
|
||||
"sha256": "be3b0971d805f68730a1c4c0f7a4c3c40dfc7a73099a5524c68759fcc1729d7c"
|
||||
},
|
||||
"binary": {
|
||||
"path": "cases/js/unsafe-eval/outputs/binary.tar.gz",
|
||||
"sha256": "72da19f28c2c36b6666afcc304514b387de20a5de881d5341067481e8418e23e"
|
||||
},
|
||||
"case": {
|
||||
"path": "cases/js/unsafe-eval/case.yaml",
|
||||
"sha256": "a858ff509fda65d69df476e870d9646c6a84744010c812f3d23a88576f20cb6b"
|
||||
},
|
||||
"coverage": {
|
||||
"path": "cases/js/unsafe-eval/outputs/coverage.json",
|
||||
"sha256": "c2cf5af508d33f6ecdc7c0f10200a02a4c0ddeb8e1fc08b55d9bd4a2d6cb926b"
|
||||
},
|
||||
"entrypoints": {
|
||||
"path": "cases/js/unsafe-eval/entrypoints.yaml",
|
||||
"sha256": "77829e728d34c9dc5f56c04784c97f619830ad43bd8410acb3d7134f372a49b3"
|
||||
},
|
||||
"sbom": {
|
||||
"path": "cases/js/unsafe-eval/outputs/sbom.cdx.json",
|
||||
"sha256": "c00ee1e12b1b6a6237e42174b2fe1393bcf575f6605205a2b84366e867b36d5f"
|
||||
},
|
||||
"source": {
|
||||
"path": "cases/js/unsafe-eval",
|
||||
"sha256": "69b0d1cbae1e2c9ddc0f4dba8c6db507e1d3a1c5ea0a0a545c6f3e785529c91c"
|
||||
},
|
||||
"traces": {
|
||||
"path": "cases/js/unsafe-eval/outputs/traces/traces.json",
|
||||
"sha256": "6e63c78e091cc9d06acdc5966dd9e54593ca6b0b97f502928de278b3f80adbd8"
|
||||
},
|
||||
"truth": {
|
||||
"path": "benchmark/truth/js-unsafe-eval.json",
|
||||
"sha256": "ab42f28ed229eb657ffcb36c3a99287436e1822a4c7d395a94de784457a08f62"
|
||||
}
|
||||
},
|
||||
"id": "js-unsafe-eval:001",
|
||||
"language": "js",
|
||||
"redaction": {
|
||||
"pii": false,
|
||||
"policy": "benchmark-default/v1"
|
||||
},
|
||||
"sandbox": {
|
||||
"network": "loopback",
|
||||
"privileges": "rootless"
|
||||
},
|
||||
"size": "small",
|
||||
"truth": {
|
||||
"confidence": "high",
|
||||
"label": "reachable",
|
||||
"rationale": "Unit test hits eval sink via POST /api/exec"
|
||||
}
|
||||
},
|
||||
{
|
||||
"hashes": {
|
||||
"attestation": {
|
||||
"path": "cases/py/fastapi-guarded/outputs/attestation.json",
|
||||
"sha256": "257aa5408a5c6ffe0e193a75a2a54597f8c6f61babfe8aaf26bd47340c3086c3"
|
||||
},
|
||||
"binary": {
|
||||
"path": "cases/py/fastapi-guarded/outputs/binary.tar.gz",
|
||||
"sha256": "ca964fef352dc535b63d35b8f8846cc051e10e54cfd8aceef7566f3c94178b76"
|
||||
},
|
||||
"case": {
|
||||
"path": "cases/py/fastapi-guarded/case.yaml",
|
||||
"sha256": "0add8a5f487ebd21ee20ab88b7c6436fe8471f0a54ab8da0e08c8416aa181346"
|
||||
},
|
||||
"coverage": {
|
||||
"path": "cases/py/fastapi-guarded/outputs/coverage.json",
|
||||
"sha256": "07b1f6dccaa02bd4e1c3e2771064fa3c6e06d02843a724151721ea694762c750"
|
||||
},
|
||||
"entrypoints": {
|
||||
"path": "cases/py/fastapi-guarded/entrypoints.yaml",
|
||||
"sha256": "47c9dd15bf7c5bb8641893a92791d3f7675ed6adba17b251f609335400d29d41"
|
||||
},
|
||||
"sbom": {
|
||||
"path": "cases/py/fastapi-guarded/outputs/sbom.cdx.json",
|
||||
"sha256": "13999d8f3d4c9bdb70ea54ad1de613be3f893d79bdd1a53f7c9401e6add88cf0"
|
||||
},
|
||||
"source": {
|
||||
"path": "cases/py/fastapi-guarded",
|
||||
"sha256": "0869cab10767ac7e7b33c9bbd634f811d98ce5cdeb244769f1a81949438460fb"
|
||||
},
|
||||
"traces": {
|
||||
"path": "cases/py/fastapi-guarded/outputs/traces/traces.json",
|
||||
"sha256": "4633748b8b428b45e3702f2f8f5b3f4270728078e26bce1e08900ed1d5bb3046"
|
||||
},
|
||||
"truth": {
|
||||
"path": "benchmark/truth/py-fastapi-guarded.json",
|
||||
"sha256": "f8c62abeb00006621feeb010d0e47d248918dffd6d6e20e0f47d74e1b3642760"
|
||||
}
|
||||
},
|
||||
"id": "py-fastapi-guarded:104",
|
||||
"language": "py",
|
||||
"redaction": {
|
||||
"pii": false,
|
||||
"policy": "benchmark-default/v1"
|
||||
},
|
||||
"sandbox": {
|
||||
"network": "loopback",
|
||||
"privileges": "rootless"
|
||||
},
|
||||
"size": "small",
|
||||
"truth": {
|
||||
"confidence": "high",
|
||||
"label": "unreachable",
|
||||
"rationale": "Feature flag ALLOW_EXEC must be true before sink executes"
|
||||
}
|
||||
},
|
||||
{
|
||||
"hashes": {
|
||||
"attestation": {
|
||||
"path": "cases/c/unsafe-system/outputs/attestation.json",
|
||||
"sha256": "c3755088182359a45492170fa8a57d826b605176333d109f4f113bc7ccf85f97"
|
||||
},
|
||||
"binary": {
|
||||
"path": "cases/c/unsafe-system/outputs/binary.tar.gz",
|
||||
"sha256": "62200167bd660bad6d131b21f941acdfebe00e949e353a53c97b6691ac8f0e49"
|
||||
},
|
||||
"case": {
|
||||
"path": "cases/c/unsafe-system/case.yaml",
|
||||
"sha256": "7799a3a629c22ad47197309f44e32aabbc4e6711ef78d606ba57a7a4974787ce"
|
||||
},
|
||||
"coverage": {
|
||||
"path": "cases/c/unsafe-system/outputs/coverage.json",
|
||||
"sha256": "03ba8cf09e7e0ed82e9fa8abb48f92355e894fd56e0c0160a504193a6f6ec48a"
|
||||
},
|
||||
"entrypoints": {
|
||||
"path": "cases/c/unsafe-system/entrypoints.yaml",
|
||||
"sha256": "06afee8350460c9d15b26ea9d4ea293e8eb3f4b86b3179e19401fa99947e4490"
|
||||
},
|
||||
"sbom": {
|
||||
"path": "cases/c/unsafe-system/outputs/sbom.cdx.json",
|
||||
"sha256": "4c72a213fc4c646f44b4d0be3c23711b120b2a386374ebaa4897e5058980e0f5"
|
||||
},
|
||||
"source": {
|
||||
"path": "cases/c/unsafe-system",
|
||||
"sha256": "bc39ab3a3e5cb3944a205912ecad8c1ac4b7d15c64b453c9d34a9a5df7fbbbf4"
|
||||
},
|
||||
"traces": {
|
||||
"path": "cases/c/unsafe-system/outputs/traces/traces.json",
|
||||
"sha256": "f6469e46a57b8a6e8e17c9b8e78168edd6657ea8a5e1e96fe6ab4a0fc88a734e"
|
||||
},
|
||||
"truth": {
|
||||
"path": "benchmark/truth/c-unsafe-system.json",
|
||||
"sha256": "9a8200c2cf549b3ac8b19b170e9d34df063351879f19f401d8492e280ad08c13"
|
||||
}
|
||||
},
|
||||
"id": "c-unsafe-system:001",
|
||||
"language": "c",
|
||||
"redaction": {
|
||||
"pii": false,
|
||||
"policy": "benchmark-default/v1"
|
||||
},
|
||||
"sandbox": {
|
||||
"network": "loopback",
|
||||
"privileges": "rootless"
|
||||
},
|
||||
"size": "small",
|
||||
"truth": {
|
||||
"confidence": "high",
|
||||
"label": "reachable",
|
||||
"rationale": "Command injection sink reachable via argv -> system()"
|
||||
}
|
||||
}
|
||||
],
|
||||
"createdAt": "2025-12-03T00:00:00Z",
|
||||
"sourceDateEpoch": 1730000000,
|
||||
"kitId": "reachability-benchmark:public-v1",
|
||||
"resourceLimits": {
|
||||
"cpu": "4",
|
||||
"memory": "8Gi"
|
||||
},
|
||||
"cases": [
|
||||
{
|
||||
"id": "js-unsafe-eval:001",
|
||||
"language": "js",
|
||||
"size": "small",
|
||||
"hashes": {
|
||||
"source": { "path": "cases/js/unsafe-eval", "sha256": "69b0d1cbae1e2c9ddc0f4dba8c6db507e1d3a1c5ea0a0a545c6f3e785529c91c" },
|
||||
"case": { "path": "cases/js/unsafe-eval/case.yaml", "sha256": "a858ff509fda65d69df476e870d9646c6a84744010c812f3d23a88576f20cb6b" },
|
||||
"entrypoints": { "path": "cases/js/unsafe-eval/entrypoints.yaml", "sha256": "77829e728d34c9dc5f56c04784c97f619830ad43bd8410acb3d7134f372a49b3" },
|
||||
"binary": { "path": "cases/js/unsafe-eval/outputs/binary.tar.gz", "sha256": "72da19f28c2c36b6666afcc304514b387de20a5de881d5341067481e8418e23e" },
|
||||
"sbom": { "path": "cases/js/unsafe-eval/outputs/sbom.cdx.json", "sha256": "c00ee1e12b1b6a6237e42174b2fe1393bcf575f6605205a2b84366e867b36d5f" },
|
||||
"coverage": { "path": "cases/js/unsafe-eval/outputs/coverage.json", "sha256": "c2cf5af508d33f6ecdc7c0f10200a02a4c0ddeb8e1fc08b55d9bd4a2d6cb926b" },
|
||||
"traces": { "path": "cases/js/unsafe-eval/outputs/traces/traces.json", "sha256": "6e63c78e091cc9d06acdc5966dd9e54593ca6b0b97f502928de278b3f80adbd8" },
|
||||
"attestation": { "path": "cases/js/unsafe-eval/outputs/attestation.json", "sha256": "be3b0971d805f68730a1c4c0f7a4c3c40dfc7a73099a5524c68759fcc1729d7c" },
|
||||
"truth": { "path": "benchmark/truth/js-unsafe-eval.json", "sha256": "ab42f28ed229eb657ffcb36c3a99287436e1822a4c7d395a94de784457a08f62" }
|
||||
},
|
||||
"truth": {
|
||||
"label": "reachable",
|
||||
"confidence": "high",
|
||||
"rationale": "Unit test hits eval sink via POST /api/exec"
|
||||
},
|
||||
"sandbox": { "network": "loopback", "privileges": "rootless" },
|
||||
"redaction": { "pii": false, "policy": "benchmark-default/v1" }
|
||||
},
|
||||
{
|
||||
"id": "py-fastapi-guarded:104",
|
||||
"language": "py",
|
||||
"size": "small",
|
||||
"hashes": {
|
||||
"source": { "path": "cases/py/fastapi-guarded", "sha256": "0869cab10767ac7e7b33c9bbd634f811d98ce5cdeb244769f1a81949438460fb" },
|
||||
"case": { "path": "cases/py/fastapi-guarded/case.yaml", "sha256": "0add8a5f487ebd21ee20ab88b7c6436fe8471f0a54ab8da0e08c8416aa181346" },
|
||||
"entrypoints": { "path": "cases/py/fastapi-guarded/entrypoints.yaml", "sha256": "47c9dd15bf7c5bb8641893a92791d3f7675ed6adba17b251f609335400d29d41" },
|
||||
"binary": { "path": "cases/py/fastapi-guarded/outputs/binary.tar.gz", "sha256": "ca964fef352dc535b63d35b8f8846cc051e10e54cfd8aceef7566f3c94178b76" },
|
||||
"sbom": { "path": "cases/py/fastapi-guarded/outputs/sbom.cdx.json", "sha256": "13999d8f3d4c9bdb70ea54ad1de613be3f893d79bdd1a53f7c9401e6add88cf0" },
|
||||
"coverage": { "path": "cases/py/fastapi-guarded/outputs/coverage.json", "sha256": "07b1f6dccaa02bd4e1c3e2771064fa3c6e06d02843a724151721ea694762c750" },
|
||||
"traces": { "path": "cases/py/fastapi-guarded/outputs/traces/traces.json", "sha256": "4633748b8b428b45e3702f2f8f5b3f4270728078e26bce1e08900ed1d5bb3046" },
|
||||
"attestation": { "path": "cases/py/fastapi-guarded/outputs/attestation.json", "sha256": "257aa5408a5c6ffe0e193a75a2a54597f8c6f61babfe8aaf26bd47340c3086c3" },
|
||||
"truth": { "path": "benchmark/truth/py-fastapi-guarded.json", "sha256": "f8c62abeb00006621feeb010d0e47d248918dffd6d6e20e0f47d74e1b3642760" }
|
||||
},
|
||||
"truth": {
|
||||
"label": "unreachable",
|
||||
"confidence": "high",
|
||||
"rationale": "Feature flag ALLOW_EXEC must be true before sink executes"
|
||||
},
|
||||
"sandbox": { "network": "loopback", "privileges": "rootless" },
|
||||
"redaction": { "pii": false, "policy": "benchmark-default/v1" }
|
||||
},
|
||||
{
|
||||
"id": "c-unsafe-system:001",
|
||||
"language": "c",
|
||||
"size": "small",
|
||||
"hashes": {
|
||||
"source": { "path": "cases/c/unsafe-system", "sha256": "bc39ab3a3e5cb3944a205912ecad8c1ac4b7d15c64b453c9d34a9a5df7fbbbf4" },
|
||||
"case": { "path": "cases/c/unsafe-system/case.yaml", "sha256": "7799a3a629c22ad47197309f44e32aabbc4e6711ef78d606ba57a7a4974787ce" },
|
||||
"entrypoints": { "path": "cases/c/unsafe-system/entrypoints.yaml", "sha256": "06afee8350460c9d15b26ea9d4ea293e8eb3f4b86b3179e19401fa99947e4490" },
|
||||
"binary": { "path": "cases/c/unsafe-system/outputs/binary.tar.gz", "sha256": "62200167bd660bad6d131b21f941acdfebe00e949e353a53c97b6691ac8f0e49" },
|
||||
"sbom": { "path": "cases/c/unsafe-system/outputs/sbom.cdx.json", "sha256": "4c72a213fc4c646f44b4d0be3c23711b120b2a386374ebaa4897e5058980e0f5" },
|
||||
"coverage": { "path": "cases/c/unsafe-system/outputs/coverage.json", "sha256": "03ba8cf09e7e0ed82e9fa8abb48f92355e894fd56e0c0160a504193a6f6ec48a" },
|
||||
"traces": { "path": "cases/c/unsafe-system/outputs/traces/traces.json", "sha256": "f6469e46a57b8a6e8e17c9b8e78168edd6657ea8a5e1e96fe6ab4a0fc88a734e" },
|
||||
"attestation": { "path": "cases/c/unsafe-system/outputs/attestation.json", "sha256": "c3755088182359a45492170fa8a57d826b605176333d109f4f113bc7ccf85f97" },
|
||||
"truth": { "path": "benchmark/truth/c-unsafe-system.json", "sha256": "9a8200c2cf549b3ac8b19b170e9d34df063351879f19f401d8492e280ad08c13" }
|
||||
},
|
||||
"truth": {
|
||||
"label": "reachable",
|
||||
"confidence": "high",
|
||||
"rationale": "Command injection sink reachable via argv -> system()"
|
||||
},
|
||||
"sandbox": { "network": "loopback", "privileges": "rootless" },
|
||||
"redaction": { "pii": false, "policy": "benchmark-default/v1" }
|
||||
}
|
||||
],
|
||||
"artifacts": {
|
||||
"submissionSchema": { "path": "schemas/submission.schema.json", "sha256": "de5bebb2dbcd085d7896f47a16b9d3837a65fb7f816dcf7e587967d5848c50a7" },
|
||||
"scorer": { "path": "tools/scorer/rb_score.py", "sha256": "32d4f69f5d1d4b87902d6c4f020efde703487d526bf7d42b4438cb2499813f7f" },
|
||||
"baselineSubmissions": []
|
||||
},
|
||||
"schemaVersion": "1.0.0",
|
||||
"signatures": [],
|
||||
"sourceDateEpoch": 1730000000,
|
||||
"tools": {
|
||||
"builder": { "path": "tools/build/build_all.py", "sha256": "64a73f3df9b6f2cdaf5cbb33852b8e9bf443f67cf9dff1573fb635a0252bda9a" },
|
||||
"validator": { "path": "tools/validate.py", "sha256": "776009ef0f3691e60cc87df3f0468181ee7a827be1bd0f73c77fdb68d3ed31c0" }
|
||||
"builder": {
|
||||
"path": "tools/build/build_all.py",
|
||||
"sha256": "64a73f3df9b6f2cdaf5cbb33852b8e9bf443f67cf9dff1573fb635a0252bda9a"
|
||||
},
|
||||
"validator": {
|
||||
"path": "tools/validate.py",
|
||||
"sha256": "776009ef0f3691e60cc87df3f0468181ee7a827be1bd0f73c77fdb68d3ed31c0"
|
||||
}
|
||||
},
|
||||
"signatures": []
|
||||
}
|
||||
"version": "1.0.2"
|
||||
}
|
||||
@@ -0,0 +1,34 @@
|
||||
{
|
||||
"version": "1.0.0",
|
||||
"cases": [
|
||||
{
|
||||
"case_id": "java-micronaut-deserialize:203",
|
||||
"case_version": "1.0.0",
|
||||
"notes": "Micronaut-style controller deserializes base64 payload",
|
||||
"sinks": [
|
||||
{
|
||||
"sink_id": "MicronautDeserialize::handleUpload",
|
||||
"label": "reachable",
|
||||
"confidence": "high",
|
||||
"dynamic_evidence": {
|
||||
"covered_by_tests": [
|
||||
"src/ControllerTest.java"
|
||||
],
|
||||
"coverage_files": [
|
||||
"outputs/coverage.json"
|
||||
]
|
||||
},
|
||||
"static_evidence": {
|
||||
"call_path": [
|
||||
"POST /mn/upload",
|
||||
"Controller.handleUpload",
|
||||
"ObjectInputStream.readObject"
|
||||
]
|
||||
},
|
||||
"config_conditions": [],
|
||||
"notes": "No guard; ObjectInputStream invoked on user-controlled bytes"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
@@ -0,0 +1,35 @@
|
||||
{
|
||||
"version": "1.0.0",
|
||||
"cases": [
|
||||
{
|
||||
"case_id": "java-micronaut-guarded:204",
|
||||
"case_version": "1.0.0",
|
||||
"notes": "Deserialization guarded by ALLOW_MN_DESER flag (unreachable by default)",
|
||||
"sinks": [
|
||||
{
|
||||
"sink_id": "MicronautDeserializeGuarded::handleUpload",
|
||||
"label": "unreachable",
|
||||
"confidence": "high",
|
||||
"dynamic_evidence": {
|
||||
"covered_by_tests": [
|
||||
"src/ControllerTest.java"
|
||||
],
|
||||
"coverage_files": [
|
||||
"outputs/coverage.json"
|
||||
]
|
||||
},
|
||||
"static_evidence": {
|
||||
"call_path": [
|
||||
"POST /mn/upload",
|
||||
"Controller.handleUpload"
|
||||
]
|
||||
},
|
||||
"config_conditions": [
|
||||
"ALLOW_MN_DESER=true"
|
||||
],
|
||||
"notes": "Feature flag defaults to false; sink not executed without ALLOW_MN_DESER"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
@@ -14,7 +14,9 @@
|
||||
"covered_by_tests": [
|
||||
"src/AppTest.java"
|
||||
],
|
||||
"coverage_files": []
|
||||
"coverage_files": [
|
||||
"outputs/coverage.json"
|
||||
]
|
||||
},
|
||||
"static_evidence": {
|
||||
"call_path": [
|
||||
|
||||
@@ -12,7 +12,7 @@
|
||||
"confidence": "high",
|
||||
"dynamic_evidence": {
|
||||
"covered_by_tests": ["src/AppTest.java"],
|
||||
"coverage_files": []
|
||||
"coverage_files": ["outputs/coverage.json"]
|
||||
},
|
||||
"static_evidence": {
|
||||
"call_path": [
|
||||
|
||||
@@ -0,0 +1,34 @@
|
||||
{
|
||||
"version": "1.0.0",
|
||||
"cases": [
|
||||
{
|
||||
"case_id": "java-spring-reflection:205",
|
||||
"case_version": "1.0.0",
|
||||
"notes": "Reflection endpoint loads arbitrary classes supplied by caller",
|
||||
"sinks": [
|
||||
{
|
||||
"sink_id": "SpringReflection::run",
|
||||
"label": "reachable",
|
||||
"confidence": "high",
|
||||
"dynamic_evidence": {
|
||||
"covered_by_tests": [
|
||||
"src/ReflectControllerTest.java"
|
||||
],
|
||||
"coverage_files": [
|
||||
"outputs/coverage.json"
|
||||
]
|
||||
},
|
||||
"static_evidence": {
|
||||
"call_path": [
|
||||
"POST /api/reflect",
|
||||
"ReflectController.run",
|
||||
"Class.forName"
|
||||
]
|
||||
},
|
||||
"config_conditions": [],
|
||||
"notes": "User-controlled class name flows into Class.forName and reflection instantiation"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
@@ -0,0 +1,48 @@
|
||||
id: "java-micronaut-deserialize:203"
|
||||
language: java
|
||||
project: micronaut-deserialize
|
||||
version: "1.0.0"
|
||||
description: "Micronaut-style controller performs unsafe deserialization on request payload"
|
||||
entrypoints:
|
||||
- "POST /mn/upload"
|
||||
sinks:
|
||||
- id: "MicronautDeserialize::handleUpload"
|
||||
path: "bench.reachability.micronaut.Controller.handleUpload"
|
||||
kind: "custom"
|
||||
location:
|
||||
file: src/Controller.java
|
||||
line: 10
|
||||
notes: "ObjectInputStream on user-controlled payload"
|
||||
environment:
|
||||
os_image: "eclipse-temurin:21-jdk"
|
||||
runtime:
|
||||
java: "21"
|
||||
source_date_epoch: 1730000000
|
||||
resource_limits:
|
||||
cpu: "2"
|
||||
memory: "4Gi"
|
||||
build:
|
||||
command: "./build/build.sh"
|
||||
source_date_epoch: 1730000000
|
||||
outputs:
|
||||
artifact_path: outputs/binary.tar.gz
|
||||
sbom_path: outputs/sbom.cdx.json
|
||||
coverage_path: outputs/coverage.json
|
||||
traces_dir: outputs/traces
|
||||
attestation_path: outputs/attestation.json
|
||||
test:
|
||||
command: "./build/build.sh"
|
||||
expected_coverage: []
|
||||
expected_traces: []
|
||||
env:
|
||||
JAVA_TOOL_OPTIONS: "-ea"
|
||||
ground_truth:
|
||||
summary: "Deserialization reachable"
|
||||
evidence_files:
|
||||
- "../benchmark/truth/java-micronaut-deserialize.json"
|
||||
sandbox:
|
||||
network: loopback
|
||||
privileges: rootless
|
||||
redaction:
|
||||
pii: false
|
||||
policy: "benchmark-default/v1"
|
||||
@@ -0,0 +1,8 @@
|
||||
case_id: "java-micronaut-deserialize:203"
|
||||
entries:
|
||||
http:
|
||||
- id: "POST /mn/upload"
|
||||
route: "/mn/upload"
|
||||
method: "POST"
|
||||
handler: "Controller.handleUpload"
|
||||
description: "Binary payload base64-deserialized"
|
||||
@@ -0,0 +1,12 @@
|
||||
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
||||
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
|
||||
<modelVersion>4.0.0</modelVersion>
|
||||
<groupId>org.stellaops.bench</groupId>
|
||||
<artifactId>micronaut-deserialize</artifactId>
|
||||
<version>1.0.0</version>
|
||||
<packaging>jar</packaging>
|
||||
<properties>
|
||||
<maven.compiler.source>17</maven.compiler.source>
|
||||
<maven.compiler.target>17</maven.compiler.target>
|
||||
</properties>
|
||||
</project>
|
||||
@@ -0,0 +1,24 @@
|
||||
package bench.reachability.micronaut;
|
||||
|
||||
import java.util.Map;
|
||||
import java.util.Base64;
|
||||
import java.io.*;
|
||||
|
||||
public class Controller {
|
||||
// Unsafe deserialization sink (reachable)
|
||||
public static Response handleUpload(Map<String, String> body) {
|
||||
String payload = body.get("payload");
|
||||
if (payload == null) {
|
||||
return new Response(400, "bad request");
|
||||
}
|
||||
try (ObjectInputStream ois = new ObjectInputStream(
|
||||
new ByteArrayInputStream(Base64.getDecoder().decode(payload)))) {
|
||||
Object obj = ois.readObject();
|
||||
return new Response(200, obj.toString());
|
||||
} catch (Exception ex) {
|
||||
return new Response(500, ex.getClass().getSimpleName());
|
||||
}
|
||||
}
|
||||
|
||||
public record Response(int status, String body) {}
|
||||
}
|
||||
@@ -0,0 +1,29 @@
|
||||
package bench.reachability.micronaut;
|
||||
|
||||
import java.io.*;
|
||||
import java.util.*;
|
||||
import java.util.Base64;
|
||||
|
||||
// Simple assertion-based oracle (JUnit-free for offline determinism)
|
||||
public class ControllerTest {
|
||||
private static String serialize(Object obj) throws IOException {
|
||||
ByteArrayOutputStream bos = new ByteArrayOutputStream();
|
||||
try (ObjectOutputStream oos = new ObjectOutputStream(bos)) {
|
||||
oos.writeObject(obj);
|
||||
}
|
||||
return Base64.getEncoder().encodeToString(bos.toByteArray());
|
||||
}
|
||||
|
||||
public static void main(String[] args) throws Exception {
|
||||
Map<String, String> body = Map.of("payload", serialize("micronaut"));
|
||||
var res = Controller.handleUpload(body);
|
||||
assert res.status() == 200 : "status";
|
||||
assert res.body().equals("micronaut") : "body";
|
||||
|
||||
File outDir = new File("outputs");
|
||||
outDir.mkdirs();
|
||||
try (FileWriter fw = new FileWriter(new File(outDir, "SINK_REACHED"))) {
|
||||
fw.write("true");
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,48 @@
|
||||
id: "java-micronaut-guarded:204"
|
||||
language: java
|
||||
project: micronaut-guarded
|
||||
version: "1.0.0"
|
||||
description: "Micronaut-style controller guards deserialization behind ALLOW_MN_DESER flag (unreachable by default)"
|
||||
entrypoints:
|
||||
- "POST /mn/upload"
|
||||
sinks:
|
||||
- id: "MicronautDeserializeGuarded::handleUpload"
|
||||
path: "bench.reachability.micronautguard.Controller.handleUpload"
|
||||
kind: "custom"
|
||||
location:
|
||||
file: src/Controller.java
|
||||
line: 11
|
||||
notes: "ObjectInputStream gated by ALLOW_MN_DESER"
|
||||
environment:
|
||||
os_image: "eclipse-temurin:21-jdk"
|
||||
runtime:
|
||||
java: "21"
|
||||
source_date_epoch: 1730000000
|
||||
resource_limits:
|
||||
cpu: "2"
|
||||
memory: "4Gi"
|
||||
build:
|
||||
command: "./build/build.sh"
|
||||
source_date_epoch: 1730000000
|
||||
outputs:
|
||||
artifact_path: outputs/binary.tar.gz
|
||||
sbom_path: outputs/sbom.cdx.json
|
||||
coverage_path: outputs/coverage.json
|
||||
traces_dir: outputs/traces
|
||||
attestation_path: outputs/attestation.json
|
||||
test:
|
||||
command: "./build/build.sh"
|
||||
expected_coverage: []
|
||||
expected_traces: []
|
||||
env:
|
||||
JAVA_TOOL_OPTIONS: "-ea"
|
||||
ground_truth:
|
||||
summary: "Guard blocks deserialization unless ALLOW_MN_DESER=true"
|
||||
evidence_files:
|
||||
- "../benchmark/truth/java-micronaut-guarded.json"
|
||||
sandbox:
|
||||
network: loopback
|
||||
privileges: rootless
|
||||
redaction:
|
||||
pii: false
|
||||
policy: "benchmark-default/v1"
|
||||
@@ -0,0 +1,8 @@
|
||||
case_id: "java-micronaut-guarded:204"
|
||||
entries:
|
||||
http:
|
||||
- id: "POST /mn/upload"
|
||||
route: "/mn/upload"
|
||||
method: "POST"
|
||||
handler: "Controller.handleUpload"
|
||||
description: "Deserialization guarded by ALLOW_MN_DESER flag"
|
||||
@@ -0,0 +1,12 @@
|
||||
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
||||
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
|
||||
<modelVersion>4.0.0</modelVersion>
|
||||
<groupId>org.stellaops.bench</groupId>
|
||||
<artifactId>micronaut-guarded</artifactId>
|
||||
<version>1.0.0</version>
|
||||
<packaging>jar</packaging>
|
||||
<properties>
|
||||
<maven.compiler.source>17</maven.compiler.source>
|
||||
<maven.compiler.target>17</maven.compiler.target>
|
||||
</properties>
|
||||
</project>
|
||||
@@ -0,0 +1,27 @@
|
||||
package bench.reachability.micronautguard;
|
||||
|
||||
import java.util.Map;
|
||||
import java.util.Base64;
|
||||
import java.io.*;
|
||||
|
||||
public class Controller {
|
||||
// Deserialization behind feature flag; unreachable unless ALLOW_MN_DESER=true
|
||||
public static Response handleUpload(Map<String, String> body, Map<String, String> env) {
|
||||
if (!"true".equals(env.getOrDefault("ALLOW_MN_DESER", "false"))) {
|
||||
return new Response(403, "forbidden");
|
||||
}
|
||||
String payload = body.get("payload");
|
||||
if (payload == null) {
|
||||
return new Response(400, "bad request");
|
||||
}
|
||||
try (ObjectInputStream ois = new ObjectInputStream(
|
||||
new ByteArrayInputStream(Base64.getDecoder().decode(payload)))) {
|
||||
Object obj = ois.readObject();
|
||||
return new Response(200, obj.toString());
|
||||
} catch (Exception ex) {
|
||||
return new Response(500, ex.getClass().getSimpleName());
|
||||
}
|
||||
}
|
||||
|
||||
public record Response(int status, String body) {}
|
||||
}
|
||||
@@ -0,0 +1,29 @@
|
||||
package bench.reachability.micronautguard;
|
||||
|
||||
import java.io.*;
|
||||
import java.util.*;
|
||||
import java.util.Base64;
|
||||
|
||||
public class ControllerTest {
|
||||
private static String serialize(Object obj) throws IOException {
|
||||
ByteArrayOutputStream bos = new ByteArrayOutputStream();
|
||||
try (ObjectOutputStream oos = new ObjectOutputStream(bos)) {
|
||||
oos.writeObject(obj);
|
||||
}
|
||||
return Base64.getEncoder().encodeToString(bos.toByteArray());
|
||||
}
|
||||
|
||||
public static void main(String[] args) throws Exception {
|
||||
Map<String, String> body = Map.of("payload", serialize("blocked"));
|
||||
Map<String, String> env = Map.of("ALLOW_MN_DESER", "false");
|
||||
var res = Controller.handleUpload(body, env);
|
||||
assert res.status() == 403 : "status";
|
||||
assert res.body().equals("forbidden") : "body";
|
||||
|
||||
File outDir = new File("outputs");
|
||||
outDir.mkdirs();
|
||||
try (FileWriter fw = new FileWriter(new File(outDir, "SINK_BLOCKED"))) {
|
||||
fw.write("true");
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,48 @@
|
||||
id: "java-spring-reflection:205"
|
||||
language: java
|
||||
project: spring-reflection
|
||||
version: "1.0.0"
|
||||
description: "Spring-style controller exposes reflection endpoint that loads arbitrary classes"
|
||||
entrypoints:
|
||||
- "POST /api/reflect"
|
||||
sinks:
|
||||
- id: "SpringReflection::run"
|
||||
path: "bench.reachability.springreflection.ReflectController.run"
|
||||
kind: "custom"
|
||||
location:
|
||||
file: src/ReflectController.java
|
||||
line: 7
|
||||
notes: "User-controlled Class.forName + newInstance"
|
||||
environment:
|
||||
os_image: "eclipse-temurin:21-jdk"
|
||||
runtime:
|
||||
java: "21"
|
||||
source_date_epoch: 1730000000
|
||||
resource_limits:
|
||||
cpu: "2"
|
||||
memory: "4Gi"
|
||||
build:
|
||||
command: "./build/build.sh"
|
||||
source_date_epoch: 1730000000
|
||||
outputs:
|
||||
artifact_path: outputs/binary.tar.gz
|
||||
sbom_path: outputs/sbom.cdx.json
|
||||
coverage_path: outputs/coverage.json
|
||||
traces_dir: outputs/traces
|
||||
attestation_path: outputs/attestation.json
|
||||
test:
|
||||
command: "./build/build.sh"
|
||||
expected_coverage: []
|
||||
expected_traces: []
|
||||
env:
|
||||
JAVA_TOOL_OPTIONS: "-ea"
|
||||
ground_truth:
|
||||
summary: "Reflection sink reachable with user-controlled class name"
|
||||
evidence_files:
|
||||
- "../benchmark/truth/java-spring-reflection.json"
|
||||
sandbox:
|
||||
network: loopback
|
||||
privileges: rootless
|
||||
redaction:
|
||||
pii: false
|
||||
policy: "benchmark-default/v1"
|
||||
@@ -0,0 +1,8 @@
|
||||
case_id: "java-spring-reflection:205"
|
||||
entries:
|
||||
http:
|
||||
- id: "POST /api/reflect"
|
||||
route: "/api/reflect"
|
||||
method: "POST"
|
||||
handler: "ReflectController.run"
|
||||
description: "Reflection endpoint loads arbitrary classes"
|
||||
@@ -0,0 +1,12 @@
|
||||
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
||||
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
|
||||
<modelVersion>4.0.0</modelVersion>
|
||||
<groupId>org.stellaops.bench</groupId>
|
||||
<artifactId>spring-reflection</artifactId>
|
||||
<version>1.0.0</version>
|
||||
<packaging>jar</packaging>
|
||||
<properties>
|
||||
<maven.compiler.source>17</maven.compiler.source>
|
||||
<maven.compiler.target>17</maven.compiler.target>
|
||||
</properties>
|
||||
</project>
|
||||
@@ -0,0 +1,29 @@
|
||||
package bench.reachability.springreflection;
|
||||
|
||||
import java.util.Map;
|
||||
|
||||
public class ReflectController {
|
||||
// Reflection sink: user controls Class.forName target
|
||||
public static Response run(Map<String, String> body) {
|
||||
String className = body.get("class");
|
||||
if (className == null || className.isBlank()) {
|
||||
return new Response(400, "bad request");
|
||||
}
|
||||
try {
|
||||
Class<?> type = Class.forName(className);
|
||||
Object instance = type.getDeclaredConstructor().newInstance();
|
||||
return new Response(200, instance.toString());
|
||||
} catch (Exception ex) {
|
||||
return new Response(500, ex.getClass().getSimpleName());
|
||||
}
|
||||
}
|
||||
|
||||
public record Response(int status, String body) {}
|
||||
|
||||
public static class Marker {
|
||||
@Override
|
||||
public String toString() {
|
||||
return "marker";
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,20 @@
|
||||
package bench.reachability.springreflection;
|
||||
|
||||
import java.io.File;
|
||||
import java.io.FileWriter;
|
||||
import java.util.Map;
|
||||
|
||||
public class ReflectControllerTest {
|
||||
public static void main(String[] args) throws Exception {
|
||||
Map<String, String> body = Map.of("class", ReflectController.Marker.class.getName());
|
||||
var res = ReflectController.run(body);
|
||||
assert res.status() == 200 : "status";
|
||||
assert res.body().equals("marker") : "body";
|
||||
|
||||
File outDir = new File("outputs");
|
||||
outDir.mkdirs();
|
||||
try (FileWriter fw = new FileWriter(new File(outDir, "SINK_REACHED"))) {
|
||||
fw.write("true");
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -9,11 +9,14 @@ export DOTNET_CLI_TELEMETRY_OPTOUT=1
|
||||
export GIT_TERMINAL_PROMPT=0
|
||||
export TZ=UTC
|
||||
|
||||
source "${ROOT}/tools/java/ensure_jdk.sh"
|
||||
ensure_bench_jdk
|
||||
|
||||
# 1) Validate schemas (truth + submission samples)
|
||||
python "${ROOT}/tools/validate.py" --schemas "${ROOT}/schemas"
|
||||
|
||||
# 2) Build all cases deterministically (skips Java since JDK may be missing)
|
||||
python "${ROOT}/tools/build/build_all.py" --cases "${ROOT}/cases" --skip-lang java
|
||||
# 2) Build all cases deterministically (including Java via vendored JDK)
|
||||
python "${ROOT}/tools/build/build_all.py" --cases "${ROOT}/cases"
|
||||
|
||||
# 3) Run Semgrep baseline (offline-safe)
|
||||
bash "${ROOT}/baselines/semgrep/run_all.sh" "${ROOT}/cases" "${ROOT}/out/semgrep-baseline"
|
||||
|
||||
@@ -13,7 +13,7 @@ This guide explains how to produce a compliant submission for the Stella Ops rea
|
||||
python tools/build/build_all.py --cases cases
|
||||
```
|
||||
- Sets `SOURCE_DATE_EPOCH`.
|
||||
- Skips Java by default if JDK is unavailable (pass `--skip-lang` as needed).
|
||||
- Uses vendored Temurin 21 via `tools/java/ensure_jdk.sh` when `JAVA_HOME`/`javac` are missing; pass `--skip-lang` if another toolchain is unavailable on your runner.
|
||||
|
||||
2) **Run your analyzer**
|
||||
- For each case, produce sink predictions in memory-safe JSON.
|
||||
|
||||
62
bench/reachability-benchmark/tools/java/ensure_jdk.sh
Normal file
62
bench/reachability-benchmark/tools/java/ensure_jdk.sh
Normal file
@@ -0,0 +1,62 @@
|
||||
#!/usr/bin/env bash
|
||||
# Offline-friendly helper to make a JDK available for benchmark builds.
|
||||
# Order of preference:
|
||||
# 1) Respect an existing JAVA_HOME when it contains javac.
|
||||
# 2) Use javac from PATH when present.
|
||||
# 3) Extract a vendored archive (jdk-21.0.1.tar.gz) into .jdk/ and use it.
|
||||
|
||||
ensure_bench_jdk() {
|
||||
# Re-use an explicitly provided JAVA_HOME when it already has javac.
|
||||
if [[ -n "${JAVA_HOME:-}" && -x "${JAVA_HOME}/bin/javac" ]]; then
|
||||
export PATH="${JAVA_HOME}/bin:${PATH}"
|
||||
return 0
|
||||
fi
|
||||
|
||||
# Use any javac already on PATH.
|
||||
if command -v javac >/dev/null 2>&1; then
|
||||
return 0
|
||||
fi
|
||||
|
||||
local script_dir bench_root cache_dir archive_dir archive_path candidate
|
||||
script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
bench_root="$(cd "${script_dir}/../.." && pwd)"
|
||||
repo_root="$(cd "${bench_root}/../.." && pwd)"
|
||||
cache_dir="${bench_root}/.jdk"
|
||||
archive_dir="${cache_dir}/jdk-21.0.1+12"
|
||||
|
||||
# Prefer an archive co-located with this script; fall back to the repo copy.
|
||||
for candidate in \
|
||||
"${script_dir}/jdk-21.0.1.tar.gz" \
|
||||
"${repo_root}/src/Sdk/StellaOps.Sdk.Generator/tools/jdk-21.0.1.tar.gz"
|
||||
do
|
||||
if [[ -f "${candidate}" ]]; then
|
||||
archive_path="${candidate}"
|
||||
break
|
||||
fi
|
||||
done
|
||||
|
||||
if [[ -z "${archive_path:-}" ]]; then
|
||||
echo "[ensure_jdk] No JDK found. Set JAVA_HOME or place jdk-21.0.1.tar.gz under tools/java/." >&2
|
||||
return 1
|
||||
fi
|
||||
|
||||
mkdir -p "${cache_dir}"
|
||||
if [[ ! -d "${archive_dir}" ]]; then
|
||||
tar -xzf "${archive_path}" -C "${cache_dir}"
|
||||
fi
|
||||
|
||||
if [[ ! -x "${archive_dir}/bin/javac" ]]; then
|
||||
echo "[ensure_jdk] Extracted archive but javac not found under ${archive_dir}" >&2
|
||||
return 1
|
||||
fi
|
||||
|
||||
export JAVA_HOME="${archive_dir}"
|
||||
export PATH="${JAVA_HOME}/bin:${PATH}"
|
||||
}
|
||||
|
||||
# Allow running as a script for quick verification.
|
||||
if [[ "${BASH_SOURCE[0]}" == "$0" ]]; then
|
||||
if ensure_bench_jdk; then
|
||||
java -version
|
||||
fi
|
||||
fi
|
||||
17
bench/reachability-benchmark/tools/node/node
Normal file
17
bench/reachability-benchmark/tools/node/node
Normal file
@@ -0,0 +1,17 @@
|
||||
#!/usr/bin/env bash
|
||||
# Lightweight Node shim to support environments where only node.exe (Windows) is present.
|
||||
|
||||
if command -v node >/dev/null 2>&1; then
|
||||
exec node "$@"
|
||||
fi
|
||||
|
||||
if command -v node.exe >/dev/null 2>&1; then
|
||||
exec node.exe "$@"
|
||||
fi
|
||||
|
||||
if [ -x "/mnt/c/Program Files/nodejs/node.exe" ]; then
|
||||
exec "/mnt/c/Program Files/nodejs/node.exe" "$@"
|
||||
fi
|
||||
|
||||
echo "node not found; install Node.js or adjust PATH" >&2
|
||||
exit 127
|
||||
Reference in New Issue
Block a user