2.4 KiB
2.4 KiB
Reachability Corpus Plan (QA-CORPUS-401-031)
Objective
- Build a multi-runtime reachability corpus (Go/.NET/Python/Rust) with EXPECT.yaml ground truths and captured traces.
- Make fixtures CI-consumable to validate reachability scoring and VEX proofs continuously.
- Add public mini-dataset cases (PHP/JavaScript/C#) from advisory 23-Nov-2025 for ingestion/bench reuse.
Scope & deliverables
- Fixture layout:
tests/reachability/corpus/<language>/<case>/expect.yaml— states (reachable|conditional|unreachable), score, evidence refs.callgraph.*.json— static graphs per language.runtime/*.ndjson— traces/probes when available.sbom.*.json— CycloneDX/SPDX slices.vex.openvex.json— expected VEX statement.
- CI integration: add corpus harness to
tests/reachability/StellaOps.Reachability.FixtureTeststo validate presence, schema, and determinism (hash manifest). - Offline posture: all artifacts deterministic, no external downloads; hashes recorded in manifest.
- Public mini-dataset layout (PHP/JS/C#) to be mirrored under
tests/reachability/samples-public/:
vuln-reach-dataset/
schema/ground-truth.schema.json
runners/run_all.sh
samples/
php/php-001-phar-deserialize/...
js/js-002-yaml-unsafe-load/...
csharp/cs-001-binaryformatter-deserialize/...
Each sample ships: minimal app, lockfile, SBOM (CycloneDX JSON), VEX, ground truth (EXPECT/JSON), repro script.
MVP slice (proposed)
- Go:
go-ssh-CVE-2020-9283-keyexchange - .NET:
dotnet-kestrel-CVE-2023-44487-http2-rapid-reset - Python:
python-django-CVE-2019-19844-sqli-like - Rust:
rust-axum-header-parsing-TBD
Work plan
- Define shared manifest schema + hash manifest (NDJSON) under
tests/reachability/corpus/manifest.json. - For each MVP case, add minimal static callgraph + EXPECT.yaml with score/state and evidence links. (DONE: stub versions committed)
- Extend reachability fixture tests to cover corpus folders (presence, hashes, EXPECT.yaml schema). (DONE)
- Wire CI job to run the extended tests in
tests/reachability/StellaOps.Reachability.FixtureTests. (TODO) - Replace stubs with real callgraphs/traces and expand corpus after MVP passes CI. (TODO)
Determinism rules
- Sort JSON keys; round scores to 2dp; UTC times only if needed.
- Stable ordering of files in manifests; hash with SHA-256.
- No network calls during test or generation.