# Reachability Test Datasets This directory contains ground truth samples for validating reachability analysis accuracy. ## Directory Structure ``` datasets/reachability/ ├── README.md # This file ├── samples/ # Test samples by language │ ├── csharp/ │ │ ├── simple-reachable/ # Positive: direct call path │ │ └── dead-code/ # Negative: unreachable code │ ├── java/ │ │ └── vulnerable-log4j/ # Positive: Log4Shell CVE │ └── native/ │ └── stripped-elf/ # Positive: stripped binary └── schema/ ├── manifest.schema.json # Sample manifest schema └── ground-truth.schema.json # Ground truth schema ``` ## Sample Categories ### Positive (Reachable) Samples where vulnerable code has a confirmed path from entry points: - `csharp/simple-reachable` - Direct call to vulnerable API - `java/vulnerable-log4j` - Log4Shell with runtime confirmation - `native/stripped-elf` - Stripped ELF with heuristic analysis ### Negative (Unreachable) Samples where vulnerable code exists but is never called: - `csharp/dead-code` - Deprecated API replaced by safe implementation ## Schema Reference ### manifest.json Sample metadata including: - `sampleId` - Unique identifier - `language` - Primary language (java, csharp, native, etc.) - `category` - positive, negative, or contested - `vulnerabilities` - CVEs and affected symbols - `artifacts` - Binary/SBOM file references ### ground-truth.json Expected outcomes including: - `targets` - Symbols with expected lattice states - `entryPoints` - Program entry points - `expectedUncertainty` - Expected uncertainty tier - `expectedGateDecisions` - Expected policy gate outcomes ## Lattice States | Code | Name | Description | |------|------|-------------| | U | Unknown | No analysis performed | | SR | StaticallyReachable | Static analysis finds path | | SU | StaticallyUnreachable | Static analysis finds no path | | RO | RuntimeObserved | Runtime probe observed execution | | RU | RuntimeUnobserved | Runtime probe did not observe | | CR | ConfirmedReachable | Both static and runtime confirm | | CU | ConfirmedUnreachable | Both static and runtime confirm unreachable | | X | Contested | Static and runtime evidence conflict | ## Running Tests ```bash # Validate schemas npx ajv validate -s schema/ground-truth.schema.json -d samples/**/ground-truth.json # Run benchmark tests dotnet test --filter "GroundTruth" src/Scanner/__Tests/StellaOps.Scanner.Reachability.Benchmarks/ ``` ## Adding New Samples 1. Create directory: `samples/{language}/{sample-name}/` 2. Add `manifest.json` with sample metadata 3. Add `ground-truth.json` with expected outcomes 4. Include `reasoning` for each target explaining the expected state 5. Validate against schema before committing ## Related Documentation - [Ground Truth Schema](../../docs/reachability/ground-truth-schema.md) - [Lattice Model](../../docs/reachability/lattice.md) - [Policy Gates](../../docs/reachability/policy-gate.md)