Files
StellaOps Bot 999e26a48e up
2025-12-13 02:22:15 +02:00

88 lines
3.0 KiB
Markdown

# Reachability Test Datasets
This directory contains ground truth samples for validating reachability analysis accuracy.
## Directory Structure
```
datasets/reachability/
├── README.md # This file
├── samples/ # Test samples by language
│ ├── csharp/
│ │ ├── simple-reachable/ # Positive: direct call path
│ │ └── dead-code/ # Negative: unreachable code
│ ├── java/
│ │ └── vulnerable-log4j/ # Positive: Log4Shell CVE
│ └── native/
│ └── stripped-elf/ # Positive: stripped binary
└── schema/
├── manifest.schema.json # Sample manifest schema
└── ground-truth.schema.json # Ground truth schema
```
## Sample Categories
### Positive (Reachable)
Samples where vulnerable code has a confirmed path from entry points:
- `csharp/simple-reachable` - Direct call to vulnerable API
- `java/vulnerable-log4j` - Log4Shell with runtime confirmation
- `native/stripped-elf` - Stripped ELF with heuristic analysis
### Negative (Unreachable)
Samples where vulnerable code exists but is never called:
- `csharp/dead-code` - Deprecated API replaced by safe implementation
## Schema Reference
### manifest.json
Sample metadata including:
- `sampleId` - Unique identifier
- `language` - Primary language (java, csharp, native, etc.)
- `category` - positive, negative, or contested
- `vulnerabilities` - CVEs and affected symbols
- `artifacts` - Binary/SBOM file references
### ground-truth.json
Expected outcomes including:
- `targets` - Symbols with expected lattice states
- `entryPoints` - Program entry points
- `expectedUncertainty` - Expected uncertainty tier
- `expectedGateDecisions` - Expected policy gate outcomes
## Lattice States
| Code | Name | Description |
|------|------|-------------|
| U | Unknown | No analysis performed |
| SR | StaticallyReachable | Static analysis finds path |
| SU | StaticallyUnreachable | Static analysis finds no path |
| RO | RuntimeObserved | Runtime probe observed execution |
| RU | RuntimeUnobserved | Runtime probe did not observe |
| CR | ConfirmedReachable | Both static and runtime confirm |
| CU | ConfirmedUnreachable | Both static and runtime confirm unreachable |
| X | Contested | Static and runtime evidence conflict |
## Running Tests
```bash
# Validate schemas
npx ajv validate -s schema/ground-truth.schema.json -d samples/**/ground-truth.json
# Run benchmark tests
dotnet test --filter "GroundTruth" src/Scanner/__Tests/StellaOps.Scanner.Reachability.Benchmarks/
```
## Adding New Samples
1. Create directory: `samples/{language}/{sample-name}/`
2. Add `manifest.json` with sample metadata
3. Add `ground-truth.json` with expected outcomes
4. Include `reasoning` for each target explaining the expected state
5. Validate against schema before committing
## Related Documentation
- [Ground Truth Schema](../../docs/reachability/ground-truth-schema.md)
- [Lattice Model](../../docs/reachability/lattice.md)
- [Policy Gates](../../docs/reachability/policy-gate.md)