up
This commit is contained in:
87
datasets/reachability/README.md
Normal file
87
datasets/reachability/README.md
Normal file
@@ -0,0 +1,87 @@
|
||||
# Reachability Test Datasets
|
||||
|
||||
This directory contains ground truth samples for validating reachability analysis accuracy.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
datasets/reachability/
|
||||
├── README.md # This file
|
||||
├── samples/ # Test samples by language
|
||||
│ ├── csharp/
|
||||
│ │ ├── simple-reachable/ # Positive: direct call path
|
||||
│ │ └── dead-code/ # Negative: unreachable code
|
||||
│ ├── java/
|
||||
│ │ └── vulnerable-log4j/ # Positive: Log4Shell CVE
|
||||
│ └── native/
|
||||
│ └── stripped-elf/ # Positive: stripped binary
|
||||
└── schema/
|
||||
├── manifest.schema.json # Sample manifest schema
|
||||
└── ground-truth.schema.json # Ground truth schema
|
||||
```
|
||||
|
||||
## Sample Categories
|
||||
|
||||
### Positive (Reachable)
|
||||
Samples where vulnerable code has a confirmed path from entry points:
|
||||
- `csharp/simple-reachable` - Direct call to vulnerable API
|
||||
- `java/vulnerable-log4j` - Log4Shell with runtime confirmation
|
||||
- `native/stripped-elf` - Stripped ELF with heuristic analysis
|
||||
|
||||
### Negative (Unreachable)
|
||||
Samples where vulnerable code exists but is never called:
|
||||
- `csharp/dead-code` - Deprecated API replaced by safe implementation
|
||||
|
||||
## Schema Reference
|
||||
|
||||
### manifest.json
|
||||
Sample metadata including:
|
||||
- `sampleId` - Unique identifier
|
||||
- `language` - Primary language (java, csharp, native, etc.)
|
||||
- `category` - positive, negative, or contested
|
||||
- `vulnerabilities` - CVEs and affected symbols
|
||||
- `artifacts` - Binary/SBOM file references
|
||||
|
||||
### ground-truth.json
|
||||
Expected outcomes including:
|
||||
- `targets` - Symbols with expected lattice states
|
||||
- `entryPoints` - Program entry points
|
||||
- `expectedUncertainty` - Expected uncertainty tier
|
||||
- `expectedGateDecisions` - Expected policy gate outcomes
|
||||
|
||||
## Lattice States
|
||||
|
||||
| Code | Name | Description |
|
||||
|------|------|-------------|
|
||||
| U | Unknown | No analysis performed |
|
||||
| SR | StaticallyReachable | Static analysis finds path |
|
||||
| SU | StaticallyUnreachable | Static analysis finds no path |
|
||||
| RO | RuntimeObserved | Runtime probe observed execution |
|
||||
| RU | RuntimeUnobserved | Runtime probe did not observe |
|
||||
| CR | ConfirmedReachable | Both static and runtime confirm |
|
||||
| CU | ConfirmedUnreachable | Both static and runtime confirm unreachable |
|
||||
| X | Contested | Static and runtime evidence conflict |
|
||||
|
||||
## Running Tests
|
||||
|
||||
```bash
|
||||
# Validate schemas
|
||||
npx ajv validate -s schema/ground-truth.schema.json -d samples/**/ground-truth.json
|
||||
|
||||
# Run benchmark tests
|
||||
dotnet test --filter "GroundTruth" src/Scanner/__Tests/StellaOps.Scanner.Reachability.Benchmarks/
|
||||
```
|
||||
|
||||
## Adding New Samples
|
||||
|
||||
1. Create directory: `samples/{language}/{sample-name}/`
|
||||
2. Add `manifest.json` with sample metadata
|
||||
3. Add `ground-truth.json` with expected outcomes
|
||||
4. Include `reasoning` for each target explaining the expected state
|
||||
5. Validate against schema before committing
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Ground Truth Schema](../../docs/reachability/ground-truth-schema.md)
|
||||
- [Lattice Model](../../docs/reachability/lattice.md)
|
||||
- [Policy Gates](../../docs/reachability/policy-gate.md)
|
||||
Reference in New Issue
Block a user