tests/reachability/
├── corpus/
│   ├── manifest.json          # SHA-256 hashes for all corpus files
│   ├── java/                  # Java test cases
│   │   └── <case-id>/
│   │       ├── project/       # Source code
│   │       ├── callgraph.json # Expected call graph
│   │       └── ground-truth.json
│   ├── dotnet/                # .NET test cases
│   └── native/                # Native (C/C++/Rust) test cases
├── fixtures/
│   └── reachbench-2025-expanded/
│       ├── INDEX.json         # Fixture index
│       └── cases/
│           └── <case-id>/
│               └── images/
│                   ├── reachable/
│                   │   └── reachgraph.truth.json
│                   └── unreachable/
│                       └── reachgraph.truth.json
└── StellaOps.Reachability.FixtureTests/
    ├── CorpusFixtureTests.cs
    └── ReachbenchFixtureTests.cs

Ground-Truth Schema

All ground-truth files follow the reachbench.reachgraph.truth/v1 schema:

{
  "schema_version": "reachbench.reachgraph.truth/v1",
  "case_id": "CVE-2023-38545",
  "variant": "reachable",
  "paths": [
    {
      "entry_point": "main",
      "vulnerable_function": "curl_easy_perform",
      "frames": ["main", "do_http_request", "curl_easy_perform"]
    }
  ],
  "metadata": {
    "cve_id": "CVE-2023-38545",
    "purl": "pkg:generic/curl@8.4.0"
  }
}

Running Fixture Tests

# Run all reachability fixture tests
dotnet test tests/reachability/StellaOps.Reachability.FixtureTests

# Run only corpus tests
dotnet test tests/reachability/StellaOps.Reachability.FixtureTests \
  --filter "FullyQualifiedName~CorpusFixtureTests"

# Run only reachbench tests
dotnet test tests/reachability/StellaOps.Reachability.FixtureTests \
  --filter "FullyQualifiedName~ReachbenchFixtureTests"

# Cross-platform runner scripts
./scripts/reachability/run_all.sh       # Unix
./scripts/reachability/run_all.ps1      # Windows

CI Integration

The reachability corpus is validated in CI via .gitea/workflows/reachability-corpus-ci.yml:

validate-corpus: Runs fixture tests, verifies SHA-256 hashes
validate-ground-truths: Validates schema version and structure
determinism-check: Ensures JSON files have sorted keys

Triggers:

Push/PR to paths: tests/reachability/**, scripts/reachability/**
Manual workflow dispatch

CAS Layout Reference

Content-Addressable Storage Paths

StellaOps uses BLAKE3 hashes for content-addressable storage:

Artifact Type	CAS Path Pattern	Example
Call Graph	`cas://reachability/graphs/{blake3}`	`cas://reachability/graphs/3a7f2b...`
Runtime Facts	`cas://reachability/runtime-facts/{blake3}`	`cas://reachability/runtime-facts/8c4d1e...`
Replay Manifest	`cas://reachability/replay/{blake3}`	`cas://reachability/replay/f2e9c8...`
Evidence Bundle	`cas://reachability/evidence/{blake3}`	`cas://reachability/evidence/a1b2c3...`
DSSE Envelope	`cas://attestation/dsse/{blake3}`	`cas://attestation/dsse/d4e5f6...`
Symbol Manifest	`cas://symbols/manifests/{blake3}`	`cas://symbols/manifests/7g8h9i...`

Hash Algorithm

All CAS URIs use BLAKE3 with base16 (hex) encoding:

cas://{namespace}/{artifact-type}/{blake3-hex}

Example hash computation:

import hashlib
# Use BLAKE3 for CAS hashing
from blake3 import blake3
content_hash = blake3(file_content).hexdigest()

Replay Workflow

Replay Manifest v2 Schema

{
  "version": 2,
  "hashAlg": "blake3",
  "hash": "blake3:3a7f2b...",
  "created_at": "2025-12-14T00:00:00Z",
  "entries": [
    {
      "type": "callgraph",
      "cas_uri": "cas://reachability/graphs/3a7f2b...",
      "hash": "blake3:3a7f2b..."
    },
    {
      "type": "runtime-facts",
      "cas_uri": "cas://reachability/runtime-facts/8c4d1e...",
      "hash": "blake3:8c4d1e..."
    }
  ],
  "code_id_coverage": 0.95
}

Replay Steps

Export replay manifest:

stella replay export --scan-id <scan-id> --output replay-manifest.json

Validate manifest integrity:

stella replay validate --manifest replay-manifest.json

Fetch CAS artifacts (online):

stella replay fetch --manifest replay-manifest.json --output ./artifacts/

Import for replay (air-gapped):

stella replay import --bundle replay-bundle.tar.gz --verify

Execute replay:

stella replay run --manifest replay-manifest.json --compare-to <baseline-hash>

Validation Error Codes

Code	Description
`REPLAY_MANIFEST_MISSING_VERSION`	Manifest missing version field
`VERSION_MISMATCH`	Unexpected manifest version
`MISSING_HASH_ALG`	Hash algorithm not specified
`UNSORTED_ENTRIES`	CAS entries not sorted (non-deterministic)
`CAS_NOT_FOUND`	Referenced CAS artifact missing
`HASH_MISMATCH`	Computed hash differs from declared

Benchmark Automation

Running Benchmarks

# Full benchmark pipeline
./scripts/bench/run-baseline.sh --all

# Individual steps
./scripts/bench/run-baseline.sh --populate   # Generate findings from fixtures
./scripts/bench/run-baseline.sh --compute    # Compute metrics

# Compare with baseline scanner
./scripts/bench/run-baseline.sh --compare baseline-results.json

Benchmark Outputs

Results are written to bench/results/:

summary.csv: Per-run metrics (TP, FP, TN, FN, precision, recall, F1)
metrics.json: Detailed findings with evidence hashes
replay/: Replay outputs for verification

Verification Tools

# Online verification (DSSE + Rekor)
./bench/tools/verify.sh <finding-bundle>

# Offline verification
python3 bench/tools/verify.py --bundle <finding-dir> --offline

# Compare scanners
python3 bench/tools/compare.py --baseline <scanner-results> --json

README.md

StellaOps Test Infrastructure

Reachability Test Fixtures

Corpus Structure