Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Export Center CI / export-ci (push) Has been cancelled
Findings Ledger CI / build-test (push) Has been cancelled
Findings Ledger CI / migration-validation (push) Has been cancelled
Findings Ledger CI / generate-manifest (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Lighthouse CI / Lighthouse Audit (push) Has been cancelled
Lighthouse CI / Axe Accessibility Audit (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
- Implemented tests for Cryptographic Failures (A02) to ensure proper handling of sensitive data, secure algorithms, and key management. - Added tests for Security Misconfiguration (A05) to validate production configurations, security headers, CORS settings, and feature management. - Developed tests for Authentication Failures (A07) to enforce strong password policies, rate limiting, session management, and MFA support. - Created tests for Software and Data Integrity Failures (A08) to verify artifact signatures, SBOM integrity, attestation chains, and feed updates.
StellaOps Test Infrastructure
This document describes the test infrastructure for StellaOps, including reachability corpus fixtures, benchmark automation, and CI integration.
Reachability Test Fixtures
Corpus Structure
The reachability corpus is located at tests/reachability/ and contains:
tests/reachability/
├── corpus/
│ ├── manifest.json # SHA-256 hashes for all corpus files
│ ├── java/ # Java test cases
│ │ └── <case-id>/
│ │ ├── project/ # Source code
│ │ ├── callgraph.json # Expected call graph
│ │ └── ground-truth.json
│ ├── dotnet/ # .NET test cases
│ └── native/ # Native (C/C++/Rust) test cases
├── fixtures/
│ └── reachbench-2025-expanded/
│ ├── INDEX.json # Fixture index
│ └── cases/
│ └── <case-id>/
│ └── images/
│ ├── reachable/
│ │ └── reachgraph.truth.json
│ └── unreachable/
│ └── reachgraph.truth.json
└── StellaOps.Reachability.FixtureTests/
├── CorpusFixtureTests.cs
└── ReachbenchFixtureTests.cs
Ground-Truth Schema
All ground-truth files follow the reachbench.reachgraph.truth/v1 schema:
{
"schema_version": "reachbench.reachgraph.truth/v1",
"case_id": "CVE-2023-38545",
"variant": "reachable",
"paths": [
{
"entry_point": "main",
"vulnerable_function": "curl_easy_perform",
"frames": ["main", "do_http_request", "curl_easy_perform"]
}
],
"metadata": {
"cve_id": "CVE-2023-38545",
"purl": "pkg:generic/curl@8.4.0"
}
}
Running Fixture Tests
# Run all reachability fixture tests
dotnet test tests/reachability/StellaOps.Reachability.FixtureTests
# Run only corpus tests
dotnet test tests/reachability/StellaOps.Reachability.FixtureTests \
--filter "FullyQualifiedName~CorpusFixtureTests"
# Run only reachbench tests
dotnet test tests/reachability/StellaOps.Reachability.FixtureTests \
--filter "FullyQualifiedName~ReachbenchFixtureTests"
# Cross-platform runner scripts
./scripts/reachability/run_all.sh # Unix
./scripts/reachability/run_all.ps1 # Windows
CI Integration
The reachability corpus is validated in CI via .gitea/workflows/reachability-corpus-ci.yml:
- validate-corpus: Runs fixture tests, verifies SHA-256 hashes
- validate-ground-truths: Validates schema version and structure
- determinism-check: Ensures JSON files have sorted keys
Triggers:
- Push/PR to paths:
tests/reachability/**,scripts/reachability/** - Manual workflow dispatch
CAS Layout Reference
Content-Addressable Storage Paths
StellaOps uses BLAKE3 hashes for content-addressable storage:
| Artifact Type | CAS Path Pattern | Example |
|---|---|---|
| Call Graph | cas://reachability/graphs/{blake3} |
cas://reachability/graphs/3a7f2b... |
| Runtime Facts | cas://reachability/runtime-facts/{blake3} |
cas://reachability/runtime-facts/8c4d1e... |
| Replay Manifest | cas://reachability/replay/{blake3} |
cas://reachability/replay/f2e9c8... |
| Evidence Bundle | cas://reachability/evidence/{blake3} |
cas://reachability/evidence/a1b2c3... |
| DSSE Envelope | cas://attestation/dsse/{blake3} |
cas://attestation/dsse/d4e5f6... |
| Symbol Manifest | cas://symbols/manifests/{blake3} |
cas://symbols/manifests/7g8h9i... |
Hash Algorithm
All CAS URIs use BLAKE3 with base16 (hex) encoding:
cas://{namespace}/{artifact-type}/{blake3-hex}
Example hash computation:
import hashlib
# Use BLAKE3 for CAS hashing
from blake3 import blake3
content_hash = blake3(file_content).hexdigest()
Replay Workflow
Replay Manifest v2 Schema
{
"version": 2,
"hashAlg": "blake3",
"hash": "blake3:3a7f2b...",
"created_at": "2025-12-14T00:00:00Z",
"entries": [
{
"type": "callgraph",
"cas_uri": "cas://reachability/graphs/3a7f2b...",
"hash": "blake3:3a7f2b..."
},
{
"type": "runtime-facts",
"cas_uri": "cas://reachability/runtime-facts/8c4d1e...",
"hash": "blake3:8c4d1e..."
}
],
"code_id_coverage": 0.95
}
Replay Steps
-
Export replay manifest:
stella replay export --scan-id <scan-id> --output replay-manifest.json -
Validate manifest integrity:
stella replay validate --manifest replay-manifest.json -
Fetch CAS artifacts (online):
stella replay fetch --manifest replay-manifest.json --output ./artifacts/ -
Import for replay (air-gapped):
stella replay import --bundle replay-bundle.tar.gz --verify -
Execute replay:
stella replay run --manifest replay-manifest.json --compare-to <baseline-hash>
Validation Error Codes
| Code | Description |
|---|---|
REPLAY_MANIFEST_MISSING_VERSION |
Manifest missing version field |
VERSION_MISMATCH |
Unexpected manifest version |
MISSING_HASH_ALG |
Hash algorithm not specified |
UNSORTED_ENTRIES |
CAS entries not sorted (non-deterministic) |
CAS_NOT_FOUND |
Referenced CAS artifact missing |
HASH_MISMATCH |
Computed hash differs from declared |
Benchmark Automation
Running Benchmarks
# Full benchmark pipeline
./scripts/bench/run-baseline.sh --all
# Individual steps
./scripts/bench/run-baseline.sh --populate # Generate findings from fixtures
./scripts/bench/run-baseline.sh --compute # Compute metrics
# Compare with baseline scanner
./scripts/bench/run-baseline.sh --compare baseline-results.json
Benchmark Outputs
Results are written to bench/results/:
summary.csv: Per-run metrics (TP, FP, TN, FN, precision, recall, F1)metrics.json: Detailed findings with evidence hashesreplay/: Replay outputs for verification
Verification Tools
# Online verification (DSSE + Rekor)
./bench/tools/verify.sh <finding-bundle>
# Offline verification
python3 bench/tools/verify.py --bundle <finding-dir> --offline
# Compare scanners
python3 bench/tools/compare.py --baseline <scanner-results> --json