up
Some checks failed
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Some checks failed
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
This commit is contained in:
128
bench/README.md
128
bench/README.md
@@ -1,7 +1,7 @@
|
||||
# Stella Ops Bench Repository
|
||||
# Stella Ops Bench Repository
|
||||
|
||||
> **Status:** Draft — aligns with `docs/benchmarks/vex-evidence-playbook.md` (Sprint 401).
|
||||
> **Purpose:** Host reproducible VEX decisions and comparison data that prove Stella Ops’ signal quality vs. baseline scanners.
|
||||
> **Status:** Active · Last updated: 2025-12-13
|
||||
> **Purpose:** Host reproducible VEX decisions, reachability evidence, and comparison data proving Stella Ops' signal quality vs. baseline scanners.
|
||||
|
||||
## Layout
|
||||
|
||||
@@ -11,20 +11,122 @@ bench/
|
||||
findings/ # per CVE/product bundles
|
||||
CVE-YYYY-NNNNN/
|
||||
evidence/
|
||||
reachability.json
|
||||
sbom.cdx.json
|
||||
decision.openvex.json
|
||||
decision.dsse.json
|
||||
rekor.txt
|
||||
metadata.json
|
||||
reachability.json # richgraph-v1 excerpt
|
||||
sbom.cdx.json # CycloneDX SBOM
|
||||
decision.openvex.json # OpenVEX decision
|
||||
decision.dsse.json # DSSE envelope
|
||||
rekor.txt # Rekor log index + inclusion proof
|
||||
metadata.json # finding metadata (purl, CVE, version)
|
||||
tools/
|
||||
verify.sh # DSSE + Rekor verifier
|
||||
verify.sh # DSSE + Rekor verifier (online)
|
||||
verify.py # offline verifier
|
||||
compare.py # baseline comparison script
|
||||
replay.sh # runs reachability replay manifolds
|
||||
replay.sh # runs reachability replay manifests
|
||||
results/
|
||||
summary.csv
|
||||
summary.csv # aggregated metrics
|
||||
runs/<date>/... # raw outputs + replay manifests
|
||||
reachability-benchmark/ # reachability benchmark with JDK fixtures
|
||||
```
|
||||
|
||||
Refer to `docs/benchmarks/vex-evidence-playbook.md` for artifact contracts and automation tasks. The `bench/` tree will be populated once `BENCH-AUTO-401-019` and `DOCS-VEX-401-012` land.
|
||||
## Related Documentation
|
||||
|
||||
| Document | Purpose |
|
||||
|----------|---------|
|
||||
| [VEX Evidence Playbook](../docs/benchmarks/vex-evidence-playbook.md) | Proof bundle schema, justification catalog, verification workflow |
|
||||
| [Hybrid Attestation](../docs/reachability/hybrid-attestation.md) | Graph-level and edge-bundle DSSE decisions |
|
||||
| [Function-Level Evidence](../docs/reachability/function-level-evidence.md) | Cross-module evidence chain guide |
|
||||
| [Deterministic Replay](../docs/replay/DETERMINISTIC_REPLAY.md) | Replay manifest specification |
|
||||
|
||||
## Verification Workflows
|
||||
|
||||
### Quick Verification (Online)
|
||||
|
||||
```bash
|
||||
# Verify a VEX proof bundle with DSSE and Rekor
|
||||
./tools/verify.sh findings/CVE-2021-44228/decision.dsse.json
|
||||
|
||||
# Output:
|
||||
# ✓ DSSE signature valid
|
||||
# ✓ Rekor inclusion verified (log index: 12345678)
|
||||
# ✓ Evidence hashes match
|
||||
# ✓ Justification catalog membership confirmed
|
||||
```
|
||||
|
||||
### Offline Verification
|
||||
|
||||
```bash
|
||||
# Verify without network access
|
||||
python tools/verify.py \
|
||||
--bundle findings/CVE-2021-44228/decision.dsse.json \
|
||||
--cas-root ./findings/CVE-2021-44228/evidence/ \
|
||||
--catalog ../docs/benchmarks/vex-justifications.catalog.json
|
||||
|
||||
# Or use the VEX proof bundle verifier
|
||||
python ../scripts/vex/verify_proof_bundle.py \
|
||||
--bundle ../tests/Vex/ProofBundles/sample-proof-bundle.json \
|
||||
--cas-root ../tests/Vex/ProofBundles/cas/
|
||||
```
|
||||
|
||||
### Reachability Graph Verification
|
||||
|
||||
```bash
|
||||
# Verify graph DSSE
|
||||
stella graph verify --hash blake3:a1b2c3d4...
|
||||
|
||||
# Verify with edge bundles
|
||||
stella graph verify --hash blake3:a1b2c3d4... --include-bundles
|
||||
|
||||
# Offline with local CAS
|
||||
stella graph verify --hash blake3:a1b2c3d4... --cas-root ./offline-cas/
|
||||
```
|
||||
|
||||
### Baseline Comparison
|
||||
|
||||
```bash
|
||||
# Compare Stella Ops findings against baseline scanners
|
||||
python tools/compare.py \
|
||||
--stellaops results/runs/2025-12-13/findings.json \
|
||||
--baseline results/baselines/trivy-latest.json \
|
||||
--output results/comparison-2025-12-13.csv
|
||||
|
||||
# Metrics generated:
|
||||
# - True positives (reachability-confirmed)
|
||||
# - False positives (unreachable code paths)
|
||||
# - MTTD (mean time to detect)
|
||||
# - Reproducibility score
|
||||
```
|
||||
|
||||
## Artifact Contracts
|
||||
|
||||
All bench artifacts must comply with:
|
||||
|
||||
1. **VEX Proof Bundle Schema** (`docs/benchmarks/vex-evidence-playbook.schema.json`)
|
||||
- BLAKE3-256 primary hash, SHA-256 secondary
|
||||
- Canonical JSON with sorted keys
|
||||
- DSSE envelope with Rekor-ready digest
|
||||
|
||||
2. **Justification Catalog** (`docs/benchmarks/vex-justifications.catalog.json`)
|
||||
- VEX1-VEX10 justification codes
|
||||
- Required evidence types per justification
|
||||
- Expiry and re-evaluation rules
|
||||
|
||||
3. **Reachability Graph** (`docs/contracts/richgraph-v1.md`)
|
||||
- BLAKE3 graph_hash for content addressing
|
||||
- Deterministic node/edge ordering
|
||||
- SymbolID/EdgeID format compliance
|
||||
|
||||
## CI Integration
|
||||
|
||||
The bench directory is validated by:
|
||||
|
||||
- `.gitea/workflows/vex-proof-bundles.yml` - Verifies all proof bundles
|
||||
- `.gitea/workflows/bench-determinism.yml` - Runs determinism benchmarks
|
||||
- `.gitea/workflows/hybrid-attestation.yml` - Verifies graph/edge-bundle fixtures
|
||||
|
||||
## Contributing
|
||||
|
||||
1. Add new findings under `findings/CVE-YYYY-NNNNN/`
|
||||
2. Include all required evidence artifacts
|
||||
3. Generate DSSE envelope and Rekor proof
|
||||
4. Update `results/summary.csv`
|
||||
5. Run verification: `./tools/verify.sh findings/CVE-YYYY-NNNNN/decision.dsse.json`
|
||||
|
||||
Reference in New Issue
Block a user