up
Some checks failed
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled

This commit is contained in:
StellaOps Bot
2025-12-13 09:37:15 +02:00
parent e00f6365da
commit 6e45066e37
349 changed files with 17160 additions and 1867 deletions

View File

@@ -1,7 +1,7 @@
# StellaOps Bench Repository
# Stella Ops Bench Repository
> **Status:** Draft — aligns with `docs/benchmarks/vex-evidence-playbook.md` (Sprint401).
> **Purpose:** Host reproducible VEX decisions and comparison data that prove StellaOps signal quality vs. baseline scanners.
> **Status:** Active · Last updated: 2025-12-13
> **Purpose:** Host reproducible VEX decisions, reachability evidence, and comparison data proving Stella Ops' signal quality vs. baseline scanners.
## Layout
@@ -11,20 +11,122 @@ bench/
findings/ # per CVE/product bundles
CVE-YYYY-NNNNN/
evidence/
reachability.json
sbom.cdx.json
decision.openvex.json
decision.dsse.json
rekor.txt
metadata.json
reachability.json # richgraph-v1 excerpt
sbom.cdx.json # CycloneDX SBOM
decision.openvex.json # OpenVEX decision
decision.dsse.json # DSSE envelope
rekor.txt # Rekor log index + inclusion proof
metadata.json # finding metadata (purl, CVE, version)
tools/
verify.sh # DSSE + Rekor verifier
verify.sh # DSSE + Rekor verifier (online)
verify.py # offline verifier
compare.py # baseline comparison script
replay.sh # runs reachability replay manifolds
replay.sh # runs reachability replay manifests
results/
summary.csv
summary.csv # aggregated metrics
runs/<date>/... # raw outputs + replay manifests
reachability-benchmark/ # reachability benchmark with JDK fixtures
```
Refer to `docs/benchmarks/vex-evidence-playbook.md` for artifact contracts and automation tasks. The `bench/` tree will be populated once `BENCH-AUTO-401-019` and `DOCS-VEX-401-012` land.
## Related Documentation
| Document | Purpose |
|----------|---------|
| [VEX Evidence Playbook](../docs/benchmarks/vex-evidence-playbook.md) | Proof bundle schema, justification catalog, verification workflow |
| [Hybrid Attestation](../docs/reachability/hybrid-attestation.md) | Graph-level and edge-bundle DSSE decisions |
| [Function-Level Evidence](../docs/reachability/function-level-evidence.md) | Cross-module evidence chain guide |
| [Deterministic Replay](../docs/replay/DETERMINISTIC_REPLAY.md) | Replay manifest specification |
## Verification Workflows
### Quick Verification (Online)
```bash
# Verify a VEX proof bundle with DSSE and Rekor
./tools/verify.sh findings/CVE-2021-44228/decision.dsse.json
# Output:
# ✓ DSSE signature valid
# ✓ Rekor inclusion verified (log index: 12345678)
# ✓ Evidence hashes match
# ✓ Justification catalog membership confirmed
```
### Offline Verification
```bash
# Verify without network access
python tools/verify.py \
--bundle findings/CVE-2021-44228/decision.dsse.json \
--cas-root ./findings/CVE-2021-44228/evidence/ \
--catalog ../docs/benchmarks/vex-justifications.catalog.json
# Or use the VEX proof bundle verifier
python ../scripts/vex/verify_proof_bundle.py \
--bundle ../tests/Vex/ProofBundles/sample-proof-bundle.json \
--cas-root ../tests/Vex/ProofBundles/cas/
```
### Reachability Graph Verification
```bash
# Verify graph DSSE
stella graph verify --hash blake3:a1b2c3d4...
# Verify with edge bundles
stella graph verify --hash blake3:a1b2c3d4... --include-bundles
# Offline with local CAS
stella graph verify --hash blake3:a1b2c3d4... --cas-root ./offline-cas/
```
### Baseline Comparison
```bash
# Compare Stella Ops findings against baseline scanners
python tools/compare.py \
--stellaops results/runs/2025-12-13/findings.json \
--baseline results/baselines/trivy-latest.json \
--output results/comparison-2025-12-13.csv
# Metrics generated:
# - True positives (reachability-confirmed)
# - False positives (unreachable code paths)
# - MTTD (mean time to detect)
# - Reproducibility score
```
## Artifact Contracts
All bench artifacts must comply with:
1. **VEX Proof Bundle Schema** (`docs/benchmarks/vex-evidence-playbook.schema.json`)
- BLAKE3-256 primary hash, SHA-256 secondary
- Canonical JSON with sorted keys
- DSSE envelope with Rekor-ready digest
2. **Justification Catalog** (`docs/benchmarks/vex-justifications.catalog.json`)
- VEX1-VEX10 justification codes
- Required evidence types per justification
- Expiry and re-evaluation rules
3. **Reachability Graph** (`docs/contracts/richgraph-v1.md`)
- BLAKE3 graph_hash for content addressing
- Deterministic node/edge ordering
- SymbolID/EdgeID format compliance
## CI Integration
The bench directory is validated by:
- `.gitea/workflows/vex-proof-bundles.yml` - Verifies all proof bundles
- `.gitea/workflows/bench-determinism.yml` - Runs determinism benchmarks
- `.gitea/workflows/hybrid-attestation.yml` - Verifies graph/edge-bundle fixtures
## Contributing
1. Add new findings under `findings/CVE-YYYY-NNNNN/`
2. Include all required evidence artifacts
3. Generate DSSE envelope and Rekor proof
4. Update `results/summary.csv`
5. Run verification: `./tools/verify.sh findings/CVE-YYYY-NNNNN/decision.dsse.json`