up

2025-12-13 09:37:15 +02:00
parent e00f6365da
commit 6e45066e37
349 changed files with 17160 additions and 1867 deletions
--- a/bench/README.md
+++ b/bench/README.md
@@ -1,7 +1,7 @@
-# Stella Ops Bench Repository
+# Stella Ops Bench Repository

-> **Status:** Draft — aligns with `docs/benchmarks/vex-evidence-playbook.md` (Sprint 401).  
-> **Purpose:** Host reproducible VEX decisions and comparison data that prove Stella Ops’ signal quality vs. baseline scanners.
+> **Status:** Active · Last updated: 2025-12-13
+> **Purpose:** Host reproducible VEX decisions, reachability evidence, and comparison data proving Stella Ops' signal quality vs. baseline scanners.

 ## Layout

@@ -11,20 +11,122 @@ bench/
  findings/                 # per CVE/product bundles
    CVE-YYYY-NNNNN/
      evidence/
-        reachability.json
-        sbom.cdx.json
-      decision.openvex.json
-      decision.dsse.json
-      rekor.txt
-      metadata.json
+        reachability.json   # richgraph-v1 excerpt
+        sbom.cdx.json       # CycloneDX SBOM
+      decision.openvex.json # OpenVEX decision
+      decision.dsse.json    # DSSE envelope
+      rekor.txt             # Rekor log index + inclusion proof
+      metadata.json         # finding metadata (purl, CVE, version)
  tools/
-    verify.sh               # DSSE + Rekor verifier
+    verify.sh               # DSSE + Rekor verifier (online)
    verify.py               # offline verifier
    compare.py              # baseline comparison script
-    replay.sh               # runs reachability replay manifolds
+    replay.sh               # runs reachability replay manifests
  results/
-    summary.csv
+    summary.csv             # aggregated metrics
    runs/<date>/...         # raw outputs + replay manifests
+  reachability-benchmark/   # reachability benchmark with JDK fixtures
 ```

-Refer to `docs/benchmarks/vex-evidence-playbook.md` for artifact contracts and automation tasks. The `bench/` tree will be populated once `BENCH-AUTO-401-019` and `DOCS-VEX-401-012` land.
+## Related Documentation
+
+| Document | Purpose |
+|----------|---------|
+| [VEX Evidence Playbook](../docs/benchmarks/vex-evidence-playbook.md) | Proof bundle schema, justification catalog, verification workflow |
+| [Hybrid Attestation](../docs/reachability/hybrid-attestation.md) | Graph-level and edge-bundle DSSE decisions |
+| [Function-Level Evidence](../docs/reachability/function-level-evidence.md) | Cross-module evidence chain guide |
+| [Deterministic Replay](../docs/replay/DETERMINISTIC_REPLAY.md) | Replay manifest specification |
+
+## Verification Workflows
+
+### Quick Verification (Online)
+
+```bash
+# Verify a VEX proof bundle with DSSE and Rekor
+./tools/verify.sh findings/CVE-2021-44228/decision.dsse.json
+
+# Output:
+# ✓ DSSE signature valid
+# ✓ Rekor inclusion verified (log index: 12345678)
+# ✓ Evidence hashes match
+# ✓ Justification catalog membership confirmed
+```
+
+### Offline Verification
+
+```bash
+# Verify without network access
+python tools/verify.py \
+  --bundle findings/CVE-2021-44228/decision.dsse.json \
+  --cas-root ./findings/CVE-2021-44228/evidence/ \
+  --catalog ../docs/benchmarks/vex-justifications.catalog.json
+
+# Or use the VEX proof bundle verifier
+python ../scripts/vex/verify_proof_bundle.py \
+  --bundle ../tests/Vex/ProofBundles/sample-proof-bundle.json \
+  --cas-root ../tests/Vex/ProofBundles/cas/
+```
+
+### Reachability Graph Verification
+
+```bash
+# Verify graph DSSE
+stella graph verify --hash blake3:a1b2c3d4...
+
+# Verify with edge bundles
+stella graph verify --hash blake3:a1b2c3d4... --include-bundles
+
+# Offline with local CAS
+stella graph verify --hash blake3:a1b2c3d4... --cas-root ./offline-cas/
+```
+
+### Baseline Comparison
+
+```bash
+# Compare Stella Ops findings against baseline scanners
+python tools/compare.py \
+  --stellaops results/runs/2025-12-13/findings.json \
+  --baseline results/baselines/trivy-latest.json \
+  --output results/comparison-2025-12-13.csv
+
+# Metrics generated:
+# - True positives (reachability-confirmed)
+# - False positives (unreachable code paths)
+# - MTTD (mean time to detect)
+# - Reproducibility score
+```
+
+## Artifact Contracts
+
+All bench artifacts must comply with:
+
+1. **VEX Proof Bundle Schema** (`docs/benchmarks/vex-evidence-playbook.schema.json`)
+   - BLAKE3-256 primary hash, SHA-256 secondary
+   - Canonical JSON with sorted keys
+   - DSSE envelope with Rekor-ready digest
+
+2. **Justification Catalog** (`docs/benchmarks/vex-justifications.catalog.json`)
+   - VEX1-VEX10 justification codes
+   - Required evidence types per justification
+   - Expiry and re-evaluation rules
+
+3. **Reachability Graph** (`docs/contracts/richgraph-v1.md`)
+   - BLAKE3 graph_hash for content addressing
+   - Deterministic node/edge ordering
+   - SymbolID/EdgeID format compliance
+
+## CI Integration
+
+The bench directory is validated by:
+
+- `.gitea/workflows/vex-proof-bundles.yml` - Verifies all proof bundles
+- `.gitea/workflows/bench-determinism.yml` - Runs determinism benchmarks
+- `.gitea/workflows/hybrid-attestation.yml` - Verifies graph/edge-bundle fixtures
+
+## Contributing
+
+1. Add new findings under `findings/CVE-YYYY-NNNNN/`
+2. Include all required evidence artifacts
+3. Generate DSSE envelope and Rekor proof
+4. Update `results/summary.csv`
+5. Run verification: `./tools/verify.sh findings/CVE-YYYY-NNNNN/decision.dsse.json`