Add Canonical JSON serialization library with tests and documentation
- Implemented CanonJson class for deterministic JSON serialization and hashing. - Added unit tests for CanonJson functionality, covering various scenarios including key sorting, handling of nested objects, arrays, and special characters. - Created project files for the Canonical JSON library and its tests, including necessary package references. - Added README.md for library usage and API reference. - Introduced RabbitMqIntegrationFactAttribute for conditional RabbitMQ integration tests.
This commit is contained in:
129
bench/determinism/README.md
Normal file
129
bench/determinism/README.md
Normal file
@@ -0,0 +1,129 @@
|
||||
# Determinism Benchmark Suite
|
||||
|
||||
> **Purpose:** Verify that StellaOps produces bit-identical results across replays.
|
||||
> **Status:** Active
|
||||
> **Sprint:** SPRINT_3850_0001_0001 (Competitive Gap Closure)
|
||||
|
||||
## Overview
|
||||
|
||||
Determinism is a core differentiator for StellaOps:
|
||||
- Same inputs → same outputs (bit-identical)
|
||||
- Replay manifests enable audit verification
|
||||
- No hidden state or environment leakage
|
||||
|
||||
## What Gets Tested
|
||||
|
||||
### Canonical JSON
|
||||
- Object key ordering (alphabetical)
|
||||
- Number formatting consistency
|
||||
- UTF-8 encoding without BOM
|
||||
- No whitespace variation
|
||||
|
||||
### Scan Manifests
|
||||
- Same artifact + same feeds → same manifest hash
|
||||
- Seed values propagate correctly
|
||||
- Timestamp handling (fixed UTC)
|
||||
|
||||
### Proof Bundles
|
||||
- Root hash computation
|
||||
- DSSE envelope determinism
|
||||
- ProofLedger node ordering
|
||||
|
||||
### Score Computation
|
||||
- Same manifest → same score
|
||||
- Lattice merge is associative/commutative
|
||||
- Policy rule ordering doesn't affect outcome
|
||||
|
||||
## Test Cases
|
||||
|
||||
### TC-001: Canonical JSON Determinism
|
||||
|
||||
```bash
|
||||
# Run same object through CanonJson 100 times
|
||||
# All hashes must match
|
||||
```
|
||||
|
||||
### TC-002: Manifest Hash Stability
|
||||
|
||||
```bash
|
||||
# Create manifest with identical inputs
|
||||
# Verify ComputeHash() returns same value
|
||||
```
|
||||
|
||||
### TC-003: Cross-Platform Determinism
|
||||
|
||||
```bash
|
||||
# Run on Linux, Windows, macOS
|
||||
# Compare output hashes
|
||||
```
|
||||
|
||||
### TC-004: Feed Snapshot Determinism
|
||||
|
||||
```bash
|
||||
# Same feed snapshot hash → same scan results
|
||||
```
|
||||
|
||||
## Fixtures
|
||||
|
||||
```
|
||||
fixtures/
|
||||
├── sample-manifest.json
|
||||
├── sample-ledger.json
|
||||
├── expected-hashes.json
|
||||
└── cross-platform/
|
||||
├── linux-x64.hashes.json
|
||||
├── windows-x64.hashes.json
|
||||
└── macos-arm64.hashes.json
|
||||
```
|
||||
|
||||
## Running the Suite
|
||||
|
||||
```bash
|
||||
# Run determinism tests
|
||||
dotnet test tests/StellaOps.Determinism.Tests
|
||||
|
||||
# Run replay verification
|
||||
./run-replay.sh --manifest fixtures/sample-manifest.json --runs 10
|
||||
|
||||
# Cross-platform verification (requires CI matrix)
|
||||
./verify-cross-platform.sh
|
||||
```
|
||||
|
||||
## Metrics
|
||||
|
||||
| Metric | Target | Description |
|
||||
|--------|--------|-------------|
|
||||
| Hash stability | 100% | All runs produce identical hash |
|
||||
| Replay success | 100% | All replays match original |
|
||||
| Cross-platform parity | 100% | Same hash across OS/arch |
|
||||
|
||||
## Integration with CI
|
||||
|
||||
```yaml
|
||||
# .gitea/workflows/bench-determinism.yaml
|
||||
name: Determinism Benchmark
|
||||
on:
|
||||
push:
|
||||
paths:
|
||||
- 'src/__Libraries/StellaOps.Canonical.Json/**'
|
||||
- 'src/Scanner/__Libraries/StellaOps.Scanner.Core/**'
|
||||
- 'bench/determinism/**'
|
||||
|
||||
jobs:
|
||||
determinism:
|
||||
strategy:
|
||||
matrix:
|
||||
os: [ubuntu-latest, windows-latest, macos-latest]
|
||||
runs-on: ${{ matrix.os }}
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- name: Run Determinism Tests
|
||||
run: dotnet test tests/StellaOps.Determinism.Tests
|
||||
- name: Capture Hashes
|
||||
run: ./bench/determinism/capture-hashes.sh
|
||||
- name: Upload Hashes
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: hashes-${{ matrix.os }}
|
||||
path: bench/determinism/results/
|
||||
```
|
||||
133
bench/determinism/run-replay.sh
Normal file
133
bench/determinism/run-replay.sh
Normal file
@@ -0,0 +1,133 @@
|
||||
#!/usr/bin/env bash
|
||||
# run-replay.sh
|
||||
# Deterministic Replay Benchmark
|
||||
# Sprint: SPRINT_3850_0001_0001
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
RESULTS_DIR="$SCRIPT_DIR/results/$(date -u +%Y%m%d_%H%M%S)"
|
||||
|
||||
# Parse arguments
|
||||
MANIFEST_FILE=""
|
||||
RUNS=5
|
||||
VERBOSE=false
|
||||
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case $1 in
|
||||
--manifest)
|
||||
MANIFEST_FILE="$2"
|
||||
shift 2
|
||||
;;
|
||||
--runs)
|
||||
RUNS="$2"
|
||||
shift 2
|
||||
;;
|
||||
--verbose|-v)
|
||||
VERBOSE=true
|
||||
shift
|
||||
;;
|
||||
*)
|
||||
echo "Unknown option: $1"
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
echo "╔════════════════════════════════════════════════╗"
|
||||
echo "║ Deterministic Replay Benchmark ║"
|
||||
echo "╚════════════════════════════════════════════════╝"
|
||||
echo ""
|
||||
echo "Configuration:"
|
||||
echo " Manifest: ${MANIFEST_FILE:-<default sample>}"
|
||||
echo " Runs: $RUNS"
|
||||
echo " Results dir: $RESULTS_DIR"
|
||||
echo ""
|
||||
|
||||
mkdir -p "$RESULTS_DIR"
|
||||
|
||||
# Use sample manifest if none provided
|
||||
if [ -z "$MANIFEST_FILE" ] && [ -f "$SCRIPT_DIR/fixtures/sample-manifest.json" ]; then
|
||||
MANIFEST_FILE="$SCRIPT_DIR/fixtures/sample-manifest.json"
|
||||
fi
|
||||
|
||||
declare -a HASHES
|
||||
|
||||
echo "Running $RUNS iterations..."
|
||||
echo ""
|
||||
|
||||
for i in $(seq 1 $RUNS); do
|
||||
echo -n " Run $i: "
|
||||
|
||||
OUTPUT_FILE="$RESULTS_DIR/run_$i.json"
|
||||
|
||||
if command -v dotnet &> /dev/null; then
|
||||
# Run the replay service
|
||||
dotnet run --project "$SCRIPT_DIR/../../src/Scanner/StellaOps.Scanner.WebService" -- \
|
||||
replay \
|
||||
--manifest "$MANIFEST_FILE" \
|
||||
--output "$OUTPUT_FILE" \
|
||||
--format json 2>/dev/null || {
|
||||
echo "⊘ Skipped (replay command not available)"
|
||||
continue
|
||||
}
|
||||
|
||||
if [ -f "$OUTPUT_FILE" ]; then
|
||||
HASH=$(sha256sum "$OUTPUT_FILE" | cut -d' ' -f1)
|
||||
HASHES+=("$HASH")
|
||||
echo "sha256:${HASH:0:16}..."
|
||||
else
|
||||
echo "⊘ No output generated"
|
||||
fi
|
||||
else
|
||||
echo "⊘ Skipped (dotnet not available)"
|
||||
fi
|
||||
done
|
||||
|
||||
echo ""
|
||||
|
||||
# Verify all hashes match
|
||||
if [ ${#HASHES[@]} -gt 1 ]; then
|
||||
FIRST_HASH="${HASHES[0]}"
|
||||
ALL_MATCH=true
|
||||
|
||||
for hash in "${HASHES[@]}"; do
|
||||
if [ "$hash" != "$FIRST_HASH" ]; then
|
||||
ALL_MATCH=false
|
||||
break
|
||||
fi
|
||||
done
|
||||
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "Results"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
|
||||
if $ALL_MATCH; then
|
||||
echo "✓ PASS: All $RUNS runs produced identical output"
|
||||
echo " Hash: sha256:$FIRST_HASH"
|
||||
else
|
||||
echo "✗ FAIL: Outputs differ between runs"
|
||||
echo ""
|
||||
echo "Hashes:"
|
||||
for i in "${!HASHES[@]}"; do
|
||||
echo " Run $((i+1)): ${HASHES[$i]}"
|
||||
done
|
||||
fi
|
||||
else
|
||||
echo "ℹ️ Insufficient runs to verify determinism"
|
||||
fi
|
||||
|
||||
# Create summary JSON
|
||||
cat > "$RESULTS_DIR/summary.json" <<EOF
|
||||
{
|
||||
"benchmark": "determinism-replay",
|
||||
"timestamp": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
|
||||
"manifest": "$MANIFEST_FILE",
|
||||
"runs": $RUNS,
|
||||
"hashes": [$(printf '"%s",' "${HASHES[@]}" | sed 's/,$//')],
|
||||
"deterministic": ${ALL_MATCH:-null}
|
||||
}
|
||||
EOF
|
||||
|
||||
echo ""
|
||||
echo "Results saved to: $RESULTS_DIR"
|
||||
117
bench/smart-diff/README.md
Normal file
117
bench/smart-diff/README.md
Normal file
@@ -0,0 +1,117 @@
|
||||
# Smart-Diff Benchmark Suite
|
||||
|
||||
> **Purpose:** Prove deterministic smart-diff reduces noise compared to naive diff.
|
||||
> **Status:** Active
|
||||
> **Sprint:** SPRINT_3850_0001_0001 (Competitive Gap Closure)
|
||||
|
||||
## Overview
|
||||
|
||||
The Smart-Diff feature enables incremental scanning by:
|
||||
1. Computing structural diffs of SBOMs/dependencies
|
||||
2. Identifying only changed components
|
||||
3. Avoiding redundant scanning of unchanged packages
|
||||
4. Producing deterministic, reproducible diff results
|
||||
|
||||
## Test Cases
|
||||
|
||||
### TC-001: Layer-Aware Diff
|
||||
|
||||
Tests that Smart-Diff correctly handles container layer changes:
|
||||
- Adding a layer
|
||||
- Removing a layer
|
||||
- Modifying a layer (same hash, different content)
|
||||
|
||||
### TC-002: Package Version Diff
|
||||
|
||||
Tests accurate detection of package version changes:
|
||||
- Minor version bump
|
||||
- Major version bump
|
||||
- Pre-release version handling
|
||||
- Epoch handling (RPM)
|
||||
|
||||
### TC-003: Noise Reduction
|
||||
|
||||
Compares smart-diff output vs naive diff for real-world images:
|
||||
- Measure CVE count reduction
|
||||
- Measure scanning time reduction
|
||||
- Verify determinism (same inputs → same outputs)
|
||||
|
||||
### TC-004: Deterministic Ordering
|
||||
|
||||
Verifies that diff results are:
|
||||
- Sorted by component PURL
|
||||
- Ordered consistently across runs
|
||||
- Independent of filesystem ordering
|
||||
|
||||
## Fixtures
|
||||
|
||||
```
|
||||
fixtures/
|
||||
├── base-alpine-3.18.sbom.cdx.json
|
||||
├── base-alpine-3.19.sbom.cdx.json
|
||||
├── layer-added.manifest.json
|
||||
├── layer-removed.manifest.json
|
||||
├── version-bump-minor.sbom.cdx.json
|
||||
├── version-bump-major.sbom.cdx.json
|
||||
└── expected/
|
||||
├── tc001-layer-added.diff.json
|
||||
├── tc001-layer-removed.diff.json
|
||||
├── tc002-minor-bump.diff.json
|
||||
├── tc002-major-bump.diff.json
|
||||
└── tc003-noise-reduction.metrics.json
|
||||
```
|
||||
|
||||
## Running the Suite
|
||||
|
||||
```bash
|
||||
# Run all smart-diff tests
|
||||
dotnet test tests/StellaOps.Scanner.SmartDiff.Tests
|
||||
|
||||
# Run benchmark comparison
|
||||
./run-benchmark.sh --baseline naive --compare smart
|
||||
|
||||
# Generate metrics report
|
||||
./tools/analyze.py results/ --output metrics.csv
|
||||
```
|
||||
|
||||
## Metrics Collected
|
||||
|
||||
| Metric | Description |
|
||||
|--------|-------------|
|
||||
| `diff_time_ms` | Time to compute diff |
|
||||
| `changed_packages` | Number of packages marked as changed |
|
||||
| `false_positive_rate` | Packages incorrectly flagged as changed |
|
||||
| `determinism_score` | 1.0 if all runs produce identical output |
|
||||
| `noise_reduction_pct` | % reduction vs naive diff |
|
||||
|
||||
## Expected Results
|
||||
|
||||
For typical Alpine base image upgrades (3.18 → 3.19):
|
||||
- **Naive diff:** ~150 packages flagged as changed
|
||||
- **Smart diff:** ~12 packages actually changed
|
||||
- **Noise reduction:** ~92%
|
||||
|
||||
## Integration with CI
|
||||
|
||||
```yaml
|
||||
# .gitea/workflows/bench-smart-diff.yaml
|
||||
name: Smart-Diff Benchmark
|
||||
on:
|
||||
push:
|
||||
paths:
|
||||
- 'src/Scanner/__Libraries/StellaOps.Scanner.SmartDiff/**'
|
||||
- 'bench/smart-diff/**'
|
||||
|
||||
jobs:
|
||||
benchmark:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- name: Run Smart-Diff Benchmark
|
||||
run: ./bench/smart-diff/run-benchmark.sh
|
||||
- name: Upload Results
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: smart-diff-results
|
||||
path: bench/smart-diff/results/
|
||||
```
|
||||
135
bench/smart-diff/run-benchmark.sh
Normal file
135
bench/smart-diff/run-benchmark.sh
Normal file
@@ -0,0 +1,135 @@
|
||||
#!/usr/bin/env bash
|
||||
# run-benchmark.sh
|
||||
# Smart-Diff Benchmark Runner
|
||||
# Sprint: SPRINT_3850_0001_0001
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
BENCH_ROOT="$SCRIPT_DIR"
|
||||
RESULTS_DIR="$BENCH_ROOT/results/$(date -u +%Y%m%d_%H%M%S)"
|
||||
|
||||
# Parse arguments
|
||||
BASELINE_MODE="naive"
|
||||
COMPARE_MODE="smart"
|
||||
VERBOSE=false
|
||||
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case $1 in
|
||||
--baseline)
|
||||
BASELINE_MODE="$2"
|
||||
shift 2
|
||||
;;
|
||||
--compare)
|
||||
COMPARE_MODE="$2"
|
||||
shift 2
|
||||
;;
|
||||
--verbose|-v)
|
||||
VERBOSE=true
|
||||
shift
|
||||
;;
|
||||
*)
|
||||
echo "Unknown option: $1"
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
echo "╔════════════════════════════════════════════════╗"
|
||||
echo "║ Smart-Diff Benchmark Suite ║"
|
||||
echo "╚════════════════════════════════════════════════╝"
|
||||
echo ""
|
||||
echo "Configuration:"
|
||||
echo " Baseline mode: $BASELINE_MODE"
|
||||
echo " Compare mode: $COMPARE_MODE"
|
||||
echo " Results dir: $RESULTS_DIR"
|
||||
echo ""
|
||||
|
||||
mkdir -p "$RESULTS_DIR"
|
||||
|
||||
# Function to run a test case
|
||||
run_test_case() {
|
||||
local test_id="$1"
|
||||
local description="$2"
|
||||
local base_sbom="$3"
|
||||
local target_sbom="$4"
|
||||
local expected_file="$5"
|
||||
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "Test: $test_id - $description"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
|
||||
local start_time=$(date +%s%3N)
|
||||
|
||||
# Run smart-diff
|
||||
if command -v dotnet &> /dev/null; then
|
||||
dotnet run --project "$SCRIPT_DIR/../../src/Scanner/__Libraries/StellaOps.Scanner.SmartDiff" -- \
|
||||
--base "$base_sbom" \
|
||||
--target "$target_sbom" \
|
||||
--output "$RESULTS_DIR/$test_id.diff.json" \
|
||||
--format json 2>/dev/null || true
|
||||
fi
|
||||
|
||||
local end_time=$(date +%s%3N)
|
||||
local elapsed=$((end_time - start_time))
|
||||
|
||||
echo " Time: ${elapsed}ms"
|
||||
|
||||
# Verify determinism by running twice
|
||||
if [ -f "$RESULTS_DIR/$test_id.diff.json" ]; then
|
||||
local hash1=$(sha256sum "$RESULTS_DIR/$test_id.diff.json" | cut -d' ' -f1)
|
||||
|
||||
if command -v dotnet &> /dev/null; then
|
||||
dotnet run --project "$SCRIPT_DIR/../../src/Scanner/__Libraries/StellaOps.Scanner.SmartDiff" -- \
|
||||
--base "$base_sbom" \
|
||||
--target "$target_sbom" \
|
||||
--output "$RESULTS_DIR/$test_id.diff.run2.json" \
|
||||
--format json 2>/dev/null || true
|
||||
fi
|
||||
|
||||
if [ -f "$RESULTS_DIR/$test_id.diff.run2.json" ]; then
|
||||
local hash2=$(sha256sum "$RESULTS_DIR/$test_id.diff.run2.json" | cut -d' ' -f1)
|
||||
|
||||
if [ "$hash1" = "$hash2" ]; then
|
||||
echo " ✓ Determinism verified"
|
||||
else
|
||||
echo " ✗ Determinism FAILED (different hashes)"
|
||||
fi
|
||||
fi
|
||||
else
|
||||
echo " ⊘ Skipped (dotnet not available or project missing)"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
}
|
||||
|
||||
# Test Case 1: Layer-Aware Diff (using fixtures)
|
||||
if [ -f "$BENCH_ROOT/fixtures/base-alpine-3.18.sbom.cdx.json" ]; then
|
||||
run_test_case "TC-001-layer-added" \
|
||||
"Layer addition detection" \
|
||||
"$BENCH_ROOT/fixtures/base-alpine-3.18.sbom.cdx.json" \
|
||||
"$BENCH_ROOT/fixtures/base-alpine-3.19.sbom.cdx.json" \
|
||||
"$BENCH_ROOT/fixtures/expected/tc001-layer-added.diff.json"
|
||||
else
|
||||
echo "ℹ️ Skipping TC-001: Fixtures not found"
|
||||
echo " Run './tools/generate-fixtures.sh' to create test fixtures"
|
||||
fi
|
||||
|
||||
# Generate summary
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "Summary"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "Results saved to: $RESULTS_DIR"
|
||||
|
||||
# Create summary JSON
|
||||
cat > "$RESULTS_DIR/summary.json" <<EOF
|
||||
{
|
||||
"benchmark": "smart-diff",
|
||||
"timestamp": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
|
||||
"baseline_mode": "$BASELINE_MODE",
|
||||
"compare_mode": "$COMPARE_MODE",
|
||||
"results_dir": "$RESULTS_DIR"
|
||||
}
|
||||
EOF
|
||||
|
||||
echo "Done."
|
||||
183
bench/unknowns/README.md
Normal file
183
bench/unknowns/README.md
Normal file
@@ -0,0 +1,183 @@
|
||||
# Unknowns Tracking Benchmark Suite
|
||||
|
||||
> **Purpose:** Verify epistemic uncertainty tracking and unknown state management.
|
||||
> **Status:** Active
|
||||
> **Sprint:** SPRINT_3850_0001_0001 (Competitive Gap Closure)
|
||||
|
||||
## Overview
|
||||
|
||||
StellaOps tracks "unknowns" - gaps in knowledge that affect confidence:
|
||||
- Missing SBOM components
|
||||
- Unmatched CVEs
|
||||
- Stale feed data
|
||||
- Zero-day windows
|
||||
- Analysis limitations
|
||||
|
||||
## What Gets Tested
|
||||
|
||||
### Unknown State Lifecycle
|
||||
1. Detection of unknown conditions
|
||||
2. Propagation to affected findings
|
||||
3. Score penalty application
|
||||
4. Resolution tracking
|
||||
|
||||
### Unknown Categories
|
||||
- `SBOM_GAP`: Component not in SBOM
|
||||
- `CVE_UNMATCHED`: CVE without component mapping
|
||||
- `FEED_STALE`: Feed data older than threshold
|
||||
- `ZERO_DAY_WINDOW`: Time between disclosure and feed update
|
||||
- `ANALYSIS_LIMIT`: Depth/timeout constraints
|
||||
|
||||
### Score Impact
|
||||
- Each unknown type has a penalty weight
|
||||
- Penalties reduce overall confidence
|
||||
- Resolved unknowns restore confidence
|
||||
|
||||
## Test Cases
|
||||
|
||||
### TC-001: SBOM Gap Detection
|
||||
|
||||
```json
|
||||
{
|
||||
"scenario": "Package in image not in SBOM",
|
||||
"input": {
|
||||
"image_packages": ["openssl@3.0.1", "curl@7.86"],
|
||||
"sbom_packages": ["openssl@3.0.1"]
|
||||
},
|
||||
"expected": {
|
||||
"unknowns": [{ "type": "SBOM_GAP", "package": "curl@7.86" }],
|
||||
"confidence_penalty": 0.15
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### TC-002: Zero-Day Window Tracking
|
||||
|
||||
```json
|
||||
{
|
||||
"scenario": "CVE disclosed before feed update",
|
||||
"input": {
|
||||
"cve_disclosure": "2025-01-01T00:00:00Z",
|
||||
"feed_update": "2025-01-03T00:00:00Z",
|
||||
"scan_time": "2025-01-02T12:00:00Z"
|
||||
},
|
||||
"expected": {
|
||||
"unknowns": [{
|
||||
"type": "ZERO_DAY_WINDOW",
|
||||
"cve": "CVE-2025-0001",
|
||||
"window_hours": 36
|
||||
}],
|
||||
"risk_note": "Scan occurred during zero-day window"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### TC-003: Feed Staleness
|
||||
|
||||
```json
|
||||
{
|
||||
"scenario": "NVD feed older than 24 hours",
|
||||
"input": {
|
||||
"feed_last_update": "2025-01-01T00:00:00Z",
|
||||
"scan_time": "2025-01-02T12:00:00Z",
|
||||
"staleness_threshold_hours": 24
|
||||
},
|
||||
"expected": {
|
||||
"unknowns": [{
|
||||
"type": "FEED_STALE",
|
||||
"feed": "nvd",
|
||||
"age_hours": 36
|
||||
}]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### TC-004: Score Penalty Application
|
||||
|
||||
```json
|
||||
{
|
||||
"scenario": "Multiple unknowns compound penalty",
|
||||
"input": {
|
||||
"base_confidence": 0.95,
|
||||
"unknowns": [
|
||||
{ "type": "SBOM_GAP", "penalty": 0.15 },
|
||||
{ "type": "FEED_STALE", "penalty": 0.10 }
|
||||
]
|
||||
},
|
||||
"expected": {
|
||||
"final_confidence": 0.70,
|
||||
"penalty_formula": "0.95 * (1 - 0.15) * (1 - 0.10)"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Fixtures
|
||||
|
||||
```
|
||||
fixtures/
|
||||
├── sbom-gaps/
|
||||
│ ├── single-missing.json
|
||||
│ ├── multiple-missing.json
|
||||
│ └── layer-specific.json
|
||||
├── zero-day/
|
||||
│ ├── within-window.json
|
||||
│ ├── after-window.json
|
||||
│ └── ongoing.json
|
||||
├── feed-staleness/
|
||||
│ ├── nvd-stale.json
|
||||
│ ├── osv-stale.json
|
||||
│ └── multiple-stale.json
|
||||
└── expected/
|
||||
└── all-tests.results.json
|
||||
```
|
||||
|
||||
## Running the Suite
|
||||
|
||||
```bash
|
||||
# Run unknowns tests
|
||||
dotnet test tests/StellaOps.Unknowns.Tests
|
||||
|
||||
# Run penalty calculation tests
|
||||
./run-penalty-tests.sh
|
||||
|
||||
# Run full benchmark
|
||||
./run-benchmark.sh --all
|
||||
```
|
||||
|
||||
## Metrics
|
||||
|
||||
| Metric | Target | Description |
|
||||
|--------|--------|-------------|
|
||||
| Detection rate | 100% | All unknown conditions detected |
|
||||
| Penalty accuracy | ±1% | Penalties match expected values |
|
||||
| Resolution tracking | 100% | All resolutions properly logged |
|
||||
|
||||
## UI Integration
|
||||
|
||||
Unknowns appear as:
|
||||
- Chips in findings table
|
||||
- Warning banners on scan results
|
||||
- Confidence reduction indicators
|
||||
- Triage action suggestions
|
||||
|
||||
## Integration with CI
|
||||
|
||||
```yaml
|
||||
# .gitea/workflows/bench-unknowns.yaml
|
||||
name: Unknowns Benchmark
|
||||
on:
|
||||
push:
|
||||
paths:
|
||||
- 'src/Unknowns/**'
|
||||
- 'bench/unknowns/**'
|
||||
|
||||
jobs:
|
||||
unknowns:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- name: Run Unknowns Tests
|
||||
run: dotnet test tests/StellaOps.Unknowns.Tests
|
||||
- name: Run Benchmark
|
||||
run: ./bench/unknowns/run-benchmark.sh
|
||||
```
|
||||
153
bench/vex-lattice/README.md
Normal file
153
bench/vex-lattice/README.md
Normal file
@@ -0,0 +1,153 @@
|
||||
# VEX Lattice Benchmark Suite
|
||||
|
||||
> **Purpose:** Verify VEX lattice merge semantics and jurisdiction rules.
|
||||
> **Status:** Active
|
||||
> **Sprint:** SPRINT_3850_0001_0001 (Competitive Gap Closure)
|
||||
|
||||
## Overview
|
||||
|
||||
StellaOps implements VEX (Vulnerability Exploitability eXchange) with:
|
||||
- Lattice-based merge semantics (stable outcomes)
|
||||
- Jurisdiction-specific trust rules (US/EU/RU/CN)
|
||||
- Source precedence and confidence weighting
|
||||
- Deterministic conflict resolution
|
||||
|
||||
## What Gets Tested
|
||||
|
||||
### Lattice Properties
|
||||
- Idempotency: merge(a, a) = a
|
||||
- Commutativity: merge(a, b) = merge(b, a)
|
||||
- Associativity: merge(merge(a, b), c) = merge(a, merge(b, c))
|
||||
- Monotonicity: once "not_affected", never regresses
|
||||
|
||||
### Status Precedence
|
||||
Order from most to least specific:
|
||||
1. `not_affected` (strongest)
|
||||
2. `affected` (with fix)
|
||||
3. `under_investigation`
|
||||
4. `affected` (no fix)
|
||||
|
||||
### Jurisdiction Rules
|
||||
- US: FDA/NIST sources preferred
|
||||
- EU: ENISA/BSI sources preferred
|
||||
- RU: FSTEC sources preferred
|
||||
- CN: CNVD sources preferred
|
||||
|
||||
## Test Cases
|
||||
|
||||
### TC-001: Idempotency
|
||||
|
||||
```json
|
||||
{
|
||||
"input_a": { "status": "not_affected", "justification": "vulnerable_code_not_in_execute_path" },
|
||||
"input_b": { "status": "not_affected", "justification": "vulnerable_code_not_in_execute_path" },
|
||||
"expected": { "status": "not_affected", "justification": "vulnerable_code_not_in_execute_path" }
|
||||
}
|
||||
```
|
||||
|
||||
### TC-002: Commutativity
|
||||
|
||||
```json
|
||||
{
|
||||
"merge_ab": "merge(vendor_vex, nvd_vex)",
|
||||
"merge_ba": "merge(nvd_vex, vendor_vex)",
|
||||
"expected": "identical_result"
|
||||
}
|
||||
```
|
||||
|
||||
### TC-003: Associativity
|
||||
|
||||
```json
|
||||
{
|
||||
"lhs": "merge(merge(a, b), c)",
|
||||
"rhs": "merge(a, merge(b, c))",
|
||||
"expected": "identical_result"
|
||||
}
|
||||
```
|
||||
|
||||
### TC-004: Conflict Resolution
|
||||
|
||||
```json
|
||||
{
|
||||
"vendor_says": "not_affected",
|
||||
"nvd_says": "affected",
|
||||
"expected": "not_affected",
|
||||
"reason": "vendor_has_higher_precedence"
|
||||
}
|
||||
```
|
||||
|
||||
### TC-005: Jurisdiction Override
|
||||
|
||||
```json
|
||||
{
|
||||
"jurisdiction": "EU",
|
||||
"bsi_says": "not_affected",
|
||||
"nist_says": "affected",
|
||||
"expected": "not_affected",
|
||||
"reason": "bsi_preferred_in_eu"
|
||||
}
|
||||
```
|
||||
|
||||
## Fixtures
|
||||
|
||||
```
|
||||
fixtures/
|
||||
├── lattice-properties/
|
||||
│ ├── idempotency.json
|
||||
│ ├── commutativity.json
|
||||
│ └── associativity.json
|
||||
├── conflict-resolution/
|
||||
│ ├── vendor-vs-nvd.json
|
||||
│ ├── multiple-vendors.json
|
||||
│ └── timestamp-tiebreaker.json
|
||||
├── jurisdiction-rules/
|
||||
│ ├── us-fda-nist.json
|
||||
│ ├── eu-enisa-bsi.json
|
||||
│ ├── ru-fstec.json
|
||||
│ └── cn-cnvd.json
|
||||
└── expected/
|
||||
└── all-tests.results.json
|
||||
```
|
||||
|
||||
## Running the Suite
|
||||
|
||||
```bash
|
||||
# Run VEX lattice tests
|
||||
dotnet test tests/StellaOps.Policy.Vex.Tests
|
||||
|
||||
# Run lattice property verification
|
||||
./run-lattice-tests.sh
|
||||
|
||||
# Run jurisdiction rule tests
|
||||
./run-jurisdiction-tests.sh
|
||||
```
|
||||
|
||||
## Metrics
|
||||
|
||||
| Metric | Target | Description |
|
||||
|--------|--------|-------------|
|
||||
| Lattice properties | 100% pass | All algebraic properties hold |
|
||||
| Jurisdiction correctness | 100% pass | Correct source preferred by region |
|
||||
| Merge determinism | 100% pass | Same inputs → same output |
|
||||
|
||||
## Integration with CI
|
||||
|
||||
```yaml
|
||||
# .gitea/workflows/bench-vex-lattice.yaml
|
||||
name: VEX Lattice Benchmark
|
||||
on:
|
||||
push:
|
||||
paths:
|
||||
- 'src/Policy/**'
|
||||
- 'bench/vex-lattice/**'
|
||||
|
||||
jobs:
|
||||
lattice:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- name: Run Lattice Tests
|
||||
run: dotnet test tests/StellaOps.Policy.Vex.Tests
|
||||
- name: Run Property Tests
|
||||
run: ./bench/vex-lattice/run-lattice-tests.sh
|
||||
```
|
||||
Reference in New Issue
Block a user