Files
git.stella-ops.org/tests
StellaOps Bot 233873f620
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Reachability Corpus Validation / validate-corpus (push) Has been cancelled
Reachability Corpus Validation / validate-ground-truths (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Reachability Corpus Validation / determinism-check (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Notify Smoke Test / Notify Unit Tests (push) Has been cancelled
Notify Smoke Test / Notifier Service Tests (push) Has been cancelled
Notify Smoke Test / Notification Smoke Test (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
up
2025-12-14 15:50:38 +02:00
..
up
2025-12-12 09:35:37 +02:00
up
2025-12-13 18:08:55 +02:00
up
2025-12-13 09:37:15 +02:00
up
2025-12-14 15:50:38 +02:00

StellaOps Test Infrastructure

This document describes the test infrastructure for StellaOps, including reachability corpus fixtures, benchmark automation, and CI integration.

Reachability Test Fixtures

Corpus Structure

The reachability corpus is located at tests/reachability/ and contains:

tests/reachability/
├── corpus/
│   ├── manifest.json          # SHA-256 hashes for all corpus files
│   ├── java/                  # Java test cases
│   │   └── <case-id>/
│   │       ├── project/       # Source code
│   │       ├── callgraph.json # Expected call graph
│   │       └── ground-truth.json
│   ├── dotnet/                # .NET test cases
│   └── native/                # Native (C/C++/Rust) test cases
├── fixtures/
│   └── reachbench-2025-expanded/
│       ├── INDEX.json         # Fixture index
│       └── cases/
│           └── <case-id>/
│               └── images/
│                   ├── reachable/
│                   │   └── reachgraph.truth.json
│                   └── unreachable/
│                       └── reachgraph.truth.json
└── StellaOps.Reachability.FixtureTests/
    ├── CorpusFixtureTests.cs
    └── ReachbenchFixtureTests.cs

Ground-Truth Schema

All ground-truth files follow the reachbench.reachgraph.truth/v1 schema:

{
  "schema_version": "reachbench.reachgraph.truth/v1",
  "case_id": "CVE-2023-38545",
  "variant": "reachable",
  "paths": [
    {
      "entry_point": "main",
      "vulnerable_function": "curl_easy_perform",
      "frames": ["main", "do_http_request", "curl_easy_perform"]
    }
  ],
  "metadata": {
    "cve_id": "CVE-2023-38545",
    "purl": "pkg:generic/curl@8.4.0"
  }
}

Running Fixture Tests

# Run all reachability fixture tests
dotnet test tests/reachability/StellaOps.Reachability.FixtureTests

# Run only corpus tests
dotnet test tests/reachability/StellaOps.Reachability.FixtureTests \
  --filter "FullyQualifiedName~CorpusFixtureTests"

# Run only reachbench tests
dotnet test tests/reachability/StellaOps.Reachability.FixtureTests \
  --filter "FullyQualifiedName~ReachbenchFixtureTests"

# Cross-platform runner scripts
./scripts/reachability/run_all.sh       # Unix
./scripts/reachability/run_all.ps1      # Windows

CI Integration

The reachability corpus is validated in CI via .gitea/workflows/reachability-corpus-ci.yml:

  1. validate-corpus: Runs fixture tests, verifies SHA-256 hashes
  2. validate-ground-truths: Validates schema version and structure
  3. determinism-check: Ensures JSON files have sorted keys

Triggers:

  • Push/PR to paths: tests/reachability/**, scripts/reachability/**
  • Manual workflow dispatch

CAS Layout Reference

Content-Addressable Storage Paths

StellaOps uses BLAKE3 hashes for content-addressable storage:

Artifact Type CAS Path Pattern Example
Call Graph cas://reachability/graphs/{blake3} cas://reachability/graphs/3a7f2b...
Runtime Facts cas://reachability/runtime-facts/{blake3} cas://reachability/runtime-facts/8c4d1e...
Replay Manifest cas://reachability/replay/{blake3} cas://reachability/replay/f2e9c8...
Evidence Bundle cas://reachability/evidence/{blake3} cas://reachability/evidence/a1b2c3...
DSSE Envelope cas://attestation/dsse/{blake3} cas://attestation/dsse/d4e5f6...
Symbol Manifest cas://symbols/manifests/{blake3} cas://symbols/manifests/7g8h9i...

Hash Algorithm

All CAS URIs use BLAKE3 with base16 (hex) encoding:

cas://{namespace}/{artifact-type}/{blake3-hex}

Example hash computation:

import hashlib
# Use BLAKE3 for CAS hashing
from blake3 import blake3
content_hash = blake3(file_content).hexdigest()

Replay Workflow

Replay Manifest v2 Schema

{
  "version": 2,
  "hashAlg": "blake3",
  "hash": "blake3:3a7f2b...",
  "created_at": "2025-12-14T00:00:00Z",
  "entries": [
    {
      "type": "callgraph",
      "cas_uri": "cas://reachability/graphs/3a7f2b...",
      "hash": "blake3:3a7f2b..."
    },
    {
      "type": "runtime-facts",
      "cas_uri": "cas://reachability/runtime-facts/8c4d1e...",
      "hash": "blake3:8c4d1e..."
    }
  ],
  "code_id_coverage": 0.95
}

Replay Steps

  1. Export replay manifest:

    stella replay export --scan-id <scan-id> --output replay-manifest.json
    
  2. Validate manifest integrity:

    stella replay validate --manifest replay-manifest.json
    
  3. Fetch CAS artifacts (online):

    stella replay fetch --manifest replay-manifest.json --output ./artifacts/
    
  4. Import for replay (air-gapped):

    stella replay import --bundle replay-bundle.tar.gz --verify
    
  5. Execute replay:

    stella replay run --manifest replay-manifest.json --compare-to <baseline-hash>
    

Validation Error Codes

Code Description
REPLAY_MANIFEST_MISSING_VERSION Manifest missing version field
VERSION_MISMATCH Unexpected manifest version
MISSING_HASH_ALG Hash algorithm not specified
UNSORTED_ENTRIES CAS entries not sorted (non-deterministic)
CAS_NOT_FOUND Referenced CAS artifact missing
HASH_MISMATCH Computed hash differs from declared

Benchmark Automation

Running Benchmarks

# Full benchmark pipeline
./scripts/bench/run-baseline.sh --all

# Individual steps
./scripts/bench/run-baseline.sh --populate   # Generate findings from fixtures
./scripts/bench/run-baseline.sh --compute    # Compute metrics

# Compare with baseline scanner
./scripts/bench/run-baseline.sh --compare baseline-results.json

Benchmark Outputs

Results are written to bench/results/:

  • summary.csv: Per-run metrics (TP, FP, TN, FN, precision, recall, F1)
  • metrics.json: Detailed findings with evidence hashes
  • replay/: Replay outputs for verification

Verification Tools

# Online verification (DSSE + Rekor)
./bench/tools/verify.sh <finding-bundle>

# Offline verification
python3 bench/tools/verify.py --bundle <finding-dir> --offline

# Compare scanners
python3 bench/tools/compare.py --baseline <scanner-results> --json

References