Files
git.stella-ops.org/tests
master 8779e9226f feat: add stella-callgraph-node for JavaScript/TypeScript call graph extraction
- Implemented a new tool `stella-callgraph-node` that extracts call graphs from JavaScript/TypeScript projects using Babel AST.
- Added command-line interface with options for JSON output and help.
- Included functionality to analyze project structure, detect functions, and build call graphs.
- Created a package.json file for dependency management.

feat: introduce stella-callgraph-python for Python call graph extraction

- Developed `stella-callgraph-python` to extract call graphs from Python projects using AST analysis.
- Implemented command-line interface with options for JSON output and verbose logging.
- Added framework detection to identify popular web frameworks and their entry points.
- Created an AST analyzer to traverse Python code and extract function definitions and calls.
- Included requirements.txt for project dependencies.

chore: add framework detection for Python projects

- Implemented framework detection logic to identify frameworks like Flask, FastAPI, Django, and others based on project files and import patterns.
- Enhanced the AST analyzer to recognize entry points based on decorators and function definitions.
2025-12-19 18:11:59 +02:00
..
2025-12-18 00:47:24 +02:00
up
2025-12-14 18:33:02 +02:00
2025-12-19 07:28:23 +02:00
up
2025-12-12 09:35:37 +02:00
2025-12-18 00:47:24 +02:00
up
2025-12-13 09:37:15 +02:00
up
2025-12-14 15:50:38 +02:00

StellaOps Test Infrastructure

This document describes the test infrastructure for StellaOps, including reachability corpus fixtures, benchmark automation, and CI integration.

Reachability Test Fixtures

Corpus Structure

The reachability corpus is located at tests/reachability/ and contains:

tests/reachability/
├── corpus/
│   ├── manifest.json          # SHA-256 hashes for all corpus files
│   ├── java/                  # Java test cases
│   │   └── <case-id>/
│   │       ├── project/       # Source code
│   │       ├── callgraph.json # Expected call graph
│   │       └── ground-truth.json
│   ├── dotnet/                # .NET test cases
│   └── native/                # Native (C/C++/Rust) test cases
├── fixtures/
│   └── reachbench-2025-expanded/
│       ├── INDEX.json         # Fixture index
│       └── cases/
│           └── <case-id>/
│               └── images/
│                   ├── reachable/
│                   │   └── reachgraph.truth.json
│                   └── unreachable/
│                       └── reachgraph.truth.json
└── StellaOps.Reachability.FixtureTests/
    ├── CorpusFixtureTests.cs
    └── ReachbenchFixtureTests.cs

Ground-Truth Schema

All ground-truth files follow the reachbench.reachgraph.truth/v1 schema:

{
  "schema_version": "reachbench.reachgraph.truth/v1",
  "case_id": "CVE-2023-38545",
  "variant": "reachable",
  "paths": [
    {
      "entry_point": "main",
      "vulnerable_function": "curl_easy_perform",
      "frames": ["main", "do_http_request", "curl_easy_perform"]
    }
  ],
  "metadata": {
    "cve_id": "CVE-2023-38545",
    "purl": "pkg:generic/curl@8.4.0"
  }
}

Running Fixture Tests

# Run all reachability fixture tests
dotnet test tests/reachability/StellaOps.Reachability.FixtureTests

# Run only corpus tests
dotnet test tests/reachability/StellaOps.Reachability.FixtureTests \
  --filter "FullyQualifiedName~CorpusFixtureTests"

# Run only reachbench tests
dotnet test tests/reachability/StellaOps.Reachability.FixtureTests \
  --filter "FullyQualifiedName~ReachbenchFixtureTests"

# Cross-platform runner scripts
./scripts/reachability/run_all.sh       # Unix
./scripts/reachability/run_all.ps1      # Windows

CI Integration

The reachability corpus is validated in CI via .gitea/workflows/reachability-corpus-ci.yml:

  1. validate-corpus: Runs fixture tests, verifies SHA-256 hashes
  2. validate-ground-truths: Validates schema version and structure
  3. determinism-check: Ensures JSON files have sorted keys

Triggers:

  • Push/PR to paths: tests/reachability/**, scripts/reachability/**
  • Manual workflow dispatch

CAS Layout Reference

Content-Addressable Storage Paths

StellaOps uses BLAKE3 hashes for content-addressable storage:

Artifact Type CAS Path Pattern Example
Call Graph cas://reachability/graphs/{blake3} cas://reachability/graphs/3a7f2b...
Runtime Facts cas://reachability/runtime-facts/{blake3} cas://reachability/runtime-facts/8c4d1e...
Replay Manifest cas://reachability/replay/{blake3} cas://reachability/replay/f2e9c8...
Evidence Bundle cas://reachability/evidence/{blake3} cas://reachability/evidence/a1b2c3...
DSSE Envelope cas://attestation/dsse/{blake3} cas://attestation/dsse/d4e5f6...
Symbol Manifest cas://symbols/manifests/{blake3} cas://symbols/manifests/7g8h9i...

Hash Algorithm

All CAS URIs use BLAKE3 with base16 (hex) encoding:

cas://{namespace}/{artifact-type}/{blake3-hex}

Example hash computation:

import hashlib
# Use BLAKE3 for CAS hashing
from blake3 import blake3
content_hash = blake3(file_content).hexdigest()

Replay Workflow

Replay Manifest v2 Schema

{
  "version": 2,
  "hashAlg": "blake3",
  "hash": "blake3:3a7f2b...",
  "created_at": "2025-12-14T00:00:00Z",
  "entries": [
    {
      "type": "callgraph",
      "cas_uri": "cas://reachability/graphs/3a7f2b...",
      "hash": "blake3:3a7f2b..."
    },
    {
      "type": "runtime-facts",
      "cas_uri": "cas://reachability/runtime-facts/8c4d1e...",
      "hash": "blake3:8c4d1e..."
    }
  ],
  "code_id_coverage": 0.95
}

Replay Steps

  1. Export replay manifest:

    stella replay export --scan-id <scan-id> --output replay-manifest.json
    
  2. Validate manifest integrity:

    stella replay validate --manifest replay-manifest.json
    
  3. Fetch CAS artifacts (online):

    stella replay fetch --manifest replay-manifest.json --output ./artifacts/
    
  4. Import for replay (air-gapped):

    stella replay import --bundle replay-bundle.tar.gz --verify
    
  5. Execute replay:

    stella replay run --manifest replay-manifest.json --compare-to <baseline-hash>
    

Validation Error Codes

Code Description
REPLAY_MANIFEST_MISSING_VERSION Manifest missing version field
VERSION_MISMATCH Unexpected manifest version
MISSING_HASH_ALG Hash algorithm not specified
UNSORTED_ENTRIES CAS entries not sorted (non-deterministic)
CAS_NOT_FOUND Referenced CAS artifact missing
HASH_MISMATCH Computed hash differs from declared

Benchmark Automation

Running Benchmarks

# Full benchmark pipeline
./scripts/bench/run-baseline.sh --all

# Individual steps
./scripts/bench/run-baseline.sh --populate   # Generate findings from fixtures
./scripts/bench/run-baseline.sh --compute    # Compute metrics

# Compare with baseline scanner
./scripts/bench/run-baseline.sh --compare baseline-results.json

Benchmark Outputs

Results are written to bench/results/:

  • summary.csv: Per-run metrics (TP, FP, TN, FN, precision, recall, F1)
  • metrics.json: Detailed findings with evidence hashes
  • replay/: Replay outputs for verification

Verification Tools

# Online verification (DSSE + Rekor)
./bench/tools/verify.sh <finding-bundle>

# Offline verification
python3 bench/tools/verify.py --bundle <finding-dir> --offline

# Compare scanners
python3 bench/tools/compare.py --baseline <scanner-results> --json

References