Files
git.stella-ops.org/docs/benchmarks/performance-baselines.md
StellaOps Bot b058dbe031 up
2025-12-14 23:20:14 +02:00

10 KiB

Performance Baselines

Overview

This document defines performance baselines for StellaOps scanner operations. All metrics are measured against reference images and workloads to ensure consistent, reproducible benchmarks.

Last Updated: 2025-12-14 Next Review: 2026-03-14


Reference Images

Standard images used for performance benchmarking:

Image Size Components Expected Vulns Category
alpine:3.19 7MB ~15 ~5 Minimal
debian:12-slim 75MB ~90 ~40 Minimal
ubuntu:22.04 77MB ~100 ~50 Standard
node:20-alpine 180MB ~200 ~100 Application
python:3.12 1GB ~300 ~150 Application
mcr.microsoft.com/dotnet/aspnet:8.0 220MB ~150 ~75 Application
nginx:1.25 190MB ~120 ~60 Application
postgres:16-alpine 240MB ~140 ~70 Database

Scan Performance Targets

Container Image Scanning

Image Category P50 Time P95 Time Max Memory CPU Cores
Minimal (<100MB) < 5s < 10s < 256MB 1
Standard (100-500MB) < 15s < 30s < 512MB 2
Large (500MB-2GB) < 45s < 90s < 1.5GB 2
Very Large (>2GB) < 120s < 240s < 2GB 4

Per-Image Targets

Image P50 Time P95 Time Max Memory
alpine:3.19 < 3s < 8s < 200MB
debian:12-slim < 8s < 15s < 300MB
ubuntu:22.04 < 10s < 20s < 400MB
node:20-alpine < 20s < 40s < 600MB
python:3.12 < 35s < 70s < 1.2GB
dotnet/aspnet:8.0 < 25s < 50s < 800MB
nginx:1.25 < 18s < 35s < 500MB
postgres:16-alpine < 22s < 45s < 600MB

Reachability Analysis Targets

By Codebase Size

Codebase Size P50 Time P95 Time Memory Notes
Tiny (<5k LOC) < 10s < 20s < 256MB Single service
Small (5-20k LOC) < 30s < 60s < 512MB Small service
Medium (20-50k LOC) < 2min < 4min < 1GB Typical microservice
Large (50-100k LOC) < 5min < 10min < 2GB Large service
Very Large (100-500k LOC) < 15min < 30min < 4GB Monolith
Monorepo (>500k LOC) < 45min < 90min < 8GB Enterprise monorepo

By Language

Language Relative Speed Notes
Go 1.0x (baseline) Fast due to simple module system
Java 1.2x Maven/Gradle resolution adds overhead
C# 1.3x MSBuild/NuGet resolution
TypeScript 1.5x npm/yarn resolution, complex imports
Python 1.8x Virtual env resolution, dynamic imports
JavaScript 2.0x Complex bundler configurations

SBOM Generation Targets

Format P50 Time P95 Time Output Size Notes
CycloneDX 1.6 (JSON) < 1s < 3s ~50KB/100 components Standard
CycloneDX 1.6 (XML) < 1.5s < 4s ~80KB/100 components Verbose
SPDX 3.0.1 (JSON) < 1s < 3s ~60KB/100 components Standard
SPDX 3.0.1 (Tag-Value) < 1.2s < 3.5s ~70KB/100 components Legacy format

Combined Operations

Operation P50 Time P95 Time
Scan + SBOM scan_time + 1s scan_time + 3s
Scan + SBOM + Reachability scan_time + reach_time + 2s scan_time + reach_time + 5s
Full attestation pipeline total_time + 2s total_time + 5s

VEX Processing Targets

Operation P50 Time P95 Time Notes
VEX document parsing < 50ms < 150ms Per document
Lattice state computation < 100ms < 300ms Per 100 vulnerabilities
VEX consensus merge < 200ms < 500ms 3-5 sources
State transition < 10ms < 30ms Single transition

CVSS Scoring Targets

Operation P50 Time P95 Time Notes
MacroVector lookup < 1μs < 5μs Dictionary lookup
CVSS v4.0 base score < 10μs < 50μs Full computation
CVSS v4.0 full score < 20μs < 100μs Base + threat + env
Vector parsing < 5μs < 20μs String parsing
Receipt generation < 100μs < 500μs Includes hashing
Batch scoring (100 vulns) < 5ms < 15ms Parallel processing

Attestation Targets

Operation P50 Time P95 Time Notes
DSSE envelope creation < 50ms < 150ms Includes signing
DSSE verification < 30ms < 100ms Signature check
Rekor submission < 500ms < 2s Network dependent
Rekor verification < 300ms < 1s Network dependent
in-toto predicate < 20ms < 80ms JSON serialization

Database Operation Targets

Operation P50 Time P95 Time Notes
Receipt insert < 5ms < 20ms Single record
Receipt query (by ID) < 2ms < 10ms Indexed lookup
Receipt query (by tenant) < 10ms < 50ms Index scan
EPSS lookup (single) < 1ms < 5ms Indexed
EPSS lookup (batch 100) < 10ms < 50ms Batch query
Risk score insert < 5ms < 20ms Single record
Risk score update < 3ms < 15ms Single record

Regression Thresholds

Performance regression is detected when metrics exceed these thresholds compared to baseline:

Metric Warning Threshold Blocking Threshold Action
P50 Time > 15% increase > 25% increase Block release
P95 Time > 20% increase > 35% increase Block release
Memory Usage > 20% increase > 30% increase Block release
CPU Time > 15% increase > 25% increase Investigate
Throughput > 10% decrease > 20% decrease Block release

Regression Detection Rules

  1. Warning: Alert engineering team, add to release notes
  2. Blocking: Cannot merge/release until resolved or waived
  3. Waiver: Requires documented justification and SME approval

Measurement Methodology

Environment Setup

# Standard test environment
# - CPU: 8 cores (x86_64)
# - Memory: 16GB RAM
# - Storage: NVMe SSD
# - OS: Ubuntu 22.04 LTS
# - Docker: 24.x

# Clear caches before cold start tests
docker system prune -af
sync && echo 3 > /proc/sys/vm/drop_caches

Scan Performance

# Cold start measurement
time stellaops scan --image alpine:3.19 --format json > /dev/null

# Warm cache measurement (run 3x, take average)
for i in {1..3}; do
  time stellaops scan --image alpine:3.19 --format json > /dev/null
done

# Memory profiling
/usr/bin/time -v stellaops scan --image alpine:3.19 --format json 2>&1 | \
  grep "Maximum resident set size"

# CPU profiling
perf stat stellaops scan --image alpine:3.19 --format json > /dev/null

Reachability Analysis

# Time measurement
time stellaops reach --project ./src --language csharp --out reach.json

# Memory profiling
/usr/bin/time -v stellaops reach --project ./src --language csharp --out reach.json 2>&1

# With detailed timing
stellaops reach --project ./src --language csharp --out reach.json --timing

SBOM Generation

# Time measurement
time stellaops sbom --image node:20-alpine --format cyclonedx --out sbom.json

# Output size
stellaops sbom --image node:20-alpine --format cyclonedx --out sbom.json && \
  ls -lh sbom.json

Batch Operations

# Process multiple images in parallel
time stellaops scan --images images.txt --parallel 4 --format json --out-dir ./results

# Throughput test (images per minute)
START=$(date +%s)
for i in {1..10}; do
  stellaops scan --image alpine:3.19 --format json > /dev/null
done
END=$(date +%s)
echo "Throughput: $(( 10 * 60 / (END - START) )) images/minute"

CI Integration

Benchmark Workflow

# .gitea/workflows/performance-benchmark.yml
name: Performance Benchmark

on:
  pull_request:
    branches: [main]
  schedule:
    - cron: '0 2 * * 1'  # Weekly Monday 2am

jobs:
  benchmark:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run benchmarks
        run: make benchmark-performance

      - name: Check for regressions
        run: |
          stellaops benchmark compare \
            --baseline results/baseline.json \
            --current results/current.json \
            --threshold-p50 0.15 \
            --threshold-p95 0.20 \
            --threshold-memory 0.20 \
            --fail-on-regression

      - name: Upload results
        uses: actions/upload-artifact@v4
        with:
          name: benchmark-results
          path: results/

Local Testing

# Run full benchmark suite
make benchmark-performance

# Run specific image benchmark
make benchmark-image IMAGE=alpine:3.19

# Generate baseline
make benchmark-baseline

# Compare against baseline
make benchmark-compare

Optimization Guidelines

For Scan Performance

  1. Pre-pull images for consistent timing
  2. Use layered caching for repeat scans
  3. Enable parallel analysis for multi-ecosystem images
  4. Consider selective scanning for known-safe layers

For Reachability

  1. Incremental analysis for unchanged files
  2. Cache resolved dependencies
  3. Use language-specific optimizations (e.g., Roslyn for C#)
  4. Limit call graph depth for very large codebases

For Memory

  1. Stream large SBOMs instead of loading fully
  2. Use batched database operations
  3. Release intermediate data structures early
  4. Configure GC appropriately for workload

Historical Baselines

Version History

Version Date P50 Scan (alpine) P50 Reach (50k LOC) Notes
1.3.0 2025-12-14 TBD TBD Current
1.2.0 2025-09-01 TBD TBD Previous
1.1.0 2025-06-01 TBD TBD Baseline

Improvement Targets

Quarter Focus Area Target Status
Q1 2026 Scan cold start -20% Planned
Q1 2026 Reachability memory -15% Planned
Q2 2026 SBOM generation -10% Planned

References