git.stella-ops.org/docs/benchmarks/performance-baselines.md

# Performance Baselines

## Overview

This document defines performance baselines for StellaOps scanner operations. All metrics are measured against reference images and workloads to ensure consistent, reproducible benchmarks.

**Last Updated:** 2025-12-14
**Next Review:** 2026-03-14

---

## Reference Images

Standard images used for performance benchmarking:

| Image | Size | Components | Expected Vulns | Category |
|-------|------|------------|----------------|----------|
| `alpine:3.19` | 7MB | ~15 | ~5 | Minimal |
| `debian:12-slim` | 75MB | ~90 | ~40 | Minimal |
| `ubuntu:22.04` | 77MB | ~100 | ~50 | Standard |
| `node:20-alpine` | 180MB | ~200 | ~100 | Application |
| `python:3.12` | 1GB | ~300 | ~150 | Application |
| `mcr.microsoft.com/dotnet/aspnet:8.0` | 220MB | ~150 | ~75 | Application |
| `nginx:1.25` | 190MB | ~120 | ~60 | Application |
| `postgres:16-alpine` | 240MB | ~140 | ~70 | Database |

---

## Scan Performance Targets

### Container Image Scanning

| Image Category | P50 Time | P95 Time | Max Memory | CPU Cores |
|---------------|----------|----------|------------|-----------|
| Minimal (<100MB) | < 5s | < 10s | < 256MB | 1 |
| Standard (100-500MB) | < 15s | < 30s | < 512MB | 2 |
| Large (500MB-2GB) | < 45s | < 90s | < 1.5GB | 2 |
| Very Large (>2GB) | < 120s | < 240s | < 2GB | 4 |

### Per-Image Targets

| Image | P50 Time | P95 Time | Max Memory |
|-------|----------|----------|------------|
| alpine:3.19 | < 3s | < 8s | < 200MB |
| debian:12-slim | < 8s | < 15s | < 300MB |
| ubuntu:22.04 | < 10s | < 20s | < 400MB |
| node:20-alpine | < 20s | < 40s | < 600MB |
| python:3.12 | < 35s | < 70s | < 1.2GB |
| dotnet/aspnet:8.0 | < 25s | < 50s | < 800MB |
| nginx:1.25 | < 18s | < 35s | < 500MB |
| postgres:16-alpine | < 22s | < 45s | < 600MB |

---

## Reachability Analysis Targets

### By Codebase Size

| Codebase Size | P50 Time | P95 Time | Memory | Notes |
|---------------|----------|----------|--------|-------|
| Tiny (<5k LOC) | < 10s | < 20s | < 256MB | Single service |
| Small (5-20k LOC) | < 30s | < 60s | < 512MB | Small service |
| Medium (20-50k LOC) | < 2min | < 4min | < 1GB | Typical microservice |
| Large (50-100k LOC) | < 5min | < 10min | < 2GB | Large service |
| Very Large (100-500k LOC) | < 15min | < 30min | < 4GB | Monolith |
| Monorepo (>500k LOC) | < 45min | < 90min | < 8GB | Enterprise monorepo |

### By Language

| Language | Relative Speed | Notes |
|----------|---------------|-------|
| Go | 1.0x (baseline) | Fast due to simple module system |
| Java | 1.2x | Maven/Gradle resolution adds overhead |
| C# | 1.3x | MSBuild/NuGet resolution |
| TypeScript | 1.5x | npm/yarn resolution, complex imports |
| Python | 1.8x | Virtual env resolution, dynamic imports |
| JavaScript | 2.0x | Complex bundler configurations |

---

## SBOM Generation Targets

| Format | P50 Time | P95 Time | Output Size | Notes |
|--------|----------|----------|-------------|-------|
| CycloneDX 1.6 (JSON) | < 1s | < 3s | ~50KB/100 components | Standard |
| CycloneDX 1.6 (XML) | < 1.5s | < 4s | ~80KB/100 components | Verbose |
| SPDX 3.0.1 (JSON) | < 1s | < 3s | ~60KB/100 components | Standard |
| SPDX 3.0.1 (Tag-Value) | < 1.2s | < 3.5s | ~70KB/100 components | Legacy format |

### Combined Operations

| Operation | P50 Time | P95 Time |
|-----------|----------|----------|
| Scan + SBOM | scan_time + 1s | scan_time + 3s |
| Scan + SBOM + Reachability | scan_time + reach_time + 2s | scan_time + reach_time + 5s |
| Full attestation pipeline | total_time + 2s | total_time + 5s |

---

## VEX Processing Targets

| Operation | P50 Time | P95 Time | Notes |
|-----------|----------|----------|-------|
| VEX document parsing | < 50ms | < 150ms | Per document |
| Lattice state computation | < 100ms | < 300ms | Per 100 vulnerabilities |
| VEX consensus merge | < 200ms | < 500ms | 3-5 sources |
| State transition | < 10ms | < 30ms | Single transition |

---

## CVSS Scoring Targets

| Operation | P50 Time | P95 Time | Notes |
|-----------|----------|----------|-------|
| MacroVector lookup | < 1μs | < 5μs | Dictionary lookup |
| CVSS v4.0 base score | < 10μs | < 50μs | Full computation |
| CVSS v4.0 full score | < 20μs | < 100μs | Base + threat + env |
| Vector parsing | < 5μs | < 20μs | String parsing |
| Receipt generation | < 100μs | < 500μs | Includes hashing |
| Batch scoring (100 vulns) | < 5ms | < 15ms | Parallel processing |

---

## Attestation Targets

| Operation | P50 Time | P95 Time | Notes |
|-----------|----------|----------|-------|
| DSSE envelope creation | < 50ms | < 150ms | Includes signing |
| DSSE verification | < 30ms | < 100ms | Signature check |
| Rekor submission | < 500ms | < 2s | Network dependent |
| Rekor verification | < 300ms | < 1s | Network dependent |
| in-toto predicate | < 20ms | < 80ms | JSON serialization |

---

## Database Operation Targets

| Operation | P50 Time | P95 Time | Notes |
|-----------|----------|----------|-------|
| Receipt insert | < 5ms | < 20ms | Single record |
| Receipt query (by ID) | < 2ms | < 10ms | Indexed lookup |
| Receipt query (by tenant) | < 10ms | < 50ms | Index scan |
| EPSS lookup (single) | < 1ms | < 5ms | Indexed |
| EPSS lookup (batch 100) | < 10ms | < 50ms | Batch query |
| Risk score insert | < 5ms | < 20ms | Single record |
| Risk score update | < 3ms | < 15ms | Single record |

---

## Regression Thresholds

Performance regression is detected when metrics exceed these thresholds compared to baseline:

| Metric | Warning Threshold | Blocking Threshold | Action |
|--------|------------------|-------------------|--------|
| P50 Time | > 15% increase | > 25% increase | Block release |
| P95 Time | > 20% increase | > 35% increase | Block release |
| Memory Usage | > 20% increase | > 30% increase | Block release |
| CPU Time | > 15% increase | > 25% increase | Investigate |
| Throughput | > 10% decrease | > 20% decrease | Block release |

### Regression Detection Rules

1. **Warning**: Alert engineering team, add to release notes
2. **Blocking**: Cannot merge/release until resolved or waived
3. **Waiver**: Requires documented justification and SME approval

---

## Measurement Methodology

### Environment Setup

```bash
# Standard test environment
# - CPU: 8 cores (x86_64)
# - Memory: 16GB RAM
# - Storage: NVMe SSD
# - OS: Ubuntu 22.04 LTS
# - Docker: 24.x

# Clear caches before cold start tests
docker system prune -af
sync && echo 3 > /proc/sys/vm/drop_caches
```

### Scan Performance

```bash
# Cold start measurement
time stellaops scan --image alpine:3.19 --format json > /dev/null

# Warm cache measurement (run 3x, take average)
for i in {1..3}; do
  time stellaops scan --image alpine:3.19 --format json > /dev/null
done

# Memory profiling
/usr/bin/time -v stellaops scan --image alpine:3.19 --format json 2>&1 | \
  grep "Maximum resident set size"

# CPU profiling
perf stat stellaops scan --image alpine:3.19 --format json > /dev/null
```

### Reachability Analysis

```bash
# Time measurement
time stellaops reach --project ./src --language csharp --out reach.json

# Memory profiling
/usr/bin/time -v stellaops reach --project ./src --language csharp --out reach.json 2>&1

# With detailed timing
stellaops reach --project ./src --language csharp --out reach.json --timing
```

### SBOM Generation

```bash
# Time measurement
time stellaops sbom --image node:20-alpine --format cyclonedx --out sbom.json

# Output size
stellaops sbom --image node:20-alpine --format cyclonedx --out sbom.json && \
  ls -lh sbom.json
```

### Batch Operations

```bash
# Process multiple images in parallel
time stellaops scan --images images.txt --parallel 4 --format json --out-dir ./results

# Throughput test (images per minute)
START=$(date +%s)
for i in {1..10}; do
  stellaops scan --image alpine:3.19 --format json > /dev/null
done
END=$(date +%s)
echo "Throughput: $(( 10 * 60 / (END - START) )) images/minute"
```

---

## CI Integration

### Benchmark Workflow

```yaml
# .gitea/workflows/performance-benchmark.yml
name: Performance Benchmark

on:
  pull_request:
    branches: [main]
  schedule:
    - cron: '0 2 * * 1'  # Weekly Monday 2am

jobs:
  benchmark:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run benchmarks
        run: make benchmark-performance

      - name: Check for regressions
        run: |
          stellaops benchmark compare \
            --baseline results/baseline.json \
            --current results/current.json \
            --threshold-p50 0.15 \
            --threshold-p95 0.20 \
            --threshold-memory 0.20 \
            --fail-on-regression

      - name: Upload results
        uses: actions/upload-artifact@v4
        with:
          name: benchmark-results
          path: results/
```

### Local Testing

```bash
# Run full benchmark suite
make benchmark-performance

# Run specific image benchmark
make benchmark-image IMAGE=alpine:3.19

# Generate baseline
make benchmark-baseline

# Compare against baseline
make benchmark-compare
```

---

## Optimization Guidelines

### For Scan Performance

1. **Pre-pull images** for consistent timing
2. **Use layered caching** for repeat scans
3. **Enable parallel analysis** for multi-ecosystem images
4. **Consider selective scanning** for known-safe layers

### For Reachability

1. **Incremental analysis** for unchanged files
2. **Cache resolved dependencies**
3. **Use language-specific optimizations** (e.g., Roslyn for C#)
4. **Limit call graph depth** for very large codebases

### For Memory

1. **Stream large SBOMs** instead of loading fully
2. **Use batched database operations**
3. **Release intermediate data structures early**
4. **Configure GC appropriately for workload**

---

## Historical Baselines

### Version History

| Version | Date | P50 Scan (alpine) | P50 Reach (50k LOC) | Notes |
|---------|------|-------------------|---------------------|-------|
| 1.3.0 | 2025-12-14 | TBD | TBD | Current |
| 1.2.0 | 2025-09-01 | TBD | TBD | Previous |
| 1.1.0 | 2025-06-01 | TBD | TBD | Baseline |

### Improvement Targets

| Quarter | Focus Area | Target | Status |
|---------|------------|--------|--------|
| Q1 2026 | Scan cold start | -20% | Planned |
| Q1 2026 | Reachability memory | -15% | Planned |
| Q2 2026 | SBOM generation | -10% | Planned |

---

## References

- [Accuracy Metrics Framework](accuracy-metrics-framework.md)
- [Benchmark Submission Guide](submission-guide.md) (pending)
- [Scanner Architecture](../modules/scanner/architecture.md)
- [Reachability Module](../modules/scanner/reachability.md)