Files
git.stella-ops.org/docs/benchmarks/performance-baselines.md
StellaOps Bot b058dbe031 up
2025-12-14 23:20:14 +02:00

356 lines
10 KiB
Markdown

# Performance Baselines
## Overview
This document defines performance baselines for StellaOps scanner operations. All metrics are measured against reference images and workloads to ensure consistent, reproducible benchmarks.
**Last Updated:** 2025-12-14
**Next Review:** 2026-03-14
---
## Reference Images
Standard images used for performance benchmarking:
| Image | Size | Components | Expected Vulns | Category |
|-------|------|------------|----------------|----------|
| `alpine:3.19` | 7MB | ~15 | ~5 | Minimal |
| `debian:12-slim` | 75MB | ~90 | ~40 | Minimal |
| `ubuntu:22.04` | 77MB | ~100 | ~50 | Standard |
| `node:20-alpine` | 180MB | ~200 | ~100 | Application |
| `python:3.12` | 1GB | ~300 | ~150 | Application |
| `mcr.microsoft.com/dotnet/aspnet:8.0` | 220MB | ~150 | ~75 | Application |
| `nginx:1.25` | 190MB | ~120 | ~60 | Application |
| `postgres:16-alpine` | 240MB | ~140 | ~70 | Database |
---
## Scan Performance Targets
### Container Image Scanning
| Image Category | P50 Time | P95 Time | Max Memory | CPU Cores |
|---------------|----------|----------|------------|-----------|
| Minimal (<100MB) | < 5s | < 10s | < 256MB | 1 |
| Standard (100-500MB) | < 15s | < 30s | < 512MB | 2 |
| Large (500MB-2GB) | < 45s | < 90s | < 1.5GB | 2 |
| Very Large (>2GB) | < 120s | < 240s | < 2GB | 4 |
### Per-Image Targets
| Image | P50 Time | P95 Time | Max Memory |
|-------|----------|----------|------------|
| alpine:3.19 | < 3s | < 8s | < 200MB |
| debian:12-slim | < 8s | < 15s | < 300MB |
| ubuntu:22.04 | < 10s | < 20s | < 400MB |
| node:20-alpine | < 20s | < 40s | < 600MB |
| python:3.12 | < 35s | < 70s | < 1.2GB |
| dotnet/aspnet:8.0 | < 25s | < 50s | < 800MB |
| nginx:1.25 | < 18s | < 35s | < 500MB |
| postgres:16-alpine | < 22s | < 45s | < 600MB |
---
## Reachability Analysis Targets
### By Codebase Size
| Codebase Size | P50 Time | P95 Time | Memory | Notes |
|---------------|----------|----------|--------|-------|
| Tiny (<5k LOC) | < 10s | < 20s | < 256MB | Single service |
| Small (5-20k LOC) | < 30s | < 60s | < 512MB | Small service |
| Medium (20-50k LOC) | < 2min | < 4min | < 1GB | Typical microservice |
| Large (50-100k LOC) | < 5min | < 10min | < 2GB | Large service |
| Very Large (100-500k LOC) | < 15min | < 30min | < 4GB | Monolith |
| Monorepo (>500k LOC) | < 45min | < 90min | < 8GB | Enterprise monorepo |
### By Language
| Language | Relative Speed | Notes |
|----------|---------------|-------|
| Go | 1.0x (baseline) | Fast due to simple module system |
| Java | 1.2x | Maven/Gradle resolution adds overhead |
| C# | 1.3x | MSBuild/NuGet resolution |
| TypeScript | 1.5x | npm/yarn resolution, complex imports |
| Python | 1.8x | Virtual env resolution, dynamic imports |
| JavaScript | 2.0x | Complex bundler configurations |
---
## SBOM Generation Targets
| Format | P50 Time | P95 Time | Output Size | Notes |
|--------|----------|----------|-------------|-------|
| CycloneDX 1.6 (JSON) | < 1s | < 3s | ~50KB/100 components | Standard |
| CycloneDX 1.6 (XML) | < 1.5s | < 4s | ~80KB/100 components | Verbose |
| SPDX 3.0.1 (JSON) | < 1s | < 3s | ~60KB/100 components | Standard |
| SPDX 3.0.1 (Tag-Value) | < 1.2s | < 3.5s | ~70KB/100 components | Legacy format |
### Combined Operations
| Operation | P50 Time | P95 Time |
|-----------|----------|----------|
| Scan + SBOM | scan_time + 1s | scan_time + 3s |
| Scan + SBOM + Reachability | scan_time + reach_time + 2s | scan_time + reach_time + 5s |
| Full attestation pipeline | total_time + 2s | total_time + 5s |
---
## VEX Processing Targets
| Operation | P50 Time | P95 Time | Notes |
|-----------|----------|----------|-------|
| VEX document parsing | < 50ms | < 150ms | Per document |
| Lattice state computation | < 100ms | < 300ms | Per 100 vulnerabilities |
| VEX consensus merge | < 200ms | < 500ms | 3-5 sources |
| State transition | < 10ms | < 30ms | Single transition |
---
## CVSS Scoring Targets
| Operation | P50 Time | P95 Time | Notes |
|-----------|----------|----------|-------|
| MacroVector lookup | < 1μs | < 5μs | Dictionary lookup |
| CVSS v4.0 base score | < 10μs | < 50μs | Full computation |
| CVSS v4.0 full score | < 20μs | < 100μs | Base + threat + env |
| Vector parsing | < 5μs | < 20μs | String parsing |
| Receipt generation | < 100μs | < 500μs | Includes hashing |
| Batch scoring (100 vulns) | < 5ms | < 15ms | Parallel processing |
---
## Attestation Targets
| Operation | P50 Time | P95 Time | Notes |
|-----------|----------|----------|-------|
| DSSE envelope creation | < 50ms | < 150ms | Includes signing |
| DSSE verification | < 30ms | < 100ms | Signature check |
| Rekor submission | < 500ms | < 2s | Network dependent |
| Rekor verification | < 300ms | < 1s | Network dependent |
| in-toto predicate | < 20ms | < 80ms | JSON serialization |
---
## Database Operation Targets
| Operation | P50 Time | P95 Time | Notes |
|-----------|----------|----------|-------|
| Receipt insert | < 5ms | < 20ms | Single record |
| Receipt query (by ID) | < 2ms | < 10ms | Indexed lookup |
| Receipt query (by tenant) | < 10ms | < 50ms | Index scan |
| EPSS lookup (single) | < 1ms | < 5ms | Indexed |
| EPSS lookup (batch 100) | < 10ms | < 50ms | Batch query |
| Risk score insert | < 5ms | < 20ms | Single record |
| Risk score update | < 3ms | < 15ms | Single record |
---
## Regression Thresholds
Performance regression is detected when metrics exceed these thresholds compared to baseline:
| Metric | Warning Threshold | Blocking Threshold | Action |
|--------|------------------|-------------------|--------|
| P50 Time | > 15% increase | > 25% increase | Block release |
| P95 Time | > 20% increase | > 35% increase | Block release |
| Memory Usage | > 20% increase | > 30% increase | Block release |
| CPU Time | > 15% increase | > 25% increase | Investigate |
| Throughput | > 10% decrease | > 20% decrease | Block release |
### Regression Detection Rules
1. **Warning**: Alert engineering team, add to release notes
2. **Blocking**: Cannot merge/release until resolved or waived
3. **Waiver**: Requires documented justification and SME approval
---
## Measurement Methodology
### Environment Setup
```bash
# Standard test environment
# - CPU: 8 cores (x86_64)
# - Memory: 16GB RAM
# - Storage: NVMe SSD
# - OS: Ubuntu 22.04 LTS
# - Docker: 24.x
# Clear caches before cold start tests
docker system prune -af
sync && echo 3 > /proc/sys/vm/drop_caches
```
### Scan Performance
```bash
# Cold start measurement
time stellaops scan --image alpine:3.19 --format json > /dev/null
# Warm cache measurement (run 3x, take average)
for i in {1..3}; do
time stellaops scan --image alpine:3.19 --format json > /dev/null
done
# Memory profiling
/usr/bin/time -v stellaops scan --image alpine:3.19 --format json 2>&1 | \
grep "Maximum resident set size"
# CPU profiling
perf stat stellaops scan --image alpine:3.19 --format json > /dev/null
```
### Reachability Analysis
```bash
# Time measurement
time stellaops reach --project ./src --language csharp --out reach.json
# Memory profiling
/usr/bin/time -v stellaops reach --project ./src --language csharp --out reach.json 2>&1
# With detailed timing
stellaops reach --project ./src --language csharp --out reach.json --timing
```
### SBOM Generation
```bash
# Time measurement
time stellaops sbom --image node:20-alpine --format cyclonedx --out sbom.json
# Output size
stellaops sbom --image node:20-alpine --format cyclonedx --out sbom.json && \
ls -lh sbom.json
```
### Batch Operations
```bash
# Process multiple images in parallel
time stellaops scan --images images.txt --parallel 4 --format json --out-dir ./results
# Throughput test (images per minute)
START=$(date +%s)
for i in {1..10}; do
stellaops scan --image alpine:3.19 --format json > /dev/null
done
END=$(date +%s)
echo "Throughput: $(( 10 * 60 / (END - START) )) images/minute"
```
---
## CI Integration
### Benchmark Workflow
```yaml
# .gitea/workflows/performance-benchmark.yml
name: Performance Benchmark
on:
pull_request:
branches: [main]
schedule:
- cron: '0 2 * * 1' # Weekly Monday 2am
jobs:
benchmark:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run benchmarks
run: make benchmark-performance
- name: Check for regressions
run: |
stellaops benchmark compare \
--baseline results/baseline.json \
--current results/current.json \
--threshold-p50 0.15 \
--threshold-p95 0.20 \
--threshold-memory 0.20 \
--fail-on-regression
- name: Upload results
uses: actions/upload-artifact@v4
with:
name: benchmark-results
path: results/
```
### Local Testing
```bash
# Run full benchmark suite
make benchmark-performance
# Run specific image benchmark
make benchmark-image IMAGE=alpine:3.19
# Generate baseline
make benchmark-baseline
# Compare against baseline
make benchmark-compare
```
---
## Optimization Guidelines
### For Scan Performance
1. **Pre-pull images** for consistent timing
2. **Use layered caching** for repeat scans
3. **Enable parallel analysis** for multi-ecosystem images
4. **Consider selective scanning** for known-safe layers
### For Reachability
1. **Incremental analysis** for unchanged files
2. **Cache resolved dependencies**
3. **Use language-specific optimizations** (e.g., Roslyn for C#)
4. **Limit call graph depth** for very large codebases
### For Memory
1. **Stream large SBOMs** instead of loading fully
2. **Use batched database operations**
3. **Release intermediate data structures early**
4. **Configure GC appropriately for workload**
---
## Historical Baselines
### Version History
| Version | Date | P50 Scan (alpine) | P50 Reach (50k LOC) | Notes |
|---------|------|-------------------|---------------------|-------|
| 1.3.0 | 2025-12-14 | TBD | TBD | Current |
| 1.2.0 | 2025-09-01 | TBD | TBD | Previous |
| 1.1.0 | 2025-06-01 | TBD | TBD | Baseline |
### Improvement Targets
| Quarter | Focus Area | Target | Status |
|---------|------------|--------|--------|
| Q1 2026 | Scan cold start | -20% | Planned |
| Q1 2026 | Reachability memory | -15% | Planned |
| Q2 2026 | SBOM generation | -10% | Planned |
---
## References
- [Accuracy Metrics Framework](accuracy-metrics-framework.md)
- [Benchmark Submission Guide](submission-guide.md) (pending)
- [Scanner Architecture](../modules/scanner/architecture.md)
- [Reachability Module](../modules/scanner/reachability.md)