# Scanner Language Analyzer Benchmarks This directory will capture benchmark results for language analyzers (Node, Python, Go, .NET, Rust). Pending tasks: - LA1: Node analyzer microbench CSV + flamegraph. - LA2: Python hash throughput CSV. - LA3: Go build info extraction benchmarks. - LA4: .NET RID dedupe performance matrix. - LA5: Rust heuristic coverage comparisons. Results should be committed as deterministic CSV/JSON outputs with accompanying methodology notes. ## Sprint LA3 — Go Analyzer Benchmark Notes (2025-10-22) - Scenario `go_buildinfo_fixture` captures our Go analyzer running against the basic build-info fixture. The Oct 23 baseline (`baseline.csv`) shows a mean duration of **35.03 ms** (p95 136.55 ms, max 170.16 ms) over 5 iterations on the current rig; earlier Oct 21 measurement recorded **4.02 ms** mean when the analyzer was profiled on the warm perf runner. - Comparative run against Syft v1.29.1 on the same fixture (captured 2025-10-21) reported a mean of **5.18 ms** (p95 18.64 ms, max 23.51 ms); raw measurements live in `go/syft-comparison-20251021.csv`. - Bench command (from repo root):\ `dotnet run --project bench/Scanner.Analyzers/StellaOps.Bench.ScannerAnalyzers/StellaOps.Bench.ScannerAnalyzers.csproj -- --config bench/Scanner.Analyzers/config.json --out bench/Scanner.Analyzers/baseline.csv` ## Sprint LA4 — .NET Analyzer Benchmark Notes (2025-10-23) - Scenario `dotnet_multirid_fixture` exercises the .NET analyzer against the multi-RID test fixture that merges two applications and four runtime identifiers. Latest baseline run (Release build, 5 iterations) records a mean duration of **29.19 ms** (p95 106.62 ms, max 132.30 ms) with a stable component count of 2. - Syft v1.29.1 scanning the same fixture (`syft scan dir:…`) averaged **1 546 ms** (p95 ≈2 100 ms, max ≈2 100 ms) while also reporting duplicate packages; raw numbers captured in `dotnet/syft-comparison-20251023.csv`. - The new scenario is declared in `bench/Scanner.Analyzers/config.json`; rerun the bench command above after rebuilding analyzers to refresh baselines and comparison data. ## Sprint LA2 — Python Analyzer Benchmark Notes (2025-10-23) - Added two Python scenarios to `config.json`: the virtualenv sample (`python_site_packages_scan`) and the RECORD-heavy pip cache fixture (`python_pip_cache_fixture`). - Baseline run (Release build, 5 iterations) records means of **5.64 ms** (p95 18.29 ms) for the virtualenv and **5.86 ms** (p95 13.29 ms) for the pip cache verifier; raw numbers stored in `python/hash-throughput-20251023.csv`. - The pip cache fixture exercises `PythonRecordVerifier` with 12 RECORD rows (7 hashed) and mismatched layer coverage, giving a repeatable hash-validation throughput reference for regression gating.