# Scanner Language Analyzer Benchmarks

This directory will capture benchmark results for language analyzers (Node, Python, Go, .NET, Rust).

Pending tasks:
- LA1: Node analyzer microbench CSV + flamegraph.
- LA2: Python hash throughput CSV.
- LA3: Go build info extraction benchmarks.
- LA4: .NET RID dedupe performance matrix.
- LA5: Rust heuristic coverage comparisons.

Results should be committed as deterministic CSV/JSON outputs with accompanying methodology notes.

## Sprint LA3 — Go Analyzer Benchmark Notes (2025-10-22)

- Scenario `go_buildinfo_fixture` captures our Go analyzer running against the basic build-info fixture. The Oct 23 baseline (`baseline.csv`) shows a mean duration of **35.03 ms** (p95 136.55 ms, max 170.16 ms) over 5 iterations on the current rig; earlier Oct 21 measurement recorded **4.02 ms** mean when the analyzer was profiled on the warm perf runner.
- Comparative run against Syft v1.29.1 on the same fixture (captured 2025-10-21) reported a mean of **5.18 ms** (p95 18.64 ms, max 23.51 ms); raw measurements live in `go/syft-comparison-20251021.csv`.
- Bench command (from repo root):\
  `dotnet run --project bench/Scanner.Analyzers/StellaOps.Bench.ScannerAnalyzers/StellaOps.Bench.ScannerAnalyzers.csproj -- --config bench/Scanner.Analyzers/config.json --out bench/Scanner.Analyzers/baseline.csv`

## Sprint LA4 — .NET Analyzer Benchmark Notes (2025-10-23)

- Scenario `dotnet_multirid_fixture` exercises the .NET analyzer against the multi-RID test fixture that merges two applications and four runtime identifiers. Latest baseline run (Release build, 5 iterations) records a mean duration of **29.19 ms** (p95 106.62 ms, max 132.30 ms) with a stable component count of 2.
- Syft v1.29.1 scanning the same fixture (`syft scan dir:…`) averaged **1 546 ms** (p95 ≈2 100 ms, max ≈2 100 ms) while also reporting duplicate packages; raw numbers captured in `dotnet/syft-comparison-20251023.csv`.
- The new scenario is declared in `bench/Scanner.Analyzers/config.json`; rerun the bench command above after rebuilding analyzers to refresh baselines and comparison data.

## Sprint LA2 — Python Analyzer Benchmark Notes (2025-10-23)

- Added two Python scenarios to `config.json`: the virtualenv sample (`python_site_packages_scan`) and the RECORD-heavy pip cache fixture (`python_pip_cache_fixture`).
- Baseline run (Release build, 5 iterations) records means of **5.64 ms** (p95 18.29 ms) for the virtualenv and **5.86 ms** (p95 13.29 ms) for the pip cache verifier; raw numbers stored in `python/hash-throughput-20251023.csv`.
- The pip cache fixture exercises `PythonRecordVerifier` with 12 RECORD rows (7 hashed) and mismatched layer coverage, giving a repeatable hash-validation throughput reference for regression gating.