# Scanner Analyzer Benchmarks – Operations Guide ## Purpose Keep the language analyzer microbench under the < 5 s SBOM pledge. CI emits Prometheus metrics and JSON fixtures so trend dashboards and alerts stay in lockstep with the repository baseline. > **Grafana note:** Import `docs/ops/scanner-analyzers-grafana-dashboard.json` into your Prometheus-backed Grafana stack to monitor `scanner_analyzer_bench_*` metrics and alert on regressions. ## Publishing workflow 1. CI (or engineers running locally) execute: ```bash dotnet run \ --project src/StellaOps.Bench/Scanner.Analyzers/StellaOps.Bench.ScannerAnalyzers/StellaOps.Bench.ScannerAnalyzers.csproj \ -- \ --repo-root . \ --out src/StellaOps.Bench/Scanner.Analyzers/baseline.csv \ --json out/bench/scanner-analyzers/latest.json \ --prom out/bench/scanner-analyzers/latest.prom \ --commit "$(git rev-parse HEAD)" \ --environment "${CI_ENVIRONMENT_NAME:-local}" ``` 2. Publish the artefacts (`baseline.csv`, `latest.json`, `latest.prom`) to `bench-artifacts//`. 3. Promtail (or the CI job) pushes `latest.prom` into Prometheus; JSON lands in long-term storage for workbook snapshots. 4. The harness exits non-zero if: - `max_ms` for any scenario breaches its configured threshold; or - `max_ms` regresses ≥ 20 % versus `baseline.csv`. ## Grafana dashboard - Import `docs/ops/scanner-analyzers-grafana-dashboard.json`. - Point the template variable `datasource` to the Prometheus instance ingesting `scanner_analyzer_bench_*` metrics. - Panels: - **Max Duration (ms)** – compares live runs vs baseline. - **Regression Ratio vs Limit** – plots `(max / baseline_max - 1) * 100`. - **Breached Scenarios** – stat panel sourced from `scanner_analyzer_bench_regression_breached`. ## Alerting & on-call response - **Primary alert**: fire when `scanner_analyzer_bench_regression_ratio{scenario=~".+"} >= 1.20` for 2 consecutive samples (10 min default). Suggested PromQL: ``` max_over_time(scanner_analyzer_bench_regression_ratio[10m]) >= 1.20 ``` - Suppress duplicates using the `scenario` label. - Pager payload should include `scenario`, `max_ms`, `baseline_max_ms`, and `commit`. - Immediate triage steps: 1. Check `latest.json` artefact for the failing scenario – confirm commit and environment. 2. Re-run the harness with `--captured-at` and `--baseline` pointing at the last known good CSV to verify determinism. 3. If regression persists, open an incident ticket tagged `scanner-analyzer-perf` and page the owning language guild. 4. Roll back the offending change or update the baseline after sign-off from the guild lead and Perf captain. Document the outcome in `docs/12_PERFORMANCE_WORKBOOK.md` (section 8) so trendlines reflect any accepted regressions.