Files
git.stella-ops.org/bench/Scanner.Analyzers
Vladimir Moushkov f4d7a15a00
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
feat: Add RustFS artifact object store and migration tool
- Implemented RustFsArtifactObjectStore for managing artifacts in RustFS.
- Added unit tests for RustFsArtifactObjectStore functionality.
- Created a RustFS migrator tool to transfer objects from S3 to RustFS.
- Introduced policy preview and report models for API integration.
- Added fixtures and tests for policy preview and report functionality.
- Included necessary metadata and scripts for cache_pkg package.
2025-10-23 18:53:18 +03:00
..

Scanner Analyzer Microbench Harness

The bench harness exercises the language analyzers against representative filesystem layouts so that regressions are caught before they ship.

Layout

  • StellaOps.Bench.ScannerAnalyzers/ .NET 10 console harness that executes the real language analyzers (and fallback metadata walks for ecosystems that are still underway).
  • config.json Declarative list of scenarios the harness executes. Each scenario points at a directory in samples/.
  • baseline.csv Reference numbers captured on the 4vCPU warm rig described in docs/12_PERFORMANCE_WORKBOOK.md. CI publishes fresh CSVs so perf trends stay visible.

Current scenarios

  • node_monorepo_walk → runs the Node analyzer across samples/runtime/npm-monorepo.
  • java_demo_archive → runs the Java analyzer against samples/runtime/java-demo/libs/demo.jar.
  • python_site_packages_walk → temporary metadata walk over samples/runtime/python-venv until the Python analyzer lands.

Running locally

dotnet run \
  --project bench/Scanner.Analyzers/StellaOps.Bench.ScannerAnalyzers/StellaOps.Bench.ScannerAnalyzers.csproj \
  -- \
  --repo-root . \
  --out bench/Scanner.Analyzers/baseline.csv \
  --json out/bench/scanner-analyzers/latest.json \
  --prom out/bench/scanner-analyzers/latest.prom \
  --commit "$(git rev-parse HEAD)"

The harness prints a table to stdout and writes the CSV (if --out is specified) with the following headers:

scenario,iterations,sample_count,mean_ms,p95_ms,max_ms

Additional outputs:

  • --json emits a deterministic report consumable by Grafana/automation (schema 1.0, see docs/12_PERFORMANCE_WORKBOOK.md).
  • --prom exports Prometheus-compatible gauges (scanner_analyzer_bench_*), which CI uploads for dashboards and alerts.

Use --iterations to override the default (5 passes per scenario) and --threshold-ms to customize the failure budget. Budgets default to 5000ms (or per-scenario overrides in config.json), aligned with the SBOM compose objective. Provide --baseline path/to/baseline.csv (defaults to the repo baseline) to compare against historical numbers—regressions ≥20% on the max_ms metric or breaches of the configured threshold will fail the run.

Metadata options:

  • --captured-at 2025-10-23T12:00:00Z to inject a deterministic timestamp (otherwise UtcNow).
  • --commit and --environment annotate the JSON report for dashboards.
  • --regression-limit 1.15 adjusts the ratio guard (default 1.20 ⇒ +20%).

Adding scenarios

  1. Drop the fixture tree under samples/<area>/....
  2. Append a new scenario entry to config.json describing:
    • id snake_case scenario name (also used in CSV).
    • label human-friendly description shown in logs.
    • root path to the directory that will be scanned.
    • For analyzer-backed scenarios, set analyzers to the list of language analyzer ids (for example, ["node"]).
    • For temporary metadata walks (used until the analyzer ships), provide parser (node or python) and the matcher glob describing files to parse.
  3. Re-run the harness (dotnet run … --out baseline.csv --json out/.../new.json --prom out/.../new.prom).
  4. Commit both the fixture and updated baseline.