2.6 KiB
2.6 KiB
Bench Prep — PREP-BENCH-GRAPH-21-002 (UI headless graph benchmarks)
Status: Ready for implementation (2025-11-20) Owners: Bench Guild · UI Guild Scope: Define the Playwright-based UI benchmark that rides on the graph harness from BENCH-GRAPH-21-001 (50k/100k node fixtures) and produces deterministic latency/FPS metrics.
Dependencies
- Harness + fixtures from BENCH-GRAPH-21-001 (must expose HTTP endpoints and data seeds for 50k/100k graphs).
- Graph API/Indexer stable query contract (per
docs/modules/graph/architecture.md).
Benchmark plan
- Runner: Playwright (Chromium, headless) driven via
src/Bench/StellaOps.Bench.GraphUi. - Environment:
- Viewport: 1920x1080, device scale 1.0, throttling disabled; CPU pinned via
--disable-features=CPUThrottling. - Fixed session seed
GRAPH_BENCH_SEED=2025-01-01T00:00:00Zfor RNG use in camera jitter.
- Viewport: 1920x1080, device scale 1.0, throttling disabled; CPU pinned via
- Scenarios (each repeated 5x, median + p95 recorded):
- Canvas load: open
/graph/bench?fixture=50k→ measure TTI, first contentful paint, tiles loaded count. - Pan/zoom loop: pan 500px x 20 iterations + zoom in/out (2x each) → record average FPS and frame jank percentage.
- Path query: submit shortest-path query between two seeded nodes → measure query latency (client + API) and render latency.
- Filter drill-down: apply two filters (severity=high, product=“core”) → measure time to filtered render + memory delta.
- Canvas load: open
- Metrics captured to NDJSON per run:
timestampUtc,scenario,fixture,p95_ms,median_ms,avg_fps,jank_pct,mem_mb,api_latency_ms(where applicable).
- Determinism:
- All timestamps recorded in UTC ISO-8601; RNG seeded; cache cleared before each scenario;
--disable-features=UseAFHdisabled to avoid adaptive throttling.
- All timestamps recorded in UTC ISO-8601; RNG seeded; cache cleared before each scenario;
Outputs
- NDJSON benchmark results stored under
out/bench/graph/ui/{runId}.ndjsonwith a.sha256alongside. - Summary CSV optional, derived from NDJSON for reporting only.
- CI step publishes artifacts to
out/bench/graph/ui/latest/with write-once semantics per runId.
Acceptance criteria
- Playwright suite reproducibly exercises the four scenarios on 50k and 100k fixtures with seeded inputs.
- Metrics include p95 and median for each scenario and fixture size; FPS ≥ 30 on 50k fixture baseline.
- Archive outputs are deterministic for given fixture and seed (excluding wall-clock timestamps in filenames; embed timestamps only in content).
Next steps
- Wire Playwright harness into
BENCH-GRAPH-21-001pipeline once fixtures ready. - Hook results into perf dashboard if available; otherwise store NDJSON + hashes.