Files
git.stella-ops.org/docs/benchmarks/graph/bench-graph-21-002-prep.md
master d519782a8f
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
prep docs and service updates
2025-11-21 06:56:36 +00:00

2.6 KiB

Bench Prep — PREP-BENCH-GRAPH-21-002 (UI headless graph benchmarks)

Status: Ready for implementation (2025-11-20) Owners: Bench Guild · UI Guild Scope: Define the Playwright-based UI benchmark that rides on the graph harness from BENCH-GRAPH-21-001 (50k/100k node fixtures) and produces deterministic latency/FPS metrics.

Dependencies

  • Harness + fixtures from BENCH-GRAPH-21-001 (must expose HTTP endpoints and data seeds for 50k/100k graphs).
  • Graph API/Indexer stable query contract (per docs/modules/graph/architecture.md).

Benchmark plan

  • Runner: Playwright (Chromium, headless) driven via src/Bench/StellaOps.Bench.GraphUi.
  • Environment:
    • Viewport: 1920x1080, device scale 1.0, throttling disabled; CPU pinned via --disable-features=CPUThrottling.
    • Fixed session seed GRAPH_BENCH_SEED=2025-01-01T00:00:00Z for RNG use in camera jitter.
  • Scenarios (each repeated 5x, median + p95 recorded):
    1. Canvas load: open /graph/bench?fixture=50k → measure TTI, first contentful paint, tiles loaded count.
    2. Pan/zoom loop: pan 500px x 20 iterations + zoom in/out (2x each) → record average FPS and frame jank percentage.
    3. Path query: submit shortest-path query between two seeded nodes → measure query latency (client + API) and render latency.
    4. Filter drill-down: apply two filters (severity=high, product=“core”) → measure time to filtered render + memory delta.
  • Metrics captured to NDJSON per run:
    • timestampUtc, scenario, fixture, p95_ms, median_ms, avg_fps, jank_pct, mem_mb, api_latency_ms (where applicable).
  • Determinism:
    • All timestamps recorded in UTC ISO-8601; RNG seeded; cache cleared before each scenario; --disable-features=UseAFH disabled to avoid adaptive throttling.

Outputs

  • NDJSON benchmark results stored under out/bench/graph/ui/{runId}.ndjson with a .sha256 alongside.
  • Summary CSV optional, derived from NDJSON for reporting only.
  • CI step publishes artifacts to out/bench/graph/ui/latest/ with write-once semantics per runId.

Acceptance criteria

  • Playwright suite reproducibly exercises the four scenarios on 50k and 100k fixtures with seeded inputs.
  • Metrics include p95 and median for each scenario and fixture size; FPS ≥ 30 on 50k fixture baseline.
  • Archive outputs are deterministic for given fixture and seed (excluding wall-clock timestamps in filenames; embed timestamps only in content).

Next steps

  • Wire Playwright harness into BENCH-GRAPH-21-001 pipeline once fixtures ready.
  • Hook results into perf dashboard if available; otherwise store NDJSON + hashes.