feat: Add UI benchmark driver and scenarios for graph interactions
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled

- Introduced `ui_bench_driver.mjs` to read scenarios and fixture manifest, generating a deterministic run plan.
- Created `ui_bench_plan.md` outlining the purpose, scope, and next steps for the benchmark.
- Added `ui_bench_scenarios.json` containing various scenarios for graph UI interactions.
- Implemented tests for CLI commands, ensuring bundle verification and telemetry defaults.
- Developed schemas for orchestrator components, including replay manifests and event envelopes.
- Added mock API for risk management, including listing and statistics functionalities.
- Implemented models for risk profiles and query options to support the new API.
This commit is contained in:
StellaOps Bot
2025-12-02 01:28:17 +02:00
parent 909d9b6220
commit 44171930ff
94 changed files with 3606 additions and 271 deletions

View File

@@ -0,0 +1,80 @@
# Time-to-Evidence (TTE) Metric
Compiled: 2025-12-01 (UTC)
## What it is
**Definition:** `TTE = t_first_proof_rendered t_open_finding`.
**Proof** = the exact artifact or path that justifies the claim (e.g., `package-lock.json: line 214 → openssl@1.1.1`, `reachability: A → B → C sink`, or `VEX: not_affected due to unreachable code`).
**Target:** **P95 ≤ 15s** (stretch: **P99 ≤ 30s**). If 95% of findings show proof within 15 seconds, the UI stays honest: evidence before opinion, low noise, fast explainability.
## Why it matters
- **Trust:** People accept decisions they can verify quickly.
- **Triage speed:** Proof-first UIs cut back-and-forth and guesswork.
- **Noise control:** If you cant surface proof fast, you probably shouldnt surface the finding yet.
## How to measure (engineering-ready)
Emit two stamps per finding view:
- `t_open_finding` (on route enter or modal open).
- `t_first_proof_rendered` (first DOM paint of SBOM line / path list / VEX clause).
Store as `tte_ms` in a lightweight events table (Postgres) with tags: `tenant`, `finding_id`, `proof_kind` (`sbom|reachability|vex`), `source` (`local|remote|cache`).
Nightly rollup: compute P50/P90/P95/P99 by proof_kind and page. Alert when **P95 > 15s** for 15 minutes.
## UI contract (keeps the UX honest)
- **Above the fold:** always show a compact **Proof panel** first (not hidden behind tabs).
- **Skeletons over spinners:** reserve space; render partial proof as soon as any piece is ready.
- **Plain text copy affordance:** “Copy SBOM line / path” button right next to the proof.
- **Defer non-proof widgets:** CVSS badges, remediation prose, and charts load *after* proof.
- **Empty-state truth:** if no proof exists, say “No proof available yet” and show the loader for *that* proof type only (dont pretend with summaries).
## Backend rules of thumb
- **Pre-index for first paint:** cache top N proof items per hot finding (e.g., first SBOM hit + shortest path).
- **Bound queries:** proof queries must be *O(log n)* on indexed columns (pkg name@version, file hash, graph node id).
- **Chunked streaming:** send first proof chunk <200 ms after backend hit; dont hold for the full set.
- **Timeout budget:** 12s backend budget + 3s UI/render margin = 15s P95.
## Minimal contract to add in your code
```ts
// Frontend: fire on open
metrics.emit('finding_open', { findingId, t: performance.now() });
// When the first real proof node/line hits the DOM:
metrics.emit('proof_rendered', { findingId, proofKind, t: performance.now() });
```
```sql
-- Rollup (hourly)
SELECT
proof_kind,
percentile_cont(0.95) WITHIN GROUP (ORDER BY tte_ms) AS p95_ms
FROM tte_events
WHERE ts >= now() - interval '1 hour'
GROUP BY proof_kind;
```
## What to put on the team dashboard
- **TTE P95 by page** (Findings list, Finding details).
- **TTE P95 by proof_kind** (sbom / reachability / vex).
- **Error budget burn**: minutes over target per day.
- **Top regressions**: last 7 days vs prior 7.
## Acceptance checklist for any finding view
- [ ] First paint shows a real proof snippet (not a summary).
- [ ] Copy proof button works within 1 click.
- [ ] TTE P95 in staging 10s; in prod 15s.
- [ ] If proof missing, explicit empty-state + retry path.
- [ ] Telemetry sampled 50% of sessions (or 100% for internal).
## Ready-to-drop implementation notes