13 KiB
Here’s a simple metric that will make your security UI (and teams) radically better: Time‑to‑Evidence (TTE) — the time from opening a finding to seeing raw proof (a data‑flow edge, an SBOM line, or a VEX note), not a summary.
What it is
- Definition: TTE =
t_first_proof_rendered − t_open_finding. - Proof = the exact artifact or path that justifies the claim (e.g.,
package-lock.json: line 214 → openssl@1.1.1,reachability: A → B → C sink, orVEX: not_affected due to unreachable code). - Target: P95 ≤ 15s (stretch: P99 ≤ 30s). If 95% of findings show proof within 15 seconds, the UI stays honest: evidence before opinion, low noise, fast explainability.
Why it matters
- Trust: People accept decisions they can verify quickly.
- Triage speed: Proof-first UIs cut back-and-forth and guesswork.
- Noise control: If you can’t surface proof fast, you probably shouldn’t surface the finding yet.
How to measure (engineering‑ready)
-
Emit two stamps per finding view:
t_open_finding(on route enter or modal open).t_first_proof_rendered(first DOM paint of SBOM line / path list / VEX clause).
-
Store as
tte_msin a lightweight events table (Postgres) with tags:tenant,finding_id,proof_kind(sbom|reachability|vex),source(local|remote|cache). -
Nightly rollup: compute P50/P90/P95/P99 by proof_kind and page.
-
Alert when P95 > 15s for 15 minutes.
UI contract (keeps the UX honest)
- Above the fold: always show a compact Proof panel first (not hidden behind tabs).
- Skeletons over spinners: reserve space; render partial proof as soon as any piece is ready.
- Plain text copy affordance: “Copy SBOM line / path” button right next to the proof.
- Defer non‑proof widgets: CVSS badges, remediation prose, and charts load after proof.
- Empty‑state truth: if no proof exists, say “No proof available yet” and show the loader for that proof type only (don’t pretend with summaries).
Backend rules of thumb
- Pre‑index for first paint: cache top N proof items per hot finding (e.g., first SBOM hit + shortest path).
- Bound queries: proof queries must be O(log n) on indexed columns (pkg name@version, file hash, graph node id).
- Chunked streaming: send first proof chunk <200 ms after backend hit; don’t hold for the full set.
- Timeout budget: 12s backend budget + 3s UI/render margin = 15s P95.
Minimal contract to add in your code
// Frontend: fire on open
metrics.emit('finding_open', { findingId, t: performance.now() });
// When the first real proof node/line hits the DOM:
metrics.emit('proof_rendered', { findingId, proofKind, t: performance.now() });
-- Rollup (hourly)
SELECT
proof_kind,
percentile_cont(0.95) WITHIN GROUP (ORDER BY tte_ms) AS p95_ms
FROM tte_events
WHERE ts >= now() - interval '1 hour'
GROUP BY proof_kind;
What to put on the team dashboard
- TTE P95 by page (Findings list, Finding details).
- TTE P95 by proof_kind (sbom / reachability / vex).
- Error budget burn: minutes over target per day.
- Top regressions: last 7 days vs prior 7.
Acceptance checklist for any finding view
- First paint shows a real proof snippet (not a summary).
- “Copy proof” button works within 1 click.
- TTE P95 in staging ≤ 10s; in prod ≤ 15s.
- If proof missing, explicit empty‑state + retry path.
- Telemetry sampled ≥ 50% of sessions (or 100% for internal).
If you want, I can turn this into a docs/UX/tte.md spec plus a Grafana panel JSON and a tiny .NET middleware to emit the metrics—just say the word and I’ll drop in ready‑to‑use snippets.
Perfect, got it — let’s turn Time‑to‑Evidence (TTE) into a proper set of “how we track UX health” guidelines.
I’ll treat this like something you can drop into an internal engineering/UX doc.
1. What exactly is Time‑to‑Evidence?
Definition
TTE = t_first_proof_rendered − t_open_finding
-
t_open_finding – when the user first opens a “finding” / detail view (e.g., vulnerability, alert, ticket, log event).
-
t_first_proof_rendered – when the UI first paints actual evidence that backs the finding, for example:
- The SBOM row showing
package@version. - The call‑graph/data‑flow path to a sink.
- A VEX note explaining why something is (not) affected.
- A raw log snippet that the alert is based on.
- The SBOM row showing
Key principle: TTE measures how long users have to trust you blindly before they can see proof with their own eyes.
2. UX health goals & targets
Treat TTE like latency SLOs:
-
Primary SLO:
- P95 TTE ≤ 15s for all findings in normal conditions.
-
Stretch SLO:
- P99 TTE ≤ 30s for heavy cases (big graphs, huge SBOMs, cold caches).
-
Guardrail:
- P50 TTE should be < 3s. If the median creeps up, you’re in trouble even if P95 looks OK.
You can refine by feature:
-
“Simple” proof (single SBOM row, small payload):
- P95 ≤ 5s.
-
“Complex” proof (reachability graph, cross‑repo joins):
- P95 ≤ 15s.
UX rule of thumb
- < 2s: feels instant.
- 2–10s: acceptable if clearly loading something heavy.
-
10s: needs strong feedback (progress, partial results, explanations).
-
30s: the system should probably offer fallback (e.g., “download raw evidence” or “retry”).
3. Instrumentation guidelines
3.1 Event model
Emit two core events per finding view:
-
finding_open-
When user opens the finding details (route enter / modal open).
-
Must include:
finding_idtenant_id/org_iduser_role(admin, dev, triager, etc.)entry_point(list, search, notification, deep link)ui_version/build_sha
-
-
proof_rendered-
First time any qualifying proof element is painted.
-
Must include:
finding_idproof_kind(sbom | reachability | vex | logs | other)source(local_cache | backend_api | 3rd_party)proof_height(e.g., pixel offset from top) – to ensure it’s actually above the fold or very close.
-
Derived metric
Your telemetry pipeline should compute:
tte_ms = proof_rendered.timestamp - finding_open.timestamp
If there are multiple proof_rendered events for the same finding_open, use:
- TTE (first proof) – minimum timestamp; primary SLO.
- Optionally: TTE (full evidence) – last proof in a defined “bundle” (e.g., path + SBOM row).
3.2 Implementation notes
Frontend
-
Emit
finding_openas soon as:- The route is confirmed and
- You know which
finding_idis being displayed.
-
Emit
proof_rendered:- Not when you fetch data, but when at least one evidence component is visibly rendered.
- Easiest approach: hook into component lifecycle / intersection observer on the evidence container.
Pseudo‑example:
// On route/mount:
metrics.emit('finding_open', {
findingId,
entryPoint,
userRole,
uiVersion,
t: performance.now()
});
// In EvidencePanel component, after first render with real data:
if (!hasEmittedProof && hasRealEvidence) {
metrics.emit('proof_rendered', {
findingId,
proofKind: 'sbom',
source: 'backend_api',
t: performance.now()
});
hasEmittedProof = true;
}
Backend
-
No special requirement beyond:
- Stable IDs (
finding_id). - Knowing which API endpoints respond with evidence payloads — you’ll want to correlate backend latency with TTE later.
- Stable IDs (
4. Data quality & sampling
If you want TTE to drive decisions, the data must be boringly reliable.
Guidelines
-
Sample rate
- Start with 100% in staging.
- In production, aim for ≥ 25% of sessions for TTE events at minimum; 100% is ideal if volume is reasonable.
-
Clock skew
- Prefer frontend timestamps using
performance.now()for TTE; they’re monotonic within a tab. - Don’t mix backend clocks into the TTE calculation.
- Prefer frontend timestamps using
-
Bot / synthetic traffic
- Tag synthetic tests (
is_synthetic = true) and exclude them from UX health dashboards.
- Tag synthetic tests (
-
Retry behavior
-
If the proof fails to load and user hits “retry”:
- Treat it as a separate measurement (
retry = true) or - Log an additional
proof_errorevent with error class (timeout, 5xx, network, parse, etc.).
- Treat it as a separate measurement (
-
5. Dashboards: how to watch TTE
You want a small, opinionated set of views that answer:
“Is UX getting better or worse for people trying to understand findings?”
5.1 Core widgets
-
TTE distribution
- P50 / P90 / P95 / P99 per day (or per release).
- Split by
proof_kind.
-
TTE by page / surface
- Finding list → detail.
- Deep links from notifications.
- Direct URLs / bookmarks.
-
TTE by user segment
- New users vs power users.
- Different roles (security engineer vs application dev).
-
Error budget panel
- “Minutes over SLO per day” – e.g., sum of all user‑minutes where TTE > 15s.
- Use this to prioritize work.
-
Correlation with engagement
- Scatter: TTE vs session length, or TTE vs “user clicked ‘ignore’ / ‘snooze’”.
- Aim to confirm the obvious: long TTE → worse engagement/completion.
5.2 Operational details
-
Update granularity: real‑time or ≤15 min for on‑call/ops panels.
-
Retention: at least 90 days to see trends across big releases.
-
Breakdowns:
backend_region(to catch regional issues).build_version(to spot regressions quickly).
6. UX & engineering design rules anchored in TTE
These are the behavior rules for the product that keep TTE healthy.
6.1 “Evidence first” layout rules
-
Evidence above the fold
- At least one proof element must be visible without scrolling on a typical laptop viewport.
-
Summary second
- CVSS scores, severity badges, long descriptions: all secondary. Evidence should come before opinion.
-
No fake proof
- Don’t use placeholders that look like evidence but aren’t (e.g., “example path” or generic text).
- If evidence is still loading, show a clear skeleton/loader with “Loading evidence…”.
6.2 Loading strategy rules
-
Start fetching evidence as soon as navigation begins, not after the page is fully mounted.
-
Use lazy loading for non‑critical widgets until after proof is shown.
-
If a call is known to be heavy:
- Consider precomputing and caching the top evidence (shortest path, first SBOM hit).
- Stream results: render first proof item as soon as it arrives; don’t wait for the full list.
6.3 Empty / error state rules
-
If there is genuinely no evidence:
-
Explicitly say “No supporting evidence available yet” and treat TTE as:
- Either “no value” (excluded), or
- A special bucket
proof_kind = "none".
-
-
If loading fails:
- Show a clear error and a retry that re‑emits
proof_renderedwhen successful. - Log
proof_errorwith reason; track error rate alongside TTE.
- Show a clear error and a retry that re‑emits
7. How to use TTE in practice
7.1 For releases
For any change that affects findings UI or evidence plumbing:
-
Add a release checklist item:
- “No regression on TTE P95 for [pages X, Y].”
-
During rollout:
-
Compare pre‑ vs post‑release TTE P95 by
ui_version. -
If regression > 20%:
- Roll back, or
- Add a follow‑up ticket explicitly tagged with the regression.
-
7.2 For experiments / A/B tests
When running UI experiments around findings:
-
Always capture TTE per variant.
-
Compare:
- TTE P50/P95.
- Task completion rate (e.g., “user changed status”).
- Subjective UX (CSAT) if you have it.
You’re looking for patterns like:
- Variant B: +5% completion, +8% TTE → maybe OK.
- Variant C: +2% completion, +70% TTE → probably not acceptable.
7.3 For prioritization
Use TTE as a lever in planning:
-
If P95 TTE is healthy and stable:
- More room for new features / experiments.
-
If P95 TTE is trending up for 2+ weeks:
- Time to schedule a “TTE debt” story: caching, query optimization, UI re‑layout, etc.
8. Quick “TTE‑ready” checklist
You’re “tracking UX health with TTE” if you can honestly tick these:
-
Instrumentation
finding_open+proof_renderedevents exist and are correlated.- TTE computed in a stable pipeline (joins, dedupe, etc.).
-
Targets
- TTE SLOs defined (P95, P99) and agreed by UX + engineering.
-
Dashboards
- A dashboard shows TTE by proof kind, page, and release.
- On‑call / ops can see TTE in near real‑time.
-
UX rules
- Evidence is visible above the fold for all main finding types.
- Non‑critical widgets load after evidence.
- Empty/error states are explicit about evidence availability.
-
Process
- Major UI changes check TTE pre vs post as part of release acceptance.
- Regressions in TTE create real tickets, not just “we’ll watch it”.
If you tell me what stack you’re on (e.g., React + Next.js + OpenTelemetry + X observability tool), I can turn this into concrete code snippets and an example dashboard spec (fields, queries, charts) tailored exactly to your setup.