add advisories

This commit is contained in:
master
2025-12-01 17:50:11 +02:00
parent c11d87d252
commit 790801f329
7 changed files with 3723 additions and 0 deletions

View File

@@ -0,0 +1,425 @@
Heres a simple metric that will make your security UI (and teams) radically better: **TimetoEvidence (TTE)** — the time from opening a finding to seeing *raw proof* (a dataflow edge, an SBOM line, or a VEX note), not a summary.
---
### What it is
* **Definition:** TTE = `t_first_proof_rendered t_open_finding`.
* **Proof =** the exact artifact or path that justifies the claim (e.g., `package-lock.json: line 214 → openssl@1.1.1`, `reachability: A → B → C sink`, or `VEX: not_affected due to unreachable code`).
* **Target:** **P95 ≤ 15s** (stretch: P99 ≤ 30s). If 95% of findings show proof within 15 seconds, the UI stays honest: evidence before opinion, low noise, fast explainability.
---
### Why it matters
* **Trust:** People accept decisions they can *verify* quickly.
* **Triage speed:** Proof-first UIs cut back-and-forth and guesswork.
* **Noise control:** If you cant surface proof fast, you probably shouldnt surface the finding yet.
---
### How to measure (engineeringready)
* Emit two stamps per finding view:
* `t_open_finding` (on route enter or modal open).
* `t_first_proof_rendered` (first DOM paint of SBOM line / path list / VEX clause).
* Store as `tte_ms` in a lightweight events table (Postgres) with tags: `tenant`, `finding_id`, `proof_kind` (`sbom|reachability|vex`), `source` (`local|remote|cache`).
* Nightly rollup: compute P50/P90/P95/P99 by proof_kind and page.
* Alert when **P95 > 15s** for 15 minutes.
---
### UI contract (keeps the UX honest)
* **Above the fold:** always show a compact **Proof panel** first (not hidden behind tabs).
* **Skeletons over spinners:** reserve space; render partial proof as soon as any piece is ready.
* **Plain text copy affordance:** “Copy SBOM line / path” button right next to the proof.
* **Defer nonproof widgets:** CVSS badges, remediation prose, and charts load *after* proof.
* **Emptystate truth:** if no proof exists, say “No proof available yet” and show the loader for *that* proof type only (dont pretend with summaries).
---
### Backend rules of thumb
* **Preindex for first paint:** cache top N proof items per hot finding (e.g., first SBOM hit + shortest path).
* **Bound queries:** proof queries must be *O(log n)* on indexed columns (pkg name@version, file hash, graph node id).
* **Chunked streaming:** send first proof chunk <200ms after backend hit; dont hold for the full set.
* **Timeout budget:** 12s backend budget + 3s UI/render margin = 15s P95.
---
### Minimal contract to add in your code
```ts
// Frontend: fire on open
metrics.emit('finding_open', { findingId, t: performance.now() });
// When the first real proof node/line hits the DOM:
metrics.emit('proof_rendered', { findingId, proofKind, t: performance.now() });
```
```sql
-- Rollup (hourly)
SELECT
proof_kind,
percentile_cont(0.95) WITHIN GROUP (ORDER BY tte_ms) AS p95_ms
FROM tte_events
WHERE ts >= now() - interval '1 hour'
GROUP BY proof_kind;
```
---
### What to put on the team dashboard
* **TTE P95 by page** (Findings list, Finding details).
* **TTE P95 by proof_kind** (sbom / reachability / vex).
* **Error budget burn**: minutes over target per day.
* **Top regressions**: last 7 days vs prior 7.
---
### Acceptance checklist for any finding view
* [ ] First paint shows a real proof snippet (not a summary).
* [ ] Copy proof button works within 1 click.
* [ ] TTE P95 in staging 10s; in prod 15s.
* [ ] If proof missing, explicit emptystate + retry path.
* [ ] Telemetry sampled 50% of sessions (or 100% for internal).
---
If you want, I can turn this into a **`docs/UX/tte.md`** spec plus a **Grafana panel JSON** and a tiny **.NET middleware** to emit the metricsjust say the word and Ill drop in readytouse snippets.
Perfect, got it lets turn **TimetoEvidence (TTE)** into a proper set of *“how we track UX health”* guidelines.
Ill treat this like something you can drop into an internal engineering/UX doc.
---
## 1. What exactly is TimetoEvidence?
**Definition**
> **TTE = t_first_proof_rendered t_open_finding**
* **t_open_finding** when the user first opens a finding / detail view (e.g., vulnerability, alert, ticket, log event).
* **t_first_proof_rendered** when the UI first paints **actual evidence** that backs the finding, for example:
* The SBOM row showing `package@version`.
* The callgraph/dataflow path to a sink.
* A VEX note explaining why something is (not) affected.
* A raw log snippet that the alert is based on.
**Key principle:**
TTE measures **how long users have to trust you blindly** before they can see proof with their own eyes.
---
## 2. UX health goals & targets
Treat TTE like latency SLOs:
* **Primary SLO**:
* **P95 TTE 15s** for all findings in normal conditions.
* **Stretch SLO**:
* **P99 TTE 30s** for heavy cases (big graphs, huge SBOMs, cold caches).
* **Guardrail**:
* P50 TTE should be **< 3s**. If the median creeps up, youre in trouble even if P95 looks OK.
You can refine by feature:
* Simple proof (single SBOM row, small payload):
* P95 5s.
* Complex proof (reachability graph, crossrepo joins):
* P95 15s.
**UX rule of thumb**
* < 2s: feels instant.
* 210s: acceptable if clearly loading something heavy.
* > 10s: needs **strong** feedback (progress, partial results, explanations).
* > 30s: the system should probably **offer fallback** (e.g., “download raw evidence” or “retry”).
---
## 3. Instrumentation guidelines
### 3.1 Event model
Emit two core events per finding view:
1. **`finding_open`**
* When user opens the finding details (route enter / modal open).
* Must include:
* `finding_id`
* `tenant_id` / `org_id`
* `user_role` (admin, dev, triager, etc.)
* `entry_point` (list, search, notification, deep link)
* `ui_version` / `build_sha`
2. **`proof_rendered`**
* First time *any* qualifying proof element is painted.
* Must include:
* `finding_id`
* `proof_kind` (`sbom | reachability | vex | logs | other`)
* `source` (`local_cache | backend_api | 3rd_party`)
* `proof_height` (e.g., pixel offset from top) to ensure its actually above the fold or very close.
**Derived metric**
Your telemetry pipeline should compute:
```text
tte_ms = proof_rendered.timestamp - finding_open.timestamp
```
If there are multiple `proof_rendered` events for the same `finding_open`, use:
* **TTE (first proof)** minimum timestamp; primary SLO.
* Optionally: **TTE (full evidence)** last proof in a defined “bundle” (e.g., path + SBOM row).
### 3.2 Implementation notes
**Frontend**
* Emit `finding_open` as soon as:
* The route is confirmed and
* You know which `finding_id` is being displayed.
* Emit `proof_rendered`:
* **Not** when you *fetch* data, but when at least one evidence component is **visibly rendered**.
* Easiest approach: hook into component lifecycle / intersection observer on the evidence container.
Pseudoexample:
```ts
// On route/mount:
metrics.emit('finding_open', {
findingId,
entryPoint,
userRole,
uiVersion,
t: performance.now()
});
// In EvidencePanel component, after first render with real data:
if (!hasEmittedProof && hasRealEvidence) {
metrics.emit('proof_rendered', {
findingId,
proofKind: 'sbom',
source: 'backend_api',
t: performance.now()
});
hasEmittedProof = true;
}
```
**Backend**
* No special requirement beyond:
* Stable IDs (`finding_id`).
* Knowing which API endpoints respond with evidence payloads — youll want to correlate backend latency with TTE later.
---
## 4. Data quality & sampling
If you want TTE to drive decisions, the data must be boringly reliable.
**Guidelines**
1. **Sample rate**
* Start with **100%** in staging.
* In production, aim for **≥ 25% of sessions** for TTE events at minimum; 100% is ideal if volume is reasonable.
2. **Clock skew**
* Prefer **frontend timestamps** using `performance.now()` for TTE; theyre monotonic within a tab.
* Dont mix backend clocks into the TTE calculation.
3. **Bot / synthetic traffic**
* Tag synthetic tests (`is_synthetic = true`) and exclude them from UX health dashboards.
4. **Retry behavior**
* If the proof fails to load and user hits “retry”:
* Treat it as a separate measurement (`retry = true`) or
* Log an additional `proof_error` event with error class (timeout, 5xx, network, parse, etc.).
---
## 5. Dashboards: how to watch TTE
You want a small, opinionated set of views that answer:
> “Is UX getting better or worse for people trying to understand findings?”
### 5.1 Core widgets
1. **TTE distribution**
* P50 / P90 / P95 / P99 per day (or per release).
* Split by `proof_kind`.
2. **TTE by page / surface**
* Finding list → detail.
* Deep links from notifications.
* Direct URLs / bookmarks.
3. **TTE by user segment**
* New users vs power users.
* Different roles (security engineer vs application dev).
4. **Error budget panel**
* “Minutes over SLO per day” e.g., sum of all userminutes where TTE > 15s.
* Use this to prioritize work.
5. **Correlation with engagement**
* Scatter: TTE vs session length, or TTE vs “user clicked ignore / snooze”.
* Aim to confirm the obvious: **long TTE → worse engagement/completion**.
### 5.2 Operational details
* Update granularity: **realtime or ≤15 min** for oncall/ops panels.
* Retention: at least **90 days** to see trends across big releases.
* Breakdowns:
* `backend_region` (to catch regional issues).
* `build_version` (to spot regressions quickly).
---
## 6. UX & engineering design rules anchored in TTE
These are the **behavior rules** for the product that keep TTE healthy.
### 6.1 “Evidence first” layout rules
* **Evidence above the fold**
* At least *one* proof element must be visible **without scrolling** on a typical laptop viewport.
* **Summary second**
* CVSS scores, severity badges, long descriptions: all secondary. Evidence should come *before* opinion.
* **No fake proof**
* Dont use placeholders that *look* like evidence but arent (e.g., “example path” or generic text).
* If evidence is still loading, show a clear skeleton/loader with “Loading evidence…”.
### 6.2 Loading strategy rules
* Start fetching evidence **as soon as navigation begins**, not after the page is fully mounted.
* Use **lazy loading** for noncritical widgets until after proof is shown.
* If a call is known to be heavy:
* Consider **precomputing** and caching the top evidence (shortest path, first SBOM hit).
* Stream results: render first proof item as soon as it arrives; dont wait for the full list.
### 6.3 Empty / error state rules
* If there is genuinely no evidence:
* Explicitly say **“No supporting evidence available yet”** and treat TTE as:
* Either “no value” (excluded), or
* A special bucket `proof_kind = "none"`.
* If loading fails:
* Show a clear error and a **retry** that reemits `proof_rendered` when successful.
* Log `proof_error` with reason; track error rate alongside TTE.
---
## 7. How to *use* TTE in practice
### 7.1 For releases
For any change that affects findings UI or evidence plumbing:
* Add a release checklist item:
* “No regression on TTE P95 for [pages X, Y].”
* During rollout:
* Compare **pre vs postrelease** TTE P95 by `ui_version`.
* If regression > 20%:
* Roll back, or
* Add a followup ticket explicitly tagged with the regression.
### 7.2 For experiments / A/B tests
When running UI experiments around findings:
* Always capture TTE per variant.
* Compare:
* TTE P50/P95.
* Task completion rate (e.g., “user changed status”).
* Subjective UX (CSAT) if you have it.
Youre looking for patterns like:
* Variant B: **+5% completion**, **+8% TTE** → maybe OK.
* Variant C: **+2% completion**, **+70% TTE** → probably not acceptable.
### 7.3 For prioritization
Use TTE as a lever in planning:
* If P95 TTE is healthy and stable:
* More room for new features / experiments.
* If P95 TTE is trending up for 2+ weeks:
* Time to schedule a “TTE debt” story: caching, query optimization, UI relayout, etc.
---
## 8. Quick “TTEready” checklist
Youre “tracking UX health with TTE” if you can honestly tick these:
1. **Instrumentation**
* [ ] `finding_open` + `proof_rendered` events exist and are correlated.
* [ ] TTE computed in a stable pipeline (joins, dedupe, etc.).
2. **Targets**
* [ ] TTE SLOs defined (P95, P99) and agreed by UX + engineering.
3. **Dashboards**
* [ ] A dashboard shows TTE by proof kind, page, and release.
* [ ] Oncall / ops can see TTE in near realtime.
4. **UX rules**
* [ ] Evidence is visible above the fold for all main finding types.
* [ ] Noncritical widgets load after evidence.
* [ ] Empty/error states are explicit about evidence availability.
5. **Process**
* [ ] Major UI changes check TTE pre vs post as part of release acceptance.
* [ ] Regressions in TTE create real tickets, not just “well watch it”.
---
If you tell me what stack youre on (e.g., React + Next.js + OpenTelemetry + X observability tool), I can turn this into concrete code snippets and an example dashboard spec (fields, queries, charts) tailored exactly to your setup.