Files

master 01f4943ab9 up

2025-12-14 16:23:44 +02:00

20 KiB

Raw Blame History

Here’s a tight, practical blueprint for building (and proving) a fast, evidence‑first triage workflow—plus the power‑user affordances that make Stella Ops feel “snappy” even offline.

What “good” looks like (background in plain words)

Alert → evidence → decision in one flow: an alert should open directly onto the concrete proof (reachability, call‑stack, provenance), then offer a one‑click decision (VEX/CSAF status) with audit logging.
Time‑to‑First‑Signal (TTFS) is king: how fast a human sees the first credible piece of evidence that explains why this alert matters here.
Clicks‑to‑Closure: count how many interactions to reach a defensible decision recorded in the audit log.

Minimal evidence bundle per finding

Reachability proof: function‑level path or package‑level import chain (with “toggle reachability view” hotkey).
Call‑stack snippet: 5–10 frames around the sink/source with file:line anchors.
Provenance: attestation / DSSE + build ancestry (image → layer → artifact → commit).
VEX/CSAF status: affected/not‑affected/under‑investigation + reason.
Diff: what changed since last scan (SBOM or VEX delta), rendered as a small, human‑readable “smart‑diff.”

KPIs to measure in CI and UI

TTFS (p50/p95) from alert creation to first rendered evidence.
Clicks‑to‑Closure (median) per decision type.
Evidence completeness score (0–4): reachability, call‑stack, provenance, VEX/CSAF present.
Offline friendliness score: % of evidence resolvable with no network.
Audit log completeness: every decision has: evidence hash set, actor, policy context, replay token.

Power‑user affordances (keyboard first)

Jump to evidence (J): focuses the first incomplete evidence pane.
Copy DSSE (Y): copies the attestation block or Rekor entry ref.
Toggle reachability view (R): path list ↔ compact graph ↔ textual proof.
Search‑within‑graph (/): node/func/package, instant.
Deterministic sort (S): stable sort by (reachability→severity→age→component) to remove hesitation.
Quick VEX set (A, N, U): Affected / Not‑affected / Under‑investigation with templated reasons.

UX flow to implement (end‑to‑end)

Alert row shows: TTFS timer, reachability badge, “decision state,” and a diff‑dot if something changed.
Open alert lands on Evidence tab (not Details). Top strip = three proof pills:
- Reachability ✓ / Call‑stack ✓ / Provenance ✓ (click to expand inline).
Decision drawer pinned on the right:
- VEX/CSAF radio (A/N/U) → Reason presets → “Record decision.”
- Shows audit‑ready summary (hashes, timestamps, policy).
Diff tab: SBOM/VEX delta since last run, grouped by “meaningful risk shift.”
Activity tab: immutable audit log; export as a signed bundle for audits.

Graph performance on large call‑graphs

Minimal‑latency snapshots: pre‑render static PNG/SVG thumbnails server‑side; open with tiny preview then hydrate to interactive graph lazily.
Progressive neighborhood expansion: load 1‑hop first, expand on demand; keep the first TTFS < 500 ms.
Stable node ordering: deterministic layout with consistent anchors to avoid “graph shuffle” anxiety.
Chunked graph edges with capped fan‑out; collapse identical library paths into a reachability macro‑edge.

Offline‑friendly design

Local evidence cache: store (SBOM slices, path proofs, DSSE attestations, compiled call‑stacks) in a signed bundle beside the SARIF/VEX.
Deferred enrichment: mark fields that need internet (e.g., upstream CSAF fetch) and queue a background “enricher” when network returns.
Predictable fallbacks: if provenance server missing, show embedded DSSE and “verification pending,” never blank states.

Audit & replay

Deterministic replay token: hash(feed manifests + rules + lattice policy + inputs) → attach to every decision.
One‑click “Reproduce”: opens CLI snippet pinned to the exact versions and policies.
Evidence hash‑set: content‑address each proof artifact; the audit entry stores only hashes + signer.

TTFS & Clicks‑to‑Closure: how to measure in code

Emit a ttfs.start at alert creation; first paint of any evidence card emits ttfs.signal.
Increment a per‑alert interaction counter; on “Record decision” emit close.clicks.
Log evidence bitset (reach, stack, prov, vex) at decision time for completeness scoring.

Developer tasks (concrete, shippable)

Evidence API: GET /alerts/{id}/evidence returns {reachability, callstack, provenance, vex, hashes[]} with deterministic sort.
Proof renderer: tiny, no‑framework widget that can render from the offline bundle; hydrate to full only on interaction.
Keyboard map: global handler with overlay help (?); no collisions; all actions are idempotent.
Graph service: server‑side layout + snapshot PNG; client hydrates WebGL only when user expands.
Smart‑diff: diff SBOM/VEX → classify into “risk‑raising / neutral / reducing,” surface only the first item by default.
Audit logger: append‑only stream; signed checkpoints; export .stella-audit.tgz (attestations + JSONL).

Benchmarks to run weekly

TTFS under poor network (100 ms RTT, 1% loss): p95 < 1.5 s to first evidence.
Graph hydration on 250k‑edge image: preview < 300 ms, interactive < 2.0 s.
Keyboard coverage: ≥90% of triage actions executable without mouse.
Offline replay: 100% of decisions re‑render from bundle; zero web calls required.

Why Stella’s approach reduces hesitation

Deterministic sort orders keep findings in place between refreshes.
Minimal‑latency graph snapshots show something trustworthy immediately, then refine—no “blank panel” delay.
Replayable, signed bundles make every click auditable and reversible, which builds operator confidence.

If you want, I can turn this into:

a UI checklist for a design review,
a .NET 10 API contract (DTOs + endpoints),
or a Cypress/Playwright test plan that measures TTFS and clicks‑to‑closure automatically. Below is a PM‑style implementation guideline you can hand to developers. It’s written as a build spec: clear goals, “MUST/SHOULD” requirements, acceptance criteria, and the non‑functional guardrails (performance, offline, auditability) that make triage feel fast and defensible.

Stella Ops — Evidence‑First Triage Implementation Guidelines (PM Spec)

0) Assumptions and scope

Assumptions

Stella Ops ingests vulnerability findings (SCA/SAST/image scans), has SBOM context, and can compute reachability/call paths.
Triage outcomes must be recorded as VEX/CSAF‑compatible states with reasons and audit trails.
Users may operate in restricted networks and need an offline mode that still shows evidence.

In scope

Evidence‑first alert triage UI + APIs + telemetry.
Reachability proof + call stack view + provenance attestation view.
VEX/CSAF decision recording with audit export.
Offline evidence bundle and deterministic replay token.

Out of scope (for this phase)

Building the underlying static analyzer or SBOM generator (we consume their outputs).
Full CSAF publishing workflow (we store and export; publishing is separate).
Remediation automation (PRs, patching).

1) Product principles (non‑negotiables)

Evidence before detail Opening an alert MUST show the best available evidence immediately (even partial/placeholder), not a generic “details” page.
Fast first signal The UI MUST render a credible “first signal” quickly (reachability badge, call stack snippet, or provenance block).
Determinism reduces hesitation Sorting, graphs, and diffs MUST be stable across refreshes. No jittery re-layout.
Offline by design If evidence exists locally (bundle), the UI MUST render it without network access.
Audit-ready by default Every decision MUST be reproducible, attributable, and exportable with evidence hashes.

2) Success metrics (what we ship toward)

These become acceptance criteria and dashboards.

Primary metrics (P0)

TTFS (Time‑to‑First‑Signal): p95 < 1.5s from opening an alert to first evidence card rendering (with 100ms RTT, 1% loss simulation).
Clicks‑to‑Closure: median < 6 interactions to record a VEX decision.
Evidence completeness at decision time: ≥ 90% of decisions include evidence hash set + reason + replay token.

Secondary metrics (P1)

Offline resolution rate: ≥ 95% of alerts opened with a local bundle show reachability + provenance without network.
Graph usability: preview render < 300ms, interactive hydration < 2.0s for large graphs (see §7).

3) User workflows and “Definition of Done”

Workflow A: Triage an alert to a decision

DoD: user can open an alert, see evidence, set VEX state, and the system records a signed/auditable decision event.

Steps

Alert list shows key signals (reachability badge, decision state, diff indicator).
Open alert → Evidence view loads first.
User reviews reachability/call stack/provenance.
User sets VEX status + reason preset (editable).
User records decision.
Audit log entry appears instantly and is exportable.

Workflow B: Explain “why is this flagged?”

DoD: user can show a defensible proof (path/call stack/provenance) and copy it into a ticket.

4) UI requirements (MUST/SHOULD/MAY)

4.1 Alert list page

MUST

Each row includes:
- Severity + component identifier
- Decision state (Unset / Under Investigation / Not Affected / Affected)
- Reachability badge (Reachable / Not Reachable / Unknown) where available
- Diff indicator if SBOM/VEX changed since last scan (simple dot/label)
- Age / first seen / last updated
Deterministic sort default: Reachability DESC → Severity DESC → Decision state (Unset first) → Age DESC → Component name ASC
Keyboard navigation:
- ↑/↓ move selection, Enter open alert.
- / search/filter focus.

SHOULD

Inline “quick set” decision menu (Affected / Not affected / Under investigation) without leaving list for obvious cases, but still requires reason and logs evidence hashes.

4.2 Alert detail — landing tab MUST be Evidence

MUST

Default landing is Evidence (not “Overview”).
Top section shows 3 “proof pills” with status:
- Reachability (✓ / ! / …)
- Call stack (✓ / ! / …)
- Provenance (✓ / ! / …)
Each pill expands inline (no navigation) into a compact evidence panel.

MUST: No blank panels

If evidence is loading, show skeleton + “what’s coming.”
If evidence missing, show a reason (“not computed”, “requires source map”, “offline – enrichment pending”).

4.3 Decision drawer

MUST

Pinned right drawer (or persistent bottom sheet on small screens).
Controls:
- VEX/CSAF status: Affected / Not affected / Under investigation
- Reason preset dropdown + editable reason text
- “Record decision” button
Preview “Audit summary” before submit:
- Evidence hashes included
- Policy context (ruleset version)
- Replay token
- Actor identity

MUST

On submit, create an append-only audit event and immediately reflect status in UI.

SHOULD

Allow attaching references: ticket URL, incident ID, PR link (stored as metadata).

4.4 Diff tab

MUST

Show delta since last scan:
- SBOM diffs (component version changes, removals/additions)
- VEX diffs (status changes)
Group diffs by risk shift:
- Risk‑raising (new reachable vuln, severity increase)
- Neutral (metadata-only)
- Risk‑reducing (fixed version, reachability removed)

SHOULD

Provide “Copy diff summary” for change management.

4.5 Activity/Audit tab

MUST

Immutable timeline of decisions and evidence changes.
Each entry includes:
- actor, timestamp, decision, reason
- evidence hash set
- replay token
- bundle/export availability

5) Power-user and accessibility requirements

Keyboard shortcuts (MUST)

J: jump to next missing/incomplete evidence panel
R: toggle reachability view (list ↔ compact graph ↔ textual proof)
Y: copy selected evidence block (call stack / DSSE / path proof)
A: set “Affected” (opens reason preset selection)
N: set “Not affected”
U: set “Under investigation”
?: keyboard help overlay

Accessibility (MUST)

Fully navigable by keyboard
Visible focus states
Screen-reader labels for evidence pills and drawer controls
Color is never the only signal (badges must have text/icon)

6) Evidence model: what every alert should attempt to provide

Treat this as the minimum evidence bundle. Each item may be “unavailable,” but must be explicit.

MUST support:

Reachability proof
- At least one of:
  - function-level call path: entry → … → vulnerable_sink
  - package/module import chain
- Includes confidence/algorithm tag: static, dynamic, heuristic
Call stack snippet
- 5–10 frames around the relevant node with file:line anchors where possible
Provenance
- DSSE attestation or equivalent statement
- Artifact ancestry chain: image → layer → artifact → commit (as available)
- Verification status: verified / pending / failed (with reason)
Decision state
- VEX status + reason + timestamps
Evidence hash set
- Content-addressed hashes of each evidence artifact included in the decision

SHOULD

“Evidence freshness”: when computed, tool version, input revisions.

7) Performance and graph rendering requirements

TTFS budget (MUST)

When opening an alert:
- <200ms: show skeleton and cached row metadata
- <500ms: render at least one evidence pill with meaningful content OR a cached preview image
- <1.5s p95: render reachability + provenance for typical alerts

Graph rendering for large call graphs (MUST)

Two-phase rendering
1. Server-generated static snapshot (PNG/SVG) displayed immediately
2. Interactive graph hydrates lazily on user expand
Progressive expansion
- Load 1-hop neighborhood first; expand on click
Deterministic layout
- Same input produces same layout anchors (no reshuffles between refreshes)
Fan-out control
- Collapse repeated library paths into “macro edges” to keep the graph readable

8) Offline mode requirements

Offline is not “nice to have”; it is a defined mode.

Offline evidence bundle (MUST)

A single file (e.g., .stella.bundle.tgz) that contains:
- Alert metadata snapshot
- Evidence artifacts (reachability proofs, call stacks, provenance attestations)
- SBOM slice(s) necessary for diffs
- VEX decision history (if available)
- Manifest with content hashes (Merkle-ish)
Bundle must be signed (or include signature material) and verifiable.

UI behavior (MUST)

If bundle is present:
- UI loads evidence from it first
- Any missing items show “enrichment pending” (not “error”)
If network returns:
- Background refresh allowed, but must not reorder the alert list unexpectedly
- Must surface “updated evidence available” as a user-controlled refresh, not an auto-switch that changes context mid-triage

9) Auditability and replay requirements

Decision event schema (MUST)

Every recorded decision must store:

alert_id, artifact_id (image digest or commit hash)
actor_id, timestamp
decision_status (Affected/Not affected/Under investigation)
reason_code (preset) + reason_text
evidence_hashes[] (content-addressed hashes)
policy_context (ruleset version, policy id)
replay_token (hash of inputs needed to reproduce)

Replay token (MUST)

Deterministic hash of:
- scan inputs (SBOM digest, image digest, tool versions)
- policy/rules versions
- reachability algorithm version
“Reproduce” button produces a CLI snippet (copyable) pinned to these versions.

Export (MUST)

Exportable audit bundle that includes:
- JSONL of decision events
- evidence artifacts referenced by hashes
- signatures/attestations
Export must be stable and verifiable later.

10) API and data contract guidelines (developer-facing)

This is an implementation guideline, not a full API spec—keep it simple and cache-friendly.

MUST endpoints (or equivalent)

GET /alerts?filters… → list view payload (small, cacheable)
GET /alerts/{id}/evidence → evidence payload (reachability, call stack, provenance, hashes)
POST /alerts/{id}/decisions → record decision event (append-only)
GET /alerts/{id}/audit → audit timeline
GET /alerts/{id}/diff?baseline=… → SBOM/VEX diff view
GET /bundles/{id} and/or POST /bundles/verify → offline bundle download/verify

Evidence payload guidelines (MUST)

Deterministic ordering for arrays and nodes (stable sorts).
Explicit status per evidence section: available | loading | unavailable | error.
Include hash per artifact for content addressing.

Example shape

{
  "alert_id": "a123",
  "reachability": { "status": "available", "hash": "sha256:…", "proof": { "type": "call_path", "nodes": [...] } },
  "callstack":     { "status": "available", "hash": "sha256:…", "frames": [...] },
  "provenance":    { "status": "pending",   "hash": null,       "dsse": { "embedded": true, "payload": "…" } },
  "vex":           { "status": "available", "current": {...}, "history": [...] },
  "hashes": ["sha256:…", "sha256:…"]
}

11) Telemetry requirements (how we prove it’s fast)

MUST instrument:

alert_opened (timestamp, alert_id)
evidence_first_paint (timestamp, evidence_type)
decision_recorded (timestamp, clicks_count, evidence_bitset)
bundle_loaded (hit/miss, size, verification_status)
graph_preview_paint and graph_hydrated

MUST compute:

TTFS = evidence_first_paint - alert_opened
Clicks‑to‑Closure = interaction counter per alert until decision recorded
Evidence completeness bitset at decision time: reachability/callstack/provenance/vex present

12) Error handling and edge cases

MUST

Never show empty states without explanation.
Distinguish between:
- “not computed yet”
- “not possible due to missing inputs”
- “blocked by permissions”
- “offline—enrichment pending”
- “verification failed”

SHOULD

Offer “Request enrichment” action when evidence missing (creates a job/task id).

13) Security, permissions, and multi-tenancy

MUST

RBAC gating for:
- viewing provenance attestations
- recording decisions
- exporting audit bundles
All decision events are immutable; corrections are new events (append-only).
PII handling:
- Avoid storing freeform reasons with secrets; warn on paste patterns (optional P1).

14) Engineering execution plan (priorities)

P0 (ship first)

Evidence-first alert detail landing
Decision drawer + append-only audit
Deterministic alert list sort + reachability badge
Evidence API + decision POST
TTFS + clicks telemetry
Static graph preview + lazy hydration

P1

Offline bundle load/verify + offline rendering
Smart diff view (risk shift grouping)
Exportable audit bundle
Keyboard shortcuts + help overlay

P2

Inline quick decisions from list
Advanced graph search within view
Suggest reason presets based on evidence patterns

15) Acceptance criteria checklist (what QA signs off)

A build is acceptable when:

Opening an alert renders at least one evidence pill within 500ms (with cache) and TTFS p95 meets target under network simulation.
Users can record A/N/U decisions with reason and see an audit event immediately.
Decision event includes evidence hashes + replay token.
Alert list sorting is stable and deterministic across refresh.
Graph preview appears instantly; interactive graph hydrates only on expand.
Offline bundle renders evidence without network; missing items show “enrichment pending,” not errors.
Keyboard shortcuts work; ? overlay lists them; full keyboard navigation is possible.

If you want, I can also format this into a developer-ready ticket pack (epics + user stories + acceptance tests) so engineers can implement without interpretation drift.

20 KiB Raw Blame History Unescape Escape

What “good” looks like (background in plain words)

Minimal evidence bundle per finding

KPIs to measure in CI and UI

Power‑user affordances (keyboard first)

UX flow to implement (end‑to‑end)

Graph performance on large call‑graphs

Offline‑friendly design

Audit & replay

TTFS & Clicks‑to‑Closure: how to measure in code

Developer tasks (concrete, shippable)

Benchmarks to run weekly

Why Stella’s approach reduces hesitation

Stella Ops — Evidence‑First Triage Implementation Guidelines (PM Spec)

0) Assumptions and scope

1) Product principles (non‑negotiables)

2) Success metrics (what we ship toward)

Primary metrics (P0)

Secondary metrics (P1)

3) User workflows and “Definition of Done”

Workflow A: Triage an alert to a decision

Workflow B: Explain “why is this flagged?”

4) UI requirements (MUST/SHOULD/MAY)

4.1 Alert list page

4.2 Alert detail — landing tab MUST be Evidence

4.3 Decision drawer

4.4 Diff tab

4.5 Activity/Audit tab

5) Power-user and accessibility requirements

Keyboard shortcuts (MUST)

Accessibility (MUST)

6) Evidence model: what every alert should attempt to provide

7) Performance and graph rendering requirements

TTFS budget (MUST)

Graph rendering for large call graphs (MUST)

8) Offline mode requirements

Offline evidence bundle (MUST)

UI behavior (MUST)

9) Auditability and replay requirements

Decision event schema (MUST)

Replay token (MUST)

Export (MUST)

10) API and data contract guidelines (developer-facing)

MUST endpoints (or equivalent)

Evidence payload guidelines (MUST)

11) Telemetry requirements (how we prove it’s fast)

12) Error handling and edge cases

13) Security, permissions, and multi-tenancy

14) Engineering execution plan (priorities)

P0 (ship first)

P1

P2

15) Acceptance criteria checklist (what QA signs off)

20 KiB

Raw Blame History