Files
git.stella-ops.org/docs/ux/TRIAGE_UX_GUIDE.md

7.8 KiB

Stella Ops Triage UX Guide (Narrative-First + Proof-Linked)

0. Scope

This guide specifies the user experience for Stella Ops triage and evidence workflows:

  • Narrative-first case view that answers DevOps' three questions quickly.
  • Proof-linked evidence surfaces (SBOM/VEX/provenance/reachability/replay).
  • Quiet-by-default noise controls with reversible, signed decisions.
  • Smart-Diff history that explains meaningful risk changes.

Architecture constraints:

  • Lattice/risk evaluation executes in scanner.webservice.
  • concelier and excititor must preserve prune source (every merged/pruned datum remains traceable to origin).

1. UX Contract

Every triage surface must answer, in order:

  1. Can I ship this?
  2. If not, what exactly blocks me?
  3. What's the minimum safe change to unblock?

Everything else is secondary and should be progressively disclosed.

2. Primary Objects in the UX

  • Finding/Case: a specific vuln/rule tied to an asset (image/artifact/environment).
  • Risk Result: deterministic lattice output (score/verdict/lane), computed by scanner.webservice.
  • Evidence Artifact: signed, hash-addressed proof objects (SBOM slice, VEX doc, provenance, reachability slice, replay manifest).
  • Decision: reversible user/system action that changes visibility/gating (mute/ack/exception) and is always signed/auditable.
  • Snapshot: immutable record of inputs/outputs hashes enabling Smart-Diff.

3. Global UX Principles

3.1 Narrative-first, list-second

Default view is a "Case" narrative header + evidence rail. Lists exist for scanning and sorting, but not as the primary cognitive surface.

3.2 Time-to-evidence (TTFS) target

From pipeline alert click → human-readable verdict + first evidence link:

  • p95 ≤ 30 seconds (including auth and initial fetch).
  • "Evidence" is always one click away (no deep tab chains).

3.3 Proof-linking is mandatory

Any chip/badge that asserts a fact must link to the exact evidence object(s) that justify it.

Examples:

  • "Reachable: Yes" → call-stack slice (and/or runtime hit record)
  • "VEX: not_affected" → effective VEX assertion + signature details
  • "Blocked by Policy Gate X" → policy artifact + lattice explanation

3.4 Quiet by default, never silent

Muted lanes are hidden by default but surfaced with counts and a toggle. Muting never deletes; it creates a signed Decision with TTL/reason and is reversible.

3.5 Deterministic and replayable

Users must be able to export an evidence bundle containing:

  • scan replay manifest (feeds/rules/policies/hashes)
  • signed artifacts
  • outputs (risk result, snapshots) so auditors can replay identically.

4. Information Architecture

4.1 Screens

  1. Findings Table (global)
  • Purpose: scan, sort, filter, jump into cases
  • Default: muted lanes hidden
  • Banner: shows count of auto-muted by policy with "Show" toggle
  1. Case View (single-page narrative)
  • Purpose: decision making + proof review
  • Above fold: verdict + chips + deterministic score
  • Right rail: evidence list
  • Tabs (max 3):
    • Evidence (default)
    • Reachability & Impact
    • History (Smart-Diff)
  1. Export / Verify Bundle
  • Purpose: offline/audit verification
  • Async export job, then download DSSE-signed zip
  • Verification UI: signature status, hash tree, issuer chain

4.2 Lanes (visibility buckets)

Lanes are a UX categorization derived from deterministic risk + decisions:

  • ACTIVE
  • BLOCKED
  • NEEDS_EXCEPTION
  • MUTED_REACH (non-reachable)
  • MUTED_VEX (effective VEX says not_affected)
  • COMPENSATED (controls satisfy policy)

Default: show ACTIVE/BLOCKED/NEEDS_EXCEPTION. Muted lanes appear behind a toggle and via the banner counts.

5. Case View Layout (Required)

5.1 Top Bar

  • Asset name / Image tag / Environment
  • Last evaluated time
  • Policy profile name (e.g., "Strict CI Gate")

5.2 Verdict Banner (Above fold)

Large, unambiguous verdict:

  • SHIP
  • BLOCKED
  • NEEDS EXCEPTION

Below verdict:

  • One-line "why" summary (max 140 chars), e.g.:
    • "Reachable path observed; exploit signal present; Policy 'prod-strict' blocks."

5.3 Chips (Each chip is clickable)

Minimum set:

  • Reachability: Reachable / Not reachable / Unknown (with confidence)
  • Effective VEX: affected / not_affected / under_investigation
  • Exploit signal: yes/no + source indicator
  • Exposure: internet-exposed yes/no (if available)
  • Asset tier: tier label
  • Gate: allow/block/exception-needed (policy gate name)

Chip click behavior:

  • Opens evidence panel anchored to the proof objects
  • Shows source chain (concelier/excititor preserved sources)

5.4 Evidence Rail (Always visible right side)

List of evidence artifacts with:

  • Type icon
  • Title
  • Issuer
  • Signed/verified indicator
  • Content hash (short)
  • Created timestamp Actions per item:
  • Preview
  • Copy hash
  • Open raw
  • "Show in bundle" marker
  • Create work item
  • Acknowledge / Mute (opens Decision drawer)
  • Propose exception (Decision with TTL + approver chain)
  • Export evidence bundle

No more than 4 primary buttons. Secondary actions go into kebab menu.

6. Decision Flows (Mute/Ack/Exception)

6.1 Decision Drawer (common UI)

Fields:

  • Decision kind: Mute reach / Mute VEX / Acknowledge / Exception
  • Reason code (dropdown) + free-text note
  • TTL (required for exceptions; optional for mutes)
  • Policy ref (auto-filled; editable only by admins)
  • "Sign and apply" (server-side DSSE signing; user identity included)

On submit:

  • Create Decision (signed)
  • Re-evaluate lane/verdict if applicable
  • Create Snapshot ("DECISION" trigger)
  • Show toast with undo link

6.2 Undo

Undo is implemented as "revoke decision" (signed revoke record or revocation fields). Never delete.

7. Smart-Diff UX

7.1 Timeline

Chronological snapshots:

  • when (timestamp)
  • trigger (feed/vex/sbom/policy/runtime/decision/rescan)
  • summary (short)

7.2 Diff panel

Two-column diff:

  • Inputs changed (with proof links): VEX assertion changed, policy version changed, runtime trace arrived, etc.
  • Outputs changed: lane, verdict, score, gates

7.3 Meaningful change definition

The UI only highlights "meaningful" changes:

  • verdict change
  • lane change
  • score crosses a policy threshold
  • reachability state changes
  • effective VEX status changes Other changes remain in "details" expandable.

8. Performance & UI Engineering Requirements

  • Findings table uses virtual scroll and server-side pagination.
  • Case view loads in 2 steps:
    1. Header narrative (small payload)
    2. Evidence list + snapshots (lazy)
  • Evidence previews are lazy-loaded and cancellable.
  • Use ETag/If-None-Match for case and evidence list endpoints.
  • UI must remain usable under high latency (air-gapped / offline kits):
    • show cached last-known verdict with clear "stale" marker
    • allow exporting bundles from cached artifacts when permissible

9. Accessibility & Operator Usability

  • Keyboard navigation: table rows, chips, evidence list
  • High contrast mode supported
  • All status is conveyed by text + shape (not color only)
  • Copy-to-clipboard for hashes, purls, CVE IDs

10. Telemetry (Must instrument)

  • TTFS: notification click → verdict banner rendered
  • Time-to-proof: click chip → proof preview shown
  • Mute reversal rate (auto-muted later becomes actionable)
  • Bundle export success/latency

11. Responsibilities by Service

  • scanner.webservice:
    • produces reachability results, risk results, snapshots
    • stores/serves case narrative header, evidence indexes, Smart-Diff
  • concelier:
    • aggregates vuln feeds and preserves per-source provenance ("preserve prune source")
  • excititor:
    • merges VEX and preserves original assertion sources ("preserve prune source")
  • notify.webservice:
    • emits first_signal / risk_changed / gate_blocked
  • scheduler.webservice:
    • re-evaluates existing images on feed/policy updates, triggers snapshots

Document Version: 1.0 Target Platform: .NET 10, PostgreSQL >= 16, Angular v17