Here’s a tight, practical blueprint for building (and proving) a fast, evidence‑first triage workflow—plus the power‑user affordances that make Stella Ops feel “snappy” even offline.

# What “good” looks like (background in plain words)

* **Alert → evidence → decision** in one flow: an alert should open directly onto the concrete proof (reachability, call‑stack, provenance), then offer a one‑click decision (VEX/CSAF status) with audit logging.
* **Time‑to‑First‑Signal (TTFS)** is king: how fast a human sees the first credible piece of evidence that explains *why this alert matters here*.
* **Clicks‑to‑Closure**: count how many interactions to reach a defensible decision recorded in the audit log.

# Minimal evidence bundle per finding

* **Reachability proof**: function‑level path or package‑level import chain (with “toggle reachability view” hotkey).
* **Call‑stack snippet**: 5–10 frames around the sink/source with file:line anchors.
* **Provenance**: attestation / DSSE + build ancestry (image → layer → artifact → commit).
* **VEX/CSAF status**: affected/not‑affected/under‑investigation + reason.
* **Diff**: what changed since last scan (SBOM or VEX delta), rendered as a small, human‑readable “smart‑diff.”

# KPIs to measure in CI and UI

* **TTFS (p50/p95)** from alert creation to first rendered evidence.
* **Clicks‑to‑Closure (median)** per decision type.
* **Evidence completeness score** (0–4): reachability, call‑stack, provenance, VEX/CSAF present.
* **Offline friendliness score**: % of evidence resolvable with no network.
* **Audit log completeness**: every decision has: evidence hash set, actor, policy context, replay token.

# Power‑user affordances (keyboard first)

* **Jump to evidence** (`J`): focuses the first incomplete evidence pane.
* **Copy DSSE** (`Y`): copies the attestation block or Rekor entry ref.
* **Toggle reachability view** (`R`): path list ↔ compact graph ↔ textual proof.
* **Search‑within‑graph** (`/`): node/func/package, instant.
* **Deterministic sort** (`S`): stable sort by (reachability→severity→age→component) to remove hesitation.
* **Quick VEX set** (`A`, `N`, `U`): Affected / Not‑affected / Under‑investigation with templated reasons.

# UX flow to implement (end‑to‑end)

1. **Alert row** shows: TTFS timer, reachability badge, “decision state,” and a diff‑dot if something changed.
2. **Open alert** lands on **Evidence tab** (not Details). Top strip = three proof pills:

   * Reachability ✓ / Call‑stack ✓ / Provenance ✓ (click to expand inline).
3. **Decision drawer** pinned on the right:

   * VEX/CSAF radio (A/N/U) → Reason presets → “Record decision.”
   * Shows **audit‑ready summary** (hashes, timestamps, policy).
4. **Diff tab**: SBOM/VEX delta since last run, grouped by “meaningful risk shift.”
5. **Activity tab**: immutable audit log; export as a signed bundle for audits.

# Graph performance on large call‑graphs

* **Minimal‑latency snapshots**: pre‑render static PNG/SVG thumbnails server‑side; open with tiny preview then hydrate to interactive graph lazily.
* **Progressive neighborhood expansion**: load 1‑hop first, expand on demand; keep the first TTFS < 500 ms.
* **Stable node ordering**: deterministic layout with consistent anchors to avoid “graph shuffle” anxiety.
* **Chunked graph edges** with capped fan‑out; collapse identical library paths into a **reachability macro‑edge**.

# Offline‑friendly design

* **Local evidence cache**: store (SBOM slices, path proofs, DSSE attestations, compiled call‑stacks) in a signed bundle beside the SARIF/VEX.
* **Deferred enrichment**: mark fields that need internet (e.g., upstream CSAF fetch) and queue a background “enricher” when network returns.
* **Predictable fallbacks**: if provenance server missing, show embedded DSSE and “verification pending,” never blank states.

# Audit & replay

* **Deterministic replay token**: hash(feed manifests + rules + lattice policy + inputs) → attach to every decision.
* **One‑click “Reproduce”**: opens CLI snippet pinned to the exact versions and policies.
* **Evidence hash‑set**: content‑address each proof artifact; the audit entry stores only hashes + signer.

# TTFS & Clicks‑to‑Closure: how to measure in code

* Emit a `ttfs.start` at alert creation; first paint of any evidence card emits `ttfs.signal`.
* Increment a per‑alert **interaction counter**; on “Record decision” emit `close.clicks`.
* Log **evidence bitset** (reach, stack, prov, vex) at decision time for completeness scoring.

# Developer tasks (concrete, shippable)

* **Evidence API**: `GET /alerts/{id}/evidence` returns `{reachability, callstack, provenance, vex, hashes[]}` with deterministic sort.
* **Proof renderer**: tiny, no‑framework widget that can render from the offline bundle; hydrate to full only on interaction.
* **Keyboard map**: global handler with overlay help (`?`); no collisions; all actions are idempotent.
* **Graph service**: server‑side layout + snapshot PNG; client hydrates WebGL only when user expands.
* **Smart‑diff**: diff SBOM/VEX → classify into “risk‑raising / neutral / reducing,” surface only the first item by default.
* **Audit logger**: append‑only stream; signed checkpoints; export `.stella-audit.tgz` (attestations + JSONL).

# Benchmarks to run weekly

* **TTFS under poor network** (100 ms RTT, 1% loss): p95 < 1.5 s to first evidence.
* **Graph hydration on 250k‑edge image**: preview < 300 ms, interactive < 2.0 s.
* **Keyboard coverage**: ≥90% of triage actions executable without mouse.
* **Offline replay**: 100% of decisions re‑render from bundle; zero web calls required.

# Why Stella’s approach reduces hesitation

* **Deterministic sort orders** keep findings in place between refreshes.
* **Minimal‑latency graph snapshots** show something trustworthy immediately, then refine—no “blank panel” delay.
* **Replayable, signed bundles** make every click auditable and reversible, which builds operator confidence.

If you want, I can turn this into:

* a **UI checklist** for a design review,
* a **.NET 10 API contract** (DTOs + endpoints),
* or a **Cypress/Playwright test plan** that measures TTFS and clicks‑to‑closure automatically.
Below is a PM‑style implementation guideline you can hand to developers. It’s written as a **build spec**: clear goals, “MUST/SHOULD” requirements, acceptance criteria, and the non‑functional guardrails (performance, offline, auditability) that make triage feel fast and defensible.

---

# Stella Ops — Evidence‑First Triage Implementation Guidelines (PM Spec)

## 0) Assumptions and scope

**Assumptions**

* Stella Ops ingests vulnerability findings (SCA/SAST/image scans), has SBOM context, and can compute reachability/call paths.
* Triage outcomes must be recorded as VEX/CSAF‑compatible states with reasons and audit trails.
* Users may operate in restricted networks and need an offline mode that still shows evidence.

**In scope**

* Evidence‑first alert triage UI + APIs + telemetry.
* Reachability proof + call stack view + provenance attestation view.
* VEX/CSAF decision recording with audit export.
* Offline evidence bundle and deterministic replay token.

**Out of scope (for this phase)**

* Building the underlying static analyzer or SBOM generator (we consume their outputs).
* Full CSAF publishing workflow (we store and export; publishing is separate).
* Remediation automation (PRs, patching).

---

## 1) Product principles (non‑negotiables)

1. **Evidence before detail**
   Opening an alert **MUST** show the best available evidence immediately (even partial/placeholder), not a generic “details” page.
2. **Fast first signal**
   The UI **MUST** render a credible “first signal” quickly (reachability badge, call stack snippet, or provenance block).
3. **Determinism reduces hesitation**
   Sorting, graphs, and diffs **MUST** be stable across refreshes. No jittery re-layout.
4. **Offline by design**
   If evidence exists locally (bundle), the UI **MUST** render it without network access.
5. **Audit-ready by default**
   Every decision **MUST** be reproducible, attributable, and exportable with evidence hashes.

---

## 2) Success metrics (what we ship toward)

These become acceptance criteria and dashboards.

### Primary metrics (P0)

* **TTFS (Time‑to‑First‑Signal)**: p95 < **1.5s** from opening an alert to first evidence card rendering (with 100ms RTT, 1% loss simulation).
* **Clicks‑to‑Closure**: median < **6** interactions to record a VEX decision.
* **Evidence completeness** at decision time: ≥ **90%** of decisions include evidence hash set + reason + replay token.

### Secondary metrics (P1)

* **Offline resolution rate**: ≥ **95%** of alerts opened with a local bundle show reachability + provenance without network.
* **Graph usability**: preview render < **300ms**, interactive hydration < **2.0s** for large graphs (see §7).

---

## 3) User workflows and “Definition of Done”

### Workflow A: Triage an alert to a decision

**DoD**: user can open an alert, see evidence, set VEX state, and the system records a signed/auditable decision event.

**Steps**

1. Alert list shows key signals (reachability badge, decision state, diff indicator).
2. Open alert → Evidence view loads first.
3. User reviews reachability/call stack/provenance.
4. User sets VEX status + reason preset (editable).
5. User records decision.
6. Audit log entry appears instantly and is exportable.

### Workflow B: Explain “why is this flagged?”

**DoD**: user can show a defensible proof (path/call stack/provenance) and copy it into a ticket.

---

## 4) UI requirements (MUST/SHOULD/MAY)

## 4.1 Alert list page

**MUST**

* Each row includes:

  * Severity + component identifier
  * **Decision state** (Unset / Under Investigation / Not Affected / Affected)
  * **Reachability badge** (Reachable / Not Reachable / Unknown) where available
  * **Diff indicator** if SBOM/VEX changed since last scan (simple dot/label)
  * Age / first seen / last updated
* **Deterministic sort** default:
  `Reachability DESC → Severity DESC → Decision state (Unset first) → Age DESC → Component name ASC`
* Keyboard navigation:

  * `↑/↓` move selection, `Enter` open alert.
  * `/` search/filter focus.

**SHOULD**

* Inline “quick set” decision menu (Affected / Not affected / Under investigation) without leaving list for obvious cases, but still requires reason and logs evidence hashes.

## 4.2 Alert detail — landing tab MUST be Evidence

**MUST**

* Default landing is **Evidence** (not “Overview”).
* Top section shows 3 “proof pills” with status:

  * Reachability (✓ / ! / …)
  * Call stack (✓ / ! / …)
  * Provenance (✓ / ! / …)
* Each pill expands inline (no navigation) into a compact evidence panel.

**MUST: No blank panels**

* If evidence is loading, show skeleton + “what’s coming.”
* If evidence missing, show a reason (“not computed”, “requires source map”, “offline – enrichment pending”).

## 4.3 Decision drawer

**MUST**

* Pinned right drawer (or persistent bottom sheet on small screens).
* Controls:

  * VEX/CSAF status: **Affected / Not affected / Under investigation**
  * Reason preset dropdown + editable reason text
  * “Record decision” button
* Preview “Audit summary” before submit:

  * Evidence hashes included
  * Policy context (ruleset version)
  * Replay token
  * Actor identity

**MUST**

* On submit, create an append-only audit event and immediately reflect status in UI.

**SHOULD**

* Allow attaching references: ticket URL, incident ID, PR link (stored as metadata).

## 4.4 Diff tab

**MUST**

* Show delta since last scan:

  * SBOM diffs (component version changes, removals/additions)
  * VEX diffs (status changes)
* Group diffs by **risk shift**:

  * Risk‑raising (new reachable vuln, severity increase)
  * Neutral (metadata-only)
  * Risk‑reducing (fixed version, reachability removed)

**SHOULD**

* Provide “Copy diff summary” for change management.

## 4.5 Activity/Audit tab

**MUST**

* Immutable timeline of decisions and evidence changes.
* Each entry includes:

  * actor, timestamp, decision, reason
  * evidence hash set
  * replay token
  * bundle/export availability

---

## 5) Power-user and accessibility requirements

### Keyboard shortcuts (MUST)

* `J`: jump to next missing/incomplete evidence panel
* `R`: toggle reachability view (list ↔ compact graph ↔ textual proof)
* `Y`: copy selected evidence block (call stack / DSSE / path proof)
* `A`: set “Affected” (opens reason preset selection)
* `N`: set “Not affected”
* `U`: set “Under investigation”
* `?`: keyboard help overlay

### Accessibility (MUST)

* Fully navigable by keyboard
* Visible focus states
* Screen-reader labels for evidence pills and drawer controls
* Color is never the only signal (badges must have text/icon)

---

## 6) Evidence model: what every alert should attempt to provide

Treat this as the **minimum evidence bundle**. Each item may be “unavailable,” but must be explicit.

**MUST** support:

1. **Reachability proof**

   * At least one of:

     * function-level call path: `entry → … → vulnerable_sink`
     * package/module import chain
   * Includes confidence/algorithm tag: `static`, `dynamic`, `heuristic`
2. **Call stack snippet**

   * 5–10 frames around the relevant node with file:line anchors where possible
3. **Provenance**

   * DSSE attestation or equivalent statement
   * Artifact ancestry chain: image → layer → artifact → commit (as available)
   * Verification status: verified / pending / failed (with reason)
4. **Decision state**

   * VEX status + reason + timestamps
5. **Evidence hash set**

   * Content-addressed hashes of each evidence artifact included in the decision

**SHOULD**

* “Evidence freshness”: when computed, tool version, input revisions.

---

## 7) Performance and graph rendering requirements

### TTFS budget (MUST)

* When opening an alert:

  * **<200ms**: show skeleton and cached row metadata
  * **<500ms**: render at least one evidence pill with meaningful content OR a cached preview image
  * **<1.5s p95**: render reachability + provenance for typical alerts

### Graph rendering for large call graphs (MUST)

* **Two-phase rendering**

  1. Server-generated **static snapshot** (PNG/SVG) displayed immediately
  2. Interactive graph hydrates lazily on user expand
* **Progressive expansion**

  * Load 1-hop neighborhood first; expand on click
* **Deterministic layout**

  * Same input produces same layout anchors (no reshuffles between refreshes)
* **Fan-out control**

  * Collapse repeated library paths into “macro edges” to keep the graph readable

---

## 8) Offline mode requirements

Offline is not “nice to have”; it is a defined mode.

### Offline evidence bundle (MUST)

* A single file (e.g., `.stella.bundle.tgz`) that contains:

  * Alert metadata snapshot
  * Evidence artifacts (reachability proofs, call stacks, provenance attestations)
  * SBOM slice(s) necessary for diffs
  * VEX decision history (if available)
  * Manifest with content hashes (Merkle-ish)
* Bundle must be **signed** (or include signature material) and verifiable.

### UI behavior (MUST)

* If bundle is present:

  * UI loads evidence from it first
  * Any missing items show “enrichment pending” (not “error”)
* If network returns:

  * Background refresh allowed, but **must not reorder** the alert list unexpectedly
  * Must surface “updated evidence available” as a user-controlled refresh, not an auto-switch that changes context mid-triage

---

## 9) Auditability and replay requirements

### Decision event schema (MUST)

Every recorded decision must store:

* `alert_id`, `artifact_id` (image digest or commit hash)
* `actor_id`, `timestamp`
* `decision_status` (Affected/Not affected/Under investigation)
* `reason_code` (preset) + `reason_text`
* `evidence_hashes[]` (content-addressed hashes)
* `policy_context` (ruleset version, policy id)
* `replay_token` (hash of inputs needed to reproduce)

### Replay token (MUST)

* Deterministic hash of:

  * scan inputs (SBOM digest, image digest, tool versions)
  * policy/rules versions
  * reachability algorithm version
* “Reproduce” button produces a CLI snippet (copyable) pinned to these versions.

### Export (MUST)

* Exportable audit bundle that includes:

  * JSONL of decision events
  * evidence artifacts referenced by hashes
  * signatures/attestations
* Export must be stable and verifiable later.

---

## 10) API and data contract guidelines (developer-facing)

This is an implementation guideline, not a full API spec—keep it simple and cache-friendly.

### MUST endpoints (or equivalent)

* `GET /alerts?filters…` → list view payload (small, cacheable)
* `GET /alerts/{id}/evidence` → evidence payload (reachability, call stack, provenance, hashes)
* `POST /alerts/{id}/decisions` → record decision event (append-only)
* `GET /alerts/{id}/audit` → audit timeline
* `GET /alerts/{id}/diff?baseline=…` → SBOM/VEX diff view
* `GET /bundles/{id}` and/or `POST /bundles/verify` → offline bundle download/verify

### Evidence payload guidelines (MUST)

* Deterministic ordering for arrays and nodes (stable sorts).
* Explicit `status` per evidence section: `available | loading | unavailable | error`.
* Include `hash` per artifact for content addressing.

**Example shape**

```json
{
  "alert_id": "a123",
  "reachability": { "status": "available", "hash": "sha256:…", "proof": { "type": "call_path", "nodes": [...] } },
  "callstack":     { "status": "available", "hash": "sha256:…", "frames": [...] },
  "provenance":    { "status": "pending",   "hash": null,       "dsse": { "embedded": true, "payload": "…" } },
  "vex":           { "status": "available", "current": {...}, "history": [...] },
  "hashes": ["sha256:…", "sha256:…"]
}
```

---

## 11) Telemetry requirements (how we prove it’s fast)

**MUST** instrument:

* `alert_opened` (timestamp, alert_id)
* `evidence_first_paint` (timestamp, evidence_type)
* `decision_recorded` (timestamp, clicks_count, evidence_bitset)
* `bundle_loaded` (hit/miss, size, verification_status)
* `graph_preview_paint` and `graph_hydrated`

**MUST** compute:

* TTFS = `evidence_first_paint - alert_opened`
* Clicks‑to‑Closure = interaction counter per alert until decision recorded
* Evidence completeness bitset at decision time: reachability/callstack/provenance/vex present

---

## 12) Error handling and edge cases

**MUST**

* Never show empty states without explanation.
* Distinguish between:

  * “not computed yet”
  * “not possible due to missing inputs”
  * “blocked by permissions”
  * “offline—enrichment pending”
  * “verification failed”

**SHOULD**

* Offer “Request enrichment” action when evidence missing (creates a job/task id).

---

## 13) Security, permissions, and multi-tenancy

**MUST**

* RBAC gating for:

  * viewing provenance attestations
  * recording decisions
  * exporting audit bundles
* All decision events are immutable; corrections are new events (append-only).
* PII handling:

  * Avoid storing freeform reasons with secrets; warn on paste patterns (optional P1).

---

## 14) Engineering execution plan (priorities)

### P0 (ship first)

* Evidence-first alert detail landing
* Decision drawer + append-only audit
* Deterministic alert list sort + reachability badge
* Evidence API + decision POST
* TTFS + clicks telemetry
* Static graph preview + lazy hydration

### P1

* Offline bundle load/verify + offline rendering
* Smart diff view (risk shift grouping)
* Exportable audit bundle
* Keyboard shortcuts + help overlay

### P2

* Inline quick decisions from list
* Advanced graph search within view
* Suggest reason presets based on evidence patterns

---

## 15) Acceptance criteria checklist (what QA signs off)

A build is acceptable when:

* Opening an alert renders at least one evidence pill within **500ms** (with cache) and TTFS p95 meets target under network simulation.
* Users can record A/N/U decisions with reason and see an audit event immediately.
* Decision event includes evidence hashes + replay token.
* Alert list sorting is stable and deterministic across refresh.
* Graph preview appears instantly; interactive graph hydrates only on expand.
* Offline bundle renders evidence without network; missing items show “enrichment pending,” not errors.
* Keyboard shortcuts work; `?` overlay lists them; full keyboard navigation is possible.

---

If you want, I can also format this into a **developer-ready ticket pack** (epics + user stories + acceptance tests) so engineers can implement without interpretation drift.