Here’s a lightweight pattern to make failures show up instantly while keeping backends decoupled: **emit a tiny, versioned event the moment you know something failed**, and attach pointers to heavier evidence that can arrive later.

---

# Why this helps

* **UI reacts in real time**: show “Failed at Step X (E123)” immediately—no waiting for logs, SBOMs, or artifacts to upload/process.
* **Backends evolve safely**: logs, traces, SBOM/VEX, heap dumps, etc., can change format or arrive out of order without breaking the UI contract.
* **Deterministic UX**: a small, stable schema prevents flaky pipelines from blocking visibility.
* **Great for air‑gapped/offline**: the tiny event rides your internal bus/storage; bulky payloads sync or materialize when available.

---

# The event itself (keep it tiny)

**Fields (stable, versioned):**

* `v` — schema version (e.g., `1`).
* `ts` — event timestamp (UTC, ISO 8601).
* `run_id` — pipeline/execution correlation ID.
* `stage` — coarse phase (e.g., `fetch`, `build`, `scan`, `policy`, `deploy`).
* `step` — fine-grained step (e.g., `trivy-scan`, `dotnet-restore`).
* `status` — `fail|warn|pass|info` (for this pattern, you’ll use `fail`).
* `error_class` — stable classifier (e.g., `NETWORK_DNS`, `AUTH_EXPIRED`, `POLICY_BLOCK`, `VULN_REACHABLE`).
* `summary` — short human string (“Reachable vuln blocks release”).
* `pointers` — array of *opaque, resolvable references* (log offsets, artifact URIs, attestation IDs).
* `kv` — optional tiny key/values for quick filtering (e.g., `severity=A`, `package=openssl`).
* `sig` (optional) — detached/inline signature (DSSE) for integrity.

**Example**

```json
{
  "v": 1,
  "ts": "2025-12-13T12:10:03Z",
  "run_id": "run_7f3c6a8",
  "stage": "policy",
  "step": "vex-gate",
  "status": "fail",
  "error_class": "VULN_REACHABLE",
  "summary": "Reachable CVE blocks release",
  "pointers": [
    {"type":"log", "ref":"logs://scanner/7f3c6a8#L1423-L1480"},
    {"type":"attestation", "ref":"rekor://sha256:…"},
    {"type":"sbom", "ref":"artifact://sbom/cyclonedx@run_7f3c6a8.json"}
  ],
  "kv": {"cve":"CVE-2025-12345", "component":"openssl", "severity":"A"}
}
```

---

# UI behavior (instant, then enrich)

1. **Instant render** (sub-200 ms): show a red card with stage/step, `error_class`, and `summary`.
2. **Progressive hydration**: as pointers resolve, add:

   * “View log excerpt” (jump to `#L1423-L1480`)
   * “Open attestation” (verify DSSE/Rekor)
   * “Inspect SBOM diff” (component → version → call‑graph)
3. **Stable affordances**: UI never breaks if a pointer is slow/missing; it just shows a spinner or “awaiting evidence”.

---

# Backend contract

* **Publish early**: emit on first knowledge of failure (e.g., non‑zero exit, policy deny, TLS error).
* **Don’t embed heavy data**: only pointers or tiny facts for filters.
* **Pointer resolution is pluggable**: files, object storage, Postgres row, Valkey cache key, Rekor entry—whatever suits the deployment.
* **Version discipline**: bump `v` only for breaking schema changes; additive fields are fine.

---

# Minimal topic map (so teams agree on names)

* `stage`: `fetch|build|scan|policy|sign|package|deploy`
* `error_class` suggestions:

  * Infra: `NETWORK_DNS`, `NETWORK_TIMEOUT`, `REGISTRY_403`, `DISK_FULL`
  * AuthN/Z: `AUTH_EXPIRED`, `TOKEN_SCOPE_MISS`
  * Supply chain: `ATTESTATION_MISSING`, `SIGNATURE_INVALID`, `SBOM_STALE`
  * Secure build: `POLICY_BLOCK`, `VULN_REACHABLE`, `MALWARE_FLAG`
  * Runtime: `IMAGE_DRIFT`, `PROVENANCE_MISMATCH`

Keep each to a 1–2 line definition in a shared doc.

---

# Drop‑in for Stella Ops (tailored)

* **Emitter**: `StellaOps.Events` (tiny .NET lib) used by Scanner/Policy/Scheduler to publish `TinyFailureEvent`.
* **Transport**: Postgres notify (default) + Valkey pub/sub accelerator. (Matches your Postgres+Valkey architecture choice.)
* **Resolver service**: `EvidenceGateway` that turns `pointers` into viewable slices (log excerpts, SBOM component focus, Rekor proof).
* **UI**: “Failure Feed” panel shows cards from the event stream; detail drawer resolves pointers on demand.
* **Signing**: optional DSSE for events; Rekor (or mirror) for attestations—your “Proof‑Linked” moat.
* **Air‑gap**: pointers use `artifact://` and `row://` schemes resolvable entirely on‑prem.

---

# Quick implementation checklist

* Define `TinyFailureEvent` schema v1 and `error_class` registry.
* Add emit helpers for each module (`FailNow(summary, error_class, pointers, kv)`).
* Build `EvidenceGateway.Resolve(pointer)` handlers.
* UI: render card instantly; hydrate sections as resolvers return.
* Telemetry: metrics on TTF**E** (Time‑To‑Failure‑Event) and pointer hydration latencies.
* Docs: 1‑page contract; examples for each error_class.

If you want, I can draft the .NET 10 interfaces (`ITinyEventEmitter`, resolvers, and a small Razor/Angular card) and a Postgres schema you can paste into your repo.
Below is a **PM-grade implementation spec** for “Real-time Failure Signaling” using **Tiny Failure Events** + **Evidence Pointers**, written so engineers can build it without guessing.

---

# Product: Real-time Failure Signaling (Tiny Failure Events)

## Goal

When any pipeline run fails, users must see **what failed and where** (stage/step + error class + short summary) **immediately**, even if logs/SBOM/attestations are delayed, huge, or unavailable.

The UI must render a failure card from a tiny event and then progressively enrich with evidence as it becomes resolvable.

## Outcomes we must deliver

1. **Instant visibility:** “Failed at Step X” appears within seconds of failure.
2. **Decoupling:** UI depends only on a stable tiny schema, not on log formats/artifact structures.
3. **Evidence linking:** Users can open logs/SBOM/attestations when available, via pointers.
4. **Reliability:** Duplicate/out-of-order events don’t break the UI; state remains consistent.
5. **Security:** Evidence access is authorized; pointers do not leak sensitive info.

---

# Scope

## In scope (MVP)

* Emit **TinyFailureEvent v1** on first detected failure for a step.
* Transport events in near real-time to UI.
* Store events durably and allow UI to fetch a run’s event timeline.
* Support evidence pointers for:

  * logs (excerptable)
  * artifacts (SBOM, reports)
  * attestations (provenance/signature)
* UI:

  * show run timeline
  * show failure card instantly
  * hydrate evidence sections on demand (or automatically where feasible)

## Out of scope (MVP)

* Full trace viewer / distributed tracing UI (we can link to external trace systems via pointer).
* Automated remediation (“fix it”) actions.
* Full-blown case management.

---

# Key terms and definitions

* **Run:** A single execution of a pipeline. Identified by `run_id`.
* **Stage:** Coarse lifecycle phase (`fetch`, `build`, `scan`, `policy`, `sign`, `package`, `deploy`).
* **Step:** A concrete activity within a stage (`dotnet-restore`, `trivy-scan`, `vex-gate`).
* **Tiny Failure Event:** A small message representing “this step failed”, including stable classification and references to evidence.
* **Pointer:** An opaque reference that can be resolved into evidence content or a link later.

---

# User stories and acceptance criteria

## Story 1: I see failure instantly

**As a** developer
**I want** to see which step failed immediately
**So that** I don’t wait on logs/artifacts

**Acceptance criteria**

* When a step fails, the UI updates within **≤ 2 seconds p95** from the time the orchestrator/runner detects failure.
* The failure card includes:

  * stage, step
  * error class
  * human summary
  * timestamp
  * (optional) primary key/value details (e.g., CVE, severity)

## Story 2: I can open evidence when available

**As a** release engineer
**I want** to click evidence links (logs/SBOM/attestation)
**So that** I can diagnose/root-cause

**Acceptance criteria**

* Failure card shows evidence sections as:

  * **Available** (clickable)
  * **Pending** (spinner / “awaiting evidence”)
  * **Unavailable** (“not produced” or “access denied”)
* Clicking log evidence opens an excerpt view, not a 500MB file download.
* Evidence access enforces authorization (same as run access).

## Story 3: Events are robust to duplicates/out-of-order

**As a** user
**I want** the timeline to remain correct
**Even if** event delivery is at-least-once

**Acceptance criteria**

* UI displays exactly one current “failed” state per step attempt.
* Duplicate events do not create duplicate cards.
* Out-of-order arrival does not revert a step from fail → pass.

---

# Functional requirements (what developers must build)

## FR1: TinyFailureEvent schema v1

### Required fields

All producers MUST emit events that validate against this schema.

```json
{
  "v": 1,
  "event_id": "evt_01J…", 
  "ts": "2025-12-13T12:10:03.123Z",
  "run_id": "run_7f3c6a8",
  "stage": "policy",
  "step": "vex-gate",
  "attempt": 1,
  "status": "fail",
  "error_class": "VULN_REACHABLE",
  "summary": "Reachable CVE blocks release",
  "pointers": [],
  "kv": {}
}
```

### Field definitions & constraints

* `v` (int, required): must be `1` for this spec.
* `event_id` (string, required): globally unique.

  * Format: `evt_<ULID>` (ULID recommended for time-sortable IDs).
* `ts` (RFC3339 UTC, required): creation timestamp.
* `run_id` (string, required): stable correlation id for run.
* `stage` (enum string, required): one of:

  * `fetch|build|scan|policy|sign|package|deploy|runtime`
* `step` (string, required): lowercase kebab-case recommended; max 80 chars.
* `attempt` (int, required): starts at 1; increments for retries.
* `status` (enum string, required for this feature): `fail` (MVP supports fail only; schema allows later expansion)
* `error_class` (string, required): stable classifier from a shared registry (see FR2).

  * max 64 chars; uppercase snake-case.
* `summary` (string, required): human readable, max 140 chars.
* `pointers` (array, optional): max 20 items; each item is a `Pointer` object (see FR3).
* `kv` (object, optional): small metadata map for filtering.

  * max 20 keys
  * key max 32 chars; value max 120 chars
  * no nested objects/arrays

### Size limits

* Entire event payload MUST be ≤ **8 KB** serialized JSON.
* If producers exceed limits, they MUST truncate `summary` and drop low-priority `kv` keys before failing emission.

---

## FR2: Error class registry (stable contract)

We maintain a canonical list of `error_class` values in a shared repo/module.

### Requirements

* Each `error_class` MUST have:

  * name (e.g., `NETWORK_DNS`)
  * short description
  * severity mapping (optional)
  * recommended remediation hints (optional, can be UI-side)
* Producers MUST use a registry value if applicable.
* Producers MAY emit `error_class="UNKNOWN"` if no mapping exists, but must log a warning and increment a metric.

### Initial registry (minimum)

Infra/Network:

* `NETWORK_DNS`
* `NETWORK_TIMEOUT`
* `DISK_FULL`

Auth:

* `AUTH_EXPIRED`
* `REGISTRY_403`

Supply chain:

* `SIGNATURE_INVALID`
* `ATTESTATION_MISSING`
* `SBOM_MISSING`

Policy/Security:

* `POLICY_BLOCK`
* `VULN_REACHABLE`
* `MALWARE_FLAG`

Runner/Orchestrator:

* `STEP_TIMEOUT`
* `RUN_ABORTED`
* `WORKER_LOST`

---

## FR3: Evidence pointer format and rules

### Pointer object schema

```json
{
  "type": "log|artifact|attestation|url|trace",
  "ref": "logs://scanner/run_7f3c6a8#L1423-L1480",
  "mime": "text/plain",
  "label": "Scanner log excerpt",
  "expires_at": "2025-12-20T00:00:00Z",
  "sha256": "optional hex"
}
```

### Rules

* `type` and `ref` are required.
* `ref` is opaque to UI; UI passes it to the resolver service.
* `label` is optional, but strongly recommended for UI friendliness.
* `expires_at` is optional; if present UI should show “may expire”.
* `sha256` optional for immutability verification (artifacts/attestations especially).

### Allowed schemes (MVP)

* `logs://<provider>/<run_id>#Lx-Ly`
* `artifact://<kind>/<name>@<version-or-run-id>`
* `attestation://<store>/<id-or-digest>`
* `url://<encoded>` (only internal allowed; resolver enforces)
* `trace://<system>/<trace-id>`

### Security constraints

* Pointers MUST NOT embed secrets (tokens, passwords).
* Any pointer that could expose sensitive data MUST be resolvable only through the Evidence Gateway (FR6), never directly client-side.
* The resolver MUST enforce authorization for the requesting user.

---

## FR4: Emission rules (when and how events are produced)

### When to emit

Producers MUST emit a TinyFailureEvent when:

1. A step exits non-zero.
2. A policy decision is “deny/block”.
3. A required artifact/attestation is missing at gate time.
4. A step times out.
5. The worker is lost (emitted by orchestrator watchdog).

### Exactly-once vs at-least-once

* Transport can be **at-least-once**.
* Consumers MUST be idempotent using `(run_id, stage, step, attempt, status)` + `event_id`.

### One failure event per step attempt

* For a given `(run_id, stage, step, attempt)`:

  * First emitted `status=fail` is canonical.
  * Later fail events for the same tuple are treated as “updates” only if they add pointers/kv (see FR5).

### Updates / enrichment

We support enrichment without breaking “tiny”:

* Producers MAY emit a second event **with the same tuple** (run_id/stage/step/attempt/status) that adds pointers or kv after the initial fail.
* Consumers MUST merge pointers (dedupe identical `type+ref`) and merge kv (new keys overwrite old keys).
* Producers MUST NOT spam; max 3 enrichment events per tuple.

---

## FR5: Event storage and aggregation

### Required services/components

1. **Event Ingest** (API or internal library endpoint)
2. **Event Store** (durable DB table)
3. **Realtime Fanout** (pub/sub channel)
4. **Run Timeline API** (query per run)

### Behavior

* On ingest:

  * Validate schema (reject invalid with 400/validation error).
  * Persist to event store.
  * Publish to realtime channel.

### Suggested DB model (Postgres)

Table: `run_events`

* `event_id` PK
* `run_id` indexed
* `ts` indexed
* `stage`, `step`, `attempt`, `status` indexed composite
* `payload` jsonb
* `ingested_at`

Uniqueness constraints:

* `event_id` unique
* Optional: unique on `(run_id, stage, step, attempt, status, hash(summary))` if you want stronger dedupe

### Query API

* `GET /runs/{run_id}/events` returns events sorted by `ts` ascending.
* UI should also subscribe realtime to avoid polling.

---

## FR6: Evidence Gateway (pointer resolver)

### Purpose

A single service that resolves pointers into either:

* log excerpts
* signed download URLs
* attestation display + verification data
* external trace links (sanitized)

### Endpoints (MVP)

1. **Resolve metadata**

   * `POST /evidence/resolve`
   * body: `{ "run_id": "...", "pointers": [ { "type": "...", "ref": "..." } ] }`
   * returns per pointer:

     * `status`: `available|pending|missing|denied|expired|error`
     * `kind`: `inline|link`
     * `title`
     * `mime`
     * `size_bytes` (if known)
     * `link` (if kind=link) – must be short-lived, server-generated
     * `inline_preview` (optional, small excerpt)

2. **Fetch log excerpt**

   * `GET /evidence/log-excerpt?ref=...`
   * returns:

     * `text` (max 64 KB)
     * `start_line`, `end_line`
     * `source` (provider info)

3. **Fetch artifact**

   * `GET /evidence/artifact?ref=...`
   * returns either:

     * short-lived download link
     * or 404/403/410

### AuthZ requirements

* Evidence Gateway MUST verify the caller has access to the `run_id`.
* Gateway MUST validate that the pointer belongs to that run (or is explicitly declared “global shared”).
* Gateway MUST audit-log every evidence resolution.

### Resilience

* If evidence is not ready, resolver returns `pending`, not 500.
* If pointer is unknown format, return `error` with a safe message.

---

# UI requirements (what the product must do)

## UI1: Run timeline renders from events

* The run detail page MUST show:

  * stages/steps list
  * current state per step (pass/warn/fail/running)
  * failure details if fail exists
* The failure state MUST be derived from TinyFailureEvent without requiring any log fetch.

## UI2: Failure card content (minimum)

When a fail event arrives:

* Show a red failure card with:

  * `stage` + `step`
  * `summary`
  * `error_class` badge
  * `ts` (relative + absolute on hover)
  * key kv fields (up to 4 shown; remainder behind “Show more”)

## UI3: Progressive hydration

* The card MUST include an “Evidence” section.
* For each pointer:

  * show a row with label and availability status
  * if available, show “Open”
  * if pending, show spinner + “Awaiting evidence”
  * if denied, show lock icon + “No access”
  * if missing, show “Not produced”
* Clicking “Open”:

  * logs open excerpt viewer (modal/drawer)
  * artifacts open in viewer or download (type-dependent)
  * attestations open verification view

## UI4: Realtime behavior

* UI MUST subscribe to realtime events for the run.
* UI MUST apply idempotent merge logic:

  * dedupe by `event_id`
  * merge enrichment events by tuple (run_id/stage/step/attempt/status)

## UI5: Ordering and out-of-order handling

* UI MUST sort by `ts` for display.
* UI MUST NOT regress a step state if a late “pass/info” arrives after fail.

  * Rule: `fail` is terminal for a step attempt.

---

# Non-functional requirements

## Latency

* From failure detection to UI update: **≤ 2s p95**, **≤ 5s p99** (within the same network).
* Evidence resolution:

  * `resolve` call should return in **≤ 300ms p95** for cached/known pointers.

## Reliability

* Event ingestion must be durable (stored) before fanout.
* System must tolerate:

  * duplicates
  * retries
  * out-of-order delivery
  * partial evidence availability

## Payload limits

* Event size ≤ 8KB
* Evidence inline previews ≤ 4KB per pointer

## Retention

* Tiny events retained ≥ 30 days (configurable).
* Evidence retention depends on provider, but resolver must surface expiry.

---

# Metrics and instrumentation (definition of success)

Producers + ingestion MUST emit:

* `ttfe_ms`: time to failure event (from step start or from failure detection)
* `event_ingest_latency_ms`
* `event_validation_fail_count`
* `unknown_error_class_count`
* `pointer_resolution_status_count{available|pending|missing|denied|expired|error}`
* `pointer_hydration_latency_ms`

UI MUST log:

* time from run page open → first event rendered
* evidence open clickthrough rate
* evidence resolution failure rate

---

# Edge cases we explicitly handle

1. **Runner killed before it can emit**

   * Orchestrator watchdog emits `WORKER_LOST` with stage/step best-effort.

2. **Logs produced after failure**

   * Initial fail event has no log pointer.
   * Later enrichment event adds log pointer (same tuple).

3. **Evidence exists but user lacks access**

   * Resolver returns `denied`; UI shows locked state.

4. **Evidence link expired**

   * Resolver returns `expired` and provides a “Refresh” action that re-resolves.

5. **Multiple retries**

   * `attempt` increments; UI shows attempt number and keeps prior attempt history.

---

# Definition of Done (engineers can ship when…)

## Backend DoD

* Schema validation implemented.
* Ingest → store → fanout pipeline working.
* Enrichment merge logic implemented.
* Evidence Gateway resolves at least:

  * log excerpt pointers
  * artifact pointers
  * attestation pointers
* AuthZ enforced.

## Frontend DoD

* Run page shows failure card from TinyFailureEvent alone.
* Evidence hydration works and handles all resolver statuses.
* Realtime updates work; duplicates/out-of-order are safe.

## QA DoD (minimum test cases)

1. Step fails; event arrives; UI shows failure without logs.
2. Log pointer arrives later; UI updates evidence section.
3. Duplicate event delivery; UI shows one failure.
4. Evidence denied; UI shows lock state.
5. Out-of-order: enrichment arrives before initial fail; UI still resolves correctly.

---

# Implementation checklist (suggested division of work)

### Team A: Producer SDK

* `StellaOps.Events` library:

  * `EmitFailure(run_id, stage, step, attempt, error_class, summary, pointers, kv)`
  * ULID generation
  * payload size enforcement
  * retry w/ backoff

### Team B: Ingest + Store + Fanout

* API endpoint or internal gRPC for ingest
* Postgres `run_events` table
* Valkey pub/sub channel: `run:{run_id}:events`

### Team C: Evidence Gateway

* pointer parsing + resolvers
* log excerpt adapter(s)
* artifact download adapter(s)
* attestation viewer adapter(s)

### Team D: UI

* realtime subscription
* timeline state reducer w/ merge rules
* evidence drawer + resolver calls

---

If you want, I can also provide:

* a **formal JSON Schema** file for v1 (drop-in for validation),
* a **state reducer spec** for the UI (pure function inputs/outputs),
* and a **Postgres migration** + example queries.
Below is a **UI State Reducer Spec (Pure Function Inputs/Outputs)** for the Run Detail page that renders the **timeline + step statuses + failure cards + evidence hydration** using TinyFailureEvents (and future-compatible with pass/warn/info).

This is written so devs can implement it as a deterministic reducer (Redux, Zustand w/ reducer, Elm-style update, etc.).

---

# UI State Reducer Spec v1: Run Timeline + Failure Cards

## Reducer contract

### Pure function

```ts
reduceRunView(state: RunViewState, action: Action): RunViewState
```

### Guarantees

* **Pure & deterministic**: no IO, no timers, no random IDs, no Date.now() inside reducer.
* **Idempotent**: applying the same `RUN_EVENT_RECEIVED` twice yields the same state after the first time.
* **Order-safe**: out-of-order events never “downgrade” a step attempt from `fail` → `pass`.

---

# 1) Data types

## 1.1 Event type used by reducer

```ts
type StageName =
  | 'fetch' | 'build' | 'scan' | 'policy'
  | 'sign' | 'package' | 'deploy' | 'runtime';

type StepStatus =
  // present now (MVP)
  | 'fail'
  // future-compatible
  | 'warn' | 'pass' | 'running' | 'queued' | 'info' | 'unknown';

type PointerType = 'log' | 'artifact' | 'attestation' | 'url' | 'trace';

type Pointer = {
  type: PointerType;
  ref: string;
  mime?: string;
  label?: string;
  expires_at?: string; // RFC3339
  sha256?: string;
};

type TinyEventV1 = {
  v: 1;
  event_id: string;
  ts: string;          // RFC3339 UTC
  run_id: string;
  stage: StageName;
  step: string;
  attempt: number;
  status: StepStatus;  // MVP sends 'fail' only
  error_class: string;
  summary: string;
  pointers?: Pointer[];
  kv?: Record<string, string>;
};

// Normalized for sorting and comparisons (created outside or inside reducer deterministically)
type NormalizedEvent = TinyEventV1 & {
  tsMs: number; // parse(ts) -> number, invalid => 0
};
```

---

## 1.2 Keys and comparisons

```ts
type TupleKey = string;       // `${stage}|${step}|${attempt}|${status}`
type StepAttemptKey = string; // `${stage}|${step}|${attempt}`
type StepIdentityKey = string;// `${stage}|${step}` (no attempt)
type PointerKey = string;     // `${type}|${ref}`

function tupleKey(e: TinyEventV1): TupleKey {
  return `${e.stage}|${e.step}|${e.attempt}|${e.status}`;
}
function stepAttemptKey(e: TinyEventV1): StepAttemptKey {
  return `${e.stage}|${e.step}|${e.attempt}`;
}
function stepIdentityKey(e: TinyEventV1): StepIdentityKey {
  return `${e.stage}|${e.step}`;
}
function pointerKey(p: Pointer): PointerKey {
  return `${p.type}|${p.ref}`;
}

// Sort: ts ascending, then event_id lexicographically (stable deterministic tiebreak)
function compareEvent(a: NormalizedEvent, b: NormalizedEvent): number {
  if (a.tsMs !== b.tsMs) return a.tsMs - b.tsMs;
  return a.event_id < b.event_id ? -1 : (a.event_id > b.event_id ? 1 : 0);
}
```

---

## 1.3 Status ranking rule (terminal safety)

We need a single numeric ranking so we can:

* prevent regressions (`fail` must remain terminal), and
* compute rollups.

```ts
const STATUS_RANK: Record<StepStatus, number> = {
  unknown: 0,
  queued:  1,
  running: 2,
  info:    3,
  pass:    4,
  warn:    5,
  fail:    6,
};

function isTerminal(status: StepStatus): boolean {
  return status === 'fail' || status === 'warn' || status === 'pass';
}
```

**Invariant:** A step attempt’s displayed status must never decrease in rank.

---

# 2) State shape

This state is for a single Run Detail page (one `runId` at a time). If you store multiple runs in a global store, wrap this in a `Record<runId, RunViewState>`.

```ts
type RealtimeStatus = 'idle' | 'connecting' | 'connected' | 'disconnected' | 'error';
type LoadStatus = 'idle' | 'loading' | 'loaded' | 'error';

type EvidenceResolveStatus =
  | 'unresolved'  // pointer exists but no resolver call made yet
  | 'loading'     // resolver call in-flight
  | 'available' | 'pending' | 'missing' | 'denied' | 'expired' | 'error';

type EvidenceResolution = {
  status: EvidenceResolveStatus;
  kind?: 'inline' | 'link';
  title?: string;
  mime?: string;
  size_bytes?: number;
  inline_preview?: string; // small preview
  link?: string;           // short-lived link
  error_message?: string;
};

type EvidenceState = {
  pointer: Pointer;          // latest metadata merged from events
  status: EvidenceResolveStatus;
  lastResolvedAtMs?: number; // from action payload (not Date.now)
  // for stale response protection
  seq: number;               // increments each request
  inFlightSeq?: number;      // seq currently in-flight
  resolution?: EvidenceResolution;
};

type PointerAggregate = {
  pointerKey: PointerKey;
  pointer: Pointer; // merged metadata
};

type TupleAggregate = {
  tupleKey: TupleKey;

  // all events contributing to this tuple (same stage/step/attempt/status)
  eventIdsSorted: string[];      // sorted by (tsMs, event_id)
  canonicalEventId: string;      // min by (tsMs, event_id)

  // merged view computed deterministically from eventIdsSorted
  merged: {
    summary: string;             // from canonical event
    error_class: string;         // from canonical event
    kv: Record<string, string>;  // merged by sorted order (later overwrites)
    pointers: PointerAggregate[];// dedup by pointerKey, merged by sorted order
    updatedAtMs: number;         // max tsMs among contributing events
  };
};

type StepAttemptState = {
  key: StepAttemptKey;
  stage: StageName;
  step: string;
  attempt: number;

  // all tuple aggregates for this attempt (one per status)
  tuplesByStatus: Partial<Record<StepStatus, TupleKey>>;

  // derived “best” status for this attempt
  bestStatus: StepStatus;
  bestStatusRank: number;
  updatedAtMs: number; // max of all tupleAgg.updatedAtMs for this attempt
};

type StageRollup = {
  stage: StageName;
  // worst status among latest attempts of steps in this stage
  rollupStatus: StepStatus;
  rollupRank: number;
};

type RunViewState = {
  runId: string | null;

  loading: { initialEvents: LoadStatus; error?: string };
  realtime: { status: RealtimeStatus; error?: string };

  // storage
  eventsById: Record<string, NormalizedEvent>;
  timelineEventIds: string[];  // global timeline sorted by (tsMs, event_id)

  tupleAggByKey: Record<TupleKey, TupleAggregate>;
  stepAttemptByKey: Record<StepAttemptKey, StepAttemptState>;
  latestAttemptByStep: Record<StepIdentityKey, number>; // max attempt observed

  stageRollups: Record<StageName, StageRollup>;

  evidenceByPointer: Record<PointerKey, EvidenceState>;
};
```

---

# 3) Actions (inputs to reducer)

```ts
type Action =
  | { type: 'RUN_VIEW_OPENED'; runId: string }
  | { type: 'RUN_EVENTS_LOAD_STARTED'; runId: string }
  | { type: 'RUN_EVENTS_LOADED'; runId: string; events: TinyEventV1[] }
  | { type: 'RUN_EVENTS_LOAD_FAILED'; runId: string; error: string }

  | { type: 'REALTIME_STATUS_CHANGED'; runId: string; status: RealtimeStatus; error?: string }
  | { type: 'RUN_EVENT_RECEIVED'; event: TinyEventV1 }

  // Evidence hydration lifecycle (pure reducer; side-effects happen elsewhere)
  | { type: 'EVIDENCE_RESOLVE_REQUESTED'; runId: string; pointerKey: PointerKey }
  | { type: 'EVIDENCE_RESOLVE_RESULT'; runId: string; pointerKey: PointerKey; seq: number; resolvedAtMs: number; resolution: EvidenceResolution }
  | { type: 'EVIDENCE_RESOLVE_CLEARED'; runId: string; pointerKey: PointerKey };
```

**Reducer must ignore** any action where `action.runId !== state.runId` (except `RUN_VIEW_OPENED` which sets it).

---

# 4) Reducer semantics (outputs)

## 4.1 RUN_VIEW_OPENED

**Input:** `{ runId }`
**Output:** resets all run-specific state.

Rules:

* Set `state.runId = runId`
* Clear events, aggregates, evidence, timeline.
* Set `loading.initialEvents = 'loading'`
* Set `realtime.status = 'connecting'` (optional)

---

## 4.2 RUN_EVENTS_LOAD_STARTED / LOADED / FAILED

### RUN_EVENTS_LOAD_STARTED

* If runId matches, set `loading.initialEvents = 'loading'`.

### RUN_EVENTS_LOADED

* If runId matches:

  * For each event in `events`: apply the exact same logic as `RUN_EVENT_RECEIVED`.
  * Then set `loading.initialEvents = 'loaded'`.

### RUN_EVENTS_LOAD_FAILED

* If runId matches: `loading.initialEvents = 'error'`, store error string.

---

## 4.3 REALTIME_STATUS_CHANGED

* Update `realtime.status` and `realtime.error` if runId matches.

---

## 4.4 RUN_EVENT_RECEIVED (core ingestion)

### Preconditions

If `state.runId` is null, ignore (or treat as no-op).
If `event.run_id !== state.runId`, ignore.

### Step A — normalize + dedupe

* Convert to `NormalizedEvent`:

  * `tsMs = parseRFC3339ToMs(event.ts)`; if parse fails, `tsMs = 0`.
  * Default `pointers = []`, `kv = {}` if missing.
* If `eventsById[event_id]` exists: **no-op**.

### Step B — insert into global stores

* Add to `eventsById[event_id]`.
* Insert `event_id` into `timelineEventIds` keeping sorted order by `(tsMs, event_id)`.

### Step C — ensure evidence entries exist for pointers

For each pointer `p`:

* `pk = pointerKey(p)`
* If `evidenceByPointer[pk]` is missing:

  * create `{ pointer: p, status: 'unresolved', seq: 0 }`
* Else merge pointer metadata into `evidenceByPointer[pk].pointer` using pointer-merge rules (below).
  (Do **not** overwrite existing resolver resolution fields.)

### Step D — update tuple aggregate (merge/enrichment)

Let `tk = tupleKey(event)`.

* If `tupleAggByKey[tk]` missing, create new `TupleAggregate` with:

  * `eventIdsSorted = [event_id]`
  * `canonicalEventId = event_id`
  * `merged` from this event

* Else:

  * Insert `event_id` into `eventIdsSorted` in sorted order (using `compareEvent` via `eventsById`).
  * Recompute:

    * `canonicalEventId = min(eventIdsSorted)` by compareEvent
    * `merged` deterministically from all contributing events (see merge rules)

### Tuple merge rules (deterministic)

Given contributing events `E` sorted by `(tsMs, event_id)` ascending:

* `canonical = E[0]`
* `merged.summary = canonical.summary`
* `merged.error_class = canonical.error_class`
* `merged.kv`:

  * start empty `{}`
  * for each event `e` in order, for each `(k,v)` in `e.kv`: `merged.kv[k] = v`
    (later events overwrite earlier keys)
* `merged.pointers`:

  * maintain `map: Record<PointerKey, Pointer>`
  * for each event `e` in order, for each pointer `p`:

    * `pk = pointerKey(p)`
    * if not present: set map[pk] = p
    * else: map[pk] = mergePointerMeta(map[pk], p) (see below)
  * output pointers as an array sorted by `PointerKey` lexicographically (for stable UI lists)
* `merged.updatedAtMs = max(e.tsMs)`

### Pointer metadata merge rule (non-null wins, later wins)

```ts
function mergePointerMeta(oldP: Pointer, newP: Pointer): Pointer {
  // type/ref must match
  return {
    type: oldP.type,
    ref: oldP.ref,
    // later non-empty wins
    mime:       newP.mime       ?? oldP.mime,
    label:      newP.label      ?? oldP.label,
    expires_at: newP.expires_at ?? oldP.expires_at,
    sha256:     newP.sha256     ?? oldP.sha256,
  };
}
```

---

## 4.5 Update StepAttemptState (best status + no regression)

After tuple aggregate update, update the parent step attempt:

* Let `sak = stepAttemptKey(event)` and `sid = stepIdentityKey(event)`.

### latest attempt tracking

* `latestAttemptByStep[sid] = max(previous, event.attempt)`

### StepAttemptState update

* If missing, create:

  * `bestStatus = 'unknown'`, `bestStatusRank = 0`, `tuplesByStatus = {}`
* Set `tuplesByStatus[event.status] = tk`

### Recompute best status (never decreases)

Compute candidate best by checking all statuses present for this attempt:

```ts
candidateBest = argmax(status in tuplesByStatus) STATUS_RANK[status]
```

Then apply **no-regression rule**:

* If `STATUS_RANK[candidateBest] >= step.bestStatusRank`:

  * update `bestStatus`, `bestStatusRank`
* Else:

  * keep existing `bestStatus` (prevents fail → pass regressions)

Set `updatedAtMs = max(updatedAtMs, tupleAgg.merged.updatedAtMs)`.

**Important:** This rule guarantees “late pass/info” cannot override a prior fail.

---

## 4.6 Stage rollups (optional but recommended)

Whenever any `StepAttemptState` changes, update `stageRollups[stage]` deterministically:

For each stage:

* Consider only the **latest attempt per step identity** in that stage:

  * For each `StepIdentityKey = stage|step`, find `attempt = latestAttemptByStep[stage|step]`
  * Look up `StepAttemptState` for that attempt.
* Roll up stage status as the **worst rank** among those:

  * `rollupRank = max(step.bestStatusRank)`
  * `rollupStatus = status with that rank`

If a stage has no steps yet, set `rollupStatus='unknown'`.

---

# 5) Evidence hydration reducer rules

Evidence actions update `evidenceByPointer` only; they must not mutate events/aggregates.

## 5.1 EVIDENCE_RESOLVE_REQUESTED

**Input:** `{ pointerKey }`

Rules:

* If no evidence entry exists: create one with status `unresolved` and `seq=0` (should be rare).
* Increment `seq = seq + 1`
* Set `inFlightSeq = seq`
* Set `status = 'loading'`
* Keep `resolution` (optional: clear it if you want UI to hide stale info; recommended to keep and show “Refreshing…”)

**Middleware/effect contract (outside reducer):**

* After dispatching `EVIDENCE_RESOLVE_REQUESTED`, the effect layer reads `inFlightSeq` from state and uses it in the API call.
* When the response returns, dispatch `EVIDENCE_RESOLVE_RESULT` with that same `seq`.

## 5.2 EVIDENCE_RESOLVE_RESULT

**Input:** `{ pointerKey, seq, resolvedAtMs, resolution }`

Rules:

* If `evidenceByPointer[pointerKey]` missing: ignore or create (implementation choice).
* If `evidence.inFlightSeq !== seq`: **ignore stale response**.
* Else:

  * `status = resolution.status`
  * `resolution = resolution`
  * `lastResolvedAtMs = resolvedAtMs`
  * `inFlightSeq = undefined`

## 5.3 EVIDENCE_RESOLVE_CLEARED

* Reset entry back to `{ status:'unresolved', resolution: undefined, inFlightSeq: undefined }`
* Keep `pointer` metadata.

---

# 6) Selectors (pure outputs for rendering)

These are not reducer logic, but they define how UI consumes state deterministically.

## 6.1 Timeline view model

```ts
selectTimeline(state): NormalizedEvent[] {
  return state.timelineEventIds.map(id => state.eventsById[id]);
}
```

## 6.2 Latest attempt cards per step identity

```ts
type StepCardVM = {
  stage: StageName;
  step: string;
  attempt: number;
  status: StepStatus;
  error_class?: string;
  summary?: string;
  kv: Record<string,string>;
  pointers: PointerAggregate[];
  updatedAtMs: number;
};

selectLatestStepCards(state): StepCardVM[] {
  const cards: StepCardVM[] = [];
  for (const sid in state.latestAttemptByStep) {
    const attempt = state.latestAttemptByStep[sid];
    const [stage, step] = sid.split('|') as [StageName, string];
    const sak = `${stage}|${step}|${attempt}`;

    const sa = state.stepAttemptByKey[sak];
    if (!sa) continue;

    // Prefer fail tuple for details if present
    const failTk = sa.tuplesByStatus['fail'];
    const bestTk = sa.tuplesByStatus[sa.bestStatus];
    const tk = failTk ?? bestTk;
    const agg = tk ? state.tupleAggByKey[tk] : undefined;

    cards.push({
      stage, step, attempt,
      status: sa.bestStatus,
      error_class: agg?.merged.error_class,
      summary: agg?.merged.summary,
      kv: agg?.merged.kv ?? {},
      pointers: agg?.merged.pointers ?? [],
      updatedAtMs: sa.updatedAtMs,
    });
  }
  // stable ordering: by stage order, then step name
  return cards.sort((a,b) =>
    (STAGE_ORDER.indexOf(a.stage) - STAGE_ORDER.indexOf(b.stage)) ||
    a.step.localeCompare(b.step)
  );
}
```

## 6.3 Failure banner (first failure by time)

```ts
selectFirstFailure(state): StepCardVM | null {
  const cards = selectLatestStepCards(state).filter(c => c.status === 'fail');
  if (cards.length === 0) return null;
  return cards.sort((a,b) => a.updatedAtMs - b.updatedAtMs)[0];
}
```

---

# 7) Worked examples (expected reducer behavior)

## Example A: fail event arrives, then enrichment adds pointers

1. Receive fail event (no pointers)

* Step card shows `fail`, summary, error_class, evidence list empty.

2. Receive second event same tupleKey with pointers

* Same step card remains `fail` (no regression)
* Evidence section now lists pointers (status `unresolved` until resolved).

## Example B: out-of-order enrichment arrives before initial fail

* Enrichment event arrives first (later tsMs) → creates tupleAgg; canonical is that (for now).
* Later initial fail arrives with earlier tsMs:

  * canonical becomes the earlier event (smaller tsMs)
  * **pointers remain**, because merged pointers are union across all contributing events.

## Example C: duplicate delivery

* Same `event_id` received twice → second is ignored (idempotent).

## Example D: late pass after fail (future-proof)

* If a `pass` event arrives after a `fail` for the same step attempt:

  * `bestStatusRank` is already `fail` (6)
  * candidate is `pass` (4)
  * no-regression rule keeps `fail`

---

# 8) Implementation notes (non-binding but useful)

* Event counts per run are usually small; simple array insert + sort is fine.
* If you expect thousands of events, maintain a binary insertion for `timelineEventIds` and `eventIdsSorted`.
* Keep all “current time” out of reducer. Any timestamps used in actions (e.g., `resolvedAtMs`) must be created outside.

---

If you want next, I can provide:

* a drop-in **TypeScript implementation** of `reduceRunView` with helper functions, and
* a set of **unit test vectors** (Given actions → expect final state) covering all edge cases above.