git.stella-ops.org/docs/modules/web/unified-triage-specification.md

# Unified Triage Experience Specification

**Version:** 1.0
**Status:** Active
**Last Updated:** 2025-12-26
**Consolidated From:** 3 product advisories (see References)

## 1. Executive Summary

### The Problem
Modern container security generates overwhelming vulnerability data. Competitors offer fragmented solutions: Snyk provides reachability analysis, Anchore offers VEX annotations, Prisma Cloud delivers runtime signals. Security teams must context-switch between tools, losing precious time and context.

### The Stella Ops Solution
A **unified triage canvas** that combines:
- Rich evidence visualization with proof-carrying verdicts
- VEX decisioning as first-class policy objects
- AI-assisted analysis via AdvisoryAI
- Attestable exceptions with audit trails
- Offline-first architecture for air-gapped parity

## 2. Competitive Landscape

### Snyk — Reachability + Continuous Context
- Implements reachability analysis building call graphs
- Factors reachability into priority scores
- Uses static analysis + AI + expert curation
- Tracks issues over time without re-scanning unchanged images

### Anchore — Vulnerability Annotations + VEX Export
- Vulnerability annotation workflows via UI/API
- Labels: "not applicable", "mitigated", "under investigation"
- Export as OpenVEX and CycloneDX VEX
- Downstream consumers receive curated exploitability state

### Prisma Cloud — Runtime Defense
- Continuous behavioral profiling
- Process, file, and network rule enforcement
- Learning models baseline expected behavior
- Runtime context during operational incidents

### Stella Ops Differentiation
| Feature | Snyk | Anchore | Prisma | Stella Ops |
|---------|------|---------|--------|------------|
| Reachability analysis | Yes | Partial | No | Yes (static + binary + runtime) |
| VEX as policy objects | No | Export only | No | **First-class** |
| Attestable exceptions | No | No | No | **Yes (DSSE)** |
| Offline replay | No | No | No | **Yes** |
| AI-assisted triage | Yes | No | No | Yes (AdvisoryAI) |
| Evidence graphs | Partial | No | Partial | **Full chain** |

## 3. Core UI Concepts

### 3.1 Visual Diff Pattern
Every policy decision or reachability change is treated as a **visual diff**, enabling quick, explainable triage.

#### Side-by-Side Panes
- **Before** (previous scan/policy) vs **After** (current)
- Show dependency/reachability subgraph
- Highlight added/removed/changed nodes/edges

#### Evidence Strip (Right Rail)
Human-readable facts used by the engine:
- Feature flag status (e.g., "feature flag OFF")
- Code path analysis (e.g., "code path unreachable")
- Runtime traces (e.g., "kernel eBPF trace absent")

#### Diff Verdict Header
```
Risk ↓ from Medium → Low (policy v1.8 → v1.9)
```

#### Filter Chips
Scope by: component, package, CVE, policy rule, environment

### 3.2 Data Models

```typescript
interface GraphSnapshot {
  nodes: GraphNode[];
  edges: GraphEdge[];
  metadata: { component: string; version: string; tags: string[] };
}

interface PolicySnapshot {
  version: string;
  rulesHash: string;
  inputs: { flags: Record<string, boolean>; env: string; vexSources: string[] };
}

interface Delta {
  added: DeltaItem[];
  removed: DeltaItem[];
  changed: DeltaItem[];
  ruleOutcomes: RuleOutcomeDelta[];
}

interface EvidenceItem {
  type: 'trace_hit' | 'sbom_line' | 'vex_claim' | 'config_value';
  source: string;
  digest: string;
  excerpt: string;
  timestamp: string;
}

interface SignedDeltaVerdict {
  status: 'routine' | 'review' | 'block';
  signatures: Signature[];
  producer: string;
}
```

### 3.3 Micro-Interactions

| Interaction | Behavior |
|-------------|----------|
| Hover changed node | Inline badge explaining "why it changed" |
| Click rule in rail | Spotlight the exact subgraph affected |
| Toggle "explain like I'm new" | Expands jargon into plain language |
| One-click "copy audit bundle" | Exports delta + evidence as attachment |

### 3.4 Keyboard Shortcuts

| Key | Action |
|-----|--------|
| `1` | Focus changes only |
| `2` | Show full graph |
| `E` | Expand evidence |
| `A` | Export audit bundle |
| `N` | Next item in queue |
| `P` | Previous item |
| `M` | Mark not affected |

## 4. Risk Budget Visualization

### 4.1 Concept
- **Risk budget** = allowable unresolved risk for a release (e.g., 100 "risk points")
- **Burn** = consumption rate as alerts appear, minus "payback" from fixes

### 4.2 Dashboard Components

#### Heatmap of Unknowns
| Component | Vulns | Compliance | Perf | Data | Supply Chain |
|-----------|-------|------------|------|------|--------------|
| Service A | 🟡 12 | 🟢 0 | 🟡 3 | 🔴 8 | 🟡 5 |
| Service B | 🔴 24 | 🟡 2 | 🟢 1 | 🟡 4 | 🟢 0 |

Cell value = unknowns count × severity weight

#### Delta Table (Risk Decay per Release)
| Release | Before | After | Retired | Shifted | Unknowns |
|---------|--------|-------|---------|---------|----------|
| v2.3.1 | 85 | 67 | -22 | +4 | 12 |
| v2.3.0 | 92 | 85 | -15 | +8 | 18 |

#### Exception Ledger
Every accepted risk has: ID, owner, expiry, evidence note, auto-reminder.

### 4.3 Risk Budget Burn-Up Chart

```
Risk Points
    ^
100 |__________ Budget Line (flat or stepped)
    |         \
 80 |          \  ← Actual Risk (cumulative)
    |           \
 60 |            \_____ Headroom (green)
    |                 \
 40 |                  \__ Target by release
    |
    +---------------------------------> Time
         T-30    T-14   T-7   T-2   Release
```

- **X-axis:** Calendar dates to code freeze
- **Y-axis:** Risk points
- **Two lines:** Budget (flat/stepped) + Actual Risk (daily)
- **Shaded area:** Headroom (green) or Overrun (red)
- **Markers:** Feature freeze, pen-test, dependency bumps

### 4.4 Computation Formulas

```typescript
// Risk points per issue
risk_points = severity_weight × exposure_factor × evidence_freshness_penalty

// Unknown penalty (no evidence ≤ N days)
if (days_since_evidence > threshold) {
  risk_points *= 1.5; // multiplier
}

// Decay on fix
if (fix_landed && evidence_refreshed) {
  subtract_points(issue.risk_points);
}

// Guardrails
if (unknowns > K || actual_risk > budget) {
  fail_gate();
}
```

## 5. Implementation Components

### 5.1 Component Hierarchy

```
TriageCanvasComponent
├── TriageListComponent
│   ├── SeverityFilterComponent
│   ├── VulnerabilityRowComponent
│   └── BulkActionBarComponent
├── TriageDetailComponent
│   ├── AffectedPackagesPanel
│   ├── AdvisoryRefsPanel
│   ├── ReachabilityContextComponent
│   └── EvidenceProvenanceComponent
├── AiRecommendationPanel
│   ├── ReachabilityExplanation
│   ├── SuggestedJustification
│   └── SimilarVulnsComponent
├── VexDecisionModalComponent
│   ├── StatusSelector
│   ├── JustificationTypeSelector
│   ├── EvidenceRefInput
│   └── ScopeSelector
└── VexHistoryComponent

CompareViewComponent
├── BaselineSelectorComponent
├── TrustIndicatorsComponent
├── DeltaSummaryStripComponent
├── ThreePaneLayoutComponent
│   ├── CategoriesPaneComponent
│   ├── ItemsPaneComponent
│   └── ProofPaneComponent
├── ActionablesPanelComponent
└── ExportActionsComponent

RiskDashboardComponent
├── BurnUpChartComponent
├── UnknownsHeatmapComponent
├── DeltaTableComponent
├── ExceptionLedgerComponent
└── KpiTilesComponent
```

### 5.2 Service Layer

```typescript
// Core services
TriageService           // Vulnerability list + filtering
VexDecisionService      // CRUD for VEX decisions
AdvisoryAiService       // AI recommendations
CompareService          // Baseline + delta computation
RiskBudgetService       // Budget + burn tracking
EvidenceService         // Evidence retrieval
```

## 6. API Integration

### VulnExplorer Endpoints
```
GET  /api/v1/vulnerabilities                    // List with filters
GET  /api/v1/vulnerabilities/{id}               // Detail
GET  /api/v1/vulnerabilities/{id}/reachability  // Call graph slice
POST /api/v1/vex-decisions                      // Create VEX decision
PUT  /api/v1/vex-decisions/{id}                 // Update VEX decision
GET  /api/v1/vex-decisions?vulnId={id}          // History for vuln
```

### AdvisoryAI Endpoints
```
POST /api/v1/advisory/plan                      // Get analysis plan
POST /api/v1/advisory/execute                   // Execute analysis
GET  /api/v1/advisory/output/{taskId}           // Get recommendations
```

### Delta/Compare Endpoints
```
GET  /api/v1/baselines/recommendations/{digest}
POST /api/v1/delta/compute
GET  /api/v1/delta/{id}/trust-indicators
GET  /api/v1/actionables/delta/{id}
```

## 7. Implementation Status

| Component | Sprint | Status |
|-----------|--------|--------|
| Risk Dashboard Base | SPRINT_20251226_004_FE | TODO |
| Smart-Diff Compare View | SPRINT_20251226_012_FE | TODO |
| Unified Triage Canvas | SPRINT_20251226_013_FE | TODO |
| Documentation Consolidation | SPRINT_20251226_014_DOCS | TODO |
| VEX Decision Models | VulnExplorer/Models | **COMPLETE** |
| AdvisoryAI Pipeline | src/AdvisoryAI | **COMPLETE** |
| Confidence Badge | Web/shared/components | **COMPLETE** |
| Release Flow | Web/features/releases | **COMPLETE** |

## 8. Testing Strategy

### Unit Tests
- Component behavior (selection, filtering, expansion)
- Signal/computed derivations
- Role-based view switching
- Form validation (VEX decisions)

### Integration Tests
- API service calls and response handling
- Navigation and routing
- State persistence across route changes

### E2E Tests
- Full triage workflow: list → detail → VEX decision
- Comparison workflow: select baseline → compute delta → export
- Risk budget: view charts → create exception → see update

### Accessibility Tests
- Keyboard navigation completeness
- Screen reader announcements
- Color contrast compliance

## 9. Success Metrics

| Metric | Definition | Target |
|--------|------------|--------|
| Mean Time to Triage (MTTT) | Time from vuln notification to VEX decision | < 5 min |
| Mean Time to Explain (MTTE) | Time from "why did this change?" to "Understood" click | < 2 min |
| Triage Queue Throughput | Vulns triaged per hour per analyst | > 20 |
| AI Recommendation Acceptance | % of AI suggestions accepted without modification | > 60% |

## 10. References

### Archived Advisories (Consolidated Here)
- `archived/2025-12-26-triage-advisories/25-Dec-2025 - Triage UI Lessons from Competitors.md`
- `archived/2025-12-26-triage-advisories/25-Dec-2025 - Visual Diffs for Explainable Triage.md`
- `archived/2025-12-26-triage-advisories/26-Dec-2026 - Visualizing the Risk Budget.md`

### Related Documentation
- `docs/modules/web/smart-diff-ui-architecture.md`
- `docs/implplan/SPRINT_20251226_004_FE_risk_dashboard.md`
- `docs/implplan/SPRINT_20251226_012_FE_smart_diff_compare.md`
- `docs/implplan/SPRINT_20251226_013_FE_triage_canvas.md`

### External References
- [Snyk Reachability Analysis](https://docs.snyk.io/manage-risk/prioritize-issues-for-fixing/reachability-analysis)
- [Anchore Vulnerability Annotations](https://docs.anchore.com/current/docs/vulnerability_management/vuln_annotations/)
- [Prisma Cloud Runtime Defense](https://docs.prismacloud.io/en/compute-edition/30/admin-guide/runtime-defense/)