Files
git.stella-ops.org/docs/modules/web/unified-triage-specification.md

12 KiB
Raw Blame History

Unified Triage Experience Specification

Version: 1.0 Status: Active Last Updated: 2025-12-26 Consolidated From: 3 product advisories (see References)

1. Executive Summary

The Problem

Modern container security generates overwhelming vulnerability data. Competitors offer fragmented solutions: Snyk provides reachability analysis, Anchore offers VEX annotations, Prisma Cloud delivers runtime signals. Security teams must context-switch between tools, losing precious time and context.

The Stella Ops Solution

A unified triage canvas that combines:

  • Rich evidence visualization with proof-carrying verdicts
  • VEX decisioning as first-class policy objects
  • AI-assisted analysis via AdvisoryAI
  • Attestable exceptions with audit trails
  • Offline-first architecture for air-gapped parity

2. Competitive Landscape

Snyk — Reachability + Continuous Context

  • Implements reachability analysis building call graphs
  • Factors reachability into priority scores
  • Uses static analysis + AI + expert curation
  • Tracks issues over time without re-scanning unchanged images

Anchore — Vulnerability Annotations + VEX Export

  • Vulnerability annotation workflows via UI/API
  • Labels: "not applicable", "mitigated", "under investigation"
  • Export as OpenVEX and CycloneDX VEX
  • Downstream consumers receive curated exploitability state

Prisma Cloud — Runtime Defense

  • Continuous behavioral profiling
  • Process, file, and network rule enforcement
  • Learning models baseline expected behavior
  • Runtime context during operational incidents

Stella Ops Differentiation

Feature Snyk Anchore Prisma Stella Ops
Reachability analysis Yes Partial No Yes (static + binary + runtime)
VEX as policy objects No Export only No First-class
Attestable exceptions No No No Yes (DSSE)
Offline replay No No No Yes
AI-assisted triage Yes No No Yes (AdvisoryAI)
Evidence graphs Partial No Partial Full chain

3. Core UI Concepts

3.1 Visual Diff Pattern

Every policy decision or reachability change is treated as a visual diff, enabling quick, explainable triage.

Side-by-Side Panes

  • Before (previous scan/policy) vs After (current)
  • Show dependency/reachability subgraph
  • Highlight added/removed/changed nodes/edges

Evidence Strip (Right Rail)

Human-readable facts used by the engine:

  • Feature flag status (e.g., "feature flag OFF")
  • Code path analysis (e.g., "code path unreachable")
  • Runtime traces (e.g., "kernel eBPF trace absent")

Diff Verdict Header

Risk ↓ from Medium → Low (policy v1.8 → v1.9)

Filter Chips

Scope by: component, package, CVE, policy rule, environment

3.2 Data Models

interface GraphSnapshot {
  nodes: GraphNode[];
  edges: GraphEdge[];
  metadata: { component: string; version: string; tags: string[] };
}

interface PolicySnapshot {
  version: string;
  rulesHash: string;
  inputs: { flags: Record<string, boolean>; env: string; vexSources: string[] };
}

interface Delta {
  added: DeltaItem[];
  removed: DeltaItem[];
  changed: DeltaItem[];
  ruleOutcomes: RuleOutcomeDelta[];
}

interface EvidenceItem {
  type: 'trace_hit' | 'sbom_line' | 'vex_claim' | 'config_value';
  source: string;
  digest: string;
  excerpt: string;
  timestamp: string;
}

interface SignedDeltaVerdict {
  status: 'routine' | 'review' | 'block';
  signatures: Signature[];
  producer: string;
}

3.3 Micro-Interactions

Interaction Behavior
Hover changed node Inline badge explaining "why it changed"
Click rule in rail Spotlight the exact subgraph affected
Toggle "explain like I'm new" Expands jargon into plain language
One-click "copy audit bundle" Exports delta + evidence as attachment

3.4 Keyboard Shortcuts

Key Action
1 Focus changes only
2 Show full graph
E Expand evidence
A Export audit bundle
N Next item in queue
P Previous item
M Mark not affected

4. Risk Budget Visualization

4.1 Concept

  • Risk budget = allowable unresolved risk for a release (e.g., 100 "risk points")
  • Burn = consumption rate as alerts appear, minus "payback" from fixes

4.2 Dashboard Components

Heatmap of Unknowns

Component Vulns Compliance Perf Data Supply Chain
Service A 🟡 12 🟢 0 🟡 3 🔴 8 🟡 5
Service B 🔴 24 🟡 2 🟢 1 🟡 4 🟢 0

Cell value = unknowns count × severity weight

Delta Table (Risk Decay per Release)

Release Before After Retired Shifted Unknowns
v2.3.1 85 67 -22 +4 12
v2.3.0 92 85 -15 +8 18

Exception Ledger

Every accepted risk has: ID, owner, expiry, evidence note, auto-reminder.

4.3 Risk Budget Burn-Up Chart

Risk Points
    ^
100 |__________ Budget Line (flat or stepped)
    |         \
 80 |          \  ← Actual Risk (cumulative)
    |           \
 60 |            \_____ Headroom (green)
    |                 \
 40 |                  \__ Target by release
    |
    +---------------------------------> Time
         T-30    T-14   T-7   T-2   Release
  • X-axis: Calendar dates to code freeze
  • Y-axis: Risk points
  • Two lines: Budget (flat/stepped) + Actual Risk (daily)
  • Shaded area: Headroom (green) or Overrun (red)
  • Markers: Feature freeze, pen-test, dependency bumps

4.4 Computation Formulas

// Risk points per issue
risk_points = severity_weight × exposure_factor × evidence_freshness_penalty

// Unknown penalty (no evidence ≤ N days)
if (days_since_evidence > threshold) {
  risk_points *= 1.5; // multiplier
}

// Decay on fix
if (fix_landed && evidence_refreshed) {
  subtract_points(issue.risk_points);
}

// Guardrails
if (unknowns > K || actual_risk > budget) {
  fail_gate();
}

5. Implementation Components

5.1 Component Hierarchy

TriageCanvasComponent
├── TriageListComponent
│   ├── SeverityFilterComponent
│   ├── VulnerabilityRowComponent
│   └── BulkActionBarComponent
├── TriageDetailComponent
│   ├── AffectedPackagesPanel
│   ├── AdvisoryRefsPanel
│   ├── ReachabilityContextComponent
│   └── EvidenceProvenanceComponent
├── AiRecommendationPanel
│   ├── ReachabilityExplanation
│   ├── SuggestedJustification
│   └── SimilarVulnsComponent
├── VexDecisionModalComponent
│   ├── StatusSelector
│   ├── JustificationTypeSelector
│   ├── EvidenceRefInput
│   └── ScopeSelector
└── VexHistoryComponent

CompareViewComponent
├── BaselineSelectorComponent
├── TrustIndicatorsComponent
├── DeltaSummaryStripComponent
├── ThreePaneLayoutComponent
│   ├── CategoriesPaneComponent
│   ├── ItemsPaneComponent
│   └── ProofPaneComponent
├── ActionablesPanelComponent
└── ExportActionsComponent

RiskDashboardComponent
├── BurnUpChartComponent
├── UnknownsHeatmapComponent
├── DeltaTableComponent
├── ExceptionLedgerComponent
└── KpiTilesComponent

5.2 Service Layer

// Core services
TriageService           // Vulnerability list + filtering
VexDecisionService      // CRUD for VEX decisions
AdvisoryAiService       // AI recommendations
CompareService          // Baseline + delta computation
RiskBudgetService       // Budget + burn tracking
EvidenceService         // Evidence retrieval

6. API Integration

VulnExplorer Endpoints

GET  /api/v1/vulnerabilities                    // List with filters
GET  /api/v1/vulnerabilities/{id}               // Detail
GET  /api/v1/vulnerabilities/{id}/reachability  // Call graph slice
POST /api/v1/vex-decisions                      // Create VEX decision
PUT  /api/v1/vex-decisions/{id}                 // Update VEX decision
GET  /api/v1/vex-decisions?vulnId={id}          // History for vuln

AdvisoryAI Endpoints

POST /api/v1/advisory/plan                      // Get analysis plan
POST /api/v1/advisory/execute                   // Execute analysis
GET  /api/v1/advisory/output/{taskId}           // Get recommendations

Delta/Compare Endpoints

GET  /api/v1/baselines/recommendations/{digest}
POST /api/v1/delta/compute
GET  /api/v1/delta/{id}/trust-indicators
GET  /api/v1/actionables/delta/{id}

7. Implementation Status

Component Sprint Status
Risk Dashboard Base SPRINT_20251226_004_FE TODO
Smart-Diff Compare View SPRINT_20251226_012_FE TODO
Unified Triage Canvas SPRINT_20251226_013_FE TODO
Documentation Consolidation SPRINT_20251226_014_DOCS TODO
VEX Decision Models VulnExplorer/Models COMPLETE
AdvisoryAI Pipeline src/AdvisoryAI COMPLETE
Confidence Badge Web/shared/components COMPLETE
Release Flow Web/features/releases COMPLETE

8. Testing Strategy

Unit Tests

  • Component behavior (selection, filtering, expansion)
  • Signal/computed derivations
  • Role-based view switching
  • Form validation (VEX decisions)

Integration Tests

  • API service calls and response handling
  • Navigation and routing
  • State persistence across route changes

E2E Tests

  • Full triage workflow: list → detail → VEX decision
  • Comparison workflow: select baseline → compute delta → export
  • Risk budget: view charts → create exception → see update

Accessibility Tests

  • Keyboard navigation completeness
  • Screen reader announcements
  • Color contrast compliance

9. Success Metrics

Metric Definition Target
Mean Time to Triage (MTTT) Time from vuln notification to VEX decision < 5 min
Mean Time to Explain (MTTE) Time from "why did this change?" to "Understood" click < 2 min
Triage Queue Throughput Vulns triaged per hour per analyst > 20
AI Recommendation Acceptance % of AI suggestions accepted without modification > 60%

10. References

Archived Advisories (Consolidated Here)

  • archived/2025-12-26-triage-advisories/25-Dec-2025 - Triage UI Lessons from Competitors.md
  • archived/2025-12-26-triage-advisories/25-Dec-2025 - Visual Diffs for Explainable Triage.md
  • archived/2025-12-26-triage-advisories/26-Dec-2026 - Visualizing the Risk Budget.md
  • docs/modules/web/smart-diff-ui-architecture.md
  • docs/implplan/SPRINT_20251226_004_FE_risk_dashboard.md
  • docs/implplan/SPRINT_20251226_012_FE_smart_diff_compare.md
  • docs/implplan/SPRINT_20251226_013_FE_triage_canvas.md

External References

12. 2026-02-26 Batch Delivery Update

This document is updated to reflect completed triage/risk/score parity work from:

  • SPRINT_20260226_227_FE_triage_risk_score_widget_wiring_and_parity

Delivered coverage in this batch:

  • Evidence pill interactions in triage now route through deterministic verification and explanation paths.
  • Risk dashboard parity widgets (budget, verdict, diff, exceptions) are covered by active Playwright suites.
  • Findings score interactions include breakdown and score-history panel sourced from API responses.
  • Previously skipped risk-dashboard and score-features E2E suites were replaced with active deterministic mock-backed tests.