# Explainable Triage Workflows - Implementation Plan

## Executive Summary

This document outlines the implementation plan for delivering **Explainable Triage Workflows** as defined in the product advisory dated 21-Dec-2025. The capability set enables vulnerability-first, policy-backed, reachability-informed verdicts with full explainability and auditability.

## Vision

> Every vulnerability finding must resolve to a **policy-backed, reachability-informed, runtime-corroborated verdict** that is **exportable as one signed attestation attached to the built artifact**.

## Current State Analysis

### Already Implemented (75%)

| Capability | Implementation | Completeness |
|------------|----------------|--------------|
| Reachability analysis | 10 language analyzers, binary, runtime | 95% |
| VEX processing | OpenVEX, CSAF, CycloneDX with lattice | 90% |
| Explainability | ExplainTrace with rule steps | 95% |
| Evidence generation | Path witnesses, rich graphs | 90% |
| Audit trails | Immutable ledger with chain integrity | 85% |
| Policy gates | 4-stage gate system | 95% |
| Attestations | 7 predicate types with DSSE | 90% |
| Runtime capture | eBPF, dyld, ETW | 85% |

### Already Planned (15%)

| Capability | Sprint | Status |
|------------|--------|--------|
| Risk Verdict Attestation | SPRINT_4100_0003_0001 | TODO |
| OCI Attachment | SPRINT_4100_0003_0002 | TODO |
| Counterfactuals | SPRINT_4200_0002_0005 | TODO |
| Replay Engine | SPRINT_4100_0002_0002 | TODO |
| Knowledge Snapshot | SPRINT_4100_0002_0001 | TODO |
| Audit Pack Export | SPRINT_5100_0006_0001 | TODO |
| Unknown Budgets | SPRINT_4100_0001_0002 | TODO |

### Net New Gaps (10%)

| Gap | Sprint | Story Points |
|-----|--------|--------------|
| Unified Confidence Model | 7000.0002.0001 | 13 |
| Vulnerability-First UX API | 7000.0002.0002 | 13 |
| Evidence Graph API | 7000.0003.0001 | 8 |
| Reachability Mini-Map | 7000.0003.0002 | 5 |
| Runtime Timeline | 7000.0003.0003 | 5 |
| Progressive Fidelity | 7000.0004.0001 | 13 |
| Evidence Size Budgets | 7000.0004.0002 | 8 |
| Quality KPIs | 7000.0005.0001 | 8 |

## Implementation Roadmap

### Phase 1: Foundation (Existing + New)

**Objective**: Establish core verdict and confidence infrastructure.

**Sprints**:
- SPRINT_4100_0003_0001: Risk Verdict Attestation
- SPRINT_4100_0002_0001: Knowledge Snapshot Manifest
- SPRINT_7000_0002_0001: Unified Confidence Model
- SPRINT_7000_0004_0001: Progressive Fidelity (parallel)
- SPRINT_7000_0004_0002: Evidence Budgets (parallel)

**Key Deliverables**:
- `RiskVerdictAttestation` model with PASS/FAIL/PASS_WITH_EXCEPTIONS/INDETERMINATE
- `ConfidenceScore` with 5-factor breakdown
- `FidelityLevel` enum with Quick/Standard/Deep modes
- `EvidenceBudget` with retention tiers

### Phase 2: UX Layer

**Objective**: Deliver vulnerability-first presentation layer.

**Sprints**:
- SPRINT_7000_0002_0002: Vulnerability-First UX API

**Key Deliverables**:
- `FindingSummaryResponse` with verdict chip, confidence, one-liner
- `ProofBadges` (Reachability, Runtime, Policy, Provenance)
- `GET /api/v1/findings` list endpoint
- `GET /api/v1/findings/{id}/summary` detail endpoint

### Phase 3: Visualization APIs

**Objective**: Enable evidence exploration and click-through.

**Sprints** (parallelizable):
- SPRINT_7000_0003_0001: Evidence Graph API
- SPRINT_7000_0003_0002: Reachability Mini-Map API
- SPRINT_7000_0003_0003: Runtime Timeline API

**Key Deliverables**:
- `GET /api/v1/findings/{id}/evidence-graph`
- `GET /api/v1/findings/{id}/reachability-map`
- `GET /api/v1/findings/{id}/runtime-timeline`

### Phase 4: Metrics & Observability

**Objective**: Track quality KPIs for continuous improvement.

**Sprints**:
- SPRINT_7000_0005_0001: Quality KPIs Tracking

**Key Deliverables**:
- `TriageQualityKpis` model
- `GET /api/v1/metrics/kpis` dashboard endpoint
- Trend tracking over time

## Architecture Changes

### New Libraries

```
src/
├── Policy/
│   └── __Libraries/
│       └── StellaOps.Policy.Confidence/     # NEW: Confidence model
│           ├── Models/
│           ├── Services/
│           └── Configuration/
├── Scanner/
│   └── __Libraries/
│       └── StellaOps.Scanner.Orchestration/ # NEW: Fidelity orchestration
│           └── Fidelity/
├── Findings/
│   └── StellaOps.Findings.WebService/       # EXTEND: UX APIs
│       ├── Contracts/
│       ├── Services/
│       └── Endpoints/
├── Evidence/                                 # NEW: Evidence management
│   └── StellaOps.Evidence/
│       ├── Budgets/
│       └── Retention/
└── Metrics/                                  # NEW: KPI tracking
    └── StellaOps.Metrics/
        └── Kpi/
```

### Database Changes

| Table | Purpose |
|-------|---------|
| `confidence_factors` | Store factor breakdown per verdict |
| `evidence_items` | Track evidence with size and tier |
| `kpi_counters` | Real-time KPI counters |
| `kpi_snapshots` | Daily KPI snapshots |

### API Surface

| Endpoint | Method | Purpose |
|----------|--------|---------|
| `/api/v1/findings` | GET | List findings with summaries |
| `/api/v1/findings/{id}/summary` | GET | Detailed finding summary |
| `/api/v1/findings/{id}/evidence-graph` | GET | Evidence graph |
| `/api/v1/findings/{id}/reachability-map` | GET | Reachability mini-map |
| `/api/v1/findings/{id}/runtime-timeline` | GET | Runtime timeline |
| `/api/v1/scan/analyze` | POST | Analyze with fidelity level |
| `/api/v1/scan/findings/{id}/upgrade` | POST | Upgrade fidelity |
| `/api/v1/metrics/kpis` | GET | Quality KPIs |

## Non-Negotiables

From the advisory:

1. **Vulnerability-first UX**: Users start from CVE/finding and immediately see applicability, reachability, runtime corroboration, and policy rationale.

2. **Single canonical verdict artifact**: One built-in, signed verdict attestation per subject (OCI digest), replayable.

3. **Deterministic evidence**: Evidence objects are content-hashed and versioned.

4. **Unknowns are first-class**: "Unknown reachability/runtime/config" is not hidden; it is budgeted and policy-controlled.

## Quality KPIs

| KPI | Target | Measurement |
|-----|--------|-------------|
| % non-UNKNOWN reachability | >80% | Weekly |
| % runtime corroboration | >50% (where sensor deployed) | Weekly |
| Explainability completeness | >95% | Weekly |
| Replay success rate | >99% | Weekly |
| Median time to verdict | <5 min | Daily |

## Risk Management

| Risk | Impact | Mitigation |
|------|--------|------------|
| Confidence model complexity | High | Start simple (3 factors), iterate |
| Deep analysis performance | Medium | Progressive fidelity with timeouts |
| Evidence storage growth | Medium | Budget enforcement + tier pruning |
| API backward compatibility | Low | Versioned endpoints |

## Definition of Done

Per advisory, a release is "done" only if:

- [ ] Build produces OCI artifact with attached **signed verdict attestation**
- [ ] Each verdict is **explainable** (reason steps + proof pointers)
- [ ] Reachability evidence stored as **reproducible subgraph** (or explicitly UNKNOWN with reason)
- [ ] Replay verification reproduces same verdict with pinned inputs
- [ ] UX starts from vulnerabilities and links directly to proofs and audit export

## References

- **Advisory**: `docs/product-advisories/archived/21-Dec-2025 - Designing Explainable Triage Workflows.md`
- **Sprint Summary**: `docs/implplan/SPRINT_7000_SUMMARY.md`
- **Individual Sprints**: `docs/implplan/SPRINT_7000_*.md`

## Revision History

| Date | Change | Author |
|------|--------|--------|
| 2025-12-22 | Initial implementation plan | Claude |