diff --git a/docs/qa/FULL_PRODUCT_DEEP_DIVE_20260316.md b/docs/qa/FULL_PRODUCT_DEEP_DIVE_20260316.md new file mode 100644 index 000000000..4fb50b2b3 --- /dev/null +++ b/docs/qa/FULL_PRODUCT_DEEP_DIVE_20260316.md @@ -0,0 +1,152 @@ +# Full Product Deep Dive — Every Surface Evaluated + +**Date**: 2026-03-16 +**Method**: Walked through 28 product surfaces as a DevOps/Security engineer +**Stack**: Fresh install, 63 containers, Harbor + GitHub App fixtures + +--- + +## Surface-by-Surface Assessment + +### TIER 1: Excellent (Ship-ready, minor polish only) + +| # | Surface | Route | Assessment | +|---|---------|-------|------------| +| 1 | **Triage Artifact Workspace** | `/triage/artifacts/{name}` | 10/10 — evidence strip, reachability score, VEX decisions, policy gating, attestations, deterministic replay, evidence bundle download. Best vulnerability triage UX I've seen. | +| 2 | **Integrations Hub** | `/setup/integrations` | 9/10 — "Suggested Setup Order" is perfect first-time guidance. Harbor + GitHub App wizards work end-to-end. Connection test confirms. | +| 3 | **Advisory Source Catalog** | `/setup/integrations/advisory-vex-sources` | 9/10 — 75 sources across 13 categories. Category headers with descriptions. Toggle/enable per source. Check All with live progress (after our fix). Mirror context header. | +| 4 | **Doctor Diagnostics** | `/ops/operations/doctor` | 9/10 — 90+ checks across 10 packs. Failed checks expand with likely causes, remediation steps, copy buttons, CLI verification command. Best ops diagnostics UX. | +| 5 | **Data Integrity** | `/ops/operations/data-integrity` | 8/10 — "Data Trust Score" with feed freshness, SBOM pipeline, reachability, integrations, DLQ. Region + time window filters. "Impact: BLOCKING" when feeds are stale. Actionable. | +| 6 | **Disposition / VEX** | `/security/disposition` | 8/10 — Providers, VEX Library, Conflicts, Issuer Trust tabs. Provider health with freshness + SLA. VEX source management. | +| 7 | **Policy Decisioning Studio** | `/ops/policy/overview` | 8/10 — 7-tab shell (Overview, Packs, Governance, Simulation, VEX & Exceptions, Release Gates, Audit). Card-based navigation. "One operator shell" concept is clear. | +| 8 | **Evidence Overview** | `/evidence/overview` | 8/10 — Operator/Auditor mode toggle. Find evidence by Release, Bundle Version, Environment, Approval Decision. Evidence-linked decision search. | +| 9 | **Replay & Verify** | `/evidence/verify-replay` | 8/10 — Request replay by verdict ID or image reference. Replay runs table with status. Deterministic verification workflow. | +| 10 | **Export Center** | `/evidence/exports` | 8/10 — Export profiles (StellaBundle OCI, tar.gz). Create Profile, Run Now, Edit, Delete. "Signed audit pack with DSSE envelope, Rekor tile receipt, and replay log." | + +### TIER 2: Good (Works, needs some UX improvements) + +| # | Surface | Route | Assessment | +|---|---------|-------|------------| +| 11 | **Security Posture** | `/security` | 6/10 — Cards (Risk, Blockers, VEX, SBOM, Reachability, Unknowns) are correct. But shows 0 findings when Triage has data (different data path). No scan trigger. | +| 12 | **SBOM Lake** | `/security/sbom-lake` | 7/10 — Supplier, license, vulnerability analytics. Attestation coverage. Environment + severity + time filters. "Export backlog CSV". Rich but data-dependent. | +| 13 | **Reachability** | `/security/reachability` | 7/10 — Coverage, witnesses, proof-of-exposure. "Healthy assets 1, 73% fleet coverage", "Stale facts 1, 0 stale observations", "Missing sensors 1, 63% sensor coverage". Good structure, data is from seed. | +| 14 | **Supply Chain Data** | `/security/supply-chain-data` | 7/10 — "Coverage/Freshness: WARN, No SBOM components in selected scope." SBOM Viewer, SBOM Graph, SBOM Lake, Reachability, Coverage/Unknowns tabs. Empty but structured. | +| 15 | **Deployments** | `/releases/deployments` | 7/10 — Table with Deployment ID, Release, Environment, Status, Started, Duration, Initiated By. Seed data shows RUNNING/SUCCESS/FAILED statuses. "View" action per deployment. | +| 16 | **Hotfixes** | `/releases/hotfixes` | 7/10 — "Dedicated queue for expedited release-control promotions." Shows `platform-bundle@1.3.1-hotfix1` targeting prod-eu with Critical urgency, Gates WARN. "Review" button. | +| 17 | **Operations Hub** | `/ops/operations` | 7/10 — Consolidated shell with tabs: Overview, Data Integrity, Jobs & Queues, Health & SLO, Feeds & Airgap, Other. "Run Doctor", "Audit Log", "Export Ops Report" actions. | +| 18 | **Feeds & Airgap** | `/ops/operations/feeds-airgap` | 7/10 — Feed mirror freshness, airgap bundle workflows, version lock controls. "Mirrors 2, Synced 2, Stale 0, Storage 12.4 GB." Three tabs: Feed Mirrors, Airgap Bundles, Version Locks. | +| 19 | **Promotions** | `/releases/promotions` | 7/10 — "Bundle-version anchored release promotions with decision context." Search + status/environment filters. "No promotions yet" with explanation. "Create Promotion" button. | +| 20 | **Identity & Access** | `/setup/identity-access` | 7/10 — Users, Roles, OAuth Clients, API Tokens, Tenants tabs. "Least privilege is selected first." User creation form. | +| 21 | **Trust & Signing** | `/setup/trust-signing` | 7/10 — 9 tabs: Overview, Signing Keys, Trusted Issuers, Certificates, Watchlist, Audit Log, Air-Gap, Incidents, Analytics. Comprehensive but empty body on first load. | +| 22 | **Tenant Branding** | `/setup/tenant-branding` | 7/10 — Application Title, Logo Upload, Favicon Upload, Theme Tokens (CSS custom properties). "Apply Changes" button. Working customization. | +| 23 | **Usage & Limits** | `/setup/usage` | 7/10 — Shows: Scans 6,500/10,000, Storage 42GB/100GB, Evidence Packets 2,800/10,000, API Requests 15,000/100,000. "Configure Quotas" button. Data is seed/compatibility but structure is right. | + +### TIER 3: Needs Work (Functional but confusing or broken) + +| # | Surface | Route | Assessment | +|---|---------|-------|------------| +| 24 | **Dashboard** | `/mission-control/board` | 5/10 — Setup guide works for empty state. But with seed environments, shows honest zeros for everything — no insight into what to do. Needs the 3-column redesign with real API data. | +| 25 | **Security Reports** | `/security/reports` | 5/10 — Three tabs: Risk Report, VEX Ledger, Evidence Export. But Risk Report just shows the Security Posture page embedded, not a downloadable report. No "Generate PDF" or "Export" action on the report itself. | +| 26 | **Release Health** | `/releases/health` | 5/10 — "Select an environment from Mission Control or Topology to view release health." No runs, no findings, no decision capsules in current scope. Needs data to be useful. | +| 27 | **Unknowns** | `/security/unknowns` | 4/10 — "Unknowns data is unavailable. The scanner unknowns APIs returned an error. Retry." Error state on fresh install — scanner API returns 404. | +| 28 | **Scheduled Jobs / JobEngine** | `/ops/operations/jobengine` | 5/10 — "Total Jobs 0, Failed Jobs 0, Quota Policies 0." Empty but structured. Tabs: Jobs, Scheduler Runs, Dead-Letter. | + +--- + +## Cross-Cutting UX Findings + +### F-NAV: Navigation Terminology Doesn't Match User Mental Model + +| User Thinks | Product Says | Where | +|-------------|-------------|-------| +| "Vulnerabilities" | "Triage" | Sidebar | +| "Scan my image" | (no entry point) | Missing entirely | +| "CVE report" | "Security Reports → Risk Report" (embeds posture page) | Security Reports | +| "Export compliance report" | "Export Center" (under Audit → Export Center) | Evidence section | +| "Deploy to production" | "Promotions" (under Releases) | Releases section | +| "See what's deployed" | "Deployments" | Correct! | +| "Air-gap setup" | "Feeds & Airgap" (under Operations) | Operations | +| "Trust configuration" | "Trust & Signing" (under Setup) | Correct! | + +### F-DATA: Some Pages Show Seed Data, Others Show Real Empty + +| Page | Data Source | Issue | +|------|-----------|-------| +| Security Posture | Real (empty) | Shows 0 — correct but disconnected from Triage | +| Triage Workspace | Seed (1 artifact) | Shows CVE-2023-38545 — useful but is it real? | +| Deployments | Seed (5 deployments) | Shows DEP-2026-050 etc. — looks like real data but isn't | +| Hotfixes | Seed (1 hotfix) | platform-bundle@1.3.1-hotfix1 | +| Usage & Limits | Compatibility (fake) | 6,500/10,000 scans — looks real | +| Reachability | Seed (coverage %) | 73% fleet coverage — from seed | +| SBOM Lake | Empty | "No SBOM components in selected scope" — honest | + +**The inconsistency is the real problem.** Some pages show demo data that looks operational, others show honest empty state. A user can't tell what's real. + +### F-FLOW: No Clear "Happy Path" Through the Product + +A security engineer's expected flow: +1. Connect registry ✓ (Integrations Hub guides this) +2. Scan an image ✗ (no scan trigger exists) +3. See vulnerabilities ✗ (Triage has seed data, but no real scan) +4. Triage findings ✓ (if you find the Triage workspace) +5. Create a release ✓ (wizard works but uses mock registry) +6. Gate the release ✗ (promotions exist but nothing connects scan → gate → release) +7. Get audit proof ✓ (evidence workspace is excellent) + +Steps 2, 3, and 6 are broken. The product has the individual pieces but the end-to-end flow isn't connected. + +### F-SEARCH: Command Palette Doesn't Index Key Actions + +Searched for these terms in the command palette — all returned 0 results: +- "scan" — no results +- "vulnerability" — no results +- "CVE" — no results +- "deploy" — no results +- "promote" — no results + +The command palette's quick actions (from the `>` prefix) include some actions, but the free-text search doesn't index page names, features, or common security terms. + +### F-EMPTY: Best Empty State vs Worst Empty State + +**Best**: Integrations Hub — "Suggested Setup Order" with linked steps +**Best**: Promotions — "No promotions yet. Promotions move a release version from one environment to another through a policy-gated approval pipeline." +**Best**: Doctor — "No Diagnostics Run Yet. Click Quick Check to run." + +**Worst**: Unknowns — "Unknowns data is unavailable. The scanner unknowns APIs returned an error." (API 404) +**Worst**: Notifications — empty page, no content visible +**Worst**: Trust & Signing — 9 tabs visible but body is empty on first load + +--- + +## Top 10 UX Issues (Severity-Ordered) + +| # | Issue | Severity | Category | +|---|-------|----------|----------| +| D1 | No "Scan" entry point anywhere | CRITICAL | Missing feature | +| D2 | Triage = "Vulnerabilities" naming mismatch | HIGH | Navigation | +| D3 | Security Posture shows 0 when Triage has findings | HIGH | Data consistency | +| D4 | End-to-end flow broken (scan → gate → release) | HIGH | Architecture | +| D5 | Seed data mixed with real empty state | HIGH | Trust | +| D6 | Command palette doesn't index security terms | MEDIUM | Search | +| D7 | Security Reports = embedded posture, not exportable | MEDIUM | Feature gap | +| D8 | Record Decision dialog outside viewport | MEDIUM | Accessibility | +| D9 | Unknowns page shows API error | MEDIUM | Error handling | +| D10 | Notifications page empty on load | LOW | Empty state | + +--- + +## What I'd Tell the Product Team + +**The product has world-class depth.** The evidence verification strip (7/7 with Rekor + DSSE), reachability scoring, deterministic replay, and per-gate policy verdicts are unique in the market. No competitor I know shows "reachability score: 0.78" with call-stack proof next to a VEX decision form with audit hash. + +**The product has world-class breadth.** 28 surfaces covering releases, security, evidence, policy, operations, trust, integrations, topology, notifications, offline kits, SBOM analytics, and AI-assisted triage. Each surface is thoughtfully designed. + +**What stops adoption:** +1. Nobody finds the killer feature (Triage is buried, no scan trigger) +2. The happy path is disconnected (pieces exist but don't chain) +3. You can't tell what's real vs demo (inconsistent data sources) + +**Three strategic fixes:** +1. **Make "Scan Image" the first thing a security engineer sees** — dashboard CTA, sidebar entry, command palette action +2. **Connect the chain**: Registry → Scan → Findings → Gate → Release → Evidence — each step should link to the next +3. **Eliminate all demo data** — every number must be real or honestly "0"