Files

master 534aabfa2a First-time user experience fixes and platform contract repairs

FTUX fixes (Sprint 316-001):
- Remove all hardcoded fake data from dashboard — fresh installs show
  honest setup guide instead of fake crisis data (5 fake criticals gone)
- Curate advisory source defaults: 32 sources disabled by default
  (ecosystem, geo-restricted, exploit, hardware, mirror). ~43 core
  sources remain enabled. StellaOps Mirror no longer enabled at priority 1.
- Filter Mirror-category sources from Create Domain wizard to prevent
  circular mirror-from-mirror chains
- Add 404 catch-all route — unknown URLs show "Page Not Found" instead
  of silently rendering the dashboard
- Fix arrow characters in release target path dropdown (? → →)
- Add login credentials to quickstart documentation
- Update Feature Matrix: 14 release orchestration features marked as
  shipped (was marked planned)

Platform contract repairs (from prior session):
- Add /api/v1/jobengine/quotas/summary endpoint on Platform
- Fix gateway route prefix matching for /policy/shadow/* and
  /policy/simulations/* (regex routes instead of exact match)
- Fix VexHub PostgresVexSourceRepository missing interface method
- Fix advisory-vex-sources sweep text expectation
- Fix mirror operator journey auth (session storage token extraction)

Verified: 110/111 canonical routes passing (1 unrelated stale approval ref)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-16 02:05:38 +02:00

17 KiB

Raw Blame History

Dashboard Redesign Proposal — Stella Ops Mission Board

Date: 2026-03-16 Author: First-time user audit + product analysis Status: Proposal

What Stella Ops Actually Is (The Mental Model)

Stella Ops is a release control plane that answers three questions for every deployment:

Is it safe? — Vulnerabilities, SBOM health, reachability evidence, VEX dispositions
Is it approved? — Policy gates, human approvals, compliance evidence
Is it working? — Deployment health, service connectivity, feed freshness

The dashboard must reflect these three pillars. Currently it's a single vertical scroll of hardcoded fake data that mixes all three with no hierarchy.

Current Dashboard Problems

P1: Everything is hardcoded

summary signal: { activePromotions: 3, blockedPromotions: 1, ... } — fake
resolveStatusSeed(): generates fake metrics by environment type (dev=healthy, staging=degraded, prod+us-east=blocked) — deterministic lies
reachabilityStats: { bCoverage: 72, iCoverage: 88, rCoverage: 61 } — fake
nightlyOpsSignals: 4 hardcoded items — fake
Alerts section: 3 hardcoded HTML <li> items — fake
Activity section: 3 hardcoded HTML cards — fake
Zero API calls to real backends

P2: Layout has no hierarchy

Current vertical order: Summary strip → Environment grid → Risk table → 3-card row (SBOM + Reachability + Ops Signals) → Alerts → Activity → Domain nav.

This is 7 sections stacked vertically. A user scrolls through 3+ screens of content with no visual priority. The most critical information (am I blocked?) competes with the least actionable (domain navigation links).

P3: No data source distinction

Dashboard shows "5 critical findings" and "blocked" but the security posture page shows "0 findings". The user can't tell what's real. There's no "last updated" timestamp, no data source indicator, no "demo data" badge.

P4: Duplicate information

Environment grid AND risk table show the same environments with the same metrics
SBOM card recalculates stats from the same environment data shown in the grid
Reachability percentages are shown per-environment (B/I/R column) AND as aggregate (Reachability card)

Proposed Layout: 3-Column Mission Board

┌─────────────────────────────────────────────────────────────────────┐
│ Dashboard — Mission Board for [Demo Production]          [Refresh] │
│ Last updated: 15 Mar 2026, 23:15 UTC                              │
├───────────────────────┬─────────────────────────────────────────────┤
│                       │                                             │
│  SECURITY POSTURE     │  ENVIRONMENTS & ACTIONS                     │
│  (1/3 width)          │  (2/3 width)                                │
│                       │                                             │
│  ┌─────────────────┐  │  ┌─────────────────────────────────────────┐│
│  │ VULNERABILITY    │  │  │ PROMOTION PIPELINE                     ││
│  │ SUMMARY          │  │  │                                         ││
│  │                  │  │  │ [env cards in promotion order:          ││
│  │ Critical:  12    │  │  │  Dev → Stage → Prod per region]        ││
│  │ High:      34    │  │  │                                         ││
│  │ Medium:    89    │  │  │  Blocked: prod-us-east (5 crit)        ││
│  │ Low:      156    │  │  │  Degraded: staging (stale SBOM)        ││
│  │                  │  │  │  Healthy: dev, prod-eu-west             ││
│  │ [severity donut  │  │  │                                         ││
│  │  or bar chart]   │  │  └─────────────────────────────────────────┘│
│  │                  │  │                                             │
│  │ Reachable: 9     │  │  ┌─────────────────────────────────────────┐│
│  │ Unreachable: 47  │  │  │ NEEDS YOUR ATTENTION                   ││
│  │ Unknown: 23      │  │  │                                         ││
│  │                  │  │  │ ⚠ 3 approvals blocked (evidence stale) ││
│  └─────────────────┘  │  │ ⚠ 2 waivers expiring in 24h            ││
│                       │  │ ⚠ 1 promotion blocked by policy gate    ││
│  ┌─────────────────┐  │  │ 🔴 Feed freshness degraded              ││
│  │ SBOM HEALTH      │  │  │                                         ││
│  │                  │  │  │ [each item links to the action page]    ││
│  │ Components: 247  │  │  └─────────────────────────────────────────┘│
│  │ Fresh: 231       │  │                                             │
│  │ Stale: 12        │  │  ┌─────────────────────────────────────────┐│
│  │ Missing: 4       │  │  │ RECENT ACTIVITY (real events)           ││
│  │                  │  │  │                                         ││
│  │ B/I/R Coverage   │  │  │ • admin sealed "API Gateway v2.1"      ││
│  │ B: 72%           │  │  │ • Policy gate blocked prod-us-east     ││
│  │ I: 88%           │  │  │ • NVD feed synced (142 new advisories) ││
│  │ R: 61%           │  │  │ • Doctor check: 1 fail, 9 warn         ││
│  └─────────────────┘  │  │                                         ││
│                       │  │ [live event stream, not static cards]    ││
│  ┌─────────────────┐  │  └─────────────────────────────────────────┘│
│  │ ADVISORY FEEDS   │  │                                             │
│  │                  │  │                                             │
│  │ Sources: 55/75   │  │                                             │
│  │ Healthy: 55      │  │                                             │
│  │ Failed: 18       │  │                                             │
│  │ Last sync: 2m    │  │                                             │
│  │                  │  │                                             │
│  │ [Configure]      │  │                                             │
│  └─────────────────┘  │                                             │
│                       │                                             │
├───────────────────────┴─────────────────────────────────────────────┤
│ PLATFORM HEALTH (full width footer bar)                             │
│                                                                     │
│ Services: 63/63 ✓  │ DB: healthy │ Events: CONNECTED │ Doctor: 7/1/1│
│ Feed: Live         │ Evidence: ON │ Offline: OK       │ DLQ: 3      │
└─────────────────────────────────────────────────────────────────────┘

Column 1 (1/3): Security Posture At-a-Glance

Purpose: Answer "Is my estate safe?" without leaving the dashboard.

Section 1A: Vulnerability Summary

Data source: GET /api/v1/findings/summary (findings ledger) or GET /api/v1/scanner/summary (scanner service)

What to show:

Severity breakdown: Critical / High / Medium / Low counts
Reachability breakdown: Reachable / Unreachable / Unknown counts
Donut chart or horizontal stacked bar colored by severity
Trend arrow (↑/↓/→) compared to 24h ago
Link: "Open Findings" → /security/triage

Why this matters:

This is the #1 thing a security auditor looks at
Currently the dashboard shows "5 critical" per environment but it's all fake
Real data from the findings ledger makes this trustworthy

Section 1B: SBOM Health

Data source: GET /api/v1/scanner/sbom/summary or computed from environment scan state

What to show:

Total components tracked across all environments
Freshness breakdown: Fresh / Stale / Missing
B/I/R reachability coverage bars (the existing bar chart, but from real data)
Link: "View Supply Chain" → /security/supply-chain-data

Section 1C: Advisory Feed Status

Data source: GET /api/v1/advisory-sources/status (Concelier source management)

What to show:

Sources active: X of Y enabled
Healthy / Failed count
Last sync timestamp
Link: "Configure Sources" → /setup/integrations/advisory-vex-sources

Why this matters:

Advisory freshness directly affects vulnerability accuracy
If feeds are stale, findings are stale, decisions are wrong
This was the 55/75 healthy we discovered in the audit

Column 2 (2/3): Environments & Actions

Purpose: Answer "What needs my attention?" and "What's the deployment state?"

Section 2A: Promotion Pipeline

Data source: GET /api/v1/platform/context/environments (already loads via PlatformContextStore) + real scan/promotion state

What to show:

Environment cards in promotion order (Dev → Stage → Prod, grouped by region)
Each card: name, region, deploy status badge, SBOM freshness, critical findings count, pending approvals
Blocked environments highlighted at top with red border
Actions per card: Detail, Findings, Promote, Approve
Real metrics from scan/promotion APIs, NOT resolveStatusSeed()

Layout: Same card grid as current, but driven by real data.

Section 2B: Needs Your Attention

Data source: Multiple — approvals API, waivers API, promotions API, notifications API

What to show:

Actionable items only — things the user MUST do:
- Pending approvals (with count and reason)
- Expiring waivers (with countdown)
- Blocked promotions (with blocking reason: policy gate, evidence freshness, etc.)
- Feed degradation alerts
Each item is a clickable link to the action page
NOT static HTML — real data from APIs

Why this matters:

The current "Alerts" section is 3 hardcoded <li> elements
A real operator needs to see: "You have 3 things to do today"

Section 2C: Recent Activity (Live Stream)

Data source: GET /api/v1/timeline/events or WebSocket event stream (already have "Events: CONNECTED" in topbar)

What to show:

Real events, most recent first:
- Release sealed/promoted/rolled back
- Policy gate pass/block
- Advisory feed sync (with advisory count)
- Doctor check results
- User actions (approval, waiver, exception)
Live updates via the event stream (already connected)
Maximum 10 items, with "View full activity" link → /evidence/audit-log

Why this matters:

Current "Recent Activity" is 3 static cards with text descriptions — not actual activity
The topbar already shows "Events: CONNECTED" — the live stream is available

Footer Bar: Platform Health

Data source: GET /api/v1/platform/health or GET /api/v1/doctor/last-run/summary

What to show (single horizontal bar, always visible):

Services: 63/63 healthy (or X unhealthy)
DB: healthy/degraded
Events: Connected/Degraded
Doctor: 7 pass / 1 warn / 1 fail (link to /ops/operations/doctor)
Feed: Live/Stale
Evidence: ON/OFF
Offline: OK/Sealed
DLQ: N items (link to /ops/operations/dead-letter)

Why this matters:

The topbar already shows some of these (Events, Policy, Evidence, Feed, Offline)
But the dashboard should also show system health — an operator wants to know "is the platform itself healthy?" at a glance
Doctor results from the last run are critical operational context

What to Remove

Risk table — duplicate of the environment grid. Merge into a single view.
Domain navigation links (Release Runs, Security & Risk, Platform, Evidence, Platform Setup) — the sidebar already provides this. Dashboard space is premium.
Activity section (3 static cards) — replace with real live activity stream
All hardcoded data — every number must come from an API or show "No data yet"

Empty State (Fresh Install)

When the dashboard has no real data (no environments, no scans, no releases):

┌─────────────────────────────────────────────────────────┐
│  Welcome to Stella Ops                                   │
│                                                          │
│  Let's set up your release control plane.                │
│                                                          │
│  ① Connect a registry    [Setup Integrations →]          │
│  ② Define environments   [Topology Wizard →]             │
│  ③ Scan your first image [Start Scan →]                  │
│  ④ Create a release      [Create Release →]              │
│                                                          │
│  ─────────────────────────────────────────               │
│  Platform Health: 63/63 services ✓                       │
│  Advisory Sources: 55/75 healthy                          │
│  Doctor: Run diagnostics → [Quick Check]                 │
└─────────────────────────────────────────────────────────┘

The empty state should guide the user through setup, not show fake crisis data.

Data Sources Required

Dashboard Section	API Endpoint	Service
Vulnerability Summary	`GET /api/v1/findings/summary`	Findings Ledger
SBOM Health	`GET /api/v1/scanner/sbom/summary`	Scanner
B/I/R Reachability	`GET /api/v1/reachgraph/coverage`	ReachGraph
Advisory Feed Status	`GET /api/v1/advisory-sources/status`	Concelier
Environment Cards	`GET /api/v1/platform/context`	Platform (exists)
Environment Scan State	`GET /api/v1/scanner/environments/summary`	Scanner
Pending Approvals	`GET /api/v1/approvals?status=pending`	JobEngine
Expiring Waivers	`GET /api/v1/exceptions?expiresWithin=24h`	Policy
Blocked Promotions	`GET /api/v1/promotions?status=blocked`	JobEngine
Recent Activity	`GET /api/v1/timeline/events?limit=10`	Timeline
Platform Health	`GET /api/v1/doctor/last-run/summary`	Platform/Doctor
Service Count	`GET /api/v1/platform/health`	Platform
DLQ Count	`GET /api/v1/jobengine/dead-letter/summary`	JobEngine

Many of these endpoints already exist (Platform context, advisory sources status, doctor, dead-letter, jobengine). Some may need summary/aggregation endpoints.

Implementation Approach

Phase 1: Honest Empty State (S effort)

Replace hardcoded data with API calls that can return empty
When empty: show the welcome/setup guide instead of fake data
When real data exists: show real data

Phase 2: 3-Column Layout (M effort)

Restructure the template from vertical scroll to 3-column grid
Left column: security posture cards
Right column: environments + actions + activity
Footer: platform health bar

Phase 3: Real API Wiring (L effort)

Wire each section to its real API endpoint
Add loading skeletons per section
Add "last updated" timestamps
Handle API errors gracefully (show error state per section, not whole page)

Phase 4: Live Activity Stream (M effort)

Replace static activity cards with real event stream
Use the existing WebSocket connection (Events: CONNECTED)
Show 10 most recent events with live updates

Files to Modify

File	Change
`src/Web/StellaOps.Web/src/app/features/dashboard-v3/dashboard-v3.component.ts`	Complete rewrite of template + data sources
`src/Web/StellaOps.Web/src/app/core/api/`	Add dashboard summary API clients
`src/Platform/StellaOps.Platform.WebService/Endpoints/`	Add dashboard aggregation endpoint
`src/Web/StellaOps.Web/src/app/features/dashboard-v3/dashboard-v3.component.spec.ts`	Update tests

17 KiB Raw Blame History

Dashboard Redesign Proposal — Stella Ops Mission Board

What Stella Ops Actually Is (The Mental Model)

Current Dashboard Problems

P1: Everything is hardcoded

P2: Layout has no hierarchy

P3: No data source distinction

P4: Duplicate information

Proposed Layout: 3-Column Mission Board

Column 1 (1/3): Security Posture At-a-Glance

Section 1A: Vulnerability Summary

Section 1B: SBOM Health

Section 1C: Advisory Feed Status

Column 2 (2/3): Environments & Actions

Section 2A: Promotion Pipeline

Section 2B: Needs Your Attention

Section 2C: Recent Activity (Live Stream)

Footer Bar: Platform Health

What to Remove

Empty State (Fresh Install)

Data Sources Required

Implementation Approach

Phase 1: Honest Empty State (S effort)

Phase 2: 3-Column Layout (M effort)

Phase 3: Real API Wiring (L effort)

Phase 4: Live Activity Stream (M effort)

Files to Modify

17 KiB

Raw Blame History