Files
git.stella-ops.org/docs/modules/ui/v2-rewire/pack-16.md
2026-02-18 23:03:07 +02:00

239 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

## Pack 16 — Dashboard upgrade: **SBOM findings + critical reachable by env**, **perenv deploy+SBOM status**, and **Nightly Ops/Data Integrity** signals (wired, not duplicated)
This pack keeps the re-org intact (release-centric). It upgrades the **Dashboard (formerly “Control Plane”)** so the operator immediately sees:
* **“No issues” vs “X environments with critical reachable issues”** (and which envs)
* **Per-environment status** that includes **deploy/runtime** *and* **image SBOM status**
* **Nightly jobs + data freshness** issues that directly affect gating/approvals (CVE feeds, SBOM rescan, integration health)
* **Hybrid reachability** coverage (Build / Image / Runtime) as **2nd-class** signal on the dashboard
---
# 16.1 Dashboard navigation graph (Mermaid)
```mermaid
flowchart TD
DASH[Dashboard\n(formerly: Control Plane)] --> REL[Releases]
DASH --> APPR[Approvals]
DASH --> DEPLOY[Deployments / Promotion Runs]
DASH --> REGENV[Regions & Environments]
DASH --> FIND[Security Findings\n(filtered)]
DASH --> RISK[Risk Overview]
DASH --> DI[Ops: Data Integrity]
DASH --> FEEDS[Ops: Feeds & AirGap]
DASH --> INT[Integrations Hub]
DASH --> EVID[Evidence & Audit]
%% What cards link to
DASH --> CARD_SBOM[Card: SBOM Findings Snapshot]
CARD_SBOM --> FIND
DASH --> CARD_HYB[Card: Hybrid Reachability Coverage]
CARD_HYB --> RISK
DASH --> CARD_DI[Card: Nightly Ops Signals]
CARD_DI --> DI
DASH --> CARD_PIPE[Regional Promotion Pipelines]
CARD_PIPE --> REGENV
CARD_PIPE --> REL
DASH --> CARD_APPR[Card: Pending Approvals]
CARD_APPR --> APPR
```
---
# 16.2 Screen — Dashboard (v3) — release-centric + “security reality” surfaced
### Formerly (where it lived pre-redesign)
* **Control Plane** was the root screen.
* It showed:
* environment pipeline (Dev/Staging/UAT/Prod) *without regions*
* pending approvals + active deployments + recent releases
* **SBOM findings / critical reachable issues** were effectively buried:
* **Security → Findings**
* **Security → Overview**
* **Nightly jobs / data freshness / integration drift** were scattered:
* **Operations → Feeds**
* **Settings → Integrations**
* **Settings → System → Background Jobs**
* **Operations → Scheduler / Dead Letter**
### Why its changed like this
Stella Ops is about **promotion-by-digest with defensible security + evidence**.
So the home screen must answer (in <30 seconds):
1. **Can I safely approve/promote right now?** (data fresh? feeds OK? rescans OK? integrations OK?)
2. **Where are my critical reachable issues?** (which envs, which CVEs, what reachability confidence?)
3. **Are environments healthy AND do they have SBOM coverage?** (runtime + SBOM freshness/coverage together)
This keeps reachability **2nd-class** (dashboard + risk drilldowns), not a top-level product area”.
---
## Dashboard screen graph (Mermaid)
```mermaid
flowchart TD
A[Dashboard] --> B[Regional Promotion Pipelines\n(per region: Dev→Stage→UAT→Prod nodes)]
A --> C[Environments at Risk table\n(deploy + SBOM + CritR + B/I/R)]
A --> D[SBOM Findings Snapshot card\n(no issues vs envs with CritR)]
A --> E[Hybrid Reachability Coverage card\n(Build/Image/Runtime)]
A --> F[Nightly Ops Signals card\n(Data Integrity)]
A --> G[Pending Approvals card]
A --> H[Active Deployments card]
A --> I[Recent Releases table]
```
---
## ASCII mock — Dashboard (v3)
```text
┌──────────────────────────────────────────────────────────────────────────────────────────────────┐
│ DASHBOARD (Formerly: Control Plane) │
│ Purpose: release-centric status across regions: promotion + risk + proof + data freshness │
├──────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Search: [ releases, digests, CVEs...________________________ ] Org: Acme Region: All Window:24h│
├──────────────────────────────────────────────────────────────────────────────────────────────────┤
│ REGIONAL PROMOTION PIPELINES (nodes show: Deploy + SBOM + CritR + Hybrid reach B/I/R) │
│ │
│ US-East [Dev: Deploy OK | SBOM OK | CritR 0 | B/I/R 3/3] → [Stage: OK | OK | 0 | 3/3] → │
│ [UAT: OK | OK | 1 | 2/3] → [Prod: DEGRADED | SBOM STALE | CritR 2 | 2/3] │
│ │
│ EU-West [Dev: OK | OK | 0 | 3/3] → [Stage: OK | OK | 0 | 3/3] → [UAT: OK | OK | 0 | 3/3] → │
│ [Prod: OK | OK | 0 | 3/3] │
│ │
│ APAC [Dev: OK | SBOM MISSING(2 imgs) | CritR 0 | 2/3] → [Stage: OK | OK | 0 | 2/3] → │
│ [UAT: UNKNOWN | OK | 0 | 2/3] → [Prod: OK | OK | 0 | 2/3] │
│ │
│ Click a node → Env Detail (deploy + SBOM status + findings + evidence + inputs) │
├──────────────────────────────────────────────────────────────────────────────────────────────────┤
│ ENVIRONMENTS AT RISK (top 5) │
│ ┌───────────────┬───────────────┬──────────────┬─────────────┬───────────┬──────────┬──────────┐ │
│ │ Region/Env │ Deploy Health │ SBOM Status │ Crit Reach │ Hybrid B/I/R │ Last SBOM │ Action │ │
│ ├───────────────┼───────────────┼──────────────┼─────────────┼───────────┼──────────┼──────────┤ │
│ │ US-East / Prod │ DEGRADED │ STALE (26h) │ 2 │ 2/3 │ 26h │ [Open] │ │
│ │ US-East / UAT │ OK │ OK │ 1 │ 2/3 │ 2h │ [Open] │ │
│ │ APAC / Dev │ OK │ MISSING (2) │ 0 │ 2/3 │ — │ [Open] │ │
│ └───────────────┴───────────────┴──────────────┴─────────────┴───────────┴──────────┴──────────┘ │
│ Notes: SBOM Status reflects image SBOM coverage+freshness; Deploy reflects runtime/service health │
├──────────────────────────────────────────────────────────────────────────────────────────────────┤
│ SNAPSHOTS (cards) │
│ ┌──────────────────────────────┬──────────────────────────────┬──────────────────────────────┬──┐│
│ │ SBOM Findings Snapshot │ Hybrid Reachability Coverage │ Nightly Ops Signals (Data │Pending│
│ │ (click → details drawer) │ (Build/Image/Runtime) │ Integrity) (click → Ops) │Approvals│
│ │ │ │ │(2) │
│ │ Crit reachable envs: 2 │ Build: OK (02:10) │ SBOM rescan: FAIL │ - API │
│ │ Crit reachable total: 3 │ Image: OK (02:12) │ NVD feed: STALE (3h) │ Gateway│
│ │ No issues envs: 5 │ Runtime: WARN (APAC missing) │ Jenkins: DEGRADED │ Gate: │
│ │ Top envs: US-East Prod, UAT │ Hybrid verdict: 2/3 in US-East│ DLQ: 1,230 runtime events │ PASS/ │
│ │ [Open Findings] │ [Open Risk] │ [Open Data Integrity] │ BLOCK │
│ └──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──┘│
├──────────────────────────────────────────────────────────────────────────────────────────────────┤
│ ACTIVE DEPLOYMENTS │
│ Hotfix 1.2.4 → US-East Prod (RUNNING) [Open Run Timeline] │
│ │
│ RECENT RELEASES │
│ Hotfix 1.2.4 PROMOTING US-East Stage→Prod Components: 1 [Review] │
│ Platform 1.3.0-rc1 READY EU-West Stage→Prod Components: 4 [Review] │
└──────────────────────────────────────────────────────────────────────────────────────────────────┘
```
**Key dashboard upgrades vs prior Control Plane**
* Pipeline nodes now show **Deploy + SBOM status + Crit reachable + Hybrid coverage** in-line.
* A dedicated **Environments at Risk”** table makes env N and env M with findings explicit.
* **Nightly Ops Signals** is a first-class dashboard card but links to **Ops → Data Integrity** (no duplication).
---
# 16.3 Screen — SBOM Findings Snapshot (dashboard drawer / panel)
This satisfies your with details requirement **without creating a new top-level screen**.
### Formerly (where it lived)
* Details were only available by going to:
* **Security Findings** and filtering
* sometimes **Security → Overview**
* There wasnt a dashboard-level whats actually burning view.
### Why changed like this
Operators need fast answers:
* which environments are impacted,
* how many **critical reachable**,
* what packages/CVEs,
* and what reachability evidence exists (Build / Image / Runtime).
This drawer is 2nd-class: its a **dashboard drilldown**, not a new top menu.
---
## Drawer screen graph (Mermaid)
```mermaid
flowchart TD
A[Dashboard: SBOM Findings Snapshot Drawer] --> B[Env list with CritR counts]
A --> C[Top CVEs/packages per env]
A --> D[Reachability evidence by source\n(Build/Image/Runtime)]
A --> E[Actions: Open Findings filtered]
A --> F[Actions: Open Env Detail]
A --> G[Actions: Request Exception / Create Work Item]
```
---
## ASCII mock — SBOM Findings Snapshot Drawer
```text
┌───────────────────────────────────────────────────────────────────────────────────────────────┐
│ SBOM FINDINGS SNAPSHOT (Drawer) │
│ Formerly: Security ▸ Findings (manual filtering) │
│ Why: show “no issues vs critical reachable envs” + immediate details from the Dashboard │
├───────────────────────────────────────────────────────────────────────────────────────────────┤
│ Summary (24h): │
│ Critical reachable envs: 2 Total Crit reachable: 3 Envs with no findings: 5 │
│ Data confidence: WARN (NVD stale 3h, runtime ingest lag) │
├───────────────────────────────────────────────────────────────────────────────────────────────┤
│ Environments with Critical Reachable │
│ ┌──────────────────┬───────────┬───────────────┬───────────────┬───────────────┬───────────┐ │
│ │ Region/Env │ CritR │ Top CVE │ Top Package │ Reach evidence│ Actions │ │
│ ├──────────────────┼───────────┼───────────────┼───────────────┼───────────────┼───────────┤ │
│ │ US-East / Prod │ 2 │ CVE-2026-1234 │ openssl │ B/I/R: 2/3 │ [Env] [Find]│
│ │ │ │ CVE-2026-9001 │ log4j │ Runtime: WARN │ [Exception]│
│ │ US-East / UAT │ 1 │ CVE-2026-2222 │ glibc │ B/I/R: 2/3 │ [Env] [Find]│
│ └──────────────────┴───────────┴───────────────┴───────────────┴───────────────┴───────────┘ │
│ Notes: “Reach evidence” reflects hybrid sources: Build (static), Image (Dover/scan), Runtime. │
├───────────────────────────────────────────────────────────────────────────────────────────────┤
│ Quick filters: [Only CritR] [Only Reachable] [Only Prod] [Only SBOM stale/missing] │
│ Links: [Open Security Findings (filtered)] [Open Risk Overview] │
└───────────────────────────────────────────────────────────────────────────────────────────────┘
```
---
## What changes elsewhere (just wiring, no new screens)
* Clicking an **environment node** or an **at-risk table row** goes to **Regions & Environments → Env Detail**, where you already have:
* deploy/runtime state,
* image SBOM coverage/freshness,
* findings,
* reachability summary,
* evidence + inputs.
* Clicking **Nightly Ops Signals** goes to **Ops → Data Integrity Overview** (Pack 15).
---
If you confirm this dashboard direction is correct, **Pack 17** will apply the same data health + SBOM status + Crit reachable + B/I/R pattern into the **Approvals** detail view (so approvers see exactly why a gate is PASS/WARN/BLOCK and what data is missing) without making reachability a top-level menu.