Files
git.stella-ops.org/docs/modules/ui/v2-rewire/pack-16.md
2026-02-18 23:03:07 +02:00

17 KiB
Raw Blame History

Pack 16 — Dashboard upgrade: SBOM findings + critical reachable by env, perenv deploy+SBOM status, and Nightly Ops/Data Integrity signals (wired, not duplicated)

This pack keeps the re-org intact (release-centric). It upgrades the Dashboard (formerly “Control Plane”) so the operator immediately sees:

  • “No issues” vs “X environments with critical reachable issues” (and which envs)
  • Per-environment status that includes deploy/runtime and image SBOM status
  • Nightly jobs + data freshness issues that directly affect gating/approvals (CVE feeds, SBOM rescan, integration health)
  • Hybrid reachability coverage (Build / Image / Runtime) as 2nd-class signal on the dashboard

16.1 Dashboard navigation graph (Mermaid)

flowchart TD
  DASH[Dashboard\n(formerly: Control Plane)] --> REL[Releases]
  DASH --> APPR[Approvals]
  DASH --> DEPLOY[Deployments / Promotion Runs]
  DASH --> REGENV[Regions & Environments]
  DASH --> FIND[Security Findings\n(filtered)]
  DASH --> RISK[Risk Overview]
  DASH --> DI[Ops: Data Integrity]
  DASH --> FEEDS[Ops: Feeds & AirGap]
  DASH --> INT[Integrations Hub]
  DASH --> EVID[Evidence & Audit]

  %% What cards link to
  DASH --> CARD_SBOM[Card: SBOM Findings Snapshot]
  CARD_SBOM --> FIND

  DASH --> CARD_HYB[Card: Hybrid Reachability Coverage]
  CARD_HYB --> RISK

  DASH --> CARD_DI[Card: Nightly Ops Signals]
  CARD_DI --> DI

  DASH --> CARD_PIPE[Regional Promotion Pipelines]
  CARD_PIPE --> REGENV
  CARD_PIPE --> REL

  DASH --> CARD_APPR[Card: Pending Approvals]
  CARD_APPR --> APPR

16.2 Screen — Dashboard (v3) — release-centric + “security reality” surfaced

Formerly (where it lived pre-redesign)

  • Control Plane was the root screen.

  • It showed:

    • environment pipeline (Dev/Staging/UAT/Prod) without regions
    • pending approvals + active deployments + recent releases
  • SBOM findings / critical reachable issues were effectively buried:

    • Security → Findings
    • Security → Overview
  • Nightly jobs / data freshness / integration drift were scattered:

    • Operations → Feeds
    • Settings → Integrations
    • Settings → System → Background Jobs
    • Operations → Scheduler / Dead Letter

Why its changed like this

Stella Ops is about promotion-by-digest with defensible security + evidence. So the home screen must answer (in <30 seconds):

  1. Can I safely approve/promote right now? (data fresh? feeds OK? rescans OK? integrations OK?)
  2. Where are my critical reachable issues? (which envs, which CVEs, what reachability confidence?)
  3. Are environments healthy AND do they have SBOM coverage? (runtime + SBOM freshness/coverage together)

This keeps reachability 2nd-class (dashboard + risk drilldowns), not a top-level “product area”.


Dashboard screen graph (Mermaid)

flowchart TD
  A[Dashboard] --> B[Regional Promotion Pipelines\n(per region: Dev→Stage→UAT→Prod nodes)]
  A --> C[Environments at Risk table\n(deploy + SBOM + CritR + B/I/R)]
  A --> D[SBOM Findings Snapshot card\n(no issues vs envs with CritR)]
  A --> E[Hybrid Reachability Coverage card\n(Build/Image/Runtime)]
  A --> F[Nightly Ops Signals card\n(Data Integrity)]
  A --> G[Pending Approvals card]
  A --> H[Active Deployments card]
  A --> I[Recent Releases table]

ASCII mock — Dashboard (v3)

┌──────────────────────────────────────────────────────────────────────────────────────────────────┐
│ DASHBOARD  (Formerly: Control Plane)                                                              │
│ Purpose: release-centric status across regions: promotion + risk + proof + data freshness         │
├──────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Search: [ releases, digests, CVEs...________________________ ]  Org: Acme  Region: All  Window:24h│
├──────────────────────────────────────────────────────────────────────────────────────────────────┤
│ REGIONAL PROMOTION PIPELINES  (nodes show: Deploy + SBOM + CritR + Hybrid reach B/I/R)            │
│                                                                                                   │
│ US-East   [Dev: Deploy OK | SBOM OK | CritR 0 | B/I/R 3/3] → [Stage: OK | OK | 0 | 3/3] →         │
│          [UAT: OK | OK | 1 | 2/3] → [Prod: DEGRADED | SBOM STALE | CritR 2 | 2/3]                 │
│                                                                                                   │
│ EU-West   [Dev: OK | OK | 0 | 3/3] → [Stage: OK | OK | 0 | 3/3] → [UAT: OK | OK | 0 | 3/3] →      │
│          [Prod: OK | OK | 0 | 3/3]                                                                 │
│                                                                                                   │
│ APAC      [Dev: OK | SBOM MISSING(2 imgs) | CritR 0 | 2/3] → [Stage: OK | OK | 0 | 2/3] →          │
│          [UAT: UNKNOWN | OK | 0 | 2/3] → [Prod: OK | OK | 0 | 2/3]                                 │
│                                                                                                   │
│ Click a node → Env Detail (deploy + SBOM status + findings + evidence + inputs)                   │
├──────────────────────────────────────────────────────────────────────────────────────────────────┤
│ ENVIRONMENTS AT RISK (top 5)                                                                      │
│ ┌───────────────┬───────────────┬──────────────┬─────────────┬───────────┬──────────┬──────────┐ │
│ │ Region/Env     │ Deploy Health  │ SBOM Status  │ Crit Reach   │ Hybrid B/I/R │ Last SBOM │ Action │ │
│ ├───────────────┼───────────────┼──────────────┼─────────────┼───────────┼──────────┼──────────┤ │
│ │ US-East / Prod │ DEGRADED       │ STALE (26h)  │ 2            │ 2/3       │ 26h      │ [Open]  │ │
│ │ US-East / UAT  │ OK             │ OK           │ 1            │ 2/3       │ 2h       │ [Open]  │ │
│ │ APAC / Dev     │ OK             │ MISSING (2)  │ 0            │ 2/3       │ —        │ [Open]  │ │
│ └───────────────┴───────────────┴──────────────┴─────────────┴───────────┴──────────┴──────────┘ │
│ Notes: SBOM Status reflects image SBOM coverage+freshness; Deploy reflects runtime/service health │
├──────────────────────────────────────────────────────────────────────────────────────────────────┤
│ SNAPSHOTS (cards)                                                                                │
│ ┌──────────────────────────────┬──────────────────────────────┬──────────────────────────────┬──┐│
│ │ SBOM Findings Snapshot        │ Hybrid Reachability Coverage │ Nightly Ops Signals (Data    │Pending│
│ │ (click → details drawer)      │ (Build/Image/Runtime)        │ Integrity) (click → Ops)     │Approvals│
│ │                              │                              │                              │(2)   │
│ │ Crit reachable envs: 2        │ Build: OK   (02:10)          │ SBOM rescan: FAIL            │ - API  │
│ │ Crit reachable total: 3       │ Image: OK   (02:12)          │ NVD feed: STALE (3h)         │ Gateway│
│ │ No issues envs: 5             │ Runtime: WARN (APAC missing) │ Jenkins: DEGRADED            │ Gate:  │
│ │ Top envs: US-East Prod, UAT   │ Hybrid verdict: 2/3 in US-East│ DLQ: 1,230 runtime events    │ PASS/  │
│ │ [Open Findings]               │ [Open Risk]                  │ [Open Data Integrity]         │ BLOCK  │
│ └──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──┘│
├──────────────────────────────────────────────────────────────────────────────────────────────────┤
│ ACTIVE DEPLOYMENTS                                                                                │
│  Hotfix 1.2.4  → US-East Prod  (RUNNING)   [Open Run Timeline]                                   │
│                                                                                                   │
│ RECENT RELEASES                                                                                   │
│  Hotfix 1.2.4     PROMOTING   US-East Stage→Prod    Components: 1     [Review]                   │
│  Platform 1.3.0-rc1 READY      EU-West Stage→Prod    Components: 4     [Review]                  │
└──────────────────────────────────────────────────────────────────────────────────────────────────┘

Key dashboard upgrades vs prior Control Plane

  • Pipeline nodes now show Deploy + SBOM status + Crit reachable + Hybrid coverage in-line.
  • A dedicated “Environments at Risk” table makes “env N and env M with findings” explicit.
  • Nightly Ops Signals is a first-class dashboard card but links to Ops → Data Integrity (no duplication).

16.3 Screen — SBOM Findings Snapshot (dashboard drawer / panel)

This satisfies your “with details” requirement without creating a new top-level screen.

Formerly (where it lived)

  • Details were only available by going to:

    • Security → Findings and filtering
    • sometimes Security → Overview
  • There wasnt a dashboard-level “whats actually burning” view.

Why changed like this

Operators need fast answers:

  • which environments are impacted,
  • how many critical reachable,
  • what packages/CVEs,
  • and what reachability evidence exists (Build / Image / Runtime).

This drawer is “2nd-class”: its a dashboard drilldown, not a new top menu.


Drawer screen graph (Mermaid)

flowchart TD
  A[Dashboard: SBOM Findings Snapshot Drawer] --> B[Env list with CritR counts]
  A --> C[Top CVEs/packages per env]
  A --> D[Reachability evidence by source\n(Build/Image/Runtime)]
  A --> E[Actions: Open Findings filtered]
  A --> F[Actions: Open Env Detail]
  A --> G[Actions: Request Exception / Create Work Item]

ASCII mock — SBOM Findings Snapshot Drawer

┌───────────────────────────────────────────────────────────────────────────────────────────────┐
│ SBOM FINDINGS SNAPSHOT (Drawer)                                                                 │
│ Formerly: Security ▸ Findings (manual filtering)                                                 │
│ Why: show “no issues vs critical reachable envs” + immediate details from the Dashboard         │
├───────────────────────────────────────────────────────────────────────────────────────────────┤
│ Summary (24h):                                                                                  │
│  Critical reachable envs: 2   Total Crit reachable: 3   Envs with no findings: 5                │
│  Data confidence: WARN (NVD stale 3h, runtime ingest lag)                                        │
├───────────────────────────────────────────────────────────────────────────────────────────────┤
│ Environments with Critical Reachable                                                           │
│ ┌──────────────────┬───────────┬───────────────┬───────────────┬───────────────┬───────────┐   │
│ │ Region/Env        │ CritR     │ Top CVE       │ Top Package    │ Reach evidence│ Actions   │   │
│ ├──────────────────┼───────────┼───────────────┼───────────────┼───────────────┼───────────┤   │
│ │ US-East / Prod    │ 2         │ CVE-2026-1234 │ openssl        │ B/I/R: 2/3    │ [Env] [Find]│
│ │                  │           │ CVE-2026-9001 │ log4j          │ Runtime: WARN │ [Exception]│
│ │ US-East / UAT     │ 1         │ CVE-2026-2222 │ glibc          │ B/I/R: 2/3    │ [Env] [Find]│
│ └──────────────────┴───────────┴───────────────┴───────────────┴───────────────┴───────────┘   │
│ Notes: “Reach evidence” reflects hybrid sources: Build (static), Image (Dover/scan), Runtime.    │
├───────────────────────────────────────────────────────────────────────────────────────────────┤
│ Quick filters: [Only CritR] [Only Reachable] [Only Prod] [Only SBOM stale/missing]              │
│ Links: [Open Security Findings (filtered)] [Open Risk Overview]                                 │
└───────────────────────────────────────────────────────────────────────────────────────────────┘

What changes elsewhere (just wiring, no new screens)

  • Clicking an environment node or an at-risk table row goes to Regions & Environments → Env Detail, where you already have:

    • deploy/runtime state,
    • image SBOM coverage/freshness,
    • findings,
    • reachability summary,
    • evidence + inputs.
  • Clicking Nightly Ops Signals goes to Ops → Data Integrity Overview (Pack 15).


If you confirm this dashboard direction is correct, Pack 17 will apply the same “data health + SBOM status + Crit reachable + B/I/R” pattern into the Approvals detail view (so approvers see exactly why a gate is PASS/WARN/BLOCK and what data is missing) without making reachability a top-level menu.