Files
git.stella-ops.org/docs/modules/ui/v2-rewire/pack-18.md
2026-02-18 23:03:07 +02:00

555 lines
32 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

## Pack 18 — Environment Detail standardized: **Deploy + SBOM + CritReachable + Hybrid B/I/R + Data Confidence** in one header (consistent everywhere)
This pack makes **Environment Detail** the single place where an operator or approver can answer:
**“Is this environment safe to promote into right now?”**
…without bouncing across Dashboard → Security → Ops → Integrations.
It keeps your IA intact:
* **Release Control** is still a root menu
* **Regions-first** environment organization remains
* **Reachability stays 2nd-class** (tab + badges), not a new top-level area
* **Data Integrity** remains owned by Ops, but is summarized here
---
# 18.1 Menu & entry graph (Mermaid)
```mermaid
flowchart TD
RC[Release Control (ROOT)] --> RE[Regions & Environments]
RE --> RD[Region Detail]
RD --> ENV[Environment Detail]
%% Entry points
DASH[Dashboard] --> ENV
APPR[Approvals] --> ENV
REL[Releases] --> ENV
%% Cross links out of env
ENV --> BV[Bundle Version Detail]
ENV --> RUN[Promotion Run Timeline]
ENV --> FIND[Security Findings (filtered)]
ENV --> DI[Ops: Data Integrity (filtered)]
ENV --> INT[Integrations Hub]
ENV --> GOV[Release Control: Governance]
ENV --> EVID[Evidence Export]
```
---
# 18.2 Environment Detail (shell) — the standardized “single header truth”
### Formerly (what it was called before)
* **Control Plane pipeline node** (no dedicated environment page), plus
* **Settings → Release Control → Environments** (flat listing; not region-first)
### Why changed like this
You asked for:
* per-environment status including **docker/runtime** *and* **image SBOM status**
* dashboard surfacing of “**X envs with critical reachable issues**”
* nightly pipeline failures (rescan / feed sync / integration connectivity)
* hybrid reachability from **image/build/runtime**
All of those converge at the environment boundary, so Env Detail needs a uniform “truth header”.
---
## Environment Detail shell graph (Mermaid)
```mermaid
flowchart TD
ENV[Environment Detail (shell)] --> O[Overview]
ENV --> DEP[Deploy Status]
ENV --> SB[SBOM & Findings]
ENV --> RCH[Reachability (Hybrid B/I/R)]
ENV --> INP[Inputs (Vault/Consul)]
ENV --> PR[Promotions & Approvals]
ENV --> DC[Data Confidence]
ENV --> EV[Evidence & Audit]
```
---
## ASCII mock — Environment Detail shell (header + tabs)
```text
┌──────────────────────────────────────────────────────────────────────────────────────────────┐
│ Environment: us-prod Region: US-East Type: Production │
│ Formerly: Control Plane pipeline node (no dedicated page) + Settings ▸ Release Control ▸ Envs │
├──────────────────────────────────────────────────────────────────────────────────────────────┤
│ STANDARD STATUS HEADER (shown consistently on every Env tab) │
│ Deploy: DEGRADED (targets 5/6 healthy) | SBOM: STALE (26h) scanned 13/14 pending 1 │
│ Findings (target env): CritR=2 HighR=0 HighNR=3 VEX=62% │
│ Hybrid reach coverage: Build 78% | Image 100% | Runtime 35% (evidence age: B 7h / I 1h / R 26h)│
│ Data Confidence: WARN (NVD stale 3h; SBOM rescan FAIL; Jenkins DEGRADED; DLQ runtime 1,230) │
│ Policy baseline: Prod-US-East Version lock: lock-2026-02-18 │
│ Deployed bundle: Platform Release 1.3.0-rc1 (manifest sha256:beef...) │
│ Quick links: [Open Deployed Bundle] [Open Findings] [Open Data Integrity] [Open Promotion Run] │
├──────────────────────────────────────────────────────────────────────────────────────────────┤
│ Tabs: [Overview] [Deploy Status] [SBOM & Findings] [Reachability] [Inputs] [Promotions] [Data] │
│ [Evidence & Audit] │
└──────────────────────────────────────────────────────────────────────────────────────────────┘
```
---
# 18.3 Tab — Overview (env “situation report”)
### Formerly
* Mixed across:
* **Control Plane** (pipeline + active deployments),
* **Security Overview** (global),
* **Platform Health** (platform-wide),
* **Approvals** (per-promotion)
### Why changed like this
Overview becomes a decision “brief”:
* what is deployed,
* what is pending,
* what is blocking promotions,
* whats changed in the last 24h.
---
## Overview graph (Mermaid)
```mermaid
flowchart TD
O[Env Overview] --> CUR[Current deployed bundle + digests]
O --> PEND[Pending approvals affecting this env]
O --> ACT[Active/Recent promotion runs]
O --> TOP[Top risks (CritR + stale SBOM + stale feeds)]
O --> ACTIONS[Recommended actions (scan/rescan/rotate token/request exception)]
O --> LINKS[Links: Findings, Data Integrity, Inputs, Run Timeline, Evidence]
```
---
## ASCII mock — Overview
```text
┌──────────────────────────────────────────────────────────────────────────────────────────────┐
│ Overview │
│ Formerly: Control Plane summary + scattered Security/Ops context │
├───────────────────────────────────────────────────────────────────────────────┬──────────────┤
│ Current deployment │ Actions │
│ Bundle: Platform Release 1.3.0-rc1 (manifest sha256:beef...) │ [Trigger SBOM │
│ Last promoted: Feb 18, 08:33 by alice.johnson │ rescan] │
│ Components: 14 images (13 scanned, 1 pending) │ [Retry NVD │
│ │ sync] │
│ Promotion posture │ [Open Inputs]│
│ Pending approvals: 1 (BLOCK) │ [Open Run] │
│ Active runs: 0 │ [Export Env │
│ Next scheduled: nightly hotfix window 02:00 │ Snapshot] │
├───────────────────────────────────────────────────────────────────────────────┴──────────────┤
│ Top risks (last 24h) │
│ 1) Crit reachable CVE-2026-1234 (user-service) → no VEX │
│ 2) SBOM stale 26h (nightly rescan failing) │
│ 3) Runtime reachability evidence 35% (agent degraded) │
│ Links: [Open Findings filtered to env] [Open Data Integrity filtered to env] │
└──────────────────────────────────────────────────────────────────────────────────────────────┘
```
---
# 18.4 Tab — Deploy Status (runtime / docker / targets + services)
### Formerly
* Best approximation:
* **Platform Health** (platform-wide),
* dashboard pipeline node “Deploy status”
* and external systems.
### Why changed like this
You explicitly want env summary to include **docker/runtime**, but it must be coupled with SBOM and risk, not isolated.
---
## Deploy Status graph (Mermaid)
```mermaid
flowchart TD
DEP[Deploy Status] --> TGT[Targets health table]
DEP --> SVC[Services/Workloads status]
DEP --> DRIFT[Config drift vs expected bundle manifest]
DEP --> LOGS[Links to run logs / agent logs]
DEP --> RUN[Open latest promotion run timeline]
```
---
## ASCII mock — Deploy Status
```text
┌──────────────────────────────────────────────────────────────────────────────────────────────┐
│ Deploy Status │
│ Formerly: Platform Health + implicit “docker status” in Control Plane pipeline │
├──────────────────────────────────────────────────────────────────────────────────────────────┤
│ Targets (US-East/us-prod) │
│ ┌───────────────┬───────────┬──────────────┬───────────────┬───────────────────────────────┐ │
│ │ Target │ Agent │ Health │ Last Heartbeat │ Notes │ │
│ ├───────────────┼───────────┼──────────────┼───────────────┼───────────────────────────────┤ │
│ │ docker-us-01 │ agent-01 │ ✓ HEALTHY │ 1m ago │ ok │ │
│ │ docker-us-02 │ agent-02 │ ✓ HEALTHY │ 2m ago │ ok │ │
│ │ docker-us-03 │ agent-03 │ ✗ DEGRADED │ 12m ago │ disk pressure │ │
│ └───────────────┴───────────┴──────────────┴───────────────┴───────────────────────────────┘ │
│ │
│ Services (from deployed bundle manifest) │
│ api-gateway RUNNING ✓ digest sha256:1111... replicas 4/4 │
│ user-service RUNNING ✓ digest sha256:2222... replicas 3/3 │
│ worker RUNNING ✓ digest sha256:4444... replicas 1/1 │
│ web-frontend WARN ⚠ digest sha256:3333... error rate 1.4% │
│ Links: [Open last Promotion Run] [Open agent logs] │
└──────────────────────────────────────────────────────────────────────────────────────────────┘
```
---
# 18.5 Tab — SBOM & Findings (deploy inventory + scan freshness + reachable breakdown)
### Formerly
* **Security → Overview / Findings / Vulnerabilities**
but not env-attached and not surfaced alongside SBOM freshness.
### Why changed like this
This is where you get exactly what you asked for:
* “no issues” vs “env with critical reachable issues”
* the deployed images list with **SBOM scan status** and **freshness**
* “reachable” classification remains visible but not a new product area
---
## SBOM & Findings graph (Mermaid)
```mermaid
flowchart TD
SB[SBOM & Findings] --> INV[Deployed inventory (digests)]
SB --> SCAN[SBOM scan status/freshness per digest]
SB --> SUM[Findings summary CritR/HighR/HighNR + VEX]
SB --> TOP[Top CVEs/packages (filtered)]
SB --> DRILL[Drill: Finding detail / Component version detail]
SB --> EX[Exceptions/VEX actions]
```
---
## ASCII mock — SBOM & Findings
```text
┌──────────────────────────────────────────────────────────────────────────────────────────────┐
│ SBOM & Findings │
│ Formerly: Security ▸ Findings / Vulnerabilities (global, not env-attached) │
├──────────────────────────────────────────────────────────────────────────────────────────────┤
│ Findings summary (this env) │
│ Crit reachable: 2 High reachable: 0 High not reachable: 3 VEX coverage: 62% │
│ SBOM freshness: WARN (26h) Missing SBOM: 0 Pending scan: 1 │
├──────────────────────────────────────────────────────────────────────────────────────────────┤
│ Deployed inventory (digest-first) │
│ ┌───────────────┬───────────────┬───────────────────────┬─────────────┬─────────────────────┐ │
│ │ Component │ Version label │ Digest │ SBOM status │ Findings (CritR) │ │
│ ├───────────────┼───────────────┼───────────────────────┼─────────────┼─────────────────────┤ │
│ │ api-gateway │ 2.1.0 │ sha256:1111... │ OK (2h) │ 0 │ │
│ │ user-service │ 3.0.0-rc1 │ sha256:2222... │ OK (26h) │ 2 │ │
│ │ worker │ 3.1.0 │ sha256:4444... │ PENDING │ — │ │
│ └───────────────┴───────────────┴───────────────────────┴─────────────┴─────────────────────┘ │
│ Top issues (click to drill) │
│ - CVE-2026-1234 openssl user-service reachable (no VEX) │
│ - CVE-2026-9001 log4j api-gateway not reachable (VEX present) │
│ Actions: [Trigger SBOM scan/rescan] [Open Findings] [Open VEX/Exceptions] │
└──────────────────────────────────────────────────────────────────────────────────────────────┘
```
---
# 18.6 Tab — Reachability (Hybrid B/I/R matrix + evidence age; still 2nd-class)
### Formerly
* Mentioned in approvals/policy but not consistently visible per environment.
### Why changed like this
You require reachability evidence from:
* **image scan (Dover)**
* **build**
* **running environment**
This tab makes the evidence **explicit**, shows coverage and age, and links to the ingest health (Ops) when missing.
---
## Reachability graph (Mermaid)
```mermaid
flowchart TD
RCH[Reachability] --> COV[Coverage Build/Image/Runtime]
RCH --> AGE[Evidence age + confidence]
RCH --> MAT[Per-component B/I/R matrix]
RCH --> DRILL[Drill: component reachability view]
RCH --> OPS[Link: Ops Data Integrity → Reachability ingest health]
```
---
## ASCII mock — Reachability
```text
┌──────────────────────────────────────────────────────────────────────────────────────────────┐
│ Reachability (Hybrid) │
│ Formerly: partial signal in approvals; no consistent per-env view │
├──────────────────────────────────────────────────────────────────────────────────────────────┤
│ Coverage: Build 78% | Image 100% | Runtime 35% │
│ Evidence age: Build 7h | Image 1h | Runtime 26h │
│ Policy interpretation (baseline Prod-US-East): │
│ - Runtime coverage < 50% → WARN (reduces confidence) │
│ - Crit reachable requires runtime evidence OR VEX override → may BLOCK │
├──────────────────────────────────────────────────────────────────────────────────────────────┤
│ Component matrix │
│ api-gateway sha256:1111... Build ✓ Image ✓ Runtime ✗ │
│ user-service sha256:2222... Build ✗ Image ✓ Runtime ✗ │
│ web-frontend sha256:3333... Build ✓ Image ✓ Runtime ✓ │
│ Links: [Open Reachability Ingest Health] [Open component version details] │
└──────────────────────────────────────────────────────────────────────────────────────────────┘
```
---
# 18.7 Tab — Inputs (Vault/Consul bindings + materialization readiness)
### Formerly
* Split across:
* Integrations (Vault),
* environment setup details (not consistently visible),
* promotion-time failures.
### Why changed like this
This is critical for the bundle organizer workflow:
If bindings are missing, **materialization and promotions must block early**, not fail at deploy time.
---
## Inputs graph (Mermaid)
```mermaid
flowchart TD
INP[Inputs] --> BIND[Bindings (Vault/Consul) per required var]
INP --> MISS[Missing bindings + suggested fixes]
INP --> OV[Overrides (env-specific)]
INP --> MAT[Materialization readiness for bundle versions]
INP --> INT[Link: Integrations (Vault/Consul)]
```
---
## ASCII mock — Inputs
```text
┌──────────────────────────────────────────────────────────────────────────────────────────────┐
│ Inputs (Vault/Consul) │
│ Formerly: implicit env config + external Vault/Consul management │
├──────────────────────────────────────────────────────────────────────────────────────────────┤
│ Binding status (required vars from bundle contracts) │
│ api-gateway │
│ - RATE_LIMIT_MAX consul key: service/api-gw/rate_limit_max ✓ bound │
│ - JWT_PUBLIC_KEYS vault path: secret/api-gw/jwt_keys ✓ bound (sealed) │
│ user-service │
│ - DB_PASSWORD vault path: secret/user/db_password ✗ MISSING binding │
│ │
│ Impact: promotions using this env will BLOCK at “Materialize Inputs” │
│ Fix: [Bind missing var] (opens mapping editor) │
│ Links: [Open Vault integration] [Open Consul integration] │
└──────────────────────────────────────────────────────────────────────────────────────────────┘
```
---
# 18.8 Tab — Promotions & Approvals (env-centric history + whats pending)
### Formerly
* Promotions were visible under Releases list, approvals under Approvals list, but env-centric “whats pending for *this env*” wasnt first-class.
### Why changed like this
Operators need an env-centric view:
* what bundle versions landed here,
* what is currently running,
* what approvals are blocked,
* and what changed between deployed and proposed.
---
## Promotions & Approvals graph (Mermaid)
```mermaid
flowchart TD
PR[Promotions & Approvals] --> PEND[Pending approvals targeting this env]
PR --> RUNS[Recent promotion runs (timeline links)]
PR --> DIFF[Diff proposed vs deployed bundle version]
PR --> EVID[Evidence links per run]
PR --> REL[Link: Releases filtered to this env]
PR --> APPR[Link: Approvals filtered to this env]
```
---
## ASCII mock — Promotions & Approvals
```text
┌──────────────────────────────────────────────────────────────────────────────────────────────┐
│ Promotions & Approvals │
│ Formerly: separate Releases list + Approvals list; env-centric view missing │
├──────────────────────────────────────────────────────────────────────────────────────────────┤
│ Pending approvals (this env) │
│ - Platform Release 1.3.0-rc1 → us-prod Gates: BLOCK (CritR + SBOM pending) [Open Approval] │
│ │
│ Recent promotions │
│ Feb 18 08:33 Hotfix Bundle 1.2.4 Status: DEPLOYED [Open Run] [Evidence] │
│ Feb 11 02:10 Platform Release 1.2.3 Status: DEPLOYED [Open Run] [Evidence] │
│ │
│ Diff (proposed vs deployed) │
│ Proposed: Platform 1.3.0-rc1 vs Deployed: Hotfix 1.2.4 │
│ Changed components: user-service, api-gateway │
│ [Open Diff] │
└──────────────────────────────────────────────────────────────────────────────────────────────┘
```
---
# 18.9 Tab — Data Confidence (env-scoped slice of Ops: Data Integrity)
### Formerly
* Data issues existed, but approvers/operators had to jump out to Ops/Settings.
### Why changed like this
This tab makes the environments security posture honest:
* If feeds are stale or rescans failing, the envs “SBOM status” is not reliable.
* This is *not* duplicating Ops; its an env-scoped summary with deep links.
---
## Data Confidence graph (Mermaid)
```mermaid
flowchart TD
DC[Data Confidence] --> FEED[Feeds freshness (env/region scoped)]
DC --> JOB[Relevant jobs (rescan, reachability ingest)]
DC --> INT[Integrations relevant to this env]
DC --> DLQ[DLQ counts affecting this env]
DC --> LINK[Open Ops Data Integrity (filtered)]
```
---
## ASCII mock — Data Confidence
```text
┌───────────────────────────────────────────────────────────────────────────────┐
│ Data Confidence │
│ Formerly: Ops Feeds + System Jobs + Integrations (manual correlation) │
├───────────────────────────────────────────────────────────────────────────────┤
│ Feeds (region: US-East) │
│ OSV OK (20m) NVD WARN (3h) KEV OK (3h) │
│ Jobs impacting this env │
│ sbom-nightly-rescan: FAIL → 12 deployed digests stale > 24h │
│ reachability-runtime-ingest: WARN → runtime evidence age 26h │
│ Integrations │
│ Registry WARN (token expiry soon) Jenkins DEGRADED Vault OK Consul OK │
│ DLQ │
│ runtime-ingest bucket: 1,230 │
│ Link: [Open Ops → Data Integrity (US-East + us-prod filter)] │
└───────────────────────────────────────────────────────────────────────────────┘
```
---
# 18.10 Tab — Evidence & Audit (env snapshot export + last proof chain refs)
### Formerly
* Evidence existed globally:
* Evidence Bundles
* Export
* Proof Chains
But env-centric export (“give me the state of us-prod at time T”) wasnt obvious.
### Why changed like this
Auditors often ask for:
* evidence for a release *and* the resulting deployed state in the env
This tab provides env snapshot exports and links to the latest promotion evidence packs.
---
## Evidence & Audit graph (Mermaid)
```mermaid
flowchart TD
EV[Evidence & Audit] --> SNAP[Export Env Snapshot]
EV --> LAST[Latest promotion evidence pack]
EV --> CHAIN[Proof chain refs (if sealed)]
EV --> AUDIT[Env audit trail (who changed inputs/bindings/policy)]
EV --> EXPORT[Open Evidence Export Center]
```
---
## ASCII mock — Evidence & Audit
```text
┌──────────────────────────────────────────────────────────────────────────────────────────────┐
│ Evidence & Audit │
│ Formerly: Evidence pages existed, but env-centric exports were not obvious │
├──────────────────────────────────────────────────────────────────────────────────────────────┤
│ Export options │
│ [Export Env Snapshot] includes: deployed bundle manifest, digests, SBOM status, findings, │
│ reachability summary, data confidence snapshot, timestamps │
│ │
│ Latest promotion evidence │
│ Hotfix Bundle 1.2.4 → us-prod evidence-pack.tar.gz (sealed) [Open] [Download] │
│ Proof chain refs: chain-9912 (valid) │
│ Audit trail (env config): │
│ - Feb 18 07:10: Vault token rotated (registry rescan recovered) │
│ - Feb 18 06:40: baseline changed Prod-US-East (gate tightened) │
│ Link: [Open Evidence Export Center] │
└──────────────────────────────────────────────────────────────────────────────────────────────┘
```
---
## What this pack accomplishes (directly matching your requirements)
* Every environment now shows **deploy health + image SBOM status** together (not separate worlds).
* The environment header includes:
* **Crit reachable** and reachable-class breakdown
* **Hybrid reachability B/I/R** + evidence age
* **Data Confidence** derived from nightly jobs, feed freshness, integrations, DLQ
* Approvals/Releases/Dashboard can link to Env Detail and always show the same standardized status strip.
---
If you want to continue, **Pack 19** can consolidate the Security area so “Findings / Vulnerabilities / SBOM Lake / SBOM Graph / VEX / Exceptions” are organized around **release decisions + audit outputs** (keeping reachability second-class and preserving all the PoC screens).