## Pack 14 — Release Run / Deployment Timeline (workflow checkpoints, logs, rollback, evidence capture, replay/verify) This pack adds the **“run view”** that ties together everything Stella Ops promises: *promote by digest, explain every decision, evidence-backed audit, deterministic replay* — without turning reachability into a top-level area. --- # 14.1 Menu graph (Mermaid) — where “Release Run” sits in the IA ```mermaid flowchart TD ROOT[Stella Ops Console] --> REL[Releases] ROOT --> APPR[Approvals] ROOT --> EVID[Evidence] ROOT --> OPS[Operations] ROOT --> RC[Release Control (ROOT)] ROOT --> INT[Integrations] ROOT --> SEC[Security] REL --> REL_LIST[Releases (Promotions)] REL_LIST --> PROMO_DETAIL[Promotion Detail] PROMO_DETAIL --> RUN_TAB[Run / Timeline] RUN_TAB --> STEP_DETAIL[Step Detail: logs + artifacts + evidence] RUN_TAB --> ROLLBACK[Rollback / Re-run] RUN_TAB --> SCHEDULE[Schedule / Automation] STEP_DETAIL -. export evidence .-> EVID STEP_DETAIL -. replay policy .-> EVID RUN_TAB -. ops health .-> OPS EVID --> PKT[Packets] EVID --> CHAIN[Proof Chains] EVID --> REPLAY[Replay/Verify] EVID --> EXPORT[Export Center] EVID --> BUNDLES[Evidence Bundles] OPS --> ORCH[Orchestrator] OPS --> SCHED[Scheduler Runs] OPS --> DLQ[Dead Letter] OPS --> FEEDS[Feeds + AirGap Ops] OPS --> HEALTH[Platform Health] RUN_TAB -. links to .-> ORCH RUN_TAB -. links to .-> SCHED RUN_TAB -. links to .-> FEEDS RUN_TAB -. links to .-> HEALTH PROMO_DETAIL -. findings snapshot .-> SEC PROMO_DETAIL -. env inputs .-> RC PROMO_DETAIL -. secrets/providers .-> INT ``` --- # 14.2 Run lifecycle graph (Mermaid) — promotion execution stages + checkpoints ```mermaid flowchart LR A[Promotion Created] --> B[Inputs Materialized] B --> C[Policy Gate Eval] C --> D{Approval Required?} D -- yes --> E[Approval Decision] D -- no --> F[Deploy Workflow Start] E --> F F --> G[Canary 10%] G --> H{SLO/Health OK?} H -- no --> R[Auto-Rollback / Pause] H -- yes --> I[Canary 50%] I --> J{SLO/Health OK?} J -- no --> R J -- yes --> K[100% Rollout] K --> L[Post-Deploy Verify] L --> M[Finalize + Seal Evidence] M --> N[Promotion Complete] %% Evidence capture points C -. DSSE policy decision .-> EV[Evidence Pack] F -. provenance/attestations .-> EV L -. runtime reachability snapshot .-> EV M -. Rekor/tlog receipts .-> EV ``` --- # 14.3 Screen — Run / Timeline (Promotion Run) ### Formerly (where it lived pre-redesign) Pieces existed but were **fragmented**: * **Control Plane** dashboard showed *Active Deployments* (high-level only). * **Operations → Orchestrator** (jobs access) and **Operations → Scheduler** (runs) were operational but not “release narrative”. * Evidence was in **Evidence → Packets / Proof Chains / Export**, but not tied to a run timeline. * Any detailed logs typically lived outside Stella (CI/CD, deploy system, cluster logs). ### Why changed like this * A release promotion must be **auditable as a single storyline**: * what happened, * when, * what data it used, * what it decided, * what evidence was sealed at each checkpoint, * and what actions are safe now (pause, rollback, replay). * This screen becomes the **single pane** that links out to specialized areas (Ops, Evidence), instead of forcing users to hunt. ### Screen graph (Mermaid) ```mermaid flowchart TD A[Run / Timeline] --> B[Stage timeline with checkpoints] A --> C[Current status + next step] A --> D[Links to logs, artifacts, evidence] A --> E[Actions: pause/retry/rollback] A --> F[Data health banner: feeds/jobs/integrations] A --> G[Drill into Step Detail] ``` ### ASCII mock ```text ┌──────────────────────────────────────────────────────────────────────────────────────────────┐ │ Promotion Run / Timeline │ │ Legacy name/location: No single screen. Pieces were Control Plane "Active Deployments" + Ops. │ ├──────────────────────────────────────────────────────────────────────────────────────────────┤ │ Promotion: Platform Release 1.3.0-rc1 manifest sha256:beef... │ │ Target: EU-West / eu-stage → eu-prod Workflow: Canary 10→50→100 │ │ Status: RUNNING (Canary 10%) Started: Feb 18, 08:30 │ │ Data health: WARN — NVD stale 3h | Rescan job failed (worker) | Jenkins degraded │ │ Links: [Ops Feeds] [System Jobs] [Integrations] │ ├──────────────────────────────────────────────────────────────────────────────────────────────┤ │ Timeline (click any step) │ │ 08:30 ✓ Inputs Materialized (Vault/Consul resolved, 0 missing) [View] │ │ 08:31 ✓ Gate Eval (Policy) PASS/WARN (reach runtime 35%) [View] │ │ 08:32 ✓ Approval APPROVED by bob.smith [View] │ │ 08:33 ▶ Deploy Canary 10% RUNNING (2/10 targets healthy) [View] [Pause] │ │ ---- ○ Deploy Canary 50% PENDING [—] │ │ ---- ○ Deploy 100% PENDING [—] │ │ ---- ○ Post-Deploy Verify PENDING [—] │ │ ---- ○ Seal Evidence PENDING [—] │ ├──────────────────────────────────────────────────────────────────────────────────────────────┤ │ Quick actions: [Pause] [Retry step] [Rollback] [Export evidence (partial)] [Replay policy] │ └──────────────────────────────────────────────────────────────────────────────────────────────┘ ``` --- # 14.4 Screen — Step Detail (Logs + Artifacts + Evidence captured at that checkpoint) ### Formerly * Logs: CI/CD (e.g., Jenkins), deploy agent logs, platform logs — outside Stella. * Evidence: visible only under **Evidence** menus and not connected to “the step that created it”. ### Why changed like this * Step Detail is the “unit of explanation”. * Every meaningful checkpoint should show: * **inputs** used, * **outputs** produced, * **logs**, * **evidence items** sealed (or pending), * and **links** to canonical storage (Evidence Packets / Proof Chains). ### Screen graph (Mermaid) ```mermaid flowchart TD A[Step Detail] --> B[Overview: inputs/outputs + timestamps] A --> C[Logs (stream / download)] A --> D[Artifacts (manifests, plans, diffs)] A --> E[Evidence items (DSSE, receipts, proofs)] A --> F[Actions: retry step / mark failed / pause] A --> G[Jump: Evidence Packet / Proof Chain] ``` ### ASCII mock ```text ┌──────────────────────────────────────────────────────────────────────────────────────────────┐ │ Step Detail: Gate Eval (Policy) │ │ Legacy name/location: gate result surfaced loosely on Approvals; evidence elsewhere. │ ├──────────────────────────────────────────────────────────────────────────────────────────────┤ │ Start: 08:31 End: 08:31:12 Duration: 12s Result: PASS (2 WARN) │ │ Inputs: bundle manifest sha256:beef... | baseline Prod-EU-West | feeds: NVD stale 3h │ │ Outputs: policy verdict id: verdict-123 | decision digest: sha256:dd77... │ ├──────────────────────────────────────────────────────────────────────────────────────────────┤ │ Tabs: [Overview] [Logs] [Artifacts] [Evidence] │ ├──────────────────────────────────────────────────────────────────────────────────────────────┤ │ Evidence captured │ │ ✓ DSSE envelope: policy-decision.dsse (digest sha256:dd77...) │ │ ✓ Rekor receipt: rekor-entry.json (tlog index 9918271) │ │ ○ Proof chain: pending until "Seal Evidence" step │ │ Links: [Open Evidence Packet] [Open Proof Chain] [Replay this Verdict] │ └──────────────────────────────────────────────────────────────────────────────────────────────┘ ``` --- # 14.5 Screen — Deploy Stage View (targets, health, checkpoints, rollback triggers) ### Formerly * “Active Deployments” showed minimal progress. * Detailed rollout/targets health likely lived in your deploy system (outside Stella). * Platform Health screen exists, but not contextualized to a specific promotion. ### Why changed like this * This is where “release operations” actually happens: * show **targets** in the region/env, * show **health gates** / SLO checks, * show **automatic rollback triggers**, * link to platform health and logs. ### Screen graph (Mermaid) ```mermaid flowchart TD A[Deploy Stage View] --> B[Targets table (per region/env)] A --> C[SLO / health checks] A --> D[Auto-rollback rules + trigger state] A --> E[Actions: pause/continue/rollback] A --> F[Link: Platform Health] ``` ### ASCII mock ```text ┌──────────────────────────────────────────────────────────────────────────────────────────────┐ │ Step Detail: Deploy Canary 10% │ │ Legacy name/location: Control Plane "Active Deployments" (summary only) + external deploy logs │ ├──────────────────────────────────────────────────────────────────────────────────────────────┤ │ Stage: Canary 10% Policy: proceed if 95% healthy for 5m, error rate < 1% │ │ Current: 2/10 healthy | Error rate: 0.4% | Latency p95: 210ms | SLO: OK │ │ Auto-rollback trigger: NOT TRIGGERED │ ├──────────────────────────────────────────────────────────────────────────────────────────────┤ │ Targets (EU-West / eu-prod) │ │ ┌───────────────┬───────────┬──────────┬──────────────┬───────────────┐ │ │ │ Target │ Version │ Health │ Notes │ Logs │ │ │ ├───────────────┼───────────┼──────────┼──────────────┼───────────────┤ │ │ │ eu-prod-01 │ bundle@beef│ ✓ │ ok │ [open] │ │ │ │ eu-prod-02 │ bundle@beef│ ✓ │ ok │ [open] │ │ │ │ eu-prod-03 │ old │ ○ │ pending │ [open] │ │ │ └───────────────┴───────────┴──────────┴──────────────┴───────────────┘ │ │ Actions: [Pause] [Continue to 50%] (disabled until criteria met) [Rollback] [Open Platform Health]│ └──────────────────────────────────────────────────────────────────────────────────────────────┘ ``` --- # 14.6 Screen — Rollback / Re-run (safe ops controls) ### Formerly * Rollback existed as a **status** (“ROLLED_BACK”) in Releases list. * Actual rollback execution likely happened externally or via Orchestrator privileges. ### Why changed like this * Rollback must be: * explicit, * traceable, * evidence-backed (what was rolled back, why, and what is the resulting state). * Re-run is needed for transient failures (e.g., feed sync delay, rescan job retry), but must preserve determinism (re-run should record new evidence with timestamps, and keep old evidence). ### Screen graph (Mermaid) ```mermaid flowchart TD A[Rollback/Re-run] --> B[Select scope: step / stage / full rollback] A --> C[Preview impact (targets + versions)] A --> D[Reason + ticket] A --> E[Execute] E --> F[Run Timeline updates + evidence appended] ``` ### ASCII mock ```text ┌──────────────────────────────────────────────────────────────────────────────────────────────┐ │ Rollback / Re-run │ │ Legacy name/location: Release status "ROLLED_BACK" existed; rollback execution was not unified │ ├──────────────────────────────────────────────────────────────────────────────────────────────┤ │ Promotion: Platform Release 1.3.0-rc1 → EU-West/eu-prod │ │ Current stage: Canary 10% (RUNNING) │ │ │ │ Choose action: │ │ ( ) Re-run current step (Deploy Canary 10%) │ │ ( ) Pause promotion │ │ ( ) Rollback to previously deployed bundle version (manifest sha256:prev...) │ │ │ │ Preview rollback impact: │ │ - 2 targets currently on new bundle → will revert to prev bundle │ │ - 8 targets still old → unchanged │ │ │ │ Reason (required): [ incident #1234: elevated latency ] │ │ [Execute] [Cancel] │ └──────────────────────────────────────────────────────────────────────────────────────────────┘ ``` --- # 14.7 Screen — Evidence Timeline (what evidence exists now vs what seals at finalize) ### Formerly * Evidence existed under: * **Evidence → Packets** * **Evidence → Proof Chains** * **Evidence → Export** * **Evidence → Evidence Bundles** …but the *relationship to the run stages* wasn’t visible. ### Why changed like this * Auditors and operators need to answer: * “What evidence is already available mid-run?” * “What is pending until completion?” * “What exactly was sealed and when?” * This is the bridge between *Ops timeline* and *audit artifacts*. ### Screen graph (Mermaid) ```mermaid flowchart LR A[Evidence Timeline (per promotion)] --> B[Evidence items by checkpoint] A --> C[Open Packet] A --> D[Open Proof Chain] A --> E[Export Evidence Pack] A --> F[Generate Auditor Bundle] ``` ### ASCII mock ```text ┌──────────────────────────────────────────────────────────────────────────────────────────────┐ │ Evidence Timeline — Promotion Run │ │ Legacy name/location: Evidence artifacts existed, but not linked to run checkpoints │ ├──────────────────────────────────────────────────────────────────────────────────────────────┤ │ Checkpoint → Evidence │ │ Inputs Materialized │ │ ✓ resolved-inputs.json (hash sha256:aa11...) │ │ │ │ Gate Eval (Policy) │ │ ✓ policy-decision.dsse ✓ rekor receipt ✓ verdict-123 │ │ │ │ Deploy Canary 10% │ │ ○ deploy-attestation.dsse (pending) │ │ │ │ Seal Evidence (final) │ │ ○ proof-chain.json ○ audit-pack.tar.gz ○ evidence-bundle.zip │ │ │ │ Actions: [Open Evidence Packet] [Open Proof Chain] [Export Pack (partial)] [Generate Auditor Bundle]│ └──────────────────────────────────────────────────────────────────────────────────────────────┘ ``` --- # 14.8 Screen — Replay/Verify (contextual replay for *this run*) ### Formerly * **Evidence → Replay/Verify** (“Verdict Replay”) existed as a standalone screen: * user inputs verdict id or image reference, * sees replay requests + determinism overview. ### Why changed like this * Replay should be reachable from where it matters: * a specific policy decision checkpoint in a promotion run. * Keep the existing Replay/Verify functionality, but add a **contextual wrapper**: * pre-fills verdict id + bundle digest + env baseline, * shows determinism status for this promotion. ### Screen graph (Mermaid) ```mermaid flowchart TD A[Run → Replay/Verify] --> B[Pre-filled replay request] B --> C[Replay requests list] C --> D[Determinism metrics] D --> E[Link: Evidence → Replay/Verify canonical view] ``` ### ASCII mock ```text ┌──────────────────────────────────────────────────────────────────────────────────────────────┐ │ Replay/Verify — For this Promotion │ │ Legacy name/location: "Verdict Replay" (Evidence → Replay/Verify) │ ├──────────────────────────────────────────────────────────────────────────────────────────────┤ │ Pre-filled replay request │ │ Verdict ID: verdict-123 │ │ Bundle: Platform Release 1.3.0-rc1 manifest sha256:beef... │ │ Baseline: Prod-EU-West │ │ Reason: [ Audit verification / policy change test ] │ │ [Request Replay] │ ├──────────────────────────────────────────────────────────────────────────────────────────────┤ │ Recent replay requests (for this promotion) │ │ rr-001 COMPLETED Feb 18, 08:30 match │ │ rr-002 RUNNING Feb 18, 07:30 │ │ Determinism: total 2 | matching 1 | mismatches 1 | match rate 50% │ │ Link: [Open canonical Replay/Verify screen] │ └──────────────────────────────────────────────────────────────────────────────────────────────┘ ``` --- # 14.9 Screen — Schedule / Automation (promotion scheduling + link to Scheduler Runs) ### Formerly * **Operations → Scheduler** existed (“Scheduler Runs”) but disconnected from promotions. * Release list had statuses but scheduling wasn’t first-class in the release context. ### Why changed like this * Scheduling belongs to release operations, but we don’t want a new menu. * This screen: * schedules this promotion (or a step), * writes a scheduler job, * then links to **Scheduler Runs** for execution diagnostics. ### Screen graph (Mermaid) ```mermaid flowchart LR A[Schedule Promotion] --> B[Choose time/window] A --> C[Choose constraints (feeds fresh, scans complete)] A --> D[Create scheduler job] D --> E[View Scheduler Runs] E --> F[Back to Run Timeline] ``` ### ASCII mock ```text ┌──────────────────────────────────────────────────────────────────────────────────────────────┐ │ Schedule Promotion │ │ Legacy name/location: Ops → Scheduler (runs), no promotion-level scheduling UI │ ├──────────────────────────────────────────────────────────────────────────────────────────────┤ │ Promotion: Hotfix Bundle 1.2.4 → US-East/us-prod │ │ │ │ Schedule: [ Feb 19, 02:00 AM ] Window: [ 2h ] │ │ Preconditions: │ │ [x] NVD/OSV feeds fresh (< 1h) │ │ [x] SBOM rescans complete │ │ [ ] Integrations healthy (warn only) │ │ │ │ [Create Schedule] Link: [Open Scheduler Runs] │ └──────────────────────────────────────────────────────────────────────────────────────────────┘ ``` --- ## Result: what you gain with Pack 14 * A promotion is now a **single auditable narrative**: * timeline + logs + checkpoints, * policy decision trace, * deploy stage health gates, * rollback controls, * evidence sealing, * deterministic replay. * Hybrid reachability becomes a **2nd-class** signal woven into checkpoints (Policy + Post-Deploy Verify), not a top-level section. * Existing PoC pages remain valid, but are now **linked meaningfully** from the run storyline. --- If you want the next pack: **Pack 15** will unify **Nightly Ops Report + Data Freshness** (feeds, rescans, integration degradation) into a single **Operations “Data Integrity”** view and show how it bubbles up to Dashboard/Releases/Approvals without duplicating screens.