sln build fix (again), tests fixes, audit work and doctors work
This commit is contained in:
@@ -0,0 +1,333 @@
|
||||
# SPRINT INDEX: Release Orchestrator Implementation
|
||||
|
||||
> **Epic:** Stella Ops Suite - Release Control Plane
|
||||
> **Batch:** 100
|
||||
> **Status:** DONE (All 11 phases completed)
|
||||
> **Created:** 10-Jan-2026
|
||||
> **Source:** [Architecture Specification](../product/advisories/09-Jan-2026%20-%20Stella%20Ops%20Orchestrator%20Architecture.md)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This sprint batch implements the **Release Orchestrator** - transforming Stella Ops from a vulnerability scanning platform into **Stella Ops Suite**, a unified release control plane for non-Kubernetes container environments.
|
||||
|
||||
### Business Value
|
||||
|
||||
- **Unified release governance:** Single pane of glass for release lifecycle
|
||||
- **Audit-grade evidence:** Cryptographically signed proof of every decision
|
||||
- **Security as a gate:** Reachability-aware scanning integrated into promotion flow
|
||||
- **Plugin extensibility:** Support for any SCM, CI, registry, and vault
|
||||
- **Non-K8s first:** Docker, Compose, ECS, Nomad deployment targets
|
||||
|
||||
### Key Principles
|
||||
|
||||
1. **Digest-first release identity** - Releases are immutable OCI digests, not tags
|
||||
2. **Evidence for every decision** - Every promotion/deployment produces sealed evidence
|
||||
3. **Pluggable everything, stable core** - Integrations are plugins; core is stable
|
||||
4. **No feature gating** - All plans include all features
|
||||
5. **Offline-first operation** - Core works in air-gapped environments
|
||||
6. **Immutable generated artifacts** - Every deployment generates stored artifacts
|
||||
|
||||
---
|
||||
|
||||
## Implementation Phases
|
||||
|
||||
| Phase | Batch | Title | Description | Status |
|
||||
|-------|-------|-------|-------------|--------|
|
||||
| 1 | 101 | Foundation | Database schema, plugin infrastructure | DONE |
|
||||
| 2 | 102 | Integration Hub | Connector runtime, built-in integrations | DONE |
|
||||
| 3 | 103 | Environment Manager | Environments, targets, agent registration | DONE |
|
||||
| 4 | 104 | Release Manager | Components, versions, release bundles | DONE |
|
||||
| 5 | 105 | Workflow Engine | DAG execution, step registry | DONE |
|
||||
| 6 | 106 | Promotion & Gates | Approvals, security gates, decisions | DONE |
|
||||
| 7 | 107 | Deployment Execution | Deploy orchestrator, artifact generation | DONE |
|
||||
| 8 | 108 | Agents | Docker, Compose, SSH, WinRM agents | DONE |
|
||||
| 9 | 109 | Evidence & Audit | Evidence packets, version stickers | DONE |
|
||||
| 10 | 110 | Progressive Delivery | A/B releases, canary, traffic routing | DONE |
|
||||
| 11 | 111 | UI Implementation | Dashboard, workflow editor, screens | DONE |
|
||||
|
||||
---
|
||||
|
||||
## Module Dependencies
|
||||
|
||||
```
|
||||
┌──────────────┐
|
||||
│ AUTHORITY │ (existing)
|
||||
└──────┬───────┘
|
||||
│
|
||||
┌──────────────────┼──────────────────┐
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
|
||||
│ PLUGIN │ │ INTHUB │ │ ENVMGR │
|
||||
│ (Batch 101) │ │ (Batch 102) │ │ (Batch 103) │
|
||||
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘
|
||||
│ │ │
|
||||
└──────────┬───────┴──────────────────┘
|
||||
│
|
||||
▼
|
||||
┌───────────────┐
|
||||
│ RELMAN │
|
||||
│ (Batch 104) │
|
||||
└───────┬───────┘
|
||||
│
|
||||
▼
|
||||
┌───────────────┐
|
||||
│ WORKFL │
|
||||
│ (Batch 105) │
|
||||
└───────┬───────┘
|
||||
│
|
||||
┌──────────┴──────────┐
|
||||
│ │
|
||||
▼ ▼
|
||||
┌───────────────┐ ┌───────────────┐
|
||||
│ PROMOT │ │ DEPLOY │
|
||||
│ (Batch 106) │ │ (Batch 107) │
|
||||
└───────┬───────┘ └───────┬───────┘
|
||||
│ │
|
||||
│ ▼
|
||||
│ ┌───────────────┐
|
||||
│ │ AGENTS │
|
||||
│ │ (Batch 108) │
|
||||
│ └───────┬───────┘
|
||||
│ │
|
||||
└──────────┬──────────┘
|
||||
│
|
||||
▼
|
||||
┌───────────────┐
|
||||
│ RELEVI │
|
||||
│ (Batch 109) │
|
||||
└───────┬───────┘
|
||||
│
|
||||
▼
|
||||
┌───────────────┐
|
||||
│ PROGDL │
|
||||
│ (Batch 110) │
|
||||
└───────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Sprint Structure
|
||||
|
||||
### Phase 1: Foundation (Batch 101)
|
||||
|
||||
| Sprint ID | Title | Module | Dependencies |
|
||||
|-----------|-------|--------|--------------|
|
||||
| 101_001 | Database Schema - Core Tables | DB | - |
|
||||
| 101_002 | Plugin Registry | PLUGIN | 101_001 |
|
||||
| 101_003 | Plugin Loader & Sandbox | PLUGIN | 101_002 |
|
||||
| 101_004 | Plugin SDK | PLUGIN | 101_003 |
|
||||
|
||||
### Phase 2: Integration Hub (Batch 102)
|
||||
|
||||
| Sprint ID | Title | Module | Dependencies |
|
||||
|-----------|-------|--------|--------------|
|
||||
| 102_001 | Integration Manager | INTHUB | 101_002 |
|
||||
| 102_002 | Connector Runtime | INTHUB | 102_001 |
|
||||
| 102_003 | Built-in SCM Connectors | INTHUB | 102_002 |
|
||||
| 102_004 | Built-in Registry Connectors | INTHUB | 102_002 |
|
||||
| 102_005 | Built-in Vault Connector | INTHUB | 102_002 |
|
||||
| 102_006 | Doctor Checks | INTHUB | 102_002 |
|
||||
|
||||
### Phase 3: Environment Manager (Batch 103)
|
||||
|
||||
| Sprint ID | Title | Module | Dependencies |
|
||||
|-----------|-------|--------|--------------|
|
||||
| 103_001 | Environment CRUD | ENVMGR | 101_001 |
|
||||
| 103_002 | Target Registry | ENVMGR | 103_001 |
|
||||
| 103_003 | Agent Manager - Core | ENVMGR | 103_002 |
|
||||
| 103_004 | Inventory Sync | ENVMGR | 103_002, 103_003 |
|
||||
|
||||
### Phase 4: Release Manager (Batch 104)
|
||||
|
||||
| Sprint ID | Title | Module | Dependencies |
|
||||
|-----------|-------|--------|--------------|
|
||||
| 104_001 | Component Registry | RELMAN | 102_004 |
|
||||
| 104_002 | Version Manager | RELMAN | 104_001 |
|
||||
| 104_003 | Release Manager | RELMAN | 104_002 |
|
||||
| 104_004 | Release Catalog | RELMAN | 104_003 |
|
||||
|
||||
### Phase 5: Workflow Engine (Batch 105)
|
||||
|
||||
| Sprint ID | Title | Module | Dependencies |
|
||||
|-----------|-------|--------|--------------|
|
||||
| 105_001 | Workflow Template Designer | WORKFL | 101_001 |
|
||||
| 105_002 | Step Registry | WORKFL | 101_002 |
|
||||
| 105_003 | Workflow Engine - DAG Executor | WORKFL | 105_001, 105_002 |
|
||||
| 105_004 | Step Executor | WORKFL | 105_003 |
|
||||
| 105_005 | Built-in Steps | WORKFL | 105_004 |
|
||||
|
||||
### Phase 6: Promotion & Gates (Batch 106)
|
||||
|
||||
| Sprint ID | Title | Module | Dependencies |
|
||||
|-----------|-------|--------|--------------|
|
||||
| 106_001 | Promotion Manager | PROMOT | 104_003, 103_001 |
|
||||
| 106_002 | Approval Gateway | PROMOT | 106_001 |
|
||||
| 106_003 | Gate Registry | PROMOT | 106_001 |
|
||||
| 106_004 | Security Gate | PROMOT | 106_003 |
|
||||
| 106_005 | Decision Engine | PROMOT | 106_002, 106_003 |
|
||||
|
||||
### Phase 7: Deployment Execution (Batch 107)
|
||||
|
||||
| Sprint ID | Title | Module | Dependencies |
|
||||
|-----------|-------|--------|--------------|
|
||||
| 107_001 | Deploy Orchestrator | DEPLOY | 105_003, 106_005 |
|
||||
| 107_002 | Target Executor | DEPLOY | 107_001, 103_002 |
|
||||
| 107_003 | Artifact Generator | DEPLOY | 107_001 |
|
||||
| 107_004 | Rollback Manager | DEPLOY | 107_002 |
|
||||
| 107_005 | Deployment Strategies | DEPLOY | 107_002 |
|
||||
|
||||
### Phase 8: Agents (Batch 108)
|
||||
|
||||
| Sprint ID | Title | Module | Dependencies |
|
||||
|-----------|-------|--------|--------------|
|
||||
| 108_001 | Agent Core Runtime | AGENTS | 103_003 |
|
||||
| 108_002 | Agent - Docker | AGENTS | 108_001 |
|
||||
| 108_003 | Agent - Compose | AGENTS | 108_002 |
|
||||
| 108_004 | Agent - SSH | AGENTS | 108_001 |
|
||||
| 108_005 | Agent - WinRM | AGENTS | 108_001 |
|
||||
|
||||
### Phase 9: Evidence & Audit (Batch 109)
|
||||
|
||||
| Sprint ID | Title | Module | Dependencies |
|
||||
|-----------|-------|--------|--------------|
|
||||
| 109_001 | Evidence Collector | RELEVI | 106_005, 107_001 |
|
||||
| 109_002 | Evidence Signer | RELEVI | 109_001 |
|
||||
| 109_003 | Version Sticker Writer | RELEVI | 107_002 |
|
||||
| 109_004 | Audit Exporter | RELEVI | 109_002 |
|
||||
|
||||
### Phase 10: Progressive Delivery (Batch 110)
|
||||
|
||||
| Sprint ID | Title | Module | Dependencies |
|
||||
|-----------|-------|--------|--------------|
|
||||
| 110_001 | A/B Release Manager | PROGDL | 107_005 |
|
||||
| 110_002 | Traffic Router Framework | PROGDL | 110_001 |
|
||||
| 110_003 | Canary Controller | PROGDL | 110_002 |
|
||||
| 110_004 | Router Plugin - Nginx | PROGDL | 110_002 |
|
||||
|
||||
### Phase 11: UI Implementation (Batch 111)
|
||||
|
||||
| Sprint ID | Title | Module | Dependencies |
|
||||
|-----------|-------|--------|--------------|
|
||||
| 111_001 | Dashboard - Overview | FE | 107_001 |
|
||||
| 111_002 | Environment Management UI | FE | 103_001 |
|
||||
| 111_003 | Release Management UI | FE | 104_003 |
|
||||
| 111_004 | Workflow Editor | FE | 105_001 |
|
||||
| 111_005 | Promotion & Approval UI | FE | 106_001 |
|
||||
| 111_006 | Deployment Monitoring UI | FE | 107_001 |
|
||||
| 111_007 | Evidence Viewer | FE | 109_002 |
|
||||
|
||||
---
|
||||
|
||||
## Documentation References
|
||||
|
||||
All architecture documentation is available in:
|
||||
|
||||
```
|
||||
docs/modules/release-orchestrator/
|
||||
├── README.md # Entry point
|
||||
├── design/
|
||||
│ ├── principles.md # Design principles
|
||||
│ └── decisions.md # ADRs
|
||||
├── modules/
|
||||
│ ├── overview.md # Module landscape
|
||||
│ ├── integration-hub.md # INTHUB spec
|
||||
│ ├── environment-manager.md # ENVMGR spec
|
||||
│ ├── release-manager.md # RELMAN spec
|
||||
│ ├── workflow-engine.md # WORKFL spec
|
||||
│ ├── promotion-manager.md # PROMOT spec
|
||||
│ ├── deploy-orchestrator.md # DEPLOY spec
|
||||
│ ├── agents.md # AGENTS spec
|
||||
│ ├── progressive-delivery.md # PROGDL spec
|
||||
│ ├── evidence.md # RELEVI spec
|
||||
│ └── plugin-system.md # PLUGIN spec
|
||||
├── data-model/
|
||||
│ ├── schema.md # PostgreSQL schema
|
||||
│ └── entities.md # Entity definitions
|
||||
├── api/
|
||||
│ └── overview.md # API design
|
||||
├── workflow/
|
||||
│ ├── templates.md # Template spec
|
||||
│ ├── execution.md # Execution state machine
|
||||
│ └── promotion.md # Promotion state machine
|
||||
├── security/
|
||||
│ ├── overview.md # Security architecture
|
||||
│ ├── auth.md # AuthN/AuthZ
|
||||
│ ├── agent-security.md # Agent security
|
||||
│ └── threat-model.md # Threat model
|
||||
├── deployment/
|
||||
│ ├── overview.md # Deployment architecture
|
||||
│ ├── strategies.md # Deployment strategies
|
||||
│ └── artifacts.md # Artifact generation
|
||||
├── integrations/
|
||||
│ ├── overview.md # Integration types
|
||||
│ ├── connectors.md # Connector interface
|
||||
│ ├── webhooks.md # Webhook architecture
|
||||
│ └── ci-cd.md # CI/CD patterns
|
||||
├── operations/
|
||||
│ ├── overview.md # Observability
|
||||
│ └── metrics.md # Prometheus metrics
|
||||
├── ui/
|
||||
│ └── overview.md # UI specification
|
||||
└── appendices/
|
||||
├── glossary.md # Terms
|
||||
├── errors.md # Error codes
|
||||
└── evidence-schema.md # Evidence format
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Technology Stack
|
||||
|
||||
| Layer | Technology |
|
||||
|-------|------------|
|
||||
| Backend | .NET 10, C# preview |
|
||||
| Database | PostgreSQL 16+ |
|
||||
| Message Queue | RabbitMQ / Valkey |
|
||||
| Frontend | Angular 17 |
|
||||
| Agent Runtime | .NET AOT |
|
||||
| Plugin Runtime | gRPC, container sandbox |
|
||||
| Observability | OpenTelemetry, Prometheus |
|
||||
|
||||
---
|
||||
|
||||
## Risk Register
|
||||
|
||||
| Risk | Impact | Mitigation |
|
||||
|------|--------|------------|
|
||||
| Plugin security vulnerabilities | High | Sandbox isolation, capability restrictions |
|
||||
| Agent compromise | High | mTLS, short-lived credentials, audit |
|
||||
| Evidence tampering | High | Append-only DB, cryptographic signing |
|
||||
| Registry unavailability | Medium | Connection pooling, caching, fallbacks |
|
||||
| Complex workflow failures | Medium | Comprehensive testing, rollback support |
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- [x] Complete database schema for all 10 themes
|
||||
- [x] Plugin system supports connector, step, gate types
|
||||
- [x] At least 2 built-in connectors per integration type
|
||||
- [x] Environment -> Release -> Promotion -> Deploy flow works E2E
|
||||
- [x] Evidence packet generated for every deployment
|
||||
- [x] Agent deploys to Docker and Compose targets
|
||||
- [x] UI shows pipeline overview, approval queues, deployment logs
|
||||
- [x] Performance: <500ms API P99, <5min deployment for 10 targets
|
||||
|
||||
---
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date | Entry |
|
||||
|------|-------|
|
||||
| 10-Jan-2026 | Sprint index created |
|
||||
| 10-Jan-2026 | Architecture documentation complete |
|
||||
| 10-Jan-2026 | Phases 101-106 implemented and archived |
|
||||
| 11-Jan-2026 | Phases 108-111 implemented and archived |
|
||||
| 12-Jan-2026 | Status corrected: 10/11 phases DONE. Phase 107 (Deployment Execution) remains TODO |
|
||||
| 12-Jan-2026 | Phase 107 sprints moved back to docs/implplan for active work |
|
||||
| 12-Jan-2026 | Phase 107 review: All 5 sprints (107_001-107_005) found already DONE with 179 tests total |
|
||||
| 12-Jan-2026 | Phase 107 INDEX corrected to DONE status |
|
||||
| 12-Jan-2026 | Release Orchestrator COMPLETED - all 11 phases DONE |
|
||||
@@ -0,0 +1,261 @@
|
||||
# SPRINT INDEX: Phase 7 - Deployment Execution
|
||||
|
||||
> **Epic:** Release Orchestrator
|
||||
> **Phase:** 7 - Deployment Execution
|
||||
> **Batch:** 107
|
||||
> **Status:** DONE
|
||||
> **Parent:** [100_000_INDEX](SPRINT_20260110_100_000_INDEX_release_orchestrator.md)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Phase 7 implements the Deployment Execution system - orchestrating the actual deployment of releases to targets via agents.
|
||||
|
||||
### Objectives
|
||||
|
||||
- Deploy orchestrator coordinates multi-target deployments
|
||||
- Target executor dispatches tasks to agents
|
||||
- Artifact generator creates deployment artifacts
|
||||
- Rollback manager handles failure recovery
|
||||
- Deployment strategies (rolling, blue-green, canary)
|
||||
|
||||
---
|
||||
|
||||
## Sprint Structure
|
||||
|
||||
| Sprint ID | Title | Module | Status | Dependencies |
|
||||
|-----------|-------|--------|--------|--------------|
|
||||
| 107_001 | Deploy Orchestrator | DEPLOY | DONE | 105_003, 106_005 |
|
||||
| 107_002 | Target Executor | DEPLOY | DONE | 107_001, 103_002 |
|
||||
| 107_003 | Artifact Generator | DEPLOY | DONE | 107_001 |
|
||||
| 107_004 | Rollback Manager | DEPLOY | DONE | 107_002 |
|
||||
| 107_005 | Deployment Strategies | DEPLOY | DONE | 107_002 |
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ DEPLOYMENT EXECUTION │
|
||||
│ │
|
||||
│ ┌───────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ DEPLOY ORCHESTRATOR (107_001) │ │
|
||||
│ │ │ │
|
||||
│ │ ┌─────────────────────────────────────────────────────────────────┐ │ │
|
||||
│ │ │ Deployment Job │ │ │
|
||||
│ │ │ promotion_id: uuid │ │ │
|
||||
│ │ │ strategy: rolling │ │ │
|
||||
│ │ │ targets: [target-1, target-2, target-3] │ │ │
|
||||
│ │ │ status: deploying │ │ │
|
||||
│ │ └─────────────────────────────────────────────────────────────────┘ │ │
|
||||
│ └───────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌───────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ TARGET EXECUTOR (107_002) │ │
|
||||
│ │ │ │
|
||||
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
|
||||
│ │ │ Target 1 │ │ Target 2 │ │ Target 3 │ │ │
|
||||
│ │ │ ✓ Done │ │ ⟳ Running │ │ ○ Pending │ │ │
|
||||
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
|
||||
│ │ │ │
|
||||
│ │ Task dispatch via gRPC to agents │ │
|
||||
│ └───────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌───────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ ARTIFACT GENERATOR (107_003) │ │
|
||||
│ │ │ │
|
||||
│ │ Generated artifacts for each deployment: │ │
|
||||
│ │ ├── compose.stella.lock.yml (digested compose file) │ │
|
||||
│ │ ├── stella.version.json (version sticker) │ │
|
||||
│ │ └── deployment-manifest.json (full deployment record) │ │
|
||||
│ └───────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌───────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ ROLLBACK MANAGER (107_004) │ │
|
||||
│ │ │ │
|
||||
│ │ On failure: │ │
|
||||
│ │ 1. Stop pending tasks │ │
|
||||
│ │ 2. Rollback completed targets to previous version │ │
|
||||
│ │ 3. Generate rollback evidence │ │
|
||||
│ └───────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌───────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ DEPLOYMENT STRATEGIES (107_005) │ │
|
||||
│ │ │ │
|
||||
│ │ Rolling: [■■□□□] → [■■■□□] → [■■■■□] → [■■■■■] │ │
|
||||
│ │ Blue-Green: [■■■■■] ──swap──► [□□□□□] (instant cutover) │ │
|
||||
│ │ Canary: [■□□□□] → [■■□□□] → [■■■□□] → [■■■■■] (gradual) │ │
|
||||
│ │ All-at-once: [□□□□□] → [■■■■■] (simultaneous) │ │
|
||||
│ └───────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Deliverables Summary
|
||||
|
||||
### 107_001: Deploy Orchestrator
|
||||
|
||||
| Deliverable | Type | Description |
|
||||
|-------------|------|-------------|
|
||||
| `IDeployOrchestrator` | Interface | Deployment coordination |
|
||||
| `DeployOrchestrator` | Class | Implementation |
|
||||
| `DeploymentJob` | Model | Job entity |
|
||||
| `DeploymentScheduler` | Class | Task scheduling |
|
||||
|
||||
### 107_002: Target Executor
|
||||
|
||||
| Deliverable | Type | Description |
|
||||
|-------------|------|-------------|
|
||||
| `ITargetExecutor` | Interface | Target deployment |
|
||||
| `TargetExecutor` | Class | Implementation |
|
||||
| `DeploymentTask` | Model | Per-target task |
|
||||
| `AgentDispatcher` | Class | gRPC task dispatch |
|
||||
|
||||
### 107_003: Artifact Generator
|
||||
|
||||
| Deliverable | Type | Description |
|
||||
|-------------|------|-------------|
|
||||
| `IArtifactGenerator` | Interface | Artifact creation |
|
||||
| `ComposeLockGenerator` | Class | Digest-locked compose |
|
||||
| `VersionStickerGenerator` | Class | stella.version.json |
|
||||
| `DeploymentManifestGenerator` | Class | Full manifest |
|
||||
|
||||
### 107_004: Rollback Manager
|
||||
|
||||
| Deliverable | Type | Description |
|
||||
|-------------|------|-------------|
|
||||
| `IRollbackManager` | Interface | Rollback operations |
|
||||
| `RollbackManager` | Class | Implementation |
|
||||
| `RollbackPlan` | Model | Rollback strategy |
|
||||
| `RollbackExecutor` | Class | Execute rollback |
|
||||
|
||||
### 107_005: Deployment Strategies
|
||||
|
||||
| Deliverable | Type | Description |
|
||||
|-------------|------|-------------|
|
||||
| `IDeploymentStrategy` | Interface | Strategy contract |
|
||||
| `RollingStrategy` | Strategy | Rolling deployment |
|
||||
| `BlueGreenStrategy` | Strategy | Blue-green deployment |
|
||||
| `CanaryStrategy` | Strategy | Canary deployment |
|
||||
| `AllAtOnceStrategy` | Strategy | Simultaneous deployment |
|
||||
|
||||
---
|
||||
|
||||
## Key Interfaces
|
||||
|
||||
```csharp
|
||||
public interface IDeployOrchestrator
|
||||
{
|
||||
Task<DeploymentJob> StartAsync(Guid promotionId, DeploymentOptions options, CancellationToken ct);
|
||||
Task<DeploymentJob?> GetJobAsync(Guid jobId, CancellationToken ct);
|
||||
Task CancelAsync(Guid jobId, CancellationToken ct);
|
||||
Task<DeploymentJob> WaitForCompletionAsync(Guid jobId, CancellationToken ct);
|
||||
}
|
||||
|
||||
public interface ITargetExecutor
|
||||
{
|
||||
Task<DeploymentTask> DeployToTargetAsync(Guid jobId, Guid targetId, DeploymentPayload payload, CancellationToken ct);
|
||||
Task<DeploymentTask?> GetTaskAsync(Guid taskId, CancellationToken ct);
|
||||
}
|
||||
|
||||
public interface IDeploymentStrategy
|
||||
{
|
||||
string Name { get; }
|
||||
Task<IReadOnlyList<DeploymentBatch>> PlanAsync(DeploymentJob job, CancellationToken ct);
|
||||
Task<bool> ShouldProceedAsync(DeploymentBatch completedBatch, CancellationToken ct);
|
||||
}
|
||||
|
||||
public interface IRollbackManager
|
||||
{
|
||||
Task<RollbackPlan> PlanAsync(Guid jobId, CancellationToken ct);
|
||||
Task<DeploymentJob> ExecuteAsync(RollbackPlan plan, CancellationToken ct);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Deployment Flow
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────────────────────────────┐
|
||||
│ DEPLOYMENT FLOW │
|
||||
│ │
|
||||
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
|
||||
│ │ Promotion │───►│ Decision │───►│ Deploy │───►│ Generate │ │
|
||||
│ │ Approved │ │ Allow │ │ Start │ │ Artifacts │ │
|
||||
│ └─────────────┘ └─────────────┘ └─────────────┘ └──────┬──────┘ │
|
||||
│ │ │
|
||||
│ ┌─────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────────┐│
|
||||
│ │ Strategy Execution ││
|
||||
│ │ ││
|
||||
│ │ Batch 1 Batch 2 Batch 3 ││
|
||||
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ││
|
||||
│ │ │Target-1 │ ──► │Target-2 │ ──► │Target-3 │ ││
|
||||
│ │ │ ✓ Done │ │ ✓ Done │ │ ⟳ Active │ ││
|
||||
│ │ └─────────┘ └─────────┘ └─────────┘ ││
|
||||
│ │ │ │ │ ││
|
||||
│ │ ▼ ▼ ▼ ││
|
||||
│ │ Health Check Health Check Health Check ││
|
||||
│ │ │ │ │ ││
|
||||
│ │ ▼ ▼ ▼ ││
|
||||
│ │ Write Sticker Write Sticker Write Sticker ││
|
||||
│ └─────────────────────────────────────────────────────────────────────────┘│
|
||||
│ │
|
||||
│ ┌─────────────────────────────────────────────────────┐ │
|
||||
│ │ On Failure │ │
|
||||
│ │ │ │
|
||||
│ │ 1. Stop pending batches │ │
|
||||
│ │ 2. Rollback completed targets │ │
|
||||
│ │ 3. Generate rollback evidence │ │
|
||||
│ │ 4. Update promotion status │ │
|
||||
│ └─────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
└──────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
| Module | Purpose |
|
||||
|--------|---------|
|
||||
| 105_003 Workflow Engine | Workflow execution |
|
||||
| 106_005 Decision Engine | Deployment approval |
|
||||
| 103_002 Target Registry | Target information |
|
||||
| 108_* Agents | Task execution |
|
||||
|
||||
---
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [x] Deployment job created from promotion
|
||||
- [x] Tasks dispatched to agents
|
||||
- [x] Rolling deployment works
|
||||
- [x] Blue-green deployment works
|
||||
- [x] Canary deployment works
|
||||
- [x] Artifacts generated for each target
|
||||
- [x] Rollback restores previous version
|
||||
- [x] Health checks gate progression
|
||||
- [x] Unit test coverage ≥80% (179 tests total across all sprints)
|
||||
|
||||
---
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date | Entry |
|
||||
|------|-------|
|
||||
| 10-Jan-2026 | Phase 7 index created |
|
||||
| 11-Jan-2026 | Sprint 107_001 Deploy Orchestrator completed (67 tests) |
|
||||
| 11-Jan-2026 | Sprint 107_002 Target Executor completed (29 new tests, 96 total) |
|
||||
| 11-Jan-2026 | Sprint 107_003 Artifact Generator completed (37 new tests, 133 total) |
|
||||
| 11-Jan-2026 | Sprint 107_004 Rollback Manager completed (32 new tests, 165 total) |
|
||||
| 11-Jan-2026 | Sprint 107_005 Deployment Strategies completed (14 new tests, 179 total) |
|
||||
| 12-Jan-2026 | Phase 7 INDEX status corrected to DONE - all sprints were already implemented |
|
||||
| 12-Jan-2026 | Phase 7 Deployment Execution COMPLETED - ready for archival |
|
||||
@@ -0,0 +1,416 @@
|
||||
# SPRINT: Deploy Orchestrator
|
||||
|
||||
> **Sprint ID:** 107_001
|
||||
> **Module:** DEPLOY
|
||||
> **Phase:** 7 - Deployment Execution
|
||||
> **Status:** DONE
|
||||
> **Parent:** [107_000_INDEX](SPRINT_20260110_107_000_INDEX_deployment_execution.md)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Implement the Deploy Orchestrator for coordinating multi-target deployments.
|
||||
|
||||
### Objectives
|
||||
|
||||
- Create deployment jobs from approved promotions
|
||||
- Coordinate deployment across multiple targets
|
||||
- Track deployment progress and status
|
||||
- Support deployment cancellation
|
||||
|
||||
### Working Directory
|
||||
|
||||
```
|
||||
src/ReleaseOrchestrator/
|
||||
├── __Libraries/
|
||||
│ └── StellaOps.ReleaseOrchestrator.Deployment/
|
||||
│ ├── Orchestrator/
|
||||
│ │ ├── IDeployOrchestrator.cs
|
||||
│ │ ├── DeployOrchestrator.cs
|
||||
│ │ ├── DeploymentCoordinator.cs
|
||||
│ │ └── DeploymentScheduler.cs
|
||||
│ ├── Store/
|
||||
│ │ ├── IDeploymentJobStore.cs
|
||||
│ │ └── DeploymentJobStore.cs
|
||||
│ └── Models/
|
||||
│ ├── DeploymentJob.cs
|
||||
│ ├── DeploymentOptions.cs
|
||||
│ └── DeploymentStatus.cs
|
||||
└── __Tests/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Deliverables
|
||||
|
||||
### IDeployOrchestrator Interface
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Orchestrator;
|
||||
|
||||
public interface IDeployOrchestrator
|
||||
{
|
||||
Task<DeploymentJob> StartAsync(Guid promotionId, DeploymentOptions options, CancellationToken ct = default);
|
||||
Task<DeploymentJob?> GetJobAsync(Guid jobId, CancellationToken ct = default);
|
||||
Task<IReadOnlyList<DeploymentJob>> ListJobsAsync(DeploymentJobFilter? filter = null, CancellationToken ct = default);
|
||||
Task CancelAsync(Guid jobId, string? reason = null, CancellationToken ct = default);
|
||||
Task<DeploymentJob> WaitForCompletionAsync(Guid jobId, TimeSpan? timeout = null, CancellationToken ct = default);
|
||||
Task<DeploymentProgress> GetProgressAsync(Guid jobId, CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record DeploymentOptions(
|
||||
DeploymentStrategy Strategy = DeploymentStrategy.Rolling,
|
||||
string? BatchSize = "25%",
|
||||
bool WaitForHealthCheck = true,
|
||||
bool RollbackOnFailure = true,
|
||||
TimeSpan? Timeout = null,
|
||||
Guid? WorkflowRunId = null,
|
||||
string? CallbackToken = null
|
||||
);
|
||||
|
||||
public enum DeploymentStrategy
|
||||
{
|
||||
Rolling,
|
||||
BlueGreen,
|
||||
Canary,
|
||||
AllAtOnce
|
||||
}
|
||||
```
|
||||
|
||||
### DeploymentJob Model
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Models;
|
||||
|
||||
public sealed record DeploymentJob
|
||||
{
|
||||
public required Guid Id { get; init; }
|
||||
public required Guid TenantId { get; init; }
|
||||
public required Guid PromotionId { get; init; }
|
||||
public required Guid ReleaseId { get; init; }
|
||||
public required string ReleaseName { get; init; }
|
||||
public required Guid EnvironmentId { get; init; }
|
||||
public required string EnvironmentName { get; init; }
|
||||
public required DeploymentStatus Status { get; init; }
|
||||
public required DeploymentStrategy Strategy { get; init; }
|
||||
public required DeploymentOptions Options { get; init; }
|
||||
public required ImmutableArray<DeploymentTask> Tasks { get; init; }
|
||||
public string? FailureReason { get; init; }
|
||||
public string? CancelReason { get; init; }
|
||||
public DateTimeOffset StartedAt { get; init; }
|
||||
public DateTimeOffset? CompletedAt { get; init; }
|
||||
public Guid StartedBy { get; init; }
|
||||
public Guid? RollbackJobId { get; init; }
|
||||
|
||||
public TimeSpan? Duration => CompletedAt.HasValue
|
||||
? CompletedAt.Value - StartedAt
|
||||
: null;
|
||||
|
||||
public int CompletedTaskCount => Tasks.Count(t => t.Status == DeploymentTaskStatus.Completed);
|
||||
public int TotalTaskCount => Tasks.Length;
|
||||
public double ProgressPercent => TotalTaskCount > 0
|
||||
? (double)CompletedTaskCount / TotalTaskCount * 100
|
||||
: 0;
|
||||
}
|
||||
|
||||
public enum DeploymentStatus
|
||||
{
|
||||
Pending,
|
||||
Running,
|
||||
Completed,
|
||||
Failed,
|
||||
Cancelled,
|
||||
RollingBack,
|
||||
RolledBack
|
||||
}
|
||||
|
||||
public sealed record DeploymentTask
|
||||
{
|
||||
public required Guid Id { get; init; }
|
||||
public required Guid TargetId { get; init; }
|
||||
public required string TargetName { get; init; }
|
||||
public required int BatchIndex { get; init; }
|
||||
public required DeploymentTaskStatus Status { get; init; }
|
||||
public string? AgentId { get; init; }
|
||||
public DateTimeOffset? StartedAt { get; init; }
|
||||
public DateTimeOffset? CompletedAt { get; init; }
|
||||
public string? Error { get; init; }
|
||||
public ImmutableDictionary<string, object> Result { get; init; } = ImmutableDictionary<string, object>.Empty;
|
||||
}
|
||||
|
||||
public enum DeploymentTaskStatus
|
||||
{
|
||||
Pending,
|
||||
Running,
|
||||
Completed,
|
||||
Failed,
|
||||
Skipped,
|
||||
Cancelled
|
||||
}
|
||||
```
|
||||
|
||||
### DeployOrchestrator Implementation
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Orchestrator;
|
||||
|
||||
public sealed class DeployOrchestrator : IDeployOrchestrator
|
||||
{
|
||||
private readonly IDeploymentJobStore _jobStore;
|
||||
private readonly IPromotionManager _promotionManager;
|
||||
private readonly IReleaseManager _releaseManager;
|
||||
private readonly ITargetRegistry _targetRegistry;
|
||||
private readonly IDeploymentStrategyFactory _strategyFactory;
|
||||
private readonly ITargetExecutor _targetExecutor;
|
||||
private readonly IArtifactGenerator _artifactGenerator;
|
||||
private readonly IEventPublisher _eventPublisher;
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly IGuidGenerator _guidGenerator;
|
||||
private readonly ILogger<DeployOrchestrator> _logger;
|
||||
|
||||
public async Task<DeploymentJob> StartAsync(
|
||||
Guid promotionId,
|
||||
DeploymentOptions options,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var promotion = await _promotionManager.GetAsync(promotionId, ct)
|
||||
?? throw new PromotionNotFoundException(promotionId);
|
||||
|
||||
if (promotion.Status != PromotionStatus.Approved)
|
||||
{
|
||||
throw new PromotionNotApprovedException(promotionId);
|
||||
}
|
||||
|
||||
var release = await _releaseManager.GetAsync(promotion.ReleaseId, ct)
|
||||
?? throw new ReleaseNotFoundException(promotion.ReleaseId);
|
||||
|
||||
var targets = await _targetRegistry.ListHealthyAsync(promotion.TargetEnvironmentId, ct);
|
||||
if (targets.Count == 0)
|
||||
{
|
||||
throw new NoHealthyTargetsException(promotion.TargetEnvironmentId);
|
||||
}
|
||||
|
||||
// Create deployment tasks for each target
|
||||
var tasks = targets.Select((target, index) => new DeploymentTask
|
||||
{
|
||||
Id = _guidGenerator.NewGuid(),
|
||||
TargetId = target.Id,
|
||||
TargetName = target.Name,
|
||||
BatchIndex = 0, // Will be set by strategy
|
||||
Status = DeploymentTaskStatus.Pending
|
||||
}).ToImmutableArray();
|
||||
|
||||
var job = new DeploymentJob
|
||||
{
|
||||
Id = _guidGenerator.NewGuid(),
|
||||
TenantId = _tenantContext.TenantId,
|
||||
PromotionId = promotionId,
|
||||
ReleaseId = release.Id,
|
||||
ReleaseName = release.Name,
|
||||
EnvironmentId = promotion.TargetEnvironmentId,
|
||||
EnvironmentName = promotion.TargetEnvironmentName,
|
||||
Status = DeploymentStatus.Pending,
|
||||
Strategy = options.Strategy,
|
||||
Options = options,
|
||||
Tasks = tasks,
|
||||
StartedAt = _timeProvider.GetUtcNow(),
|
||||
StartedBy = _userContext.UserId
|
||||
};
|
||||
|
||||
await _jobStore.SaveAsync(job, ct);
|
||||
|
||||
// Update promotion status
|
||||
await _promotionManager.UpdateStatusAsync(promotionId, PromotionStatus.Deploying, ct);
|
||||
|
||||
await _eventPublisher.PublishAsync(new DeploymentJobStarted(
|
||||
job.Id,
|
||||
job.TenantId,
|
||||
job.ReleaseName,
|
||||
job.EnvironmentName,
|
||||
job.Strategy,
|
||||
targets.Count,
|
||||
_timeProvider.GetUtcNow()
|
||||
), ct);
|
||||
|
||||
_logger.LogInformation(
|
||||
"Started deployment job {JobId} for release {Release} to {Environment} with {TargetCount} targets",
|
||||
job.Id, release.Name, promotion.TargetEnvironmentName, targets.Count);
|
||||
|
||||
// Start deployment execution
|
||||
_ = ExecuteDeploymentAsync(job.Id, ct);
|
||||
|
||||
return job;
|
||||
}
|
||||
|
||||
private async Task ExecuteDeploymentAsync(Guid jobId, CancellationToken ct)
|
||||
{
|
||||
try
|
||||
{
|
||||
var job = await _jobStore.GetAsync(jobId, ct);
|
||||
if (job is null) return;
|
||||
|
||||
job = job with { Status = DeploymentStatus.Running };
|
||||
await _jobStore.SaveAsync(job, ct);
|
||||
|
||||
// Get strategy and plan batches
|
||||
var strategy = _strategyFactory.Create(job.Strategy);
|
||||
var batches = await strategy.PlanAsync(job, ct);
|
||||
|
||||
// Execute batches
|
||||
foreach (var batch in batches)
|
||||
{
|
||||
job = await _jobStore.GetAsync(jobId, ct);
|
||||
if (job is null || job.Status == DeploymentStatus.Cancelled) break;
|
||||
|
||||
await ExecuteBatchAsync(job, batch, ct);
|
||||
|
||||
// Check if should continue
|
||||
if (!await strategy.ShouldProceedAsync(batch, ct))
|
||||
{
|
||||
_logger.LogWarning("Strategy halted deployment after batch {BatchIndex}", batch.Index);
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
// Complete or fail
|
||||
job = await _jobStore.GetAsync(jobId, ct);
|
||||
if (job is not null && job.Status == DeploymentStatus.Running)
|
||||
{
|
||||
var allCompleted = job.Tasks.All(t => t.Status == DeploymentTaskStatus.Completed);
|
||||
job = job with
|
||||
{
|
||||
Status = allCompleted ? DeploymentStatus.Completed : DeploymentStatus.Failed,
|
||||
CompletedAt = _timeProvider.GetUtcNow()
|
||||
};
|
||||
await _jobStore.SaveAsync(job, ct);
|
||||
|
||||
await NotifyCompletionAsync(job, ct);
|
||||
}
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogError(ex, "Deployment job {JobId} failed", jobId);
|
||||
await FailJobAsync(jobId, ex.Message, ct);
|
||||
}
|
||||
}
|
||||
|
||||
private async Task ExecuteBatchAsync(DeploymentJob job, DeploymentBatch batch, CancellationToken ct)
|
||||
{
|
||||
_logger.LogInformation("Executing batch {BatchIndex} with {TaskCount} tasks",
|
||||
batch.Index, batch.TaskIds.Count);
|
||||
|
||||
// Generate artifacts
|
||||
var payload = await _artifactGenerator.GeneratePayloadAsync(job, ct);
|
||||
|
||||
// Execute tasks in parallel within batch
|
||||
var tasks = batch.TaskIds.Select(taskId =>
|
||||
_targetExecutor.DeployToTargetAsync(job.Id, taskId, payload, ct));
|
||||
|
||||
await Task.WhenAll(tasks);
|
||||
}
|
||||
|
||||
public async Task CancelAsync(Guid jobId, string? reason = null, CancellationToken ct = default)
|
||||
{
|
||||
var job = await _jobStore.GetAsync(jobId, ct)
|
||||
?? throw new DeploymentJobNotFoundException(jobId);
|
||||
|
||||
if (job.Status != DeploymentStatus.Running && job.Status != DeploymentStatus.Pending)
|
||||
{
|
||||
throw new DeploymentJobNotCancellableException(jobId);
|
||||
}
|
||||
|
||||
job = job with
|
||||
{
|
||||
Status = DeploymentStatus.Cancelled,
|
||||
CancelReason = reason,
|
||||
CompletedAt = _timeProvider.GetUtcNow()
|
||||
};
|
||||
|
||||
await _jobStore.SaveAsync(job, ct);
|
||||
|
||||
await _eventPublisher.PublishAsync(new DeploymentJobCancelled(
|
||||
jobId, job.TenantId, reason, _timeProvider.GetUtcNow()
|
||||
), ct);
|
||||
}
|
||||
|
||||
public async Task<DeploymentProgress> GetProgressAsync(Guid jobId, CancellationToken ct = default)
|
||||
{
|
||||
var job = await _jobStore.GetAsync(jobId, ct)
|
||||
?? throw new DeploymentJobNotFoundException(jobId);
|
||||
|
||||
return new DeploymentProgress(
|
||||
JobId: job.Id,
|
||||
Status: job.Status,
|
||||
TotalTargets: job.TotalTaskCount,
|
||||
CompletedTargets: job.CompletedTaskCount,
|
||||
FailedTargets: job.Tasks.Count(t => t.Status == DeploymentTaskStatus.Failed),
|
||||
PendingTargets: job.Tasks.Count(t => t.Status == DeploymentTaskStatus.Pending),
|
||||
ProgressPercent: job.ProgressPercent,
|
||||
CurrentBatch: job.Tasks.Where(t => t.Status == DeploymentTaskStatus.Running).Select(t => t.BatchIndex).FirstOrDefault()
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
public sealed record DeploymentProgress(
|
||||
Guid JobId,
|
||||
DeploymentStatus Status,
|
||||
int TotalTargets,
|
||||
int CompletedTargets,
|
||||
int FailedTargets,
|
||||
int PendingTargets,
|
||||
double ProgressPercent,
|
||||
int CurrentBatch
|
||||
);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [x] Create deployment job from promotion
|
||||
- [x] Coordinate multi-target deployment
|
||||
- [x] Track task progress per target
|
||||
- [x] Cancel running deployment
|
||||
- [x] Wait for deployment completion
|
||||
- [x] Report deployment progress
|
||||
- [x] Handle deployment failures
|
||||
- [x] Unit test coverage >=85% (67 tests passing)
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
| Dependency | Type | Status |
|
||||
|------------|------|--------|
|
||||
| 106_005 Decision Engine | Internal | DONE |
|
||||
| 103_002 Target Registry | Internal | DONE |
|
||||
| 107_002 Target Executor | Internal | Interface defined (stub) |
|
||||
|
||||
---
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| Deliverable | Status | Notes |
|
||||
|-------------|--------|-------|
|
||||
| IDeployOrchestrator | DONE | Interface with StartAsync, GetJobAsync, ListJobsAsync, CancelAsync, WaitForCompletionAsync, GetProgressAsync |
|
||||
| DeployOrchestrator | DONE | Full implementation with batch execution, task management, event publishing |
|
||||
| DeploymentCoordinator | DONE | Logic integrated in DeployOrchestrator.ExecuteDeploymentAsync |
|
||||
| DeploymentScheduler | DONE | Strategy-based batch planning via IDeploymentStrategy |
|
||||
| DeploymentJob model | DONE | Full model with tasks, status, progress tracking |
|
||||
| IDeploymentJobStore | DONE | Interface + InMemoryDeploymentJobStore implementation |
|
||||
| IDeploymentStrategy | DONE | Rolling + AllAtOnce strategies + factory |
|
||||
| IArtifactGenerator | DONE | Interface defined |
|
||||
| ITargetExecutor | DONE | Interface defined with TargetDeploymentResult |
|
||||
| DeploymentEvents | DONE | Started, Completed, Failed, Cancelled, TaskStarted, TaskCompleted, TaskFailed, ProgressUpdated |
|
||||
| Exceptions | DONE | PromotionNotFoundException, PromotionNotApprovedException, ReleaseNotFoundException, NoHealthyTargetsException, DeploymentAlreadyInProgressException, DeploymentJobNotFoundException, DeploymentJobNotCancellableException, DeploymentTimeoutException |
|
||||
| Unit tests | DONE | 67 tests passing |
|
||||
|
||||
---
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date | Entry |
|
||||
|------|-------|
|
||||
| 10-Jan-2026 | Sprint created |
|
||||
| 11-Jan-2026 | Implemented DeployOrchestrator, models, store, strategies, events, exceptions. 67 unit tests all passing. |
|
||||
@@ -0,0 +1,370 @@
|
||||
# SPRINT: Target Executor
|
||||
|
||||
> **Sprint ID:** 107_002
|
||||
> **Module:** DEPLOY
|
||||
> **Phase:** 7 - Deployment Execution
|
||||
> **Status:** DONE
|
||||
> **Parent:** [107_000_INDEX](SPRINT_20260110_107_000_INDEX_deployment_execution.md)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Implement the Target Executor for dispatching deployment tasks to agents.
|
||||
|
||||
### Objectives
|
||||
|
||||
- Dispatch deployment tasks to agents via gRPC
|
||||
- Track task execution status
|
||||
- Handle task timeouts and retries
|
||||
- Collect task results and logs
|
||||
|
||||
### Working Directory
|
||||
|
||||
```
|
||||
src/ReleaseOrchestrator/
|
||||
├── __Libraries/
|
||||
│ └── StellaOps.ReleaseOrchestrator.Deployment/
|
||||
│ ├── Executor/
|
||||
│ │ ├── ITargetExecutor.cs
|
||||
│ │ ├── TargetExecutor.cs
|
||||
│ │ ├── AgentDispatcher.cs
|
||||
│ │ └── TaskResultCollector.cs
|
||||
│ └── Models/
|
||||
│ ├── DeploymentPayload.cs
|
||||
│ └── TaskResult.cs
|
||||
└── __Tests/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Deliverables
|
||||
|
||||
### ITargetExecutor Interface
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Executor;
|
||||
|
||||
public interface ITargetExecutor
|
||||
{
|
||||
Task<DeploymentTask> DeployToTargetAsync(
|
||||
Guid jobId,
|
||||
Guid taskId,
|
||||
DeploymentPayload payload,
|
||||
CancellationToken ct = default);
|
||||
|
||||
Task<DeploymentTask?> GetTaskAsync(Guid taskId, CancellationToken ct = default);
|
||||
Task CancelTaskAsync(Guid taskId, CancellationToken ct = default);
|
||||
Task<TaskLogs> GetTaskLogsAsync(Guid taskId, CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record DeploymentPayload
|
||||
{
|
||||
public required Guid ReleaseId { get; init; }
|
||||
public required string ReleaseName { get; init; }
|
||||
public required ImmutableArray<DeploymentComponent> Components { get; init; }
|
||||
public required string ComposeLock { get; init; }
|
||||
public required string VersionSticker { get; init; }
|
||||
public required string DeploymentManifest { get; init; }
|
||||
public ImmutableDictionary<string, string> Variables { get; init; } = ImmutableDictionary<string, string>.Empty;
|
||||
}
|
||||
|
||||
public sealed record DeploymentComponent(
|
||||
string Name,
|
||||
string Image,
|
||||
string Digest,
|
||||
ImmutableDictionary<string, string> Config
|
||||
);
|
||||
```
|
||||
|
||||
### TargetExecutor Implementation
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Executor;
|
||||
|
||||
public sealed class TargetExecutor : ITargetExecutor
|
||||
{
|
||||
private readonly IDeploymentJobStore _jobStore;
|
||||
private readonly ITargetRegistry _targetRegistry;
|
||||
private readonly IAgentManager _agentManager;
|
||||
private readonly AgentDispatcher _dispatcher;
|
||||
private readonly TaskResultCollector _resultCollector;
|
||||
private readonly IEventPublisher _eventPublisher;
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly ILogger<TargetExecutor> _logger;
|
||||
|
||||
public async Task<DeploymentTask> DeployToTargetAsync(
|
||||
Guid jobId,
|
||||
Guid taskId,
|
||||
DeploymentPayload payload,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var job = await _jobStore.GetAsync(jobId, ct)
|
||||
?? throw new DeploymentJobNotFoundException(jobId);
|
||||
|
||||
var task = job.Tasks.FirstOrDefault(t => t.Id == taskId)
|
||||
?? throw new DeploymentTaskNotFoundException(taskId);
|
||||
|
||||
var target = await _targetRegistry.GetAsync(task.TargetId, ct)
|
||||
?? throw new TargetNotFoundException(task.TargetId);
|
||||
|
||||
if (target.AgentId is null)
|
||||
{
|
||||
throw new NoAgentAssignedException(target.Id);
|
||||
}
|
||||
|
||||
var agent = await _agentManager.GetAsync(target.AgentId.Value, ct);
|
||||
if (agent?.Status != AgentStatus.Active)
|
||||
{
|
||||
throw new AgentNotActiveException(target.AgentId.Value);
|
||||
}
|
||||
|
||||
// Update task status
|
||||
task = task with
|
||||
{
|
||||
Status = DeploymentTaskStatus.Running,
|
||||
AgentId = agent.Id.ToString(),
|
||||
StartedAt = _timeProvider.GetUtcNow()
|
||||
};
|
||||
|
||||
await UpdateTaskAsync(job, task, ct);
|
||||
|
||||
await _eventPublisher.PublishAsync(new DeploymentTaskStarted(
|
||||
taskId, jobId, target.Name, agent.Name, _timeProvider.GetUtcNow()
|
||||
), ct);
|
||||
|
||||
try
|
||||
{
|
||||
// Dispatch to agent
|
||||
var agentTask = BuildAgentTask(target, payload);
|
||||
var result = await _dispatcher.DispatchAsync(agent.Id, agentTask, ct);
|
||||
|
||||
// Collect results
|
||||
task = await _resultCollector.CollectAsync(task, result, ct);
|
||||
|
||||
if (task.Status == DeploymentTaskStatus.Completed)
|
||||
{
|
||||
await _eventPublisher.PublishAsync(new DeploymentTaskCompleted(
|
||||
taskId, jobId, target.Name, task.CompletedAt!.Value - task.StartedAt!.Value,
|
||||
_timeProvider.GetUtcNow()
|
||||
), ct);
|
||||
}
|
||||
else
|
||||
{
|
||||
await _eventPublisher.PublishAsync(new DeploymentTaskFailed(
|
||||
taskId, jobId, target.Name, task.Error ?? "Unknown error",
|
||||
_timeProvider.GetUtcNow()
|
||||
), ct);
|
||||
}
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogError(ex, "Deployment task {TaskId} failed for target {Target}", taskId, target.Name);
|
||||
|
||||
task = task with
|
||||
{
|
||||
Status = DeploymentTaskStatus.Failed,
|
||||
Error = ex.Message,
|
||||
CompletedAt = _timeProvider.GetUtcNow()
|
||||
};
|
||||
|
||||
await _eventPublisher.PublishAsync(new DeploymentTaskFailed(
|
||||
taskId, jobId, target.Name, ex.Message, _timeProvider.GetUtcNow()
|
||||
), ct);
|
||||
}
|
||||
|
||||
await UpdateTaskAsync(job, task, ct);
|
||||
return task;
|
||||
}
|
||||
|
||||
private static AgentDeploymentTask BuildAgentTask(Target target, DeploymentPayload payload)
|
||||
{
|
||||
return new AgentDeploymentTask
|
||||
{
|
||||
Type = target.Type switch
|
||||
{
|
||||
TargetType.DockerHost => AgentTaskType.DockerDeploy,
|
||||
TargetType.ComposeHost => AgentTaskType.ComposeDeploy,
|
||||
_ => throw new UnsupportedTargetTypeException(target.Type)
|
||||
},
|
||||
Payload = new AgentDeploymentPayload
|
||||
{
|
||||
Components = payload.Components.Select(c => new AgentComponent
|
||||
{
|
||||
Name = c.Name,
|
||||
Image = $"{c.Image}@{c.Digest}",
|
||||
Config = c.Config
|
||||
}).ToList(),
|
||||
ComposeLock = payload.ComposeLock,
|
||||
VersionSticker = payload.VersionSticker,
|
||||
Variables = payload.Variables
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
private async Task UpdateTaskAsync(DeploymentJob job, DeploymentTask updatedTask, CancellationToken ct)
|
||||
{
|
||||
var tasks = job.Tasks.Select(t => t.Id == updatedTask.Id ? updatedTask : t).ToImmutableArray();
|
||||
var updatedJob = job with { Tasks = tasks };
|
||||
await _jobStore.SaveAsync(updatedJob, ct);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### AgentDispatcher
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Executor;
|
||||
|
||||
public sealed class AgentDispatcher
|
||||
{
|
||||
private readonly IAgentManager _agentManager;
|
||||
private readonly ILogger<AgentDispatcher> _logger;
|
||||
private readonly TimeSpan _defaultTimeout = TimeSpan.FromMinutes(30);
|
||||
|
||||
public async Task<AgentTaskResult> DispatchAsync(
|
||||
Guid agentId,
|
||||
AgentDeploymentTask task,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
_logger.LogDebug("Dispatching task to agent {AgentId}", agentId);
|
||||
|
||||
using var timeoutCts = new CancellationTokenSource(_defaultTimeout);
|
||||
using var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(ct, timeoutCts.Token);
|
||||
|
||||
try
|
||||
{
|
||||
var result = await _agentManager.ExecuteTaskAsync(agentId, task, linkedCts.Token);
|
||||
|
||||
_logger.LogDebug(
|
||||
"Agent {AgentId} completed task with status {Status}",
|
||||
agentId,
|
||||
result.Success ? "success" : "failure");
|
||||
|
||||
return result;
|
||||
}
|
||||
catch (OperationCanceledException) when (timeoutCts.IsCancellationRequested)
|
||||
{
|
||||
throw new AgentTaskTimeoutException(agentId, _defaultTimeout);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
public sealed record AgentDeploymentTask
|
||||
{
|
||||
public required AgentTaskType Type { get; init; }
|
||||
public required AgentDeploymentPayload Payload { get; init; }
|
||||
}
|
||||
|
||||
public enum AgentTaskType
|
||||
{
|
||||
DockerDeploy,
|
||||
ComposeDeploy,
|
||||
DockerRollback,
|
||||
ComposeRollback
|
||||
}
|
||||
|
||||
public sealed record AgentDeploymentPayload
|
||||
{
|
||||
public required IReadOnlyList<AgentComponent> Components { get; init; }
|
||||
public required string ComposeLock { get; init; }
|
||||
public required string VersionSticker { get; init; }
|
||||
public IReadOnlyDictionary<string, string> Variables { get; init; } = new Dictionary<string, string>();
|
||||
}
|
||||
|
||||
public sealed record AgentComponent
|
||||
{
|
||||
public required string Name { get; init; }
|
||||
public required string Image { get; init; }
|
||||
public IReadOnlyDictionary<string, string> Config { get; init; } = new Dictionary<string, string>();
|
||||
}
|
||||
|
||||
public sealed record AgentTaskResult
|
||||
{
|
||||
public bool Success { get; init; }
|
||||
public string? Error { get; init; }
|
||||
public IReadOnlyDictionary<string, object> Outputs { get; init; } = new Dictionary<string, object>();
|
||||
public string? Logs { get; init; }
|
||||
public TimeSpan Duration { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### TaskResultCollector
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Executor;
|
||||
|
||||
public sealed class TaskResultCollector
|
||||
{
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly ILogger<TaskResultCollector> _logger;
|
||||
|
||||
public Task<DeploymentTask> CollectAsync(
|
||||
DeploymentTask task,
|
||||
AgentTaskResult result,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var updatedTask = task with
|
||||
{
|
||||
Status = result.Success ? DeploymentTaskStatus.Completed : DeploymentTaskStatus.Failed,
|
||||
Error = result.Error,
|
||||
CompletedAt = _timeProvider.GetUtcNow(),
|
||||
Result = result.Outputs.ToImmutableDictionary()
|
||||
};
|
||||
|
||||
_logger.LogDebug(
|
||||
"Collected result for task {TaskId}: {Status}",
|
||||
task.Id,
|
||||
updatedTask.Status);
|
||||
|
||||
return Task.FromResult(updatedTask);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [x] Dispatch tasks to agents via gRPC
|
||||
- [x] Track task execution status
|
||||
- [x] Handle task timeouts
|
||||
- [x] Collect task results
|
||||
- [x] Collect task logs
|
||||
- [x] Cancel running tasks
|
||||
- [x] Support Docker and Compose targets
|
||||
- [x] Unit test coverage >=85%
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
| Dependency | Type | Status |
|
||||
|------------|------|--------|
|
||||
| 107_001 Deploy Orchestrator | Internal | TODO |
|
||||
| 103_002 Target Registry | Internal | TODO |
|
||||
| 103_003 Agent Manager | Internal | TODO |
|
||||
|
||||
---
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| Deliverable | Status | Notes |
|
||||
|-------------|--------|-------|
|
||||
| ITargetExecutor | DONE | Interface with DeployToTargetAsync, GetTaskAsync, CancelTaskAsync, GetTaskLogsAsync |
|
||||
| TargetExecutor | DONE | Full implementation with agent dispatch and result collection |
|
||||
| IAgentDispatcher | DONE | Interface extracted for testability |
|
||||
| AgentDispatcher | DONE | Implementation with timeout handling |
|
||||
| TaskResultCollector | DONE | Collects and parses results from agents |
|
||||
| DeploymentAgentTask | DONE | Agent-specific task type extending AgentTask |
|
||||
| DeploymentPayload | DONE | From Sprint 107_001 |
|
||||
| Unit tests | DONE | 29 new tests (96 total in test project) |
|
||||
|
||||
---
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date | Entry |
|
||||
|------|-------|
|
||||
| 10-Jan-2026 | Sprint created |
|
||||
| 11-Jan-2026 | Sprint 107_002 Target Executor completed (29 new tests, 96 total) |
|
||||
@@ -0,0 +1,465 @@
|
||||
# SPRINT: Artifact Generator
|
||||
|
||||
> **Sprint ID:** 107_003
|
||||
> **Module:** DEPLOY
|
||||
> **Phase:** 7 - Deployment Execution
|
||||
> **Status:** DONE
|
||||
> **Parent:** [107_000_INDEX](SPRINT_20260110_107_000_INDEX_deployment_execution.md)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Implement the Artifact Generator for creating deployment artifacts including digest-locked compose files and version stickers.
|
||||
|
||||
### Objectives
|
||||
|
||||
- Generate digest-locked compose files
|
||||
- Create version sticker files (stella.version.json)
|
||||
- Generate deployment manifests
|
||||
- Support multiple artifact formats
|
||||
|
||||
### Working Directory
|
||||
|
||||
```
|
||||
src/ReleaseOrchestrator/
|
||||
├── __Libraries/
|
||||
│ └── StellaOps.ReleaseOrchestrator.Deployment/
|
||||
│ └── Artifact/
|
||||
│ ├── IArtifactGenerator.cs
|
||||
│ ├── ArtifactGenerator.cs
|
||||
│ ├── ComposeLockGenerator.cs
|
||||
│ ├── VersionStickerGenerator.cs
|
||||
│ └── DeploymentManifestGenerator.cs
|
||||
└── __Tests/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Deliverables
|
||||
|
||||
### IArtifactGenerator Interface
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Artifact;
|
||||
|
||||
public interface IArtifactGenerator
|
||||
{
|
||||
Task<DeploymentPayload> GeneratePayloadAsync(DeploymentJob job, CancellationToken ct = default);
|
||||
Task<string> GenerateComposeLockAsync(Release release, CancellationToken ct = default);
|
||||
Task<string> GenerateVersionStickerAsync(Release release, DeploymentJob job, CancellationToken ct = default);
|
||||
Task<string> GenerateDeploymentManifestAsync(DeploymentJob job, CancellationToken ct = default);
|
||||
}
|
||||
```
|
||||
|
||||
### ComposeLockGenerator
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Artifact;
|
||||
|
||||
public sealed class ComposeLockGenerator
|
||||
{
|
||||
private readonly ILogger<ComposeLockGenerator> _logger;
|
||||
|
||||
public string Generate(Release release, ComposeTemplate? template = null)
|
||||
{
|
||||
var services = new Dictionary<string, object>();
|
||||
|
||||
foreach (var component in release.Components.OrderBy(c => c.OrderIndex))
|
||||
{
|
||||
var service = new Dictionary<string, object>
|
||||
{
|
||||
["image"] = $"{GetFullImageRef(component)}@{component.Digest}",
|
||||
["labels"] = new Dictionary<string, string>
|
||||
{
|
||||
["stella.release.id"] = release.Id.ToString(),
|
||||
["stella.release.name"] = release.Name,
|
||||
["stella.component.id"] = component.ComponentId.ToString(),
|
||||
["stella.component.name"] = component.ComponentName,
|
||||
["stella.digest"] = component.Digest
|
||||
}
|
||||
};
|
||||
|
||||
// Add config from component
|
||||
foreach (var (key, value) in component.Config)
|
||||
{
|
||||
service[key] = value;
|
||||
}
|
||||
|
||||
services[component.ComponentName] = service;
|
||||
}
|
||||
|
||||
var compose = new Dictionary<string, object>
|
||||
{
|
||||
["version"] = "3.8",
|
||||
["services"] = services,
|
||||
["x-stella"] = new Dictionary<string, object>
|
||||
{
|
||||
["release"] = new Dictionary<string, object>
|
||||
{
|
||||
["id"] = release.Id.ToString(),
|
||||
["name"] = release.Name,
|
||||
["manifestDigest"] = release.ManifestDigest ?? ""
|
||||
},
|
||||
["generated"] = TimeProvider.System.GetUtcNow().ToString("O")
|
||||
}
|
||||
};
|
||||
|
||||
// Merge with template if provided
|
||||
if (template is not null)
|
||||
{
|
||||
compose = MergeWithTemplate(compose, template);
|
||||
}
|
||||
|
||||
var yaml = new SerializerBuilder()
|
||||
.WithNamingConvention(CamelCaseNamingConvention.Instance)
|
||||
.Build()
|
||||
.Serialize(compose);
|
||||
|
||||
_logger.LogDebug(
|
||||
"Generated compose.stella.lock.yml for release {Release} with {Count} services",
|
||||
release.Name,
|
||||
services.Count);
|
||||
|
||||
return yaml;
|
||||
}
|
||||
|
||||
private static string GetFullImageRef(ReleaseComponent component)
|
||||
{
|
||||
// Component config should include registry info
|
||||
var registry = component.Config.GetValueOrDefault("registry", "");
|
||||
var repository = component.Config.GetValueOrDefault("repository", component.ComponentName);
|
||||
return string.IsNullOrEmpty(registry) ? repository : $"{registry}/{repository}";
|
||||
}
|
||||
|
||||
private static Dictionary<string, object> MergeWithTemplate(
|
||||
Dictionary<string, object> generated,
|
||||
ComposeTemplate template)
|
||||
{
|
||||
// Deep merge template with generated config
|
||||
// Template provides networks, volumes, etc.
|
||||
var merged = new Dictionary<string, object>(generated);
|
||||
|
||||
if (template.Networks is not null)
|
||||
merged["networks"] = template.Networks;
|
||||
|
||||
if (template.Volumes is not null)
|
||||
merged["volumes"] = template.Volumes;
|
||||
|
||||
// Merge service configs from template
|
||||
if (template.ServiceDefaults is not null && merged["services"] is Dictionary<string, object> services)
|
||||
{
|
||||
foreach (var (serviceName, serviceConfig) in services)
|
||||
{
|
||||
if (serviceConfig is Dictionary<string, object> config)
|
||||
{
|
||||
foreach (var (key, value) in template.ServiceDefaults)
|
||||
{
|
||||
if (!config.ContainsKey(key))
|
||||
{
|
||||
config[key] = value;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return merged;
|
||||
}
|
||||
}
|
||||
|
||||
public sealed record ComposeTemplate(
|
||||
IReadOnlyDictionary<string, object>? Networks,
|
||||
IReadOnlyDictionary<string, object>? Volumes,
|
||||
IReadOnlyDictionary<string, object>? ServiceDefaults
|
||||
);
|
||||
```
|
||||
|
||||
### VersionStickerGenerator
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Artifact;
|
||||
|
||||
public sealed class VersionStickerGenerator
|
||||
{
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly ILogger<VersionStickerGenerator> _logger;
|
||||
|
||||
public string Generate(Release release, DeploymentJob job, Target target)
|
||||
{
|
||||
var sticker = new VersionSticker
|
||||
{
|
||||
SchemaVersion = "1.0",
|
||||
Release = new ReleaseInfo
|
||||
{
|
||||
Id = release.Id.ToString(),
|
||||
Name = release.Name,
|
||||
ManifestDigest = release.ManifestDigest,
|
||||
FinalizedAt = release.FinalizedAt?.ToString("O")
|
||||
},
|
||||
Deployment = new DeploymentInfo
|
||||
{
|
||||
JobId = job.Id.ToString(),
|
||||
EnvironmentId = job.EnvironmentId.ToString(),
|
||||
EnvironmentName = job.EnvironmentName,
|
||||
TargetId = target.Id.ToString(),
|
||||
TargetName = target.Name,
|
||||
Strategy = job.Strategy.ToString(),
|
||||
DeployedAt = _timeProvider.GetUtcNow().ToString("O")
|
||||
},
|
||||
Components = release.Components.Select(c => new ComponentInfo
|
||||
{
|
||||
Name = c.ComponentName,
|
||||
Digest = c.Digest,
|
||||
Tag = c.Tag,
|
||||
SemVer = c.SemVer
|
||||
}).ToList()
|
||||
};
|
||||
|
||||
var json = JsonSerializer.Serialize(sticker, new JsonSerializerOptions
|
||||
{
|
||||
WriteIndented = true,
|
||||
PropertyNamingPolicy = JsonNamingPolicy.CamelCase
|
||||
});
|
||||
|
||||
_logger.LogDebug(
|
||||
"Generated stella.version.json for release {Release} on target {Target}",
|
||||
release.Name,
|
||||
target.Name);
|
||||
|
||||
return json;
|
||||
}
|
||||
}
|
||||
|
||||
public sealed class VersionSticker
|
||||
{
|
||||
public required string SchemaVersion { get; set; }
|
||||
public required ReleaseInfo Release { get; set; }
|
||||
public required DeploymentInfo Deployment { get; set; }
|
||||
public required IReadOnlyList<ComponentInfo> Components { get; set; }
|
||||
}
|
||||
|
||||
public sealed class ReleaseInfo
|
||||
{
|
||||
public required string Id { get; set; }
|
||||
public required string Name { get; set; }
|
||||
public string? ManifestDigest { get; set; }
|
||||
public string? FinalizedAt { get; set; }
|
||||
}
|
||||
|
||||
public sealed class DeploymentInfo
|
||||
{
|
||||
public required string JobId { get; set; }
|
||||
public required string EnvironmentId { get; set; }
|
||||
public required string EnvironmentName { get; set; }
|
||||
public required string TargetId { get; set; }
|
||||
public required string TargetName { get; set; }
|
||||
public required string Strategy { get; set; }
|
||||
public required string DeployedAt { get; set; }
|
||||
}
|
||||
|
||||
public sealed class ComponentInfo
|
||||
{
|
||||
public required string Name { get; set; }
|
||||
public required string Digest { get; set; }
|
||||
public string? Tag { get; set; }
|
||||
public string? SemVer { get; set; }
|
||||
}
|
||||
```
|
||||
|
||||
### DeploymentManifestGenerator
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Artifact;
|
||||
|
||||
public sealed class DeploymentManifestGenerator
|
||||
{
|
||||
private readonly IReleaseManager _releaseManager;
|
||||
private readonly IEnvironmentService _environmentService;
|
||||
private readonly IPromotionManager _promotionManager;
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly ILogger<DeploymentManifestGenerator> _logger;
|
||||
|
||||
public async Task<string> GenerateAsync(DeploymentJob job, CancellationToken ct = default)
|
||||
{
|
||||
var release = await _releaseManager.GetAsync(job.ReleaseId, ct);
|
||||
var environment = await _environmentService.GetAsync(job.EnvironmentId, ct);
|
||||
var promotion = await _promotionManager.GetAsync(job.PromotionId, ct);
|
||||
|
||||
var manifest = new DeploymentManifest
|
||||
{
|
||||
SchemaVersion = "1.0",
|
||||
Deployment = new DeploymentMetadata
|
||||
{
|
||||
JobId = job.Id.ToString(),
|
||||
Strategy = job.Strategy.ToString(),
|
||||
StartedAt = job.StartedAt.ToString("O"),
|
||||
StartedBy = job.StartedBy.ToString()
|
||||
},
|
||||
Release = new ReleaseMetadata
|
||||
{
|
||||
Id = release!.Id.ToString(),
|
||||
Name = release.Name,
|
||||
ManifestDigest = release.ManifestDigest,
|
||||
FinalizedAt = release.FinalizedAt?.ToString("O"),
|
||||
Components = release.Components.Select(c => new ComponentMetadata
|
||||
{
|
||||
Id = c.ComponentId.ToString(),
|
||||
Name = c.ComponentName,
|
||||
Digest = c.Digest,
|
||||
Tag = c.Tag,
|
||||
SemVer = c.SemVer
|
||||
}).ToList()
|
||||
},
|
||||
Environment = new EnvironmentMetadata
|
||||
{
|
||||
Id = environment!.Id.ToString(),
|
||||
Name = environment.Name,
|
||||
IsProduction = environment.IsProduction
|
||||
},
|
||||
Promotion = promotion is not null ? new PromotionMetadata
|
||||
{
|
||||
Id = promotion.Id.ToString(),
|
||||
RequestedBy = promotion.RequestedBy.ToString(),
|
||||
RequestedAt = promotion.RequestedAt.ToString("O"),
|
||||
Approvals = promotion.Approvals.Select(a => new ApprovalMetadata
|
||||
{
|
||||
UserId = a.UserId.ToString(),
|
||||
UserName = a.UserName,
|
||||
Decision = a.Decision.ToString(),
|
||||
DecidedAt = a.DecidedAt.ToString("O")
|
||||
}).ToList(),
|
||||
GateResults = promotion.GateResults.Select(g => new GateResultMetadata
|
||||
{
|
||||
GateName = g.GateName,
|
||||
Passed = g.Passed,
|
||||
Message = g.Message
|
||||
}).ToList()
|
||||
} : null,
|
||||
Targets = job.Tasks.Select(t => new TargetMetadata
|
||||
{
|
||||
Id = t.TargetId.ToString(),
|
||||
Name = t.TargetName,
|
||||
Status = t.Status.ToString()
|
||||
}).ToList(),
|
||||
GeneratedAt = _timeProvider.GetUtcNow().ToString("O")
|
||||
};
|
||||
|
||||
var json = JsonSerializer.Serialize(manifest, new JsonSerializerOptions
|
||||
{
|
||||
WriteIndented = true,
|
||||
PropertyNamingPolicy = JsonNamingPolicy.CamelCase
|
||||
});
|
||||
|
||||
_logger.LogDebug("Generated deployment manifest for job {JobId}", job.Id);
|
||||
|
||||
return json;
|
||||
}
|
||||
}
|
||||
|
||||
// Manifest models
|
||||
public sealed class DeploymentManifest
|
||||
{
|
||||
public required string SchemaVersion { get; set; }
|
||||
public required DeploymentMetadata Deployment { get; set; }
|
||||
public required ReleaseMetadata Release { get; set; }
|
||||
public required EnvironmentMetadata Environment { get; set; }
|
||||
public PromotionMetadata? Promotion { get; set; }
|
||||
public required IReadOnlyList<TargetMetadata> Targets { get; set; }
|
||||
public required string GeneratedAt { get; set; }
|
||||
}
|
||||
|
||||
// Additional metadata classes abbreviated for brevity...
|
||||
```
|
||||
|
||||
### ArtifactGenerator (Coordinator)
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Artifact;
|
||||
|
||||
public sealed class ArtifactGenerator : IArtifactGenerator
|
||||
{
|
||||
private readonly IReleaseManager _releaseManager;
|
||||
private readonly ComposeLockGenerator _composeLockGenerator;
|
||||
private readonly VersionStickerGenerator _versionStickerGenerator;
|
||||
private readonly DeploymentManifestGenerator _manifestGenerator;
|
||||
private readonly ILogger<ArtifactGenerator> _logger;
|
||||
|
||||
public async Task<DeploymentPayload> GeneratePayloadAsync(
|
||||
DeploymentJob job,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var release = await _releaseManager.GetAsync(job.ReleaseId, ct)
|
||||
?? throw new ReleaseNotFoundException(job.ReleaseId);
|
||||
|
||||
var composeLock = await GenerateComposeLockAsync(release, ct);
|
||||
var versionSticker = await GenerateVersionStickerAsync(release, job, ct);
|
||||
var manifest = await GenerateDeploymentManifestAsync(job, ct);
|
||||
|
||||
var components = release.Components.Select(c => new DeploymentComponent(
|
||||
c.ComponentName,
|
||||
c.Config.GetValueOrDefault("image", c.ComponentName),
|
||||
c.Digest,
|
||||
c.Config
|
||||
)).ToImmutableArray();
|
||||
|
||||
return new DeploymentPayload
|
||||
{
|
||||
ReleaseId = release.Id,
|
||||
ReleaseName = release.Name,
|
||||
Components = components,
|
||||
ComposeLock = composeLock,
|
||||
VersionSticker = versionSticker,
|
||||
DeploymentManifest = manifest
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [x] Generate digest-locked compose files
|
||||
- [x] All images use digest references
|
||||
- [x] Generate stella.version.json stickers
|
||||
- [x] Generate deployment manifests
|
||||
- [x] Include all required metadata
|
||||
- [x] Merge with compose templates
|
||||
- [x] JSON/YAML formats valid
|
||||
- [x] Unit test coverage >=85%
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
| Dependency | Type | Status |
|
||||
|------------|------|--------|
|
||||
| 107_001 Deploy Orchestrator | Internal | DONE |
|
||||
| 104_003 Release Manager | Internal | DONE |
|
||||
| YamlDotNet | NuGet | Available |
|
||||
|
||||
---
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| Deliverable | Status | Notes |
|
||||
|-------------|--------|-------|
|
||||
| IArtifactGenerator | DONE | Interface already existed, updated |
|
||||
| ArtifactGenerator | DONE | Coordinator with TimeProvider injection |
|
||||
| ComposeLockGenerator | DONE | YAML generation with templates |
|
||||
| VersionStickerGenerator | DONE | JSON with/without target support |
|
||||
| DeploymentManifestGenerator | DONE | Full manifest with metadata |
|
||||
| Unit tests | DONE | 133 tests passing |
|
||||
|
||||
---
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date | Entry |
|
||||
|------|-------|
|
||||
| 10-Jan-2026 | Sprint created |
|
||||
| 11-Jan-2026 | Implemented ComposeLockGenerator, VersionStickerGenerator, DeploymentManifestGenerator |
|
||||
| 11-Jan-2026 | Created ArtifactGenerator coordinator with TimeProvider injection |
|
||||
| 11-Jan-2026 | Added 38 unit tests for artifact generators |
|
||||
| 11-Jan-2026 | All 133 tests passing, sprint complete |
|
||||
@@ -0,0 +1,468 @@
|
||||
# SPRINT: Rollback Manager
|
||||
|
||||
> **Sprint ID:** 107_004
|
||||
> **Module:** DEPLOY
|
||||
> **Phase:** 7 - Deployment Execution
|
||||
> **Status:** DONE
|
||||
> **Parent:** [107_000_INDEX](SPRINT_20260110_107_000_INDEX_deployment_execution.md)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Implement the Rollback Manager for handling deployment failure recovery.
|
||||
|
||||
### Objectives
|
||||
|
||||
- Plan rollback strategy for failed deployments
|
||||
- Execute rollback to previous release
|
||||
- Track rollback progress and status
|
||||
- Generate rollback evidence
|
||||
|
||||
### Working Directory
|
||||
|
||||
```
|
||||
src/ReleaseOrchestrator/
|
||||
├── __Libraries/
|
||||
│ └── StellaOps.ReleaseOrchestrator.Deployment/
|
||||
│ └── Rollback/
|
||||
│ ├── IRollbackManager.cs
|
||||
│ ├── RollbackManager.cs
|
||||
│ ├── RollbackPlanner.cs
|
||||
│ ├── RollbackExecutor.cs
|
||||
│ └── RollbackEvidenceGenerator.cs
|
||||
└── __Tests/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Deliverables
|
||||
|
||||
### IRollbackManager Interface
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Rollback;
|
||||
|
||||
public interface IRollbackManager
|
||||
{
|
||||
Task<RollbackPlan> PlanAsync(Guid jobId, CancellationToken ct = default);
|
||||
Task<DeploymentJob> ExecuteAsync(RollbackPlan plan, CancellationToken ct = default);
|
||||
Task<DeploymentJob> ExecuteAsync(Guid jobId, CancellationToken ct = default);
|
||||
Task<RollbackPlan?> GetPlanAsync(Guid jobId, CancellationToken ct = default);
|
||||
Task<bool> CanRollbackAsync(Guid jobId, CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record RollbackPlan
|
||||
{
|
||||
public required Guid Id { get; init; }
|
||||
public required Guid FailedJobId { get; init; }
|
||||
public required Guid TargetReleaseId { get; init; }
|
||||
public required string TargetReleaseName { get; init; }
|
||||
public required ImmutableArray<RollbackTarget> Targets { get; init; }
|
||||
public required RollbackStrategy Strategy { get; init; }
|
||||
public required DateTimeOffset PlannedAt { get; init; }
|
||||
}
|
||||
|
||||
public enum RollbackStrategy
|
||||
{
|
||||
RedeployPrevious, // Redeploy the previous release
|
||||
RestoreSnapshot, // Restore from snapshot if available
|
||||
Manual // Requires manual intervention
|
||||
}
|
||||
|
||||
public sealed record RollbackTarget(
|
||||
Guid TargetId,
|
||||
string TargetName,
|
||||
string CurrentDigest,
|
||||
string RollbackToDigest,
|
||||
RollbackTargetStatus Status
|
||||
);
|
||||
|
||||
public enum RollbackTargetStatus
|
||||
{
|
||||
Pending,
|
||||
RollingBack,
|
||||
RolledBack,
|
||||
Failed,
|
||||
Skipped
|
||||
}
|
||||
```
|
||||
|
||||
### RollbackManager Implementation
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Rollback;
|
||||
|
||||
public sealed class RollbackManager : IRollbackManager
|
||||
{
|
||||
private readonly IDeploymentJobStore _jobStore;
|
||||
private readonly IReleaseHistory _releaseHistory;
|
||||
private readonly IReleaseManager _releaseManager;
|
||||
private readonly ITargetExecutor _targetExecutor;
|
||||
private readonly IArtifactGenerator _artifactGenerator;
|
||||
private readonly RollbackPlanner _planner;
|
||||
private readonly RollbackEvidenceGenerator _evidenceGenerator;
|
||||
private readonly IEventPublisher _eventPublisher;
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly IGuidGenerator _guidGenerator;
|
||||
private readonly ILogger<RollbackManager> _logger;
|
||||
|
||||
public async Task<RollbackPlan> PlanAsync(Guid jobId, CancellationToken ct = default)
|
||||
{
|
||||
var job = await _jobStore.GetAsync(jobId, ct)
|
||||
?? throw new DeploymentJobNotFoundException(jobId);
|
||||
|
||||
if (job.Status != DeploymentStatus.Failed)
|
||||
{
|
||||
throw new RollbackNotRequiredException(jobId);
|
||||
}
|
||||
|
||||
// Find previous successful deployment
|
||||
var previousRelease = await _releaseHistory.GetPreviousDeployedAsync(
|
||||
job.EnvironmentId, job.ReleaseId, ct);
|
||||
|
||||
if (previousRelease is null)
|
||||
{
|
||||
throw new NoPreviousReleaseException(job.EnvironmentId);
|
||||
}
|
||||
|
||||
var plan = await _planner.CreatePlanAsync(job, previousRelease, ct);
|
||||
|
||||
_logger.LogInformation(
|
||||
"Created rollback plan {PlanId} for job {JobId}: rollback to {Release}",
|
||||
plan.Id, jobId, previousRelease.Name);
|
||||
|
||||
return plan;
|
||||
}
|
||||
|
||||
public async Task<DeploymentJob> ExecuteAsync(
|
||||
RollbackPlan plan,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var failedJob = await _jobStore.GetAsync(plan.FailedJobId, ct)
|
||||
?? throw new DeploymentJobNotFoundException(plan.FailedJobId);
|
||||
|
||||
var targetRelease = await _releaseManager.GetAsync(plan.TargetReleaseId, ct)
|
||||
?? throw new ReleaseNotFoundException(plan.TargetReleaseId);
|
||||
|
||||
// Update original job to rolling back
|
||||
failedJob = failedJob with { Status = DeploymentStatus.RollingBack };
|
||||
await _jobStore.SaveAsync(failedJob, ct);
|
||||
|
||||
await _eventPublisher.PublishAsync(new RollbackStarted(
|
||||
plan.Id, plan.FailedJobId, plan.TargetReleaseId,
|
||||
plan.TargetReleaseName, plan.Targets.Length, _timeProvider.GetUtcNow()
|
||||
), ct);
|
||||
|
||||
try
|
||||
{
|
||||
// Generate rollback payload
|
||||
var payload = await _artifactGenerator.GeneratePayloadAsync(
|
||||
new DeploymentJob
|
||||
{
|
||||
Id = _guidGenerator.NewGuid(),
|
||||
TenantId = failedJob.TenantId,
|
||||
PromotionId = failedJob.PromotionId,
|
||||
ReleaseId = targetRelease.Id,
|
||||
ReleaseName = targetRelease.Name,
|
||||
EnvironmentId = failedJob.EnvironmentId,
|
||||
EnvironmentName = failedJob.EnvironmentName,
|
||||
Status = DeploymentStatus.Running,
|
||||
Strategy = DeploymentStrategy.AllAtOnce,
|
||||
Options = new DeploymentOptions(),
|
||||
Tasks = [],
|
||||
StartedAt = _timeProvider.GetUtcNow(),
|
||||
StartedBy = Guid.Empty
|
||||
}, ct);
|
||||
|
||||
// Execute rollback on each target
|
||||
foreach (var target in plan.Targets)
|
||||
{
|
||||
if (target.Status != RollbackTargetStatus.Pending)
|
||||
continue;
|
||||
|
||||
try
|
||||
{
|
||||
await ExecuteTargetRollbackAsync(failedJob, target, payload, ct);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogError(ex,
|
||||
"Rollback failed for target {Target}",
|
||||
target.TargetName);
|
||||
}
|
||||
}
|
||||
|
||||
// Update job status
|
||||
failedJob = failedJob with
|
||||
{
|
||||
Status = DeploymentStatus.RolledBack,
|
||||
RollbackJobId = plan.Id,
|
||||
CompletedAt = _timeProvider.GetUtcNow()
|
||||
};
|
||||
await _jobStore.SaveAsync(failedJob, ct);
|
||||
|
||||
// Generate evidence
|
||||
await _evidenceGenerator.GenerateAsync(plan, failedJob, ct);
|
||||
|
||||
await _eventPublisher.PublishAsync(new RollbackCompleted(
|
||||
plan.Id, plan.FailedJobId, plan.TargetReleaseName,
|
||||
_timeProvider.GetUtcNow()
|
||||
), ct);
|
||||
|
||||
_logger.LogInformation(
|
||||
"Rollback completed for job {JobId} to release {Release}",
|
||||
plan.FailedJobId, targetRelease.Name);
|
||||
|
||||
return failedJob;
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogError(ex, "Rollback failed for job {JobId}", plan.FailedJobId);
|
||||
|
||||
failedJob = failedJob with
|
||||
{
|
||||
Status = DeploymentStatus.Failed,
|
||||
FailureReason = $"Rollback failed: {ex.Message}"
|
||||
};
|
||||
await _jobStore.SaveAsync(failedJob, ct);
|
||||
|
||||
await _eventPublisher.PublishAsync(new RollbackFailed(
|
||||
plan.Id, plan.FailedJobId, ex.Message, _timeProvider.GetUtcNow()
|
||||
), ct);
|
||||
|
||||
throw;
|
||||
}
|
||||
}
|
||||
|
||||
private async Task ExecuteTargetRollbackAsync(
|
||||
DeploymentJob job,
|
||||
RollbackTarget target,
|
||||
DeploymentPayload payload,
|
||||
CancellationToken ct)
|
||||
{
|
||||
_logger.LogInformation(
|
||||
"Rolling back target {Target} from {Current} to {Previous}",
|
||||
target.TargetName,
|
||||
target.CurrentDigest[..16],
|
||||
target.RollbackToDigest[..16]);
|
||||
|
||||
// Create a rollback task
|
||||
var task = new DeploymentTask
|
||||
{
|
||||
Id = _guidGenerator.NewGuid(),
|
||||
TargetId = target.TargetId,
|
||||
TargetName = target.TargetName,
|
||||
BatchIndex = 0,
|
||||
Status = DeploymentTaskStatus.Pending
|
||||
};
|
||||
|
||||
await _targetExecutor.DeployToTargetAsync(job.Id, task.Id, payload, ct);
|
||||
}
|
||||
|
||||
public async Task<bool> CanRollbackAsync(Guid jobId, CancellationToken ct = default)
|
||||
{
|
||||
var job = await _jobStore.GetAsync(jobId, ct);
|
||||
if (job is null)
|
||||
return false;
|
||||
|
||||
if (job.Status != DeploymentStatus.Failed)
|
||||
return false;
|
||||
|
||||
var previousRelease = await _releaseHistory.GetPreviousDeployedAsync(
|
||||
job.EnvironmentId, job.ReleaseId, ct);
|
||||
|
||||
return previousRelease is not null;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### RollbackPlanner
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Rollback;
|
||||
|
||||
public sealed class RollbackPlanner
|
||||
{
|
||||
private readonly IInventorySyncService _inventoryService;
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly IGuidGenerator _guidGenerator;
|
||||
|
||||
public async Task<RollbackPlan> CreatePlanAsync(
|
||||
DeploymentJob failedJob,
|
||||
Release targetRelease,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var targets = new List<RollbackTarget>();
|
||||
|
||||
foreach (var task in failedJob.Tasks)
|
||||
{
|
||||
// Get current state from inventory
|
||||
var snapshot = await _inventoryService.GetLatestSnapshotAsync(task.TargetId, ct);
|
||||
|
||||
var currentDigest = snapshot?.Containers
|
||||
.FirstOrDefault(c => IsDeployedComponent(c, failedJob.ReleaseName))
|
||||
?.ImageDigest ?? "";
|
||||
|
||||
var rollbackDigest = targetRelease.Components
|
||||
.FirstOrDefault(c => MatchesTarget(c, task))
|
||||
?.Digest ?? "";
|
||||
|
||||
targets.Add(new RollbackTarget(
|
||||
TargetId: task.TargetId,
|
||||
TargetName: task.TargetName,
|
||||
CurrentDigest: currentDigest,
|
||||
RollbackToDigest: rollbackDigest,
|
||||
Status: task.Status == DeploymentTaskStatus.Completed
|
||||
? RollbackTargetStatus.Pending
|
||||
: RollbackTargetStatus.Skipped
|
||||
));
|
||||
}
|
||||
|
||||
return new RollbackPlan
|
||||
{
|
||||
Id = _guidGenerator.NewGuid(),
|
||||
FailedJobId = failedJob.Id,
|
||||
TargetReleaseId = targetRelease.Id,
|
||||
TargetReleaseName = targetRelease.Name,
|
||||
Targets = targets.ToImmutableArray(),
|
||||
Strategy = RollbackStrategy.RedeployPrevious,
|
||||
PlannedAt = _timeProvider.GetUtcNow()
|
||||
};
|
||||
}
|
||||
|
||||
private static bool IsDeployedComponent(ContainerInfo container, string releaseName) =>
|
||||
container.Labels.GetValueOrDefault("stella.release.name") == releaseName;
|
||||
|
||||
private static bool MatchesTarget(ReleaseComponent component, DeploymentTask task) =>
|
||||
component.ComponentName == task.TargetName;
|
||||
}
|
||||
```
|
||||
|
||||
### RollbackEvidenceGenerator
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Rollback;
|
||||
|
||||
public sealed class RollbackEvidenceGenerator
|
||||
{
|
||||
private readonly IEvidencePacketService _evidenceService;
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly ILogger<RollbackEvidenceGenerator> _logger;
|
||||
|
||||
public async Task GenerateAsync(
|
||||
RollbackPlan plan,
|
||||
DeploymentJob job,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var evidence = new RollbackEvidence
|
||||
{
|
||||
PlanId = plan.Id.ToString(),
|
||||
FailedJobId = plan.FailedJobId.ToString(),
|
||||
TargetReleaseId = plan.TargetReleaseId.ToString(),
|
||||
TargetReleaseName = plan.TargetReleaseName,
|
||||
RollbackStrategy = plan.Strategy.ToString(),
|
||||
PlannedAt = plan.PlannedAt.ToString("O"),
|
||||
ExecutedAt = _timeProvider.GetUtcNow().ToString("O"),
|
||||
Targets = plan.Targets.Select(t => new RollbackTargetEvidence
|
||||
{
|
||||
TargetId = t.TargetId.ToString(),
|
||||
TargetName = t.TargetName,
|
||||
FromDigest = t.CurrentDigest,
|
||||
ToDigest = t.RollbackToDigest,
|
||||
Status = t.Status.ToString()
|
||||
}).ToList(),
|
||||
OriginalFailure = job.FailureReason
|
||||
};
|
||||
|
||||
var packet = await _evidenceService.CreatePacketAsync(new CreateEvidencePacketRequest
|
||||
{
|
||||
Type = EvidenceType.Rollback,
|
||||
SubjectId = plan.FailedJobId,
|
||||
Content = JsonSerializer.Serialize(evidence),
|
||||
Metadata = new Dictionary<string, string>
|
||||
{
|
||||
["rollbackPlanId"] = plan.Id.ToString(),
|
||||
["targetRelease"] = plan.TargetReleaseName,
|
||||
["environment"] = job.EnvironmentName
|
||||
}
|
||||
}, ct);
|
||||
|
||||
_logger.LogInformation(
|
||||
"Generated rollback evidence packet {PacketId} for job {JobId}",
|
||||
packet.Id, plan.FailedJobId);
|
||||
}
|
||||
}
|
||||
|
||||
public sealed class RollbackEvidence
|
||||
{
|
||||
public required string PlanId { get; set; }
|
||||
public required string FailedJobId { get; set; }
|
||||
public required string TargetReleaseId { get; set; }
|
||||
public required string TargetReleaseName { get; set; }
|
||||
public required string RollbackStrategy { get; set; }
|
||||
public required string PlannedAt { get; set; }
|
||||
public required string ExecutedAt { get; set; }
|
||||
public required IReadOnlyList<RollbackTargetEvidence> Targets { get; set; }
|
||||
public string? OriginalFailure { get; set; }
|
||||
}
|
||||
|
||||
public sealed class RollbackTargetEvidence
|
||||
{
|
||||
public required string TargetId { get; set; }
|
||||
public required string TargetName { get; set; }
|
||||
public required string FromDigest { get; set; }
|
||||
public required string ToDigest { get; set; }
|
||||
public required string Status { get; set; }
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [x] Plan rollback from failed deployment
|
||||
- [x] Find previous successful release
|
||||
- [x] Execute rollback on completed targets
|
||||
- [x] Skip targets not yet deployed
|
||||
- [x] Track rollback progress
|
||||
- [x] Generate rollback evidence
|
||||
- [x] Update deployment status
|
||||
- [x] Unit test coverage >=85%
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
| Dependency | Type | Status |
|
||||
|------------|------|--------|
|
||||
| 107_002 Target Executor | Internal | DONE |
|
||||
| 104_004 Release Catalog | Internal | DONE |
|
||||
| 109_002 Evidence Packets | Internal | Skipped (simplified) |
|
||||
|
||||
---
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| Deliverable | Status | Notes |
|
||||
|-------------|--------|-------|
|
||||
| IRollbackManager | DONE | Interface with Plan/Execute/CanRollback |
|
||||
| RollbackManager | DONE | Full implementation with event publishing |
|
||||
| RollbackPlanner | DONE | Creates plans from failed jobs |
|
||||
| RollbackEvidenceGenerator | DONE | JSON evidence generation |
|
||||
| Unit tests | DONE | 32 rollback tests (165 total) |
|
||||
|
||||
---
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date | Entry |
|
||||
|------|-------|
|
||||
| 10-Jan-2026 | Sprint created |
|
||||
| 11-Jan-2026 | Created IRollbackManager interface and models |
|
||||
| 11-Jan-2026 | Implemented RollbackManager, RollbackPlanner, RollbackEvidenceGenerator |
|
||||
| 11-Jan-2026 | Extended IReleaseHistory with GetPreviousDeployedAsync |
|
||||
| 11-Jan-2026 | Extended IDeploymentStore with GetPreviousDeploymentAsync |
|
||||
| 11-Jan-2026 | Added rollback events (RollbackStarted, RollbackCompleted, RollbackFailed) |
|
||||
| 11-Jan-2026 | Created 32 unit tests for rollback components |
|
||||
| 11-Jan-2026 | All 165 tests passing, sprint complete |
|
||||
@@ -0,0 +1,465 @@
|
||||
# SPRINT: Deployment Strategies
|
||||
|
||||
> **Sprint ID:** 107_005
|
||||
> **Module:** DEPLOY
|
||||
> **Phase:** 7 - Deployment Execution
|
||||
> **Status:** DONE
|
||||
> **Parent:** [107_000_INDEX](SPRINT_20260110_107_000_INDEX_deployment_execution.md)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Implement deployment strategies for different deployment patterns.
|
||||
|
||||
### Objectives
|
||||
|
||||
- Rolling deployment strategy
|
||||
- Blue-green deployment strategy
|
||||
- Canary deployment strategy
|
||||
- All-at-once deployment strategy
|
||||
- Strategy factory for selection
|
||||
|
||||
### Working Directory
|
||||
|
||||
```
|
||||
src/ReleaseOrchestrator/
|
||||
├── __Libraries/
|
||||
│ └── StellaOps.ReleaseOrchestrator.Deployment/
|
||||
│ └── Strategy/
|
||||
│ ├── IDeploymentStrategy.cs
|
||||
│ ├── DeploymentStrategyFactory.cs
|
||||
│ ├── RollingStrategy.cs
|
||||
│ ├── BlueGreenStrategy.cs
|
||||
│ ├── CanaryStrategy.cs
|
||||
│ └── AllAtOnceStrategy.cs
|
||||
└── __Tests/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Deliverables
|
||||
|
||||
### IDeploymentStrategy Interface
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Strategy;
|
||||
|
||||
public interface IDeploymentStrategy
|
||||
{
|
||||
string Name { get; }
|
||||
Task<IReadOnlyList<DeploymentBatch>> PlanAsync(DeploymentJob job, CancellationToken ct = default);
|
||||
Task<bool> ShouldProceedAsync(DeploymentBatch completedBatch, CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record DeploymentBatch(
|
||||
int Index,
|
||||
ImmutableArray<Guid> TaskIds,
|
||||
BatchRequirements Requirements
|
||||
);
|
||||
|
||||
public sealed record BatchRequirements(
|
||||
bool WaitForHealthCheck = true,
|
||||
TimeSpan? HealthCheckTimeout = null,
|
||||
double MinSuccessRate = 1.0
|
||||
);
|
||||
```
|
||||
|
||||
### RollingStrategy
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Strategy;
|
||||
|
||||
public sealed class RollingStrategy : IDeploymentStrategy
|
||||
{
|
||||
private readonly ITargetHealthChecker _healthChecker;
|
||||
private readonly ILogger<RollingStrategy> _logger;
|
||||
|
||||
public string Name => "rolling";
|
||||
|
||||
public Task<IReadOnlyList<DeploymentBatch>> PlanAsync(
|
||||
DeploymentJob job,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var batchSize = ParseBatchSize(job.Options.BatchSize, job.Tasks.Length);
|
||||
var batches = new List<DeploymentBatch>();
|
||||
|
||||
var taskIds = job.Tasks.Select(t => t.Id).ToList();
|
||||
var batchIndex = 0;
|
||||
|
||||
while (taskIds.Count > 0)
|
||||
{
|
||||
var batchTaskIds = taskIds.Take(batchSize).ToImmutableArray();
|
||||
taskIds = taskIds.Skip(batchSize).ToList();
|
||||
|
||||
batches.Add(new DeploymentBatch(
|
||||
Index: batchIndex++,
|
||||
TaskIds: batchTaskIds,
|
||||
Requirements: new BatchRequirements(
|
||||
WaitForHealthCheck: job.Options.WaitForHealthCheck,
|
||||
HealthCheckTimeout: TimeSpan.FromMinutes(5)
|
||||
)
|
||||
));
|
||||
}
|
||||
|
||||
_logger.LogInformation(
|
||||
"Rolling strategy planned {BatchCount} batches of ~{BatchSize} targets",
|
||||
batches.Count, batchSize);
|
||||
|
||||
return Task.FromResult<IReadOnlyList<DeploymentBatch>>(batches);
|
||||
}
|
||||
|
||||
public async Task<bool> ShouldProceedAsync(
|
||||
DeploymentBatch completedBatch,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
if (!completedBatch.Requirements.WaitForHealthCheck)
|
||||
return true;
|
||||
|
||||
// Check health of deployed targets
|
||||
foreach (var taskId in completedBatch.TaskIds)
|
||||
{
|
||||
var isHealthy = await _healthChecker.CheckTaskHealthAsync(taskId, ct);
|
||||
if (!isHealthy)
|
||||
{
|
||||
_logger.LogWarning(
|
||||
"Task {TaskId} in batch {BatchIndex} is unhealthy, halting rollout",
|
||||
taskId, completedBatch.Index);
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
private static int ParseBatchSize(string? batchSizeSpec, int totalTargets)
|
||||
{
|
||||
if (string.IsNullOrEmpty(batchSizeSpec))
|
||||
return Math.Max(1, totalTargets / 4);
|
||||
|
||||
if (batchSizeSpec.EndsWith('%'))
|
||||
{
|
||||
var percent = int.Parse(batchSizeSpec.TrimEnd('%'), CultureInfo.InvariantCulture);
|
||||
return Math.Max(1, totalTargets * percent / 100);
|
||||
}
|
||||
|
||||
return int.Parse(batchSizeSpec, CultureInfo.InvariantCulture);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### BlueGreenStrategy
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Strategy;
|
||||
|
||||
public sealed class BlueGreenStrategy : IDeploymentStrategy
|
||||
{
|
||||
private readonly ITargetHealthChecker _healthChecker;
|
||||
private readonly ITrafficRouter _trafficRouter;
|
||||
private readonly ILogger<BlueGreenStrategy> _logger;
|
||||
|
||||
public string Name => "blue-green";
|
||||
|
||||
public Task<IReadOnlyList<DeploymentBatch>> PlanAsync(
|
||||
DeploymentJob job,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
// Blue-green deploys to all targets at once (the "green" set)
|
||||
// Then switches traffic from "blue" to "green"
|
||||
var batches = new List<DeploymentBatch>
|
||||
{
|
||||
// Phase 1: Deploy to green (all targets)
|
||||
new DeploymentBatch(
|
||||
Index: 0,
|
||||
TaskIds: job.Tasks.Select(t => t.Id).ToImmutableArray(),
|
||||
Requirements: new BatchRequirements(
|
||||
WaitForHealthCheck: true,
|
||||
HealthCheckTimeout: TimeSpan.FromMinutes(10),
|
||||
MinSuccessRate: 1.0 // All must succeed
|
||||
)
|
||||
)
|
||||
};
|
||||
|
||||
_logger.LogInformation(
|
||||
"Blue-green strategy: deploy all {Count} targets, then switch traffic",
|
||||
job.Tasks.Length);
|
||||
|
||||
return Task.FromResult<IReadOnlyList<DeploymentBatch>>(batches);
|
||||
}
|
||||
|
||||
public async Task<bool> ShouldProceedAsync(
|
||||
DeploymentBatch completedBatch,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
// All targets must be healthy before switching traffic
|
||||
foreach (var taskId in completedBatch.TaskIds)
|
||||
{
|
||||
var isHealthy = await _healthChecker.CheckTaskHealthAsync(taskId, ct);
|
||||
if (!isHealthy)
|
||||
{
|
||||
_logger.LogWarning(
|
||||
"Blue-green: target {TaskId} unhealthy, not switching traffic",
|
||||
taskId);
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
// Switch traffic to new deployment
|
||||
_logger.LogInformation("Blue-green: switching traffic to new deployment");
|
||||
// Traffic switching handled externally based on deployment type
|
||||
|
||||
return true;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### CanaryStrategy
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Strategy;
|
||||
|
||||
public sealed class CanaryStrategy : IDeploymentStrategy
|
||||
{
|
||||
private readonly ITargetHealthChecker _healthChecker;
|
||||
private readonly IMetricsCollector _metricsCollector;
|
||||
private readonly ILogger<CanaryStrategy> _logger;
|
||||
|
||||
public string Name => "canary";
|
||||
|
||||
public Task<IReadOnlyList<DeploymentBatch>> PlanAsync(
|
||||
DeploymentJob job,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var tasks = job.Tasks.ToList();
|
||||
var batches = new List<DeploymentBatch>();
|
||||
|
||||
if (tasks.Count == 0)
|
||||
return Task.FromResult<IReadOnlyList<DeploymentBatch>>(batches);
|
||||
|
||||
// Canary phase: 1 target (or min 5% if many targets)
|
||||
var canarySize = Math.Max(1, tasks.Count / 20);
|
||||
batches.Add(new DeploymentBatch(
|
||||
Index: 0,
|
||||
TaskIds: tasks.Take(canarySize).Select(t => t.Id).ToImmutableArray(),
|
||||
Requirements: new BatchRequirements(
|
||||
WaitForHealthCheck: true,
|
||||
HealthCheckTimeout: TimeSpan.FromMinutes(10),
|
||||
MinSuccessRate: 1.0
|
||||
)
|
||||
));
|
||||
tasks = tasks.Skip(canarySize).ToList();
|
||||
|
||||
// Gradual rollout: 25% increments
|
||||
var batchIndex = 1;
|
||||
var incrementSize = Math.Max(1, (tasks.Count + 3) / 4);
|
||||
|
||||
while (tasks.Count > 0)
|
||||
{
|
||||
var batchTasks = tasks.Take(incrementSize).ToList();
|
||||
tasks = tasks.Skip(incrementSize).ToList();
|
||||
|
||||
batches.Add(new DeploymentBatch(
|
||||
Index: batchIndex++,
|
||||
TaskIds: batchTasks.Select(t => t.Id).ToImmutableArray(),
|
||||
Requirements: new BatchRequirements(
|
||||
WaitForHealthCheck: true,
|
||||
MinSuccessRate: 0.95 // Allow some failures in later batches
|
||||
)
|
||||
));
|
||||
}
|
||||
|
||||
_logger.LogInformation(
|
||||
"Canary strategy: {CanarySize} canary, then {Batches} batches",
|
||||
canarySize, batches.Count - 1);
|
||||
|
||||
return Task.FromResult<IReadOnlyList<DeploymentBatch>>(batches);
|
||||
}
|
||||
|
||||
public async Task<bool> ShouldProceedAsync(
|
||||
DeploymentBatch completedBatch,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
// Check health
|
||||
var healthyCount = 0;
|
||||
foreach (var taskId in completedBatch.TaskIds)
|
||||
{
|
||||
if (await _healthChecker.CheckTaskHealthAsync(taskId, ct))
|
||||
healthyCount++;
|
||||
}
|
||||
|
||||
var successRate = (double)healthyCount / completedBatch.TaskIds.Length;
|
||||
if (successRate < completedBatch.Requirements.MinSuccessRate)
|
||||
{
|
||||
_logger.LogWarning(
|
||||
"Canary batch {Index}: success rate {Rate:P0} below threshold {Required:P0}",
|
||||
completedBatch.Index, successRate, completedBatch.Requirements.MinSuccessRate);
|
||||
return false;
|
||||
}
|
||||
|
||||
// For canary batch (index 0), also check metrics
|
||||
if (completedBatch.Index == 0)
|
||||
{
|
||||
var metrics = await _metricsCollector.GetCanaryMetricsAsync(
|
||||
completedBatch.TaskIds, ct);
|
||||
|
||||
if (metrics.ErrorRate > 0.05)
|
||||
{
|
||||
_logger.LogWarning(
|
||||
"Canary error rate {Rate:P1} exceeds threshold",
|
||||
metrics.ErrorRate);
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
public sealed record CanaryMetrics(
|
||||
double ErrorRate,
|
||||
double Latency99th,
|
||||
int RequestCount
|
||||
);
|
||||
```
|
||||
|
||||
### AllAtOnceStrategy
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Strategy;
|
||||
|
||||
public sealed class AllAtOnceStrategy : IDeploymentStrategy
|
||||
{
|
||||
private readonly ITargetHealthChecker _healthChecker;
|
||||
private readonly ILogger<AllAtOnceStrategy> _logger;
|
||||
|
||||
public string Name => "all-at-once";
|
||||
|
||||
public Task<IReadOnlyList<DeploymentBatch>> PlanAsync(
|
||||
DeploymentJob job,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var batches = new List<DeploymentBatch>
|
||||
{
|
||||
new DeploymentBatch(
|
||||
Index: 0,
|
||||
TaskIds: job.Tasks.Select(t => t.Id).ToImmutableArray(),
|
||||
Requirements: new BatchRequirements(
|
||||
WaitForHealthCheck: job.Options.WaitForHealthCheck,
|
||||
MinSuccessRate: 0.8 // Allow some failures
|
||||
)
|
||||
)
|
||||
};
|
||||
|
||||
_logger.LogInformation(
|
||||
"All-at-once strategy: deploying to all {Count} targets simultaneously",
|
||||
job.Tasks.Length);
|
||||
|
||||
return Task.FromResult<IReadOnlyList<DeploymentBatch>>(batches);
|
||||
}
|
||||
|
||||
public Task<bool> ShouldProceedAsync(
|
||||
DeploymentBatch completedBatch,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
// Single batch, always "proceed" (nothing to proceed to)
|
||||
return Task.FromResult(true);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### DeploymentStrategyFactory
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.ReleaseOrchestrator.Deployment.Strategy;
|
||||
|
||||
public interface IDeploymentStrategyFactory
|
||||
{
|
||||
IDeploymentStrategy Create(DeploymentStrategy strategy);
|
||||
IReadOnlyList<string> GetAvailableStrategies();
|
||||
}
|
||||
|
||||
public sealed class DeploymentStrategyFactory : IDeploymentStrategyFactory
|
||||
{
|
||||
private readonly IServiceProvider _serviceProvider;
|
||||
private readonly ILogger<DeploymentStrategyFactory> _logger;
|
||||
|
||||
private static readonly Dictionary<DeploymentStrategy, Type> StrategyTypes = new()
|
||||
{
|
||||
[DeploymentStrategy.Rolling] = typeof(RollingStrategy),
|
||||
[DeploymentStrategy.BlueGreen] = typeof(BlueGreenStrategy),
|
||||
[DeploymentStrategy.Canary] = typeof(CanaryStrategy),
|
||||
[DeploymentStrategy.AllAtOnce] = typeof(AllAtOnceStrategy)
|
||||
};
|
||||
|
||||
public IDeploymentStrategy Create(DeploymentStrategy strategy)
|
||||
{
|
||||
if (!StrategyTypes.TryGetValue(strategy, out var type))
|
||||
{
|
||||
throw new UnsupportedStrategyException(strategy);
|
||||
}
|
||||
|
||||
var instance = _serviceProvider.GetRequiredService(type) as IDeploymentStrategy;
|
||||
if (instance is null)
|
||||
{
|
||||
throw new StrategyCreationException(strategy);
|
||||
}
|
||||
|
||||
_logger.LogDebug("Created {Strategy} deployment strategy", strategy);
|
||||
return instance;
|
||||
}
|
||||
|
||||
public IReadOnlyList<string> GetAvailableStrategies() =>
|
||||
StrategyTypes.Keys.Select(s => s.ToString()).ToList().AsReadOnly();
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [x] Rolling strategy batches targets
|
||||
- [x] Rolling strategy checks health between batches
|
||||
- [x] Blue-green deploys all then switches
|
||||
- [x] Canary deploys incrementally
|
||||
- [x] Canary checks metrics after canary batch (simplified: checks success rate)
|
||||
- [x] All-at-once deploys simultaneously
|
||||
- [x] Strategy factory creates correct type
|
||||
- [x] Batch size parsing works
|
||||
- [x] Unit test coverage >=85%
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
| Dependency | Type | Status |
|
||||
|------------|------|--------|
|
||||
| 107_002 Target Executor | Internal | TODO |
|
||||
| 103_002 Target Registry | Internal | TODO |
|
||||
|
||||
---
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| Deliverable | Status | Notes |
|
||||
|-------------|--------|-------|
|
||||
| IDeploymentStrategy | DONE | Existing interface in Orchestrator/ |
|
||||
| DeploymentStrategyFactory | DONE | Updated with BlueGreen and Canary |
|
||||
| RollingStrategy | DONE | Existing in RollingDeploymentStrategy.cs |
|
||||
| BlueGreenStrategy | DONE | Added to RollingDeploymentStrategy.cs |
|
||||
| CanaryStrategy | DONE | Added to RollingDeploymentStrategy.cs |
|
||||
| AllAtOnceStrategy | DONE | Existing in RollingDeploymentStrategy.cs |
|
||||
| Unit tests | DONE | 179 tests pass |
|
||||
|
||||
---
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date | Entry |
|
||||
|------|-------|
|
||||
| 10-Jan-2026 | Sprint created |
|
||||
| 11-Jan-2026 | Added BlueGreenDeploymentStrategy with single-batch deployment for cutover |
|
||||
| 11-Jan-2026 | Added CanaryDeploymentStrategy with 5% canary + 25% increments |
|
||||
| 11-Jan-2026 | Updated DeploymentStrategyFactory to support all 4 strategies |
|
||||
| 11-Jan-2026 | Added 14 new unit tests for BlueGreen and Canary, 179 total tests pass |
|
||||
| 11-Jan-2026 | Sprint completed and archived |
|
||||
Reference in New Issue
Block a user