release orchestrator pivot, architecture and planning
This commit is contained in:
591
docs/modules/release-orchestrator/workflow/execution.md
Normal file
591
docs/modules/release-orchestrator/workflow/execution.md
Normal file
@@ -0,0 +1,591 @@
|
||||
# Workflow Execution
|
||||
|
||||
## Overview
|
||||
|
||||
The Workflow Engine executes workflow templates as DAGs (Directed Acyclic Graphs) of steps, managing state transitions, parallelism, retries, and failure handling.
|
||||
|
||||
## Execution Architecture
|
||||
|
||||
```
|
||||
WORKFLOW EXECUTION ARCHITECTURE
|
||||
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ WORKFLOW ENGINE │
|
||||
│ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ WORKFLOW RUNNER │ │
|
||||
│ │ │ │
|
||||
│ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │
|
||||
│ │ │ Template │───►│ Execution │───►│ Context │ │ │
|
||||
│ │ │ Parser │ │ Planner │ │ Builder │ │ │
|
||||
│ │ └────────────┘ └────────────┘ └────────────┘ │ │
|
||||
│ │ │ │ │ │ │
|
||||
│ │ └────────────────┼─────────────────┘ │ │
|
||||
│ │ ▼ │ │
|
||||
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │
|
||||
│ │ │ DAG EXECUTOR │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │
|
||||
│ │ │ │ Ready │ │ Running │ │ Waiting │ │ Completed│ │ │ │
|
||||
│ │ │ │ Queue │ │ Set │ │ Set │ │ Set │ │ │ │
|
||||
│ │ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ ┌──────────────────────────────────────────────────────┐ │ │ │
|
||||
│ │ │ │ STEP DISPATCHER │ │ │ │
|
||||
│ │ │ └──────────────────────────────────────────────────────┘ │ │ │
|
||||
│ │ └─────────────────────────────────────────────────────────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ STEP EXECUTOR POOL │ │
|
||||
│ │ │ │
|
||||
│ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │
|
||||
│ │ │ Executor 1 │ │ Executor 2 │ │ Executor 3 │ │ Executor N │ │ │
|
||||
│ │ └────────────┘ └────────────┘ └────────────┘ └────────────┘ │ │
|
||||
│ │ │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Workflow Run State Machine
|
||||
|
||||
```
|
||||
WORKFLOW RUN STATES
|
||||
|
||||
┌──────────┐
|
||||
│ CREATED │
|
||||
└────┬─────┘
|
||||
│ start()
|
||||
▼
|
||||
┌──────────┐
|
||||
│ RUNNING │◄──────────────────┐
|
||||
└────┬─────┘ │
|
||||
│ │
|
||||
┌───────────────────┼───────────────────┐ │
|
||||
│ │ │ │
|
||||
▼ ▼ ▼ │
|
||||
┌──────────┐ ┌──────────┐ ┌──────────┐│
|
||||
│ WAITING │ │ PAUSED │ │ FAILING ││
|
||||
│ APPROVAL │ │ │ │ ││
|
||||
└────┬─────┘ └────┬─────┘ └────┬─────┘│
|
||||
│ │ │ │
|
||||
│ approve() │ resume() │ │
|
||||
│ │ │ │
|
||||
└───────────────►──┴──────────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────┘
|
||||
│
|
||||
┌───────────────────────┼───────────────────┐
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
┌──────────┐ ┌──────────┐ ┌──────────┐
|
||||
│COMPLETED │ │ FAILED │ │ CANCELLED│
|
||||
└──────────┘ └──────────┘ └──────────┘
|
||||
```
|
||||
|
||||
### State Transitions
|
||||
|
||||
| Current State | Event | Next State | Description |
|
||||
|---------------|-------|------------|-------------|
|
||||
| `created` | `start()` | `running` | Begin workflow execution |
|
||||
| `running` | Step requires approval | `waiting_approval` | Pause for human approval |
|
||||
| `running` | `pause()` | `paused` | Manual pause requested |
|
||||
| `running` | Step fails | `failing` | Handle failure path |
|
||||
| `running` | All steps complete | `completed` | Workflow success |
|
||||
| `waiting_approval` | `approve()` | `running` | Resume after approval |
|
||||
| `waiting_approval` | `reject()` | `failed` | Rejection ends workflow |
|
||||
| `paused` | `resume()` | `running` | Resume execution |
|
||||
| `paused` | `cancel()` | `cancelled` | Cancel workflow |
|
||||
| `failing` | Rollback complete | `failed` | Failure handling done |
|
||||
| `failing` | Rollback succeeds | `running` | Resume with fallback |
|
||||
|
||||
## Step Execution State Machine
|
||||
|
||||
```
|
||||
STEP STATES
|
||||
|
||||
┌──────────┐
|
||||
│ PENDING │
|
||||
└────┬─────┘
|
||||
│ schedule()
|
||||
▼
|
||||
┌──────────┐
|
||||
│ QUEUED │
|
||||
└────┬─────┘
|
||||
│ dispatch()
|
||||
▼
|
||||
┌──────────┐
|
||||
│ RUNNING │◄─────────┐
|
||||
└────┬─────┘ │
|
||||
│ │ retry()
|
||||
┌───────────────────┼───────────────┐│
|
||||
│ │ ││
|
||||
▼ ▼ ▼│
|
||||
┌──────────┐ ┌──────────┐ ┌──────────┐
|
||||
│SUCCEEDED │ │ FAILED │ │ RETRYING │
|
||||
└──────────┘ └────┬─────┘ └──────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────┐
|
||||
│ FAILURE HANDLER │
|
||||
│ ┌───────────────┐ │
|
||||
│ │ fail │──┼─► Mark workflow failing
|
||||
│ │ continue │──┼─► Continue to next step
|
||||
│ │ rollback │──┼─► Trigger rollback path
|
||||
│ │ goto:{nodeId} │──┼─► Jump to specific node
|
||||
│ └───────────────┘ │
|
||||
└─────────────────────┘
|
||||
```
|
||||
|
||||
### Step States
|
||||
|
||||
| State | Description |
|
||||
|-------|-------------|
|
||||
| `pending` | Step not yet ready (dependencies incomplete) |
|
||||
| `queued` | Ready for execution, waiting for executor |
|
||||
| `running` | Currently executing |
|
||||
| `succeeded` | Completed successfully |
|
||||
| `failed` | Failed after all retries exhausted |
|
||||
| `retrying` | Failed, waiting for retry |
|
||||
| `skipped` | Condition evaluated to false |
|
||||
|
||||
## DAG Execution Algorithm
|
||||
|
||||
```python
|
||||
class DAGExecutor:
|
||||
def __init__(self, workflow_run: WorkflowRun):
|
||||
self.run = workflow_run
|
||||
self.template = workflow_run.template
|
||||
self.pending = set(node.id for node in template.nodes)
|
||||
self.running = set()
|
||||
self.completed = set()
|
||||
self.failed = set()
|
||||
self.outputs = {} # nodeId -> outputs
|
||||
|
||||
async def execute(self):
|
||||
"""Main execution loop."""
|
||||
self.run.status = WorkflowStatus.RUNNING
|
||||
self.run.started_at = datetime.utcnow()
|
||||
|
||||
while self.pending or self.running:
|
||||
# Find ready nodes (all dependencies satisfied)
|
||||
ready = self.find_ready_nodes()
|
||||
|
||||
# Dispatch ready nodes
|
||||
for node_id in ready:
|
||||
asyncio.create_task(self.execute_node(node_id))
|
||||
self.pending.remove(node_id)
|
||||
self.running.add(node_id)
|
||||
|
||||
# Wait for any node to complete
|
||||
if self.running:
|
||||
await self.wait_for_completion()
|
||||
|
||||
# Check for deadlock
|
||||
if not ready and self.pending and not self.running:
|
||||
raise DeadlockException(self.pending)
|
||||
|
||||
# Determine final status
|
||||
if self.failed:
|
||||
self.run.status = WorkflowStatus.FAILED
|
||||
else:
|
||||
self.run.status = WorkflowStatus.COMPLETED
|
||||
|
||||
self.run.completed_at = datetime.utcnow()
|
||||
|
||||
def find_ready_nodes(self) -> List[str]:
|
||||
"""Find nodes whose dependencies are all complete."""
|
||||
ready = []
|
||||
for node_id in self.pending:
|
||||
node = self.template.get_node(node_id)
|
||||
|
||||
# Check condition
|
||||
if node.condition:
|
||||
if not self.evaluate_condition(node.condition):
|
||||
self.mark_skipped(node_id)
|
||||
continue
|
||||
|
||||
# Check all incoming edges
|
||||
incoming = self.template.get_incoming_edges(node_id)
|
||||
dependencies_met = all(
|
||||
edge.from_node in self.completed
|
||||
for edge in incoming
|
||||
if self.evaluate_edge_condition(edge)
|
||||
)
|
||||
|
||||
if dependencies_met:
|
||||
ready.append(node_id)
|
||||
|
||||
return ready
|
||||
|
||||
async def execute_node(self, node_id: str):
|
||||
"""Execute a single node."""
|
||||
node = self.template.get_node(node_id)
|
||||
step_run = StepRun(
|
||||
workflow_run_id=self.run.id,
|
||||
node_id=node_id,
|
||||
status=StepStatus.RUNNING
|
||||
)
|
||||
|
||||
try:
|
||||
# Resolve inputs
|
||||
inputs = self.resolve_inputs(node)
|
||||
|
||||
# Get step executor
|
||||
executor = self.step_registry.get_executor(node.type)
|
||||
|
||||
# Execute with timeout
|
||||
async with asyncio.timeout(node.timeout):
|
||||
outputs = await executor.execute(inputs, node.config)
|
||||
|
||||
# Store outputs
|
||||
self.outputs[node_id] = outputs
|
||||
step_run.outputs = outputs
|
||||
step_run.status = StepStatus.SUCCEEDED
|
||||
|
||||
self.running.remove(node_id)
|
||||
self.completed.add(node_id)
|
||||
|
||||
except Exception as e:
|
||||
await self.handle_step_failure(node, step_run, e)
|
||||
|
||||
async def handle_step_failure(self, node, step_run, error):
|
||||
"""Handle step failure according to retry and failure policies."""
|
||||
step_run.attempt_number += 1
|
||||
|
||||
# Check retry policy
|
||||
if step_run.attempt_number <= node.retry_policy.max_retries:
|
||||
if self.is_retryable(error, node.retry_policy):
|
||||
step_run.status = StepStatus.RETRYING
|
||||
delay = self.calculate_backoff(node.retry_policy, step_run.attempt_number)
|
||||
await asyncio.sleep(delay)
|
||||
await self.execute_node(node.id) # Retry
|
||||
return
|
||||
|
||||
# No more retries - handle failure
|
||||
step_run.status = StepStatus.FAILED
|
||||
step_run.error = str(error)
|
||||
|
||||
match node.on_failure:
|
||||
case "fail":
|
||||
self.run.status = WorkflowStatus.FAILING
|
||||
self.failed.add(node.id)
|
||||
case "continue":
|
||||
self.completed.add(node.id) # Continue as if succeeded
|
||||
case "rollback":
|
||||
await self.trigger_rollback(node)
|
||||
case _ if node.on_failure.startswith("goto:"):
|
||||
target = node.on_failure.split(":")[1]
|
||||
self.pending.add(target) # Add target to pending
|
||||
|
||||
self.running.remove(node.id)
|
||||
```
|
||||
|
||||
## Input Resolution
|
||||
|
||||
Inputs to steps can come from multiple sources:
|
||||
|
||||
```typescript
|
||||
interface InputResolver {
|
||||
resolve(binding: InputBinding, context: ExecutionContext): any;
|
||||
}
|
||||
|
||||
class StandardInputResolver implements InputResolver {
|
||||
resolve(binding: InputBinding, context: ExecutionContext): any {
|
||||
switch (binding.source.type) {
|
||||
case "literal":
|
||||
return binding.source.value;
|
||||
|
||||
case "context":
|
||||
// Navigate context path: "release.name" -> context.release.name
|
||||
return this.navigatePath(context, binding.source.path);
|
||||
|
||||
case "output":
|
||||
// Get output from previous step
|
||||
const stepOutputs = context.stepOutputs[binding.source.nodeId];
|
||||
return stepOutputs?.[binding.source.outputName];
|
||||
|
||||
case "secret":
|
||||
// Fetch from vault (never cached)
|
||||
return this.secretsClient.fetch(binding.source.secretName);
|
||||
|
||||
case "expression":
|
||||
// Evaluate JavaScript expression
|
||||
return this.expressionEvaluator.evaluate(
|
||||
binding.source.expression,
|
||||
context
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Execution Context
|
||||
|
||||
The execution context provides data available to all steps:
|
||||
|
||||
```typescript
|
||||
interface ExecutionContext {
|
||||
// Workflow identifiers
|
||||
workflowRunId: UUID;
|
||||
templateId: UUID;
|
||||
templateVersion: number;
|
||||
|
||||
// Input values
|
||||
inputs: Record<string, any>;
|
||||
|
||||
// Domain objects (loaded at start)
|
||||
release?: Release;
|
||||
promotion?: Promotion;
|
||||
environment?: Environment;
|
||||
targets?: Target[];
|
||||
|
||||
// Step outputs (accumulated during execution)
|
||||
stepOutputs: Record<string, Record<string, any>>;
|
||||
|
||||
// Tenant context
|
||||
tenantId: UUID;
|
||||
userId: UUID;
|
||||
|
||||
// Metadata
|
||||
startedAt: DateTime;
|
||||
correlationId: string;
|
||||
}
|
||||
```
|
||||
|
||||
## Concurrency Control
|
||||
|
||||
### Parallelism Within Workflows
|
||||
|
||||
```typescript
|
||||
interface ParallelConfig {
|
||||
maxConcurrency: number; // Max simultaneous steps
|
||||
failFast: boolean; // Stop all on first failure
|
||||
}
|
||||
|
||||
// Example: Parallel deployment to multiple targets
|
||||
const parallelDeploy: StepNode = {
|
||||
id: "parallel-deploy",
|
||||
type: "parallel",
|
||||
config: {
|
||||
maxConcurrency: 5,
|
||||
failFast: false
|
||||
},
|
||||
children: [
|
||||
{ id: "deploy-target-1", type: "deploy-docker", ... },
|
||||
{ id: "deploy-target-2", type: "deploy-docker", ... },
|
||||
{ id: "deploy-target-3", type: "deploy-docker", ... },
|
||||
]
|
||||
};
|
||||
```
|
||||
|
||||
### Global Concurrency Limits
|
||||
|
||||
```typescript
|
||||
interface ConcurrencyLimits {
|
||||
maxWorkflowsPerTenant: number; // Concurrent workflow runs
|
||||
maxStepsPerWorkflow: number; // Concurrent steps per workflow
|
||||
maxDeploymentsPerEnvironment: number; // Prevent deployment conflicts
|
||||
}
|
||||
|
||||
// Default limits
|
||||
const defaults: ConcurrencyLimits = {
|
||||
maxWorkflowsPerTenant: 10,
|
||||
maxStepsPerWorkflow: 20,
|
||||
maxDeploymentsPerEnvironment: 1 // One deployment at a time
|
||||
};
|
||||
```
|
||||
|
||||
## Checkpoint and Resume
|
||||
|
||||
Workflows support checkpointing for long-running executions:
|
||||
|
||||
```typescript
|
||||
interface WorkflowCheckpoint {
|
||||
workflowRunId: UUID;
|
||||
checkpointedAt: DateTime;
|
||||
|
||||
// Execution state
|
||||
pendingNodes: string[];
|
||||
completedNodes: string[];
|
||||
failedNodes: string[];
|
||||
|
||||
// Accumulated data
|
||||
stepOutputs: Record<string, Record<string, any>>;
|
||||
|
||||
// Context snapshot
|
||||
contextSnapshot: ExecutionContext;
|
||||
}
|
||||
|
||||
class CheckpointManager {
|
||||
// Save checkpoint after each step completion
|
||||
async saveCheckpoint(run: WorkflowRun): Promise<void> {
|
||||
const checkpoint: WorkflowCheckpoint = {
|
||||
workflowRunId: run.id,
|
||||
checkpointedAt: new Date(),
|
||||
pendingNodes: Array.from(run.executor.pending),
|
||||
completedNodes: Array.from(run.executor.completed),
|
||||
failedNodes: Array.from(run.executor.failed),
|
||||
stepOutputs: run.executor.outputs,
|
||||
contextSnapshot: run.context
|
||||
};
|
||||
|
||||
await this.repository.save(checkpoint);
|
||||
}
|
||||
|
||||
// Resume from checkpoint after service restart
|
||||
async resumeFromCheckpoint(workflowRunId: UUID): Promise<WorkflowRun> {
|
||||
const checkpoint = await this.repository.get(workflowRunId);
|
||||
|
||||
const run = new WorkflowRun();
|
||||
run.executor.pending = new Set(checkpoint.pendingNodes);
|
||||
run.executor.completed = new Set(checkpoint.completedNodes);
|
||||
run.executor.failed = new Set(checkpoint.failedNodes);
|
||||
run.executor.outputs = checkpoint.stepOutputs;
|
||||
run.context = checkpoint.contextSnapshot;
|
||||
|
||||
// Resume execution
|
||||
await run.executor.execute();
|
||||
return run;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Timeout Handling
|
||||
|
||||
```typescript
|
||||
interface TimeoutConfig {
|
||||
stepTimeout: number; // Per-step timeout (seconds)
|
||||
workflowTimeout: number; // Total workflow timeout (seconds)
|
||||
}
|
||||
|
||||
class TimeoutHandler {
|
||||
async executeWithTimeout<T>(
|
||||
operation: () => Promise<T>,
|
||||
timeoutSeconds: number,
|
||||
onTimeout: () => Promise<void>
|
||||
): Promise<T> {
|
||||
const controller = new AbortController();
|
||||
const timeoutId = setTimeout(
|
||||
() => controller.abort(),
|
||||
timeoutSeconds * 1000
|
||||
);
|
||||
|
||||
try {
|
||||
const result = await operation();
|
||||
clearTimeout(timeoutId);
|
||||
return result;
|
||||
} catch (error) {
|
||||
if (error.name === 'AbortError') {
|
||||
await onTimeout();
|
||||
throw new TimeoutException(timeoutSeconds);
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Event Emission
|
||||
|
||||
The workflow engine emits events for observability:
|
||||
|
||||
```typescript
|
||||
type WorkflowEvent =
|
||||
| { type: "workflow.started"; workflowRunId: UUID; templateId: UUID }
|
||||
| { type: "workflow.completed"; workflowRunId: UUID; status: string }
|
||||
| { type: "workflow.failed"; workflowRunId: UUID; error: string }
|
||||
| { type: "step.started"; workflowRunId: UUID; nodeId: string }
|
||||
| { type: "step.completed"; workflowRunId: UUID; nodeId: string; outputs: any }
|
||||
| { type: "step.failed"; workflowRunId: UUID; nodeId: string; error: string }
|
||||
| { type: "step.retrying"; workflowRunId: UUID; nodeId: string; attempt: number };
|
||||
|
||||
class WorkflowEventEmitter {
|
||||
private subscribers: Map<string, ((event: WorkflowEvent) => void)[]> = new Map();
|
||||
|
||||
emit(event: WorkflowEvent): void {
|
||||
const handlers = this.subscribers.get(event.type) || [];
|
||||
for (const handler of handlers) {
|
||||
handler(event);
|
||||
}
|
||||
|
||||
// Also emit to event bus for external consumers
|
||||
this.eventBus.publish("workflow.events", event);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Execution Monitoring
|
||||
|
||||
### Real-time Progress
|
||||
|
||||
```typescript
|
||||
interface WorkflowProgress {
|
||||
workflowRunId: UUID;
|
||||
status: WorkflowStatus;
|
||||
|
||||
// Step progress
|
||||
totalSteps: number;
|
||||
completedSteps: number;
|
||||
runningSteps: number;
|
||||
failedSteps: number;
|
||||
|
||||
// Current activity
|
||||
currentNodes: string[];
|
||||
|
||||
// Timing
|
||||
startedAt: DateTime;
|
||||
estimatedCompletion?: DateTime;
|
||||
|
||||
// Step details
|
||||
steps: StepProgress[];
|
||||
}
|
||||
|
||||
interface StepProgress {
|
||||
nodeId: string;
|
||||
nodeName: string;
|
||||
status: StepStatus;
|
||||
startedAt?: DateTime;
|
||||
completedAt?: DateTime;
|
||||
attempt: number;
|
||||
logs?: string;
|
||||
}
|
||||
```
|
||||
|
||||
### WebSocket Streaming
|
||||
|
||||
```typescript
|
||||
// Client subscribes to workflow progress
|
||||
const ws = new WebSocket(`/api/v1/workflow-runs/${runId}/stream`);
|
||||
|
||||
ws.onmessage = (event) => {
|
||||
const progress: WorkflowProgress = JSON.parse(event.data);
|
||||
updateUI(progress);
|
||||
};
|
||||
|
||||
// Server streams updates
|
||||
class WorkflowStreamHandler {
|
||||
async stream(runId: UUID, connection: WebSocket): Promise<void> {
|
||||
const subscription = this.eventBus.subscribe(`workflow.${runId}.*`);
|
||||
|
||||
for await (const event of subscription) {
|
||||
const progress = await this.buildProgress(runId);
|
||||
connection.send(JSON.stringify(progress));
|
||||
|
||||
if (progress.status === 'completed' || progress.status === 'failed') {
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
connection.close();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- [Workflow Templates](templates.md)
|
||||
- [Workflow Engine Module](../modules/workflow-engine.md)
|
||||
- [Promotion Manager](../modules/promotion-manager.md)
|
||||
405
docs/modules/release-orchestrator/workflow/promotion.md
Normal file
405
docs/modules/release-orchestrator/workflow/promotion.md
Normal file
@@ -0,0 +1,405 @@
|
||||
# Promotion State Machine
|
||||
|
||||
## Overview
|
||||
|
||||
Promotions move releases through environments (Dev -> Staging -> Production). The promotion state machine manages the lifecycle from request to completion.
|
||||
|
||||
## Promotion States
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ PROMOTION STATE MACHINE │
|
||||
│ │
|
||||
│ ┌──────────────────┐ │
|
||||
│ │ PENDING_APPROVAL │ (initial) │
|
||||
│ └────────┬─────────┘ │
|
||||
│ │ │
|
||||
│ ┌──────────────────┼──────────────────┐ │
|
||||
│ │ │ │ │
|
||||
│ ▼ ▼ ▼ │
|
||||
│ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ │
|
||||
│ │ REJECTED │ │ PENDING_GATE │ │ CANCELLED │ │
|
||||
│ └────────────────┘ └────────┬───────┘ └────────────────┘ │
|
||||
│ │ │
|
||||
│ │ gates pass │
|
||||
│ ▼ │
|
||||
│ ┌────────────────┐ │
|
||||
│ │ APPROVED │ │
|
||||
│ └────────┬───────┘ │
|
||||
│ │ │
|
||||
│ │ start deployment │
|
||||
│ ▼ │
|
||||
│ ┌────────────────┐ │
|
||||
│ │ DEPLOYING │ │
|
||||
│ └────────┬───────┘ │
|
||||
│ │ │
|
||||
│ ┌──────────────────┼──────────────────┐ │
|
||||
│ │ │ │ │
|
||||
│ ▼ ▼ ▼ │
|
||||
│ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ │
|
||||
│ │ FAILED │ │ DEPLOYED │ │ ROLLED_BACK │ │
|
||||
│ └────────────────┘ └────────────────┘ └────────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## State Definitions
|
||||
|
||||
| State | Description |
|
||||
|-------|-------------|
|
||||
| `pending_approval` | Awaiting human approval (if required) |
|
||||
| `pending_gate` | Awaiting automated gate evaluation |
|
||||
| `approved` | All approvals and gates satisfied; ready for deployment |
|
||||
| `rejected` | Blocked by approval rejection or gate failure |
|
||||
| `deploying` | Deployment in progress |
|
||||
| `deployed` | Successfully deployed to target environment |
|
||||
| `failed` | Deployment failed (not rolled back) |
|
||||
| `cancelled` | Cancelled by user before completion |
|
||||
| `rolled_back` | Deployment rolled back to previous version |
|
||||
|
||||
## State Transitions
|
||||
|
||||
### Valid Transitions
|
||||
|
||||
```typescript
|
||||
const validTransitions: Record<PromotionStatus, PromotionStatus[]> = {
|
||||
pending_approval: ["pending_gate", "approved", "rejected", "cancelled"],
|
||||
pending_gate: ["approved", "rejected", "cancelled"],
|
||||
approved: ["deploying", "cancelled"],
|
||||
deploying: ["deployed", "failed", "rolled_back"],
|
||||
rejected: [], // terminal
|
||||
cancelled: [], // terminal
|
||||
deployed: [], // terminal (for this promotion)
|
||||
failed: ["rolled_back"], // can trigger rollback
|
||||
rolled_back: [] // terminal
|
||||
};
|
||||
```
|
||||
|
||||
### Transition Events
|
||||
|
||||
```typescript
|
||||
interface PromotionTransition {
|
||||
promotionId: UUID;
|
||||
fromState: PromotionStatus;
|
||||
toState: PromotionStatus;
|
||||
trigger: TransitionTrigger;
|
||||
triggeredBy: UUID; // user or system
|
||||
timestamp: DateTime;
|
||||
details: object;
|
||||
}
|
||||
|
||||
type TransitionTrigger =
|
||||
| "approval_granted"
|
||||
| "approval_rejected"
|
||||
| "gate_passed"
|
||||
| "gate_failed"
|
||||
| "deployment_started"
|
||||
| "deployment_completed"
|
||||
| "deployment_failed"
|
||||
| "rollback_triggered"
|
||||
| "rollback_completed"
|
||||
| "user_cancelled";
|
||||
```
|
||||
|
||||
## Promotion Flow
|
||||
|
||||
### 1. Request Promotion
|
||||
|
||||
```typescript
|
||||
async function requestPromotion(request: PromotionRequest): Promise<Promotion> {
|
||||
// Validate release exists and is ready
|
||||
const release = await getRelease(request.releaseId);
|
||||
if (release.status !== "ready" && release.status !== "deployed") {
|
||||
throw new Error("Release not ready for promotion");
|
||||
}
|
||||
|
||||
// Validate target environment
|
||||
const environment = await getEnvironment(request.targetEnvironmentId);
|
||||
|
||||
// Check freeze windows
|
||||
if (await isEnvironmentFrozen(environment.id)) {
|
||||
throw new Error("Environment is frozen");
|
||||
}
|
||||
|
||||
// Determine initial state
|
||||
const requiresApproval = environment.requiredApprovals > 0;
|
||||
const initialStatus = requiresApproval ? "pending_approval" : "pending_gate";
|
||||
|
||||
// Create promotion
|
||||
const promotion = await createPromotion({
|
||||
releaseId: request.releaseId,
|
||||
sourceEnvironmentId: release.currentEnvironmentId,
|
||||
targetEnvironmentId: environment.id,
|
||||
status: initialStatus,
|
||||
requestedBy: request.userId,
|
||||
requestReason: request.reason
|
||||
});
|
||||
|
||||
// Emit event
|
||||
await emitEvent("promotion.requested", promotion);
|
||||
|
||||
return promotion;
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Approval Phase
|
||||
|
||||
```typescript
|
||||
async function processApproval(
|
||||
promotionId: UUID,
|
||||
approverId: UUID,
|
||||
action: "approve" | "reject",
|
||||
comment?: string
|
||||
): Promise<Promotion> {
|
||||
const promotion = await getPromotion(promotionId);
|
||||
const environment = await getEnvironment(promotion.targetEnvironmentId);
|
||||
|
||||
// Validate approver can approve
|
||||
await validateApproverPermission(approverId, environment.id);
|
||||
|
||||
// Check separation of duties
|
||||
if (environment.requireSeparationOfDuties) {
|
||||
if (approverId === promotion.requestedBy) {
|
||||
throw new Error("Separation of duties violation: requester cannot approve");
|
||||
}
|
||||
}
|
||||
|
||||
// Record approval
|
||||
await recordApproval({
|
||||
promotionId,
|
||||
approverId,
|
||||
action,
|
||||
comment
|
||||
});
|
||||
|
||||
if (action === "reject") {
|
||||
return await transitionState(promotion, "rejected", {
|
||||
trigger: "approval_rejected",
|
||||
triggeredBy: approverId,
|
||||
details: { reason: comment }
|
||||
});
|
||||
}
|
||||
|
||||
// Check if all required approvals received
|
||||
const approvalCount = await countApprovals(promotionId);
|
||||
if (approvalCount >= environment.requiredApprovals) {
|
||||
return await transitionState(promotion, "pending_gate", {
|
||||
trigger: "approval_granted",
|
||||
triggeredBy: approverId
|
||||
});
|
||||
}
|
||||
|
||||
return promotion;
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Gate Evaluation
|
||||
|
||||
```typescript
|
||||
async function evaluateGates(promotionId: UUID): Promise<GateEvaluationResult> {
|
||||
const promotion = await getPromotion(promotionId);
|
||||
const environment = await getEnvironment(promotion.targetEnvironmentId);
|
||||
const release = await getRelease(promotion.releaseId);
|
||||
|
||||
const gateResults: GateResult[] = [];
|
||||
|
||||
// Security gate
|
||||
const securityResult = await evaluateSecurityGate(release, environment);
|
||||
gateResults.push(securityResult);
|
||||
|
||||
// Custom policy gates
|
||||
for (const policy of environment.policies) {
|
||||
const policyResult = await evaluatePolicyGate(release, environment, policy);
|
||||
gateResults.push(policyResult);
|
||||
}
|
||||
|
||||
// Aggregate results
|
||||
const allPassed = gateResults.every(g => g.passed);
|
||||
const blockingFailures = gateResults.filter(g => !g.passed && g.blocking);
|
||||
|
||||
// Create decision record
|
||||
const decisionRecord = await createDecisionRecord({
|
||||
promotionId,
|
||||
gateResults,
|
||||
decision: allPassed ? "allow" : "block",
|
||||
decidedAt: new Date()
|
||||
});
|
||||
|
||||
// Transition state
|
||||
if (allPassed) {
|
||||
await transitionState(promotion, "approved", {
|
||||
trigger: "gate_passed",
|
||||
triggeredBy: "system",
|
||||
details: { decisionRecordId: decisionRecord.id }
|
||||
});
|
||||
} else {
|
||||
await transitionState(promotion, "rejected", {
|
||||
trigger: "gate_failed",
|
||||
triggeredBy: "system",
|
||||
details: { blockingGates: blockingFailures }
|
||||
});
|
||||
}
|
||||
|
||||
return { passed: allPassed, gateResults, decisionRecord };
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Deployment Execution
|
||||
|
||||
```typescript
|
||||
async function executeDeployment(promotionId: UUID): Promise<DeploymentJob> {
|
||||
const promotion = await getPromotion(promotionId);
|
||||
|
||||
// Transition to deploying
|
||||
await transitionState(promotion, "deploying", {
|
||||
trigger: "deployment_started",
|
||||
triggeredBy: "system"
|
||||
});
|
||||
|
||||
// Generate artifacts
|
||||
const artifacts = await generateArtifacts(promotion);
|
||||
|
||||
// Create deployment job
|
||||
const job = await createDeploymentJob({
|
||||
promotionId,
|
||||
releaseId: promotion.releaseId,
|
||||
environmentId: promotion.targetEnvironmentId,
|
||||
artifacts
|
||||
});
|
||||
|
||||
// Execute via workflow or direct
|
||||
const workflowRun = await startDeploymentWorkflow(job);
|
||||
|
||||
// Update promotion with workflow reference
|
||||
await updatePromotion(promotionId, { workflowRunId: workflowRun.id });
|
||||
|
||||
return job;
|
||||
}
|
||||
```
|
||||
|
||||
### 5. Completion Handling
|
||||
|
||||
```typescript
|
||||
async function handleDeploymentCompletion(
|
||||
jobId: UUID,
|
||||
status: "succeeded" | "failed"
|
||||
): Promise<Promotion> {
|
||||
const job = await getDeploymentJob(jobId);
|
||||
const promotion = await getPromotion(job.promotionId);
|
||||
|
||||
if (status === "succeeded") {
|
||||
// Generate evidence packet
|
||||
const evidence = await generateEvidencePacket(promotion, job);
|
||||
|
||||
// Update release environment state
|
||||
await updateReleaseEnvironmentState({
|
||||
releaseId: promotion.releaseId,
|
||||
environmentId: promotion.targetEnvironmentId,
|
||||
status: "deployed",
|
||||
promotionId: promotion.id,
|
||||
evidenceRef: evidence.id
|
||||
});
|
||||
|
||||
return await transitionState(promotion, "deployed", {
|
||||
trigger: "deployment_completed",
|
||||
triggeredBy: "system",
|
||||
details: { evidencePacketId: evidence.id }
|
||||
});
|
||||
} else {
|
||||
return await transitionState(promotion, "failed", {
|
||||
trigger: "deployment_failed",
|
||||
triggeredBy: "system",
|
||||
details: { jobId, error: job.errorMessage }
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Decision Record
|
||||
|
||||
Every promotion produces a decision record:
|
||||
|
||||
```typescript
|
||||
interface DecisionRecord {
|
||||
id: UUID;
|
||||
promotionId: UUID;
|
||||
decision: "allow" | "block";
|
||||
decidedAt: DateTime;
|
||||
|
||||
// Inputs
|
||||
release: {
|
||||
id: UUID;
|
||||
name: string;
|
||||
components: Array<{ name: string; digest: string }>;
|
||||
};
|
||||
environment: {
|
||||
id: UUID;
|
||||
name: string;
|
||||
};
|
||||
|
||||
// Gate results
|
||||
gateResults: Array<{
|
||||
gateName: string;
|
||||
gateType: string;
|
||||
passed: boolean;
|
||||
blocking: boolean;
|
||||
message: string;
|
||||
details: object;
|
||||
evaluatedAt: DateTime;
|
||||
}>;
|
||||
|
||||
// Approvals
|
||||
approvals: Array<{
|
||||
approverId: UUID;
|
||||
approverName: string;
|
||||
action: "approved" | "rejected";
|
||||
comment?: string;
|
||||
timestamp: DateTime;
|
||||
}>;
|
||||
|
||||
// Context
|
||||
requester: {
|
||||
id: UUID;
|
||||
name: string;
|
||||
};
|
||||
requestReason: string;
|
||||
|
||||
// Signature
|
||||
contentHash: string;
|
||||
signature: string;
|
||||
}
|
||||
```
|
||||
|
||||
## API Endpoints
|
||||
|
||||
```yaml
|
||||
# Request promotion
|
||||
POST /api/v1/promotions
|
||||
Body: { releaseId, targetEnvironmentId, reason? }
|
||||
Response: Promotion
|
||||
|
||||
# Approve/reject promotion
|
||||
POST /api/v1/promotions/{id}/approve
|
||||
POST /api/v1/promotions/{id}/reject
|
||||
Body: { comment? }
|
||||
Response: Promotion
|
||||
|
||||
# Cancel promotion
|
||||
POST /api/v1/promotions/{id}/cancel
|
||||
Response: Promotion
|
||||
|
||||
# Get decision record
|
||||
GET /api/v1/promotions/{id}/decision
|
||||
Response: DecisionRecord
|
||||
|
||||
# Preview gates (dry run)
|
||||
POST /api/v1/promotions/preview-gates
|
||||
Body: { releaseId, targetEnvironmentId }
|
||||
Response: { wouldPass: boolean, gates: GateResult[] }
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- [Workflow Templates](templates.md)
|
||||
- [Workflow Execution](execution.md)
|
||||
- [Evidence Schema](../appendices/evidence-schema.md)
|
||||
327
docs/modules/release-orchestrator/workflow/templates.md
Normal file
327
docs/modules/release-orchestrator/workflow/templates.md
Normal file
@@ -0,0 +1,327 @@
|
||||
# Workflow Template Structure
|
||||
|
||||
## Overview
|
||||
|
||||
Workflow templates define the DAG (Directed Acyclic Graph) of steps to execute during deployment, promotion, and other automated processes.
|
||||
|
||||
## Template Structure
|
||||
|
||||
```typescript
|
||||
interface WorkflowTemplate {
|
||||
id: UUID;
|
||||
tenantId: UUID;
|
||||
name: string; // "standard-deploy"
|
||||
displayName: string; // "Standard Deployment"
|
||||
description: string;
|
||||
version: number; // Auto-incremented
|
||||
|
||||
// DAG structure
|
||||
nodes: StepNode[];
|
||||
edges: StepEdge[];
|
||||
|
||||
// I/O definitions
|
||||
inputs: InputDefinition[];
|
||||
outputs: OutputDefinition[];
|
||||
|
||||
// Metadata
|
||||
tags: string[];
|
||||
isBuiltin: boolean;
|
||||
createdAt: DateTime;
|
||||
createdBy: UUID;
|
||||
}
|
||||
```
|
||||
|
||||
## Node Types
|
||||
|
||||
### Step Node
|
||||
|
||||
```typescript
|
||||
interface StepNode {
|
||||
id: string; // Unique within template (e.g., "deploy-api")
|
||||
type: string; // Step type from registry
|
||||
name: string; // Display name
|
||||
config: Record<string, any>; // Step-specific configuration
|
||||
inputs: InputBinding[]; // Input value bindings
|
||||
outputs: OutputBinding[]; // Output declarations
|
||||
position: { x: number; y: number }; // UI position
|
||||
|
||||
// Execution settings
|
||||
timeout: number; // Seconds (default from step type)
|
||||
retryPolicy: RetryPolicy;
|
||||
onFailure: FailureAction;
|
||||
condition?: string; // JS expression for conditional execution
|
||||
|
||||
// Documentation
|
||||
description?: string;
|
||||
documentation?: string;
|
||||
}
|
||||
|
||||
type FailureAction = "fail" | "continue" | "rollback" | "goto:{nodeId}";
|
||||
|
||||
interface RetryPolicy {
|
||||
maxRetries: number;
|
||||
backoffType: "fixed" | "exponential";
|
||||
backoffSeconds: number;
|
||||
retryableErrors: string[];
|
||||
}
|
||||
```
|
||||
|
||||
### Input Bindings
|
||||
|
||||
```typescript
|
||||
interface InputBinding {
|
||||
name: string; // Input parameter name
|
||||
source: InputSource;
|
||||
}
|
||||
|
||||
type InputSource =
|
||||
| { type: "literal"; value: any }
|
||||
| { type: "context"; path: string } // e.g., "release.name"
|
||||
| { type: "output"; nodeId: string; outputName: string }
|
||||
| { type: "secret"; secretName: string }
|
||||
| { type: "expression"; expression: string }; // JS expression
|
||||
```
|
||||
|
||||
### Edge Types
|
||||
|
||||
```typescript
|
||||
interface StepEdge {
|
||||
id: string;
|
||||
from: string; // Source node ID
|
||||
to: string; // Target node ID
|
||||
condition?: string; // Optional condition expression
|
||||
label?: string; // Display label for conditional edges
|
||||
}
|
||||
```
|
||||
|
||||
## Built-in Step Types
|
||||
|
||||
### Control Steps
|
||||
|
||||
| Type | Description | Config |
|
||||
|------|-------------|--------|
|
||||
| `approval` | Wait for human approval | `promotionId` |
|
||||
| `wait` | Wait for specified duration | `durationSeconds` |
|
||||
| `condition` | Branch based on condition | `expression` |
|
||||
| `parallel` | Execute children in parallel | `maxConcurrency` |
|
||||
|
||||
### Gate Steps
|
||||
|
||||
| Type | Description | Config |
|
||||
|------|-------------|--------|
|
||||
| `security-gate` | Evaluate security policy | `blockOnCritical`, `blockOnHigh` |
|
||||
| `custom-gate` | Custom OPA policy evaluation | `policyName` |
|
||||
| `freeze-check` | Check freeze windows | - |
|
||||
| `approval-check` | Check approval status | `requiredCount` |
|
||||
|
||||
### Deploy Steps
|
||||
|
||||
| Type | Description | Config |
|
||||
|------|-------------|--------|
|
||||
| `deploy-docker` | Deploy single container | `containerName`, `strategy` |
|
||||
| `deploy-compose` | Deploy Docker Compose stack | `composePath`, `strategy` |
|
||||
| `deploy-ecs` | Deploy to AWS ECS | `cluster`, `service` |
|
||||
| `deploy-nomad` | Deploy to HashiCorp Nomad | `jobName` |
|
||||
|
||||
### Verification Steps
|
||||
|
||||
| Type | Description | Config |
|
||||
|------|-------------|--------|
|
||||
| `health-check` | HTTP/TCP health check | `type`, `path`, `expectedStatus` |
|
||||
| `smoke-test` | Run smoke test suite | `testSuite`, `timeout` |
|
||||
| `verify-digest` | Verify deployed digest | `expectedDigest` |
|
||||
|
||||
### Integration Steps
|
||||
|
||||
| Type | Description | Config |
|
||||
|------|-------------|--------|
|
||||
| `webhook` | Call external webhook | `url`, `method`, `headers` |
|
||||
| `trigger-ci` | Trigger CI pipeline | `integrationId`, `pipelineId` |
|
||||
| `wait-ci` | Wait for CI pipeline | `runId`, `timeout` |
|
||||
|
||||
### Notification Steps
|
||||
|
||||
| Type | Description | Config |
|
||||
|------|-------------|--------|
|
||||
| `notify` | Send notification | `channel`, `template` |
|
||||
| `slack` | Send Slack message | `channel`, `message` |
|
||||
| `email` | Send email | `recipients`, `template` |
|
||||
|
||||
### Recovery Steps
|
||||
|
||||
| Type | Description | Config |
|
||||
|------|-------------|--------|
|
||||
| `rollback` | Rollback deployment | `strategy`, `targetReleaseId` |
|
||||
| `execute-script` | Run recovery script | `scriptType`, `scriptRef` |
|
||||
|
||||
## Template Example: Standard Deployment
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "template-standard-deploy",
|
||||
"name": "standard-deploy",
|
||||
"displayName": "Standard Deployment",
|
||||
"version": 1,
|
||||
"inputs": [
|
||||
{ "name": "releaseId", "type": "uuid", "required": true },
|
||||
{ "name": "environmentId", "type": "uuid", "required": true },
|
||||
{ "name": "promotionId", "type": "uuid", "required": true }
|
||||
],
|
||||
"nodes": [
|
||||
{
|
||||
"id": "approval",
|
||||
"type": "approval",
|
||||
"name": "Approval Gate",
|
||||
"config": {},
|
||||
"inputs": [
|
||||
{ "name": "promotionId", "source": { "type": "context", "path": "promotionId" } }
|
||||
],
|
||||
"position": { "x": 100, "y": 100 }
|
||||
},
|
||||
{
|
||||
"id": "security-gate",
|
||||
"type": "security-gate",
|
||||
"name": "Security Verification",
|
||||
"config": {
|
||||
"blockOnCritical": true,
|
||||
"blockOnHigh": true
|
||||
},
|
||||
"inputs": [
|
||||
{ "name": "releaseId", "source": { "type": "context", "path": "releaseId" } }
|
||||
],
|
||||
"position": { "x": 100, "y": 200 }
|
||||
},
|
||||
{
|
||||
"id": "pre-deploy-hook",
|
||||
"type": "execute-script",
|
||||
"name": "Pre-Deploy Hook",
|
||||
"config": {
|
||||
"scriptType": "csharp",
|
||||
"scriptRef": "hooks/pre-deploy.csx"
|
||||
},
|
||||
"inputs": [
|
||||
{ "name": "release", "source": { "type": "context", "path": "release" } },
|
||||
{ "name": "environment", "source": { "type": "context", "path": "environment" } }
|
||||
],
|
||||
"timeout": 300,
|
||||
"onFailure": "fail",
|
||||
"position": { "x": 100, "y": 300 }
|
||||
},
|
||||
{
|
||||
"id": "deploy-targets",
|
||||
"type": "deploy-compose",
|
||||
"name": "Deploy to Targets",
|
||||
"config": {
|
||||
"strategy": "rolling",
|
||||
"parallelism": 2
|
||||
},
|
||||
"inputs": [
|
||||
{ "name": "releaseId", "source": { "type": "context", "path": "releaseId" } },
|
||||
{ "name": "environmentId", "source": { "type": "context", "path": "environmentId" } }
|
||||
],
|
||||
"timeout": 600,
|
||||
"retryPolicy": {
|
||||
"maxRetries": 2,
|
||||
"backoffType": "exponential",
|
||||
"backoffSeconds": 30
|
||||
},
|
||||
"onFailure": "rollback",
|
||||
"position": { "x": 100, "y": 400 }
|
||||
},
|
||||
{
|
||||
"id": "health-check",
|
||||
"type": "health-check",
|
||||
"name": "Health Verification",
|
||||
"config": {
|
||||
"type": "http",
|
||||
"path": "/health",
|
||||
"expectedStatus": 200,
|
||||
"timeout": 30,
|
||||
"retries": 5
|
||||
},
|
||||
"inputs": [
|
||||
{ "name": "targets", "source": { "type": "output", "nodeId": "deploy-targets", "outputName": "deployedTargets" } }
|
||||
],
|
||||
"onFailure": "rollback",
|
||||
"position": { "x": 100, "y": 500 }
|
||||
},
|
||||
{
|
||||
"id": "post-deploy-hook",
|
||||
"type": "execute-script",
|
||||
"name": "Post-Deploy Hook",
|
||||
"config": {
|
||||
"scriptType": "bash",
|
||||
"inline": "echo 'Deployment complete'"
|
||||
},
|
||||
"timeout": 300,
|
||||
"onFailure": "continue",
|
||||
"position": { "x": 100, "y": 600 }
|
||||
},
|
||||
{
|
||||
"id": "notify-success",
|
||||
"type": "notify",
|
||||
"name": "Success Notification",
|
||||
"config": {
|
||||
"channel": "slack",
|
||||
"template": "deployment-success"
|
||||
},
|
||||
"inputs": [
|
||||
{ "name": "release", "source": { "type": "context", "path": "release" } },
|
||||
{ "name": "environment", "source": { "type": "context", "path": "environment" } }
|
||||
],
|
||||
"onFailure": "continue",
|
||||
"position": { "x": 100, "y": 700 }
|
||||
},
|
||||
{
|
||||
"id": "rollback-handler",
|
||||
"type": "rollback",
|
||||
"name": "Rollback Handler",
|
||||
"config": {
|
||||
"strategy": "to-previous"
|
||||
},
|
||||
"inputs": [
|
||||
{ "name": "deploymentJobId", "source": { "type": "output", "nodeId": "deploy-targets", "outputName": "jobId" } }
|
||||
],
|
||||
"position": { "x": 300, "y": 450 }
|
||||
},
|
||||
{
|
||||
"id": "notify-failure",
|
||||
"type": "notify",
|
||||
"name": "Failure Notification",
|
||||
"config": {
|
||||
"channel": "slack",
|
||||
"template": "deployment-failure"
|
||||
},
|
||||
"onFailure": "continue",
|
||||
"position": { "x": 300, "y": 550 }
|
||||
}
|
||||
],
|
||||
"edges": [
|
||||
{ "id": "e1", "from": "approval", "to": "security-gate" },
|
||||
{ "id": "e2", "from": "security-gate", "to": "pre-deploy-hook" },
|
||||
{ "id": "e3", "from": "pre-deploy-hook", "to": "deploy-targets" },
|
||||
{ "id": "e4", "from": "deploy-targets", "to": "health-check" },
|
||||
{ "id": "e5", "from": "health-check", "to": "post-deploy-hook" },
|
||||
{ "id": "e6", "from": "post-deploy-hook", "to": "notify-success" },
|
||||
{ "id": "e7", "from": "deploy-targets", "to": "rollback-handler", "condition": "status === 'failed'" },
|
||||
{ "id": "e8", "from": "health-check", "to": "rollback-handler", "condition": "status === 'failed'" },
|
||||
{ "id": "e9", "from": "rollback-handler", "to": "notify-failure" }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Template Validation
|
||||
|
||||
Templates are validated for:
|
||||
|
||||
1. **Structural validity**: Valid JSON/YAML, required fields present
|
||||
2. **DAG validity**: No cycles, all edges reference valid nodes
|
||||
3. **Type validity**: All step types exist in registry
|
||||
4. **Schema validity**: Step configs match type schemas
|
||||
5. **Input validity**: All required inputs are bindable
|
||||
|
||||
## References
|
||||
|
||||
- [Workflow Engine](../modules/workflow-engine.md)
|
||||
- [Execution State Machine](execution.md)
|
||||
- [Step Registry](../modules/workflow-engine.md#module-step-registry)
|
||||
Reference in New Issue
Block a user