Some checks are pending
Docs CI / lint-and-preview (push) Waiting to run
- Added tests for output projection and failure policy population in TaskPackPlanner. - Introduced new failure policy manifest in TestManifests. - Implemented simulation endpoints in the web service for task execution. - Created TaskRunnerServiceOptions for configuration management. - Updated appsettings.json to include TaskRunner configuration. - Enhanced PackRunWorkerService to handle execution graphs and state management. - Added support for parallel execution and conditional steps in the worker service. - Updated documentation to reflect new features and changes in execution flow.
3.3 KiB
3.3 KiB
Task Runner — Simulation & Failure Policy Notes
Status: Draft (2025-11-04) — execution wiring + CLI simulate command landed; docs pending final polish
The Task Runner planning layer now materialises additional runtime metadata to unblock execution and simulation flows:
- Execution graph builder – converts
TaskPackPlansteps (includingmapandparallel) into a deterministic graph with preserved enablement flags and per-step metadata (maxParallel,continueOnError, parameters, approval IDs). - Simulation engine – walks the execution graph and classifies steps as
pending,skipped,requires-approval, orrequires-policy, producing a deterministic preview for CLI/UI consumers while surfacing declared outputs. - Failure policy – pack-level
spec.failure.retriesis normalised into aTaskPackPlanFailurePolicy(default:maxAttempts = 1,backoffSeconds = 0). The new step state machine uses this policy to schedule retries and to determine when a run must abort. - Simulation API + Worker –
POST /v1/task-runner/simulationsreturns the deterministic preview;GET /v1/task-runner/runs/{id}exposes persisted retry windows now written by the worker as it honoursmaxParallel,continueOnError, and retry windows during execution.
Current behaviour
- Map steps expand into child iterations (
stepId[index]::templateId) with per-item parameters preserved for runtime reference. - Parallel blocks honour
maxParallel(defaults to unlimited) and the worker executes children accordingly, short-circuiting whencontinueOnErroris false. - Simulation output mirrors approvals/policy gates, allowing the WebService/CLI to show which actions must occur before execution resumes.
- File-backed state store persists
PackRunStatesnapshots (nextAttemptAt, attempts, reasons) so orchestration clients and CLI can resume runs deterministically even in air-gapped environments. - Step state machine transitions:
pending → running → succeededrunning → failed(abort) once attempts ≥maxAttemptsrunning → pendingwith schedulednextAttemptAtwhen retries remainpending → skippedfor disabled steps (e.g.,whenexpressions).
CLI usage
Run the simulation without mutating state:
stella task-runner simulate \
--manifest ./packs/sample-pack.yaml \
--inputs ./inputs.json \
--format table
Use --format json (or --output path.json) to emit the raw payload produced by POST /api/task-runner/simulations.
Follow-up gaps
- Fold the CLI command into the official reference/quickstart guides and capture exit-code conventions.
References:
src/TaskRunner/StellaOps.TaskRunner/StellaOps.TaskRunner.Core/Execution/PackRunExecutionGraphBuilder.cssrc/TaskRunner/StellaOps.TaskRunner/StellaOps.TaskRunner.Core/Execution/Simulation/PackRunSimulationEngine.cssrc/TaskRunner/StellaOps.TaskRunner/StellaOps.TaskRunner.Core/Execution/PackRunStepStateMachine.cssrc/TaskRunner/StellaOps.TaskRunner/StellaOps.TaskRunner.Infrastructure/Execution/FilePackRunStateStore.cssrc/TaskRunner/StellaOps.TaskRunner/StellaOps.TaskRunner.Worker/Services/PackRunWorkerService.cssrc/TaskRunner/StellaOps.TaskRunner/StellaOps.TaskRunner.WebService/Program.cs