Add StellaOps.Workflow engine: 14 libraries, WebService, 8 test projects

Extract product-agnostic workflow engine from Ablera.Serdica.Workflow into standalone StellaOps.Workflow.* libraries targeting net10.0. Libraries (14): - Contracts, Abstractions (compiler, decompiler, expression runtime) - Engine (execution, signaling, scheduling, projections, hosted services) - ElkSharp (generic graph layout algorithm) - Renderer.ElkSharp, Renderer.ElkJs, Renderer.Msagl, Renderer.Svg - Signaling.Redis, Signaling.OracleAq - DataStore.MongoDB, DataStore.PostgreSQL, DataStore.Oracle WebService: ASP.NET Core Minimal API with 22 endpoints Tests (8 projects, 109 tests pass): - Engine.Tests (105 pass), WebService.Tests (4 E2E pass) - Renderer.Tests, DataStore.MongoDB/Oracle/PostgreSQL.Tests - Signaling.Redis.Tests, IntegrationTests.Shared Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 19:14:44 +02:00
parent e56f9a114a
commit f5b5f24d95
422 changed files with 85428 additions and 0 deletions
--- a/docs/workflow/engine/03-canonical-execution-model.md
+++ b/docs/workflow/engine/03-canonical-execution-model.md
@@ -0,0 +1,377 @@
+# 03. Canonical Execution Model
+
+## 1. Why The Engine Executes Canonical Definitions
+
+The workflow corpus is now fully declarative and canonicalizable.
+
+That changes the best runtime strategy:
+
+- authored C# remains the source of truth
+- canonical definition becomes the runtime execution contract
+- the engine interprets canonical definitions directly
+
+This gives the platform:
+
+- deterministic runtime behavior
+- shared semantics between export/import and execution
+- less runtime coupling to workflow-specific CLR delegates
+- a clean separation between authoring and execution
+
+## 2. Definition Lifecycle
+
+### 2.1 Authoring
+
+Workflows are authored in C# through the declarative DSL.
+
+### 2.2 Normalization
+
+At service startup, each workflow registration is normalized into:
+
+1. workflow registration metadata
+2. canonical workflow definition
+3. required module set
+4. function usage metadata
+
+### 2.3 Validation
+
+The runtime should validate canonical definitions before accepting them for execution.
+
+Recommended startup modes:
+
+- `Strict`
+  Startup fails if a definition is invalid.
+- `Warn`
+  Startup succeeds, but invalid definitions are marked unavailable.
+
+### 2.4 Runtime Cache
+
+The engine should cache canonical runtime definitions in memory by:
+
+- workflow name
+- workflow version
+
+This cache is immutable after startup in v1.
+
+## 3. Canonical Runtime Definition Shape
+
+The runtime definition should be treated as a compiled, execution-ready representation of the canonical contracts, not a raw JSON document.
+
+The runtime model should contain:
+
+- definition identity
+- display metadata
+- required modules
+- step graph
+- task declarations
+- expression trees
+- transport declarations
+- subworkflow declarations
+- continue-with declarations
+
+## 4. Execution Context Model
+
+The interpreter should run every step against a single canonical execution context.
+
+Recommended execution context fields:
+
+- `WorkflowName`
+- `WorkflowVersion`
+- `WorkflowInstanceId`
+- `BusinessReference`
+- `State`
+- `StartPayload`
+- `CompletionPayload`
+- `CurrentTask`
+- `CurrentSignal`
+- `FunctionRuntime`
+- `TransportDispatcher`
+- `RuntimeMetadata`
+
+`RuntimeMetadata` should hold:
+
+- node id
+- current signal id
+- snapshot version
+- waiting token
+- execution started at
+
+## 5. Core Runtime State Model
+
+The runtime must distinguish between:
+
+- business state
+- engine state
+
+### 5.1 Business State
+
+Business state is what the workflow author reasons about.
+
+Examples:
+
+- `srPolicyId`
+- `policySubstatus`
+- customer lookup state
+- payload shaping outputs
+- subworkflow results
+
+### 5.2 Engine State
+
+Engine state is what the runtime needs to resume correctly.
+
+Examples:
+
+- current workflow status
+- current wait type
+- current wait token
+- active task identity
+- resume pointer
+- subworkflow frame stack
+- outstanding timer descriptors
+- last processed signal id
+
+Business state must remain visible in runtime inspection.
+Engine state must remain safe and deterministic for resume.
+
+## 6. Run-To-Wait Execution Model
+
+The engine uses a run-to-wait interpreter.
+
+This means:
+
+1. load snapshot
+2. execute sequentially
+3. stop when a durable wait boundary is reached
+4. persist resulting snapshot
+5. release instance
+
+Wait boundaries are:
+
+- human task activation
+- scheduled timer
+- external signal wait
+- child workflow wait
+- terminal completion
+
+This model is essential for:
+
+- multi-instance safety
+- restart recovery
+- no sticky ownership
+- no in-memory correctness assumptions
+
+## 7. Step Semantics
+
+### 7.1 State Assignment
+
+State assignment is immediate and local to the current execution transaction.
+
+The engine:
+
+- evaluates the assignment expression
+- writes to the business state dictionary
+- keeps changes in-memory until the next durable checkpoint
+
+### 7.2 Business Reference Assignment
+
+Business reference assignment updates the canonical business reference attached to:
+
+- the runtime snapshot
+- new tasks
+- instance projection updates
+
+Business reference changes must be applied transactionally with other execution results.
+
+### 7.3 Human Task Activation
+
+A human task activation step is a terminal wait boundary.
+
+The interpreter does not continue past it in the same execution.
+
+The result of task activation is:
+
+- one active task projection
+- updated instance status
+- updated runtime snapshot
+- optional runtime metadata for the active task
+
+### 7.4 Transport Call
+
+Transport calls are synchronous from the perspective of a single execution slice.
+
+The engine:
+
+- evaluates payload expressions
+- dispatches through the correct transport adapter
+- captures result payload
+- stores result under the result key when present
+- chooses the success, failure, or timeout branch
+
+No engine-specific callback registration should be required for normal synchronous transport calls.
+
+### 7.5 Conditional Branch
+
+Conditions evaluate against the current execution context.
+
+Only one branch is executed.
+
+The branch path must be reproducible in the resume pointer model.
+
+### 7.6 Repeat
+
+Repeat executes logically as:
+
+- evaluate collection or repeat source
+- for each iteration:
+  - bind iteration context
+  - execute nested sequence
+
+If an iteration hits a wait boundary, the engine snapshot must preserve:
+
+- repeat step id
+- iteration index
+- remaining resume location inside the iteration body
+
+### 7.7 Subworkflow Invocation
+
+Subworkflow invocation is a wait boundary unless the child completes inline before producing a wait.
+
+Parent snapshot must record:
+
+- child workflow identity
+- child workflow version
+- parent business reference
+- parent resume pointer
+- target result key
+- parent workflow state needed for resume
+
+### 7.8 Continue-With
+
+Continue-with creates a new workflow start request as an engine side effect.
+
+It is not a resume boundary for the current instance unless explicitly modeled that way by the workflow.
+
+## 8. Resume Model
+
+### 8.1 Resume Pointer
+
+The engine must persist a deterministic resume pointer.
+
+It should identify:
+
+- entry point kind
+- task name if resuming from task completion
+- branch path
+- next step index
+- repeat iteration where applicable
+
+The existing declarative resume model is the right conceptual baseline, but the engine should persist it inside the canonical runtime snapshot rather than inside a CLR-only execution flow.
+
+### 8.2 Waiting Token
+
+Every durable wait must have a waiting token.
+
+The waiting token is how the engine prevents stale resumes.
+
+When a signal arrives:
+
+- if the waiting token does not match the snapshot
+- the signal is stale and must be ignored safely
+
+This is the primary guard for:
+
+- canceled timers
+- duplicate wake-ups
+- late child completions
+- redelivered signals
+
+### 8.3 Version
+
+Every successful execution commit must increment snapshot version.
+
+Signals may carry the expected version that created the wait.
+
+This allows the engine to detect stale work before any mutation.
+
+## 9. Human Task Model
+
+The task model remains projection-first.
+
+The runtime does not wait on an in-memory task object.
+
+Instead:
+
+- task activation writes a task projection row
+- runtime snapshot enters `WaitingForTaskCompletion`
+- task completion API provides the wake-up event
+
+Task completion is therefore an external signal into the engine.
+
+## 10. Error Model
+
+The interpreter should classify errors into:
+
+- definition errors
+- expression evaluation errors
+- transport errors
+- timeout errors
+- authorization errors
+- engine consistency errors
+
+Definition errors are startup or validation failures.
+Execution errors are runtime failures that may:
+
+- route into a failure branch
+- schedule a retry
+- fail the workflow
+- move the instance to a recoverable error state
+
+## 11. Retry Model
+
+Retries should be modeled explicitly as scheduled signals.
+
+The engine should not sleep inside a worker.
+
+A retry should:
+
+1. persist the failure context
+2. generate a new waiting token
+3. enqueue a delayed resume signal
+4. commit
+
+## 12. Completion Model
+
+A workflow completes when the interpreter reaches terminal completion with no outstanding waits.
+
+Completion result must:
+
+- mark instance projection completed
+- mark runtime state completed
+- clear stale timeout metadata
+- apply retention timing
+
+## 13. Determinism Requirements
+
+The runtime must assume:
+
+- expressions are deterministic given the execution context
+- transport calls are side effects and must be treated explicitly
+- no hidden CLR delegate behavior remains in workflow definitions
+
+The runtime should not rely on:
+
+- non-deterministic local time calls inside step execution
+- in-memory mutable workflow objects
+- ambient state outside the canonical execution context
+
+## 14. Resulting Implementation Shape
+
+The engine kernel should be implemented as:
+
+- definition normalizer
+- canonical interpreter
+- transport dispatcher
+- execution coordinator
+- resume serializer/deserializer
+
+This produces a runtime that is small, explicit, and aligned with the already-completed full-declaration effort.
+