Extract product-agnostic workflow engine from Ablera.Serdica.Workflow into standalone StellaOps.Workflow.* libraries targeting net10.0. Libraries (14): - Contracts, Abstractions (compiler, decompiler, expression runtime) - Engine (execution, signaling, scheduling, projections, hosted services) - ElkSharp (generic graph layout algorithm) - Renderer.ElkSharp, Renderer.ElkJs, Renderer.Msagl, Renderer.Svg - Signaling.Redis, Signaling.OracleAq - DataStore.MongoDB, DataStore.PostgreSQL, DataStore.Oracle WebService: ASP.NET Core Minimal API with 22 endpoints Tests (8 projects, 109 tests pass): - Engine.Tests (105 pass), WebService.Tests (4 E2E pass) - Renderer.Tests, DataStore.MongoDB/Oracle/PostgreSQL.Tests - Signaling.Redis.Tests, IntegrationTests.Shared Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1024 lines
33 KiB
Markdown
1024 lines
33 KiB
Markdown
# Serdica Workflow Engine
|
|
|
|
A declarative, plugin-based workflow engine for long-running insurance business processes. Replaces Camunda BPMN with a native C# fluent DSL, canonical JSON schema, durable signal-based execution, and multi-backend persistence.
|
|
|
|
---
|
|
|
|
## Table of Contents
|
|
|
|
- [Architecture Overview](#architecture-overview)
|
|
- [Workflow Declaration DSL](#workflow-declaration-dsl)
|
|
- [Human Tasks](#human-tasks)
|
|
- [Transport Calls (Service Tasks)](#transport-calls-service-tasks)
|
|
- [Control Flow](#control-flow)
|
|
- [Sub-Workflows & Continuations](#sub-workflows--continuations)
|
|
- [Canonical Definition Schema](#canonical-definition-schema)
|
|
- [Expression System](#expression-system)
|
|
- [Signal System](#signal-system)
|
|
- [Timeout Architecture](#timeout-architecture)
|
|
- [Retention & Lifecycle](#retention--lifecycle)
|
|
- [Authorization](#authorization)
|
|
- [Plugin System](#plugin-system)
|
|
- [Configuration Reference](#configuration-reference)
|
|
- [Service Surface](#service-surface)
|
|
- [Diagram & Visualization](#diagram--visualization)
|
|
- [Error Handling](#error-handling)
|
|
|
|
---
|
|
|
|
## Architecture Overview
|
|
|
|
### Execution Flow
|
|
|
|
```
|
|
Workflow request
|
|
-> Workflow runtime service
|
|
-> Runtime orchestrator
|
|
-> Canonical execution handler
|
|
-> Transport adapters
|
|
-> Projection store
|
|
-> Runtime state store
|
|
-> Signal and schedule buses
|
|
```
|
|
|
|
### Key Components
|
|
|
|
| Component | Responsibility |
|
|
|-----------|---------------|
|
|
| **WorkflowRuntimeService** | Engine-facing lifecycle and task operations |
|
|
| **CanonicalWorkflowExecutionHandler** | Evaluates canonical step sequences, manages fork/join state |
|
|
| **WorkflowSignalPumpHostedService** | Background consumer for durable signal processing |
|
|
| **WorkflowRetentionHostedService** | Background cleanup of stale/completed instances |
|
|
| **IWorkflowProjectionStore** | Task and instance persistence (Mongo, Oracle, Postgres) |
|
|
| **IWorkflowRuntimeStateStore** | Durable execution state snapshots |
|
|
| **IWorkflowSignalBus** | Signal publishing and delivery |
|
|
| **Transport Plugins** | HTTP, GraphQL, message-bus transports, microservice command transport |
|
|
|
|
### Runtime Providers
|
|
|
|
| Provider | Name | Use Case |
|
|
|----------|------|----------|
|
|
| **Serdica.InProcess** | In-memory execution | Testing, simple workflows without durability |
|
|
| **Serdica.Engine** | Canonical engine with durable state | Production workflows with signal-based resumption |
|
|
|
|
---
|
|
|
|
## Workflow Declaration DSL
|
|
|
|
Workflows are defined using a strongly-typed C# fluent builder. The DSL compiles to a canonical JSON definition at startup.
|
|
|
|
### Minimal Workflow
|
|
|
|
```csharp
|
|
public sealed class ApproveApplicationWorkflow
|
|
: IDeclarativeWorkflow<ApproveRequest>
|
|
{
|
|
public string WorkflowName => "ApproveApplication";
|
|
public string WorkflowVersion => "1.0.0";
|
|
public string DisplayName => "Approve Application";
|
|
public IReadOnlyCollection<string> WorkflowRoles => ["DBA", "UR_UNDERWRITER"];
|
|
|
|
public WorkflowSpec<ApproveRequest> Spec { get; } = WorkflowSpec
|
|
.For<ApproveRequest>()
|
|
.InitializeState(request => new Dictionary<string, JsonElement>
|
|
{
|
|
["policyId"] = JsonSerializer.SerializeToElement(request.PolicyId),
|
|
})
|
|
.StartWith(approveTask)
|
|
.Build();
|
|
|
|
public IReadOnlyCollection<WorkflowTaskDescriptor> Tasks => Spec.TaskDescriptors;
|
|
|
|
private static readonly WorkflowHumanTaskDefinition<ApproveRequest> approveTask =
|
|
WorkflowHumanTask.For<ApproveRequest>(
|
|
taskName: "Approve Application",
|
|
taskType: "ApproveQTApproveApplication",
|
|
route: "business/policies")
|
|
.WithPayload(context => new Dictionary<string, JsonElement>
|
|
{
|
|
["policyId"] = context.StateValues.GetRequired<long>("policyId").AsJsonElement(),
|
|
})
|
|
.OnComplete(flow => flow.Complete());
|
|
}
|
|
```
|
|
|
|
### State Initialization
|
|
|
|
State can be initialized from the start request using delegates or expressions:
|
|
|
|
```csharp
|
|
// Delegate-based (typed)
|
|
.InitializeState(request => new { policyId = request.PolicyId, status = "NEW" })
|
|
|
|
// Expression-based (canonical, portable)
|
|
.InitializeState(
|
|
WorkflowExpr.Object(
|
|
WorkflowExpr.Prop("policyId", WorkflowExpr.Path("start.policyId")),
|
|
WorkflowExpr.Prop("status", WorkflowExpr.String("NEW"))))
|
|
```
|
|
|
|
### Business Reference
|
|
|
|
Business references provide a queryable key for workflow instances:
|
|
|
|
```csharp
|
|
flow.SetBusinessReference(new WorkflowBusinessReferenceDeclaration
|
|
{
|
|
KeyExpression = WorkflowExpr.Path("state.policyId"),
|
|
PartsExpressions =
|
|
{
|
|
["policyId"] = WorkflowExpr.Path("state.policyId"),
|
|
["annexId"] = WorkflowExpr.Path("state.annexId"),
|
|
},
|
|
})
|
|
```
|
|
|
|
---
|
|
|
|
## Human Tasks
|
|
|
|
Human tasks pause workflow execution and wait for a user action (assign, complete, release).
|
|
|
|
### Defining a Task
|
|
|
|
```csharp
|
|
var reviewTask = WorkflowHumanTask.For<MyRequest>(
|
|
taskName: "Review Changes",
|
|
taskType: "ReviewPolicyChanges",
|
|
route: "business/policies",
|
|
taskRoles: ["UR_UNDERWRITER", "UR_OPERATIONS"])
|
|
.WithPayload(context => new Dictionary<string, JsonElement>
|
|
{
|
|
["policyId"] = context.StateValues.GetRequired<long>("policyId").AsJsonElement(),
|
|
})
|
|
.WithTimeout(86400) // 24-hour deadline (optional; default: no deadline)
|
|
.OnComplete(flow => flow
|
|
.WhenExpression(
|
|
"Approved?",
|
|
WorkflowExpr.Eq(WorkflowExpr.Path("payload.answer"), WorkflowExpr.String("approve")),
|
|
approved => approved
|
|
.Call("Confirm", confirmAddress, confirmPayload,
|
|
WorkflowHandledBranchAction.Complete,
|
|
WorkflowHandledBranchAction.Complete)
|
|
.Complete(),
|
|
rejected => rejected.Complete()));
|
|
```
|
|
|
|
### Task Properties
|
|
|
|
| Property | Type | Description |
|
|
|----------|------|-------------|
|
|
| `TaskName` | string | Unique name within the workflow |
|
|
| `TaskType` | string | UI component type identifier |
|
|
| `Route` | string | Navigation route for the UI |
|
|
| `TaskRoles` | string[] | Roles that can interact with this task |
|
|
| `TimeoutSeconds` | int? | Optional deadline. Null = no deadline (runs until completed or purged) |
|
|
| `DeadlineUtc` | DateTime? | Computed: `CreatedOnUtc + TimeoutSeconds`. Null if no timeout set |
|
|
|
|
### Task Lifecycle
|
|
|
|
```
|
|
Created (Pending)
|
|
-> Assigned (user claims task)
|
|
-> Completed (user submits payload)
|
|
-> OnComplete sequence executes
|
|
-> Next task activated, or workflow completes
|
|
```
|
|
|
|
### Task Actions & Authorization
|
|
|
|
| Action | Who Can Perform |
|
|
|--------|----------------|
|
|
| `AssignSelf` | Any user with matching effective roles |
|
|
| `AssignOther` | Admin roles only |
|
|
| `AssignRoles` | Admin roles only |
|
|
| `Release` | Current assignee or admin |
|
|
| `Complete` | Current assignee or admin |
|
|
|
|
---
|
|
|
|
## Transport Calls (Service Tasks)
|
|
|
|
Service tasks call external services via pluggable transports. Each call has optional failure and timeout recovery branches.
|
|
|
|
### Call with Address
|
|
|
|
```csharp
|
|
flow.Call<object>(
|
|
"Calculate Premium",
|
|
Address.LegacyRabbit("pas_premium_calculate_for_object"),
|
|
context => new { policyId = context.StateValues.GetRequired<long>("policyId") },
|
|
whenFailure: fail => fail.Complete(), // recovery on failure
|
|
whenTimeout: timeout => timeout.Complete(), // recovery on timeout
|
|
resultKey: "premiumResult", // store response in state
|
|
timeoutSeconds: 120); // per-step timeout override
|
|
```
|
|
|
|
### Address Types
|
|
|
|
| Address Factory | Transport | Example |
|
|
|----------------|-----------|---------|
|
|
| `Address.Microservice(name, command)` | Microservice command transport | `Address.Microservice("PasOperations", "perform")` |
|
|
| `Address.LegacyRabbit(command)` | Legacy message-bus transport | `Address.LegacyRabbit("pas_premium_calculate")` |
|
|
| `Address.Rabbit(exchange, routingKey)` | Exchange/routing-key bus transport | `Address.Rabbit("serdica", "policy.create")` |
|
|
| `Address.Http(target, path, method?)` | HTTP REST | `Address.Http("authority", "/api/users", "GET")` |
|
|
| `Address.Graphql(target, query)` | GraphQL | `Address.Graphql("serdica", "query { ... }")` |
|
|
|
|
### Failure & Timeout Handling
|
|
|
|
Every `Call` step supports optional `whenFailure` and `whenTimeout` branches:
|
|
|
|
```csharp
|
|
.Call("Service Task", address, payload,
|
|
whenFailure: fail => fail
|
|
.SetState("errorOccurred", WorkflowExpr.Bool(true))
|
|
.Complete(), // graceful completion on failure
|
|
whenTimeout: timeout => timeout
|
|
.Call("Retry Alternative", altAddress, altPayload,
|
|
WorkflowHandledBranchAction.Complete,
|
|
WorkflowHandledBranchAction.Complete)
|
|
.Complete())
|
|
```
|
|
|
|
If neither handler is defined and the transport call fails/times out, the exception propagates and the signal is retried (up to `MaxDeliveryAttempts`).
|
|
|
|
### Shorthand Actions
|
|
|
|
```csharp
|
|
.Call("Step", address, payload,
|
|
WorkflowHandledBranchAction.Complete, // on failure: complete workflow
|
|
WorkflowHandledBranchAction.Complete) // on timeout: complete workflow
|
|
```
|
|
|
|
---
|
|
|
|
## Control Flow
|
|
|
|
### Decisions (Conditional Branching)
|
|
|
|
```csharp
|
|
flow.WhenExpression(
|
|
"Is VIP Customer?",
|
|
WorkflowExpr.Eq(WorkflowExpr.Path("state.customerType"), WorkflowExpr.String("VIP")),
|
|
whenTrue: vip => vip
|
|
.Call("VIP Processing", ...)
|
|
.Complete(),
|
|
whenElse: standard => standard
|
|
.Call("Standard Processing", ...)
|
|
.Complete());
|
|
```
|
|
|
|
### State Flag Decisions
|
|
|
|
```csharp
|
|
flow.WhenStateFlag(
|
|
"policyExistsOnIPAL",
|
|
expectedValue: true,
|
|
"Policy exists on IPAL?",
|
|
whenTrue: exists => exists.Call("Open For Change", ...),
|
|
whenElse: notExists => notExists.Call("Create Policy", ...));
|
|
```
|
|
|
|
### Repeat (Loops)
|
|
|
|
```csharp
|
|
flow.Repeat(
|
|
"Retry Integration",
|
|
maxIterations: context => 5,
|
|
body: body => body
|
|
.Call("Integrate", integrationAddress, payload,
|
|
WorkflowHandledBranchAction.Complete,
|
|
WorkflowHandledBranchAction.Complete)
|
|
.SetState("retryCount", WorkflowExpr.Func("add",
|
|
WorkflowExpr.Path("state.retryCount"), WorkflowExpr.Number(1))),
|
|
continueWhile: WorkflowExpr.Ne(
|
|
WorkflowExpr.Path("state.integrationStatus"),
|
|
WorkflowExpr.String("SUCCESS")));
|
|
```
|
|
|
|
### Fork (Parallel Branches)
|
|
|
|
```csharp
|
|
flow.Fork("Process All Objects",
|
|
branch1 => branch1.Call("Process Object A", ...),
|
|
branch2 => branch2.Call("Process Object B", ...),
|
|
branch3 => branch3.Call("Process Object C", ...));
|
|
```
|
|
|
|
All branches execute concurrently. The workflow resumes after all branches complete.
|
|
|
|
### Timer (Delay)
|
|
|
|
```csharp
|
|
flow.Timer("Wait Before Retry",
|
|
delay: context => TimeSpan.FromMinutes(5));
|
|
```
|
|
|
|
### External Signal (Wait for Event)
|
|
|
|
```csharp
|
|
flow.WaitForSignal(
|
|
"Wait for Document Upload",
|
|
signalName: "documents-uploaded",
|
|
resultKey: "uploadedDocuments");
|
|
```
|
|
|
|
Signals are raised via `RaiseExternalSignalAsync` and matched by `signalName` + `WaitingToken`.
|
|
|
|
---
|
|
|
|
## Sub-Workflows & Continuations
|
|
|
|
### SubWorkflow (Inline Execution)
|
|
|
|
Executes a child workflow inline within the parent. The parent waits for the child to complete.
|
|
|
|
```csharp
|
|
flow.SubWorkflow(
|
|
"Run Review Process",
|
|
new WorkflowWorkflowInvocationDeclaration
|
|
{
|
|
WorkflowName = "ReviewPolicyChanges",
|
|
PayloadExpression = WorkflowExpr.Object(
|
|
WorkflowExpr.Prop("policyId", WorkflowExpr.Path("state.policyId"))),
|
|
});
|
|
```
|
|
|
|
### ContinueWith (Signal-Based)
|
|
|
|
Starts a new workflow instance asynchronously via the signal bus. The parent completes immediately.
|
|
|
|
```csharp
|
|
flow.ContinueWith(
|
|
"Start Transfer Process",
|
|
new WorkflowWorkflowInvocationDeclaration
|
|
{
|
|
WorkflowName = "TransferPolicy",
|
|
PayloadExpression = WorkflowExpr.Path("state"),
|
|
});
|
|
```
|
|
|
|
**When to use which:**
|
|
- **SubWorkflow**: Child must complete before parent continues. State flows back to parent.
|
|
- **ContinueWith**: Fire-and-forget. Parent completes, child runs independently.
|
|
|
|
---
|
|
|
|
## Canonical Definition Schema
|
|
|
|
Every workflow compiles to a canonical JSON definition (`serdica.workflow.definition/v1`). This enables:
|
|
- Portable workflow definitions (JSON import/export)
|
|
- Runtime validation without C# compilation
|
|
- Visual designer support
|
|
|
|
### Step Types
|
|
|
|
| Type | JSON `$type` | Description |
|
|
|------|-------------|-------------|
|
|
| Set State | `"set-state"` | Assign a value to workflow state |
|
|
| Business Reference | `"assign-business-reference"` | Set the business reference |
|
|
| Transport Call | `"call-transport"` | Call an external service |
|
|
| Decision | `"decision"` | Conditional branch |
|
|
| Activate Task | `"activate-task"` | Pause for human task |
|
|
| Continue With | `"continue-with-workflow"` | Start child workflow (async) |
|
|
| Sub-Workflow | `"sub-workflow"` | Execute child workflow (inline) |
|
|
| Repeat | `"repeat"` | Loop with condition |
|
|
| Timer | `"timer"` | Delay execution |
|
|
| External Signal | `"external-signal"` | Wait for external event |
|
|
| Fork | `"fork"` | Parallel branches |
|
|
| Complete | `"complete"` | Terminal step |
|
|
|
|
### Transport Address Types
|
|
|
|
| Type | JSON `$type` | Properties |
|
|
|------|-------------|------------|
|
|
| Microservice | `"microservice"` | `microserviceName`, `command` |
|
|
| Rabbit | `"rabbit"` | `exchange`, `routingKey` |
|
|
| Legacy Rabbit | `"legacy-rabbit"` | `command`, `mode` |
|
|
| GraphQL | `"graphql"` | `target`, `query`, `operationName?` |
|
|
| HTTP | `"http"` | `target`, `path`, `method` |
|
|
|
|
### Example Canonical Definition
|
|
|
|
```json
|
|
{
|
|
"$schemaVersion": "serdica.workflow.definition/v1",
|
|
"workflowName": "ApproveApplication",
|
|
"workflowVersion": "1.0.0",
|
|
"displayName": "Approve Application",
|
|
"workflowRoles": ["DBA", "UR_UNDERWRITER"],
|
|
"start": {
|
|
"initializeStateExpression": {
|
|
"$type": "object",
|
|
"properties": [
|
|
{ "name": "policyId", "expression": { "$type": "path", "path": "start.policyId" } }
|
|
]
|
|
},
|
|
"sequence": {
|
|
"steps": [
|
|
{
|
|
"$type": "call-transport",
|
|
"stepName": "Validate Policy",
|
|
"timeoutSeconds": 60,
|
|
"invocation": {
|
|
"address": {
|
|
"$type": "legacy-rabbit",
|
|
"command": "pas_policy_validate"
|
|
},
|
|
"payloadExpression": {
|
|
"$type": "object",
|
|
"properties": [
|
|
{ "name": "policyId", "expression": { "$type": "path", "path": "state.policyId" } }
|
|
]
|
|
}
|
|
}
|
|
},
|
|
{
|
|
"$type": "activate-task",
|
|
"taskName": "Approve Application",
|
|
"timeoutSeconds": 86400
|
|
}
|
|
]
|
|
}
|
|
},
|
|
"tasks": [
|
|
{
|
|
"taskName": "Approve Application",
|
|
"taskType": "ApproveQTApproveApplication",
|
|
"routeExpression": { "$type": "string", "value": "business/policies" },
|
|
"taskRoles": [],
|
|
"payloadExpression": { "$type": "path", "path": "state" }
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Expression System
|
|
|
|
The expression system evaluates declarative expressions at runtime without recompilation. All expressions are JSON-serializable for canonical portability.
|
|
|
|
### Expression Types
|
|
|
|
| Type | Builder | Example |
|
|
|------|---------|---------|
|
|
| Null | `WorkflowExpr.Null()` | JSON null |
|
|
| String | `WorkflowExpr.String("value")` | `"value"` |
|
|
| Number | `WorkflowExpr.Number(42)` | `42` |
|
|
| Boolean | `WorkflowExpr.Bool(true)` | `true` |
|
|
| Path | `WorkflowExpr.Path("state.policyId")` | Navigate object graph |
|
|
| Object | `WorkflowExpr.Object(props...)` | Construct object from named props |
|
|
| Array | `WorkflowExpr.Array(items...)` | Construct array |
|
|
| Function | `WorkflowExpr.Func("name", args...)` | Call a registered function |
|
|
| Binary | `WorkflowExpr.Eq(left, right)` | Comparison/arithmetic |
|
|
| Unary | `WorkflowExpr.Not(expr)` | Logical negation |
|
|
|
|
### Path Navigation
|
|
|
|
Paths navigate the execution context:
|
|
- `start.*` — Start request fields
|
|
- `state.*` — Current workflow state
|
|
- `payload.*` — Current task completion payload
|
|
- `result.*` — Step result (when `resultKey` is set)
|
|
|
|
### Binary Operators
|
|
|
|
| Operator | Builder | Description |
|
|
|----------|---------|-------------|
|
|
| `eq` | `WorkflowExpr.Eq(a, b)` | Equal |
|
|
| `ne` | `WorkflowExpr.Ne(a, b)` | Not equal |
|
|
| `gt` | `WorkflowExpr.Gt(a, b)` | Greater than |
|
|
| `gte` | `WorkflowExpr.Gte(a, b)` | Greater or equal |
|
|
| `lt` | `WorkflowExpr.Lt(a, b)` | Less than |
|
|
| `lte` | `WorkflowExpr.Lte(a, b)` | Less or equal |
|
|
| `and` | `WorkflowExpr.And(a, b)` | Logical AND |
|
|
| `or` | `WorkflowExpr.Or(a, b)` | Logical OR |
|
|
| `add` | — | Arithmetic addition |
|
|
| `subtract` | — | Arithmetic subtraction |
|
|
| `multiply` | — | Arithmetic multiplication |
|
|
| `divide` | — | Arithmetic division |
|
|
|
|
### Built-in Functions
|
|
|
|
| Function | Signature | Description |
|
|
|----------|-----------|-------------|
|
|
| `coalesce` | `coalesce(value1, value2, ...)` | Returns first non-null argument |
|
|
| `concat` | `concat(str1, str2, ...)` | String concatenation |
|
|
| `add` | `add(num1, num2, ...)` | Sum of numeric arguments |
|
|
| `first` | `first(array)` | First element of array |
|
|
| `if` | `if(condition, whenTrue, whenFalse)` | Conditional value |
|
|
| `isNullOrWhiteSpace` | `isNullOrWhiteSpace(value)` | Check for null/empty string |
|
|
| `length` | `length(value)` | Length of string or array |
|
|
| `mergeObjects` | `mergeObjects(obj1, obj2, ...)` | Deep merge objects |
|
|
| `upper` | `upper(value)` | Uppercase string |
|
|
| `selectManyPath` | `selectManyPath(array, path)` | Map over array elements |
|
|
| `findPath` | `findPath(data, path)` | Navigate nested paths |
|
|
|
|
Custom functions can be registered via `IWorkflowFunctionProvider` plugins.
|
|
|
|
---
|
|
|
|
## Signal System
|
|
|
|
Signals enable durable, asynchronous communication within workflows. They are persisted to a message queue (Oracle AQ, MongoDB, etc.) and processed by the signal pump.
|
|
|
|
### Signal Types
|
|
|
|
| Type | Trigger | Purpose |
|
|
|------|---------|---------|
|
|
| `InternalContinue` | `ContinueWith` step | Start a child workflow asynchronously |
|
|
| `TimerDue` | Timer step delay expired | Resume workflow after delay |
|
|
| `RetryDue` | Retry delay expired | Resume after backoff |
|
|
| `ExternalSignal` | External signal submission through the workflow service surface | External event notification |
|
|
| `SubWorkflowCompleted` | Child sub-workflow finished | Resume parent workflow |
|
|
|
|
### Signal Envelope
|
|
|
|
```csharp
|
|
new WorkflowSignalEnvelope
|
|
{
|
|
SignalId = "unique-id",
|
|
WorkflowInstanceId = "target-instance",
|
|
RuntimeProvider = "Serdica.Engine",
|
|
SignalType = "ExternalSignal",
|
|
ExpectedVersion = 5, // Concurrency control
|
|
WaitingToken = "wait-token", // Match specific wait
|
|
DueAtUtc = null, // null = immediate, DateTime = scheduled
|
|
Payload = { ... },
|
|
}
|
|
```
|
|
|
|
### Signal Processing Pipeline
|
|
|
|
```
|
|
Oracle AQ / MongoDB / Postgres
|
|
-> WorkflowSignalPumpHostedService (N concurrent workers)
|
|
-> WorkflowSignalPumpWorker.RunOnceAsync
|
|
-> IWorkflowSignalBus.ReceiveAsync (blocking dequeue)
|
|
-> WorkflowSignalProcessor.ProcessAsync (route by type)
|
|
-> WorkflowSignalCommandDispatcher
|
|
-> WorkflowRuntimeService.StartWorkflowAsync (InternalContinue)
|
|
-> WorkflowRuntimeService.ResumeSignalAsync (all others)
|
|
```
|
|
|
|
### Concurrency Control
|
|
|
|
Version-based optimistic concurrency prevents duplicate signal processing:
|
|
- Each signal carries `ExpectedVersion`
|
|
- `IWorkflowRuntimeStateStore.UpsertAsync` validates version matches
|
|
- On mismatch: `WorkflowRuntimeStateConcurrencyException` is thrown
|
|
- Signal pump treats concurrency conflicts as successful (completes the lease)
|
|
|
|
### Dead Letter Queue
|
|
|
|
Signals that fail `MaxDeliveryAttempts` times are moved to the dead-letter queue. Dead letters can be inspected and replayed through the workflow service surface.
|
|
|
|
---
|
|
|
|
## Timeout Architecture
|
|
|
|
Timeouts operate at three independent levels:
|
|
|
|
### Level 1: Per-Step Timeout (Service Tasks)
|
|
|
|
Each transport call step has an optional timeout that wraps the entire call (including retries) with a `CancellationTokenSource`.
|
|
|
|
| Setting | Default | Override |
|
|
|---------|---------|----------|
|
|
| `step.TimeoutSeconds` | null | Per-step in workflow declaration |
|
|
| `DefaultTimeoutForServiceTaskCallsSeconds` | 3600s (1h) | Code constant (fallback) |
|
|
|
|
```csharp
|
|
// Per-step override in workflow DSL:
|
|
.Call("Slow Service", address, payload, fail, timeout, timeoutSeconds: 300)
|
|
|
|
// In canonical JSON:
|
|
{ "$type": "call-transport", "timeoutSeconds": 300, ... }
|
|
```
|
|
|
|
**Precedence:** `step.TimeoutSeconds` -> `DefaultTimeoutForServiceTaskCallsSeconds` (1h)
|
|
|
|
### Level 2: Per-Attempt Transport Timeout
|
|
|
|
Each individual transport attempt (single HTTP request, single RPC call) has its own timeout. This is independent of the step-level timeout.
|
|
|
|
| Transport | Default | Config Section |
|
|
|-----------|---------|---------------|
|
|
| HTTP | 30s | `WorkflowHttpTransport.TimeoutSeconds` |
|
|
| GraphQL | 30s | `WorkflowGraphqlTransport.TimeoutSeconds` |
|
|
| Legacy message-bus transport | 30s | `WorkflowLegacyRabbitTransport.DefaultTimeout` |
|
|
| Exchange/routing-key bus transport | 30s | `WorkflowRabbitTransport.DefaultTimeout` |
|
|
|
|
The step timeout wraps all attempts. Example: step timeout 120s + transport timeout 30s = up to 4 retries within the step window.
|
|
|
|
### Level 3: Engine-Wide Execution Timeout
|
|
|
|
Optional global timeout per workflow operation (start, complete, resume).
|
|
|
|
| Setting | Default | Config |
|
|
|---------|---------|--------|
|
|
| `ExecutionTimeoutSeconds` | null (disabled) | `WorkflowEngine.ExecutionTimeoutSeconds` |
|
|
|
|
Set to null for long-running business processes that span days or months.
|
|
|
|
### Human Task Deadlines
|
|
|
|
| Setting | Default | Override |
|
|
|---------|---------|----------|
|
|
| `TimeoutSeconds` on activate-task | null (no deadline) | `.WithTimeout(seconds)` on task builder |
|
|
| `DeadlineUtc` on task summary | null | Computed: `CreatedOnUtc + TimeoutSeconds` |
|
|
|
|
When null, human tasks run indefinitely. Stale/orphaned tasks are cleaned up by the retention service.
|
|
|
|
---
|
|
|
|
## Retention & Lifecycle
|
|
|
|
The retention system automatically manages workflow instance lifecycle.
|
|
|
|
### Configuration
|
|
|
|
| Setting | Default | Config Section |
|
|
|---------|---------|---------------|
|
|
| `OpenStaleAfterDays` | 30 | `WorkflowRetention` |
|
|
| `CompletedPurgeAfterDays` | 180 | `WorkflowRetention` |
|
|
|
|
### Retention Job
|
|
|
|
| Setting | Default | Config Section |
|
|
|---------|---------|---------------|
|
|
| `Enabled` | true | `WorkflowRetentionHostedJob` |
|
|
| `RunOnStartup` | false | `WorkflowRetentionHostedJob` |
|
|
| `InitialDelay` | 5 min | `WorkflowRetentionHostedJob` |
|
|
| `Interval` | 24 hours | `WorkflowRetentionHostedJob` |
|
|
| `LockLease` | 2 hours | `WorkflowRetentionHostedJob` |
|
|
|
|
### Lifecycle Flow
|
|
|
|
```
|
|
Instance Created (Open)
|
|
-> StaleAfterUtc = CreatedOnUtc + OpenStaleAfterDays
|
|
[Retention job marks as stale]
|
|
|
|
Instance Completed
|
|
-> PurgeAfterUtc = CompletedOnUtc + CompletedPurgeAfterDays
|
|
[Retention job deletes instance, tasks, events, runtime state]
|
|
```
|
|
|
|
Manual trigger: `POST /workflow-retention/run`
|
|
|
|
---
|
|
|
|
## Authorization
|
|
|
|
Task authorization uses a pluggable evaluator pattern.
|
|
|
|
### Interface
|
|
|
|
```csharp
|
|
public interface IWorkflowAssignmentPermissionEvaluator
|
|
{
|
|
WorkflowAssignmentPermissionDecision Evaluate(WorkflowAssignmentPermissionContext context);
|
|
}
|
|
```
|
|
|
|
### Default Plugin: Generic Assignment Permissions
|
|
|
|
Configured via `GenericAssignmentPermissions.AdminRoles` (appsettings).
|
|
|
|
| Action | Admin | Standard User |
|
|
|--------|-------|--------------|
|
|
| AssignSelf | Yes | Yes (if has effective role) |
|
|
| AssignOther | Yes | No |
|
|
| AssignRoles | Yes | No |
|
|
| Release | Yes | Yes (if current assignee) |
|
|
| Complete | Yes | Yes (if current assignee) |
|
|
|
|
### Effective Roles
|
|
|
|
A task's `EffectiveRoles` combines:
|
|
1. `WorkflowRoles` — from the workflow definition
|
|
2. `TaskRoles` — from the task definition
|
|
3. `RuntimeRoles` — computed at runtime via expression
|
|
|
|
If `TaskRoles` are specified, they narrow the effective roles. Otherwise, `WorkflowRoles` apply.
|
|
|
|
---
|
|
|
|
## Plugin System
|
|
|
|
Plugins extend the workflow engine with backend stores, transports, signal drivers, and workflow definitions.
|
|
|
|
### Plugin Types
|
|
|
|
| Category | Example Plugins |
|
|
|----------|----------------|
|
|
| **Backend Store** | Oracle, MongoDB, Postgres |
|
|
| **Signal Driver** | Redis, Oracle AQ (native) |
|
|
| **Transport** | HTTP, GraphQL, legacy message-bus, exchange/routing-key bus, microservice command |
|
|
| **Permissions** | Generic RBAC |
|
|
| **Workflow Definitions** | Bulstrad (customer-specific) |
|
|
|
|
### Creating a Plugin
|
|
|
|
```csharp
|
|
public sealed class ServiceRegistrator : IPluginServiceRegistrator
|
|
{
|
|
public void RegisterServices(IServiceCollection services, IConfiguration configuration)
|
|
{
|
|
services.AddWorkflowModule("my-module", "1.0.0");
|
|
services.AddScoped<IMyService, MyServiceImpl>();
|
|
}
|
|
}
|
|
```
|
|
|
|
### Loading Order
|
|
|
|
Plugins load in the order specified by `PluginsConfig.PluginsOrder` in appsettings. Backend stores must load before transport or workflow plugins.
|
|
|
|
### Marker Interfaces
|
|
|
|
- `IWorkflowBackendRegistrationMarker` — validates backend plugin is loaded
|
|
- `IWorkflowSignalDriverRegistrationMarker` — validates signal driver is loaded
|
|
|
|
Startup validation throws `InvalidOperationException` if a configured provider is missing its plugin.
|
|
|
|
---
|
|
|
|
## Configuration Reference
|
|
|
|
### WorkflowEngine
|
|
|
|
```json
|
|
{
|
|
"WorkflowEngine": {
|
|
"NodeId": "workflow-node-1",
|
|
"MaxConcurrentExecutions": 16,
|
|
"MaxConcurrentSignalHandlers": 16,
|
|
"ExecutionTimeoutSeconds": 300,
|
|
"GracefulShutdownTimeoutSeconds": 30
|
|
}
|
|
}
|
|
```
|
|
|
|
### WorkflowRuntime
|
|
|
|
```json
|
|
{
|
|
"WorkflowRuntime": {
|
|
"DefaultProvider": "Serdica.Engine",
|
|
"EnabledProviders": ["Serdica.InProcess", "Serdica.Engine"]
|
|
}
|
|
}
|
|
```
|
|
|
|
### WorkflowAq (Signal Queue)
|
|
|
|
```json
|
|
{
|
|
"WorkflowAq": {
|
|
"QueueOwner": "SRD_WFKLW",
|
|
"SignalQueueName": "WF_SIGNAL_Q",
|
|
"ScheduleQueueName": "WF_SCHEDULE_Q",
|
|
"DeadLetterQueueName": "WF_DLQ_Q",
|
|
"ConsumerName": "WORKFLOW_SERVICE",
|
|
"BlockingDequeueSeconds": 30,
|
|
"MaxDeliveryAttempts": 10
|
|
}
|
|
}
|
|
```
|
|
|
|
### WorkflowRetention
|
|
|
|
```json
|
|
{
|
|
"WorkflowRetention": {
|
|
"OpenStaleAfterDays": 30,
|
|
"CompletedPurgeAfterDays": 180
|
|
}
|
|
}
|
|
```
|
|
|
|
### WorkflowRetentionHostedJob
|
|
|
|
```json
|
|
{
|
|
"WorkflowRetentionHostedJob": {
|
|
"Enabled": true,
|
|
"RunOnStartup": false,
|
|
"InitialDelay": "00:05:00",
|
|
"Interval": "1.00:00:00",
|
|
"LockName": "workflow.retention",
|
|
"LockLease": "02:00:00"
|
|
}
|
|
}
|
|
```
|
|
|
|
### Transport Configuration
|
|
|
|
```json
|
|
{
|
|
"WorkflowHttpTransport": {
|
|
"TimeoutSeconds": 30,
|
|
"RetryCount": 3,
|
|
"Targets": {
|
|
"authority": { "Url": "http://localhost:52000", "Headers": {} }
|
|
}
|
|
},
|
|
"WorkflowGraphqlTransport": {
|
|
"TimeoutSeconds": 30,
|
|
"RetryCount": 3,
|
|
"Targets": {
|
|
"serdica": { "Url": "http://localhost:5100/graphql/" }
|
|
}
|
|
},
|
|
"WorkflowLegacyRabbitTransport": {
|
|
"DefaultTimeout": "00:00:30"
|
|
},
|
|
"WorkflowRabbitTransport": {
|
|
"DefaultTimeout": "00:00:30",
|
|
"DefaultUserId": "workflow-engine"
|
|
}
|
|
}
|
|
```
|
|
|
|
### Plugin Loading
|
|
|
|
```json
|
|
{
|
|
"PluginsConfig": {
|
|
"PluginsDirectory": "PluginBinaries",
|
|
"PluginsOrder": [
|
|
"assign-permissions",
|
|
"workflow-store",
|
|
"signal-driver",
|
|
"transports",
|
|
"workflow-definitions"
|
|
]
|
|
}
|
|
}
|
|
```
|
|
|
|
Use deployment-specific plugin identifiers in that order: durability first, wake mechanism second, transports after that, and workflow-definition bundles last.
|
|
|
|
---
|
|
|
|
## Service Surface
|
|
|
|
The engine depends on a workflow service surface, but the platform transport and command-mapping layer are intentionally out of scope for this document.
|
|
|
|
### Lifecycle Operations
|
|
|
|
- Start a workflow instance.
|
|
- List workflow instances with filtering (by name, version, status, business reference, instance ID, or multiple instance IDs). Set `IncludeDetails = true` to return each instance's active task and workflow state variables.
|
|
- Read one workflow instance with tasks, events, and runtime state.
|
|
|
|
### Task Operations
|
|
|
|
- List tasks by workflow, status, assignee, or business reference.
|
|
- Read one task.
|
|
- Assign a task to a user or role group.
|
|
- Release a task back to the pool.
|
|
- Complete a task with payload.
|
|
|
|
### Signal And Operations Management
|
|
|
|
- Raise an external signal to a waiting instance.
|
|
- Inspect dead-lettered signals.
|
|
- Replay dead-lettered signals.
|
|
- Inspect signal-pump telemetry.
|
|
|
|
### Definitions And Metadata
|
|
|
|
- List workflow definitions (filterable by name, version, or multiple names).
|
|
- Get a single definition by name with optional rendering assets (SVG/PNG/JSON).
|
|
- Render a workflow definition as a diagram.
|
|
- Render a definition in a specific format (`svg`, `png`, or `json` render graph).
|
|
- Expose the canonical schema.
|
|
- Validate canonical definitions.
|
|
- Expose the installed function catalog and engine metadata.
|
|
|
|
### Definition Deployment
|
|
|
|
- Import a canonical definition with versioned storage and content-hash deduplication.
|
|
- Export a definition with optional rendering package.
|
|
- List all versions of a definition with hash, active flag, and metadata.
|
|
- Activate a specific version as the active version for a workflow name.
|
|
|
|
### Administration
|
|
|
|
- Trigger a manual retention sweep.
|
|
|
|
---
|
|
|
|
## Diagram & Visualization
|
|
|
|
The engine can render workflow definitions as visual diagrams.
|
|
|
|
### Layout Engines
|
|
|
|
| Engine | Description |
|
|
|--------|-------------|
|
|
| **ElkSharp** | Port of Eclipse Layout Kernel (default) |
|
|
| **ElkJS** | JavaScript-based ELK via Node.js |
|
|
| **MSAGL** | Microsoft Automatic Graph Layout |
|
|
|
|
### Configuration
|
|
|
|
```json
|
|
{
|
|
"WorkflowRendering": {
|
|
"LayoutProvider": "ElkSharp"
|
|
}
|
|
}
|
|
```
|
|
|
|
### Render Pipeline
|
|
|
|
```
|
|
WorkflowCanonicalDefinition
|
|
-> WorkflowRenderGraphCompiler (nodes + edges)
|
|
-> WorkflowRenderLayoutEngineResolver (select engine)
|
|
-> Layout engine (compute positions)
|
|
-> WorkflowRenderDiagramResponse (JSON for UI)
|
|
```
|
|
|
|
---
|
|
|
|
## Error Handling
|
|
|
|
### Exception Types
|
|
|
|
| Exception | Cause | Recovery |
|
|
|-----------|-------|----------|
|
|
| `WorkflowRuntimeStateConcurrencyException` | Duplicate/stale signal delivery | Auto-handled by signal pump (completes lease) |
|
|
| `BaseResultException` | Business validation failure (not found, denied) | Returns error to caller |
|
|
| `TimeoutException` | Transport or step timeout exceeded | Executes `WhenTimeout` branch if configured |
|
|
| `NotSupportedException` | Unsupported operation (e.g., null signal store) | Configuration error — check plugin loading |
|
|
|
|
### Signal Retry Behavior
|
|
|
|
| Scenario | Behavior |
|
|
|----------|----------|
|
|
| Transient error | Signal abandoned, retried on next poll |
|
|
| Concurrency conflict | Signal completed (not retried) |
|
|
| Max delivery attempts exceeded | Signal moved to dead-letter queue |
|
|
| Deserialization failure | Signal dead-lettered with error logged |
|
|
|
|
### Observability
|
|
|
|
- **Structured logging** via Serilog (all key operations logged with structured properties)
|
|
- **Signal pump telemetry** via `WorkflowSignalPumpTelemetryService` (in-memory counters, queryable through the workflow service surface)
|
|
- **W3C trace IDs** enabled (`Activity.DefaultIdFormat = W3C`)
|
|
|
|
---
|
|
|
|
## Compiler & Decompiler
|
|
|
|
### Forward Compiler
|
|
|
|
`WorkflowCanonicalDefinitionCompiler.Compile<TStartRequest>()` converts a C# fluent DSL workflow (`IDeclarativeWorkflow<T>`) into a canonical JSON definition (`WorkflowCanonicalDefinition`). This runs at startup for all registered workflows.
|
|
|
|
The compiler also generates a **JSON Schema** for the start request type, embedded in the canonical definition's `startRequest.schema` field. This provides a portable, CLR-independent contract for the workflow's input.
|
|
|
|
### Reverse Compiler (Decompiler)
|
|
|
|
`WorkflowCanonicalDecompiler` converts a canonical definition back to C# source code using Roslyn `SyntaxFactory`. Two modes:
|
|
|
|
- **`Decompile(definition)`** — produces formatted C# source text including a typed start request class generated from the JSON Schema and the full workflow class with fluent builder chain
|
|
- **`Reconstruct(definition)`** — produces a new `WorkflowCanonicalDefinition` via deep clone (for structural comparison)
|
|
|
|
The decompiler uses `nameof()` for all type and method references (`WorkflowExpr.Obj`, `LegacyRabbitAddress`, etc.) to ensure rename safety at compile time.
|
|
|
|
### Round-Trip Verification
|
|
|
|
The test suite verifies compiler fidelity via real Roslyn dynamic compilation:
|
|
|
|
```
|
|
Original C# workflow
|
|
-> [compile] -> canonical JSON (JSON1)
|
|
-> [decompile] -> C# source text
|
|
-> [Roslyn CSharpCompilation] -> in-memory assembly
|
|
-> [reflection: instantiate workflow]
|
|
-> [compile] -> canonical JSON (JSON2)
|
|
-> assert JSON1 == JSON2
|
|
```
|
|
|
|
This catches any information loss in the compile/decompile cycle: missing steps, truncated expressions, wrong addresses, lost failure/timeout branches.
|
|
|
|
**Test results:**
|
|
- 177/177 decompiled C# files compile cleanly with Roslyn
|
|
- Semantic round-trip comparison identifies remaining gaps for iterative improvement
|
|
|
|
### Decompiled Output
|
|
|
|
Running the `RenderAllDecompiledOutputs` test generates human-readable output for all workflows:
|
|
|
|
```
|
|
docs/decompiled-samples/
|
|
csharp/ 177 .cs files (Roslyn-formatted C# with typed request models)
|
|
json/ 177 .json files (indented canonical definitions with JSON Schema)
|
|
```
|
|
|