Files

master f5b5f24d95 Add StellaOps.Workflow engine: 14 libraries, WebService, 8 test projects

Extract product-agnostic workflow engine from Ablera.Serdica.Workflow into
standalone StellaOps.Workflow.* libraries targeting net10.0.

Libraries (14):
- Contracts, Abstractions (compiler, decompiler, expression runtime)
- Engine (execution, signaling, scheduling, projections, hosted services)
- ElkSharp (generic graph layout algorithm)
- Renderer.ElkSharp, Renderer.ElkJs, Renderer.Msagl, Renderer.Svg
- Signaling.Redis, Signaling.OracleAq
- DataStore.MongoDB, DataStore.PostgreSQL, DataStore.Oracle

WebService: ASP.NET Core Minimal API with 22 endpoints

Tests (8 projects, 109 tests pass):
- Engine.Tests (105 pass), WebService.Tests (4 E2E pass)
- Renderer.Tests, DataStore.MongoDB/Oracle/PostgreSQL.Tests
- Signaling.Redis.Tests, IntegrationTests.Shared

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-20 19:14:44 +02:00

13 KiB

Raw Blame History

15. Backend And Signal Driver Usage

Purpose

This document turns the current backend implementation and measured six-profile matrix into operating guidance.

It answers three practical questions:

which backend should be the durable workflow system of record
whether the signal driver should stay native or use Redis
when a given combination should or should not be used

The reference comparison data comes from:

Two Separate Choices

There are two distinct infrastructure choices in the current engine.

1. Backend

The backend is the durable correctness layer.

It owns:

runtime state
projections
durable signal persistence
delayed signal persistence
dead-letter persistence
mutation transaction boundary

The configured backend lives under:

WorkflowBackend:Provider

Supported values are defined by the engine backend identifiers.

Current values:

Oracle
Postgres
Mongo

2. Signal Driver

The signal driver is the wake mechanism.

It owns:

wake notification delivery
receive wait behavior
claim loop entry path

It does not own correctness.

The configured signal driver lives under:

WorkflowSignalDriver:Provider

Supported values are defined by the engine signal-driver identifiers.

Current values:

Native
Redis

Core Rule

Redis is a wake driver, not a durable workflow queue.

That means:

the selected backend always remains the durable source of truth
runtime state and durable signals commit in the backend transaction boundary
Redis only publishes wake hints after commit
workers always claim from the durable backend store

Do not design or describe Redis as the place where workflow correctness lives.

Supported Profiles

Profile	Durable correctness layer	Wake path	Current recommendation
`Oracle + Native`	Oracle + AQ	AQ dequeue	Default production profile
`Oracle + Redis`	Oracle + AQ	Redis wake, AQ claim	Supported, not preferred
`Postgres + Native`	PostgreSQL tables	PostgreSQL native wake	Best relational portability profile
`Postgres + Redis`	PostgreSQL tables	Redis wake, PostgreSQL claim	Supported, optional
`Mongo + Native`	Mongo collections	Mongo change streams	Fastest measured profile, with operational caveats
`Mongo + Redis`	Mongo collections	Redis wake, Mongo claim	Supported, generally not recommended

How To Read The Performance Data

The six-profile matrix contains both real resume timing and benchmark drain policy timing.

Use these rows as primary decision inputs:

Signal to first completion avg
Throughput

Treat these rows as secondary:

Signal to completion avg
Drain-to-idle overhang avg

Reason:

Signal to first completion avg measures actual wake and resume speed
Signal to completion avg also includes empty-queue drain behavior
Drain-to-idle overhang avg explains how much of the mixed latency is benchmark overhang, not real resume work

The current matrix shows that clearly:

Metric	Oracle	PostgreSQL	Mongo	Oracle+Redis	PostgreSQL+Redis	Mongo+Redis
Signal to first completion avg ms	76.15	37.56	55.06	81.46	31.77	40.88
Throughput ops/s	24.17	26.28	119.51	21.88	25.51	25.14
Drain-to-idle overhang avg ms	2909.65	3047.65	57.86	3031.66	3033.61	3036.85

Interpretation:

native Mongo is fast because the native change-stream wake path also has low empty-receive overhang
PostgreSQL native and PostgreSQL plus Redis are close in real resume speed
Oracle native remains slightly better than Oracle plus Redis
Mongo plus Redis loses most of native Mongo's advantage because Redis mode reintroduces the empty-wait overhang

Recommended Default Choices

Default Production Choice Today

Use Oracle + Native.

Use it when:

Oracle is already the platform system of record
strongest validated correctness and restart behavior matter more than portability
AQ is available and operationally acceptable
timer precision and native transactional coupling are important

Why:

it has the strongest hostile-condition coverage
it remains the semantic reference implementation
it keeps one native durable stack for state, signals, and scheduling

Best Relational Non-Oracle Choice

Use Postgres + Native.

Use it when:

a relational backend is required
Oracle is not desired
you want the cleanest portability path
you want performance close to Oracle with simpler infrastructure

Why:

it is the strongest non-Oracle backend in the current relational comparison
native PostgreSQL wake is already competitive with Redis in the current measurements
it keeps one backend-native operational story

Highest Measured Synthetic Throughput Choice

Use Mongo + Native only when its operational assumptions are acceptable.

Use it when:

throughput and low wake latency matter strongly
Mongo replica-set transactions are already an accepted platform dependency
the team is comfortable operating change streams and Mongo-specific failure modes

Why:

it is currently the fastest measured profile
its native wake path avoids the large empty-wait overhang seen in the other measured paths

Do not treat this as the universal default.

Mongo is fast in the current engine workload, but its operational model is still less conservative than the relational profiles.

When Redis Should Be Used

Redis should be selected for operational topology reasons, not by default as a performance assumption.

Good reasons to use Redis:

one shared wake substrate is required across multiple backend profiles
the deployment already standardizes on Redis for fan-out and worker wake infrastructure
you want the backend-native wake path disabled intentionally and replaced by one uniform wake mechanism

Weak reasons to use Redis:

"Redis is always faster"
"Redis should hold the durable signal queue"
"Redis should replace the backend transaction boundary"

Those are not valid design assumptions for this engine.

Profile-By-Profile Guidance

Oracle + Native

Use when:

Oracle is the chosen workflow backend
AQ is available
you want the strongest native transactional semantics

Do not switch away from it just to standardize on Redis.

Current measured result:

native Oracle is slightly better than Oracle plus Redis on both first-completion latency and throughput

Oracle + Redis

Use only when:

Oracle remains the durable backend
Redis is required as a uniform wake topology across the environment
the small performance loss is acceptable

Do not use it as the default Oracle profile.

Current measured result:

it works correctly
it is slower than native Oracle
it does not improve timer behavior today

Postgres + Native

Use as the first portability target when leaving Oracle.

Use when:

you want a relational durable store
you want the cleanest alternative to Oracle
you want the simplest operational story for PostgreSQL

This should be the default PostgreSQL profile.

Postgres + Redis

Use when:

PostgreSQL is the durable backend
a shared Redis wake topology is required
a nearly flat performance profile versus native PostgreSQL is acceptable

Do not assume it is a speed upgrade.

Current measured result:

it is very close to native PostgreSQL
it is not a compelling performance win on its own

Mongo + Native

Use when:

MongoDB is an accepted transactional system of record for workflow runtime state
replica-set transactions are available
the team accepts Mongo operational ownership

This should be the default Mongo profile.

Mongo + Redis

Avoid as the normal Mongo profile.

Use only when:

Mongo must remain the durable backend
Redis wake standardization is mandatory for the deployment
the team accepts materially worse measured wake behavior than native Mongo

Current measured result:

native Mongo is much better overall
first-completion latency stays acceptable, but steady throughput and idle-drain behavior become much worse
Redis removes the main measured advantage of the native Mongo wake path

Timer And Delayed-Signal Guidance

Timers remain durable in the selected backend.

That means:

Oracle timers remain durable in AQ
PostgreSQL timers remain durable in PostgreSQL tables
Mongo timers remain durable in Mongo collections

Redis does not become the timer authority.

Current practical rule:

if timer behavior is a primary concern, prefer the native signal driver for the selected backend

Reason:

Redis wake currently optimizes wake notification, not durable due-time ownership
delayed messages still live in the backend store
due-time wake precision in Redis mode is still bounded by the driver wait policy rather than a separate Redis-native timer authority

What Must Not Be Mixed

Do not mix durable responsibilities across systems.

Bad combinations:

Oracle runtime state with PostgreSQL signals
PostgreSQL runtime state with Redis as the durable signal queue
Mongo runtime state with Oracle scheduling
one backend for runtime state and another backend for projections

Use one backend profile per deployment.

The only supported cross-system split is:

durable backend
optional Redis wake driver

Operational Decision Matrix

Goal	Recommended profile
strongest production default today	`Oracle + Native`
best non-Oracle relational target	`Postgres + Native`
one uniform wake substrate across relational backends	`Postgres + Redis`
highest measured synthetic wake and throughput	`Mongo + Native`
Mongo with forced Redis standardization	`Mongo + Redis`, only if policy requires it
Oracle with forced Redis standardization	`Oracle + Redis`, only if policy requires it

Configuration Surface

Oracle + Native

{
  "WorkflowBackend": {
    "Provider": "Oracle"
  },
  "WorkflowSignalDriver": {
    "Provider": "Native"
  },
  "WorkflowAq": {
    "QueueOwner": "SRD_WFKLW",
    "SignalQueueName": "WF_SIGNAL_Q",
    "ScheduleQueueName": "WF_SCHEDULE_Q",
    "DeadLetterQueueName": "WF_DLQ_Q"
  }
}

Oracle + Redis

{
  "WorkflowBackend": {
    "Provider": "Oracle"
  },
  "WorkflowSignalDriver": {
    "Provider": "Redis",
    "Redis": {
      "ChannelName": "serdica:workflow:signals",
      "BlockingWaitSeconds": 5
    }
  },
  "WorkflowAq": {
    "QueueOwner": "SRD_WFKLW",
    "SignalQueueName": "WF_SIGNAL_Q",
    "ScheduleQueueName": "WF_SCHEDULE_Q",
    "DeadLetterQueueName": "WF_DLQ_Q"
  }
}

Postgres + Native

{
  "WorkflowBackend": {
    "Provider": "Postgres",
    "Postgres": {
      "ConnectionStringName": "WorkflowPostgres",
      "SchemaName": "srd_wfklw",
      "ClaimBatchSize": 32,
      "BlockingWaitSeconds": 30
    }
  },
  "WorkflowSignalDriver": {
    "Provider": "Native"
  }
}

Postgres + Redis

{
  "WorkflowBackend": {
    "Provider": "Postgres",
    "Postgres": {
      "ConnectionStringName": "WorkflowPostgres",
      "SchemaName": "srd_wfklw"
    }
  },
  "WorkflowSignalDriver": {
    "Provider": "Redis",
    "Redis": {
      "ChannelName": "serdica:workflow:signals",
      "BlockingWaitSeconds": 5
    }
  }
}

Mongo + Native

{
  "WorkflowBackend": {
    "Provider": "Mongo",
    "Mongo": {
      "ConnectionStringName": "WorkflowMongo",
      "DatabaseName": "serdica_workflow_store",
      "BlockingWaitSeconds": 30
    }
  },
  "WorkflowSignalDriver": {
    "Provider": "Native"
  }
}

Mongo + Redis

{
  "WorkflowBackend": {
    "Provider": "Mongo",
    "Mongo": {
      "ConnectionStringName": "WorkflowMongo",
      "DatabaseName": "serdica_workflow_store"
    }
  },
  "WorkflowSignalDriver": {
    "Provider": "Redis",
    "Redis": {
      "ChannelName": "serdica:workflow:signals",
      "BlockingWaitSeconds": 5
    }
  }
}

Plugin Registration Rule

The host stays backend-neutral.

That means the selected backend and optional Redis wake plugin must be present in PluginsConfig:PluginsOrder.

Relevant plugin categories are:

Oracle backend plugin
PostgreSQL backend plugin
MongoDB backend plugin
Redis wake-driver plugin

If Redis is not configured, do not register it just because it exists.

Recommended Decision Order

When choosing a deployment profile, use this order:

choose the durable backend based on correctness and platform ownership
choose the native signal driver first
add Redis only if there is a clear topology or operational reason
validate the choice against the six-profile matrix, not assumption

Current Bottom Line

Today the practical recommendation is:

Oracle + Native for the strongest default production backend
Postgres + Native for the best relational portability target
Mongo + Native only when Mongo operational assumptions are explicitly accepted
Redis as an optional wake standardization layer, not as the default performance answer

13 KiB Raw Blame History

15. Backend And Signal Driver Usage

Purpose

Two Separate Choices

1. Backend

2. Signal Driver

Core Rule

Supported Profiles

How To Read The Performance Data

Recommended Default Choices

Default Production Choice Today

Best Relational Non-Oracle Choice

Highest Measured Synthetic Throughput Choice

When Redis Should Be Used

Profile-By-Profile Guidance

Oracle + Native

Oracle + Redis

Postgres + Native

Postgres + Redis

Mongo + Native

Mongo + Redis

Timer And Delayed-Signal Guidance

What Must Not Be Mixed

Operational Decision Matrix

Configuration Surface

Oracle + Native

Oracle + Redis

Postgres + Native

Postgres + Redis

Mongo + Native

Mongo + Redis

Plugin Registration Rule

Recommended Decision Order

Current Bottom Line

13 KiB

Raw Blame History