Files

master f5b5f24d95 Add StellaOps.Workflow engine: 14 libraries, WebService, 8 test projects

Extract product-agnostic workflow engine from Ablera.Serdica.Workflow into
standalone StellaOps.Workflow.* libraries targeting net10.0.

Libraries (14):
- Contracts, Abstractions (compiler, decompiler, expression runtime)
- Engine (execution, signaling, scheduling, projections, hosted services)
- ElkSharp (generic graph layout algorithm)
- Renderer.ElkSharp, Renderer.ElkJs, Renderer.Msagl, Renderer.Svg
- Signaling.Redis, Signaling.OracleAq
- DataStore.MongoDB, DataStore.PostgreSQL, DataStore.Oracle

WebService: ASP.NET Core Minimal API with 22 endpoints

Tests (8 projects, 109 tests pass):
- Engine.Tests (105 pass), WebService.Tests (4 E2E pass)
- Renderer.Tests, DataStore.MongoDB/Oracle/PostgreSQL.Tests
- Signaling.Redis.Tests, IntegrationTests.Shared

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-20 19:14:44 +02:00

9.2 KiB

Raw Blame History

PostgreSQL Performance Baseline 2026-03-17

Purpose

This document captures the current PostgreSQL-backed load and performance baseline for the Serdica workflow engine. It is the reference point for later MongoDB backend comparisons and the final three-backend decision pack.

The durable machine-readable companion is 11-postgres-performance-baseline-2026-03-17.json.

Run Metadata

Date: 2026-03-17
Test command:
- integration performance suite filtered to PostgresPerformance
Suite result:
- 11/11 tests passed
- total wall-clock time: 2 m 16 s
Raw artifact directory:
- TestResults/workflow-performance/
PostgreSQL environment:
- Docker image: postgres:16-alpine
- database: workflow
- version: PostgreSQL 16.13
- backend: durable queue tables plus LISTEN/NOTIFY wake hints

Scenario Summary

Scenario	Tier	Ops	Conc	Duration ms	Throughput/s	Avg ms	P95 ms	Max ms
`postgres-signal-roundtrip-capacity-c1`	`WorkflowPerfCapacity`	16	1	3895.54	4.11	3738.08	3762.51	3771.10
`postgres-signal-roundtrip-capacity-c4`	`WorkflowPerfCapacity`	64	4	3700.99	17.29	3577.49	3583.70	3584.43
`postgres-signal-roundtrip-capacity-c8`	`WorkflowPerfCapacity`	128	8	3853.89	33.21	3713.31	3718.66	3719.34
`postgres-signal-roundtrip-capacity-c16`	`WorkflowPerfCapacity`	256	16	4488.07	57.04	4251.48	4287.87	4294.09
`postgres-signal-roundtrip-latency-serial`	`WorkflowPerfLatency`	16	1	49290.47	0.32	3079.33	3094.94	3101.71
`postgres-bulstrad-quotation-confirm-convert-to-policy-nightly`	`WorkflowPerfNightly`	12	4	3598.64	3.33	3478.52	3500.76	3503.73
`postgres-delayed-burst-nightly`	`WorkflowPerfNightly`	48	1	2449.25	19.60	2096.34	2152.50	2157.39
`postgres-immediate-burst-nightly`	`WorkflowPerfNightly`	120	1	1711.87	70.10	849.78	1012.13	1030.98
`postgres-synthetic-external-resume-nightly`	`WorkflowPerfNightly`	36	8	4162.56	8.65	4026.50	4048.09	4049.91
`postgres-bulstrad-quote-or-apl-cancel-smoke`	`WorkflowPerfSmoke`	10	4	166.99	59.88	13.51	23.87	26.35
`postgres-delayed-burst-smoke`	`WorkflowPerfSmoke`	12	1	2146.89	5.59	2032.67	2050.20	2051.30
`postgres-immediate-burst-smoke`	`WorkflowPerfSmoke`	24	1	341.84	70.21	176.19	197.25	197.91
`postgres-signal-roundtrip-soak`	`WorkflowPerfSoak`	108	8	25121.68	4.30	4164.52	4208.42	4209.96
`postgres-signal-roundtrip-throughput-parallel`	`WorkflowPerfThroughput`	96	16	3729.17	25.74	3603.54	3635.59	3649.96

Measurement Split

The synthetic signal round-trip workload is measured in three separate ways:

postgres-signal-roundtrip-latency-serial: one workflow at a time, one signal worker, used as the single-instance latency baseline.
postgres-signal-roundtrip-throughput-parallel: 96 workflows, 16-way workload concurrency, 8 signal workers, used as the steady-state throughput baseline.
postgres-signal-roundtrip-capacity-c*: batch-wave capacity ladder used to observe scaling and pressure points.

The useful PostgreSQL baseline is:

serial latency baseline: 3079.33 ms average end-to-end per workflow
steady throughput baseline: 25.74 ops/s with 16 workload concurrency and 8 signal workers
capacity c1: 4.11 ops/s; this is only the smallest batch-wave rung

Serial Latency Baseline

Phase	Avg ms	P95 ms	Max ms
`start`	6.12	9.29	11.26
`signalPublish`	5.63	6.82	7.53
`signalToCompletion`	3073.20	3086.59	3090.44

Interpretation:

almost all serial latency is in signalToCompletion
workflow start is very cheap on this backend
external signal publication is also cheap

Steady Throughput Baseline

Phase	Avg ms	P95 ms	Max ms
`start`	16.21	40.31	47.02
`signalPublish`	18.11	23.62	28.41
`signalToCompletion`	3504.24	3530.38	3531.14

Interpretation:

the engine sustained 25.74 ops/s in a 96-operation wave
end-to-end average stayed at 3603.54 ms
start and signal publication remained small compared to the resume path

PostgreSQL Observations

Dominant Waits

Client:ClientRead was the top observed wait class in 13/14 scenario artifacts.
The serial latency scenario had no distinct competing wait class because the measurement ran with effectively no backend concurrency.
On this local PostgreSQL profile the wake-up path is not the visible bottleneck; the dominant observed state is clients waiting on the next command while the engine completes work in short transactions.

Capacity Ladder

Scenario	Throughput/s	P95 ms	Xact Commits	Buffer Hits	Buffer Reads	Tuples Inserted	Tuples Updated	Tuples Deleted	Top Wait
`c1`	4.11	3762.51	251	1654	24	48	48	16	`Client:ClientRead`
`c4`	17.29	3583.70	1080	7084	1	192	192	64	`Client:ClientRead`
`c8`	33.21	3718.66	2348	17069	0	384	384	128	`Client:ClientRead`
`c16`	57.04	4287.87	4536	40443	0	768	768	256	`Client:ClientRead`

Interpretation:

the capacity ladder scales more smoothly than the Oracle baseline on the same local machine
c16 is the fastest tested rung and does not yet show a hard cliff
the next meaningful PostgreSQL characterization step should test above c16 before declaring a saturation boundary

Transport Baselines

Scenario	Throughput/s	Xact Commits	Buffer Hits	Buffer Reads	Tuples Inserted	Tuples Updated	Top Wait
`postgres-immediate-burst-nightly`	70.10	801	13207	4	570	162	`Client:ClientRead`
`postgres-delayed-burst-nightly`	19.60	269	11472	3	498	33	`Client:ClientRead`

Interpretation:

immediate transport remains much cheaper than full workflow resume
delayed transport is still dominated by the intentional delay window, not by raw dequeue speed
the very short smoke transport runs are useful for end-to-end timing, but they are too brief to rely on as the primary PostgreSQL stat sample

Business Flow Baselines

Scenario	Throughput/s	Avg ms	Xact Commits	Buffer Hits	Buffer Reads	Tuples Inserted	Tuples Updated	Top Wait
`postgres-bulstrad-quote-or-apl-cancel-smoke`	59.88	13.51	3	93	0	0	0	`Client:ClientRead`
`postgres-bulstrad-quotation-confirm-convert-to-policy-nightly`	3.33	3478.52	236	12028	270	546	75	`Client:ClientRead`

Interpretation:

the short Bulstrad flow is still mostly transport and orchestration overhead
the heavier QuotationConfirm -> ConvertToPolicy flow is a better real-workload pressure baseline because it exercises deeper projection and signal traffic

Soak Baseline

postgres-signal-roundtrip-soak completed 108 operations at concurrency 8 with:

throughput: 4.30 ops/s
average latency: 4164.52 ms
P95 latency: 4208.42 ms
0 failures
0 dead-lettered signals
0 runtime conflicts
0 stuck instances

PostgreSQL metrics for the soak run:

xact_commit: 3313
xact_rollback: 352
blks_hit: 26548
blks_read: 269
tup_inserted: 774
tup_updated: 339
tup_deleted: 108
top wait:
- Client:ClientRead

What Must Stay Constant For Future Backend Comparisons

When MongoDB is benchmarked and the final Oracle/PostgreSQL/MongoDB comparison is produced, keep these constant:

same scenario names
same operation counts
same concurrency levels
same worker counts for signal drain
same synthetic workflow definitions
same Bulstrad workflow families
same correctness assertions

Compare these dimensions directly:

throughput per second
latency average, P95, P99, and max
phase latency summaries for start, signal publish, and signal-to-completion on the synthetic signal round-trip workload
failures, dead letters, runtime conflicts, and stuck instances
commit count analogs
row, tuple, or document movement analogs
read-hit or read-amplification analogs
dominant waits, locks, or wake-path contention classes

First Sizing Note

On this local PostgreSQL baseline:

immediate queue burst handling is comfortably above the small workflow tiers; the current nightly transport baseline is 70.10 ops/s
the separated steady throughput baseline is 25.74 ops/s, ahead of the current Oracle baseline on the same synthetic workflow profile
the ladder through c16 still looks healthy and does not yet expose a sharp pressure rung
the dominant observed backend state is client read waiting, which suggests the next tuning conversation should focus on queue claim cadence, notification wake-ups, and transaction shape rather than on an obvious storage stall

This is a baseline, not a production commitment. MongoDB should now reuse the same scenarios and produce the same summary tables before any backend recommendation is declared.

9.2 KiB Raw Blame History

PostgreSQL Performance Baseline 2026-03-17

Purpose

Run Metadata

Scenario Summary

Measurement Split

Serial Latency Baseline

Steady Throughput Baseline

PostgreSQL Observations

Dominant Waits

Capacity Ladder

Transport Baselines

Business Flow Baselines

Soak Baseline

What Must Stay Constant For Future Backend Comparisons

First Sizing Note

9.2 KiB

Raw Blame History