Add StellaOps.Workflow engine: 14 libraries, WebService, 8 test projects
Extract product-agnostic workflow engine from Ablera.Serdica.Workflow into standalone StellaOps.Workflow.* libraries targeting net10.0. Libraries (14): - Contracts, Abstractions (compiler, decompiler, expression runtime) - Engine (execution, signaling, scheduling, projections, hosted services) - ElkSharp (generic graph layout algorithm) - Renderer.ElkSharp, Renderer.ElkJs, Renderer.Msagl, Renderer.Svg - Signaling.Redis, Signaling.OracleAq - DataStore.MongoDB, DataStore.PostgreSQL, DataStore.Oracle WebService: ASP.NET Core Minimal API with 22 endpoints Tests (8 projects, 109 tests pass): - Engine.Tests (105 pass), WebService.Tests (4 E2E pass) - Renderer.Tests, DataStore.MongoDB/Oracle/PostgreSQL.Tests - Signaling.Redis.Tests, IntegrationTests.Shared Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,191 @@
|
||||
# PostgreSQL Performance Baseline 2026-03-17
|
||||
|
||||
## Purpose
|
||||
|
||||
This document captures the current PostgreSQL-backed load and performance baseline for the Serdica workflow engine. It is the reference point for later MongoDB backend comparisons and the final three-backend decision pack.
|
||||
|
||||
The durable machine-readable companion is [11-postgres-performance-baseline-2026-03-17.json](11-postgres-performance-baseline-2026-03-17.json).
|
||||
|
||||
## Run Metadata
|
||||
|
||||
- Date: `2026-03-17`
|
||||
- Test command:
|
||||
- integration performance suite filtered to `PostgresPerformance`
|
||||
- Suite result:
|
||||
- `11/11` tests passed
|
||||
- total wall-clock time: `2 m 16 s`
|
||||
- Raw artifact directory:
|
||||
- `TestResults/workflow-performance/`
|
||||
- PostgreSQL environment:
|
||||
- Docker image: `postgres:16-alpine`
|
||||
- database: `workflow`
|
||||
- version: `PostgreSQL 16.13`
|
||||
- backend: durable queue tables plus `LISTEN/NOTIFY` wake hints
|
||||
|
||||
## Scenario Summary
|
||||
|
||||
| Scenario | Tier | Ops | Conc | Duration ms | Throughput/s | Avg ms | P95 ms | Max ms |
|
||||
| --- | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
|
||||
| `postgres-signal-roundtrip-capacity-c1` | `WorkflowPerfCapacity` | 16 | 1 | 3895.54 | 4.11 | 3738.08 | 3762.51 | 3771.10 |
|
||||
| `postgres-signal-roundtrip-capacity-c4` | `WorkflowPerfCapacity` | 64 | 4 | 3700.99 | 17.29 | 3577.49 | 3583.70 | 3584.43 |
|
||||
| `postgres-signal-roundtrip-capacity-c8` | `WorkflowPerfCapacity` | 128 | 8 | 3853.89 | 33.21 | 3713.31 | 3718.66 | 3719.34 |
|
||||
| `postgres-signal-roundtrip-capacity-c16` | `WorkflowPerfCapacity` | 256 | 16 | 4488.07 | 57.04 | 4251.48 | 4287.87 | 4294.09 |
|
||||
| `postgres-signal-roundtrip-latency-serial` | `WorkflowPerfLatency` | 16 | 1 | 49290.47 | 0.32 | 3079.33 | 3094.94 | 3101.71 |
|
||||
| `postgres-bulstrad-quotation-confirm-convert-to-policy-nightly` | `WorkflowPerfNightly` | 12 | 4 | 3598.64 | 3.33 | 3478.52 | 3500.76 | 3503.73 |
|
||||
| `postgres-delayed-burst-nightly` | `WorkflowPerfNightly` | 48 | 1 | 2449.25 | 19.60 | 2096.34 | 2152.50 | 2157.39 |
|
||||
| `postgres-immediate-burst-nightly` | `WorkflowPerfNightly` | 120 | 1 | 1711.87 | 70.10 | 849.78 | 1012.13 | 1030.98 |
|
||||
| `postgres-synthetic-external-resume-nightly` | `WorkflowPerfNightly` | 36 | 8 | 4162.56 | 8.65 | 4026.50 | 4048.09 | 4049.91 |
|
||||
| `postgres-bulstrad-quote-or-apl-cancel-smoke` | `WorkflowPerfSmoke` | 10 | 4 | 166.99 | 59.88 | 13.51 | 23.87 | 26.35 |
|
||||
| `postgres-delayed-burst-smoke` | `WorkflowPerfSmoke` | 12 | 1 | 2146.89 | 5.59 | 2032.67 | 2050.20 | 2051.30 |
|
||||
| `postgres-immediate-burst-smoke` | `WorkflowPerfSmoke` | 24 | 1 | 341.84 | 70.21 | 176.19 | 197.25 | 197.91 |
|
||||
| `postgres-signal-roundtrip-soak` | `WorkflowPerfSoak` | 108 | 8 | 25121.68 | 4.30 | 4164.52 | 4208.42 | 4209.96 |
|
||||
| `postgres-signal-roundtrip-throughput-parallel` | `WorkflowPerfThroughput` | 96 | 16 | 3729.17 | 25.74 | 3603.54 | 3635.59 | 3649.96 |
|
||||
|
||||
## Measurement Split
|
||||
|
||||
The synthetic signal round-trip workload is measured in three separate ways:
|
||||
|
||||
- `postgres-signal-roundtrip-latency-serial`: one workflow at a time, one signal worker, used as the single-instance latency baseline.
|
||||
- `postgres-signal-roundtrip-throughput-parallel`: `96` workflows, `16`-way workload concurrency, `8` signal workers, used as the steady-state throughput baseline.
|
||||
- `postgres-signal-roundtrip-capacity-c*`: batch-wave capacity ladder used to observe scaling and pressure points.
|
||||
|
||||
The useful PostgreSQL baseline is:
|
||||
|
||||
- serial latency baseline: `3079.33 ms` average end-to-end per workflow
|
||||
- steady throughput baseline: `25.74 ops/s` with `16` workload concurrency and `8` signal workers
|
||||
- capacity `c1`: `4.11 ops/s`; this is only the smallest batch-wave rung
|
||||
|
||||
### Serial Latency Baseline
|
||||
|
||||
| Phase | Avg ms | P95 ms | Max ms |
|
||||
| --- | ---: | ---: | ---: |
|
||||
| `start` | 6.12 | 9.29 | 11.26 |
|
||||
| `signalPublish` | 5.63 | 6.82 | 7.53 |
|
||||
| `signalToCompletion` | 3073.20 | 3086.59 | 3090.44 |
|
||||
|
||||
Interpretation:
|
||||
|
||||
- almost all serial latency is in `signalToCompletion`
|
||||
- workflow start is very cheap on this backend
|
||||
- external signal publication is also cheap
|
||||
|
||||
### Steady Throughput Baseline
|
||||
|
||||
| Phase | Avg ms | P95 ms | Max ms |
|
||||
| --- | ---: | ---: | ---: |
|
||||
| `start` | 16.21 | 40.31 | 47.02 |
|
||||
| `signalPublish` | 18.11 | 23.62 | 28.41 |
|
||||
| `signalToCompletion` | 3504.24 | 3530.38 | 3531.14 |
|
||||
|
||||
Interpretation:
|
||||
|
||||
- the engine sustained `25.74 ops/s` in a `96`-operation wave
|
||||
- end-to-end average stayed at `3603.54 ms`
|
||||
- start and signal publication remained small compared to the resume path
|
||||
|
||||
## PostgreSQL Observations
|
||||
|
||||
### Dominant Waits
|
||||
|
||||
- `Client:ClientRead` was the top observed wait class in `13/14` scenario artifacts.
|
||||
- The serial latency scenario had no distinct competing wait class because the measurement ran with effectively no backend concurrency.
|
||||
- On this local PostgreSQL profile the wake-up path is not the visible bottleneck; the dominant observed state is clients waiting on the next command while the engine completes work in short transactions.
|
||||
|
||||
### Capacity Ladder
|
||||
|
||||
| Scenario | Throughput/s | P95 ms | Xact Commits | Buffer Hits | Buffer Reads | Tuples Inserted | Tuples Updated | Tuples Deleted | Top Wait |
|
||||
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | --- |
|
||||
| `c1` | 4.11 | 3762.51 | 251 | 1654 | 24 | 48 | 48 | 16 | `Client:ClientRead` |
|
||||
| `c4` | 17.29 | 3583.70 | 1080 | 7084 | 1 | 192 | 192 | 64 | `Client:ClientRead` |
|
||||
| `c8` | 33.21 | 3718.66 | 2348 | 17069 | 0 | 384 | 384 | 128 | `Client:ClientRead` |
|
||||
| `c16` | 57.04 | 4287.87 | 4536 | 40443 | 0 | 768 | 768 | 256 | `Client:ClientRead` |
|
||||
|
||||
Interpretation:
|
||||
|
||||
- the capacity ladder scales more smoothly than the Oracle baseline on the same local machine
|
||||
- `c16` is the fastest tested rung and does not yet show a hard cliff
|
||||
- the next meaningful PostgreSQL characterization step should test above `c16` before declaring a saturation boundary
|
||||
|
||||
### Transport Baselines
|
||||
|
||||
| Scenario | Throughput/s | Xact Commits | Buffer Hits | Buffer Reads | Tuples Inserted | Tuples Updated | Top Wait |
|
||||
| --- | ---: | ---: | ---: | ---: | ---: | ---: | --- |
|
||||
| `postgres-immediate-burst-nightly` | 70.10 | 801 | 13207 | 4 | 570 | 162 | `Client:ClientRead` |
|
||||
| `postgres-delayed-burst-nightly` | 19.60 | 269 | 11472 | 3 | 498 | 33 | `Client:ClientRead` |
|
||||
|
||||
Interpretation:
|
||||
|
||||
- immediate transport remains much cheaper than full workflow resume
|
||||
- delayed transport is still dominated by the intentional delay window, not by raw dequeue speed
|
||||
- the very short smoke transport runs are useful for end-to-end timing, but they are too brief to rely on as the primary PostgreSQL stat sample
|
||||
|
||||
### Business Flow Baselines
|
||||
|
||||
| Scenario | Throughput/s | Avg ms | Xact Commits | Buffer Hits | Buffer Reads | Tuples Inserted | Tuples Updated | Top Wait |
|
||||
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | --- |
|
||||
| `postgres-bulstrad-quote-or-apl-cancel-smoke` | 59.88 | 13.51 | 3 | 93 | 0 | 0 | 0 | `Client:ClientRead` |
|
||||
| `postgres-bulstrad-quotation-confirm-convert-to-policy-nightly` | 3.33 | 3478.52 | 236 | 12028 | 270 | 546 | 75 | `Client:ClientRead` |
|
||||
|
||||
Interpretation:
|
||||
|
||||
- the short Bulstrad flow is still mostly transport and orchestration overhead
|
||||
- the heavier `QuotationConfirm -> ConvertToPolicy` flow is a better real-workload pressure baseline because it exercises deeper projection and signal traffic
|
||||
|
||||
### Soak Baseline
|
||||
|
||||
`postgres-signal-roundtrip-soak` completed `108` operations at concurrency `8` with:
|
||||
|
||||
- throughput: `4.30 ops/s`
|
||||
- average latency: `4164.52 ms`
|
||||
- P95 latency: `4208.42 ms`
|
||||
- `0` failures
|
||||
- `0` dead-lettered signals
|
||||
- `0` runtime conflicts
|
||||
- `0` stuck instances
|
||||
|
||||
PostgreSQL metrics for the soak run:
|
||||
|
||||
- `xact_commit`: `3313`
|
||||
- `xact_rollback`: `352`
|
||||
- `blks_hit`: `26548`
|
||||
- `blks_read`: `269`
|
||||
- `tup_inserted`: `774`
|
||||
- `tup_updated`: `339`
|
||||
- `tup_deleted`: `108`
|
||||
- top wait:
|
||||
- `Client:ClientRead`
|
||||
|
||||
## What Must Stay Constant For Future Backend Comparisons
|
||||
|
||||
When MongoDB is benchmarked and the final Oracle/PostgreSQL/MongoDB comparison is produced, keep these constant:
|
||||
|
||||
- same scenario names
|
||||
- same operation counts
|
||||
- same concurrency levels
|
||||
- same worker counts for signal drain
|
||||
- same synthetic workflow definitions
|
||||
- same Bulstrad workflow families
|
||||
- same correctness assertions
|
||||
|
||||
Compare these dimensions directly:
|
||||
|
||||
- throughput per second
|
||||
- latency average, P95, P99, and max
|
||||
- phase latency summaries for start, signal publish, and signal-to-completion on the synthetic signal round-trip workload
|
||||
- failures, dead letters, runtime conflicts, and stuck instances
|
||||
- commit count analogs
|
||||
- row, tuple, or document movement analogs
|
||||
- read-hit or read-amplification analogs
|
||||
- dominant waits, locks, or wake-path contention classes
|
||||
|
||||
## First Sizing Note
|
||||
|
||||
On this local PostgreSQL baseline:
|
||||
|
||||
- immediate queue burst handling is comfortably above the small workflow tiers; the current nightly transport baseline is `70.10 ops/s`
|
||||
- the separated steady throughput baseline is `25.74 ops/s`, ahead of the current Oracle baseline on the same synthetic workflow profile
|
||||
- the ladder through `c16` still looks healthy and does not yet expose a sharp pressure rung
|
||||
- the dominant observed backend state is client read waiting, which suggests the next tuning conversation should focus on queue claim cadence, notification wake-ups, and transaction shape rather than on an obvious storage stall
|
||||
|
||||
This is a baseline, not a production commitment. MongoDB should now reuse the same scenarios and produce the same summary tables before any backend recommendation is declared.
|
||||
|
||||
Reference in New Issue
Block a user