Files
git.stella-ops.org/docs/workflow/engine/10-oracle-performance-baseline-2026-03-17.md
master f5b5f24d95 Add StellaOps.Workflow engine: 14 libraries, WebService, 8 test projects
Extract product-agnostic workflow engine from Ablera.Serdica.Workflow into
standalone StellaOps.Workflow.* libraries targeting net10.0.

Libraries (14):
- Contracts, Abstractions (compiler, decompiler, expression runtime)
- Engine (execution, signaling, scheduling, projections, hosted services)
- ElkSharp (generic graph layout algorithm)
- Renderer.ElkSharp, Renderer.ElkJs, Renderer.Msagl, Renderer.Svg
- Signaling.Redis, Signaling.OracleAq
- DataStore.MongoDB, DataStore.PostgreSQL, DataStore.Oracle

WebService: ASP.NET Core Minimal API with 22 endpoints

Tests (8 projects, 109 tests pass):
- Engine.Tests (105 pass), WebService.Tests (4 E2E pass)
- Renderer.Tests, DataStore.MongoDB/Oracle/PostgreSQL.Tests
- Signaling.Redis.Tests, IntegrationTests.Shared

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 19:14:44 +02:00

9.6 KiB

Oracle Performance Baseline 2026-03-17

Purpose

This document captures the current Oracle-backed load and performance baseline for the Serdica workflow engine. It is the reference point for later PostgreSQL and MongoDB backend comparisons.

The durable machine-readable companion is 10-oracle-performance-baseline-2026-03-17.json.

Run Metadata

  • Date: 2026-03-17
  • Test command:
    • integration performance suite filtered to OracleAqPerformance
  • Suite result:
    • 12/12 tests passed
    • total wall-clock time: 2 m 40 s
  • Raw artifact directory:
    • TestResults/workflow-performance/
  • Oracle environment:
    • Docker image: gvenzl/oracle-free:23-slim
    • instance: FREE
    • version: 23.0.0.0.0
    • AQ backend: Oracle AQ with pooled connections and retry-hardened setup

Scenario Summary

Scenario Tier Ops Conc Duration ms Throughput/s Avg ms P95 ms Max ms
oracle-aq-signal-roundtrip-capacity-c1 WorkflowPerfCapacity 16 1 4752.27 3.37 4257.28 4336.38 4359.67
oracle-aq-signal-roundtrip-capacity-c4 WorkflowPerfCapacity 64 4 4205.24 15.22 3926.42 3988.33 3994.74
oracle-aq-signal-roundtrip-capacity-c8 WorkflowPerfCapacity 128 8 5998.88 21.34 5226.56 5561.22 5605.59
oracle-aq-signal-roundtrip-capacity-c16 WorkflowPerfCapacity 256 16 7523.47 34.03 6551.81 6710.05 6721.81
oracle-aq-signal-roundtrip-latency-serial WorkflowPerfLatency 16 1 49755.52 0.32 3104.85 3165.04 3232.40
oracle-aq-bulstrad-quotation-confirm-convert-to-policy-nightly WorkflowPerfNightly 12 4 6761.14 1.77 5679.63 6259.65 6276.32
oracle-aq-delayed-burst-nightly WorkflowPerfNightly 48 1 4483.42 10.71 3908.13 3978.47 3991.75
oracle-aq-immediate-burst-nightly WorkflowPerfNightly 120 1 2391.29 50.18 902.17 1179.59 1207.44
oracle-aq-synthetic-external-resume-nightly WorkflowPerfNightly 36 8 6793.73 5.30 6238.80 6425.95 6466.75
oracle-aq-bulstrad-quote-or-apl-cancel-smoke WorkflowPerfSmoke 10 4 507.79 19.69 28.54 40.05 42.93
oracle-aq-delayed-burst-smoke WorkflowPerfSmoke 12 1 4202.91 2.86 4040.62 4083.70 4084.12
oracle-aq-immediate-burst-smoke WorkflowPerfSmoke 24 1 421.48 56.94 205.87 209.90 210.16
oracle-aq-synthetic-external-resume-smoke WorkflowPerfSmoke 12 4 3843.39 3.12 3644.91 3691.31 3696.92
oracle-aq-signal-roundtrip-soak WorkflowPerfSoak 108 8 27620.16 3.91 4494.29 5589.33 5595.04
oracle-aq-signal-roundtrip-throughput-parallel WorkflowPerfThroughput 96 16 4575.99 20.98 4142.13 4215.64 4233.33

Measurement Split

The synthetic signal round-trip workload is now measured in three separate ways so the numbers are not conflated:

  • oracle-aq-signal-roundtrip-latency-serial: one workflow at a time, one signal worker, used as the single-instance latency baseline.
  • oracle-aq-signal-roundtrip-throughput-parallel: 96 workflows, 16-way workload concurrency, 8 signal workers, used as the steady-state throughput baseline.
  • oracle-aq-signal-roundtrip-capacity-c*: batch-wave capacity ladder used to observe scaling and pressure points.

This split matters because the old low c1 figure was easy to misread. The useful baseline now is:

  • serial latency baseline: 3104.85 ms average end-to-end per workflow
  • steady throughput baseline: 20.98 ops/s with 16 workload concurrency and 8 signal workers
  • capacity c1: 3.37 ops/s; this is now just the smallest batch-wave rung, not the headline latency number

Serial Latency Baseline

Phase Avg ms P95 ms Max ms
start 25.14 41.92 45.27
signalPublish 16.57 31.39 50.72
signalToCompletion 3079.70 3128.56 3203.33

Interpretation:

  • most of the serial latency is in signalToCompletion, not in start or signal publication
  • start itself is cheap
  • signal publication itself is also cheap

Steady Throughput Baseline

Phase Avg ms P95 ms Max ms
start 63.15 122.10 134.45
signalPublish 18.61 25.99 29.06
signalToCompletion 3905.26 4007.86 4016.82

Interpretation:

  • the engine sustained 20.98 ops/s in a 96-operation wave
  • end-to-end average stayed at 4142.13 ms
  • start and signal publication remained small compared to the resume path

Oracle Observations

Dominant Waits

  • log file sync was the top wait in 14/15 scenario artifacts. Commit pressure is still the main Oracle-side cost center for this engine profile.
  • The only scenario with a different top wait was the heavier Bulstrad nightly flow:
    • oracle-aq-bulstrad-quotation-confirm-convert-to-policy-nightly -> library cache lock
  • At higher concurrency the second-order waits become visible:
    • resmgr:cpu quantum
    • row cache lock
    • buffer busy waits

Capacity Ladder

Scenario Throughput/s P95 ms User Commits Session Logical Reads Redo Size DB Time DB CPU Top Wait
c1 3.37 4336.38 64 3609 232824 653630 403101 log file sync
c4 15.22 3988.33 256 19710 913884 1867747 1070601 log file sync
c8 21.34 5561.22 512 66375 1910412 14103899 2746786 log file sync
c16 34.03 6710.05 1024 229828 3796688 17605655 6083523 log file sync

Interpretation:

  • The harness changes improved the ladder materially compared to the previous cut.
  • c1 moved to 3.37 ops/s from the earlier 1.63 ops/s, mostly because the harness no longer spends as much time in serial verifier tail behavior.
  • c16 reached 34.03 ops/s, but it is also the first rung with clearly visible CPU scheduling and contention pressure.
  • c8 is still the last comfortable rung on this local Oracle Free setup.

Transport Baselines

Scenario Throughput/s User Commits Session Logical Reads Redo Size Top Wait
oracle-aq-immediate-burst-smoke 56.94 48 973 88700 log file sync
oracle-aq-immediate-burst-nightly 50.18 240 12426 451200 log file sync
oracle-aq-delayed-burst-smoke 2.86 24 566 52724 log file sync
oracle-aq-delayed-burst-nightly 10.71 96 3043 197696 log file sync

Interpretation:

  • Immediate AQ transport remains much cheaper than full workflow resume.
  • Delayed AQ transport is still dominated by the intentional delay window, not raw dequeue throughput.

Business Flow Baselines

Scenario Throughput/s Avg ms User Commits Session Logical Reads Redo Size Top Wait
oracle-aq-bulstrad-quote-or-apl-cancel-smoke 19.69 28.54 10 3411 40748 log file sync
oracle-aq-bulstrad-quotation-confirm-convert-to-policy-nightly 1.77 5679.63 48 18562 505656 library cache lock

Interpretation:

  • The short Bulstrad flow is still mostly transport-bound.
  • The heavier QuotationConfirm -> ConvertToPolicy flow remains a useful real-workflow pressure baseline because it introduces parse and library pressure that the synthetic workloads do not.

Soak Baseline

oracle-aq-signal-roundtrip-soak completed 108 operations at concurrency 8 with:

  • throughput: 3.91 ops/s
  • average latency: 4494.29 ms
  • P95 latency: 5589.33 ms
  • 0 failures
  • 0 dead-lettered signals
  • 0 runtime conflicts
  • 0 stuck instances

Oracle metrics for the soak run:

  • user commits: 432
  • user rollbacks: 54
  • session logical reads: 104711
  • redo size: 1535580
  • DB time: 15394405
  • DB CPU: 2680492
  • top waits:
    • log file sync: 10904550 us
    • resmgr:cpu quantum: 1573185 us
    • row cache lock: 719739 us

What Must Stay Constant For Future Backend Comparisons

When PostgreSQL and MongoDB backends are benchmarked, keep these constant:

  • same scenario names
  • same operation counts
  • same concurrency levels
  • same worker counts for signal drain
  • same synthetic workflow definitions
  • same Bulstrad workflow families
  • same correctness assertions

Compare these dimensions directly:

  • throughput per second
  • latency average, P95, P99, and max
  • phase latency summaries for start, signal publish, and signal-to-completion on the synthetic signal round-trip workload
  • failures, dead letters, runtime conflicts, and stuck instances
  • commit count analogs
  • logical read or document/row read analogs
  • redo or WAL/journal write analogs
  • dominant waits, locks, or contention classes

First Sizing Note

On this local Oracle Free baseline:

  • Oracle AQ immediate burst handling is comfortably above the small workflow tiers, but not at the earlier near-100 ops/s level on this latest run; the current nightly transport baseline is 50.18 ops/s.
  • The first clear saturation signal is still not transport dequeue itself, but commit pressure and then CPU scheduling pressure.
  • The separated throughput baseline is the better reference for backend comparisons than the old low c1 figure.
  • c8 remains the last comfortably scaling signal-roundtrip rung on this machine.
  • c16 is still correct and faster, but it is the first pressure rung, not the default deployment target.

This is a baseline, not a production commitment. PostgreSQL and MongoDB backend work should reuse the same scenarios and produce the same summary tables before any architectural preference is declared.