From e91cf98f8fb87537bfa92de8ddfb7ba7fc602afe Mon Sep 17 00:00:00 2001 From: master <> Date: Mon, 30 Mar 2026 11:37:32 +0300 Subject: [PATCH] Add ElkSharp rendering architecture docs, ADRs, tutorial, AGENTS rules MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Five documentation deliverables for the ElkSharp rendering improvements: 1. docs/workflow/engine/16-elksharp-rendering-architecture.md (453 lines) Full pipeline: Sugiyama stages, edge routing strategies, hybrid deterministic mode, gateway geometry, 18-category scoring system, corridor routing, Y-gutter expansion, diagnostics. 2. docs/workflow/engine/17-elksharp-architectural-decisions.md (259 lines) Six ADRs: short-stub normalization, gateway vertex entries, Y-gutter expansion, corridor rerouting, FinalScore adjustment, alongside detection. 3. docs/workflow/tutorials/10-rendering/README.md (234 lines) Practical tutorial: setup, layout options, SVG/PNG rendering, diagnostics capture, violation reports, full end-to-end example. 4. src/__Libraries/StellaOps.ElkSharp/AGENTS.md — 7 new local rules for Y-gutter, corridor reroute, gateway vertices, FinalScore adjustments, short-stub normalization, alongside detection, target-join spread. 5. docs/workflow/ENGINE.md — replaced monolithic ElkSharp paragraph with structured pipeline overview, effort-level table, and links to the new architecture docs. Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/workflow/ENGINE.md | 60 ++- .../16-elksharp-rendering-architecture.md | 453 ++++++++++++++++++ .../17-elksharp-architectural-decisions.md | 259 ++++++++++ .../workflow/tutorials/10-rendering/README.md | 234 +++++++++ src/__Libraries/StellaOps.ElkSharp/AGENTS.md | 7 + 5 files changed, 997 insertions(+), 16 deletions(-) create mode 100644 docs/workflow/engine/16-elksharp-rendering-architecture.md create mode 100644 docs/workflow/engine/17-elksharp-architectural-decisions.md create mode 100644 docs/workflow/tutorials/10-rendering/README.md diff --git a/docs/workflow/ENGINE.md b/docs/workflow/ENGINE.md index 49bdb62cf..04961a3f3 100644 --- a/docs/workflow/ENGINE.md +++ b/docs/workflow/ENGINE.md @@ -1,4 +1,4 @@ -# Serdica Workflow Engine +# Serdica Workflow Engine A declarative, plugin-based workflow engine for long-running insurance business processes. Replaces Camunda BPMN with a native C# fluent DSL, canonical JSON schema, durable signal-based execution, and multi-backend persistence. @@ -477,10 +477,10 @@ The expression system evaluates declarative expressions at runtime without recom ### Path Navigation Paths navigate the execution context: -- `start.*` — Start request fields -- `state.*` — Current workflow state -- `payload.*` — Current task completion payload -- `result.*` — Step result (when `resultKey` is set) +- `start.*` — Start request fields +- `state.*` — Current workflow state +- `payload.*` — Current task completion payload +- `result.*` — Step result (when `resultKey` is set) ### Binary Operators @@ -494,10 +494,10 @@ Paths navigate the execution context: | `lte` | `WorkflowExpr.Lte(a, b)` | Less or equal | | `and` | `WorkflowExpr.And(a, b)` | Logical AND | | `or` | `WorkflowExpr.Or(a, b)` | Logical OR | -| `add` | — | Arithmetic addition | -| `subtract` | — | Arithmetic subtraction | -| `multiply` | — | Arithmetic multiplication | -| `divide` | — | Arithmetic division | +| `add` | — | Arithmetic addition | +| `subtract` | — | Arithmetic subtraction | +| `multiply` | — | Arithmetic multiplication | +| `divide` | — | Arithmetic division | ### Built-in Functions @@ -698,9 +698,9 @@ Configured via `GenericAssignmentPermissions.AdminRoles` (appsettings). ### Effective Roles A task's `EffectiveRoles` combines: -1. `WorkflowRoles` — from the workflow definition -2. `TaskRoles` — from the task definition -3. `RuntimeRoles` — computed at runtime via expression +1. `WorkflowRoles` — from the workflow definition +2. `TaskRoles` — from the task definition +3. `RuntimeRoles` — computed at runtime via expression If `TaskRoles` are specified, they narrow the effective roles. Otherwise, `WorkflowRoles` apply. @@ -739,8 +739,8 @@ Plugins load in the order specified by `PluginsConfig.PluginsOrder` in appsettin ### Marker Interfaces -- `IWorkflowBackendRegistrationMarker` — validates backend plugin is loaded -- `IWorkflowSignalDriverRegistrationMarker` — validates signal driver is loaded +- `IWorkflowBackendRegistrationMarker` — validates backend plugin is loaded +- `IWorkflowSignalDriverRegistrationMarker` — validates signal driver is loaded Startup validation throws `InvalidOperationException` if a configured provider is missing its plugin. @@ -920,7 +920,7 @@ The engine can render workflow definitions as visual diagrams. | Engine | Description | |--------|-------------| -| **ElkSharp** | Port of Eclipse Layout Kernel (default). In `Best` effort mode it runs a deterministic iterative multi-strategy orthogonal router after base routing, scoring candidate layouts across crossings, proximity, labels, target-approach joins, detours, target-approach backtracking, and entry geometry before selecting the best valid result. Attempt 1 remains the only full-strategy reroute; later attempts repair only the currently penalized lanes or exact conflict peers, with shortest-path detours prioritized first, a direct orthogonal shortcut tried before broader rerouting, and corridor-like overshoots only eligible when a clean orthogonal shortcut actually exists. Local-repair candidate building may run in parallel inside an attempt, but builds that touch the same source/target neighborhood are lock-serialized and the final apply order remains deterministic. Small or protected graphs keep the baseline route to preserve established sink-corridor, backward-edge, and port-anchor contracts, while larger congested graphs use the iterative sweep. Final strategy acceptance re-validates post-processed output so remaining broken short highways and non-applicable target-side approach joins are retried instead of being selected, while other soft-rule regressions get bounded multi-attempt retries and a wider but finite strategy sweep before fallback selection. The current A* pathfinder precomputes node-obstacle blocked step masks per route and uses lighter soft-obstacle rejection checks before exact geometry tests, materially reducing route-all-edges time without changing selected-path semantics. A final cheap geometry-repair pass cleans node-side entry/exit angles, target-slot spacing, repeat-collector return lanes, and target-side backtracking without re-running whole-graph A*. Rectangular boundary joins are constrained to a discrete slot lattice so one edge cannot silently concentrate on top of another: `left`/`right` faces may use at most `3` evenly spread side slots, `top`/`bottom` faces may use at most `5`, and the realized slot span matches the same safe boundary inset used by rectangle entry/exit normalization. Gateway faces are limited to `1` centered slot or `2` centered slots, singleton entries and preserved repeat/corridor exits are scored against the same centered lattice instead of being exempt, and the final slot snap can relax the generic shared-lane validator when the centered repair is still obstacle-safe and boundary-valid. Winner refinement now ends with a boundary-slot restabilization pass as well, so late shared-lane or under-node cleanup cannot drift decision/branch source exits back off the assigned lattice. Shortest-path local repair now also reuses interior axes from the current path and tries a raw-clearance obstacle-skirt fallback before accepting a wider preserved overshoot, which lets detour cleanup collapse onto an honest existing lane when the expanded-clearance candidates stay unnecessarily high. Decision/Fork/Join gateway nodes use a gateway-specific boundary algorithm instead of rectangular side snapping: off-axis lanes land on the actual polygon boundary, gateway target slots are derived from polygon-face intersections instead of rectangular side slots, gateway faces use only `1` or `2` centered face slots, short 45-degree diagonal stubs are allowed only on gateway side faces, corner-vertex diagonals are rejected, gateway-target arrival repair now also forbids tiny orthogonal last-moment hooks that change direction within less than one node depth of the boundary, and any retained 45-degree segment longer than one average node-shape length is rejected during scoring and artifact verification. Gateway-source dominant-axis detour checks are opportunity-gated, so obstacle-blocked gateway exits may keep a short local dogleg when there is no clean downstream-facing repair, while the artifact tests still enforce blocker clearance and keep those local exits out of unrelated node clearance bands. Repeat-collector lanes that preserve an outer corridor can still locally reroute their pre-corridor prefix when that prefix crosses a node, so node-safety no longer depends on skipping the edge outright. ElkSharp also supports strict compound-node trees through `ParentNodeId`: leaf nodes and empty parents receive layered positions, non-empty parent rectangles are derived bottom-up from descendant bounds plus `CompoundPadding` and `CompoundHeaderHeight`, exported child coordinates remain absolute, and cross-compound edges now include explicit parent-boundary crossing points. In v1, real edges must still terminate on leaves; non-leaf compound parents remain grouping-only containers and may not declare explicit ports. The document-processing artifact test emits both a live progress log and per-attempt phase timings/route-pass counts alongside the SVG/PNG/JSON diagnostics so long-running strategy searches can be inspected while they are still running and profiled after completion. `Draft` and `Balanced` keep the base route unless library callers opt in through ElkSharp layout options. | +| **ElkSharp** | Pure C# Sugiyama layout engine (default). See [Rendering Architecture](engine/16-elksharp-rendering-architecture.md) and [Architectural Decisions](engine/17-elksharp-architectural-decisions.md). | | **ElkJS** | JavaScript-based ELK via Node.js | | **MSAGL** | Microsoft Automatic Graph Layout | @@ -934,6 +934,34 @@ The engine can render workflow definitions as visual diagrams. } ``` +### ElkSharp Layout Pipeline + +ElkSharp uses a Sugiyama-based layered graph layout with deterministic iterative edge routing. + +**Pipeline stages:** +1. **Layer assignment** -- depth-first traversal assigns layer indices +2. **Node ordering** -- barycenter median sorting (8-24 iterations) +3. **Placement** -- median-of-incoming positioning with grid alignment +4. **Gutter expansion** -- X-gutters widen inter-layer corridors; Y-gutters create vertical routing space +5. **Edge routing** -- channel-based orthogonal routing with A* 8-direction pathfinding +6. **Iterative optimization** -- hybrid deterministic repair: score -> plan -> batch -> parallel A* -> merge -> refine +7. **Post-processing** -- boundary normalization, gateway geometry, corridor rerouting, FinalScore calibration + +**Effort levels:** +| Level | Ordering | Placement | Routing | +|-------|----------|-----------|---------| +| Draft | 8 iter | 3 iter | Baseline only | +| Balanced | 14 iter | 6 iter | Baseline + light repair | +| Best | 24 iter | 10 iter | Hybrid deterministic with full-core parallel repair | + +**Layout options:** +- `Direction`: LeftToRight (default for workflows) or TopToBottom +- `NodeSpacing`: vertical gap between nodes (default 40, scaled by edge density up to 1.8x) +- `LayerSpacing`: horizontal gap between layers (default 60) +- `Effort`: Draft / Balanced / Best + +For detailed architecture, see [ElkSharp Rendering Architecture](engine/16-elksharp-rendering-architecture.md). + ### Render Pipeline ``` @@ -955,7 +983,7 @@ WorkflowCanonicalDefinition | `WorkflowRuntimeStateConcurrencyException` | Duplicate/stale signal delivery | Auto-handled by signal pump (completes lease) | | `BaseResultException` | Business validation failure (not found, denied) | Returns error to caller | | `TimeoutException` | Transport or step timeout exceeded | Executes `WhenTimeout` branch if configured | -| `NotSupportedException` | Unsupported operation (e.g., null signal store) | Configuration error — check plugin loading | +| `NotSupportedException` | Unsupported operation (e.g., null signal store) | Configuration error — check plugin loading | ### Signal Retry Behavior diff --git a/docs/workflow/engine/16-elksharp-rendering-architecture.md b/docs/workflow/engine/16-elksharp-rendering-architecture.md new file mode 100644 index 000000000..3a8db3ac2 --- /dev/null +++ b/docs/workflow/engine/16-elksharp-rendering-architecture.md @@ -0,0 +1,453 @@ +# ElkSharp Rendering Architecture + +## Overview + +ElkSharp is a deterministic, in-process Sugiyama-based graph layout engine for workflow +visualization. It replaces external layout dependencies (ElkJs/Node.js) with a pure C# +implementation that produces identical output for identical input, regardless of host +platform, thread scheduling, or execution environment. + +The engine handles the full pipeline from abstract workflow graphs to positioned nodes +and routed edges, suitable for SVG/PNG/JSON rendering. It supports left-to-right and +top-to-bottom layout directions, three effort levels (Draft, Balanced, Best), compound +node hierarchies, and gateway-specific polygon geometry (diamond decisions, hexagon +forks/joins). + +--- + +## Sugiyama Pipeline + +The core layout algorithm follows the Sugiyama framework for layered graph drawing, +extended with workflow-specific constraints. + +### 1. Layer Assignment + +Depth-first traversal assigns a layer index to each node. The layer index determines +the horizontal position (in left-to-right mode) or vertical position (in top-to-bottom +mode) of each node. + +- Start nodes are assigned layer 0. +- Each successor is assigned `max(predecessor layers) + 1`. +- Backward edges (cycles, repeat connectors) are identified and handled separately + so they do not influence forward layer assignment. + +### 2. Dummy Node Insertion + +Edges that span more than one layer are split into chains of single-layer segments. +Each intermediate layer receives a "dummy node" -- a zero-size virtual node that +serves as a routing waypoint. + +- Dummy nodes inherit the edge's channel classification (forward, backward, sink). +- After routing, dummy chains are reconstructed back into the original multi-layer + edge with bend points at each dummy position. + +### 3. Node Ordering + +Barycenter-based median ordering minimizes edge crossings within each layer. + +- The algorithm sweeps forward and backward across layers, computing each node's + barycenter (average position of connected nodes in adjacent layers). +- Nodes are sorted by barycenter within their layer. +- Iteration count depends on effort level: + - Draft: 8 iterations + - Balanced: 14 iterations + - Best: 24 iterations +- Tie-breaking is deterministic (stable sort by node ID) to ensure reproducibility. + +### 4. Initial Placement + +After ordering, nodes receive Y-coordinates (in left-to-right mode) based on the +median of incoming connection Y-centers. + +- Enforced linear spacing ensures minimum `NodeSpacing` between adjacent nodes. +- Nodes with no incoming connections use the median of outgoing connection positions. +- Grid alignment snaps positions to the nearest grid line for visual consistency. + +### 5. Placement Refinement + +Multiple refinement passes adjust positions to reduce visual clutter: + +- **Preferred-center pull**: Nodes shift toward the center of their connected + neighbors' Y-range, weighted by connection count. +- **Snap-to-grid**: Positions align to a grid derived from `NodeSpacing`. +- **Compact-toward-incoming**: Nodes pull toward their primary incoming edge + direction to reduce edge length and crossings. + +Refinement iteration count scales with effort level (3 / 6 / 10 for Draft / Balanced / Best). + +### 6. Y-Gutter Expansion + +After edge routing identifies under-node violations (edges running through or +alongside nodes), the engine shifts entire Y-bands downward to create routing +corridors. + +- Runs after X-gutter expansion and before compact passes. +- Scans routed edges for horizontal segments that violate under-node or alongside + clearance rules. +- Identifies blocking nodes and computes the required clearance gap. +- Shifts ALL nodes below the violation Y downward by the clearance amount. + This preserves relative ordering within each layer. +- Re-routes edges with the expanded corridors. +- Up to 2 iterations to handle cascading violations. + +The key insight is that shifting individual nodes disrupts the Sugiyama median-based +optimization, causing cascading layout degradation. Shifting entire Y-bands (like +X-gutter expansion does for inter-layer gaps) preserves within-layer relationships. + +### 7. X-Gutter Expansion + +Inter-layer horizontal gaps are widened to provide edge corridor space. + +- The base gap is `LayerSpacing` (default 60px). +- When edge density between two layers exceeds a threshold, the gap is scaled + up to 1.8x to accommodate additional routing channels. +- Expansion is computed before edge routing so that the router has adequate space + for orthogonal paths. + +--- + +## Edge Routing + +### Channel Assignment + +Each edge is classified into one of three routing channels: + +- **Forward**: Source layer < target layer (the common case). +- **Backward**: Source layer > target layer (repeat/loop connectors). +- **Sink**: Edges to terminal nodes (End events) that may use special corridors. + +Channel classification determines routing priority, corridor eligibility, and +post-processing rules. + +### Base Routing + +Orthogonal bend-point construction builds an initial route from source to target: + +1. Exit the source node perpendicular to its boundary. +2. Route horizontally through inter-layer gutters. +3. Enter the target node perpendicular to its boundary. +4. Insert 90-degree bends at each direction change. + +The base router respects node obstacles, avoiding routes that cross through node +rectangles or polygons. + +### Dummy Edge Reconstruction + +After routing, multi-layer dummy chains are merged back to original edges: + +- Bend points from each dummy segment are concatenated. +- Redundant collinear points are removed. +- The result is a single edge with bend points at each layer transition. + +### Anchor Snapping + +Edge endpoints are projected onto actual node shape boundaries: + +- **Rectangle**: Standard side intersection (left, right, top, bottom). +- **Diamond** (decision gateways): Intersection with the diamond's four edges, + producing diagonal approach stubs. +- **Hexagon** (fork/join gateways): Intersection with the hexagon's six edges, + with asymmetric shoulder geometry. + +Anchor snapping runs after routing so that bend points near the boundary +produce clean visual connections. + +--- + +## Iterative Optimization (Hybrid Deterministic Path) + +The `Best` effort level activates hybrid deterministic optimization, which repairs +routing violations without disrupting the Sugiyama node placement. + +### 1. Baseline Evaluation + +The baseline route is scored using an 18-category violation taxonomy. Each edge +receives a violation list with severity-weighted penalties. The total score is the +sum of all edge scores. + +### 2. Repair Planning + +Penalized edges are identified from the violation severity map. The planner +extracts the specific violations for each edge and determines which edges need +repair and in what priority order. + +High-severity violations (node crossings, under-node, shared lanes) take priority +over medium-severity (backtracking, detours) and soft-severity (edge crossings, +proximity, bends). + +### 3. Conflict-Zone Batching + +Independent repair candidates are grouped by shared source, shared target, or +shared corridor zone. Edges in the same conflict zone are batched together so +that their repairs are coordinated rather than competing. + +Batching ensures that fixing one edge does not create a new violation for a +nearby edge in the same zone. + +### 4. Parallel A* Candidate Construction + +Each repair batch constructs candidate reroutes using A* 8-direction pathfinding: + +- Candidates are built on full-core parallel threads. +- Each candidate is a complete reroute for the batch's edge set. +- The A* grid derives intermediate spacing from approximately one-third of the + average service-task node size. +- Node-obstacle blocked step masks are precomputed per route so neighbor + expansion does not rescan every node. +- Merge back into the route is deterministic and single-threaded. + +### 5. Winner Refinement + +The best candidate from each batch undergoes a refinement pipeline: + +1. **Under-node repair**: Shift lanes that pass through node bounding boxes. +2. **Local-bundle spread**: Separate parallel edges that share the same lane. +3. **Shared-lane elimination**: Push edges apart that overlap on the same axis. +4. **Boundary-slot snap**: Align endpoints to the discrete slot lattice. +5. **Detour collapse**: Remove unnecessary overshoots where a shorter path exists. +6. **Post-slot restabilization**: Re-validate slot assignments after detour changes. +7. **Corridor reroute**: Move long horizontal sweeps to top/bottom corridors. +8. **Elevation adjustment**: Shift edges vertically to clear obstructions. +9. **Target-join spread**: Push convergent approach lanes apart by + `minClearance - currentGap + 8px` (half applied to each edge). + +Winner promotion uses weighted score comparison (`Score.Value`) to ensure +the refinement actually improved the layout. + +--- + +## Gateway Geometry + +Gateways use non-rectangular shapes that require specialized boundary logic. + +### Decision Gateway (Diamond) + +- 4 vertices: left tip, top, right tip, bottom. +- Left and right tips are the horizontal extremes. +- Top and bottom are the vertical extremes. +- Source exits leave from face interiors (not tips). +- Target entries may use left/right tips as convergence points. +- `ForceDecisionSourceExitOffVertex` blocks source exits from tip vertices. + +### Fork/Join Gateway (Hexagon) + +- 6 vertices: left tip, upper-left shoulder, upper-right shoulder, right tip, + lower-right shoulder, lower-left shoulder. +- Shoulders create flat top and bottom faces suitable for multiple slot entries. +- Asymmetric geometry: the shoulder offset from the tip varies by gateway size. + +### Boundary Slot Capacity + +| Shape | Face | Max Slots | +|-------|------|-----------| +| Gateway (diamond/hexagon) | Any face | 2 | +| Rectangle | Left / Right | 3 | +| Rectangle | Top / Bottom | 5 | + +Slots are evenly distributed within the face's safe boundary inset. Scoring and +final repair share the same realizable slot coordinates. + +### Gateway Vertex Entry Rules + +Left and right tip vertices are allowed for target entries but blocked for source +exits. This is enforced by a 3-way coordination mechanism: + +1. **IsAllowedGatewayTipVertex**: Returns `true` for left/right tips when the + edge is a target entry (incoming). +2. **HasValidGatewayBoundaryAngle**: Accepts any external approach angle at + allowed tip vertices. +3. **CountBoundarySlotViolations**: Skips slot-occupancy checks when all entries + at a vertex share the same allowed tip point. + +All three checks must stay synchronized -- changing one without the others causes +cascading boundary-slot violations. + +--- + +## Scoring System + +### Violation Categories + +The scoring system uses 18 violation categories with severity-weighted penalties. + +#### Hard Violations (100,000 per instance) + +| Category | Description | +|----------|-------------| +| Node crossings | Edge segment passes through a node bounding box | +| Under-node | Edge runs beneath or through a node's vertical extent | +| Shared lanes | Two edges share the same routing lane segment | +| Boundary slots | More edges than slots on a node face | +| Target joins | Multiple edges converge to the same target arrival point | +| Gateway exits | Source exit from a blocked gateway vertex | +| Collector corridors | Repeat-collector lane conflicts | +| Below-graph | Edge segment routes below the graph's maximum Y extent | + +#### Medium Violations (50,000 per instance) + +| Category | Description | +|----------|-------------| +| Backtracking | Edge reverses direction in the target-approach window | +| Detours | Unnecessary overshoot where a shorter path exists | + +#### Soft Violations (200-650 per instance) + +| Category | Description | +|----------|-------------| +| Edge crossings | Two edge segments intersect (200 per crossing) | +| Proximity | Edge passes too close to a node boundary (400) | +| Labels | Edge label overlaps another element (300) | +| Bends | Excessive number of bend points (200 per extra bend) | +| Diagonals | Non-orthogonal segment exceeds one node-shape length (650) | + +### FinalScore Adjustments + +The FinalScore applies detection exclusions that are NOT used during the iterative +search. This separation is critical: the search uses the raw scoring as its +heuristic, and changing it alters the search trajectory (causing speed regressions). + +FinalScore excludes these borderline detection artifacts: + +- **Valid gateway face approaches**: The exterior approach point is closer to + the face center than the predecessor bend point (a legitimate face entry, + not a violation). +- **Gateway-exit under-node**: The lane runs within 16px of the source node's + bottom boundary (a tight but valid exit, not a true under-node crossing). +- **Convergent target joins from distant sources**: Sources separated by > 15px + on the Y-axis with significant X separation (natural convergence, not a + shared-lane conflict). +- **Borderline shared lanes**: Gap between parallel edges is within 3px of + the tolerance threshold (measurement noise, not a real overlap). + +--- + +## Corridor Routing + +Long-range edges that would cross many intermediate nodes are routed through +corridors outside the main node field. + +### Top Corridor (Long Sweeps) + +For forward edges spanning more than 40% of the graph width: + +- Route through the top corridor at `graphMinY - 56`. +- Exit the source with a 24px perpendicular stub. +- Route horizontally across the top corridor. +- Descend to the target. + +The 24px exit stub is critical: it prevents `NormalizeBoundaryAngles` from +collapsing the vertical corridor segment back into the source boundary. + +### Bottom Corridor (Near-Boundary Sweeps) + +For edges that need to route near the graph's lower boundary: + +- Route through the bottom corridor at `graphMaxY + 32`. +- Same perpendicular exit stub pattern. + +### Below-Graph Detection + +The below-graph violation detector (`HasCorridorBendPoints`) exempts edges that +intentionally use corridor routing. Without this exemption, corridor edges would +be penalized as below-graph violations and rerouted back into the node field. + +--- + +## Y-Gutter Expansion (Routing-Aware Placement Feedback Loop) + +Y-gutter expansion is a post-routing placement correction that creates vertical +routing space where the initial Sugiyama placement left insufficient clearance. + +### Algorithm + +1. **Detection**: After edge routing, scan all routed edge segments for horizontal + segments that violate under-node or alongside clearance rules. + +2. **Identification**: For each violation, identify the blocking node and compute + the required clearance: + - Under-node: The edge passes through the node's bounding box. + - Alongside (flush): The edge runs within +/-4px of a node's top or bottom + boundary (the "alongside" extension catches edges "glued" to boundaries + that the standard gap > 0.5px check misses). + +3. **Expansion**: Shift ALL nodes below the violation Y downward by the computed + clearance amount. This preserves relative ordering within each layer, unlike + individual node shifting which disrupts Sugiyama optimization. + +4. **Re-routing**: Re-route edges with the expanded corridors. + +5. **Iteration**: Repeat up to 2 times to handle cascading violations (where + fixing one violation exposes another). + +### Design Rationale + +The same pattern is used for X-gutter expansion (widening inter-layer gaps). +Band-level shifting is fundamentally different from individual node adjustment: + +- Individual node shifts break the barycenter ordering that the Sugiyama + algorithm optimized, causing cascading position changes. +- Post-refinement clearance insertion fails because subsequent optimization + passes override the inserted space. +- Band-level shifts preserve within-layer relationships while creating the + needed routing corridors. + +### Timing + +Y-gutter expansion runs: +- AFTER X-gutter expansion (inter-layer gaps are already set). +- BEFORE compact passes (so compaction respects the new corridors). +- BEFORE the iterative optimization loop (so the optimizer works with + adequate routing space). + +--- + +## Effort Levels + +| Level | Ordering Iterations | Placement Iterations | Routing | +|-------|--------------------:|---------------------:|---------| +| Draft | 8 | 3 | Baseline only | +| Balanced | 14 | 6 | Baseline + light repair | +| Best | 24 | 10 | Hybrid deterministic with full-core parallel repair | + +- **Draft**: Fastest layout for previews and interactive editing. No iterative + optimization. Suitable for graphs under ~20 nodes. +- **Balanced**: Good quality for medium graphs. Light repair fixes the worst + violations without full A* search. +- **Best**: Production quality for rendered artifacts (SVG, PNG). Full hybrid + deterministic optimization with parallel candidate construction. Typical + runtime: 12-15 seconds for complex workflow graphs. + +--- + +## Layout Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `Direction` | Enum | `LeftToRight` | Layout direction. `TopToBottom` uses the legacy iterative path. | +| `NodeSpacing` | int | 40 | Vertical gap between nodes (px). Scaled by edge density up to 1.8x. | +| `LayerSpacing` | int | 60 | Horizontal gap between layers (px). | +| `Effort` | Enum | `Best` | Layout quality vs. speed tradeoff. | + +### Edge Density Scaling + +When the number of edges between two adjacent layers exceeds a threshold, both +`NodeSpacing` and `LayerSpacing` are scaled up to accommodate additional routing +channels. The maximum scale factor is 1.8x, applied per-layer-pair. + +--- + +## Diagnostics + +The layout engine emits detailed diagnostics when running in `Best` effort mode: + +- **Live progress log**: Baseline state, strategy starts, per-attempt scores, + and adaptation decisions logged during execution. +- **Per-attempt phase timings**: Routing time, post-processing time, and + route-pass counts for each optimization attempt. +- **SVG/PNG/JSON artifacts**: The document-processing artifact test produces + rendered output alongside diagnostic data. +- **Violation reports**: Per-edge violation lists with category, severity, + geometry details, and FinalScore adjustments. + +Diagnostics are detailed enough to prove routing progress and to profile +optimization performance for regression detection. diff --git a/docs/workflow/engine/17-elksharp-architectural-decisions.md b/docs/workflow/engine/17-elksharp-architectural-decisions.md new file mode 100644 index 000000000..e00cc3469 --- /dev/null +++ b/docs/workflow/engine/17-elksharp-architectural-decisions.md @@ -0,0 +1,259 @@ +# ElkSharp Architectural Decisions + +This document records architectural decisions made during the ElkSharp rendering +engine development. Each record follows the ADR (Architecture Decision Record) +format: context, decision, consequences. + +--- + +## ADR-1: Short-Stub Exit Normalization + +**Status**: Accepted + +**Context**: +`NormalizeExitPath` creates a perpendicular stub from the source node boundary to +establish a clean exit direction. The default stub length extends to the anchor X +coordinate: `Math.Max(sourceX + 24, anchorX)`. For edges where the anchor is far +from the source (e.g., a long forward edge), this creates a horizontal segment of +1000+ pixels that crosses intermediate nodes in the same Y-band. + +The long horizontal stub was originally designed to produce clean orthogonal exits, +but it assumed the Y-band between source and anchor was unoccupied. In dense graphs, +intermediate nodes occupy the same Y-band, and the long stub crosses through them, +creating entry-angle violations and node-crossing penalties. + +**Decision**: +When the long stub fails `HasClearSourceExitSegment` (i.e., the horizontal segment +between `sourceX + 24` and `anchorX` crosses a node), try a short stub instead. The +short stub extends only `sourceX +/- 24px` -- just enough to establish the +perpendicular exit direction without reaching into the occupied Y-band. + +The short stub is controlled by the `useShortStub` parameter in `NormalizeExitPath`. +It fires ONLY when the default long stub fails clearance. The long stub remains the +default because it produces cleaner, more direct paths when clearance is available. + +**Consequences**: +- Fixes entry-angle violations where intermediate nodes in occupied Y-bands blocked + the perpendicular exit path. +- The short stub creates a 24px vertical segment that subsequent routing can extend + into a clean corridor without crossing obstacles. +- Does not change behavior for edges with clear exit paths (the long stub is still + preferred when it passes clearance). + +--- + +## ADR-2: Gateway Vertex Entries + +**Status**: Accepted + +**Context**: +Gateway tips (diamond corner vertices at left and right extremes) were blocked for +all edges because source exits from tips create "pin" visual artifacts -- a thin +spike extending from the corner that looks like a rendering glitch rather than an +intentional connection. + +However, for target entries (incoming edges), tips are the natural convergence point. +Multiple edges arriving at a decision gateway naturally converge toward its left tip. +Blocking tip entries forced edges to route to face interiors, which required +additional bends, created shared-lane conflicts between fork output edges, and +produced visually cluttered arrivals. + +**Decision**: +Allow left/right tip vertices for target entries via a 3-way coordination mechanism: + +1. `IsAllowedGatewayTipVertex`: Returns `true` for left and right tip vertices when + the edge direction is "target entry" (incoming). +2. `HasValidGatewayBoundaryAngle`: Accepts any external approach angle at allowed + tip vertices. Without this relaxation, the angle validator would reject diagonal + approaches to the tip even though they are visually correct. +3. `CountBoundarySlotViolations`: Skips the slot-occupancy check when all entries + at a boundary point share the same allowed tip vertex. Since a vertex is a single + geometric point (not a face segment), slot capacity is not meaningful. + +Source exits from tips remain blocked by `ForceDecisionSourceExitOffVertex`. + +**Consequences**: +- Eliminates shared-lane conflicts from fork output edges that were forced to + route around blocked tips. +- Creates cleaner convergent target entries where multiple edges naturally meet + at the gateway's leading tip. +- The 3-way coordination must stay synchronized: changing any one of the three + checks without updating the others causes cascading boundary-slot violations, + angle rejections, or vertex blocking. +- Source exits remain clean -- the "pin" artifact is prevented for outgoing edges. + +--- + +## ADR-3: Y-Gutter Expansion + +**Status**: Accepted + +**Context**: +After Sugiyama placement and initial edge routing, some edges route through or +alongside nodes because the placement did not leave sufficient vertical space for +routing corridors. + +Two prior approaches failed: + +1. **Post-placement individual node shifting**: Moving individual nodes to create + clearance disrupted the barycenter ordering that Sugiyama optimized. The shifted + node changed the median calculations for adjacent layers, causing cascading + position changes that degraded overall layout quality. + +2. **Post-refinement clearance insertion**: Adding vertical space after refinement + failed because subsequent optimization passes (compact-toward-incoming, grid + alignment) overrode the inserted space, collapsing the corridors. + +**Decision**: +Use the same pattern as X-gutter expansion: shift entire Y-bands (all nodes below +the violation point) together, preserving relative positions. + +- Scan routed edges for horizontal segments with under-node or alongside violations. +- Identify the blocking node and compute the required clearance. +- Shift ALL nodes with Y > violation Y downward by the clearance amount. +- Re-route edges with the expanded corridors. +- Run up to 2 iterations to handle cascading violations. + +The expansion runs after X-gutters (inter-layer gaps are set) and before compact +passes (so compaction respects the new corridors). + +**Consequences**: +- Creates adequate routing corridors without disrupting within-layer ordering. +- Routing gets clean paths on the first pass because the corridors exist before + the iterative optimizer runs. +- The downward-only shift direction ensures the graph grows in one direction, + avoiding oscillation between iterations. +- Up to 2 iterations handles the case where fixing one violation exposes another + (the shifted band may push edges into a new conflict zone). + +--- + +## ADR-4: Corridor Rerouting for Long Sweeps + +**Status**: Accepted + +**Context**: +Forward edges spanning 10+ layers (e.g., failure/timeout paths from an early task +to the End event) route horizontally at the source's Y coordinate. In a dense graph, +this horizontal segment crosses many intermediate nodes -- a 3076px sweep in the +document-processing test case. + +No amount of Y-adjustment can clear a sweep that crosses the entire graph width. +Y-gutter expansion would need to push the entire graph below the sweep, which +defeats the purpose of the layout. + +Backward edges already use corridor routing (above the graph field) because they +inherently travel against the layout direction. Forward edges did not have this +treatment. + +**Decision**: +Route long forward sweeps (spanning > 40% of the graph width) through the top +corridor at `graphMinY - 56`: + +1. Exit the source with a 24px perpendicular stub. +2. Route vertically to `graphMinY - 56`. +3. Route horizontally across the top corridor. +4. Descend to the target. + +The 24px perpendicular exit stub is critical: without it, `NormalizeBoundaryAngles` +collapses the vertical corridor segment back into the source boundary, destroying +the corridor route. + +Near-boundary sweeps (edges that would conflict with the graph's lower edge) use +the bottom corridor at `graphMaxY + 32`. + +**Consequences**: +- Long-range forward edges route cleanly above the graph field, like backward edges. +- The graph's visual area remains clear of long horizontal sweeps. +- The perpendicular exit stub (24px) must survive normalization -- removing it or + reducing it below the normalization threshold causes the corridor route to + collapse. +- Below-graph detection (`HasCorridorBendPoints`) must exempt corridor edges; + otherwise they would be penalized and rerouted back into the node field. + +--- + +## ADR-5: FinalScore Adjustment (Search/Display Separation) + +**Status**: Accepted + +**Context**: +The iterative optimization loop uses the scoring function as both a quality metric +AND a search heuristic. The score determines which candidates are explored and which +are accepted as improvements. + +During development, borderline detection patterns were identified -- situations where +the scoring detected a "violation" that was actually a valid layout artifact (e.g., +a gateway face approach that looks like a boundary-slot conflict but is geometrically +correct). + +The initial fix was to update the detection logic to exclude these borderline cases. +However, this changed the scoring function that the search used as its heuristic, +altering the search trajectory and causing a 40-second speed regression (from 12s +to 52s) because the optimizer explored different (and more) candidates. + +**Decision**: +Keep the original scoring function unchanged during the iterative search (stable +heuristic trajectory). Apply detection exclusions ONLY in the `FinalScore` +computation (post-search). + +The FinalScore excludes: +- Valid gateway face approaches (exterior closer to center than predecessor). +- Gateway-exit under-node (lane within 16px of source bottom). +- Convergent target joins from X-separated sources with > 15px Y-gap. +- Borderline shared lanes (gap within 3px of tolerance). + +The search does not need to know about borderline patterns -- it just needs +consistent heuristics to explore the candidate space efficiently. + +**Consequences**: +- The FinalScore accurately reflects visual quality: 0 hard violations in the + document-processing test case. +- The search maintains stable 12-15s runtime because the heuristic is unchanged. +- The separation means that the search may "fix" violations that the FinalScore + would have excluded. This is acceptable: the extra fixes are not harmful, and + the stable search trajectory is worth the minor redundant work. +- Future scoring changes must decide whether they apply to the search heuristic + (affects trajectory and speed) or only to the FinalScore (affects reported quality). + +--- + +## ADR-6: Under-Node Alongside Detection + +**Status**: Accepted + +**Context**: +`CountUnderNodeViolations` detected edges that pass through a node's bounding box +with a gap greater than 0.5px. This threshold was chosen to avoid false positives +from floating-point precision. + +However, edges running flush with a node boundary (gap = 0px, e.g., exactly at the +bottom edge of a node) were not detected. These edges are visually "glued" to the +node boundary -- they appear to touch the node even though they technically do not +pass through it. + +The 0.5px threshold also missed edges within a few pixels of the boundary. An edge +at gap = 2px is visually indistinguishable from one at gap = 0px at typical zoom +levels, but only the latter was detected. + +**Decision**: +Extend the under-node detection to include flush and near-flush edges: + +- Standard under-node: gap > 0.5px (unchanged). +- Flush bottom (`isFlushBottom`): gap >= -4px and <= 0.5px relative to the node's + bottom boundary. +- Flush top (`isFlushTop`): gap >= -4px and <= 0.5px relative to the node's top + boundary. + +The +/-4px range catches edges that are visually "alongside" the node boundary, +even if they are technically outside the bounding box by a few pixels. + +**Consequences**: +- Catches visually "glued" edges that touch or nearly touch node boundaries. +- The Y-gutter expansion then creates clearance for these edges, pushing them + into a clean routing corridor. +- The -4px lower bound prevents false positives from edges that are merely + "nearby" but visually separate from the node. +- The detection threshold (±4px for alongside, > 0.5px for standard) should not + be changed without sprint-level approval, as it affects which edges trigger + Y-gutter expansion. diff --git a/docs/workflow/tutorials/10-rendering/README.md b/docs/workflow/tutorials/10-rendering/README.md new file mode 100644 index 000000000..e659121f2 --- /dev/null +++ b/docs/workflow/tutorials/10-rendering/README.md @@ -0,0 +1,234 @@ +# Tutorial 10: Rendering Workflow Diagrams + +This tutorial shows how to use the Stella Ops workflow rendering system to produce +visual diagrams from workflow definitions. + +--- + +## Prerequisites + +- A workflow canonical definition (from the compiler or imported JSON). +- Reference to `StellaOps.Workflow.Renderer` and `StellaOps.ElkSharp` assemblies. + +--- + +## Basic Usage + +### 1. Create the Layout Engine + +```csharp +var engine = new ElkSharpWorkflowRenderLayoutEngine(); +``` + +### 2. Configure Layout Options + +```csharp +var request = new WorkflowRenderLayoutRequest +{ + Direction = WorkflowRenderLayoutDirection.LeftToRight, + Effort = WorkflowRenderLayoutEffort.Best, + NodeSpacing = 40, + LayerSpacing = 60, +}; +``` + +### 3. Compute the Layout + +```csharp +var layout = await engine.LayoutAsync(graph, request); +``` + +The `graph` parameter is a `WorkflowRenderGraph` produced by the +`WorkflowRenderGraphCompiler` from a canonical workflow definition. + +### 4. Render to SVG + +```csharp +var svgRenderer = new WorkflowRenderSvgRenderer(); +var svgDoc = svgRenderer.Render(layout, "My Workflow"); +await File.WriteAllTextAsync("workflow.svg", svgDoc.Svg); +``` + +### 5. Export to PNG + +```csharp +var pngExporter = new WorkflowRenderPngExporter(); +await pngExporter.ExportAsync(svgDoc, "workflow.png", scale: 2f); +``` + +The `scale` parameter controls the pixel density (2f = 2x resolution for HiDPI). + +--- + +## Layout Options Reference + +### Direction + +| Value | Description | +|-------|-------------| +| `LeftToRight` | Nodes flow left to right (default for workflows). | +| `TopToBottom` | Nodes flow top to bottom. Uses the legacy iterative path. | + +### Effort Levels + +| Level | Speed | Quality | Use Case | +|-------|-------|---------|----------| +| `Draft` | Fast (~1s) | Basic | Interactive editing, previews | +| `Balanced` | Medium (~3-5s) | Good | Medium graphs, dev-time rendering | +| `Best` | Slow (~12-15s) | Production | Final artifacts, export, CI rendering | + +**Draft** uses 8 ordering iterations, 3 placement iterations, and baseline routing +only. No iterative optimization is performed. + +**Balanced** uses 14 ordering iterations, 6 placement iterations, and light repair +that fixes the worst violations without full A* search. + +**Best** uses 24 ordering iterations, 10 placement iterations, and the hybrid +deterministic optimization pipeline with full-core parallel repair candidates. + +### Spacing + +- **NodeSpacing** (default 40): Vertical gap between nodes in pixels. The engine + may scale this up to 1.8x when edge density is high. +- **LayerSpacing** (default 60): Horizontal gap between layers in pixels. + +--- + +## Reading the Layout Result + +The `LayoutAsync` result contains positioned nodes and routed edges. + +### Nodes + +```csharp +foreach (var node in layout.Nodes) +{ + Console.WriteLine($"Node {node.Id}: ({node.X}, {node.Y}) " + + $"size {node.Width}x{node.Height} " + + $"shape={node.Shape}"); +} +``` + +Node shapes include `Rectangle` (service tasks), `Diamond` (decision gateways), +`Hexagon` (fork/join gateways), `Circle` (start/end events), and others. + +### Edges + +```csharp +foreach (var edge in layout.Edges) +{ + Console.WriteLine($"Edge {edge.SourceId} -> {edge.TargetId}"); + foreach (var point in edge.BendPoints) + { + Console.WriteLine($" bend: ({point.X}, {point.Y})"); + } +} +``` + +Bend points define the orthogonal path from source to target. Two consecutive +bend points with the same Y form a horizontal segment; two with the same X +form a vertical segment. + +--- + +## Diagnostics + +When using `Best` effort, the engine captures detailed diagnostics about the +optimization process. + +### Enabling Diagnostics + +Diagnostics are captured automatically in `Best` mode. Access them through +the layout result. + +### Violation Report + +The violation report lists each edge's violations with category, severity, +and geometric details. + +```csharp +if (layout.Diagnostics?.ViolationReport != null) +{ + foreach (var entry in layout.Diagnostics.ViolationReport) + { + Console.WriteLine($"Edge {entry.EdgeId}: " + + $"{entry.Category} (penalty {entry.Penalty})"); + } +} +``` + +### Violation Categories + +The scoring system uses 18 categories. Hard violations (100K penalty) include +node crossings, under-node routing, shared lanes, and boundary slot conflicts. +Medium violations (50K) include backtracking and detours. Soft violations +(200-650) include edge crossings, proximity, and excessive bends. + +A FinalScore of 0 for hard violations indicates a clean layout with no visual +defects. See the [Rendering Architecture](../../engine/16-elksharp-rendering-architecture.md) +for the full violation taxonomy. + +### Phase Timings + +```csharp +if (layout.Diagnostics?.PhaseTimings != null) +{ + foreach (var phase in layout.Diagnostics.PhaseTimings) + { + Console.WriteLine($"{phase.Name}: {phase.Duration.TotalMilliseconds}ms"); + } +} +``` + +Phase timings cover ordering, placement, gutter expansion, base routing, +iterative optimization, and post-processing. + +--- + +## End-to-End Example + +```csharp +// Compile a workflow definition to a render graph +var compiler = new WorkflowRenderGraphCompiler(); +var graph = compiler.Compile(workflowDefinition); + +// Configure and run layout +var engine = new ElkSharpWorkflowRenderLayoutEngine(); +var request = new WorkflowRenderLayoutRequest +{ + Direction = WorkflowRenderLayoutDirection.LeftToRight, + Effort = WorkflowRenderLayoutEffort.Best, + NodeSpacing = 40, + LayerSpacing = 60, +}; +var layout = await engine.LayoutAsync(graph, request); + +// Render to SVG +var svgRenderer = new WorkflowRenderSvgRenderer(); +var svgDoc = svgRenderer.Render(layout, workflowDefinition.Name); +await File.WriteAllTextAsync($"{workflowDefinition.Name}.svg", svgDoc.Svg); + +// Export to PNG at 2x resolution +var pngExporter = new WorkflowRenderPngExporter(); +await pngExporter.ExportAsync(svgDoc, $"{workflowDefinition.Name}.png", scale: 2f); + +// Check for violations +var hardViolations = layout.Diagnostics?.ViolationReport? + .Where(v => v.Penalty >= 100_000) + .ToList(); +if (hardViolations?.Any() == true) +{ + Console.WriteLine($"WARNING: {hardViolations.Count} hard violations detected"); +} +``` + +--- + +## Further Reading + +- [ElkSharp Rendering Architecture](../../engine/16-elksharp-rendering-architecture.md) -- + Full technical details of the Sugiyama pipeline, edge routing, and iterative optimization. +- [Architectural Decisions](../../engine/17-elksharp-architectural-decisions.md) -- + ADR records for key design choices. +- [ENGINE.md](../../ENGINE.md) -- Workflow engine overview including layout engine + configuration and render pipeline. diff --git a/src/__Libraries/StellaOps.ElkSharp/AGENTS.md b/src/__Libraries/StellaOps.ElkSharp/AGENTS.md index 1aa652f0e..90e07d6a9 100644 --- a/src/__Libraries/StellaOps.ElkSharp/AGENTS.md +++ b/src/__Libraries/StellaOps.ElkSharp/AGENTS.md @@ -40,6 +40,13 @@ - When touching proximity/highway logic, keep long applicable shared corridors distinct from short shared segments that must be spread apart. - The A* router now precomputes node-obstacle blocked step masks per route so neighbor expansion does not rescan every node obstacle. Future performance work should extend that to precomputed lane-occupancy masks for previously committed edge lanes, so the router can skip already-owned space instead of only penalizing it after expansion. Derive intermediate grid spacing from approximately one third of the average service-task size instead of keeping a fixed dense lattice. - Keep `TopToBottom` behavior stable unless the sprint explicitly includes it. +- Y-gutter expansion runs after X-gutters and before compact passes. It shifts all nodes below a violation Y downward by the needed clearance. Up to 2 iterations. Do not modify the shift direction (always downward) or the detection threshold (minClearance for under-node, +/-4px for alongside) without sprint-level approval. +- Corridor rerouting for long horizontal sweeps (> 40% graph width) uses the top corridor at graphMinY - 56. Near-boundary sweeps use the bottom corridor at graphMaxY + 32. The perpendicular exit stub must be 24px to survive NormalizeBoundaryAngles. Do not remove the stub or change corridor Y offsets without verifying the normalization interaction. +- Gateway left/right tip vertices are allowed for target entries only. Source exits are still blocked by ForceDecisionSourceExitOffVertex. The 3-way coordination (IsAllowedGatewayTipVertex + HasValidGatewayBoundaryAngle + CountBoundarySlotViolations vertex exemption) must stay synchronized -- changing one without the others causes cascading boundary-slot violations. +- FinalScore adjustments exclude borderline detection artifacts: valid gateway face approaches (exterior closer to center than predecessor), gateway-exit under-node (lane within 16px of source bottom), convergent target joins from X-separated sources with > 15px Y-gap, and borderline shared lanes (gap within 3px of tolerance). These exclusions apply ONLY to the FinalScore, not during the iterative search. +- Short-stub exit normalization (useShortStub parameter in NormalizeExitPath) fires only when the default long stub fails HasClearSourceExitSegment. The short stub is always sourceX +/- 24px. Do not make it the default -- the long stub produces cleaner paths when clearance is available. +- Under-node alongside detection extends the standard gap > 0.5px check to include flush edges (gap >= -4px and <= 0.5px for bottom, same for top). This catches edges "glued" to node boundaries. +- Target-join spread pushes convergent approach lanes apart by minClearance - currentGap + 8px (half applied to each edge). The spread runs as a final winner refinement step and uses weighted score comparison (Score.Value) for promotion. ## Testing - Run the targeted workflow renderer test project for ElkSharp changes.