From e91cf98f8fb87537bfa92de8ddfb7ba7fc602afe Mon Sep 17 00:00:00 2001
From: master <>
Date: Mon, 30 Mar 2026 11:37:32 +0300
Subject: [PATCH] Add ElkSharp rendering architecture docs, ADRs, tutorial,
 AGENTS rules
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Five documentation deliverables for the ElkSharp rendering improvements:

1. docs/workflow/engine/16-elksharp-rendering-architecture.md (453 lines)
   Full pipeline: Sugiyama stages, edge routing strategies, hybrid
   deterministic mode, gateway geometry, 18-category scoring system,
   corridor routing, Y-gutter expansion, diagnostics.

2. docs/workflow/engine/17-elksharp-architectural-decisions.md (259 lines)
   Six ADRs: short-stub normalization, gateway vertex entries, Y-gutter
   expansion, corridor rerouting, FinalScore adjustment, alongside
   detection.

3. docs/workflow/tutorials/10-rendering/README.md (234 lines)
   Practical tutorial: setup, layout options, SVG/PNG rendering,
   diagnostics capture, violation reports, full end-to-end example.

4. src/__Libraries/StellaOps.ElkSharp/AGENTS.md — 7 new local rules
   for Y-gutter, corridor reroute, gateway vertices, FinalScore
   adjustments, short-stub normalization, alongside detection,
   target-join spread.

5. docs/workflow/ENGINE.md — replaced monolithic ElkSharp paragraph
   with structured pipeline overview, effort-level table, and links
   to the new architecture docs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---
 docs/workflow/ENGINE.md                       |  60 ++-
 .../16-elksharp-rendering-architecture.md     | 453 ++++++++++++++++++
 .../17-elksharp-architectural-decisions.md    | 259 ++++++++++
 .../workflow/tutorials/10-rendering/README.md | 234 +++++++++
 src/__Libraries/StellaOps.ElkSharp/AGENTS.md  |   7 +
 5 files changed, 997 insertions(+), 16 deletions(-)
 create mode 100644 docs/workflow/engine/16-elksharp-rendering-architecture.md
 create mode 100644 docs/workflow/engine/17-elksharp-architectural-decisions.md
 create mode 100644 docs/workflow/tutorials/10-rendering/README.md

diff --git a/docs/workflow/ENGINE.md b/docs/workflow/ENGINE.md
index 49bdb62cf..04961a3f3 100644
--- a/docs/workflow/ENGINE.md
+++ b/docs/workflow/ENGINE.md
@@ -1,4 +1,4 @@
-﻿# Serdica Workflow Engine
+# Serdica Workflow Engine
 
 A declarative, plugin-based workflow engine for long-running insurance business processes. Replaces Camunda BPMN with a native C# fluent DSL, canonical JSON schema, durable signal-based execution, and multi-backend persistence.
 
@@ -477,10 +477,10 @@ The expression system evaluates declarative expressions at runtime without recom
 ### Path Navigation
 
 Paths navigate the execution context:
-- `start.*` â€” Start request fields
-- `state.*` â€” Current workflow state
-- `payload.*` â€” Current task completion payload
-- `result.*` â€” Step result (when `resultKey` is set)
+- `start.*` — Start request fields
+- `state.*` — Current workflow state
+- `payload.*` — Current task completion payload
+- `result.*` — Step result (when `resultKey` is set)
 
 ### Binary Operators
 
@@ -494,10 +494,10 @@ Paths navigate the execution context:
 | `lte` | `WorkflowExpr.Lte(a, b)` | Less or equal |
 | `and` | `WorkflowExpr.And(a, b)` | Logical AND |
 | `or` | `WorkflowExpr.Or(a, b)` | Logical OR |
-| `add` | â€” | Arithmetic addition |
-| `subtract` | â€” | Arithmetic subtraction |
-| `multiply` | â€” | Arithmetic multiplication |
-| `divide` | â€” | Arithmetic division |
+| `add` | — | Arithmetic addition |
+| `subtract` | — | Arithmetic subtraction |
+| `multiply` | — | Arithmetic multiplication |
+| `divide` | — | Arithmetic division |
 
 ### Built-in Functions
 
@@ -698,9 +698,9 @@ Configured via `GenericAssignmentPermissions.AdminRoles` (appsettings).
 ### Effective Roles
 
 A task's `EffectiveRoles` combines:
-1. `WorkflowRoles` â€” from the workflow definition
-2. `TaskRoles` â€” from the task definition
-3. `RuntimeRoles` â€” computed at runtime via expression
+1. `WorkflowRoles` — from the workflow definition
+2. `TaskRoles` — from the task definition
+3. `RuntimeRoles` — computed at runtime via expression
 
 If `TaskRoles` are specified, they narrow the effective roles. Otherwise, `WorkflowRoles` apply.
 
@@ -739,8 +739,8 @@ Plugins load in the order specified by `PluginsConfig.PluginsOrder` in appsettin
 
 ### Marker Interfaces
 
-- `IWorkflowBackendRegistrationMarker` â€” validates backend plugin is loaded
-- `IWorkflowSignalDriverRegistrationMarker` â€” validates signal driver is loaded
+- `IWorkflowBackendRegistrationMarker` — validates backend plugin is loaded
+- `IWorkflowSignalDriverRegistrationMarker` — validates signal driver is loaded
 
 Startup validation throws `InvalidOperationException` if a configured provider is missing its plugin.
 
@@ -920,7 +920,7 @@ The engine can render workflow definitions as visual diagrams.
 
 | Engine | Description |
 |--------|-------------|
-| **ElkSharp** | Port of Eclipse Layout Kernel (default). In `Best` effort mode it runs a deterministic iterative multi-strategy orthogonal router after base routing, scoring candidate layouts across crossings, proximity, labels, target-approach joins, detours, target-approach backtracking, and entry geometry before selecting the best valid result. Attempt 1 remains the only full-strategy reroute; later attempts repair only the currently penalized lanes or exact conflict peers, with shortest-path detours prioritized first, a direct orthogonal shortcut tried before broader rerouting, and corridor-like overshoots only eligible when a clean orthogonal shortcut actually exists. Local-repair candidate building may run in parallel inside an attempt, but builds that touch the same source/target neighborhood are lock-serialized and the final apply order remains deterministic. Small or protected graphs keep the baseline route to preserve established sink-corridor, backward-edge, and port-anchor contracts, while larger congested graphs use the iterative sweep. Final strategy acceptance re-validates post-processed output so remaining broken short highways and non-applicable target-side approach joins are retried instead of being selected, while other soft-rule regressions get bounded multi-attempt retries and a wider but finite strategy sweep before fallback selection. The current A* pathfinder precomputes node-obstacle blocked step masks per route and uses lighter soft-obstacle rejection checks before exact geometry tests, materially reducing route-all-edges time without changing selected-path semantics. A final cheap geometry-repair pass cleans node-side entry/exit angles, target-slot spacing, repeat-collector return lanes, and target-side backtracking without re-running whole-graph A*. Rectangular boundary joins are constrained to a discrete slot lattice so one edge cannot silently concentrate on top of another: `left`/`right` faces may use at most `3` evenly spread side slots, `top`/`bottom` faces may use at most `5`, and the realized slot span matches the same safe boundary inset used by rectangle entry/exit normalization. Gateway faces are limited to `1` centered slot or `2` centered slots, singleton entries and preserved repeat/corridor exits are scored against the same centered lattice instead of being exempt, and the final slot snap can relax the generic shared-lane validator when the centered repair is still obstacle-safe and boundary-valid. Winner refinement now ends with a boundary-slot restabilization pass as well, so late shared-lane or under-node cleanup cannot drift decision/branch source exits back off the assigned lattice. Shortest-path local repair now also reuses interior axes from the current path and tries a raw-clearance obstacle-skirt fallback before accepting a wider preserved overshoot, which lets detour cleanup collapse onto an honest existing lane when the expanded-clearance candidates stay unnecessarily high. Decision/Fork/Join gateway nodes use a gateway-specific boundary algorithm instead of rectangular side snapping: off-axis lanes land on the actual polygon boundary, gateway target slots are derived from polygon-face intersections instead of rectangular side slots, gateway faces use only `1` or `2` centered face slots, short 45-degree diagonal stubs are allowed only on gateway side faces, corner-vertex diagonals are rejected, gateway-target arrival repair now also forbids tiny orthogonal last-moment hooks that change direction within less than one node depth of the boundary, and any retained 45-degree segment longer than one average node-shape length is rejected during scoring and artifact verification. Gateway-source dominant-axis detour checks are opportunity-gated, so obstacle-blocked gateway exits may keep a short local dogleg when there is no clean downstream-facing repair, while the artifact tests still enforce blocker clearance and keep those local exits out of unrelated node clearance bands. Repeat-collector lanes that preserve an outer corridor can still locally reroute their pre-corridor prefix when that prefix crosses a node, so node-safety no longer depends on skipping the edge outright. ElkSharp also supports strict compound-node trees through `ParentNodeId`: leaf nodes and empty parents receive layered positions, non-empty parent rectangles are derived bottom-up from descendant bounds plus `CompoundPadding` and `CompoundHeaderHeight`, exported child coordinates remain absolute, and cross-compound edges now include explicit parent-boundary crossing points. In v1, real edges must still terminate on leaves; non-leaf compound parents remain grouping-only containers and may not declare explicit ports. The document-processing artifact test emits both a live progress log and per-attempt phase timings/route-pass counts alongside the SVG/PNG/JSON diagnostics so long-running strategy searches can be inspected while they are still running and profiled after completion. `Draft` and `Balanced` keep the base route unless library callers opt in through ElkSharp layout options. |
+| **ElkSharp** | Pure C# Sugiyama layout engine (default). See [Rendering Architecture](engine/16-elksharp-rendering-architecture.md) and [Architectural Decisions](engine/17-elksharp-architectural-decisions.md). |
 | **ElkJS** | JavaScript-based ELK via Node.js |
 | **MSAGL** | Microsoft Automatic Graph Layout |
 
@@ -934,6 +934,34 @@ The engine can render workflow definitions as visual diagrams.
 }
 ```
 
+### ElkSharp Layout Pipeline
+
+ElkSharp uses a Sugiyama-based layered graph layout with deterministic iterative edge routing.
+
+**Pipeline stages:**
+1. **Layer assignment** -- depth-first traversal assigns layer indices
+2. **Node ordering** -- barycenter median sorting (8-24 iterations)
+3. **Placement** -- median-of-incoming positioning with grid alignment
+4. **Gutter expansion** -- X-gutters widen inter-layer corridors; Y-gutters create vertical routing space
+5. **Edge routing** -- channel-based orthogonal routing with A* 8-direction pathfinding
+6. **Iterative optimization** -- hybrid deterministic repair: score -> plan -> batch -> parallel A* -> merge -> refine
+7. **Post-processing** -- boundary normalization, gateway geometry, corridor rerouting, FinalScore calibration
+
+**Effort levels:**
+| Level | Ordering | Placement | Routing |
+|-------|----------|-----------|---------|
+| Draft | 8 iter | 3 iter | Baseline only |
+| Balanced | 14 iter | 6 iter | Baseline + light repair |
+| Best | 24 iter | 10 iter | Hybrid deterministic with full-core parallel repair |
+
+**Layout options:**
+- `Direction`: LeftToRight (default for workflows) or TopToBottom
+- `NodeSpacing`: vertical gap between nodes (default 40, scaled by edge density up to 1.8x)
+- `LayerSpacing`: horizontal gap between layers (default 60)
+- `Effort`: Draft / Balanced / Best
+
+For detailed architecture, see [ElkSharp Rendering Architecture](engine/16-elksharp-rendering-architecture.md).
+
 ### Render Pipeline
 
 ```
@@ -955,7 +983,7 @@ WorkflowCanonicalDefinition
 | `WorkflowRuntimeStateConcurrencyException` | Duplicate/stale signal delivery | Auto-handled by signal pump (completes lease) |
 | `BaseResultException` | Business validation failure (not found, denied) | Returns error to caller |
 | `TimeoutException` | Transport or step timeout exceeded | Executes `WhenTimeout` branch if configured |
-| `NotSupportedException` | Unsupported operation (e.g., null signal store) | Configuration error â€” check plugin loading |
+| `NotSupportedException` | Unsupported operation (e.g., null signal store) | Configuration error — check plugin loading |
 
 ### Signal Retry Behavior
 
diff --git a/docs/workflow/engine/16-elksharp-rendering-architecture.md b/docs/workflow/engine/16-elksharp-rendering-architecture.md
new file mode 100644
index 000000000..3a8db3ac2
--- /dev/null
+++ b/docs/workflow/engine/16-elksharp-rendering-architecture.md
@@ -0,0 +1,453 @@
+# ElkSharp Rendering Architecture
+
+## Overview
+
+ElkSharp is a deterministic, in-process Sugiyama-based graph layout engine for workflow
+visualization. It replaces external layout dependencies (ElkJs/Node.js) with a pure C#
+implementation that produces identical output for identical input, regardless of host
+platform, thread scheduling, or execution environment.
+
+The engine handles the full pipeline from abstract workflow graphs to positioned nodes
+and routed edges, suitable for SVG/PNG/JSON rendering. It supports left-to-right and
+top-to-bottom layout directions, three effort levels (Draft, Balanced, Best), compound
+node hierarchies, and gateway-specific polygon geometry (diamond decisions, hexagon
+forks/joins).
+
+---
+
+## Sugiyama Pipeline
+
+The core layout algorithm follows the Sugiyama framework for layered graph drawing,
+extended with workflow-specific constraints.
+
+### 1. Layer Assignment
+
+Depth-first traversal assigns a layer index to each node. The layer index determines
+the horizontal position (in left-to-right mode) or vertical position (in top-to-bottom
+mode) of each node.
+
+- Start nodes are assigned layer 0.
+- Each successor is assigned `max(predecessor layers) + 1`.
+- Backward edges (cycles, repeat connectors) are identified and handled separately
+  so they do not influence forward layer assignment.
+
+### 2. Dummy Node Insertion
+
+Edges that span more than one layer are split into chains of single-layer segments.
+Each intermediate layer receives a "dummy node" -- a zero-size virtual node that
+serves as a routing waypoint.
+
+- Dummy nodes inherit the edge's channel classification (forward, backward, sink).
+- After routing, dummy chains are reconstructed back into the original multi-layer
+  edge with bend points at each dummy position.
+
+### 3. Node Ordering
+
+Barycenter-based median ordering minimizes edge crossings within each layer.
+
+- The algorithm sweeps forward and backward across layers, computing each node's
+  barycenter (average position of connected nodes in adjacent layers).
+- Nodes are sorted by barycenter within their layer.
+- Iteration count depends on effort level:
+  - Draft: 8 iterations
+  - Balanced: 14 iterations
+  - Best: 24 iterations
+- Tie-breaking is deterministic (stable sort by node ID) to ensure reproducibility.
+
+### 4. Initial Placement
+
+After ordering, nodes receive Y-coordinates (in left-to-right mode) based on the
+median of incoming connection Y-centers.
+
+- Enforced linear spacing ensures minimum `NodeSpacing` between adjacent nodes.
+- Nodes with no incoming connections use the median of outgoing connection positions.
+- Grid alignment snaps positions to the nearest grid line for visual consistency.
+
+### 5. Placement Refinement
+
+Multiple refinement passes adjust positions to reduce visual clutter:
+
+- **Preferred-center pull**: Nodes shift toward the center of their connected
+  neighbors' Y-range, weighted by connection count.
+- **Snap-to-grid**: Positions align to a grid derived from `NodeSpacing`.
+- **Compact-toward-incoming**: Nodes pull toward their primary incoming edge
+  direction to reduce edge length and crossings.
+
+Refinement iteration count scales with effort level (3 / 6 / 10 for Draft / Balanced / Best).
+
+### 6. Y-Gutter Expansion
+
+After edge routing identifies under-node violations (edges running through or
+alongside nodes), the engine shifts entire Y-bands downward to create routing
+corridors.
+
+- Runs after X-gutter expansion and before compact passes.
+- Scans routed edges for horizontal segments that violate under-node or alongside
+  clearance rules.
+- Identifies blocking nodes and computes the required clearance gap.
+- Shifts ALL nodes below the violation Y downward by the clearance amount.
+  This preserves relative ordering within each layer.
+- Re-routes edges with the expanded corridors.
+- Up to 2 iterations to handle cascading violations.
+
+The key insight is that shifting individual nodes disrupts the Sugiyama median-based
+optimization, causing cascading layout degradation. Shifting entire Y-bands (like
+X-gutter expansion does for inter-layer gaps) preserves within-layer relationships.
+
+### 7. X-Gutter Expansion
+
+Inter-layer horizontal gaps are widened to provide edge corridor space.
+
+- The base gap is `LayerSpacing` (default 60px).
+- When edge density between two layers exceeds a threshold, the gap is scaled
+  up to 1.8x to accommodate additional routing channels.
+- Expansion is computed before edge routing so that the router has adequate space
+  for orthogonal paths.
+
+---
+
+## Edge Routing
+
+### Channel Assignment
+
+Each edge is classified into one of three routing channels:
+
+- **Forward**: Source layer < target layer (the common case).
+- **Backward**: Source layer > target layer (repeat/loop connectors).
+- **Sink**: Edges to terminal nodes (End events) that may use special corridors.
+
+Channel classification determines routing priority, corridor eligibility, and
+post-processing rules.
+
+### Base Routing
+
+Orthogonal bend-point construction builds an initial route from source to target:
+
+1. Exit the source node perpendicular to its boundary.
+2. Route horizontally through inter-layer gutters.
+3. Enter the target node perpendicular to its boundary.
+4. Insert 90-degree bends at each direction change.
+
+The base router respects node obstacles, avoiding routes that cross through node
+rectangles or polygons.
+
+### Dummy Edge Reconstruction
+
+After routing, multi-layer dummy chains are merged back to original edges:
+
+- Bend points from each dummy segment are concatenated.
+- Redundant collinear points are removed.
+- The result is a single edge with bend points at each layer transition.
+
+### Anchor Snapping
+
+Edge endpoints are projected onto actual node shape boundaries:
+
+- **Rectangle**: Standard side intersection (left, right, top, bottom).
+- **Diamond** (decision gateways): Intersection with the diamond's four edges,
+  producing diagonal approach stubs.
+- **Hexagon** (fork/join gateways): Intersection with the hexagon's six edges,
+  with asymmetric shoulder geometry.
+
+Anchor snapping runs after routing so that bend points near the boundary
+produce clean visual connections.
+
+---
+
+## Iterative Optimization (Hybrid Deterministic Path)
+
+The `Best` effort level activates hybrid deterministic optimization, which repairs
+routing violations without disrupting the Sugiyama node placement.
+
+### 1. Baseline Evaluation
+
+The baseline route is scored using an 18-category violation taxonomy. Each edge
+receives a violation list with severity-weighted penalties. The total score is the
+sum of all edge scores.
+
+### 2. Repair Planning
+
+Penalized edges are identified from the violation severity map. The planner
+extracts the specific violations for each edge and determines which edges need
+repair and in what priority order.
+
+High-severity violations (node crossings, under-node, shared lanes) take priority
+over medium-severity (backtracking, detours) and soft-severity (edge crossings,
+proximity, bends).
+
+### 3. Conflict-Zone Batching
+
+Independent repair candidates are grouped by shared source, shared target, or
+shared corridor zone. Edges in the same conflict zone are batched together so
+that their repairs are coordinated rather than competing.
+
+Batching ensures that fixing one edge does not create a new violation for a
+nearby edge in the same zone.
+
+### 4. Parallel A* Candidate Construction
+
+Each repair batch constructs candidate reroutes using A* 8-direction pathfinding:
+
+- Candidates are built on full-core parallel threads.
+- Each candidate is a complete reroute for the batch's edge set.
+- The A* grid derives intermediate spacing from approximately one-third of the
+  average service-task node size.
+- Node-obstacle blocked step masks are precomputed per route so neighbor
+  expansion does not rescan every node.
+- Merge back into the route is deterministic and single-threaded.
+
+### 5. Winner Refinement
+
+The best candidate from each batch undergoes a refinement pipeline:
+
+1. **Under-node repair**: Shift lanes that pass through node bounding boxes.
+2. **Local-bundle spread**: Separate parallel edges that share the same lane.
+3. **Shared-lane elimination**: Push edges apart that overlap on the same axis.
+4. **Boundary-slot snap**: Align endpoints to the discrete slot lattice.
+5. **Detour collapse**: Remove unnecessary overshoots where a shorter path exists.
+6. **Post-slot restabilization**: Re-validate slot assignments after detour changes.
+7. **Corridor reroute**: Move long horizontal sweeps to top/bottom corridors.
+8. **Elevation adjustment**: Shift edges vertically to clear obstructions.
+9. **Target-join spread**: Push convergent approach lanes apart by
+   `minClearance - currentGap + 8px` (half applied to each edge).
+
+Winner promotion uses weighted score comparison (`Score.Value`) to ensure
+the refinement actually improved the layout.
+
+---
+
+## Gateway Geometry
+
+Gateways use non-rectangular shapes that require specialized boundary logic.
+
+### Decision Gateway (Diamond)
+
+- 4 vertices: left tip, top, right tip, bottom.
+- Left and right tips are the horizontal extremes.
+- Top and bottom are the vertical extremes.
+- Source exits leave from face interiors (not tips).
+- Target entries may use left/right tips as convergence points.
+- `ForceDecisionSourceExitOffVertex` blocks source exits from tip vertices.
+
+### Fork/Join Gateway (Hexagon)
+
+- 6 vertices: left tip, upper-left shoulder, upper-right shoulder, right tip,
+  lower-right shoulder, lower-left shoulder.
+- Shoulders create flat top and bottom faces suitable for multiple slot entries.
+- Asymmetric geometry: the shoulder offset from the tip varies by gateway size.
+
+### Boundary Slot Capacity
+
+| Shape | Face | Max Slots |
+|-------|------|-----------|
+| Gateway (diamond/hexagon) | Any face | 2 |
+| Rectangle | Left / Right | 3 |
+| Rectangle | Top / Bottom | 5 |
+
+Slots are evenly distributed within the face's safe boundary inset. Scoring and
+final repair share the same realizable slot coordinates.
+
+### Gateway Vertex Entry Rules
+
+Left and right tip vertices are allowed for target entries but blocked for source
+exits. This is enforced by a 3-way coordination mechanism:
+
+1. **IsAllowedGatewayTipVertex**: Returns `true` for left/right tips when the
+   edge is a target entry (incoming).
+2. **HasValidGatewayBoundaryAngle**: Accepts any external approach angle at
+   allowed tip vertices.
+3. **CountBoundarySlotViolations**: Skips slot-occupancy checks when all entries
+   at a vertex share the same allowed tip point.
+
+All three checks must stay synchronized -- changing one without the others causes
+cascading boundary-slot violations.
+
+---
+
+## Scoring System
+
+### Violation Categories
+
+The scoring system uses 18 violation categories with severity-weighted penalties.
+
+#### Hard Violations (100,000 per instance)
+
+| Category | Description |
+|----------|-------------|
+| Node crossings | Edge segment passes through a node bounding box |
+| Under-node | Edge runs beneath or through a node's vertical extent |
+| Shared lanes | Two edges share the same routing lane segment |
+| Boundary slots | More edges than slots on a node face |
+| Target joins | Multiple edges converge to the same target arrival point |
+| Gateway exits | Source exit from a blocked gateway vertex |
+| Collector corridors | Repeat-collector lane conflicts |
+| Below-graph | Edge segment routes below the graph's maximum Y extent |
+
+#### Medium Violations (50,000 per instance)
+
+| Category | Description |
+|----------|-------------|
+| Backtracking | Edge reverses direction in the target-approach window |
+| Detours | Unnecessary overshoot where a shorter path exists |
+
+#### Soft Violations (200-650 per instance)
+
+| Category | Description |
+|----------|-------------|
+| Edge crossings | Two edge segments intersect (200 per crossing) |
+| Proximity | Edge passes too close to a node boundary (400) |
+| Labels | Edge label overlaps another element (300) |
+| Bends | Excessive number of bend points (200 per extra bend) |
+| Diagonals | Non-orthogonal segment exceeds one node-shape length (650) |
+
+### FinalScore Adjustments
+
+The FinalScore applies detection exclusions that are NOT used during the iterative
+search. This separation is critical: the search uses the raw scoring as its
+heuristic, and changing it alters the search trajectory (causing speed regressions).
+
+FinalScore excludes these borderline detection artifacts:
+
+- **Valid gateway face approaches**: The exterior approach point is closer to
+  the face center than the predecessor bend point (a legitimate face entry,
+  not a violation).
+- **Gateway-exit under-node**: The lane runs within 16px of the source node's
+  bottom boundary (a tight but valid exit, not a true under-node crossing).
+- **Convergent target joins from distant sources**: Sources separated by > 15px
+  on the Y-axis with significant X separation (natural convergence, not a
+  shared-lane conflict).
+- **Borderline shared lanes**: Gap between parallel edges is within 3px of
+  the tolerance threshold (measurement noise, not a real overlap).
+
+---
+
+## Corridor Routing
+
+Long-range edges that would cross many intermediate nodes are routed through
+corridors outside the main node field.
+
+### Top Corridor (Long Sweeps)
+
+For forward edges spanning more than 40% of the graph width:
+
+- Route through the top corridor at `graphMinY - 56`.
+- Exit the source with a 24px perpendicular stub.
+- Route horizontally across the top corridor.
+- Descend to the target.
+
+The 24px exit stub is critical: it prevents `NormalizeBoundaryAngles` from
+collapsing the vertical corridor segment back into the source boundary.
+
+### Bottom Corridor (Near-Boundary Sweeps)
+
+For edges that need to route near the graph's lower boundary:
+
+- Route through the bottom corridor at `graphMaxY + 32`.
+- Same perpendicular exit stub pattern.
+
+### Below-Graph Detection
+
+The below-graph violation detector (`HasCorridorBendPoints`) exempts edges that
+intentionally use corridor routing. Without this exemption, corridor edges would
+be penalized as below-graph violations and rerouted back into the node field.
+
+---
+
+## Y-Gutter Expansion (Routing-Aware Placement Feedback Loop)
+
+Y-gutter expansion is a post-routing placement correction that creates vertical
+routing space where the initial Sugiyama placement left insufficient clearance.
+
+### Algorithm
+
+1. **Detection**: After edge routing, scan all routed edge segments for horizontal
+   segments that violate under-node or alongside clearance rules.
+
+2. **Identification**: For each violation, identify the blocking node and compute
+   the required clearance:
+   - Under-node: The edge passes through the node's bounding box.
+   - Alongside (flush): The edge runs within +/-4px of a node's top or bottom
+     boundary (the "alongside" extension catches edges "glued" to boundaries
+     that the standard gap > 0.5px check misses).
+
+3. **Expansion**: Shift ALL nodes below the violation Y downward by the computed
+   clearance amount. This preserves relative ordering within each layer, unlike
+   individual node shifting which disrupts Sugiyama optimization.
+
+4. **Re-routing**: Re-route edges with the expanded corridors.
+
+5. **Iteration**: Repeat up to 2 times to handle cascading violations (where
+   fixing one violation exposes another).
+
+### Design Rationale
+
+The same pattern is used for X-gutter expansion (widening inter-layer gaps).
+Band-level shifting is fundamentally different from individual node adjustment:
+
+- Individual node shifts break the barycenter ordering that the Sugiyama
+  algorithm optimized, causing cascading position changes.
+- Post-refinement clearance insertion fails because subsequent optimization
+  passes override the inserted space.
+- Band-level shifts preserve within-layer relationships while creating the
+  needed routing corridors.
+
+### Timing
+
+Y-gutter expansion runs:
+- AFTER X-gutter expansion (inter-layer gaps are already set).
+- BEFORE compact passes (so compaction respects the new corridors).
+- BEFORE the iterative optimization loop (so the optimizer works with
+  adequate routing space).
+
+---
+
+## Effort Levels
+
+| Level | Ordering Iterations | Placement Iterations | Routing |
+|-------|--------------------:|---------------------:|---------|
+| Draft | 8 | 3 | Baseline only |
+| Balanced | 14 | 6 | Baseline + light repair |
+| Best | 24 | 10 | Hybrid deterministic with full-core parallel repair |
+
+- **Draft**: Fastest layout for previews and interactive editing. No iterative
+  optimization. Suitable for graphs under ~20 nodes.
+- **Balanced**: Good quality for medium graphs. Light repair fixes the worst
+  violations without full A* search.
+- **Best**: Production quality for rendered artifacts (SVG, PNG). Full hybrid
+  deterministic optimization with parallel candidate construction. Typical
+  runtime: 12-15 seconds for complex workflow graphs.
+
+---
+
+## Layout Options
+
+| Option | Type | Default | Description |
+|--------|------|---------|-------------|
+| `Direction` | Enum | `LeftToRight` | Layout direction. `TopToBottom` uses the legacy iterative path. |
+| `NodeSpacing` | int | 40 | Vertical gap between nodes (px). Scaled by edge density up to 1.8x. |
+| `LayerSpacing` | int | 60 | Horizontal gap between layers (px). |
+| `Effort` | Enum | `Best` | Layout quality vs. speed tradeoff. |
+
+### Edge Density Scaling
+
+When the number of edges between two adjacent layers exceeds a threshold, both
+`NodeSpacing` and `LayerSpacing` are scaled up to accommodate additional routing
+channels. The maximum scale factor is 1.8x, applied per-layer-pair.
+
+---
+
+## Diagnostics
+
+The layout engine emits detailed diagnostics when running in `Best` effort mode:
+
+- **Live progress log**: Baseline state, strategy starts, per-attempt scores,
+  and adaptation decisions logged during execution.
+- **Per-attempt phase timings**: Routing time, post-processing time, and
+  route-pass counts for each optimization attempt.
+- **SVG/PNG/JSON artifacts**: The document-processing artifact test produces
+  rendered output alongside diagnostic data.
+- **Violation reports**: Per-edge violation lists with category, severity,
+  geometry details, and FinalScore adjustments.
+
+Diagnostics are detailed enough to prove routing progress and to profile
+optimization performance for regression detection.
diff --git a/docs/workflow/engine/17-elksharp-architectural-decisions.md b/docs/workflow/engine/17-elksharp-architectural-decisions.md
new file mode 100644
index 000000000..e00cc3469
--- /dev/null
+++ b/docs/workflow/engine/17-elksharp-architectural-decisions.md
@@ -0,0 +1,259 @@
+# ElkSharp Architectural Decisions
+
+This document records architectural decisions made during the ElkSharp rendering
+engine development. Each record follows the ADR (Architecture Decision Record)
+format: context, decision, consequences.
+
+---
+
+## ADR-1: Short-Stub Exit Normalization
+
+**Status**: Accepted
+
+**Context**:
+`NormalizeExitPath` creates a perpendicular stub from the source node boundary to
+establish a clean exit direction. The default stub length extends to the anchor X
+coordinate: `Math.Max(sourceX + 24, anchorX)`. For edges where the anchor is far
+from the source (e.g., a long forward edge), this creates a horizontal segment of
+1000+ pixels that crosses intermediate nodes in the same Y-band.
+
+The long horizontal stub was originally designed to produce clean orthogonal exits,
+but it assumed the Y-band between source and anchor was unoccupied. In dense graphs,
+intermediate nodes occupy the same Y-band, and the long stub crosses through them,
+creating entry-angle violations and node-crossing penalties.
+
+**Decision**:
+When the long stub fails `HasClearSourceExitSegment` (i.e., the horizontal segment
+between `sourceX + 24` and `anchorX` crosses a node), try a short stub instead. The
+short stub extends only `sourceX +/- 24px` -- just enough to establish the
+perpendicular exit direction without reaching into the occupied Y-band.
+
+The short stub is controlled by the `useShortStub` parameter in `NormalizeExitPath`.
+It fires ONLY when the default long stub fails clearance. The long stub remains the
+default because it produces cleaner, more direct paths when clearance is available.
+
+**Consequences**:
+- Fixes entry-angle violations where intermediate nodes in occupied Y-bands blocked
+  the perpendicular exit path.
+- The short stub creates a 24px vertical segment that subsequent routing can extend
+  into a clean corridor without crossing obstacles.
+- Does not change behavior for edges with clear exit paths (the long stub is still
+  preferred when it passes clearance).
+
+---
+
+## ADR-2: Gateway Vertex Entries
+
+**Status**: Accepted
+
+**Context**:
+Gateway tips (diamond corner vertices at left and right extremes) were blocked for
+all edges because source exits from tips create "pin" visual artifacts -- a thin
+spike extending from the corner that looks like a rendering glitch rather than an
+intentional connection.
+
+However, for target entries (incoming edges), tips are the natural convergence point.
+Multiple edges arriving at a decision gateway naturally converge toward its left tip.
+Blocking tip entries forced edges to route to face interiors, which required
+additional bends, created shared-lane conflicts between fork output edges, and
+produced visually cluttered arrivals.
+
+**Decision**:
+Allow left/right tip vertices for target entries via a 3-way coordination mechanism:
+
+1. `IsAllowedGatewayTipVertex`: Returns `true` for left and right tip vertices when
+   the edge direction is "target entry" (incoming).
+2. `HasValidGatewayBoundaryAngle`: Accepts any external approach angle at allowed
+   tip vertices. Without this relaxation, the angle validator would reject diagonal
+   approaches to the tip even though they are visually correct.
+3. `CountBoundarySlotViolations`: Skips the slot-occupancy check when all entries
+   at a boundary point share the same allowed tip vertex. Since a vertex is a single
+   geometric point (not a face segment), slot capacity is not meaningful.
+
+Source exits from tips remain blocked by `ForceDecisionSourceExitOffVertex`.
+
+**Consequences**:
+- Eliminates shared-lane conflicts from fork output edges that were forced to
+  route around blocked tips.
+- Creates cleaner convergent target entries where multiple edges naturally meet
+  at the gateway's leading tip.
+- The 3-way coordination must stay synchronized: changing any one of the three
+  checks without updating the others causes cascading boundary-slot violations,
+  angle rejections, or vertex blocking.
+- Source exits remain clean -- the "pin" artifact is prevented for outgoing edges.
+
+---
+
+## ADR-3: Y-Gutter Expansion
+
+**Status**: Accepted
+
+**Context**:
+After Sugiyama placement and initial edge routing, some edges route through or
+alongside nodes because the placement did not leave sufficient vertical space for
+routing corridors.
+
+Two prior approaches failed:
+
+1. **Post-placement individual node shifting**: Moving individual nodes to create
+   clearance disrupted the barycenter ordering that Sugiyama optimized. The shifted
+   node changed the median calculations for adjacent layers, causing cascading
+   position changes that degraded overall layout quality.
+
+2. **Post-refinement clearance insertion**: Adding vertical space after refinement
+   failed because subsequent optimization passes (compact-toward-incoming, grid
+   alignment) overrode the inserted space, collapsing the corridors.
+
+**Decision**:
+Use the same pattern as X-gutter expansion: shift entire Y-bands (all nodes below
+the violation point) together, preserving relative positions.
+
+- Scan routed edges for horizontal segments with under-node or alongside violations.
+- Identify the blocking node and compute the required clearance.
+- Shift ALL nodes with Y > violation Y downward by the clearance amount.
+- Re-route edges with the expanded corridors.
+- Run up to 2 iterations to handle cascading violations.
+
+The expansion runs after X-gutters (inter-layer gaps are set) and before compact
+passes (so compaction respects the new corridors).
+
+**Consequences**:
+- Creates adequate routing corridors without disrupting within-layer ordering.
+- Routing gets clean paths on the first pass because the corridors exist before
+  the iterative optimizer runs.
+- The downward-only shift direction ensures the graph grows in one direction,
+  avoiding oscillation between iterations.
+- Up to 2 iterations handles the case where fixing one violation exposes another
+  (the shifted band may push edges into a new conflict zone).
+
+---
+
+## ADR-4: Corridor Rerouting for Long Sweeps
+
+**Status**: Accepted
+
+**Context**:
+Forward edges spanning 10+ layers (e.g., failure/timeout paths from an early task
+to the End event) route horizontally at the source's Y coordinate. In a dense graph,
+this horizontal segment crosses many intermediate nodes -- a 3076px sweep in the
+document-processing test case.
+
+No amount of Y-adjustment can clear a sweep that crosses the entire graph width.
+Y-gutter expansion would need to push the entire graph below the sweep, which
+defeats the purpose of the layout.
+
+Backward edges already use corridor routing (above the graph field) because they
+inherently travel against the layout direction. Forward edges did not have this
+treatment.
+
+**Decision**:
+Route long forward sweeps (spanning > 40% of the graph width) through the top
+corridor at `graphMinY - 56`:
+
+1. Exit the source with a 24px perpendicular stub.
+2. Route vertically to `graphMinY - 56`.
+3. Route horizontally across the top corridor.
+4. Descend to the target.
+
+The 24px perpendicular exit stub is critical: without it, `NormalizeBoundaryAngles`
+collapses the vertical corridor segment back into the source boundary, destroying
+the corridor route.
+
+Near-boundary sweeps (edges that would conflict with the graph's lower edge) use
+the bottom corridor at `graphMaxY + 32`.
+
+**Consequences**:
+- Long-range forward edges route cleanly above the graph field, like backward edges.
+- The graph's visual area remains clear of long horizontal sweeps.
+- The perpendicular exit stub (24px) must survive normalization -- removing it or
+  reducing it below the normalization threshold causes the corridor route to
+  collapse.
+- Below-graph detection (`HasCorridorBendPoints`) must exempt corridor edges;
+  otherwise they would be penalized and rerouted back into the node field.
+
+---
+
+## ADR-5: FinalScore Adjustment (Search/Display Separation)
+
+**Status**: Accepted
+
+**Context**:
+The iterative optimization loop uses the scoring function as both a quality metric
+AND a search heuristic. The score determines which candidates are explored and which
+are accepted as improvements.
+
+During development, borderline detection patterns were identified -- situations where
+the scoring detected a "violation" that was actually a valid layout artifact (e.g.,
+a gateway face approach that looks like a boundary-slot conflict but is geometrically
+correct).
+
+The initial fix was to update the detection logic to exclude these borderline cases.
+However, this changed the scoring function that the search used as its heuristic,
+altering the search trajectory and causing a 40-second speed regression (from 12s
+to 52s) because the optimizer explored different (and more) candidates.
+
+**Decision**:
+Keep the original scoring function unchanged during the iterative search (stable
+heuristic trajectory). Apply detection exclusions ONLY in the `FinalScore`
+computation (post-search).
+
+The FinalScore excludes:
+- Valid gateway face approaches (exterior closer to center than predecessor).
+- Gateway-exit under-node (lane within 16px of source bottom).
+- Convergent target joins from X-separated sources with > 15px Y-gap.
+- Borderline shared lanes (gap within 3px of tolerance).
+
+The search does not need to know about borderline patterns -- it just needs
+consistent heuristics to explore the candidate space efficiently.
+
+**Consequences**:
+- The FinalScore accurately reflects visual quality: 0 hard violations in the
+  document-processing test case.
+- The search maintains stable 12-15s runtime because the heuristic is unchanged.
+- The separation means that the search may "fix" violations that the FinalScore
+  would have excluded. This is acceptable: the extra fixes are not harmful, and
+  the stable search trajectory is worth the minor redundant work.
+- Future scoring changes must decide whether they apply to the search heuristic
+  (affects trajectory and speed) or only to the FinalScore (affects reported quality).
+
+---
+
+## ADR-6: Under-Node Alongside Detection
+
+**Status**: Accepted
+
+**Context**:
+`CountUnderNodeViolations` detected edges that pass through a node's bounding box
+with a gap greater than 0.5px. This threshold was chosen to avoid false positives
+from floating-point precision.
+
+However, edges running flush with a node boundary (gap = 0px, e.g., exactly at the
+bottom edge of a node) were not detected. These edges are visually "glued" to the
+node boundary -- they appear to touch the node even though they technically do not
+pass through it.
+
+The 0.5px threshold also missed edges within a few pixels of the boundary. An edge
+at gap = 2px is visually indistinguishable from one at gap = 0px at typical zoom
+levels, but only the latter was detected.
+
+**Decision**:
+Extend the under-node detection to include flush and near-flush edges:
+
+- Standard under-node: gap > 0.5px (unchanged).
+- Flush bottom (`isFlushBottom`): gap >= -4px and <= 0.5px relative to the node's
+  bottom boundary.
+- Flush top (`isFlushTop`): gap >= -4px and <= 0.5px relative to the node's top
+  boundary.
+
+The +/-4px range catches edges that are visually "alongside" the node boundary,
+even if they are technically outside the bounding box by a few pixels.
+
+**Consequences**:
+- Catches visually "glued" edges that touch or nearly touch node boundaries.
+- The Y-gutter expansion then creates clearance for these edges, pushing them
+  into a clean routing corridor.
+- The -4px lower bound prevents false positives from edges that are merely
+  "nearby" but visually separate from the node.
+- The detection threshold (±4px for alongside, > 0.5px for standard) should not
+  be changed without sprint-level approval, as it affects which edges trigger
+  Y-gutter expansion.
diff --git a/docs/workflow/tutorials/10-rendering/README.md b/docs/workflow/tutorials/10-rendering/README.md
new file mode 100644
index 000000000..e659121f2
--- /dev/null
+++ b/docs/workflow/tutorials/10-rendering/README.md
@@ -0,0 +1,234 @@
+# Tutorial 10: Rendering Workflow Diagrams
+
+This tutorial shows how to use the Stella Ops workflow rendering system to produce
+visual diagrams from workflow definitions.
+
+---
+
+## Prerequisites
+
+- A workflow canonical definition (from the compiler or imported JSON).
+- Reference to `StellaOps.Workflow.Renderer` and `StellaOps.ElkSharp` assemblies.
+
+---
+
+## Basic Usage
+
+### 1. Create the Layout Engine
+
+```csharp
+var engine = new ElkSharpWorkflowRenderLayoutEngine();
+```
+
+### 2. Configure Layout Options
+
+```csharp
+var request = new WorkflowRenderLayoutRequest
+{
+    Direction = WorkflowRenderLayoutDirection.LeftToRight,
+    Effort = WorkflowRenderLayoutEffort.Best,
+    NodeSpacing = 40,
+    LayerSpacing = 60,
+};
+```
+
+### 3. Compute the Layout
+
+```csharp
+var layout = await engine.LayoutAsync(graph, request);
+```
+
+The `graph` parameter is a `WorkflowRenderGraph` produced by the
+`WorkflowRenderGraphCompiler` from a canonical workflow definition.
+
+### 4. Render to SVG
+
+```csharp
+var svgRenderer = new WorkflowRenderSvgRenderer();
+var svgDoc = svgRenderer.Render(layout, "My Workflow");
+await File.WriteAllTextAsync("workflow.svg", svgDoc.Svg);
+```
+
+### 5. Export to PNG
+
+```csharp
+var pngExporter = new WorkflowRenderPngExporter();
+await pngExporter.ExportAsync(svgDoc, "workflow.png", scale: 2f);
+```
+
+The `scale` parameter controls the pixel density (2f = 2x resolution for HiDPI).
+
+---
+
+## Layout Options Reference
+
+### Direction
+
+| Value | Description |
+|-------|-------------|
+| `LeftToRight` | Nodes flow left to right (default for workflows). |
+| `TopToBottom` | Nodes flow top to bottom. Uses the legacy iterative path. |
+
+### Effort Levels
+
+| Level | Speed | Quality | Use Case |
+|-------|-------|---------|----------|
+| `Draft` | Fast (~1s) | Basic | Interactive editing, previews |
+| `Balanced` | Medium (~3-5s) | Good | Medium graphs, dev-time rendering |
+| `Best` | Slow (~12-15s) | Production | Final artifacts, export, CI rendering |
+
+**Draft** uses 8 ordering iterations, 3 placement iterations, and baseline routing
+only. No iterative optimization is performed.
+
+**Balanced** uses 14 ordering iterations, 6 placement iterations, and light repair
+that fixes the worst violations without full A* search.
+
+**Best** uses 24 ordering iterations, 10 placement iterations, and the hybrid
+deterministic optimization pipeline with full-core parallel repair candidates.
+
+### Spacing
+
+- **NodeSpacing** (default 40): Vertical gap between nodes in pixels. The engine
+  may scale this up to 1.8x when edge density is high.
+- **LayerSpacing** (default 60): Horizontal gap between layers in pixels.
+
+---
+
+## Reading the Layout Result
+
+The `LayoutAsync` result contains positioned nodes and routed edges.
+
+### Nodes
+
+```csharp
+foreach (var node in layout.Nodes)
+{
+    Console.WriteLine($"Node {node.Id}: ({node.X}, {node.Y}) " +
+                      $"size {node.Width}x{node.Height} " +
+                      $"shape={node.Shape}");
+}
+```
+
+Node shapes include `Rectangle` (service tasks), `Diamond` (decision gateways),
+`Hexagon` (fork/join gateways), `Circle` (start/end events), and others.
+
+### Edges
+
+```csharp
+foreach (var edge in layout.Edges)
+{
+    Console.WriteLine($"Edge {edge.SourceId} -> {edge.TargetId}");
+    foreach (var point in edge.BendPoints)
+    {
+        Console.WriteLine($"  bend: ({point.X}, {point.Y})");
+    }
+}
+```
+
+Bend points define the orthogonal path from source to target. Two consecutive
+bend points with the same Y form a horizontal segment; two with the same X
+form a vertical segment.
+
+---
+
+## Diagnostics
+
+When using `Best` effort, the engine captures detailed diagnostics about the
+optimization process.
+
+### Enabling Diagnostics
+
+Diagnostics are captured automatically in `Best` mode. Access them through
+the layout result.
+
+### Violation Report
+
+The violation report lists each edge's violations with category, severity,
+and geometric details.
+
+```csharp
+if (layout.Diagnostics?.ViolationReport != null)
+{
+    foreach (var entry in layout.Diagnostics.ViolationReport)
+    {
+        Console.WriteLine($"Edge {entry.EdgeId}: " +
+                          $"{entry.Category} (penalty {entry.Penalty})");
+    }
+}
+```
+
+### Violation Categories
+
+The scoring system uses 18 categories. Hard violations (100K penalty) include
+node crossings, under-node routing, shared lanes, and boundary slot conflicts.
+Medium violations (50K) include backtracking and detours. Soft violations
+(200-650) include edge crossings, proximity, and excessive bends.
+
+A FinalScore of 0 for hard violations indicates a clean layout with no visual
+defects. See the [Rendering Architecture](../../engine/16-elksharp-rendering-architecture.md)
+for the full violation taxonomy.
+
+### Phase Timings
+
+```csharp
+if (layout.Diagnostics?.PhaseTimings != null)
+{
+    foreach (var phase in layout.Diagnostics.PhaseTimings)
+    {
+        Console.WriteLine($"{phase.Name}: {phase.Duration.TotalMilliseconds}ms");
+    }
+}
+```
+
+Phase timings cover ordering, placement, gutter expansion, base routing,
+iterative optimization, and post-processing.
+
+---
+
+## End-to-End Example
+
+```csharp
+// Compile a workflow definition to a render graph
+var compiler = new WorkflowRenderGraphCompiler();
+var graph = compiler.Compile(workflowDefinition);
+
+// Configure and run layout
+var engine = new ElkSharpWorkflowRenderLayoutEngine();
+var request = new WorkflowRenderLayoutRequest
+{
+    Direction = WorkflowRenderLayoutDirection.LeftToRight,
+    Effort = WorkflowRenderLayoutEffort.Best,
+    NodeSpacing = 40,
+    LayerSpacing = 60,
+};
+var layout = await engine.LayoutAsync(graph, request);
+
+// Render to SVG
+var svgRenderer = new WorkflowRenderSvgRenderer();
+var svgDoc = svgRenderer.Render(layout, workflowDefinition.Name);
+await File.WriteAllTextAsync($"{workflowDefinition.Name}.svg", svgDoc.Svg);
+
+// Export to PNG at 2x resolution
+var pngExporter = new WorkflowRenderPngExporter();
+await pngExporter.ExportAsync(svgDoc, $"{workflowDefinition.Name}.png", scale: 2f);
+
+// Check for violations
+var hardViolations = layout.Diagnostics?.ViolationReport?
+    .Where(v => v.Penalty >= 100_000)
+    .ToList();
+if (hardViolations?.Any() == true)
+{
+    Console.WriteLine($"WARNING: {hardViolations.Count} hard violations detected");
+}
+```
+
+---
+
+## Further Reading
+
+- [ElkSharp Rendering Architecture](../../engine/16-elksharp-rendering-architecture.md) --
+  Full technical details of the Sugiyama pipeline, edge routing, and iterative optimization.
+- [Architectural Decisions](../../engine/17-elksharp-architectural-decisions.md) --
+  ADR records for key design choices.
+- [ENGINE.md](../../ENGINE.md) -- Workflow engine overview including layout engine
+  configuration and render pipeline.
diff --git a/src/__Libraries/StellaOps.ElkSharp/AGENTS.md b/src/__Libraries/StellaOps.ElkSharp/AGENTS.md
index 1aa652f0e..90e07d6a9 100644
--- a/src/__Libraries/StellaOps.ElkSharp/AGENTS.md
+++ b/src/__Libraries/StellaOps.ElkSharp/AGENTS.md
@@ -40,6 +40,13 @@
 - When touching proximity/highway logic, keep long applicable shared corridors distinct from short shared segments that must be spread apart.
 - The A* router now precomputes node-obstacle blocked step masks per route so neighbor expansion does not rescan every node obstacle. Future performance work should extend that to precomputed lane-occupancy masks for previously committed edge lanes, so the router can skip already-owned space instead of only penalizing it after expansion. Derive intermediate grid spacing from approximately one third of the average service-task size instead of keeping a fixed dense lattice.
 - Keep `TopToBottom` behavior stable unless the sprint explicitly includes it.
+- Y-gutter expansion runs after X-gutters and before compact passes. It shifts all nodes below a violation Y downward by the needed clearance. Up to 2 iterations. Do not modify the shift direction (always downward) or the detection threshold (minClearance for under-node, +/-4px for alongside) without sprint-level approval.
+- Corridor rerouting for long horizontal sweeps (> 40% graph width) uses the top corridor at graphMinY - 56. Near-boundary sweeps use the bottom corridor at graphMaxY + 32. The perpendicular exit stub must be 24px to survive NormalizeBoundaryAngles. Do not remove the stub or change corridor Y offsets without verifying the normalization interaction.
+- Gateway left/right tip vertices are allowed for target entries only. Source exits are still blocked by ForceDecisionSourceExitOffVertex. The 3-way coordination (IsAllowedGatewayTipVertex + HasValidGatewayBoundaryAngle + CountBoundarySlotViolations vertex exemption) must stay synchronized -- changing one without the others causes cascading boundary-slot violations.
+- FinalScore adjustments exclude borderline detection artifacts: valid gateway face approaches (exterior closer to center than predecessor), gateway-exit under-node (lane within 16px of source bottom), convergent target joins from X-separated sources with > 15px Y-gap, and borderline shared lanes (gap within 3px of tolerance). These exclusions apply ONLY to the FinalScore, not during the iterative search.
+- Short-stub exit normalization (useShortStub parameter in NormalizeExitPath) fires only when the default long stub fails HasClearSourceExitSegment. The short stub is always sourceX +/- 24px. Do not make it the default -- the long stub produces cleaner paths when clearance is available.
+- Under-node alongside detection extends the standard gap > 0.5px check to include flush edges (gap >= -4px and <= 0.5px for bottom, same for top). This catches edges "glued" to node boundaries.
+- Target-join spread pushes convergent approach lanes apart by minClearance - currentGap + 8px (half applied to each edge). The spread runs as a final winner refinement step and uses weighted score comparison (Score.Value) for promotion.
 
 ## Testing
 - Run the targeted workflow renderer test project for ElkSharp changes.