# Router Architecture This document is the canonical specification for the StellaOps Router system. ## System Architecture ### Scope - A single HTTP ingress service (`StellaOps.Gateway.WebService`) handles all external HTTP traffic - Microservices communicate with the Gateway using binary transports (TCP, TLS, UDP, RabbitMQ) - HTTP is not used for internal microservice-to-gateway traffic - Request/response bodies are opaque to the router (raw bytes/streams) ### Transport Architecture Each transport connection carries: - Initial registration (HELLO) and endpoint configuration - Ongoing heartbeats - Request/response data frames - Streaming data frames - Cancellation frames ``` ┌─────────────────┐ ┌─────────────────┐ │ Microservice │ │ Gateway │ │ │ HELLO │ │ │ Endpoints: │ ─────────────────────────►│ Routing │ │ - POST /items │ HEARTBEAT │ State │ │ - GET /items │ ◄────────────────────────►│ │ │ │ │ Connections[] │ │ │ REQUEST / RESPONSE │ │ │ │ ◄────────────────────────►│ │ │ │ │ │ │ │ STREAM_DATA / CANCEL │ │ │ │ ◄────────────────────────►│ │ └─────────────────┘ └─────────────────┘ ``` --- ## Service Identity ### Instance Identity Each microservice instance is identified by: | Field | Type | Description | |-------|------|-------------| | `ServiceName` | string | Logical service name (e.g., "billing") | | `Version` | string | Semantic version (`major.minor.patch`) | | `Region` | string | Deployment region (e.g., "us-east-1") | | `InstanceId` | string | Unique instance identifier | ### Version Matching - Version matching is strict semver equality - Router only routes to instances with exact version match - Default version used when client doesn't specify ### Region Configuration Gateway region comes from `GatewayNodeConfig`: ```csharp public sealed class GatewayNodeConfig { public required string Region { get; init; } // e.g., "eu1" public required string NodeId { get; init; } // e.g., "gw-eu1-01" public required string Environment { get; init; } // e.g., "prod" } ``` Region is never derived from HTTP headers or URL hostnames. --- ## Endpoint Model ### Endpoint Identity Endpoint identity is `(HTTP Method, Path)`: | Field | Example | |-------|---------| | Method | `GET`, `POST`, `PUT`, `PATCH`, `DELETE` | | Path | `/invoices`, `/items/{id}`, `/users/{userId}/orders` | ### Endpoint Descriptor Each endpoint includes: ```csharp public sealed class EndpointDescriptor { public required string Method { get; init; } public required string Path { get; init; } public required string ServiceName { get; init; } public required string Version { get; init; } public TimeSpan DefaultTimeout { get; init; } public bool SupportsStreaming { get; init; } public IReadOnlyList RequiringClaims { get; init; } = []; public EndpointSchemaInfo? SchemaInfo { get; init; } } ``` ### Path Matching - ASP.NET-style route templates - Parameter segments: `{id}`, `{userId}` - Case sensitivity and trailing slash handling follow ASP.NET conventions --- ## Routing Algorithm ### Instance Selection Given `(ServiceName, Version, Method, Path)`: 1. **Filter candidates**: - Match `ServiceName` exactly - Match `Version` exactly (strict semver) - Health status in acceptable set (`Healthy` or `Degraded`) 2. **Region preference**: - Prefer instances where `Region == GatewayNodeConfig.Region` - Fall back to configured neighbor regions - Fall back to all other regions 3. **Within region tier**: - Prefer lower `AveragePingMs` - If tied, prefer more recent `LastHeartbeatUtc` - If still tied, use round-robin balancing ### Instance Health ```csharp public enum InstanceHealthStatus { Unknown, Healthy, Degraded, Draining, Unhealthy } ``` Health metadata per connection: | Field | Type | Description | |-------|------|-------------| | `Status` | enum | Current health status | | `LastHeartbeatUtc` | DateTime | Last heartbeat timestamp | | `AveragePingMs` | double | Average round-trip latency | --- ## Transport Layer ### Transport Types | Transport | Use Case | Streaming | Notes | |-----------|----------|-----------|-------| | InMemory | Testing | Yes | In-process channels | | TCP | Production | Yes | Length-prefixed frames | | TLS | Secure | Yes | Certificate-based encryption | | UDP | Small payloads | No | Single datagram per frame | | RabbitMQ | Queuing | Yes | Exchange/queue routing | ### Transport Plugin Interface ```csharp public interface ITransportServer { Task StartAsync(CancellationToken ct); Task StopAsync(CancellationToken ct); event Func OnHelloReceived; event Func OnHeartbeatReceived; event Func OnConnectionClosed; } public interface ITransportClient { Task ConnectAsync(CancellationToken ct); Task DisconnectAsync(CancellationToken ct); Task SendFrameAsync(Frame frame, CancellationToken ct); } ``` ### Frame Types ```csharp public enum FrameType : byte { Hello = 1, Heartbeat = 2, Request = 3, Response = 4, RequestStreamData = 5, ResponseStreamData = 6, Cancel = 7 } ``` --- ## Gateway Pipeline ### HTTP Middleware Stack ``` Request ─►│ ForwardedHeaders │ │ RequestLogging │ │ ErrorHandling │ │ Authentication │ │ EndpointResolution │ ◄── (Method, Path) → EndpointDescriptor │ Authorization │ ◄── RequiringClaims check │ RoutingDecision │ ◄── Select connection/instance │ TransportDispatch │ ◄── Send to microservice ▼ ``` ### Connection State Per-connection state maintained by Gateway: ```csharp public sealed class ConnectionState { public required string ConnectionId { get; init; } public required InstanceDescriptor Instance { get; init; } public InstanceHealthStatus Status { get; set; } public DateTime? LastHeartbeatUtc { get; set; } public double AveragePingMs { get; set; } public TransportType TransportType { get; init; } public Dictionary<(string Method, string Path), EndpointDescriptor> Endpoints { get; } = new(); public IReadOnlyDictionary Schemas { get; init; } = new Dictionary(); } ``` ### Payload Handling The Gateway treats bodies as opaque byte sequences: - No deserialization or schema interpretation - Headers and bytes forwarded as-is - Schema validation is microservice responsibility ### Payload Limits Configurable limits protect against resource exhaustion: | Limit | Scope | |-------|-------| | `MaxRequestBytesPerCall` | Single request | | `MaxRequestBytesPerConnection` | All requests on connection | | `MaxAggregateInflightBytes` | All in-flight across gateway | Exceeded limits result in: - Early rejection (HTTP 413) if `Content-Length` known - Mid-stream abort with CANCEL frame - Appropriate error response (413 or 503) --- ## Microservice SDK ### Configuration ```csharp services.AddStellaMicroservice(options => { options.ServiceName = "billing"; options.Version = "1.0.0"; options.Region = "us-east-1"; options.InstanceId = Guid.NewGuid().ToString(); options.ServiceDescription = "Invoice processing service"; }); ``` ### Endpoint Declaration Attributes: ```csharp [StellaEndpoint("POST", "/invoices")] public sealed class CreateInvoiceEndpoint : IStellaEndpoint ``` ### Handler Interfaces **Typed handler** (JSON serialization): ```csharp public interface IStellaEndpoint { Task HandleAsync(TRequest request, CancellationToken ct); } public interface IStellaEndpoint { Task HandleAsync(CancellationToken ct); } ``` **Raw handler** (streaming): ```csharp public interface IRawStellaEndpoint { Task HandleAsync(RawRequestContext ctx, CancellationToken ct); } ``` ### Endpoint Discovery Two mechanisms: 1. **Source Generator** (preferred): Compile-time discovery via Roslyn 2. **Reflection** (fallback): Runtime assembly scanning ### Connection Behavior On connection: 1. Send HELLO with instance info and endpoints 2. Start heartbeat timer 3. Listen for REQUEST frames HELLO payload: ```csharp public sealed class HelloPayload { public required InstanceDescriptor Instance { get; init; } public required IReadOnlyList Endpoints { get; init; } public IReadOnlyDictionary Schemas { get; init; } = new Dictionary(); public ServiceOpenApiInfo? OpenApiInfo { get; init; } } ``` --- ## Authorization ### Claims-based Model Authorization uses `RequiringClaims`, not roles: ```csharp public sealed class ClaimRequirement { public required string Type { get; init; } public string? Value { get; init; } } ``` ### Precedence 1. Microservice provides defaults in HELLO 2. Authority can override centrally 3. Gateway enforces final effective claims ### Enforcement Gateway `AuthorizationMiddleware`: - Validates user principal has all required claims - Empty claims list = authenticated access only - Missing claim = 403 Forbidden --- ## Cancellation ### CANCEL Frame ```csharp public sealed class CancelPayload { public required string Reason { get; init; } // Values: "ClientDisconnected", "Timeout", "PayloadLimitExceeded", "Shutdown" } ``` ### Gateway sends CANCEL when: - HTTP client disconnects (`HttpContext.RequestAborted`) - Request timeout elapses - Payload limit exceeded - Gateway shutdown ### Microservice handles CANCEL: - Maps correlation ID to `CancellationTokenSource` - Calls `Cancel()` on the source - Handler receives cancellation via `CancellationToken` --- ## Streaming ### Buffered vs Streaming | Mode | Request Body | Response Body | Use Case | |------|--------------|---------------|----------| | Buffered | Full in memory | Full in memory | Small payloads | | Streaming | Chunked frames | Chunked frames | Large payloads | ### Frame Flow (Streaming) ``` Gateway Microservice │ │ │ REQUEST (headers only) │ │ ────────────────────────────────────►│ │ │ │ REQUEST_STREAM_DATA (chunk 1) │ │ ────────────────────────────────────►│ │ │ │ REQUEST_STREAM_DATA (chunk n) │ │ ────────────────────────────────────►│ │ │ │ REQUEST_STREAM_DATA (final=true) │ │ ────────────────────────────────────►│ │ │ │ RESPONSE │ │◄────────────────────────────────────│ │ │ │ RESPONSE_STREAM_DATA │ │◄────────────────────────────────────│ ``` --- ## Heartbeat & Health ### Heartbeat Frame Sent at regular intervals over the same connection as requests: ```csharp public sealed class HeartbeatPayload { public required InstanceHealthStatus Status { get; init; } public int InflightRequests { get; init; } public double ErrorRate { get; init; } } ``` ### Health Tracking Gateway tracks: - `LastHeartbeatUtc` per connection - Derives status from heartbeat recency - Marks stale instances as Unhealthy - Uses health in routing decisions --- ## Configuration ### Router YAML ```yaml # router.yaml Gateway: Region: "us-east-1" NodeId: "gw-east-01" Environment: "production" PayloadLimits: MaxRequestBytesPerCall: 10485760 # 10 MB MaxRequestBytesPerConnection: 104857600 # 100 MB MaxAggregateInflightBytes: 1073741824 # 1 GB Services: - ServiceName: billing DefaultVersion: "1.0.0" DefaultTransport: Tcp Endpoints: - Method: POST Path: /invoices TimeoutSeconds: 30 RequiringClaims: - Type: "invoices:write" OpenApi: Title: "StellaOps Gateway API" CacheTtlSeconds: 60 ``` ### Hot Reload - YAML changes picked up at runtime - Routing state updated without restart - New services/endpoints added dynamically --- ## Error Mapping | Condition | HTTP Status | |-----------|-------------| | Version not found | 404 Not Found | | No healthy instance | 503 Service Unavailable | | Request timeout | 504 Gateway Timeout | | Payload too large | 413 Payload Too Large | | Unauthorized | 401 Unauthorized | | Missing claims | 403 Forbidden | | Validation error | 422 Unprocessable Entity | | Internal error | 500 Internal Server Error | --- ## See Also - [schema-validation.md](schema-validation.md) - JSON Schema validation - [openapi-aggregation.md](openapi-aggregation.md) - OpenAPI document generation - [migration-guide.md](migration-guide.md) - WebService to Microservice migration