router planning

This commit is contained in:
master
2025-12-02 18:38:32 +02:00
parent 790801f329
commit 0c9e8d5d18
15 changed files with 6439 additions and 0 deletions

378
docs/router/07-Step.md Normal file
View File

@@ -0,0 +1,378 @@
For this step youre wiring **request cancellation** endtoend in the InMemory setup:
> Client / gateway gives up → gateway sends CANCEL → microservice cancels handler
No need to mix in streaming or payload limits yet; just enforce cancellation for timeouts and client disconnects.
---
## 0. Preconditions
Have in place:
* `FrameType.Cancel` in `StellaOps.Router.Common.FrameType`.
* `ITransportClient.SendCancelAsync(ConnectionState, Guid, string?)` in Common.
* Minimal InMemory path from HTTP → gateway → microservice (HELLO + REQUEST/RESPONSE) working.
If `FrameType.Cancel` or `SendCancelAsync` arent there yet, add them first.
---
## 1. Common: cancel payload (optional, but useful)
If you want reasons attached, add a DTO in Common:
```csharp
public sealed class CancelPayload
{
public string Reason { get; init; } = string.Empty; // eg: "ClientDisconnected", "Timeout"
}
```
Youll serialize this into `Frame.Payload` when sending a CANCEL. If you dont care about reasons yet, you can skip the payload and just use the correlation id.
No behavior in Common, just the shape.
---
## 2. Gateway: trigger CANCEL on client abort and timeout
### 2.1 Extend `TransportDispatchMiddleware`
You already:
* Generate a `correlationId`.
* Build a `FrameType.Request`.
* Call `ITransportClient.SendRequestAsync(...)` and await it.
Now:
1. Create a linked CTS that combines:
* `HttpContext.RequestAborted`
* The endpoint timeout
2. Register a callback on `RequestAborted` that sends a CANCEL with the same correlationId.
3. On `OperationCanceledException` where the HTTP token is not canceled (pure timeout), send a CANCEL once and return 504.
Sketch:
```csharp
public async Task Invoke(HttpContext context, ITransportClient transportClient)
{
var decision = (RoutingDecision)context.Items[RouterHttpContextKeys.RoutingDecision]!;
var correlationId = Guid.NewGuid();
// build requestFrame as before
var timeout = decision.EffectiveTimeout;
using var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(context.RequestAborted);
linkedCts.CancelAfter(timeout);
// fire-and-forget cancel on client disconnect
context.RequestAborted.Register(() =>
{
_ = transportClient.SendCancelAsync(
decision.Connection, correlationId, "ClientDisconnected");
});
Frame responseFrame;
try
{
responseFrame = await transportClient.SendRequestAsync(
decision.Connection,
requestFrame,
timeout,
linkedCts.Token);
}
catch (OperationCanceledException) when (!context.RequestAborted.IsCancellationRequested)
{
// internal timeout
await transportClient.SendCancelAsync(
decision.Connection, correlationId, "Timeout");
context.Response.StatusCode = StatusCodes.Status504GatewayTimeout;
await context.Response.WriteAsync("Upstream timeout");
return;
}
// existing response mapping goes here
}
```
Key points:
* The gateway sends CANCEL **as soon as**:
* The client disconnects (RequestAborted).
* Or the internal timeout triggers (catch branch).
* We do not need any global correlation registry on the gateway side; the middleware has the `correlationId` and `Connection`.
---
## 3. InMemory transport: propagate CANCEL to microservice
### 3.1 Implement `SendCancelAsync` in `InMemoryTransportClient` (gateway side)
In your gateway InMemory implementation:
```csharp
public Task SendCancelAsync(ConnectionState connection, Guid correlationId, string? reason = null)
{
var payload = reason is null
? Array.Empty<byte>()
: SerializeCancelPayload(new CancelPayload { Reason = reason });
var frame = new Frame
{
Type = FrameType.Cancel,
CorrelationId = correlationId,
Payload = payload
};
return _hub.SendFromGatewayAsync(connection.ConnectionId, frame, CancellationToken.None);
}
```
`_hub.SendFromGatewayAsync` must route the frame to the microservices receive loop for that connection.
### 3.2 Hub routing
Ensure your `IInMemoryRouterHub` implementation:
* When `SendFromGatewayAsync(connectionId, cancelFrame, ct)` is called:
* Enqueues that frame onto the microservices incoming channel (`GetFramesForMicroserviceAsync` stream).
No extra logic; just treat CANCEL like REQUEST/HELLO in terms of delivery.
---
## 4. Microservice: track in-flight requests
Now microservice needs to know **which** request to cancel when a CANCEL arrives.
### 4.1 In-flight registry
In the microservice connection class (the one doing the receive loop):
```csharp
private readonly ConcurrentDictionary<Guid, RequestExecution> _inflight =
new();
private sealed class RequestExecution
{
public CancellationTokenSource Cts { get; init; } = default!;
public Task ExecutionTask { get; init; } = default!;
}
```
When a `Request` frame arrives:
* Create a `CancellationTokenSource`.
* Start the handler using that token.
* Store both in `_inflight`.
Example pattern in `ReceiveLoopAsync`:
```csharp
private async Task ReceiveLoopAsync(CancellationToken ct)
{
await foreach (var frame in _routerClient.GetIncomingFramesAsync(ct))
{
switch (frame.Type)
{
case FrameType.Request:
HandleRequest(frame);
break;
case FrameType.Cancel:
HandleCancel(frame);
break;
// other frame types...
}
}
}
private void HandleRequest(Frame frame)
{
var cts = new CancellationTokenSource();
var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(cts.Token); // later link to global shutdown if needed
var exec = new RequestExecution
{
Cts = cts,
ExecutionTask = HandleRequestCoreAsync(frame, linkedCts.Token)
};
_inflight[frame.CorrelationId] = exec;
_ = exec.ExecutionTask.ContinueWith(_ =>
{
_inflight.TryRemove(frame.CorrelationId, out _);
cts.Dispose();
linkedCts.Dispose();
}, TaskScheduler.Default);
}
```
### 4.2 Wire CancellationToken into dispatcher
`HandleRequestCoreAsync` should:
* Deserialize the request payload.
* Build a `RawRequestContext` with `CancellationToken = token`.
* Pass that token through to:
* `IRawStellaEndpoint.HandleAsync(context)` (via the context).
* Or typed handler adapter (`IStellaEndpoint<,>` / `IStellaEndpoint<TResponse>`), passing it explicitly.
Example pattern:
```csharp
private async Task HandleRequestCoreAsync(Frame frame, CancellationToken ct)
{
var req = DeserializeRequestPayload(frame.Payload);
if (!_catalog.TryGetHandler(req.Method, req.Path, out var registration))
{
var notFound = BuildNotFoundResponse(frame.CorrelationId);
await _routerClient.SendFrameAsync(notFound, ct);
return;
}
using var bodyStream = new MemoryStream(req.Body); // minimal case
var ctx = new RawRequestContext
{
Method = req.Method,
Path = req.Path,
Headers = req.Headers,
Body = bodyStream,
CancellationToken = ct
};
var handler = (IRawStellaEndpoint)_serviceProvider.GetRequiredService(registration.HandlerType);
var response = await handler.HandleAsync(ctx);
var respFrame = BuildResponseFrame(frame.CorrelationId, response);
await _routerClient.SendFrameAsync(respFrame, ct);
}
```
Now each handler sees a token that will be canceled when a CANCEL frame arrives.
### 4.3 Handle CANCEL frames
When a `Cancel` frame arrives:
```csharp
private void HandleCancel(Frame frame)
{
if (_inflight.TryGetValue(frame.CorrelationId, out var exec))
{
exec.Cts.Cancel();
}
// Ignore if not found (e.g. already completed)
}
```
If you care about the reason, deserialize `CancelPayload` and log it; not required for behavior.
---
## 5. Handler guidance (for your Microservice docs)
In `Stella Ops Router Microservice.md`, add simple rules devs must follow:
* Any longrunning or IO-heavy code in endpoints MUST:
* Accept a `CancellationToken` (for typed endpoints).
* Or use `RawRequestContext.CancellationToken` for raw endpoints.
* Always pass the token into:
* DB calls.
* File I/O and stream operations.
* HTTP/gRPC calls to other services.
* Do not swallow `OperationCanceledException` unless there is a good reason; normally let it bubble or treat it as a normal cancellation.
Concrete example for devs:
```csharp
[StellaEndpoint("POST", "/billing/slow-operation")]
public sealed class SlowEndpoint : IRawStellaEndpoint
{
public async Task<RawResponse> HandleAsync(RawRequestContext ctx)
{
// Correct: observe token
await Task.Delay(TimeSpan.FromMinutes(5), ctx.CancellationToken);
return new RawResponse { StatusCode = 204 };
}
}
```
---
## 6. Tests
### 6.1 Client abort → CANCEL
Test outline:
* Setup:
* Gateway + microservice wired via InMemory hub.
* Microservice endpoint that:
* Waits on `Task.Delay(TimeSpan.FromMinutes(5), ctx.CancellationToken)`.
* Test:
1. Start HTTP request to `/slow`.
2. After sending request, cancel the clients HttpClient token or close the connection.
3. Assert:
* Gateways InMemory transport sent a `FrameType.Cancel`.
* Microservices handler is canceled (e.g. no longer running after a short time).
* No response (or partial) is written; HTTP side will produce whatever your test harness expects when client aborts.
### 6.2 Gateway timeout → CANCEL
* Configure endpoint timeout small (e.g. 100 ms).
* Have endpoint sleep for 5 seconds with the token.
* Assert:
* Gateway returns 504.
* Cancel frame was sent.
* Handler is canceled (task completes early).
These tests lock in the semantics so later additions (real transports, streaming) dont regress cancellation.
---
## 7. Done criteria for “Add cancellation semantics (with InMemory)”
You can mark step 7 as complete when:
* For every routed request, the gateway knows its correlationId and connection.
* On client disconnect:
* Gateway sends a `FrameType.Cancel` with that correlationId.
* On internal timeout:
* Gateway sends a `FrameType.Cancel` and returns 504 to the client.
* InMemory hub delivers CANCEL frames to the microservice.
* Microservice:
* Tracks inflight requests by correlationId.
* Cancels the proper `CancellationTokenSource` when CANCEL arrives.
* Passes the token into handlers via `RawRequestContext` and typed adapters.
* At least one automated test proves:
* Cancellation propagates from gateway to microservice and stops the handler.
Once this is done, youll be in good shape to add streaming & payload-limits on top, because the cancel path is already wired endtoend.