Files
git.stella-ops.org/docs/router/07-Step.md
2025-12-02 18:38:32 +02:00

379 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

For this step youre wiring **request cancellation** endtoend in the InMemory setup:
> Client / gateway gives up → gateway sends CANCEL → microservice cancels handler
No need to mix in streaming or payload limits yet; just enforce cancellation for timeouts and client disconnects.
---
## 0. Preconditions
Have in place:
* `FrameType.Cancel` in `StellaOps.Router.Common.FrameType`.
* `ITransportClient.SendCancelAsync(ConnectionState, Guid, string?)` in Common.
* Minimal InMemory path from HTTP → gateway → microservice (HELLO + REQUEST/RESPONSE) working.
If `FrameType.Cancel` or `SendCancelAsync` arent there yet, add them first.
---
## 1. Common: cancel payload (optional, but useful)
If you want reasons attached, add a DTO in Common:
```csharp
public sealed class CancelPayload
{
public string Reason { get; init; } = string.Empty; // eg: "ClientDisconnected", "Timeout"
}
```
Youll serialize this into `Frame.Payload` when sending a CANCEL. If you dont care about reasons yet, you can skip the payload and just use the correlation id.
No behavior in Common, just the shape.
---
## 2. Gateway: trigger CANCEL on client abort and timeout
### 2.1 Extend `TransportDispatchMiddleware`
You already:
* Generate a `correlationId`.
* Build a `FrameType.Request`.
* Call `ITransportClient.SendRequestAsync(...)` and await it.
Now:
1. Create a linked CTS that combines:
* `HttpContext.RequestAborted`
* The endpoint timeout
2. Register a callback on `RequestAborted` that sends a CANCEL with the same correlationId.
3. On `OperationCanceledException` where the HTTP token is not canceled (pure timeout), send a CANCEL once and return 504.
Sketch:
```csharp
public async Task Invoke(HttpContext context, ITransportClient transportClient)
{
var decision = (RoutingDecision)context.Items[RouterHttpContextKeys.RoutingDecision]!;
var correlationId = Guid.NewGuid();
// build requestFrame as before
var timeout = decision.EffectiveTimeout;
using var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(context.RequestAborted);
linkedCts.CancelAfter(timeout);
// fire-and-forget cancel on client disconnect
context.RequestAborted.Register(() =>
{
_ = transportClient.SendCancelAsync(
decision.Connection, correlationId, "ClientDisconnected");
});
Frame responseFrame;
try
{
responseFrame = await transportClient.SendRequestAsync(
decision.Connection,
requestFrame,
timeout,
linkedCts.Token);
}
catch (OperationCanceledException) when (!context.RequestAborted.IsCancellationRequested)
{
// internal timeout
await transportClient.SendCancelAsync(
decision.Connection, correlationId, "Timeout");
context.Response.StatusCode = StatusCodes.Status504GatewayTimeout;
await context.Response.WriteAsync("Upstream timeout");
return;
}
// existing response mapping goes here
}
```
Key points:
* The gateway sends CANCEL **as soon as**:
* The client disconnects (RequestAborted).
* Or the internal timeout triggers (catch branch).
* We do not need any global correlation registry on the gateway side; the middleware has the `correlationId` and `Connection`.
---
## 3. InMemory transport: propagate CANCEL to microservice
### 3.1 Implement `SendCancelAsync` in `InMemoryTransportClient` (gateway side)
In your gateway InMemory implementation:
```csharp
public Task SendCancelAsync(ConnectionState connection, Guid correlationId, string? reason = null)
{
var payload = reason is null
? Array.Empty<byte>()
: SerializeCancelPayload(new CancelPayload { Reason = reason });
var frame = new Frame
{
Type = FrameType.Cancel,
CorrelationId = correlationId,
Payload = payload
};
return _hub.SendFromGatewayAsync(connection.ConnectionId, frame, CancellationToken.None);
}
```
`_hub.SendFromGatewayAsync` must route the frame to the microservices receive loop for that connection.
### 3.2 Hub routing
Ensure your `IInMemoryRouterHub` implementation:
* When `SendFromGatewayAsync(connectionId, cancelFrame, ct)` is called:
* Enqueues that frame onto the microservices incoming channel (`GetFramesForMicroserviceAsync` stream).
No extra logic; just treat CANCEL like REQUEST/HELLO in terms of delivery.
---
## 4. Microservice: track in-flight requests
Now microservice needs to know **which** request to cancel when a CANCEL arrives.
### 4.1 In-flight registry
In the microservice connection class (the one doing the receive loop):
```csharp
private readonly ConcurrentDictionary<Guid, RequestExecution> _inflight =
new();
private sealed class RequestExecution
{
public CancellationTokenSource Cts { get; init; } = default!;
public Task ExecutionTask { get; init; } = default!;
}
```
When a `Request` frame arrives:
* Create a `CancellationTokenSource`.
* Start the handler using that token.
* Store both in `_inflight`.
Example pattern in `ReceiveLoopAsync`:
```csharp
private async Task ReceiveLoopAsync(CancellationToken ct)
{
await foreach (var frame in _routerClient.GetIncomingFramesAsync(ct))
{
switch (frame.Type)
{
case FrameType.Request:
HandleRequest(frame);
break;
case FrameType.Cancel:
HandleCancel(frame);
break;
// other frame types...
}
}
}
private void HandleRequest(Frame frame)
{
var cts = new CancellationTokenSource();
var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(cts.Token); // later link to global shutdown if needed
var exec = new RequestExecution
{
Cts = cts,
ExecutionTask = HandleRequestCoreAsync(frame, linkedCts.Token)
};
_inflight[frame.CorrelationId] = exec;
_ = exec.ExecutionTask.ContinueWith(_ =>
{
_inflight.TryRemove(frame.CorrelationId, out _);
cts.Dispose();
linkedCts.Dispose();
}, TaskScheduler.Default);
}
```
### 4.2 Wire CancellationToken into dispatcher
`HandleRequestCoreAsync` should:
* Deserialize the request payload.
* Build a `RawRequestContext` with `CancellationToken = token`.
* Pass that token through to:
* `IRawStellaEndpoint.HandleAsync(context)` (via the context).
* Or typed handler adapter (`IStellaEndpoint<,>` / `IStellaEndpoint<TResponse>`), passing it explicitly.
Example pattern:
```csharp
private async Task HandleRequestCoreAsync(Frame frame, CancellationToken ct)
{
var req = DeserializeRequestPayload(frame.Payload);
if (!_catalog.TryGetHandler(req.Method, req.Path, out var registration))
{
var notFound = BuildNotFoundResponse(frame.CorrelationId);
await _routerClient.SendFrameAsync(notFound, ct);
return;
}
using var bodyStream = new MemoryStream(req.Body); // minimal case
var ctx = new RawRequestContext
{
Method = req.Method,
Path = req.Path,
Headers = req.Headers,
Body = bodyStream,
CancellationToken = ct
};
var handler = (IRawStellaEndpoint)_serviceProvider.GetRequiredService(registration.HandlerType);
var response = await handler.HandleAsync(ctx);
var respFrame = BuildResponseFrame(frame.CorrelationId, response);
await _routerClient.SendFrameAsync(respFrame, ct);
}
```
Now each handler sees a token that will be canceled when a CANCEL frame arrives.
### 4.3 Handle CANCEL frames
When a `Cancel` frame arrives:
```csharp
private void HandleCancel(Frame frame)
{
if (_inflight.TryGetValue(frame.CorrelationId, out var exec))
{
exec.Cts.Cancel();
}
// Ignore if not found (e.g. already completed)
}
```
If you care about the reason, deserialize `CancelPayload` and log it; not required for behavior.
---
## 5. Handler guidance (for your Microservice docs)
In `Stella Ops Router Microservice.md`, add simple rules devs must follow:
* Any longrunning or IO-heavy code in endpoints MUST:
* Accept a `CancellationToken` (for typed endpoints).
* Or use `RawRequestContext.CancellationToken` for raw endpoints.
* Always pass the token into:
* DB calls.
* File I/O and stream operations.
* HTTP/gRPC calls to other services.
* Do not swallow `OperationCanceledException` unless there is a good reason; normally let it bubble or treat it as a normal cancellation.
Concrete example for devs:
```csharp
[StellaEndpoint("POST", "/billing/slow-operation")]
public sealed class SlowEndpoint : IRawStellaEndpoint
{
public async Task<RawResponse> HandleAsync(RawRequestContext ctx)
{
// Correct: observe token
await Task.Delay(TimeSpan.FromMinutes(5), ctx.CancellationToken);
return new RawResponse { StatusCode = 204 };
}
}
```
---
## 6. Tests
### 6.1 Client abort → CANCEL
Test outline:
* Setup:
* Gateway + microservice wired via InMemory hub.
* Microservice endpoint that:
* Waits on `Task.Delay(TimeSpan.FromMinutes(5), ctx.CancellationToken)`.
* Test:
1. Start HTTP request to `/slow`.
2. After sending request, cancel the clients HttpClient token or close the connection.
3. Assert:
* Gateways InMemory transport sent a `FrameType.Cancel`.
* Microservices handler is canceled (e.g. no longer running after a short time).
* No response (or partial) is written; HTTP side will produce whatever your test harness expects when client aborts.
### 6.2 Gateway timeout → CANCEL
* Configure endpoint timeout small (e.g. 100 ms).
* Have endpoint sleep for 5 seconds with the token.
* Assert:
* Gateway returns 504.
* Cancel frame was sent.
* Handler is canceled (task completes early).
These tests lock in the semantics so later additions (real transports, streaming) dont regress cancellation.
---
## 7. Done criteria for “Add cancellation semantics (with InMemory)”
You can mark step 7 as complete when:
* For every routed request, the gateway knows its correlationId and connection.
* On client disconnect:
* Gateway sends a `FrameType.Cancel` with that correlationId.
* On internal timeout:
* Gateway sends a `FrameType.Cancel` and returns 504 to the client.
* InMemory hub delivers CANCEL frames to the microservice.
* Microservice:
* Tracks inflight requests by correlationId.
* Cancels the proper `CancellationTokenSource` when CANCEL arrives.
* Passes the token into handlers via `RawRequestContext` and typed adapters.
* At least one automated test proves:
* Cancellation propagates from gateway to microservice and stops the handler.
Once this is done, youll be in good shape to add streaming & payload-limits on top, because the cancel path is already wired endtoend.