Add integration tests for migration categories and execution
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled

- Implemented MigrationCategoryTests to validate migration categorization for startup, release, seed, and data migrations.
- Added tests for edge cases, including null, empty, and whitespace migration names.
- Created StartupMigrationHostTests to verify the behavior of the migration host with real PostgreSQL instances using Testcontainers.
- Included tests for migration execution, schema creation, and handling of pending release migrations.
- Added SQL migration files for testing: creating a test table, adding a column, a release migration, and seeding data.
This commit is contained in:
master
2025-12-04 19:10:54 +02:00
parent 600f3a7a3c
commit 75f6942769
301 changed files with 32810 additions and 1128 deletions

946
docs/router/13-Step.md Normal file
View File

@@ -0,0 +1,946 @@
# Step 13: InMemory Transport Implementation
**Phase 3: Transport Layer**
**Estimated Complexity:** Medium
**Dependencies:** Step 12 (Request/Response Serialization)
---
## Overview
The InMemory transport provides a high-performance, zero-network transport for testing, local development, and same-process microservices. It serves as the reference implementation for the transport layer and must pass all protocol tests before any real transport implementation.
---
## Goals
1. Implement a fully-functional in-process transport without network overhead
2. Serve as the reference implementation for transport protocol compliance
3. Enable fast integration tests without network dependencies
4. Support all frame types and streaming semantics
5. Provide debugging hooks for protocol validation
---
## Core Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ InMemory Transport Hub │
├─────────────────────────────────────────────────────────────┤
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Gateway Side │◄──►│ Channels │◄──►│Microservice │ │
│ │ Client │ │ (Duplex) │ │ Server │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ Connection Registry Frame Queue Handler Dispatch │
└─────────────────────────────────────────────────────────────┘
```
---
## Core Types
### InMemory Channel
```csharp
namespace StellaOps.Router.Transport.InMemory;
/// <summary>
/// Bidirectional in-memory channel for frame exchange.
/// </summary>
public sealed class InMemoryChannel : IAsyncDisposable
{
private readonly Channel<Frame> _gatewayToService;
private readonly Channel<Frame> _serviceToGateway;
private readonly CancellationTokenSource _cts;
public string ChannelId { get; }
public string ServiceName { get; }
public string InstanceId { get; }
public ConnectionState State { get; private set; }
public DateTimeOffset CreatedAt { get; }
public DateTimeOffset LastActivityAt { get; private set; }
public InMemoryChannel(string serviceName, string instanceId)
{
ChannelId = Guid.NewGuid().ToString("N");
ServiceName = serviceName;
InstanceId = instanceId;
CreatedAt = DateTimeOffset.UtcNow;
LastActivityAt = CreatedAt;
State = ConnectionState.Connecting;
_cts = new CancellationTokenSource();
// Bounded channels to provide backpressure
var options = new BoundedChannelOptions(1000)
{
FullMode = BoundedChannelFullMode.Wait,
SingleReader = false,
SingleWriter = false
};
_gatewayToService = Channel.CreateBounded<Frame>(options);
_serviceToGateway = Channel.CreateBounded<Frame>(options);
}
/// <summary>
/// Gets the writer for sending frames from gateway to service.
/// </summary>
public ChannelWriter<Frame> GatewayWriter => _gatewayToService.Writer;
/// <summary>
/// Gets the reader for receiving frames from gateway (service side).
/// </summary>
public ChannelReader<Frame> ServiceReader => _gatewayToService.Reader;
/// <summary>
/// Gets the writer for sending frames from service to gateway.
/// </summary>
public ChannelWriter<Frame> ServiceWriter => _serviceToGateway.Writer;
/// <summary>
/// Gets the reader for receiving frames from service (gateway side).
/// </summary>
public ChannelReader<Frame> GatewayReader => _serviceToGateway.Reader;
public void MarkConnected()
{
State = ConnectionState.Connected;
LastActivityAt = DateTimeOffset.UtcNow;
}
public void UpdateActivity()
{
LastActivityAt = DateTimeOffset.UtcNow;
}
public async ValueTask DisposeAsync()
{
State = ConnectionState.Disconnected;
_cts.Cancel();
_gatewayToService.Writer.TryComplete();
_serviceToGateway.Writer.TryComplete();
_cts.Dispose();
}
}
```
### InMemory Hub
```csharp
namespace StellaOps.Router.Transport.InMemory;
/// <summary>
/// Central hub managing all InMemory transport connections.
/// </summary>
public sealed class InMemoryTransportHub : IDisposable
{
private readonly ConcurrentDictionary<string, InMemoryChannel> _channels = new();
private readonly ConcurrentDictionary<string, List<string>> _serviceChannels = new();
private readonly ILogger<InMemoryTransportHub> _logger;
public InMemoryTransportHub(ILogger<InMemoryTransportHub> logger)
{
_logger = logger;
}
/// <summary>
/// Creates a new channel for a microservice connection.
/// </summary>
public InMemoryChannel CreateChannel(string serviceName, string instanceId)
{
var channel = new InMemoryChannel(serviceName, instanceId);
if (!_channels.TryAdd(channel.ChannelId, channel))
{
throw new InvalidOperationException($"Channel {channel.ChannelId} already exists");
}
_serviceChannels.AddOrUpdate(
serviceName,
_ => new List<string> { channel.ChannelId },
(_, list) => { lock (list) { list.Add(channel.ChannelId); } return list; }
);
_logger.LogDebug(
"Created InMemory channel {ChannelId} for {ServiceName}/{InstanceId}",
channel.ChannelId, serviceName, instanceId);
return channel;
}
/// <summary>
/// Gets a channel by ID.
/// </summary>
public InMemoryChannel? GetChannel(string channelId)
{
return _channels.TryGetValue(channelId, out var channel) ? channel : null;
}
/// <summary>
/// Gets all channels for a service.
/// </summary>
public IReadOnlyList<InMemoryChannel> GetServiceChannels(string serviceName)
{
if (!_serviceChannels.TryGetValue(serviceName, out var channelIds))
return Array.Empty<InMemoryChannel>();
var result = new List<InMemoryChannel>();
lock (channelIds)
{
foreach (var id in channelIds)
{
if (_channels.TryGetValue(id, out var channel) &&
channel.State == ConnectionState.Connected)
{
result.Add(channel);
}
}
}
return result;
}
/// <summary>
/// Removes a channel from the hub.
/// </summary>
public async Task RemoveChannelAsync(string channelId)
{
if (_channels.TryRemove(channelId, out var channel))
{
if (_serviceChannels.TryGetValue(channel.ServiceName, out var list))
{
lock (list) { list.Remove(channelId); }
}
await channel.DisposeAsync();
_logger.LogDebug("Removed InMemory channel {ChannelId}", channelId);
}
}
/// <summary>
/// Gets all active channels.
/// </summary>
public IEnumerable<InMemoryChannel> GetAllChannels()
{
return _channels.Values.Where(c => c.State == ConnectionState.Connected);
}
public void Dispose()
{
foreach (var channel in _channels.Values)
{
_ = channel.DisposeAsync();
}
_channels.Clear();
_serviceChannels.Clear();
}
}
```
---
## Gateway-Side Client
```csharp
namespace StellaOps.Router.Transport.InMemory;
/// <summary>
/// Gateway-side client for InMemory transport.
/// </summary>
public sealed class InMemoryTransportClient : ITransportClient
{
private readonly InMemoryTransportHub _hub;
private readonly IPayloadSerializer _serializer;
private readonly ILogger<InMemoryTransportClient> _logger;
private readonly ConcurrentDictionary<string, TaskCompletionSource<ResponsePayload>> _pendingRequests = new();
public string TransportType => "InMemory";
public InMemoryTransportClient(
InMemoryTransportHub hub,
IPayloadSerializer serializer,
ILogger<InMemoryTransportClient> logger)
{
_hub = hub;
_serializer = serializer;
_logger = logger;
}
public async Task<ResponsePayload> SendRequestAsync(
string serviceName,
RequestPayload request,
TimeSpan timeout,
CancellationToken cancellationToken)
{
var channels = _hub.GetServiceChannels(serviceName);
if (channels.Count == 0)
{
throw new NoAvailableInstanceException(serviceName);
}
// Simple round-robin selection (in production, use routing plugin)
var channel = channels[Random.Shared.Next(channels.Count)];
var correlationId = Guid.NewGuid().ToString("N");
var tcs = new TaskCompletionSource<ResponsePayload>(TaskCreationOptions.RunContinuationsAsynchronously);
_pendingRequests[correlationId] = tcs;
try
{
// Create and send request frame
var frame = new Frame
{
Type = FrameType.Request,
CorrelationId = correlationId,
Payload = _serializer.SerializeRequest(request)
};
await channel.GatewayWriter.WriteAsync(frame, cancellationToken);
channel.UpdateActivity();
// Start listening for response
_ = ListenForResponseAsync(channel, correlationId, cancellationToken);
// Wait for response with timeout
using var timeoutCts = new CancellationTokenSource(timeout);
using var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(
cancellationToken, timeoutCts.Token);
try
{
return await tcs.Task.WaitAsync(linkedCts.Token);
}
catch (OperationCanceledException) when (timeoutCts.IsCancellationRequested)
{
// Send cancel frame
await SendCancelAsync(channel, correlationId);
throw new TimeoutException($"Request to {serviceName} timed out after {timeout}");
}
}
finally
{
_pendingRequests.TryRemove(correlationId, out _);
}
}
public async IAsyncEnumerable<ResponsePayload> SendStreamingRequestAsync(
string serviceName,
IAsyncEnumerable<RequestPayload> requestChunks,
TimeSpan timeout,
[EnumeratorCancellation] CancellationToken cancellationToken)
{
var channels = _hub.GetServiceChannels(serviceName);
if (channels.Count == 0)
{
throw new NoAvailableInstanceException(serviceName);
}
var channel = channels[Random.Shared.Next(channels.Count)];
var correlationId = Guid.NewGuid().ToString("N");
// Send all request chunks
await foreach (var chunk in requestChunks.WithCancellation(cancellationToken))
{
var frame = new Frame
{
Type = FrameType.Request,
CorrelationId = correlationId,
Payload = _serializer.SerializeRequest(chunk),
Flags = chunk.IsStreaming ? FrameFlags.None : FrameFlags.Final
};
await channel.GatewayWriter.WriteAsync(frame, cancellationToken);
channel.UpdateActivity();
}
// Read response chunks
await foreach (var frame in channel.GatewayReader.ReadAllAsync(cancellationToken))
{
if (frame.CorrelationId != correlationId)
continue;
if (frame.Type == FrameType.Response)
{
var response = _serializer.DeserializeResponse(frame.Payload);
yield return response;
if (response.IsFinalChunk || frame.Flags.HasFlag(FrameFlags.Final))
yield break;
}
}
}
private async Task ListenForResponseAsync(
InMemoryChannel channel,
string correlationId,
CancellationToken cancellationToken)
{
try
{
await foreach (var frame in channel.GatewayReader.ReadAllAsync(cancellationToken))
{
if (frame.CorrelationId != correlationId)
continue;
if (frame.Type == FrameType.Response)
{
var response = _serializer.DeserializeResponse(frame.Payload);
if (_pendingRequests.TryGetValue(correlationId, out var tcs))
{
tcs.TrySetResult(response);
}
return;
}
}
}
catch (OperationCanceledException)
{
// Expected on cancellation
}
}
private async Task SendCancelAsync(InMemoryChannel channel, string correlationId)
{
try
{
var cancelFrame = new Frame
{
Type = FrameType.Cancel,
CorrelationId = correlationId,
Payload = Array.Empty<byte>()
};
await channel.GatewayWriter.WriteAsync(cancelFrame);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to send cancel frame for {CorrelationId}", correlationId);
}
}
}
```
---
## Microservice-Side Server
```csharp
namespace StellaOps.Router.Transport.InMemory;
/// <summary>
/// Microservice-side server for InMemory transport.
/// </summary>
public sealed class InMemoryTransportServer : ITransportServer
{
private readonly InMemoryTransportHub _hub;
private readonly IPayloadSerializer _serializer;
private readonly ILogger<InMemoryTransportServer> _logger;
private InMemoryChannel? _channel;
private CancellationTokenSource? _cts;
private Task? _processingTask;
public string TransportType => "InMemory";
public bool IsConnected => _channel?.State == ConnectionState.Connected;
public event Func<RequestPayload, CancellationToken, Task<ResponsePayload>>? OnRequest;
public event Func<string, CancellationToken, Task>? OnCancel;
public InMemoryTransportServer(
InMemoryTransportHub hub,
IPayloadSerializer serializer,
ILogger<InMemoryTransportServer> logger)
{
_hub = hub;
_serializer = serializer;
_logger = logger;
}
public async Task ConnectAsync(
string serviceName,
string instanceId,
EndpointDescriptor[] endpoints,
CancellationToken cancellationToken)
{
_channel = _hub.CreateChannel(serviceName, instanceId);
_cts = new CancellationTokenSource();
// Send HELLO frame
var helloPayload = new HelloPayload
{
ServiceName = serviceName,
InstanceId = instanceId,
Endpoints = endpoints,
Metadata = new Dictionary<string, string>
{
["transport"] = "InMemory",
["pid"] = Environment.ProcessId.ToString()
}
};
var helloFrame = new Frame
{
Type = FrameType.Hello,
CorrelationId = Guid.NewGuid().ToString("N"),
Payload = _serializer.SerializeHello(helloPayload)
};
await _channel.ServiceWriter.WriteAsync(helloFrame, cancellationToken);
// Wait for HELLO response
var response = await _channel.ServiceReader.ReadAsync(cancellationToken);
if (response.Type != FrameType.Hello)
{
throw new ProtocolException($"Expected HELLO response, got {response.Type}");
}
_channel.MarkConnected();
_logger.LogInformation(
"InMemory transport connected for {ServiceName}/{InstanceId}",
serviceName, instanceId);
// Start processing loop
_processingTask = ProcessFramesAsync(_cts.Token);
}
private async Task ProcessFramesAsync(CancellationToken cancellationToken)
{
if (_channel == null) return;
try
{
await foreach (var frame in _channel.ServiceReader.ReadAllAsync(cancellationToken))
{
_channel.UpdateActivity();
switch (frame.Type)
{
case FrameType.Request:
_ = HandleRequestAsync(frame, cancellationToken);
break;
case FrameType.Cancel:
if (OnCancel != null)
{
await OnCancel(frame.CorrelationId, cancellationToken);
}
break;
case FrameType.Heartbeat:
await HandleHeartbeatAsync(frame);
break;
}
}
}
catch (OperationCanceledException)
{
// Expected on shutdown
}
catch (Exception ex)
{
_logger.LogError(ex, "Error processing InMemory frames");
}
}
private async Task HandleRequestAsync(Frame frame, CancellationToken cancellationToken)
{
if (_channel == null || OnRequest == null) return;
try
{
var request = _serializer.DeserializeRequest(frame.Payload);
var response = await OnRequest(request, cancellationToken);
var responseFrame = new Frame
{
Type = FrameType.Response,
CorrelationId = frame.CorrelationId,
Payload = _serializer.SerializeResponse(response),
Flags = FrameFlags.Final
};
await _channel.ServiceWriter.WriteAsync(responseFrame, cancellationToken);
}
catch (Exception ex)
{
_logger.LogError(ex, "Error handling request {CorrelationId}", frame.CorrelationId);
// Send error response
var errorResponse = new ResponsePayload
{
StatusCode = 500,
Headers = new Dictionary<string, string>(),
ErrorMessage = ex.Message,
IsFinalChunk = true
};
var errorFrame = new Frame
{
Type = FrameType.Response,
CorrelationId = frame.CorrelationId,
Payload = _serializer.SerializeResponse(errorResponse),
Flags = FrameFlags.Final | FrameFlags.Error
};
await _channel.ServiceWriter.WriteAsync(errorFrame, cancellationToken);
}
}
private async Task HandleHeartbeatAsync(Frame frame)
{
if (_channel == null) return;
var pongFrame = new Frame
{
Type = FrameType.Heartbeat,
CorrelationId = frame.CorrelationId,
Payload = frame.Payload // Echo back
};
await _channel.ServiceWriter.WriteAsync(pongFrame);
}
public async Task DisconnectAsync()
{
_cts?.Cancel();
if (_processingTask != null)
{
try
{
await _processingTask.WaitAsync(TimeSpan.FromSeconds(5));
}
catch (TimeoutException)
{
_logger.LogWarning("InMemory processing task did not complete in time");
}
}
if (_channel != null)
{
await _hub.RemoveChannelAsync(_channel.ChannelId);
}
_cts?.Dispose();
}
public async Task SendHeartbeatAsync(CancellationToken cancellationToken)
{
if (_channel == null || _channel.State != ConnectionState.Connected)
return;
var heartbeatFrame = new Frame
{
Type = FrameType.Heartbeat,
CorrelationId = Guid.NewGuid().ToString("N"),
Payload = BitConverter.GetBytes(DateTimeOffset.UtcNow.ToUnixTimeMilliseconds())
};
await _channel.ServiceWriter.WriteAsync(heartbeatFrame, cancellationToken);
}
}
```
---
## Integration with Global Routing State
```csharp
namespace StellaOps.Router.Transport.InMemory;
/// <summary>
/// InMemory transport integration with gateway routing state.
/// </summary>
public sealed class InMemoryRoutingIntegration : IHostedService
{
private readonly InMemoryTransportHub _hub;
private readonly IGlobalRoutingState _routingState;
private readonly ILogger<InMemoryRoutingIntegration> _logger;
private Timer? _syncTimer;
public InMemoryRoutingIntegration(
InMemoryTransportHub hub,
IGlobalRoutingState routingState,
ILogger<InMemoryRoutingIntegration> logger)
{
_hub = hub;
_routingState = routingState;
_logger = logger;
}
public Task StartAsync(CancellationToken cancellationToken)
{
// Sync InMemory channels with routing state periodically
_syncTimer = new Timer(SyncChannels, null, TimeSpan.Zero, TimeSpan.FromSeconds(5));
return Task.CompletedTask;
}
private void SyncChannels(object? state)
{
try
{
foreach (var channel in _hub.GetAllChannels())
{
var connection = new EndpointConnection
{
ServiceName = channel.ServiceName,
InstanceId = channel.InstanceId,
ConnectionId = channel.ChannelId,
Transport = "InMemory",
State = channel.State,
LastHeartbeat = channel.LastActivityAt
};
_routingState.UpdateConnection(connection);
}
}
catch (Exception ex)
{
_logger.LogError(ex, "Error syncing InMemory channels");
}
}
public Task StopAsync(CancellationToken cancellationToken)
{
_syncTimer?.Dispose();
return Task.CompletedTask;
}
}
```
---
## Service Registration
```csharp
namespace StellaOps.Router.Transport.InMemory;
public static class InMemoryTransportExtensions
{
/// <summary>
/// Adds InMemory transport to the gateway.
/// </summary>
public static IServiceCollection AddInMemoryTransport(this IServiceCollection services)
{
services.AddSingleton<InMemoryTransportHub>();
services.AddSingleton<ITransportClient, InMemoryTransportClient>();
services.AddHostedService<InMemoryRoutingIntegration>();
return services;
}
/// <summary>
/// Adds InMemory transport to a microservice.
/// </summary>
public static IServiceCollection AddInMemoryMicroserviceTransport(
this IServiceCollection services,
Action<InMemoryTransportOptions>? configure = null)
{
var options = new InMemoryTransportOptions();
configure?.Invoke(options);
services.AddSingleton(options);
services.AddSingleton<ITransportServer, InMemoryTransportServer>();
return services;
}
}
public class InMemoryTransportOptions
{
public int MaxPendingRequests { get; set; } = 1000;
public TimeSpan ConnectionTimeout { get; set; } = TimeSpan.FromSeconds(30);
}
```
---
## Testing Utilities
```csharp
namespace StellaOps.Router.Transport.InMemory.Testing;
/// <summary>
/// Test fixture for InMemory transport testing.
/// </summary>
public sealed class InMemoryTransportFixture : IAsyncDisposable
{
private readonly InMemoryTransportHub _hub;
private readonly ILoggerFactory _loggerFactory;
public InMemoryTransportHub Hub => _hub;
public InMemoryTransportFixture()
{
_loggerFactory = LoggerFactory.Create(b => b.AddConsole());
_hub = new InMemoryTransportHub(_loggerFactory.CreateLogger<InMemoryTransportHub>());
}
public InMemoryTransportClient CreateClient()
{
var serializer = new MessagePackPayloadSerializer();
return new InMemoryTransportClient(
_hub,
serializer,
_loggerFactory.CreateLogger<InMemoryTransportClient>());
}
public InMemoryTransportServer CreateServer()
{
var serializer = new MessagePackPayloadSerializer();
return new InMemoryTransportServer(
_hub,
serializer,
_loggerFactory.CreateLogger<InMemoryTransportServer>());
}
public async ValueTask DisposeAsync()
{
_hub.Dispose();
_loggerFactory.Dispose();
}
}
```
---
## Unit Tests
```csharp
public class InMemoryTransportTests
{
[Fact]
public async Task SimpleRequestResponse_Works()
{
await using var fixture = new InMemoryTransportFixture();
var client = fixture.CreateClient();
var server = fixture.CreateServer();
// Setup server
server.OnRequest += (request, ct) => Task.FromResult(new ResponsePayload
{
StatusCode = 200,
Headers = new Dictionary<string, string>(),
Body = Encoding.UTF8.GetBytes($"Hello {request.Path}")
});
await server.ConnectAsync("test-service", "instance-1", Array.Empty<EndpointDescriptor>(), default);
// Send request
var response = await client.SendRequestAsync(
"test-service",
new RequestPayload
{
Method = "GET",
Path = "/test",
Headers = new Dictionary<string, string>(),
Claims = new Dictionary<string, string>()
},
TimeSpan.FromSeconds(5),
default);
Assert.Equal(200, response.StatusCode);
Assert.Equal("Hello /test", Encoding.UTF8.GetString(response.Body!));
}
[Fact]
public async Task Cancellation_SendsCancelFrame()
{
await using var fixture = new InMemoryTransportFixture();
var client = fixture.CreateClient();
var server = fixture.CreateServer();
var cancelReceived = new TaskCompletionSource<bool>();
server.OnRequest += async (request, ct) =>
{
await Task.Delay(TimeSpan.FromSeconds(30), ct);
return new ResponsePayload { StatusCode = 200, Headers = new Dictionary<string, string>() };
};
server.OnCancel += (correlationId, ct) =>
{
cancelReceived.TrySetResult(true);
return Task.CompletedTask;
};
await server.ConnectAsync("test-service", "instance-1", Array.Empty<EndpointDescriptor>(), default);
// Send request with short timeout
await Assert.ThrowsAsync<TimeoutException>(() =>
client.SendRequestAsync(
"test-service",
new RequestPayload { Method = "GET", Path = "/slow", Headers = new Dictionary<string, string>(), Claims = new Dictionary<string, string>() },
TimeSpan.FromMilliseconds(100),
default));
// Verify cancel was received
var result = await cancelReceived.Task.WaitAsync(TimeSpan.FromSeconds(1));
Assert.True(result);
}
[Fact]
public async Task MultipleInstances_DistributesRequests()
{
await using var fixture = new InMemoryTransportFixture();
var client = fixture.CreateClient();
var server1 = fixture.CreateServer();
var server2 = fixture.CreateServer();
var server1Count = 0;
var server2Count = 0;
server1.OnRequest += (r, ct) =>
{
Interlocked.Increment(ref server1Count);
return Task.FromResult(new ResponsePayload { StatusCode = 200, Headers = new Dictionary<string, string>() });
};
server2.OnRequest += (r, ct) =>
{
Interlocked.Increment(ref server2Count);
return Task.FromResult(new ResponsePayload { StatusCode = 200, Headers = new Dictionary<string, string>() });
};
await server1.ConnectAsync("test-service", "instance-1", Array.Empty<EndpointDescriptor>(), default);
await server2.ConnectAsync("test-service", "instance-2", Array.Empty<EndpointDescriptor>(), default);
// Send multiple requests
for (int i = 0; i < 100; i++)
{
await client.SendRequestAsync(
"test-service",
new RequestPayload { Method = "GET", Path = "/test", Headers = new Dictionary<string, string>(), Claims = new Dictionary<string, string>() },
TimeSpan.FromSeconds(5),
default);
}
// Both instances should have received requests
Assert.True(server1Count > 0);
Assert.True(server2Count > 0);
Assert.Equal(100, server1Count + server2Count);
}
}
```
---
## Deliverables
1. `StellaOps.Router.Transport.InMemory/InMemoryChannel.cs`
2. `StellaOps.Router.Transport.InMemory/InMemoryTransportHub.cs`
3. `StellaOps.Router.Transport.InMemory/InMemoryTransportClient.cs`
4. `StellaOps.Router.Transport.InMemory/InMemoryTransportServer.cs`
5. `StellaOps.Router.Transport.InMemory/InMemoryRoutingIntegration.cs`
6. `StellaOps.Router.Transport.InMemory/InMemoryTransportExtensions.cs`
7. `StellaOps.Router.Transport.InMemory.Testing/InMemoryTransportFixture.cs`
8. Unit tests for all frame types
9. Integration tests for request/response patterns
10. Streaming tests
---
## Next Step
Proceed to [Step 14: TCP Transport Implementation](14-Step.md) to implement the primary production transport.

1054
docs/router/14-Step.md Normal file

File diff suppressed because it is too large Load Diff

1156
docs/router/15-Step.md Normal file

File diff suppressed because it is too large Load Diff

994
docs/router/16-Step.md Normal file
View File

@@ -0,0 +1,994 @@
# Step 16: GraphQL Handler Implementation
**Phase 4: Handler Plugins**
**Estimated Complexity:** High
**Dependencies:** Step 10 (Microservice Handler)
---
## Overview
The GraphQL handler routes GraphQL queries, mutations, and subscriptions to appropriate microservices based on schema analysis. It supports schema stitching, query splitting, and federated execution across multiple services.
---
## Goals
1. Route GraphQL operations to appropriate backend services
2. Support schema federation/stitching across microservices
3. Handle batched queries with DataLoader patterns
4. Support subscriptions via WebSocket upgrade
5. Provide introspection proxying and schema caching
---
## Core Architecture
```
┌──────────────────────────────────────────────────────────────────┐
│ GraphQL Handler │
├──────────────────────────────────────────────────────────────────┤
│ │
│ HTTP Request │
│ │ │
│ ▼ │
│ ┌───────────────┐ │
│ │ Query Parser │──► Extract operation type & fields │
│ └───────┬───────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ ┌─────────────────┐ │
│ │ Query Planner │───►│ Schema Registry │ │
│ └───────┬───────┘ └─────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ │
│ │Query Executor │──► Split & dispatch to services │
│ └───────┬───────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ │
│ │Result Merger │──► Combine partial results │
│ └───────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────┘
```
---
## Configuration
```csharp
namespace StellaOps.Router.Handlers.GraphQL;
public class GraphQLHandlerConfig
{
/// <summary>Path prefix for GraphQL endpoint.</summary>
public string Path { get; set; } = "/graphql";
/// <summary>Whether to enable introspection queries.</summary>
public bool EnableIntrospection { get; set; } = true;
/// <summary>Whether to enable subscriptions.</summary>
public bool EnableSubscriptions { get; set; } = true;
/// <summary>Maximum query depth to prevent DOS.</summary>
public int MaxQueryDepth { get; set; } = 15;
/// <summary>Maximum query complexity score.</summary>
public int MaxQueryComplexity { get; set; } = 1000;
/// <summary>Timeout for query execution.</summary>
public TimeSpan ExecutionTimeout { get; set; } = TimeSpan.FromSeconds(30);
/// <summary>Cache duration for schema introspection.</summary>
public TimeSpan SchemaCacheDuration { get; set; } = TimeSpan.FromMinutes(5);
/// <summary>Whether to enable query batching.</summary>
public bool EnableBatching { get; set; } = true;
/// <summary>Maximum batch size.</summary>
public int MaxBatchSize { get; set; } = 10;
/// <summary>Registered GraphQL services and their type ownership.</summary>
public Dictionary<string, GraphQLServiceConfig> Services { get; set; } = new();
}
public class GraphQLServiceConfig
{
/// <summary>Service name for routing.</summary>
public required string ServiceName { get; set; }
/// <summary>Root types this service handles (Query, Mutation, Subscription).</summary>
public HashSet<string> RootTypes { get; set; } = new();
/// <summary>Specific fields this service owns.</summary>
public Dictionary<string, HashSet<string>> OwnedFields { get; set; } = new();
/// <summary>Whether this service provides the full schema.</summary>
public bool IsSchemaProvider { get; set; }
}
```
---
## Core Types
```csharp
namespace StellaOps.Router.Handlers.GraphQL;
/// <summary>
/// Parsed GraphQL request.
/// </summary>
public sealed class GraphQLRequest
{
public required string Query { get; init; }
public string? OperationName { get; init; }
public Dictionary<string, object?>? Variables { get; init; }
public Dictionary<string, object?>? Extensions { get; init; }
}
/// <summary>
/// GraphQL response format.
/// </summary>
public sealed class GraphQLResponse
{
public object? Data { get; set; }
public List<GraphQLError>? Errors { get; set; }
public Dictionary<string, object?>? Extensions { get; set; }
}
public sealed class GraphQLError
{
public required string Message { get; init; }
public List<GraphQLLocation>? Locations { get; init; }
public List<object>? Path { get; init; }
public Dictionary<string, object?>? Extensions { get; init; }
}
public sealed class GraphQLLocation
{
public int Line { get; init; }
public int Column { get; init; }
}
/// <summary>
/// Represents a planned query execution.
/// </summary>
public sealed class QueryPlan
{
public GraphQLOperationType OperationType { get; init; }
public List<QueryPlanNode> Nodes { get; init; } = new();
}
public sealed class QueryPlanNode
{
public string ServiceName { get; init; } = "";
public string SubQuery { get; init; } = "";
public List<string> RequiredFields { get; init; } = new();
public List<QueryPlanNode> DependsOn { get; init; } = new();
}
public enum GraphQLOperationType
{
Query,
Mutation,
Subscription
}
```
---
## GraphQL Handler Implementation
```csharp
namespace StellaOps.Router.Handlers.GraphQL;
public sealed class GraphQLHandler : IRouteHandler
{
public string HandlerType => "GraphQL";
public int Priority => 100;
private readonly GraphQLHandlerConfig _config;
private readonly IGraphQLParser _parser;
private readonly IQueryPlanner _planner;
private readonly IQueryExecutor _executor;
private readonly ISchemaRegistry _schemaRegistry;
private readonly ILogger<GraphQLHandler> _logger;
public GraphQLHandler(
IOptions<GraphQLHandlerConfig> config,
IGraphQLParser parser,
IQueryPlanner planner,
IQueryExecutor executor,
ISchemaRegistry schemaRegistry,
ILogger<GraphQLHandler> logger)
{
_config = config.Value;
_parser = parser;
_planner = planner;
_executor = executor;
_schemaRegistry = schemaRegistry;
_logger = logger;
}
public bool CanHandle(RouteMatchResult match)
{
return match.Handler == "GraphQL" ||
match.Route.Path.StartsWith(_config.Path, StringComparison.OrdinalIgnoreCase);
}
public async Task<RouteHandlerResult> HandleAsync(
HttpContext context,
RouteMatchResult match,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
try
{
// Handle WebSocket upgrade for subscriptions
if (context.WebSockets.IsWebSocketRequest && _config.EnableSubscriptions)
{
return await HandleSubscriptionAsync(context, claims, cancellationToken);
}
// Parse GraphQL request
var request = await ParseRequestAsync(context, cancellationToken);
// Validate query
var validationResult = _parser.Validate(
request.Query,
_config.MaxQueryDepth,
_config.MaxQueryComplexity);
if (!validationResult.IsValid)
{
return CreateErrorResponse(validationResult.Errors);
}
// Parse and analyze query
var operation = _parser.Parse(request.Query, request.OperationName);
// Check if introspection
if (operation.IsIntrospection)
{
if (!_config.EnableIntrospection)
{
return CreateErrorResponse(new[] { "Introspection is disabled" });
}
return await HandleIntrospectionAsync(request, cancellationToken);
}
// Plan query execution
var plan = _planner.CreatePlan(operation, _config.Services);
_logger.LogDebug(
"Query plan created: {NodeCount} nodes for {OperationType}",
plan.Nodes.Count, plan.OperationType);
// Execute plan
var result = await _executor.ExecuteAsync(
plan,
request,
claims,
_config.ExecutionTimeout,
cancellationToken);
return CreateSuccessResponse(result);
}
catch (GraphQLParseException ex)
{
return CreateErrorResponse(new[] { ex.Message });
}
catch (Exception ex)
{
_logger.LogError(ex, "GraphQL execution error");
return CreateErrorResponse(new[] { "Internal server error" }, 500);
}
}
private async Task<GraphQLRequest> ParseRequestAsync(
HttpContext context,
CancellationToken cancellationToken)
{
if (context.Request.Method == "GET")
{
return new GraphQLRequest
{
Query = context.Request.Query["query"].ToString(),
OperationName = context.Request.Query["operationName"].ToString(),
Variables = ParseVariables(context.Request.Query["variables"].ToString())
};
}
var body = await JsonSerializer.DeserializeAsync<GraphQLRequest>(
context.Request.Body,
cancellationToken: cancellationToken);
return body ?? throw new GraphQLParseException("Invalid request body");
}
private Dictionary<string, object?>? ParseVariables(string? json)
{
if (string.IsNullOrEmpty(json))
return null;
return JsonSerializer.Deserialize<Dictionary<string, object?>>(json);
}
private async Task<RouteHandlerResult> HandleIntrospectionAsync(
GraphQLRequest request,
CancellationToken cancellationToken)
{
var schema = await _schemaRegistry.GetMergedSchemaAsync(cancellationToken);
var result = await _executor.ExecuteIntrospectionAsync(schema, request, cancellationToken);
return CreateSuccessResponse(result);
}
private async Task<RouteHandlerResult> HandleSubscriptionAsync(
HttpContext context,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
var webSocket = await context.WebSockets.AcceptWebSocketAsync("graphql-transport-ws");
await _executor.HandleSubscriptionAsync(webSocket, claims, cancellationToken);
return new RouteHandlerResult
{
Handled = true,
StatusCode = 101 // Switching Protocols
};
}
private RouteHandlerResult CreateSuccessResponse(GraphQLResponse response)
{
return new RouteHandlerResult
{
Handled = true,
StatusCode = 200,
ContentType = "application/json",
Body = JsonSerializer.SerializeToUtf8Bytes(response)
};
}
private RouteHandlerResult CreateErrorResponse(IEnumerable<string> messages, int statusCode = 200)
{
var response = new GraphQLResponse
{
Errors = messages.Select(m => new GraphQLError { Message = m }).ToList()
};
return new RouteHandlerResult
{
Handled = true,
StatusCode = statusCode,
ContentType = "application/json",
Body = JsonSerializer.SerializeToUtf8Bytes(response)
};
}
}
```
---
## Query Planner
```csharp
namespace StellaOps.Router.Handlers.GraphQL;
public interface IQueryPlanner
{
QueryPlan CreatePlan(
ParsedOperation operation,
Dictionary<string, GraphQLServiceConfig> services);
}
public sealed class QueryPlanner : IQueryPlanner
{
private readonly ILogger<QueryPlanner> _logger;
public QueryPlanner(ILogger<QueryPlanner> logger)
{
_logger = logger;
}
public QueryPlan CreatePlan(
ParsedOperation operation,
Dictionary<string, GraphQLServiceConfig> services)
{
var plan = new QueryPlan
{
OperationType = operation.OperationType
};
// Group fields by owning service
var fieldsByService = new Dictionary<string, List<FieldSelection>>();
foreach (var field in operation.SelectionSet)
{
var service = FindOwningService(operation.OperationType, field.Name, services);
if (!fieldsByService.ContainsKey(service))
{
fieldsByService[service] = new List<FieldSelection>();
}
fieldsByService[service].Add(field);
}
// Create execution nodes
foreach (var (serviceName, fields) in fieldsByService)
{
var subQuery = BuildSubQuery(operation, fields);
plan.Nodes.Add(new QueryPlanNode
{
ServiceName = serviceName,
SubQuery = subQuery,
RequiredFields = fields.Select(f => f.Name).ToList()
});
}
// For mutations, nodes must execute sequentially
if (operation.OperationType == GraphQLOperationType.Mutation)
{
for (int i = 1; i < plan.Nodes.Count; i++)
{
plan.Nodes[i].DependsOn.Add(plan.Nodes[i - 1]);
}
}
return plan;
}
private string FindOwningService(
GraphQLOperationType opType,
string fieldName,
Dictionary<string, GraphQLServiceConfig> services)
{
var rootType = opType switch
{
GraphQLOperationType.Query => "Query",
GraphQLOperationType.Mutation => "Mutation",
GraphQLOperationType.Subscription => "Subscription",
_ => "Query"
};
foreach (var (name, config) in services)
{
if (config.OwnedFields.TryGetValue(rootType, out var fields) &&
fields.Contains(fieldName))
{
return name;
}
if (config.RootTypes.Contains(rootType))
{
return name;
}
}
throw new GraphQLExecutionException($"No service found for field: {rootType}.{fieldName}");
}
private string BuildSubQuery(ParsedOperation operation, List<FieldSelection> fields)
{
var sb = new StringBuilder();
sb.Append(operation.OperationType.ToString().ToLower());
if (!string.IsNullOrEmpty(operation.Name))
{
sb.Append(' ').Append(operation.Name);
}
if (operation.Variables.Count > 0)
{
sb.Append('(');
sb.Append(string.Join(", ", operation.Variables.Select(v => $"${v.Name}: {v.Type}")));
sb.Append(')');
}
sb.Append(" { ");
foreach (var field in fields)
{
AppendField(sb, field);
}
sb.Append(" }");
return sb.ToString();
}
private void AppendField(StringBuilder sb, FieldSelection field)
{
if (!string.IsNullOrEmpty(field.Alias))
{
sb.Append(field.Alias).Append(": ");
}
sb.Append(field.Name);
if (field.Arguments.Count > 0)
{
sb.Append('(');
sb.Append(string.Join(", ", field.Arguments.Select(a => $"{a.Key}: {FormatValue(a.Value)}")));
sb.Append(')');
}
if (field.SelectionSet.Count > 0)
{
sb.Append(" { ");
foreach (var subField in field.SelectionSet)
{
AppendField(sb, subField);
sb.Append(' ');
}
sb.Append('}');
}
sb.Append(' ');
}
private string FormatValue(object? value)
{
return value switch
{
null => "null",
string s => $"\"{s}\"",
bool b => b.ToString().ToLower(),
_ => value.ToString() ?? "null"
};
}
}
```
---
## Query Executor
```csharp
namespace StellaOps.Router.Handlers.GraphQL;
public interface IQueryExecutor
{
Task<GraphQLResponse> ExecuteAsync(
QueryPlan plan,
GraphQLRequest request,
IReadOnlyDictionary<string, string> claims,
TimeSpan timeout,
CancellationToken cancellationToken);
Task<GraphQLResponse> ExecuteIntrospectionAsync(
GraphQLSchema schema,
GraphQLRequest request,
CancellationToken cancellationToken);
Task HandleSubscriptionAsync(
WebSocket webSocket,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken);
}
public sealed class QueryExecutor : IQueryExecutor
{
private readonly ITransportClientFactory _transportFactory;
private readonly IPayloadSerializer _serializer;
private readonly ILogger<QueryExecutor> _logger;
public QueryExecutor(
ITransportClientFactory transportFactory,
IPayloadSerializer serializer,
ILogger<QueryExecutor> logger)
{
_transportFactory = transportFactory;
_serializer = serializer;
_logger = logger;
}
public async Task<GraphQLResponse> ExecuteAsync(
QueryPlan plan,
GraphQLRequest request,
IReadOnlyDictionary<string, string> claims,
TimeSpan timeout,
CancellationToken cancellationToken)
{
using var cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
cts.CancelAfter(timeout);
var results = new ConcurrentDictionary<string, object?>();
var errors = new ConcurrentBag<GraphQLError>();
// Execute nodes respecting dependencies
await ExecuteNodesAsync(plan.Nodes, request, claims, results, errors, cts.Token);
// Merge results
var data = MergeResults(plan.Nodes, results);
return new GraphQLResponse
{
Data = data,
Errors = errors.Any() ? errors.ToList() : null
};
}
private async Task ExecuteNodesAsync(
List<QueryPlanNode> nodes,
GraphQLRequest request,
IReadOnlyDictionary<string, string> claims,
ConcurrentDictionary<string, object?> results,
ConcurrentBag<GraphQLError> errors,
CancellationToken cancellationToken)
{
// Group nodes by dependency level
var executed = new HashSet<QueryPlanNode>();
while (executed.Count < nodes.Count)
{
var ready = nodes
.Where(n => !executed.Contains(n))
.Where(n => n.DependsOn.All(d => executed.Contains(d)))
.ToList();
if (ready.Count == 0)
{
throw new GraphQLExecutionException("Circular dependency in query plan");
}
// Execute ready nodes in parallel
await Parallel.ForEachAsync(ready, cancellationToken, async (node, ct) =>
{
try
{
var result = await ExecuteNodeAsync(node, request, claims, ct);
MergeNodeResult(results, result);
}
catch (Exception ex)
{
_logger.LogError(ex, "Error executing node for service {Service}", node.ServiceName);
errors.Add(new GraphQLError
{
Message = $"Error from {node.ServiceName}: {ex.Message}",
Path = node.RequiredFields.Cast<object>().ToList()
});
}
});
foreach (var node in ready)
{
executed.Add(node);
}
}
}
private async Task<GraphQLResponse> ExecuteNodeAsync(
QueryPlanNode node,
GraphQLRequest request,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
var client = _transportFactory.GetClient(node.ServiceName);
var payload = new RequestPayload
{
Method = "POST",
Path = "/graphql",
Headers = new Dictionary<string, string>
{
["Content-Type"] = "application/json"
},
Claims = claims.ToDictionary(x => x.Key, x => x.Value),
Body = JsonSerializer.SerializeToUtf8Bytes(new
{
query = node.SubQuery,
variables = request.Variables,
operationName = request.OperationName
})
};
var response = await client.SendRequestAsync(
node.ServiceName,
payload,
TimeSpan.FromSeconds(30),
cancellationToken);
if (response.Body == null)
{
throw new GraphQLExecutionException($"Empty response from {node.ServiceName}");
}
return JsonSerializer.Deserialize<GraphQLResponse>(response.Body)
?? throw new GraphQLExecutionException($"Invalid response from {node.ServiceName}");
}
private void MergeNodeResult(ConcurrentDictionary<string, object?> results, GraphQLResponse response)
{
if (response.Data is JsonElement element && element.ValueKind == JsonValueKind.Object)
{
foreach (var property in element.EnumerateObject())
{
results[property.Name] = property.Value.Clone();
}
}
}
private object? MergeResults(List<QueryPlanNode> nodes, ConcurrentDictionary<string, object?> results)
{
return results.ToDictionary(x => x.Key, x => x.Value);
}
public Task<GraphQLResponse> ExecuteIntrospectionAsync(
GraphQLSchema schema,
GraphQLRequest request,
CancellationToken cancellationToken)
{
// Execute introspection against merged schema
var result = schema.ExecuteIntrospection(request);
return Task.FromResult(result);
}
public async Task HandleSubscriptionAsync(
WebSocket webSocket,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
var buffer = new byte[4096];
try
{
while (webSocket.State == WebSocketState.Open && !cancellationToken.IsCancellationRequested)
{
var result = await webSocket.ReceiveAsync(buffer, cancellationToken);
if (result.MessageType == WebSocketMessageType.Close)
{
await webSocket.CloseAsync(
WebSocketCloseStatus.NormalClosure,
"Closed by client",
cancellationToken);
break;
}
var message = Encoding.UTF8.GetString(buffer, 0, result.Count);
await HandleSubscriptionMessageAsync(webSocket, message, claims, cancellationToken);
}
}
catch (WebSocketException ex)
{
_logger.LogWarning(ex, "WebSocket error in subscription");
}
}
private async Task HandleSubscriptionMessageAsync(
WebSocket webSocket,
string message,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
// Implement graphql-transport-ws protocol
var msg = JsonSerializer.Deserialize<SubscriptionMessage>(message);
switch (msg?.Type)
{
case "connection_init":
await SendAsync(webSocket, new { type = "connection_ack" }, cancellationToken);
break;
case "subscribe":
// Start subscription
break;
case "complete":
// End subscription
break;
}
}
private async Task SendAsync(WebSocket webSocket, object message, CancellationToken cancellationToken)
{
var bytes = JsonSerializer.SerializeToUtf8Bytes(message);
await webSocket.SendAsync(bytes, WebSocketMessageType.Text, true, cancellationToken);
}
}
internal class SubscriptionMessage
{
public string? Type { get; set; }
public string? Id { get; set; }
public GraphQLRequest? Payload { get; set; }
}
```
---
## Schema Registry
```csharp
namespace StellaOps.Router.Handlers.GraphQL;
public interface ISchemaRegistry
{
Task<GraphQLSchema> GetMergedSchemaAsync(CancellationToken cancellationToken);
void InvalidateCache();
}
public sealed class SchemaRegistry : ISchemaRegistry
{
private readonly GraphQLHandlerConfig _config;
private readonly ITransportClientFactory _transportFactory;
private readonly ILogger<SchemaRegistry> _logger;
private GraphQLSchema? _cachedSchema;
private DateTimeOffset _cacheExpiry;
private readonly SemaphoreSlim _lock = new(1, 1);
public SchemaRegistry(
IOptions<GraphQLHandlerConfig> config,
ITransportClientFactory transportFactory,
ILogger<SchemaRegistry> logger)
{
_config = config.Value;
_transportFactory = transportFactory;
_logger = logger;
}
public async Task<GraphQLSchema> GetMergedSchemaAsync(CancellationToken cancellationToken)
{
if (_cachedSchema != null && DateTimeOffset.UtcNow < _cacheExpiry)
{
return _cachedSchema;
}
await _lock.WaitAsync(cancellationToken);
try
{
if (_cachedSchema != null && DateTimeOffset.UtcNow < _cacheExpiry)
{
return _cachedSchema;
}
var schemas = new List<string>();
foreach (var (name, config) in _config.Services)
{
if (config.IsSchemaProvider)
{
var schema = await FetchSchemaAsync(config.ServiceName, cancellationToken);
schemas.Add(schema);
}
}
_cachedSchema = MergeSchemas(schemas);
_cacheExpiry = DateTimeOffset.UtcNow.Add(_config.SchemaCacheDuration);
_logger.LogInformation("Schema cache refreshed, expires at {Expiry}", _cacheExpiry);
return _cachedSchema;
}
finally
{
_lock.Release();
}
}
private async Task<string> FetchSchemaAsync(string serviceName, CancellationToken cancellationToken)
{
var client = _transportFactory.GetClient(serviceName);
var introspectionQuery = @"
query IntrospectionQuery {
__schema {
types { ...FullType }
queryType { name }
mutationType { name }
subscriptionType { name }
}
}
fragment FullType on __Type {
kind name description
fields(includeDeprecated: true) {
name description
args { ...InputValue }
type { ...TypeRef }
isDeprecated deprecationReason
}
}
fragment InputValue on __InputValue { name description type { ...TypeRef } }
fragment TypeRef on __Type {
kind name
ofType { kind name ofType { kind name ofType { kind name } } }
}";
var payload = new RequestPayload
{
Method = "POST",
Path = "/graphql",
Headers = new Dictionary<string, string> { ["Content-Type"] = "application/json" },
Claims = new Dictionary<string, string>(),
Body = JsonSerializer.SerializeToUtf8Bytes(new { query = introspectionQuery })
};
var response = await client.SendRequestAsync(
serviceName,
payload,
TimeSpan.FromSeconds(30),
cancellationToken);
return Encoding.UTF8.GetString(response.Body ?? Array.Empty<byte>());
}
private GraphQLSchema MergeSchemas(List<string> schemas)
{
// Merge multiple introspection results into unified schema
return new GraphQLSchema(schemas);
}
public void InvalidateCache()
{
_cachedSchema = null;
_cacheExpiry = DateTimeOffset.MinValue;
}
}
```
---
## YAML Configuration
```yaml
GraphQL:
Path: "/graphql"
EnableIntrospection: true
EnableSubscriptions: true
MaxQueryDepth: 15
MaxQueryComplexity: 1000
ExecutionTimeout: "00:00:30"
SchemaCacheDuration: "00:05:00"
EnableBatching: true
MaxBatchSize: 10
Services:
users:
ServiceName: "user-service"
RootTypes:
- Query
- Mutation
OwnedFields:
Query:
- user
- users
- me
Mutation:
- createUser
- updateUser
IsSchemaProvider: true
billing:
ServiceName: "billing-service"
OwnedFields:
Query:
- invoices
- subscription
Mutation:
- createInvoice
IsSchemaProvider: true
```
---
## Deliverables
1. `StellaOps.Router.Handlers.GraphQL/GraphQLHandler.cs`
2. `StellaOps.Router.Handlers.GraphQL/GraphQLHandlerConfig.cs`
3. `StellaOps.Router.Handlers.GraphQL/IGraphQLParser.cs`
4. `StellaOps.Router.Handlers.GraphQL/IQueryPlanner.cs`
5. `StellaOps.Router.Handlers.GraphQL/QueryPlanner.cs`
6. `StellaOps.Router.Handlers.GraphQL/IQueryExecutor.cs`
7. `StellaOps.Router.Handlers.GraphQL/QueryExecutor.cs`
8. `StellaOps.Router.Handlers.GraphQL/ISchemaRegistry.cs`
9. `StellaOps.Router.Handlers.GraphQL/SchemaRegistry.cs`
10. Unit tests for query planning
11. Integration tests for federated execution
12. Subscription handling tests
---
## Next Step
Proceed to [Step 17: S3/Storage Handler Implementation](17-Step.md) to implement the storage route handler.

903
docs/router/17-Step.md Normal file
View File

@@ -0,0 +1,903 @@
# Step 17: S3/Storage Handler Implementation
**Phase 4: Handler Plugins**
**Estimated Complexity:** Medium
**Dependencies:** Step 10 (Microservice Handler)
---
## Overview
The S3/Storage handler routes file operations to object storage backends (S3, MinIO, Azure Blob, GCS). It handles presigned URL generation, multipart uploads, streaming downloads, and integrates with claim-based access control.
---
## Goals
1. Route file operations to appropriate storage backends
2. Generate presigned URLs for direct client uploads/downloads
3. Support multipart uploads for large files
4. Stream files without buffering in gateway
5. Enforce claim-based access control on storage operations
---
## Core Architecture
```
┌────────────────────────────────────────────────────────────────┐
│ Storage Handler │
├────────────────────────────────────────────────────────────────┤
│ │
│ HTTP Request │
│ │ │
│ ▼ │
│ ┌───────────────┐ ┌─────────────────────┐ │
│ │ Path Resolver │───►│ Bucket/Key Mapping │ │
│ └───────┬───────┘ └─────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ ┌─────────────────────┐ │
│ │Access Control │───►│ Claim-Based Policy │ │
│ └───────┬───────┘ └─────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────┐ │
│ │ Storage Backend │ │
│ │ ┌─────┐ ┌───────┐ ┌──────┐ ┌─────┐ │ │
│ │ │ S3 │ │ MinIO │ │Azure │ │ GCS │ │ │
│ │ └─────┘ └───────┘ └──────┘ └─────┘ │ │
│ └───────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────┘
```
---
## Configuration
```csharp
namespace StellaOps.Router.Handlers.Storage;
public class StorageHandlerConfig
{
/// <summary>Path prefix for storage routes.</summary>
public string PathPrefix { get; set; } = "/files";
/// <summary>Default storage backend.</summary>
public string DefaultBackend { get; set; } = "s3";
/// <summary>Maximum upload size (bytes).</summary>
public long MaxUploadSize { get; set; } = 5L * 1024 * 1024 * 1024; // 5GB
/// <summary>Multipart threshold (bytes).</summary>
public long MultipartThreshold { get; set; } = 100 * 1024 * 1024; // 100MB
/// <summary>Presigned URL expiration.</summary>
public TimeSpan PresignedUrlExpiration { get; set; } = TimeSpan.FromHours(1);
/// <summary>Whether to use presigned URLs for uploads.</summary>
public bool UsePresignedUploads { get; set; } = true;
/// <summary>Whether to use presigned URLs for downloads.</summary>
public bool UsePresignedDownloads { get; set; } = true;
/// <summary>Storage backends configuration.</summary>
public Dictionary<string, StorageBackendConfig> Backends { get; set; } = new();
/// <summary>Bucket mappings (path pattern to bucket).</summary>
public List<BucketMapping> BucketMappings { get; set; } = new();
}
public class StorageBackendConfig
{
public string Type { get; set; } = "S3"; // S3, Azure, GCS
public string Endpoint { get; set; } = "";
public string Region { get; set; } = "us-east-1";
public string AccessKey { get; set; } = "";
public string SecretKey { get; set; } = "";
public bool UsePathStyle { get; set; } = false;
public bool UseSsl { get; set; } = true;
}
public class BucketMapping
{
public string PathPattern { get; set; } = "";
public string Bucket { get; set; } = "";
public string? KeyPrefix { get; set; }
public string Backend { get; set; } = "default";
public StorageAccessPolicy Policy { get; set; } = new();
}
public class StorageAccessPolicy
{
public bool RequireAuthentication { get; set; } = true;
public List<string> AllowedClaims { get; set; } = new();
public string? OwnerClaimPath { get; set; }
public bool EnforceOwnership { get; set; } = false;
}
```
---
## Storage Handler Implementation
```csharp
namespace StellaOps.Router.Handlers.Storage;
public sealed class StorageHandler : IRouteHandler
{
public string HandlerType => "Storage";
public int Priority => 90;
private readonly StorageHandlerConfig _config;
private readonly IStorageBackendFactory _backendFactory;
private readonly IAccessControlEvaluator _accessControl;
private readonly ILogger<StorageHandler> _logger;
public StorageHandler(
IOptions<StorageHandlerConfig> config,
IStorageBackendFactory backendFactory,
IAccessControlEvaluator accessControl,
ILogger<StorageHandler> logger)
{
_config = config.Value;
_backendFactory = backendFactory;
_accessControl = accessControl;
_logger = logger;
}
public bool CanHandle(RouteMatchResult match)
{
return match.Handler == "Storage" ||
match.Route.Path.StartsWith(_config.PathPrefix, StringComparison.OrdinalIgnoreCase);
}
public async Task<RouteHandlerResult> HandleAsync(
HttpContext context,
RouteMatchResult match,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
try
{
// Resolve storage location
var location = ResolveLocation(context.Request.Path, context.Request.Query);
// Check access
var accessResult = _accessControl.Evaluate(location, claims, context.Request.Method);
if (!accessResult.Allowed)
{
return new RouteHandlerResult
{
Handled = true,
StatusCode = 403,
Body = Encoding.UTF8.GetBytes(accessResult.Reason ?? "Access denied")
};
}
// Get backend
var backend = _backendFactory.GetBackend(location.Backend);
return context.Request.Method.ToUpper() switch
{
"GET" => await HandleGetAsync(context, backend, location, cancellationToken),
"HEAD" => await HandleHeadAsync(context, backend, location, cancellationToken),
"PUT" => await HandlePutAsync(context, backend, location, claims, cancellationToken),
"POST" => await HandlePostAsync(context, backend, location, claims, cancellationToken),
"DELETE" => await HandleDeleteAsync(context, backend, location, cancellationToken),
_ => new RouteHandlerResult { Handled = true, StatusCode = 405 }
};
}
catch (StorageNotFoundException)
{
return new RouteHandlerResult { Handled = true, StatusCode = 404 };
}
catch (Exception ex)
{
_logger.LogError(ex, "Storage operation error");
return new RouteHandlerResult
{
Handled = true,
StatusCode = 500,
Body = Encoding.UTF8.GetBytes("Storage operation failed")
};
}
}
private StorageLocation ResolveLocation(PathString path, IQueryCollection query)
{
var relativePath = path.Value?.Substring(_config.PathPrefix.Length).TrimStart('/') ?? "";
foreach (var mapping in _config.BucketMappings)
{
if (IsMatch(relativePath, mapping.PathPattern))
{
var key = ExtractKey(relativePath, mapping);
return new StorageLocation
{
Backend = mapping.Backend,
Bucket = mapping.Bucket,
Key = key,
Policy = mapping.Policy
};
}
}
// Default: first segment is bucket, rest is key
var segments = relativePath.Split('/', 2);
return new StorageLocation
{
Backend = _config.DefaultBackend,
Bucket = segments[0],
Key = segments.Length > 1 ? segments[1] : ""
};
}
private bool IsMatch(string path, string pattern)
{
var regex = new Regex("^" + Regex.Escape(pattern).Replace("\\*", ".*") + "$");
return regex.IsMatch(path);
}
private string ExtractKey(string path, BucketMapping mapping)
{
var key = path;
if (!string.IsNullOrEmpty(mapping.KeyPrefix))
{
key = mapping.KeyPrefix.TrimEnd('/') + "/" + key;
}
return key;
}
private async Task<RouteHandlerResult> HandleGetAsync(
HttpContext context,
IStorageBackend backend,
StorageLocation location,
CancellationToken cancellationToken)
{
// Check for presigned download
if (_config.UsePresignedDownloads && !IsRangeRequest(context.Request))
{
var presignedUrl = await backend.GetPresignedDownloadUrlAsync(
location.Bucket,
location.Key,
_config.PresignedUrlExpiration,
cancellationToken);
return new RouteHandlerResult
{
Handled = true,
StatusCode = 307, // Temporary Redirect
Headers = new Dictionary<string, string>
{
["Location"] = presignedUrl,
["Cache-Control"] = "no-store"
}
};
}
// Stream directly
var metadata = await backend.GetObjectMetadataAsync(location.Bucket, location.Key, cancellationToken);
var stream = await backend.GetObjectStreamAsync(location.Bucket, location.Key, cancellationToken);
context.Response.StatusCode = 200;
context.Response.ContentType = metadata.ContentType;
context.Response.ContentLength = metadata.ContentLength;
if (!string.IsNullOrEmpty(metadata.ETag))
{
context.Response.Headers["ETag"] = metadata.ETag;
}
await stream.CopyToAsync(context.Response.Body, cancellationToken);
return new RouteHandlerResult { Handled = true, StatusCode = 200 };
}
private bool IsRangeRequest(HttpRequest request)
{
return request.Headers.ContainsKey("Range");
}
private async Task<RouteHandlerResult> HandleHeadAsync(
HttpContext context,
IStorageBackend backend,
StorageLocation location,
CancellationToken cancellationToken)
{
var metadata = await backend.GetObjectMetadataAsync(location.Bucket, location.Key, cancellationToken);
return new RouteHandlerResult
{
Handled = true,
StatusCode = 200,
Headers = new Dictionary<string, string>
{
["Content-Type"] = metadata.ContentType,
["Content-Length"] = metadata.ContentLength.ToString(),
["ETag"] = metadata.ETag ?? "",
["Last-Modified"] = metadata.LastModified.ToString("R")
}
};
}
private async Task<RouteHandlerResult> HandlePutAsync(
HttpContext context,
IStorageBackend backend,
StorageLocation location,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
var contentLength = context.Request.ContentLength ?? 0;
// Validate size
if (contentLength > _config.MaxUploadSize)
{
return new RouteHandlerResult
{
Handled = true,
StatusCode = 413,
Body = Encoding.UTF8.GetBytes($"File too large. Max size: {_config.MaxUploadSize}")
};
}
// Use presigned upload for large files
if (_config.UsePresignedUploads && contentLength > _config.MultipartThreshold)
{
var uploadInfo = await backend.InitiateMultipartUploadAsync(
location.Bucket,
location.Key,
context.Request.ContentType ?? "application/octet-stream",
cancellationToken);
return new RouteHandlerResult
{
Handled = true,
StatusCode = 200,
ContentType = "application/json",
Body = JsonSerializer.SerializeToUtf8Bytes(new
{
uploadId = uploadInfo.UploadId,
parts = uploadInfo.PresignedPartUrls
})
};
}
// Direct upload
var contentType = context.Request.ContentType ?? "application/octet-stream";
var metadata = new Dictionary<string, string>();
// Add owner metadata if enforced
if (location.Policy?.EnforceOwnership == true && location.Policy.OwnerClaimPath != null)
{
if (claims.TryGetValue(location.Policy.OwnerClaimPath, out var owner))
{
metadata["x-owner"] = owner;
}
}
await backend.PutObjectAsync(
location.Bucket,
location.Key,
context.Request.Body,
contentLength,
contentType,
metadata,
cancellationToken);
return new RouteHandlerResult
{
Handled = true,
StatusCode = 201,
Headers = new Dictionary<string, string>
{
["Location"] = $"{_config.PathPrefix}/{location.Bucket}/{location.Key}"
}
};
}
private async Task<RouteHandlerResult> HandlePostAsync(
HttpContext context,
IStorageBackend backend,
StorageLocation location,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
var action = context.Request.Query["action"].ToString();
return action switch
{
"presign" => await HandlePresignRequestAsync(context, backend, location, cancellationToken),
"complete" => await HandleCompleteMultipartAsync(context, backend, location, cancellationToken),
"abort" => await HandleAbortMultipartAsync(context, backend, location, cancellationToken),
_ => await HandlePutAsync(context, backend, location, claims, cancellationToken)
};
}
private async Task<RouteHandlerResult> HandlePresignRequestAsync(
HttpContext context,
IStorageBackend backend,
StorageLocation location,
CancellationToken cancellationToken)
{
var method = context.Request.Query["method"].ToString().ToUpper();
var expiration = _config.PresignedUrlExpiration;
string presignedUrl;
if (method == "PUT")
{
var contentType = context.Request.Query["contentType"].ToString();
presignedUrl = await backend.GetPresignedUploadUrlAsync(
location.Bucket,
location.Key,
contentType,
expiration,
cancellationToken);
}
else
{
presignedUrl = await backend.GetPresignedDownloadUrlAsync(
location.Bucket,
location.Key,
expiration,
cancellationToken);
}
return new RouteHandlerResult
{
Handled = true,
StatusCode = 200,
ContentType = "application/json",
Body = JsonSerializer.SerializeToUtf8Bytes(new
{
url = presignedUrl,
expiresAt = DateTimeOffset.UtcNow.Add(expiration)
})
};
}
private async Task<RouteHandlerResult> HandleCompleteMultipartAsync(
HttpContext context,
IStorageBackend backend,
StorageLocation location,
CancellationToken cancellationToken)
{
var body = await JsonSerializer.DeserializeAsync<CompleteMultipartRequest>(
context.Request.Body,
cancellationToken: cancellationToken);
if (body == null)
{
return new RouteHandlerResult { Handled = true, StatusCode = 400 };
}
await backend.CompleteMultipartUploadAsync(
location.Bucket,
location.Key,
body.UploadId,
body.Parts,
cancellationToken);
return new RouteHandlerResult { Handled = true, StatusCode = 200 };
}
private async Task<RouteHandlerResult> HandleAbortMultipartAsync(
HttpContext context,
IStorageBackend backend,
StorageLocation location,
CancellationToken cancellationToken)
{
var uploadId = context.Request.Query["uploadId"].ToString();
await backend.AbortMultipartUploadAsync(
location.Bucket,
location.Key,
uploadId,
cancellationToken);
return new RouteHandlerResult { Handled = true, StatusCode = 204 };
}
private async Task<RouteHandlerResult> HandleDeleteAsync(
HttpContext context,
IStorageBackend backend,
StorageLocation location,
CancellationToken cancellationToken)
{
await backend.DeleteObjectAsync(location.Bucket, location.Key, cancellationToken);
return new RouteHandlerResult { Handled = true, StatusCode = 204 };
}
}
internal class CompleteMultipartRequest
{
public string UploadId { get; set; } = "";
public List<UploadPart> Parts { get; set; } = new();
}
internal class StorageLocation
{
public string Backend { get; set; } = "";
public string Bucket { get; set; } = "";
public string Key { get; set; } = "";
public StorageAccessPolicy? Policy { get; set; }
}
```
---
## Storage Backend Interface
```csharp
namespace StellaOps.Router.Handlers.Storage;
public interface IStorageBackend
{
Task<ObjectMetadata> GetObjectMetadataAsync(
string bucket, string key, CancellationToken cancellationToken);
Task<Stream> GetObjectStreamAsync(
string bucket, string key, CancellationToken cancellationToken);
Task PutObjectAsync(
string bucket, string key, Stream content, long contentLength,
string contentType, Dictionary<string, string>? metadata,
CancellationToken cancellationToken);
Task DeleteObjectAsync(
string bucket, string key, CancellationToken cancellationToken);
Task<string> GetPresignedDownloadUrlAsync(
string bucket, string key, TimeSpan expiration,
CancellationToken cancellationToken);
Task<string> GetPresignedUploadUrlAsync(
string bucket, string key, string contentType, TimeSpan expiration,
CancellationToken cancellationToken);
Task<MultipartUploadInfo> InitiateMultipartUploadAsync(
string bucket, string key, string contentType,
CancellationToken cancellationToken);
Task CompleteMultipartUploadAsync(
string bucket, string key, string uploadId, List<UploadPart> parts,
CancellationToken cancellationToken);
Task AbortMultipartUploadAsync(
string bucket, string key, string uploadId,
CancellationToken cancellationToken);
}
public class ObjectMetadata
{
public string ContentType { get; set; } = "application/octet-stream";
public long ContentLength { get; set; }
public string? ETag { get; set; }
public DateTimeOffset LastModified { get; set; }
public Dictionary<string, string> CustomMetadata { get; set; } = new();
}
public class MultipartUploadInfo
{
public string UploadId { get; set; } = "";
public List<PresignedPartUrl> PresignedPartUrls { get; set; } = new();
}
public class PresignedPartUrl
{
public int PartNumber { get; set; }
public string Url { get; set; } = "";
}
public class UploadPart
{
public int PartNumber { get; set; }
public string ETag { get; set; } = "";
}
```
---
## S3 Backend Implementation
```csharp
namespace StellaOps.Router.Handlers.Storage;
public sealed class S3StorageBackend : IStorageBackend
{
private readonly IAmazonS3 _client;
private readonly ILogger<S3StorageBackend> _logger;
public S3StorageBackend(IAmazonS3 client, ILogger<S3StorageBackend> logger)
{
_client = client;
_logger = logger;
}
public async Task<ObjectMetadata> GetObjectMetadataAsync(
string bucket, string key, CancellationToken cancellationToken)
{
var response = await _client.GetObjectMetadataAsync(bucket, key, cancellationToken);
return new ObjectMetadata
{
ContentType = response.Headers.ContentType,
ContentLength = response.ContentLength,
ETag = response.ETag,
LastModified = response.LastModified,
CustomMetadata = response.Metadata.Keys
.ToDictionary(k => k, k => response.Metadata[k])
};
}
public async Task<Stream> GetObjectStreamAsync(
string bucket, string key, CancellationToken cancellationToken)
{
var response = await _client.GetObjectAsync(bucket, key, cancellationToken);
return response.ResponseStream;
}
public async Task PutObjectAsync(
string bucket, string key, Stream content, long contentLength,
string contentType, Dictionary<string, string>? metadata,
CancellationToken cancellationToken)
{
var request = new PutObjectRequest
{
BucketName = bucket,
Key = key,
InputStream = content,
ContentType = contentType
};
if (metadata != null)
{
foreach (var (k, v) in metadata)
{
request.Metadata.Add(k, v);
}
}
await _client.PutObjectAsync(request, cancellationToken);
}
public async Task DeleteObjectAsync(
string bucket, string key, CancellationToken cancellationToken)
{
await _client.DeleteObjectAsync(bucket, key, cancellationToken);
}
public Task<string> GetPresignedDownloadUrlAsync(
string bucket, string key, TimeSpan expiration,
CancellationToken cancellationToken)
{
var request = new GetPreSignedUrlRequest
{
BucketName = bucket,
Key = key,
Expires = DateTime.UtcNow.Add(expiration),
Verb = HttpVerb.GET
};
var url = _client.GetPreSignedURL(request);
return Task.FromResult(url);
}
public Task<string> GetPresignedUploadUrlAsync(
string bucket, string key, string contentType, TimeSpan expiration,
CancellationToken cancellationToken)
{
var request = new GetPreSignedUrlRequest
{
BucketName = bucket,
Key = key,
Expires = DateTime.UtcNow.Add(expiration),
Verb = HttpVerb.PUT,
ContentType = contentType
};
var url = _client.GetPreSignedURL(request);
return Task.FromResult(url);
}
public async Task<MultipartUploadInfo> InitiateMultipartUploadAsync(
string bucket, string key, string contentType,
CancellationToken cancellationToken)
{
var initResponse = await _client.InitiateMultipartUploadAsync(
bucket, key, cancellationToken);
// Generate presigned URLs for parts (assuming 100MB parts, 50 parts max)
var partUrls = new List<PresignedPartUrl>();
for (int i = 1; i <= 50; i++)
{
var url = _client.GetPreSignedURL(new GetPreSignedUrlRequest
{
BucketName = bucket,
Key = key,
Expires = DateTime.UtcNow.AddHours(24),
Verb = HttpVerb.PUT,
UploadId = initResponse.UploadId,
PartNumber = i
});
partUrls.Add(new PresignedPartUrl { PartNumber = i, Url = url });
}
return new MultipartUploadInfo
{
UploadId = initResponse.UploadId,
PresignedPartUrls = partUrls
};
}
public async Task CompleteMultipartUploadAsync(
string bucket, string key, string uploadId, List<UploadPart> parts,
CancellationToken cancellationToken)
{
var request = new CompleteMultipartUploadRequest
{
BucketName = bucket,
Key = key,
UploadId = uploadId,
PartETags = parts.Select(p => new PartETag(p.PartNumber, p.ETag)).ToList()
};
await _client.CompleteMultipartUploadAsync(request, cancellationToken);
}
public async Task AbortMultipartUploadAsync(
string bucket, string key, string uploadId,
CancellationToken cancellationToken)
{
await _client.AbortMultipartUploadAsync(bucket, key, uploadId, cancellationToken);
}
}
```
---
## Access Control Evaluator
```csharp
namespace StellaOps.Router.Handlers.Storage;
public interface IAccessControlEvaluator
{
AccessResult Evaluate(
StorageLocation location,
IReadOnlyDictionary<string, string> claims,
string httpMethod);
}
public class AccessResult
{
public bool Allowed { get; set; }
public string? Reason { get; set; }
}
public sealed class ClaimBasedAccessControlEvaluator : IAccessControlEvaluator
{
public AccessResult Evaluate(
StorageLocation location,
IReadOnlyDictionary<string, string> claims,
string httpMethod)
{
var policy = location.Policy ?? new StorageAccessPolicy();
// Check authentication requirement
if (policy.RequireAuthentication && !claims.Any())
{
return new AccessResult { Allowed = false, Reason = "Authentication required" };
}
// Check allowed claims
if (policy.AllowedClaims.Any())
{
var hasRequiredClaim = policy.AllowedClaims.Any(c =>
{
var parts = c.Split('=', 2);
if (parts.Length == 2)
{
return claims.TryGetValue(parts[0], out var value) && value == parts[1];
}
return claims.ContainsKey(c);
});
if (!hasRequiredClaim)
{
return new AccessResult { Allowed = false, Reason = "Required claim not present" };
}
}
// Check ownership for write operations
if (policy.EnforceOwnership && IsWriteOperation(httpMethod))
{
if (string.IsNullOrEmpty(policy.OwnerClaimPath))
{
return new AccessResult { Allowed = false, Reason = "Owner claim path not configured" };
}
if (!claims.ContainsKey(policy.OwnerClaimPath))
{
return new AccessResult { Allowed = false, Reason = "Owner claim required" };
}
}
return new AccessResult { Allowed = true };
}
private bool IsWriteOperation(string method)
{
return method.ToUpper() is "PUT" or "POST" or "DELETE" or "PATCH";
}
}
```
---
## YAML Configuration
```yaml
Storage:
PathPrefix: "/files"
DefaultBackend: "s3"
MaxUploadSize: 5368709120 # 5GB
MultipartThreshold: 104857600 # 100MB
PresignedUrlExpiration: "01:00:00"
UsePresignedUploads: true
UsePresignedDownloads: true
Backends:
s3:
Type: "S3"
Endpoint: "https://s3.amazonaws.com"
Region: "us-east-1"
AccessKey: "${AWS_ACCESS_KEY}"
SecretKey: "${AWS_SECRET_KEY}"
minio:
Type: "S3"
Endpoint: "https://minio.internal:9000"
Region: "us-east-1"
AccessKey: "${MINIO_ACCESS_KEY}"
SecretKey: "${MINIO_SECRET_KEY}"
UsePathStyle: true
BucketMappings:
- PathPattern: "uploads/*"
Bucket: "user-uploads"
KeyPrefix: "files/"
Backend: "s3"
Policy:
RequireAuthentication: true
EnforceOwnership: true
OwnerClaimPath: "sub"
- PathPattern: "public/*"
Bucket: "public-assets"
Backend: "s3"
Policy:
RequireAuthentication: false
```
---
## Deliverables
1. `StellaOps.Router.Handlers.Storage/StorageHandler.cs`
2. `StellaOps.Router.Handlers.Storage/StorageHandlerConfig.cs`
3. `StellaOps.Router.Handlers.Storage/IStorageBackend.cs`
4. `StellaOps.Router.Handlers.Storage/S3StorageBackend.cs`
5. `StellaOps.Router.Handlers.Storage/IAccessControlEvaluator.cs`
6. `StellaOps.Router.Handlers.Storage/ClaimBasedAccessControlEvaluator.cs`
7. `StellaOps.Router.Handlers.Storage/StorageBackendFactory.cs`
8. Presigned URL generation tests
9. Multipart upload tests
10. Access control tests
---
## Next Step
Proceed to [Step 18: Reverse Proxy Handler Implementation](18-Step.md) to implement direct reverse proxy routing.

890
docs/router/18-Step.md Normal file
View File

@@ -0,0 +1,890 @@
# Step 18: Reverse Proxy Handler Implementation
**Phase 4: Handler Plugins**
**Estimated Complexity:** Medium
**Dependencies:** Step 10 (Microservice Handler)
---
## Overview
The Reverse Proxy handler forwards requests to external HTTP services without using the internal transport protocol. It's used for legacy services, third-party APIs, and services that can't be modified to use the Stella transport layer.
---
## Goals
1. Forward HTTP requests to configurable upstream servers
2. Support connection pooling and HTTP/2 multiplexing
3. Handle request/response transformation
4. Support health checks and circuit breaking
5. Maintain correlation IDs for tracing
---
## Core Architecture
```
┌────────────────────────────────────────────────────────────────┐
│ Reverse Proxy Handler │
├────────────────────────────────────────────────────────────────┤
│ │
│ Incoming Request │
│ │ │
│ ▼ │
│ ┌───────────────┐ ┌─────────────────────┐ │
│ │Path Rewriter │───►│ URL Transformation │ │
│ └───────┬───────┘ └─────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ ┌─────────────────────┐ │
│ │ Header Filter │───►│ Add/Remove Headers │ │
│ └───────┬───────┘ └─────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ ┌─────────────────────┐ │
│ │ Load Balancer │───►│ Round Robin/Weighted │ │
│ └───────┬───────┘ └─────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────┐ │
│ │ HttpClient Pool │ │
│ │ (Connection pooling, HTTP/2, retries) │ │
│ └───────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────┘
```
---
## Configuration
```csharp
namespace StellaOps.Router.Handlers.ReverseProxy;
public class ReverseProxyConfig
{
/// <summary>Upstream definitions by name.</summary>
public Dictionary<string, UpstreamConfig> Upstreams { get; set; } = new();
/// <summary>Route-to-upstream mappings.</summary>
public List<ProxyRoute> Routes { get; set; } = new();
/// <summary>Default timeout for upstream requests.</summary>
public TimeSpan DefaultTimeout { get; set; } = TimeSpan.FromSeconds(30);
/// <summary>Whether to forward X-Forwarded-* headers.</summary>
public bool AddForwardedHeaders { get; set; } = true;
/// <summary>Whether to preserve host header.</summary>
public bool PreserveHost { get; set; } = false;
/// <summary>Connection pool settings.</summary>
public ConnectionPoolConfig ConnectionPool { get; set; } = new();
}
public class UpstreamConfig
{
/// <summary>Upstream server addresses.</summary>
public List<UpstreamServer> Servers { get; set; } = new();
/// <summary>Load balancing strategy.</summary>
public LoadBalanceStrategy LoadBalance { get; set; } = LoadBalanceStrategy.RoundRobin;
/// <summary>Health check configuration.</summary>
public HealthCheckConfig? HealthCheck { get; set; }
/// <summary>Circuit breaker configuration.</summary>
public CircuitBreakerConfig? CircuitBreaker { get; set; }
/// <summary>Retry configuration.</summary>
public RetryConfig? Retry { get; set; }
}
public class UpstreamServer
{
public string Address { get; set; } = "";
public int Weight { get; set; } = 1;
public bool Backup { get; set; } = false;
}
public class ProxyRoute
{
/// <summary>Path pattern to match.</summary>
public string PathPattern { get; set; } = "";
/// <summary>Target upstream name.</summary>
public string Upstream { get; set; } = "";
/// <summary>Path rewrite rule.</summary>
public PathRewriteRule? Rewrite { get; set; }
/// <summary>Header transformations.</summary>
public HeaderTransformConfig? Headers { get; set; }
/// <summary>Timeout override.</summary>
public TimeSpan? Timeout { get; set; }
/// <summary>Required claims for access.</summary>
public List<string>? RequiredClaims { get; set; }
}
public class PathRewriteRule
{
public string Pattern { get; set; } = "";
public string Replacement { get; set; } = "";
}
public class HeaderTransformConfig
{
public Dictionary<string, string> Add { get; set; } = new();
public List<string> Remove { get; set; } = new();
public Dictionary<string, string> Set { get; set; } = new();
public bool ForwardClaims { get; set; } = false;
public string ClaimsHeaderPrefix { get; set; } = "X-Claim-";
}
public class HealthCheckConfig
{
public string Path { get; set; } = "/health";
public TimeSpan Interval { get; set; } = TimeSpan.FromSeconds(10);
public TimeSpan Timeout { get; set; } = TimeSpan.FromSeconds(5);
public int UnhealthyThreshold { get; set; } = 3;
public int HealthyThreshold { get; set; } = 2;
}
public class CircuitBreakerConfig
{
public int FailureThreshold { get; set; } = 5;
public TimeSpan SamplingDuration { get; set; } = TimeSpan.FromSeconds(30);
public TimeSpan BreakDuration { get; set; } = TimeSpan.FromSeconds(30);
public double FailureRatioThreshold { get; set; } = 0.5;
}
public class RetryConfig
{
public int MaxRetries { get; set; } = 3;
public TimeSpan InitialDelay { get; set; } = TimeSpan.FromMilliseconds(100);
public double BackoffMultiplier { get; set; } = 2.0;
public List<int> RetryableStatusCodes { get; set; } = new() { 502, 503, 504 };
}
public class ConnectionPoolConfig
{
public int MaxConnectionsPerServer { get; set; } = 100;
public TimeSpan ConnectionIdleTimeout { get; set; } = TimeSpan.FromMinutes(2);
public bool EnableHttp2 { get; set; } = true;
}
public enum LoadBalanceStrategy
{
RoundRobin,
Random,
LeastConnections,
WeightedRoundRobin,
IPHash
}
```
---
## Reverse Proxy Handler Implementation
```csharp
namespace StellaOps.Router.Handlers.ReverseProxy;
public sealed class ReverseProxyHandler : IRouteHandler
{
public string HandlerType => "ReverseProxy";
public int Priority => 50;
private readonly ReverseProxyConfig _config;
private readonly IUpstreamManager _upstreamManager;
private readonly IHttpClientFactory _httpClientFactory;
private readonly ILogger<ReverseProxyHandler> _logger;
public ReverseProxyHandler(
IOptions<ReverseProxyConfig> config,
IUpstreamManager upstreamManager,
IHttpClientFactory httpClientFactory,
ILogger<ReverseProxyHandler> logger)
{
_config = config.Value;
_upstreamManager = upstreamManager;
_httpClientFactory = httpClientFactory;
_logger = logger;
}
public bool CanHandle(RouteMatchResult match)
{
if (match.Handler == "ReverseProxy")
return true;
return _config.Routes.Any(r => IsRouteMatch(match.Route.Path, r.PathPattern));
}
public async Task<RouteHandlerResult> HandleAsync(
HttpContext context,
RouteMatchResult match,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
// Find matching route
var route = _config.Routes.FirstOrDefault(r =>
IsRouteMatch(context.Request.Path, r.PathPattern));
if (route == null)
{
return new RouteHandlerResult { Handled = false };
}
// Check required claims
if (route.RequiredClaims?.Any() == true)
{
if (!route.RequiredClaims.All(c => claims.ContainsKey(c)))
{
return new RouteHandlerResult
{
Handled = true,
StatusCode = 403,
Body = Encoding.UTF8.GetBytes("Forbidden")
};
}
}
// Get upstream server
var server = await _upstreamManager.GetServerAsync(route.Upstream, context, cancellationToken);
if (server == null)
{
_logger.LogWarning("No healthy upstream for {Upstream}", route.Upstream);
return new RouteHandlerResult
{
Handled = true,
StatusCode = 503,
Body = Encoding.UTF8.GetBytes("Service unavailable")
};
}
try
{
return await ForwardRequestAsync(context, route, server, claims, cancellationToken);
}
catch (Exception ex)
{
_logger.LogError(ex, "Proxy error for {Upstream}", route.Upstream);
_upstreamManager.ReportFailure(route.Upstream, server.Address);
return new RouteHandlerResult
{
Handled = true,
StatusCode = 502,
Body = Encoding.UTF8.GetBytes("Bad gateway")
};
}
}
private bool IsRouteMatch(string path, string pattern)
{
if (pattern.EndsWith("*"))
{
return path.StartsWith(pattern.TrimEnd('*'), StringComparison.OrdinalIgnoreCase);
}
return string.Equals(path, pattern, StringComparison.OrdinalIgnoreCase);
}
private async Task<RouteHandlerResult> ForwardRequestAsync(
HttpContext context,
ProxyRoute route,
UpstreamServer server,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
var request = context.Request;
// Build upstream URL
var targetUri = BuildTargetUri(server.Address, request, route.Rewrite);
// Create HTTP request
var httpRequest = new HttpRequestMessage
{
Method = new HttpMethod(request.Method),
RequestUri = targetUri
};
// Copy headers
CopyRequestHeaders(request, httpRequest, route.Headers, claims);
// Add forwarded headers
if (_config.AddForwardedHeaders)
{
AddForwardedHeaders(context, httpRequest);
}
// Copy body for non-GET/HEAD requests
if (!HttpMethods.IsGet(request.Method) && !HttpMethods.IsHead(request.Method))
{
httpRequest.Content = new StreamContent(request.Body);
if (request.ContentType != null)
{
httpRequest.Content.Headers.ContentType = MediaTypeHeaderValue.Parse(request.ContentType);
}
}
// Send request
var client = _httpClientFactory.CreateClient("proxy");
var timeout = route.Timeout ?? _config.DefaultTimeout;
using var cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
cts.CancelAfter(timeout);
var response = await client.SendAsync(httpRequest, HttpCompletionOption.ResponseHeadersRead, cts.Token);
// Copy response
return await BuildResponseAsync(context, response, route.Headers, cancellationToken);
}
private Uri BuildTargetUri(string serverAddress, HttpRequest request, PathRewriteRule? rewrite)
{
var path = request.Path.Value ?? "/";
if (rewrite != null)
{
path = Regex.Replace(path, rewrite.Pattern, rewrite.Replacement);
}
var query = request.QueryString.Value ?? "";
var baseUri = new Uri(serverAddress.TrimEnd('/'));
return new Uri(baseUri, path + query);
}
private void CopyRequestHeaders(
HttpRequest source,
HttpRequestMessage target,
HeaderTransformConfig? transform,
IReadOnlyDictionary<string, string> claims)
{
// Skip hop-by-hop headers
var skipHeaders = new HashSet<string>(StringComparer.OrdinalIgnoreCase)
{
"Connection", "Keep-Alive", "Proxy-Authenticate", "Proxy-Authorization",
"TE", "Trailer", "Transfer-Encoding", "Upgrade", "Host"
};
// Headers to remove
if (transform?.Remove != null)
{
foreach (var header in transform.Remove)
{
skipHeaders.Add(header);
}
}
foreach (var header in source.Headers)
{
if (skipHeaders.Contains(header.Key))
continue;
target.Headers.TryAddWithoutValidation(header.Key, header.Value.ToArray());
}
// Add configured headers
if (transform?.Add != null)
{
foreach (var (key, value) in transform.Add)
{
target.Headers.TryAddWithoutValidation(key, value);
}
}
// Set configured headers (overwrite)
if (transform?.Set != null)
{
foreach (var (key, value) in transform.Set)
{
target.Headers.Remove(key);
target.Headers.TryAddWithoutValidation(key, value);
}
}
// Forward claims as headers
if (transform?.ForwardClaims == true)
{
var prefix = transform.ClaimsHeaderPrefix ?? "X-Claim-";
foreach (var (key, value) in claims)
{
var headerName = prefix + key.Replace('/', '-').Replace(':', '-');
target.Headers.TryAddWithoutValidation(headerName, value);
}
}
// Preserve or set Host
if (_config.PreserveHost)
{
target.Headers.Host = source.Host.Value;
}
}
private void AddForwardedHeaders(HttpContext context, HttpRequestMessage request)
{
var connection = context.Connection;
var httpRequest = context.Request;
// X-Forwarded-For
var forwardedFor = httpRequest.Headers["X-Forwarded-For"].FirstOrDefault();
var clientIp = connection.RemoteIpAddress?.ToString();
if (!string.IsNullOrEmpty(clientIp))
{
forwardedFor = string.IsNullOrEmpty(forwardedFor)
? clientIp
: $"{forwardedFor}, {clientIp}";
}
request.Headers.TryAddWithoutValidation("X-Forwarded-For", forwardedFor);
// X-Forwarded-Proto
request.Headers.TryAddWithoutValidation("X-Forwarded-Proto", httpRequest.Scheme);
// X-Forwarded-Host
request.Headers.TryAddWithoutValidation("X-Forwarded-Host", httpRequest.Host.Value);
// X-Real-IP
if (connection.RemoteIpAddress != null)
{
request.Headers.TryAddWithoutValidation("X-Real-IP", connection.RemoteIpAddress.ToString());
}
// X-Request-ID (correlation)
request.Headers.TryAddWithoutValidation("X-Request-ID", context.TraceIdentifier);
}
private async Task<RouteHandlerResult> BuildResponseAsync(
HttpContext context,
HttpResponseMessage response,
HeaderTransformConfig? transform,
CancellationToken cancellationToken)
{
var httpResponse = context.Response;
httpResponse.StatusCode = (int)response.StatusCode;
// Copy response headers
var skipHeaders = new HashSet<string>(StringComparer.OrdinalIgnoreCase)
{
"Transfer-Encoding", "Connection"
};
foreach (var header in response.Headers)
{
if (skipHeaders.Contains(header.Key))
continue;
httpResponse.Headers[header.Key] = header.Value.ToArray();
}
foreach (var header in response.Content.Headers)
{
if (skipHeaders.Contains(header.Key))
continue;
httpResponse.Headers[header.Key] = header.Value.ToArray();
}
// Stream response body
await response.Content.CopyToAsync(httpResponse.Body, cancellationToken);
return new RouteHandlerResult
{
Handled = true,
StatusCode = (int)response.StatusCode
};
}
}
```
---
## Upstream Manager
```csharp
namespace StellaOps.Router.Handlers.ReverseProxy;
public interface IUpstreamManager
{
Task<UpstreamServer?> GetServerAsync(
string upstreamName,
HttpContext context,
CancellationToken cancellationToken);
void ReportSuccess(string upstreamName, string serverAddress);
void ReportFailure(string upstreamName, string serverAddress);
}
public sealed class UpstreamManager : IUpstreamManager, IHostedService
{
private readonly ReverseProxyConfig _config;
private readonly ILogger<UpstreamManager> _logger;
private readonly ConcurrentDictionary<string, ServerState> _serverStates = new();
private readonly ConcurrentDictionary<string, int> _roundRobinCounters = new();
private Timer? _healthCheckTimer;
public UpstreamManager(
IOptions<ReverseProxyConfig> config,
ILogger<UpstreamManager> logger)
{
_config = config.Value;
_logger = logger;
InitializeServerStates();
}
private void InitializeServerStates()
{
foreach (var (name, upstream) in _config.Upstreams)
{
foreach (var server in upstream.Servers)
{
var key = $"{name}:{server.Address}";
_serverStates[key] = new ServerState
{
Address = server.Address,
Weight = server.Weight,
IsHealthy = true,
IsBackup = server.Backup
};
}
}
}
public Task<UpstreamServer?> GetServerAsync(
string upstreamName,
HttpContext context,
CancellationToken cancellationToken)
{
if (!_config.Upstreams.TryGetValue(upstreamName, out var upstream))
{
return Task.FromResult<UpstreamServer?>(null);
}
var healthyServers = upstream.Servers
.Where(s => IsServerHealthy(upstreamName, s.Address) && !s.Backup)
.ToList();
// Fall back to backup servers if no primary available
if (healthyServers.Count == 0)
{
healthyServers = upstream.Servers
.Where(s => IsServerHealthy(upstreamName, s.Address) && s.Backup)
.ToList();
}
if (healthyServers.Count == 0)
{
return Task.FromResult<UpstreamServer?>(null);
}
var server = upstream.LoadBalance switch
{
LoadBalanceStrategy.RoundRobin => SelectRoundRobin(upstreamName, healthyServers),
LoadBalanceStrategy.Random => SelectRandom(healthyServers),
LoadBalanceStrategy.WeightedRoundRobin => SelectWeightedRoundRobin(upstreamName, healthyServers),
LoadBalanceStrategy.LeastConnections => SelectLeastConnections(upstreamName, healthyServers),
LoadBalanceStrategy.IPHash => SelectIPHash(context, healthyServers),
_ => healthyServers[0]
};
return Task.FromResult<UpstreamServer?>(server);
}
private bool IsServerHealthy(string upstreamName, string address)
{
var key = $"{upstreamName}:{address}";
return _serverStates.TryGetValue(key, out var state) && state.IsHealthy;
}
private UpstreamServer SelectRoundRobin(string upstreamName, List<UpstreamServer> servers)
{
var counter = _roundRobinCounters.AddOrUpdate(upstreamName, 0, (_, c) => c + 1);
return servers[counter % servers.Count];
}
private UpstreamServer SelectRandom(List<UpstreamServer> servers)
{
return servers[Random.Shared.Next(servers.Count)];
}
private UpstreamServer SelectWeightedRoundRobin(string upstreamName, List<UpstreamServer> servers)
{
var totalWeight = servers.Sum(s => s.Weight);
var counter = _roundRobinCounters.AddOrUpdate(upstreamName, 0, (_, c) => c + 1);
var position = counter % totalWeight;
var cumulative = 0;
foreach (var server in servers)
{
cumulative += server.Weight;
if (position < cumulative)
return server;
}
return servers[^1];
}
private UpstreamServer SelectLeastConnections(string upstreamName, List<UpstreamServer> servers)
{
return servers
.OrderBy(s =>
{
var key = $"{upstreamName}:{s.Address}";
return _serverStates.TryGetValue(key, out var state) ? state.ActiveConnections : 0;
})
.First();
}
private UpstreamServer SelectIPHash(HttpContext context, List<UpstreamServer> servers)
{
var ip = context.Connection.RemoteIpAddress?.ToString() ?? "127.0.0.1";
var hash = ip.GetHashCode();
return servers[Math.Abs(hash) % servers.Count];
}
public void ReportSuccess(string upstreamName, string serverAddress)
{
var key = $"{upstreamName}:{serverAddress}";
if (_serverStates.TryGetValue(key, out var state))
{
state.ConsecutiveFailures = 0;
state.ConsecutiveSuccesses++;
// Check circuit breaker reset
if (!state.IsHealthy && state.ConsecutiveSuccesses >= GetHealthyThreshold(upstreamName))
{
state.IsHealthy = true;
_logger.LogInformation("Server {Server} marked healthy", serverAddress);
}
}
}
public void ReportFailure(string upstreamName, string serverAddress)
{
var key = $"{upstreamName}:{serverAddress}";
if (_serverStates.TryGetValue(key, out var state))
{
state.ConsecutiveSuccesses = 0;
state.ConsecutiveFailures++;
// Check circuit breaker trip
if (state.IsHealthy && state.ConsecutiveFailures >= GetUnhealthyThreshold(upstreamName))
{
state.IsHealthy = false;
_logger.LogWarning("Server {Server} marked unhealthy after {Failures} failures",
serverAddress, state.ConsecutiveFailures);
}
}
}
private int GetUnhealthyThreshold(string upstreamName)
{
return _config.Upstreams.TryGetValue(upstreamName, out var upstream)
? upstream.HealthCheck?.UnhealthyThreshold ?? 3
: 3;
}
private int GetHealthyThreshold(string upstreamName)
{
return _config.Upstreams.TryGetValue(upstreamName, out var upstream)
? upstream.HealthCheck?.HealthyThreshold ?? 2
: 2;
}
public Task StartAsync(CancellationToken cancellationToken)
{
_healthCheckTimer = new Timer(PerformHealthChecks, null, TimeSpan.Zero, TimeSpan.FromSeconds(10));
return Task.CompletedTask;
}
private async void PerformHealthChecks(object? state)
{
foreach (var (name, upstream) in _config.Upstreams)
{
if (upstream.HealthCheck == null)
continue;
foreach (var server in upstream.Servers)
{
await CheckServerHealthAsync(name, server, upstream.HealthCheck);
}
}
}
private async Task CheckServerHealthAsync(
string upstreamName,
UpstreamServer server,
HealthCheckConfig config)
{
try
{
using var client = new HttpClient { Timeout = config.Timeout };
var uri = new Uri(new Uri(server.Address), config.Path);
var response = await client.GetAsync(uri);
if (response.IsSuccessStatusCode)
{
ReportSuccess(upstreamName, server.Address);
}
else
{
ReportFailure(upstreamName, server.Address);
}
}
catch
{
ReportFailure(upstreamName, server.Address);
}
}
public Task StopAsync(CancellationToken cancellationToken)
{
_healthCheckTimer?.Dispose();
return Task.CompletedTask;
}
}
internal class ServerState
{
public string Address { get; set; } = "";
public int Weight { get; set; } = 1;
public bool IsHealthy { get; set; } = true;
public bool IsBackup { get; set; }
public int ConsecutiveFailures { get; set; }
public int ConsecutiveSuccesses { get; set; }
public int ActiveConnections { get; set; }
}
```
---
## Service Registration
```csharp
namespace StellaOps.Router.Handlers.ReverseProxy;
public static class ReverseProxyExtensions
{
public static IServiceCollection AddReverseProxyHandler(
this IServiceCollection services,
IConfiguration configuration)
{
services.Configure<ReverseProxyConfig>(
configuration.GetSection("ReverseProxy"));
services.AddSingleton<IUpstreamManager, UpstreamManager>();
services.AddHostedService(sp => (UpstreamManager)sp.GetRequiredService<IUpstreamManager>());
services.AddHttpClient("proxy", client =>
{
client.DefaultRequestVersion = HttpVersion.Version20;
client.DefaultVersionPolicy = HttpVersionPolicy.RequestVersionOrLower;
})
.ConfigurePrimaryHttpMessageHandler(() => new SocketsHttpHandler
{
PooledConnectionLifetime = TimeSpan.FromMinutes(5),
MaxConnectionsPerServer = 100,
EnableMultipleHttp2Connections = true
});
services.AddSingleton<IRouteHandler, ReverseProxyHandler>();
return services;
}
}
```
---
## YAML Configuration
```yaml
ReverseProxy:
DefaultTimeout: "00:00:30"
AddForwardedHeaders: true
PreserveHost: false
ConnectionPool:
MaxConnectionsPerServer: 100
ConnectionIdleTimeout: "00:02:00"
EnableHttp2: true
Upstreams:
legacy-api:
LoadBalance: RoundRobin
Servers:
- Address: "http://legacy-api-1:8080"
Weight: 2
- Address: "http://legacy-api-2:8080"
Weight: 1
- Address: "http://legacy-api-backup:8080"
Backup: true
HealthCheck:
Path: "/health"
Interval: "00:00:10"
Timeout: "00:00:05"
UnhealthyThreshold: 3
HealthyThreshold: 2
CircuitBreaker:
FailureThreshold: 5
SamplingDuration: "00:00:30"
BreakDuration: "00:00:30"
Retry:
MaxRetries: 3
InitialDelay: "00:00:00.100"
BackoffMultiplier: 2.0
RetryableStatusCodes: [502, 503, 504]
external-service:
LoadBalance: LeastConnections
Servers:
- Address: "https://api.external-service.com"
Routes:
- PathPattern: "/legacy/*"
Upstream: "legacy-api"
Rewrite:
Pattern: "^/legacy"
Replacement: "/api/v1"
Headers:
Add:
X-Proxy-Source: "stella-router"
Remove:
- "X-Internal-Token"
ForwardClaims: true
ClaimsHeaderPrefix: "X-User-"
RequiredClaims:
- "sub"
- PathPattern: "/external/*"
Upstream: "external-service"
Timeout: "00:01:00"
Headers:
Set:
Authorization: "Bearer ${EXTERNAL_API_KEY}"
```
---
## Deliverables
1. `StellaOps.Router.Handlers.ReverseProxy/ReverseProxyHandler.cs`
2. `StellaOps.Router.Handlers.ReverseProxy/ReverseProxyConfig.cs`
3. `StellaOps.Router.Handlers.ReverseProxy/IUpstreamManager.cs`
4. `StellaOps.Router.Handlers.ReverseProxy/UpstreamManager.cs`
5. `StellaOps.Router.Handlers.ReverseProxy/ReverseProxyExtensions.cs`
6. Load balancing strategy tests
7. Health check tests
8. Circuit breaker tests
9. Header transformation tests
---
## Next Step
Proceed to [Step 19: Additional Handler Plugins](19-Step.md) to implement static files and WebSocket handlers.

714
docs/router/19-Step.md Normal file
View File

@@ -0,0 +1,714 @@
# Step 19: Microservice Host Builder
**Phase 5: Microservice SDK**
**Estimated Complexity:** High
**Dependencies:** Step 14 (TCP Transport), Step 15 (TLS Transport)
---
## Overview
The Microservice Host Builder provides a fluent API for building microservices that connect to the Stella Router. It handles transport configuration, endpoint registration, graceful shutdown, and integration with ASP.NET Core's hosting infrastructure.
---
## Goals
1. Provide fluent builder API for microservice configuration
2. Support both standalone and ASP.NET Core integrated hosting
3. Handle transport lifecycle (connect, reconnect, disconnect)
4. Support multiple transport configurations
5. Enable dual-exposure mode (gateway + direct HTTP)
---
## Core Architecture
```
┌────────────────────────────────────────────────────────────────┐
│ Microservice Host Builder │
├────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ StellaMicroserviceHost │ │
│ │ ┌───────────────┐ ┌───────────────┐ ┌─────────────┐ │ │
│ │ │Transport Layer│ │Endpoint Registry│ │ Request │ │ │
│ │ │ (TCP/TLS/etc) │ │(Discovery/Reg) │ │ Dispatcher │ │ │
│ │ └───────────────┘ └───────────────┘ └─────────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Optional: ASP.NET Core Host │ │
│ │ (Kestrel for direct HTTP access + default claims) │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────┘
```
---
## Configuration
```csharp
namespace StellaOps.Microservice;
public class StellaMicroserviceOptions
{
/// <summary>Service name for registration.</summary>
public required string ServiceName { get; set; }
/// <summary>Unique instance identifier (auto-generated if not set).</summary>
public string InstanceId { get; set; } = Guid.NewGuid().ToString("N")[..8];
/// <summary>Service version for routing.</summary>
public string Version { get; set; } = "1.0.0";
/// <summary>Region for routing affinity.</summary>
public string? Region { get; set; }
/// <summary>Tags for routing metadata.</summary>
public Dictionary<string, string> Tags { get; set; } = new();
/// <summary>Router connection pool.</summary>
public List<RouterConnectionConfig> Routers { get; set; } = new();
/// <summary>Transport configuration.</summary>
public TransportConfig Transport { get; set; } = new();
/// <summary>Endpoint discovery configuration.</summary>
public EndpointDiscoveryConfig Discovery { get; set; } = new();
/// <summary>Heartbeat configuration.</summary>
public HeartbeatConfig Heartbeat { get; set; } = new();
/// <summary>Dual exposure mode configuration.</summary>
public DualExposureConfig? DualExposure { get; set; }
/// <summary>Graceful shutdown timeout.</summary>
public TimeSpan ShutdownTimeout { get; set; } = TimeSpan.FromSeconds(30);
}
public class RouterConnectionConfig
{
public string Host { get; set; } = "localhost";
public int Port { get; set; } = 9500;
public string Transport { get; set; } = "TCP"; // TCP, TLS, InMemory
public int Priority { get; set; } = 1;
public bool Enabled { get; set; } = true;
}
public class TransportConfig
{
public string Default { get; set; } = "TCP";
public TcpClientConfig? Tcp { get; set; }
public TlsClientConfig? Tls { get; set; }
public int MaxReconnectAttempts { get; set; } = -1; // -1 = unlimited
public TimeSpan ReconnectDelay { get; set; } = TimeSpan.FromSeconds(5);
}
public class EndpointDiscoveryConfig
{
/// <summary>Assemblies to scan for endpoints.</summary>
public List<string> ScanAssemblies { get; set; } = new();
/// <summary>Path to YAML overrides file.</summary>
public string? ConfigFilePath { get; set; }
/// <summary>Base path prefix for all endpoints.</summary>
public string? BasePath { get; set; }
/// <summary>Whether to auto-discover endpoints via reflection.</summary>
public bool AutoDiscover { get; set; } = true;
}
public class HeartbeatConfig
{
public TimeSpan Interval { get; set; } = TimeSpan.FromSeconds(10);
public TimeSpan Timeout { get; set; } = TimeSpan.FromSeconds(5);
public int MissedHeartbeatsThreshold { get; set; } = 3;
}
public class DualExposureConfig
{
/// <summary>Enable direct HTTP access.</summary>
public bool Enabled { get; set; } = false;
/// <summary>HTTP port for direct access.</summary>
public int HttpPort { get; set; } = 8080;
/// <summary>Default claims for direct access (no JWT).</summary>
public Dictionary<string, string> DefaultClaims { get; set; } = new();
/// <summary>Whether to require JWT for direct access.</summary>
public bool RequireAuthentication { get; set; } = false;
}
```
---
## Host Builder Implementation
```csharp
namespace StellaOps.Microservice;
public interface IStellaMicroserviceBuilder
{
IStellaMicroserviceBuilder ConfigureServices(Action<IServiceCollection> configure);
IStellaMicroserviceBuilder ConfigureTransport(Action<TransportConfig> configure);
IStellaMicroserviceBuilder ConfigureEndpoints(Action<EndpointDiscoveryConfig> configure);
IStellaMicroserviceBuilder AddRouter(string host, int port, string transport = "TCP");
IStellaMicroserviceBuilder EnableDualExposure(Action<DualExposureConfig>? configure = null);
IStellaMicroserviceBuilder UseYamlConfig(string path);
IStellaMicroserviceHost Build();
}
public sealed class StellaMicroserviceBuilder : IStellaMicroserviceBuilder
{
private readonly StellaMicroserviceOptions _options;
private readonly IServiceCollection _services;
private readonly List<Action<IServiceCollection>> _configureActions = new();
public StellaMicroserviceBuilder(string serviceName)
{
_options = new StellaMicroserviceOptions { ServiceName = serviceName };
_services = new ServiceCollection();
// Add default services
_services.AddLogging(b => b.AddConsole());
_services.AddSingleton(_options);
}
public static IStellaMicroserviceBuilder Create(string serviceName)
{
return new StellaMicroserviceBuilder(serviceName);
}
public IStellaMicroserviceBuilder ConfigureServices(Action<IServiceCollection> configure)
{
_configureActions.Add(configure);
return this;
}
public IStellaMicroserviceBuilder ConfigureTransport(Action<TransportConfig> configure)
{
configure(_options.Transport);
return this;
}
public IStellaMicroserviceBuilder ConfigureEndpoints(Action<EndpointDiscoveryConfig> configure)
{
configure(_options.Discovery);
return this;
}
public IStellaMicroserviceBuilder AddRouter(string host, int port, string transport = "TCP")
{
_options.Routers.Add(new RouterConnectionConfig
{
Host = host,
Port = port,
Transport = transport,
Priority = _options.Routers.Count + 1
});
return this;
}
public IStellaMicroserviceBuilder EnableDualExposure(Action<DualExposureConfig>? configure = null)
{
_options.DualExposure = new DualExposureConfig { Enabled = true };
configure?.Invoke(_options.DualExposure);
return this;
}
public IStellaMicroserviceBuilder UseYamlConfig(string path)
{
_options.Discovery.ConfigFilePath = path;
return this;
}
public IStellaMicroserviceHost Build()
{
// Apply custom service configuration
foreach (var action in _configureActions)
{
action(_services);
}
// Add core services
AddCoreServices();
// Add transport services
AddTransportServices();
// Add endpoint services
AddEndpointServices();
var serviceProvider = _services.BuildServiceProvider();
return serviceProvider.GetRequiredService<IStellaMicroserviceHost>();
}
private void AddCoreServices()
{
_services.AddSingleton<IStellaMicroserviceHost, StellaMicroserviceHost>();
_services.AddSingleton<IEndpointRegistry, EndpointRegistry>();
_services.AddSingleton<IRequestDispatcher, RequestDispatcher>();
_services.AddSingleton<IPayloadSerializer, MessagePackPayloadSerializer>();
}
private void AddTransportServices()
{
_services.AddSingleton<TcpFrameCodec>();
switch (_options.Transport.Default.ToUpper())
{
case "TCP":
_services.AddSingleton<ITransportServer, TcpTransportClient>();
break;
case "TLS":
_services.AddSingleton<ICertificateProvider, CertificateProvider>();
_services.AddSingleton<ITransportServer, TlsTransportClient>();
break;
case "INMEMORY":
// InMemory requires hub to be provided externally
_services.AddSingleton<ITransportServer, InMemoryTransportServer>();
break;
}
}
private void AddEndpointServices()
{
_services.AddSingleton<IEndpointDiscovery, ReflectionEndpointDiscovery>();
if (!string.IsNullOrEmpty(_options.Discovery.ConfigFilePath))
{
_services.AddSingleton<IEndpointOverrideProvider, YamlEndpointOverrideProvider>();
}
}
}
```
---
## Microservice Host Implementation
```csharp
namespace StellaOps.Microservice;
public interface IStellaMicroserviceHost : IAsyncDisposable
{
StellaMicroserviceOptions Options { get; }
bool IsConnected { get; }
Task StartAsync(CancellationToken cancellationToken = default);
Task StopAsync(CancellationToken cancellationToken = default);
Task WaitForShutdownAsync(CancellationToken cancellationToken = default);
}
public sealed class StellaMicroserviceHost : IStellaMicroserviceHost, IHostedService
{
private readonly StellaMicroserviceOptions _options;
private readonly ITransportServer _transport;
private readonly IEndpointRegistry _endpointRegistry;
private readonly IRequestDispatcher _dispatcher;
private readonly ILogger<StellaMicroserviceHost> _logger;
private readonly CancellationTokenSource _shutdownCts = new();
private readonly TaskCompletionSource _shutdownComplete = new();
private Timer? _heartbeatTimer;
private IHost? _httpHost;
public StellaMicroserviceOptions Options => _options;
public bool IsConnected => _transport.IsConnected;
public StellaMicroserviceHost(
StellaMicroserviceOptions options,
ITransportServer transport,
IEndpointRegistry endpointRegistry,
IRequestDispatcher dispatcher,
ILogger<StellaMicroserviceHost> logger)
{
_options = options;
_transport = transport;
_endpointRegistry = endpointRegistry;
_dispatcher = dispatcher;
_logger = logger;
}
public async Task StartAsync(CancellationToken cancellationToken = default)
{
_logger.LogInformation(
"Starting microservice {ServiceName}/{InstanceId}",
_options.ServiceName, _options.InstanceId);
// Discover endpoints
var endpoints = await _endpointRegistry.DiscoverEndpointsAsync(cancellationToken);
_logger.LogInformation("Discovered {Count} endpoints", endpoints.Length);
// Wire up request handler
_transport.OnRequest += HandleRequestAsync;
_transport.OnCancel += HandleCancelAsync;
// Connect to router
var router = _options.Routers.OrderBy(r => r.Priority).FirstOrDefault()
?? throw new InvalidOperationException("No routers configured");
await _transport.ConnectAsync(
_options.ServiceName,
_options.InstanceId,
endpoints,
cancellationToken);
_logger.LogInformation(
"Connected to router at {Host}:{Port}",
router.Host, router.Port);
// Start heartbeat
_heartbeatTimer = new Timer(
SendHeartbeatAsync,
null,
_options.Heartbeat.Interval,
_options.Heartbeat.Interval);
// Start dual exposure HTTP if enabled
if (_options.DualExposure?.Enabled == true)
{
await StartHttpHostAsync(cancellationToken);
}
_logger.LogInformation(
"Microservice {ServiceName} started successfully",
_options.ServiceName);
}
private async Task<ResponsePayload> HandleRequestAsync(
RequestPayload request,
CancellationToken cancellationToken)
{
using var activity = Activity.StartActivity("HandleRequest");
activity?.SetTag("http.method", request.Method);
activity?.SetTag("http.path", request.Path);
try
{
return await _dispatcher.DispatchAsync(request, cancellationToken);
}
catch (Exception ex)
{
_logger.LogError(ex, "Error handling request {Path}", request.Path);
return new ResponsePayload
{
StatusCode = 500,
Headers = new Dictionary<string, string>(),
Body = Encoding.UTF8.GetBytes($"{{\"error\": \"{ex.Message}\"}}"),
IsFinalChunk = true
};
}
}
private Task HandleCancelAsync(string correlationId, CancellationToken cancellationToken)
{
_logger.LogDebug("Request {CorrelationId} cancelled", correlationId);
// Propagate cancellation to active request handling
return Task.CompletedTask;
}
private async void SendHeartbeatAsync(object? state)
{
try
{
await _transport.SendHeartbeatAsync(_shutdownCts.Token);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to send heartbeat");
}
}
private async Task StartHttpHostAsync(CancellationToken cancellationToken)
{
var config = _options.DualExposure!;
_httpHost = Host.CreateDefaultBuilder()
.ConfigureWebHostDefaults(web =>
{
web.UseKestrel(k => k.ListenAnyIP(config.HttpPort));
web.Configure(app =>
{
app.UseRouting();
app.UseEndpoints(endpoints =>
{
endpoints.MapFallback(async context =>
{
// Inject default claims for direct access
var claims = config.DefaultClaims;
var request = new RequestPayload
{
Method = context.Request.Method,
Path = context.Request.Path + context.Request.QueryString,
Host = context.Request.Host.Value,
Headers = context.Request.Headers
.ToDictionary(h => h.Key, h => h.Value.ToString()),
Claims = claims,
ClientIp = context.Connection.RemoteIpAddress?.ToString(),
TraceId = context.TraceIdentifier
};
// Read body if present
if (context.Request.ContentLength > 0)
{
using var ms = new MemoryStream();
await context.Request.Body.CopyToAsync(ms);
request = request with { Body = ms.ToArray() };
}
var response = await _dispatcher.DispatchAsync(request, context.RequestAborted);
context.Response.StatusCode = response.StatusCode;
foreach (var (key, value) in response.Headers)
{
context.Response.Headers[key] = value;
}
if (response.Body != null)
{
await context.Response.Body.WriteAsync(response.Body);
}
});
});
});
})
.Build();
await _httpHost.StartAsync(cancellationToken);
_logger.LogInformation(
"Direct HTTP access enabled on port {Port}",
config.HttpPort);
}
public async Task StopAsync(CancellationToken cancellationToken = default)
{
_logger.LogInformation(
"Stopping microservice {ServiceName}",
_options.ServiceName);
_shutdownCts.Cancel();
_heartbeatTimer?.Dispose();
if (_httpHost != null)
{
await _httpHost.StopAsync(cancellationToken);
}
await _transport.DisconnectAsync();
_logger.LogInformation(
"Microservice {ServiceName} stopped",
_options.ServiceName);
_shutdownComplete.TrySetResult();
}
public Task WaitForShutdownAsync(CancellationToken cancellationToken = default)
{
return _shutdownComplete.Task.WaitAsync(cancellationToken);
}
public async ValueTask DisposeAsync()
{
await StopAsync();
_shutdownCts.Dispose();
}
// IHostedService implementation for ASP.NET Core integration
Task IHostedService.StartAsync(CancellationToken cancellationToken) => StartAsync(cancellationToken);
Task IHostedService.StopAsync(CancellationToken cancellationToken) => StopAsync(cancellationToken);
}
```
---
## ASP.NET Core Integration
```csharp
namespace StellaOps.Microservice;
public static class StellaMicroserviceExtensions
{
/// <summary>
/// Adds Stella microservice to an existing ASP.NET Core host.
/// </summary>
public static IServiceCollection AddStellaMicroservice(
this IServiceCollection services,
Action<StellaMicroserviceOptions> configure)
{
var options = new StellaMicroserviceOptions { ServiceName = "unknown" };
configure(options);
services.AddSingleton(options);
services.AddSingleton<IEndpointRegistry, EndpointRegistry>();
services.AddSingleton<IRequestDispatcher, RequestDispatcher>();
services.AddSingleton<IPayloadSerializer, MessagePackPayloadSerializer>();
services.AddSingleton<TcpFrameCodec>();
// Add transport based on configuration
switch (options.Transport.Default.ToUpper())
{
case "TCP":
services.AddSingleton<ITransportServer, TcpTransportClient>();
break;
case "TLS":
services.AddSingleton<ICertificateProvider, CertificateProvider>();
services.AddSingleton<ITransportServer, TlsTransportClient>();
break;
}
services.AddSingleton<IStellaMicroserviceHost, StellaMicroserviceHost>();
services.AddHostedService(sp => (StellaMicroserviceHost)sp.GetRequiredService<IStellaMicroserviceHost>());
return services;
}
/// <summary>
/// Configures an endpoint handler for the microservice.
/// </summary>
public static IServiceCollection AddEndpointHandler<THandler>(
this IServiceCollection services)
where THandler : class, IEndpointHandler
{
services.AddScoped<IEndpointHandler, THandler>();
return services;
}
}
```
---
## Usage Examples
### Standalone Microservice
```csharp
var host = StellaMicroserviceBuilder
.Create("billing-service")
.AddRouter("gateway.internal", 9500, "TLS")
.ConfigureTransport(t =>
{
t.Tls = new TlsClientConfig
{
ClientCertificatePath = "/etc/certs/billing.pfx",
ClientCertificatePassword = Environment.GetEnvironmentVariable("CERT_PASSWORD")
};
})
.ConfigureEndpoints(e =>
{
e.BasePath = "/billing";
e.ScanAssemblies.Add("BillingService.Handlers");
})
.ConfigureServices(services =>
{
services.AddScoped<BillingContext>();
services.AddScoped<InvoiceHandler>();
})
.Build();
await host.StartAsync();
await host.WaitForShutdownAsync();
```
### ASP.NET Core Integration
```csharp
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddStellaMicroservice(options =>
{
options.ServiceName = "user-service";
options.Region = "us-east-1";
options.Routers.Add(new RouterConnectionConfig
{
Host = "gateway.internal",
Port = 9500
});
options.DualExposure = new DualExposureConfig
{
Enabled = true,
HttpPort = 8080,
DefaultClaims = new Dictionary<string, string>
{
["tier"] = "free"
}
};
});
builder.Services.AddEndpointHandler<UserEndpointHandler>();
var app = builder.Build();
await app.RunAsync();
```
---
## YAML Configuration
```yaml
Microservice:
ServiceName: "billing-service"
Version: "1.0.0"
Region: "us-east-1"
Tags:
team: "payments"
tier: "critical"
Routers:
- Host: "gateway-primary.internal"
Port: 9500
Transport: "TLS"
Priority: 1
- Host: "gateway-secondary.internal"
Port: 9500
Transport: "TLS"
Priority: 2
Transport:
Default: "TLS"
Tls:
ClientCertificatePath: "/etc/certs/service.pfx"
ClientCertificatePassword: "${CERT_PASSWORD}"
Discovery:
AutoDiscover: true
BasePath: "/billing"
ConfigFilePath: "/etc/stellaops/endpoints.yaml"
Heartbeat:
Interval: "00:00:10"
Timeout: "00:00:05"
DualExposure:
Enabled: true
HttpPort: 8080
DefaultClaims:
tier: "free"
ShutdownTimeout: "00:00:30"
```
---
## Deliverables
1. `StellaOps.Microservice/StellaMicroserviceOptions.cs`
2. `StellaOps.Microservice/IStellaMicroserviceBuilder.cs`
3. `StellaOps.Microservice/StellaMicroserviceBuilder.cs`
4. `StellaOps.Microservice/IStellaMicroserviceHost.cs`
5. `StellaOps.Microservice/StellaMicroserviceHost.cs`
6. `StellaOps.Microservice/StellaMicroserviceExtensions.cs`
7. Builder pattern tests
8. Lifecycle tests (start/stop/reconnect)
9. Dual exposure mode tests
---
## Next Step
Proceed to [Step 20: Endpoint Discovery & Registration](20-Step.md) to implement automatic endpoint discovery.

696
docs/router/20-Step.md Normal file
View File

@@ -0,0 +1,696 @@
# Step 20: Endpoint Discovery & Registration
**Phase 5: Microservice SDK**
**Estimated Complexity:** Medium
**Dependencies:** Step 19 (Microservice Host Builder)
---
## Overview
Endpoint discovery automatically finds and registers HTTP endpoints from microservice code using attributes and reflection. YAML configuration provides overrides for metadata like rate limits, authentication requirements, and versioning.
---
## Goals
1. Discover endpoints via reflection and attributes
2. Support YAML-based metadata overrides
3. Generate EndpointDescriptor for router registration
4. Support endpoint versioning and deprecation
5. Validate endpoint configurations at startup
---
## Endpoint Attributes
```csharp
namespace StellaOps.Microservice;
/// <summary>
/// Marks a class as containing Stella endpoints.
/// </summary>
[AttributeUsage(AttributeTargets.Class)]
public sealed class StellaEndpointAttribute : Attribute
{
public string? BasePath { get; set; }
public string? Version { get; set; }
public string[]? Tags { get; set; }
}
/// <summary>
/// Marks a method as a Stella endpoint handler.
/// </summary>
[AttributeUsage(AttributeTargets.Method)]
public sealed class StellaRouteAttribute : Attribute
{
public string Method { get; }
public string Path { get; }
public string? Name { get; set; }
public string? Description { get; set; }
public StellaRouteAttribute(string method, string path)
{
Method = method;
Path = path;
}
}
/// <summary>
/// Specifies authentication requirements for an endpoint.
/// </summary>
[AttributeUsage(AttributeTargets.Method | AttributeTargets.Class)]
public sealed class StellaAuthAttribute : Attribute
{
public bool Required { get; set; } = true;
public string[]? RequiredClaims { get; set; }
public string? Policy { get; set; }
}
/// <summary>
/// Specifies rate limiting for an endpoint.
/// </summary>
[AttributeUsage(AttributeTargets.Method | AttributeTargets.Class)]
public sealed class StellaRateLimitAttribute : Attribute
{
public int RequestsPerMinute { get; set; }
public string? BucketKey { get; set; } // e.g., "sub", "ip", "path"
}
/// <summary>
/// Specifies timeout for an endpoint.
/// </summary>
[AttributeUsage(AttributeTargets.Method | AttributeTargets.Class)]
public sealed class StellaTimeoutAttribute : Attribute
{
public int TimeoutMs { get; }
public StellaTimeoutAttribute(int timeoutMs)
{
TimeoutMs = timeoutMs;
}
}
/// <summary>
/// Marks an endpoint as deprecated.
/// </summary>
[AttributeUsage(AttributeTargets.Method)]
public sealed class StellaDeprecatedAttribute : Attribute
{
public string? Message { get; set; }
public string? AlternativeEndpoint { get; set; }
public string? SunsetDate { get; set; }
}
/// <summary>
/// Convenience attributes for common HTTP methods.
/// </summary>
public sealed class StellaGetAttribute : StellaRouteAttribute
{
public StellaGetAttribute(string path) : base("GET", path) { }
}
public sealed class StellaPostAttribute : StellaRouteAttribute
{
public StellaPostAttribute(string path) : base("POST", path) { }
}
public sealed class StellaPutAttribute : StellaRouteAttribute
{
public StellaPutAttribute(string path) : base("PUT", path) { }
}
public sealed class StellaDeleteAttribute : StellaRouteAttribute
{
public StellaDeleteAttribute(string path) : base("DELETE", path) { }
}
public sealed class StellaPatchAttribute : StellaRouteAttribute
{
public StellaPatchAttribute(string path) : base("PATCH", path) { }
}
```
---
## Endpoint Descriptor
```csharp
namespace StellaOps.Microservice;
/// <summary>
/// Describes an endpoint for router registration.
/// </summary>
public sealed class EndpointDescriptor
{
/// <summary>HTTP method (GET, POST, etc.).</summary>
public required string Method { get; init; }
/// <summary>Path pattern (may include parameters like {id}).</summary>
public required string Path { get; init; }
/// <summary>Unique endpoint name.</summary>
public string? Name { get; init; }
/// <summary>Endpoint description for documentation.</summary>
public string? Description { get; init; }
/// <summary>API version.</summary>
public string? Version { get; init; }
/// <summary>Tags for grouping/filtering.</summary>
public string[]? Tags { get; init; }
/// <summary>Whether authentication is required.</summary>
public bool RequiresAuth { get; init; } = true;
/// <summary>Required claims for access.</summary>
public string[]? RequiredClaims { get; init; }
/// <summary>Authentication policy name.</summary>
public string? AuthPolicy { get; init; }
/// <summary>Rate limit configuration.</summary>
public RateLimitDescriptor? RateLimit { get; init; }
/// <summary>Request timeout in milliseconds.</summary>
public int? TimeoutMs { get; init; }
/// <summary>Deprecation information.</summary>
public DeprecationDescriptor? Deprecation { get; init; }
/// <summary>Custom metadata.</summary>
public Dictionary<string, string>? Metadata { get; init; }
}
public sealed class RateLimitDescriptor
{
public int RequestsPerMinute { get; init; }
public string BucketKey { get; init; } = "sub";
}
public sealed class DeprecationDescriptor
{
public string? Message { get; init; }
public string? AlternativeEndpoint { get; init; }
public DateOnly? SunsetDate { get; init; }
}
```
---
## Endpoint Discovery Interface
```csharp
namespace StellaOps.Microservice;
public interface IEndpointDiscovery
{
/// <summary>
/// Discovers endpoints from configured assemblies.
/// </summary>
Task<IReadOnlyList<DiscoveredEndpoint>> DiscoverAsync(CancellationToken cancellationToken);
}
public sealed class DiscoveredEndpoint
{
public required EndpointDescriptor Descriptor { get; init; }
public required Type HandlerType { get; init; }
public required MethodInfo HandlerMethod { get; init; }
}
```
---
## Reflection-Based Discovery
```csharp
namespace StellaOps.Microservice;
public sealed class ReflectionEndpointDiscovery : IEndpointDiscovery
{
private readonly EndpointDiscoveryConfig _config;
private readonly ILogger<ReflectionEndpointDiscovery> _logger;
public ReflectionEndpointDiscovery(
StellaMicroserviceOptions options,
ILogger<ReflectionEndpointDiscovery> logger)
{
_config = options.Discovery;
_logger = logger;
}
public Task<IReadOnlyList<DiscoveredEndpoint>> DiscoverAsync(CancellationToken cancellationToken)
{
var endpoints = new List<DiscoveredEndpoint>();
var assemblies = GetAssembliesToScan();
foreach (var assembly in assemblies)
{
foreach (var type in assembly.GetExportedTypes())
{
var classAttr = type.GetCustomAttribute<StellaEndpointAttribute>();
if (classAttr == null)
continue;
var classAuth = type.GetCustomAttribute<StellaAuthAttribute>();
var classRateLimit = type.GetCustomAttribute<StellaRateLimitAttribute>();
var classTimeout = type.GetCustomAttribute<StellaTimeoutAttribute>();
foreach (var method in type.GetMethods(BindingFlags.Public | BindingFlags.Instance))
{
var routeAttr = method.GetCustomAttribute<StellaRouteAttribute>();
if (routeAttr == null)
continue;
var endpoint = BuildEndpoint(
type, method, classAttr, routeAttr,
classAuth, classRateLimit, classTimeout);
endpoints.Add(endpoint);
_logger.LogDebug(
"Discovered endpoint: {Method} {Path}",
endpoint.Descriptor.Method, endpoint.Descriptor.Path);
}
}
}
_logger.LogInformation("Discovered {Count} endpoints", endpoints.Count);
return Task.FromResult<IReadOnlyList<DiscoveredEndpoint>>(endpoints);
}
private IEnumerable<Assembly> GetAssembliesToScan()
{
if (_config.ScanAssemblies.Any())
{
return _config.ScanAssemblies.Select(Assembly.Load);
}
// Default: scan entry assembly and referenced assemblies
var entry = Assembly.GetEntryAssembly();
if (entry == null)
return Enumerable.Empty<Assembly>();
return new[] { entry }
.Concat(entry.GetReferencedAssemblies().Select(Assembly.Load));
}
private DiscoveredEndpoint BuildEndpoint(
Type handlerType,
MethodInfo method,
StellaEndpointAttribute classAttr,
StellaRouteAttribute routeAttr,
StellaAuthAttribute? classAuth,
StellaRateLimitAttribute? classRateLimit,
StellaTimeoutAttribute? classTimeout)
{
// Method-level attributes override class-level
var methodAuth = method.GetCustomAttribute<StellaAuthAttribute>() ?? classAuth;
var methodRateLimit = method.GetCustomAttribute<StellaRateLimitAttribute>() ?? classRateLimit;
var methodTimeout = method.GetCustomAttribute<StellaTimeoutAttribute>() ?? classTimeout;
var deprecatedAttr = method.GetCustomAttribute<StellaDeprecatedAttribute>();
// Build full path
var basePath = classAttr.BasePath?.TrimEnd('/') ?? "";
if (!string.IsNullOrEmpty(_config.BasePath))
{
basePath = _config.BasePath.TrimEnd('/') + basePath;
}
var fullPath = basePath + "/" + routeAttr.Path.TrimStart('/');
var descriptor = new EndpointDescriptor
{
Method = routeAttr.Method,
Path = fullPath,
Name = routeAttr.Name ?? $"{handlerType.Name}.{method.Name}",
Description = routeAttr.Description,
Version = classAttr.Version,
Tags = classAttr.Tags,
RequiresAuth = methodAuth?.Required ?? true,
RequiredClaims = methodAuth?.RequiredClaims,
AuthPolicy = methodAuth?.Policy,
RateLimit = methodRateLimit != null ? new RateLimitDescriptor
{
RequestsPerMinute = methodRateLimit.RequestsPerMinute,
BucketKey = methodRateLimit.BucketKey ?? "sub"
} : null,
TimeoutMs = methodTimeout?.TimeoutMs,
Deprecation = deprecatedAttr != null ? new DeprecationDescriptor
{
Message = deprecatedAttr.Message,
AlternativeEndpoint = deprecatedAttr.AlternativeEndpoint,
SunsetDate = DateOnly.TryParse(deprecatedAttr.SunsetDate, out var date) ? date : null
} : null
};
return new DiscoveredEndpoint
{
Descriptor = descriptor,
HandlerType = handlerType,
HandlerMethod = method
};
}
}
```
---
## YAML Override Provider
```csharp
namespace StellaOps.Microservice;
public interface IEndpointOverrideProvider
{
/// <summary>
/// Applies overrides to discovered endpoints.
/// </summary>
void ApplyOverrides(IList<DiscoveredEndpoint> endpoints);
}
public sealed class YamlEndpointOverrideProvider : IEndpointOverrideProvider
{
private readonly EndpointDiscoveryConfig _config;
private readonly ILogger<YamlEndpointOverrideProvider> _logger;
private readonly Dictionary<string, EndpointOverride> _overrides = new();
public YamlEndpointOverrideProvider(
StellaMicroserviceOptions options,
ILogger<YamlEndpointOverrideProvider> logger)
{
_config = options.Discovery;
_logger = logger;
LoadOverrides();
}
private void LoadOverrides()
{
if (string.IsNullOrEmpty(_config.ConfigFilePath))
return;
if (!File.Exists(_config.ConfigFilePath))
{
_logger.LogWarning("Endpoint config file not found: {Path}", _config.ConfigFilePath);
return;
}
var yaml = File.ReadAllText(_config.ConfigFilePath);
var deserializer = new DeserializerBuilder()
.WithNamingConvention(CamelCaseNamingConvention.Instance)
.Build();
var config = deserializer.Deserialize<EndpointOverrideConfig>(yaml);
if (config?.Endpoints != null)
{
foreach (var (key, value) in config.Endpoints)
{
_overrides[key] = value;
}
}
_logger.LogInformation("Loaded {Count} endpoint overrides", _overrides.Count);
}
public void ApplyOverrides(IList<DiscoveredEndpoint> endpoints)
{
foreach (var endpoint in endpoints)
{
var key = $"{endpoint.Descriptor.Method} {endpoint.Descriptor.Path}";
if (_overrides.TryGetValue(key, out var over) ||
_overrides.TryGetValue(endpoint.Descriptor.Path, out over) ||
(endpoint.Descriptor.Name != null && _overrides.TryGetValue(endpoint.Descriptor.Name, out over)))
{
ApplyOverride(endpoint, over);
}
}
}
private void ApplyOverride(DiscoveredEndpoint endpoint, EndpointOverride over)
{
// Create new descriptor with overrides applied
var original = endpoint.Descriptor;
var updated = new EndpointDescriptor
{
Method = original.Method,
Path = original.Path,
Name = over.Name ?? original.Name,
Description = over.Description ?? original.Description,
Version = over.Version ?? original.Version,
Tags = over.Tags ?? original.Tags,
RequiresAuth = over.RequiresAuth ?? original.RequiresAuth,
RequiredClaims = over.RequiredClaims ?? original.RequiredClaims,
AuthPolicy = over.AuthPolicy ?? original.AuthPolicy,
RateLimit = over.RateLimit != null ? new RateLimitDescriptor
{
RequestsPerMinute = over.RateLimit.RequestsPerMinute,
BucketKey = over.RateLimit.BucketKey ?? "sub"
} : original.RateLimit,
TimeoutMs = over.TimeoutMs ?? original.TimeoutMs,
Deprecation = original.Deprecation, // Keep original deprecation
Metadata = MergeMetadata(original.Metadata, over.Metadata)
};
// Replace descriptor (need mutable property or rebuild)
// In real implementation, use record with 'with' expression
_logger.LogDebug("Applied override to endpoint {Path}", original.Path);
}
private Dictionary<string, string>? MergeMetadata(
Dictionary<string, string>? original,
Dictionary<string, string>? over)
{
if (original == null && over == null)
return null;
var result = new Dictionary<string, string>(original ?? new());
if (over != null)
{
foreach (var (key, value) in over)
{
result[key] = value;
}
}
return result;
}
}
internal class EndpointOverrideConfig
{
public Dictionary<string, EndpointOverride>? Endpoints { get; set; }
}
internal class EndpointOverride
{
public string? Name { get; set; }
public string? Description { get; set; }
public string? Version { get; set; }
public string[]? Tags { get; set; }
public bool? RequiresAuth { get; set; }
public string[]? RequiredClaims { get; set; }
public string? AuthPolicy { get; set; }
public RateLimitOverride? RateLimit { get; set; }
public int? TimeoutMs { get; set; }
public Dictionary<string, string>? Metadata { get; set; }
}
internal class RateLimitOverride
{
public int RequestsPerMinute { get; set; }
public string? BucketKey { get; set; }
}
```
---
## Endpoint Registry
```csharp
namespace StellaOps.Microservice;
public interface IEndpointRegistry
{
Task<EndpointDescriptor[]> DiscoverEndpointsAsync(CancellationToken cancellationToken);
DiscoveredEndpoint? FindEndpoint(string method, string path);
}
public sealed class EndpointRegistry : IEndpointRegistry
{
private readonly IEndpointDiscovery _discovery;
private readonly IEndpointOverrideProvider? _overrideProvider;
private readonly ILogger<EndpointRegistry> _logger;
private IReadOnlyList<DiscoveredEndpoint>? _endpoints;
private readonly Dictionary<string, DiscoveredEndpoint> _endpointLookup = new();
public EndpointRegistry(
IEndpointDiscovery discovery,
IEndpointOverrideProvider? overrideProvider,
ILogger<EndpointRegistry> logger)
{
_discovery = discovery;
_overrideProvider = overrideProvider;
_logger = logger;
}
public async Task<EndpointDescriptor[]> DiscoverEndpointsAsync(CancellationToken cancellationToken)
{
_endpoints = await _discovery.DiscoverAsync(cancellationToken);
if (_overrideProvider != null)
{
var mutableList = _endpoints.ToList();
_overrideProvider.ApplyOverrides(mutableList);
_endpoints = mutableList;
}
// Build lookup table
_endpointLookup.Clear();
foreach (var endpoint in _endpoints)
{
var key = $"{endpoint.Descriptor.Method}:{endpoint.Descriptor.Path}";
_endpointLookup[key] = endpoint;
}
// Validate endpoints
ValidateEndpoints(_endpoints);
return _endpoints.Select(e => e.Descriptor).ToArray();
}
public DiscoveredEndpoint? FindEndpoint(string method, string path)
{
// Exact match
var key = $"{method}:{path}";
if (_endpointLookup.TryGetValue(key, out var endpoint))
return endpoint;
// Pattern match for path parameters
foreach (var ep in _endpoints ?? Enumerable.Empty<DiscoveredEndpoint>())
{
if (ep.Descriptor.Method != method)
continue;
if (IsPathMatch(path, ep.Descriptor.Path))
return ep;
}
return null;
}
private bool IsPathMatch(string requestPath, string pattern)
{
var patternSegments = pattern.Split('/', StringSplitOptions.RemoveEmptyEntries);
var pathSegments = requestPath.Split('/', StringSplitOptions.RemoveEmptyEntries);
if (patternSegments.Length != pathSegments.Length)
return false;
for (int i = 0; i < patternSegments.Length; i++)
{
var patternSeg = patternSegments[i];
var pathSeg = pathSegments[i];
// Check for path parameter
if (patternSeg.StartsWith('{') && patternSeg.EndsWith('}'))
continue;
if (!string.Equals(patternSeg, pathSeg, StringComparison.OrdinalIgnoreCase))
return false;
}
return true;
}
private void ValidateEndpoints(IReadOnlyList<DiscoveredEndpoint> endpoints)
{
var duplicates = endpoints
.GroupBy(e => $"{e.Descriptor.Method}:{e.Descriptor.Path}")
.Where(g => g.Count() > 1)
.Select(g => g.Key)
.ToList();
if (duplicates.Any())
{
throw new InvalidOperationException(
$"Duplicate endpoints detected: {string.Join(", ", duplicates)}");
}
// Validate handler method signatures
foreach (var endpoint in endpoints)
{
ValidateHandlerMethod(endpoint);
}
}
private void ValidateHandlerMethod(DiscoveredEndpoint endpoint)
{
var method = endpoint.HandlerMethod;
var returnType = method.ReturnType;
// Must return Task<ResponsePayload> or Task<T> where T can be serialized
if (!typeof(Task).IsAssignableFrom(returnType))
{
throw new InvalidOperationException(
$"Handler {method.Name} must return Task or Task<T>");
}
}
}
```
---
## YAML Configuration Example
```yaml
# endpoints.yaml - Endpoint overrides
Endpoints:
# Override by path
"GET /billing/invoices":
RateLimit:
RequestsPerMinute: 100
BucketKey: "sub"
TimeoutMs: 30000
# Override by name
"InvoiceHandler.GetInvoice":
RequiredClaims:
- "billing:read"
AuthPolicy: "billing-read"
# Override by method + path
"POST /billing/invoices":
RequiredClaims:
- "billing:write"
RateLimit:
RequestsPerMinute: 10
BucketKey: "sub"
Metadata:
audit: "required"
```
---
## Deliverables
1. `StellaOps.Microservice/Attributes/*.cs` (all endpoint attributes)
2. `StellaOps.Microservice/EndpointDescriptor.cs`
3. `StellaOps.Microservice/IEndpointDiscovery.cs`
4. `StellaOps.Microservice/ReflectionEndpointDiscovery.cs`
5. `StellaOps.Microservice/IEndpointOverrideProvider.cs`
6. `StellaOps.Microservice/YamlEndpointOverrideProvider.cs`
7. `StellaOps.Microservice/IEndpointRegistry.cs`
8. `StellaOps.Microservice/EndpointRegistry.cs`
9. Attribute parsing tests
10. YAML override tests
11. Path matching tests
---
## Next Step
Proceed to [Step 21: Request/Response Context](21-Step.md) to implement the request handling context.

793
docs/router/21-Step.md Normal file
View File

@@ -0,0 +1,793 @@
# Step 21: Request/Response Context
**Phase 5: Microservice SDK**
**Estimated Complexity:** Medium
**Dependencies:** Step 20 (Endpoint Discovery)
---
## Overview
The Request/Response Context provides a clean abstraction for endpoint handlers to access request data, claims, and build responses. It hides transport details while providing easy access to parsed path parameters, query strings, headers, and the request body.
---
## Goals
1. Provide clean request context abstraction
2. Support path parameter extraction
3. Provide typed body deserialization
4. Support streaming responses
5. Enable easy response building
---
## Request Context
```csharp
namespace StellaOps.Microservice;
/// <summary>
/// Context for handling a request in a microservice endpoint.
/// </summary>
public sealed class StellaRequestContext
{
private readonly RequestPayload _payload;
private readonly Dictionary<string, string> _pathParameters;
private readonly Lazy<IQueryCollection> _query;
private readonly Lazy<IHeaderDictionary> _headers;
internal StellaRequestContext(
RequestPayload payload,
Dictionary<string, string> pathParameters)
{
_payload = payload;
_pathParameters = pathParameters;
_query = new Lazy<IQueryCollection>(() => ParseQuery(payload.Path));
_headers = new Lazy<IHeaderDictionary>(() => new HeaderDictionary(
payload.Headers.ToDictionary(
h => h.Key,
h => new StringValues(h.Value))));
}
/// <summary>HTTP method.</summary>
public string Method => _payload.Method;
/// <summary>Request path (without query string).</summary>
public string Path => _payload.Path.Split('?')[0];
/// <summary>Full path including query string.</summary>
public string FullPath => _payload.Path;
/// <summary>Host header value.</summary>
public string? Host => _payload.Host;
/// <summary>Client IP address.</summary>
public string? ClientIp => _payload.ClientIp;
/// <summary>Trace/correlation ID.</summary>
public string? TraceId => _payload.TraceId;
/// <summary>Request headers.</summary>
public IHeaderDictionary Headers => _headers.Value;
/// <summary>Query string parameters.</summary>
public IQueryCollection Query => _query.Value;
/// <summary>Authenticated claims from JWT + hydration.</summary>
public IReadOnlyDictionary<string, string> Claims => _payload.Claims;
/// <summary>Path parameters extracted from route pattern.</summary>
public IReadOnlyDictionary<string, string> PathParameters => _pathParameters;
/// <summary>Content-Type header value.</summary>
public string? ContentType => Headers.ContentType;
/// <summary>Content-Length header value.</summary>
public long? ContentLength => _payload.ContentLength > 0 ? _payload.ContentLength : null;
/// <summary>Whether the request has a body.</summary>
public bool HasBody => _payload.Body != null && _payload.Body.Length > 0;
/// <summary>Raw request body bytes.</summary>
public byte[]? RawBody => _payload.Body;
/// <summary>
/// Gets a path parameter by name.
/// </summary>
public string? GetPathParameter(string name)
{
return _pathParameters.TryGetValue(name, out var value) ? value : null;
}
/// <summary>
/// Gets a required path parameter, throws if missing.
/// </summary>
public string RequirePathParameter(string name)
{
return _pathParameters.TryGetValue(name, out var value)
? value
: throw new ArgumentException($"Missing path parameter: {name}");
}
/// <summary>
/// Gets a query parameter by name.
/// </summary>
public string? GetQueryParameter(string name)
{
return Query.TryGetValue(name, out var values) ? values.FirstOrDefault() : null;
}
/// <summary>
/// Gets all values for a query parameter.
/// </summary>
public string[] GetQueryParameterValues(string name)
{
return Query.TryGetValue(name, out var values) ? values.ToArray() : Array.Empty<string>();
}
/// <summary>
/// Gets a header value by name.
/// </summary>
public string? GetHeader(string name)
{
return Headers.TryGetValue(name, out var values) ? values.FirstOrDefault() : null;
}
/// <summary>
/// Gets a claim value by name.
/// </summary>
public string? GetClaim(string name)
{
return Claims.TryGetValue(name, out var value) ? value : null;
}
/// <summary>
/// Gets a required claim, throws if missing.
/// </summary>
public string RequireClaim(string name)
{
return Claims.TryGetValue(name, out var value)
? value
: throw new UnauthorizedAccessException($"Missing required claim: {name}");
}
/// <summary>
/// Reads the body as a string.
/// </summary>
public string? ReadBodyAsString(Encoding? encoding = null)
{
if (_payload.Body == null || _payload.Body.Length == 0)
return null;
return (encoding ?? Encoding.UTF8).GetString(_payload.Body);
}
/// <summary>
/// Deserializes the body as JSON.
/// </summary>
public T? ReadBodyAsJson<T>(JsonSerializerOptions? options = null)
{
if (_payload.Body == null || _payload.Body.Length == 0)
return default;
return JsonSerializer.Deserialize<T>(_payload.Body, options ?? JsonDefaults.Options);
}
/// <summary>
/// Deserializes the body as JSON, throwing if null or invalid.
/// </summary>
public T RequireBodyAsJson<T>(JsonSerializerOptions? options = null) where T : class
{
var result = ReadBodyAsJson<T>(options);
return result ?? throw new ArgumentException("Request body is required");
}
/// <summary>
/// Gets a body stream for reading.
/// </summary>
public Stream GetBodyStream()
{
return new MemoryStream(_payload.Body ?? Array.Empty<byte>(), writable: false);
}
private static IQueryCollection ParseQuery(string path)
{
var queryIndex = path.IndexOf('?');
if (queryIndex < 0)
return QueryCollection.Empty;
var queryString = path[(queryIndex + 1)..];
return QueryHelpers.ParseQuery(queryString);
}
}
internal static class JsonDefaults
{
public static readonly JsonSerializerOptions Options = new()
{
PropertyNameCaseInsensitive = true,
PropertyNamingPolicy = JsonNamingPolicy.CamelCase
};
}
```
---
## Response Builder
```csharp
namespace StellaOps.Microservice;
/// <summary>
/// Builder for constructing endpoint responses.
/// </summary>
public sealed class StellaResponseBuilder
{
private int _statusCode = 200;
private readonly Dictionary<string, string> _headers = new(StringComparer.OrdinalIgnoreCase);
private byte[]? _body;
private string _contentType = "application/json";
/// <summary>
/// Creates a new response builder.
/// </summary>
public static StellaResponseBuilder Create() => new();
/// <summary>
/// Sets the status code.
/// </summary>
public StellaResponseBuilder WithStatus(int statusCode)
{
_statusCode = statusCode;
return this;
}
/// <summary>
/// Sets a response header.
/// </summary>
public StellaResponseBuilder WithHeader(string name, string value)
{
_headers[name] = value;
return this;
}
/// <summary>
/// Sets multiple response headers.
/// </summary>
public StellaResponseBuilder WithHeaders(IEnumerable<KeyValuePair<string, string>> headers)
{
foreach (var (key, value) in headers)
{
_headers[key] = value;
}
return this;
}
/// <summary>
/// Sets the Content-Type header.
/// </summary>
public StellaResponseBuilder WithContentType(string contentType)
{
_contentType = contentType;
return this;
}
/// <summary>
/// Sets a JSON body.
/// </summary>
public StellaResponseBuilder WithJson<T>(T value, JsonSerializerOptions? options = null)
{
_contentType = "application/json";
_body = JsonSerializer.SerializeToUtf8Bytes(value, options ?? JsonDefaults.Options);
return this;
}
/// <summary>
/// Sets a string body.
/// </summary>
public StellaResponseBuilder WithText(string text, Encoding? encoding = null)
{
if (!_headers.ContainsKey("Content-Type") && _contentType == "application/json")
{
_contentType = "text/plain";
}
_body = (encoding ?? Encoding.UTF8).GetBytes(text);
return this;
}
/// <summary>
/// Sets raw bytes as body.
/// </summary>
public StellaResponseBuilder WithBytes(byte[] data, string? contentType = null)
{
if (contentType != null)
{
_contentType = contentType;
}
_body = data;
return this;
}
/// <summary>
/// Sets a stream as body.
/// </summary>
public StellaResponseBuilder WithStream(Stream stream, string? contentType = null)
{
if (contentType != null)
{
_contentType = contentType;
}
using var ms = new MemoryStream();
stream.CopyTo(ms);
_body = ms.ToArray();
return this;
}
/// <summary>
/// Builds the response payload.
/// </summary>
public ResponsePayload Build()
{
_headers["Content-Type"] = _contentType;
return new ResponsePayload
{
StatusCode = _statusCode,
Headers = new Dictionary<string, string>(_headers),
Body = _body,
IsFinalChunk = true
};
}
// Static factory methods for common responses
/// <summary>Creates a 200 OK response with JSON body.</summary>
public static ResponsePayload Ok<T>(T value) =>
Create().WithStatus(200).WithJson(value).Build();
/// <summary>Creates a 200 OK response with no body.</summary>
public static ResponsePayload Ok() =>
Create().WithStatus(200).Build();
/// <summary>Creates a 201 Created response with JSON body.</summary>
public static ResponsePayload Created<T>(T value, string? location = null)
{
var builder = Create().WithStatus(201).WithJson(value);
if (location != null)
{
builder.WithHeader("Location", location);
}
return builder.Build();
}
/// <summary>Creates a 204 No Content response.</summary>
public static ResponsePayload NoContent() =>
Create().WithStatus(204).Build();
/// <summary>Creates a 400 Bad Request response.</summary>
public static ResponsePayload BadRequest(string message) =>
Create().WithStatus(400).WithJson(new { error = message }).Build();
/// <summary>Creates a 400 Bad Request response with validation errors.</summary>
public static ResponsePayload BadRequest(Dictionary<string, string[]> errors) =>
Create().WithStatus(400).WithJson(new { errors }).Build();
/// <summary>Creates a 401 Unauthorized response.</summary>
public static ResponsePayload Unauthorized(string? message = null) =>
Create().WithStatus(401).WithJson(new { error = message ?? "Unauthorized" }).Build();
/// <summary>Creates a 403 Forbidden response.</summary>
public static ResponsePayload Forbidden(string? message = null) =>
Create().WithStatus(403).WithJson(new { error = message ?? "Forbidden" }).Build();
/// <summary>Creates a 404 Not Found response.</summary>
public static ResponsePayload NotFound(string? message = null) =>
Create().WithStatus(404).WithJson(new { error = message ?? "Not found" }).Build();
/// <summary>Creates a 409 Conflict response.</summary>
public static ResponsePayload Conflict(string message) =>
Create().WithStatus(409).WithJson(new { error = message }).Build();
/// <summary>Creates a 500 Internal Server Error response.</summary>
public static ResponsePayload InternalError(string? message = null) =>
Create().WithStatus(500).WithJson(new { error = message ?? "Internal server error" }).Build();
/// <summary>Creates a 503 Service Unavailable response.</summary>
public static ResponsePayload ServiceUnavailable(string? message = null) =>
Create().WithStatus(503).WithJson(new { error = message ?? "Service unavailable" }).Build();
/// <summary>Creates a redirect response.</summary>
public static ResponsePayload Redirect(string location, bool permanent = false) =>
Create()
.WithStatus(permanent ? 301 : 302)
.WithHeader("Location", location)
.Build();
}
```
---
## Endpoint Handler Interface
```csharp
namespace StellaOps.Microservice;
/// <summary>
/// Interface for endpoint handler classes.
/// </summary>
public interface IEndpointHandler
{
}
/// <summary>
/// Base class for endpoint handlers with helper methods.
/// </summary>
public abstract class EndpointHandler : IEndpointHandler
{
/// <summary>Current request context (set by dispatcher).</summary>
public StellaRequestContext Context { get; internal set; } = null!;
/// <summary>Creates a 200 OK response with JSON body.</summary>
protected ResponsePayload Ok<T>(T value) => StellaResponseBuilder.Ok(value);
/// <summary>Creates a 200 OK response with no body.</summary>
protected ResponsePayload Ok() => StellaResponseBuilder.Ok();
/// <summary>Creates a 201 Created response.</summary>
protected ResponsePayload Created<T>(T value, string? location = null) =>
StellaResponseBuilder.Created(value, location);
/// <summary>Creates a 204 No Content response.</summary>
protected ResponsePayload NoContent() => StellaResponseBuilder.NoContent();
/// <summary>Creates a 400 Bad Request response.</summary>
protected ResponsePayload BadRequest(string message) =>
StellaResponseBuilder.BadRequest(message);
/// <summary>Creates a 401 Unauthorized response.</summary>
protected ResponsePayload Unauthorized(string? message = null) =>
StellaResponseBuilder.Unauthorized(message);
/// <summary>Creates a 403 Forbidden response.</summary>
protected ResponsePayload Forbidden(string? message = null) =>
StellaResponseBuilder.Forbidden(message);
/// <summary>Creates a 404 Not Found response.</summary>
protected ResponsePayload NotFound(string? message = null) =>
StellaResponseBuilder.NotFound(message);
/// <summary>Creates a response with custom status and body.</summary>
protected StellaResponseBuilder Response() => StellaResponseBuilder.Create();
}
```
---
## Request Dispatcher
```csharp
namespace StellaOps.Microservice;
public interface IRequestDispatcher
{
Task<ResponsePayload> DispatchAsync(RequestPayload request, CancellationToken cancellationToken);
}
public sealed class RequestDispatcher : IRequestDispatcher
{
private readonly IEndpointRegistry _registry;
private readonly IServiceProvider _serviceProvider;
private readonly ILogger<RequestDispatcher> _logger;
public RequestDispatcher(
IEndpointRegistry registry,
IServiceProvider serviceProvider,
ILogger<RequestDispatcher> logger)
{
_registry = registry;
_serviceProvider = serviceProvider;
_logger = logger;
}
public async Task<ResponsePayload> DispatchAsync(
RequestPayload request,
CancellationToken cancellationToken)
{
var path = request.Path.Split('?')[0];
var endpoint = _registry.FindEndpoint(request.Method, path);
if (endpoint == null)
{
_logger.LogDebug("No endpoint found for {Method} {Path}", request.Method, path);
return StellaResponseBuilder.NotFound($"No endpoint: {request.Method} {path}");
}
// Extract path parameters
var pathParams = ExtractPathParameters(path, endpoint.Descriptor.Path);
// Create request context
var context = new StellaRequestContext(request, pathParams);
// Create handler instance
using var scope = _serviceProvider.CreateScope();
var handler = scope.ServiceProvider.GetService(endpoint.HandlerType);
if (handler == null)
{
// Try to create without DI
handler = Activator.CreateInstance(endpoint.HandlerType);
}
if (handler == null)
{
_logger.LogError("Cannot create handler {Type}", endpoint.HandlerType);
return StellaResponseBuilder.InternalError("Handler instantiation failed");
}
// Set context on base handler
if (handler is EndpointHandler baseHandler)
{
baseHandler.Context = context;
}
try
{
// Invoke handler method
var result = endpoint.HandlerMethod.Invoke(handler, BuildMethodParameters(
endpoint.HandlerMethod, context, cancellationToken));
// Handle async methods
if (result is Task<ResponsePayload> taskResponse)
{
return await taskResponse;
}
else if (result is Task task)
{
await task;
// Method returned Task without result - assume OK
return StellaResponseBuilder.Ok();
}
else if (result is ResponsePayload response)
{
return response;
}
else if (result != null)
{
// Serialize result as JSON
return StellaResponseBuilder.Ok(result);
}
else
{
return StellaResponseBuilder.NoContent();
}
}
catch (TargetInvocationException ex) when (ex.InnerException != null)
{
throw ex.InnerException;
}
}
private Dictionary<string, string> ExtractPathParameters(string actualPath, string pattern)
{
var result = new Dictionary<string, string>();
var patternSegments = pattern.Split('/', StringSplitOptions.RemoveEmptyEntries);
var pathSegments = actualPath.Split('/', StringSplitOptions.RemoveEmptyEntries);
for (int i = 0; i < patternSegments.Length && i < pathSegments.Length; i++)
{
var patternSeg = patternSegments[i];
if (patternSeg.StartsWith('{') && patternSeg.EndsWith('}'))
{
var paramName = patternSeg[1..^1];
result[paramName] = pathSegments[i];
}
}
return result;
}
private object?[] BuildMethodParameters(
MethodInfo method,
StellaRequestContext context,
CancellationToken cancellationToken)
{
var parameters = method.GetParameters();
var args = new object?[parameters.Length];
for (int i = 0; i < parameters.Length; i++)
{
var param = parameters[i];
var paramType = param.ParameterType;
if (paramType == typeof(StellaRequestContext))
{
args[i] = context;
}
else if (paramType == typeof(CancellationToken))
{
args[i] = cancellationToken;
}
else if (param.GetCustomAttribute<FromPathAttribute>() != null)
{
var value = context.GetPathParameter(param.Name ?? "");
args[i] = ConvertParameter(value, paramType);
}
else if (param.GetCustomAttribute<FromQueryAttribute>() != null)
{
var value = context.GetQueryParameter(param.Name ?? "");
args[i] = ConvertParameter(value, paramType);
}
else if (param.GetCustomAttribute<FromHeaderAttribute>() != null)
{
var headerName = param.GetCustomAttribute<FromHeaderAttribute>()?.Name ?? param.Name;
var value = context.GetHeader(headerName ?? "");
args[i] = ConvertParameter(value, paramType);
}
else if (param.GetCustomAttribute<FromClaimAttribute>() != null)
{
var claimName = param.GetCustomAttribute<FromClaimAttribute>()?.Name ?? param.Name;
var value = context.GetClaim(claimName ?? "");
args[i] = ConvertParameter(value, paramType);
}
else if (param.GetCustomAttribute<FromBodyAttribute>() != null || IsComplexType(paramType))
{
// Deserialize body
args[i] = context.ReadBodyAsJson(paramType);
}
else
{
args[i] = param.HasDefaultValue ? param.DefaultValue : null;
}
}
return args;
}
private static object? ConvertParameter(string? value, Type targetType)
{
if (value == null)
return targetType.IsValueType ? Activator.CreateInstance(targetType) : null;
if (targetType == typeof(string))
return value;
if (targetType == typeof(int) || targetType == typeof(int?))
return int.TryParse(value, out var i) ? i : null;
if (targetType == typeof(long) || targetType == typeof(long?))
return long.TryParse(value, out var l) ? l : null;
if (targetType == typeof(Guid) || targetType == typeof(Guid?))
return Guid.TryParse(value, out var g) ? g : null;
if (targetType == typeof(bool) || targetType == typeof(bool?))
return bool.TryParse(value, out var b) ? b : null;
return Convert.ChangeType(value, targetType);
}
private static bool IsComplexType(Type type)
{
return !type.IsPrimitive &&
type != typeof(string) &&
type != typeof(decimal) &&
type != typeof(Guid) &&
type != typeof(DateTime) &&
type != typeof(DateTimeOffset) &&
!type.IsEnum;
}
private object? ReadBodyAsJson(StellaRequestContext context, Type targetType)
{
if (!context.HasBody)
return null;
var json = context.RawBody;
return JsonSerializer.Deserialize(json, targetType, JsonDefaults.Options);
}
}
```
---
## Parameter Binding Attributes
```csharp
namespace StellaOps.Microservice;
[AttributeUsage(AttributeTargets.Parameter)]
public sealed class FromPathAttribute : Attribute { }
[AttributeUsage(AttributeTargets.Parameter)]
public sealed class FromQueryAttribute : Attribute { }
[AttributeUsage(AttributeTargets.Parameter)]
public sealed class FromHeaderAttribute : Attribute
{
public string? Name { get; set; }
}
[AttributeUsage(AttributeTargets.Parameter)]
public sealed class FromClaimAttribute : Attribute
{
public string? Name { get; set; }
}
[AttributeUsage(AttributeTargets.Parameter)]
public sealed class FromBodyAttribute : Attribute { }
```
---
## Usage Example
```csharp
[StellaEndpoint(BasePath = "/billing")]
public class InvoiceHandler : EndpointHandler
{
private readonly InvoiceService _service;
public InvoiceHandler(InvoiceService service)
{
_service = service;
}
[StellaGet("invoices/{id}")]
public async Task<ResponsePayload> GetInvoice(
[FromPath] Guid id,
CancellationToken cancellationToken)
{
var invoice = await _service.GetByIdAsync(id, cancellationToken);
if (invoice == null)
return NotFound($"Invoice {id} not found");
return Ok(invoice);
}
[StellaPost("invoices")]
[StellaAuth(RequiredClaims = new[] { "billing:write" })]
public async Task<ResponsePayload> CreateInvoice(
[FromBody] CreateInvoiceRequest request,
[FromClaim(Name = "sub")] string userId,
CancellationToken cancellationToken)
{
var invoice = await _service.CreateAsync(request, userId, cancellationToken);
return Created(invoice, $"/billing/invoices/{invoice.Id}");
}
[StellaGet("invoices")]
public async Task<ResponsePayload> ListInvoices(
StellaRequestContext context,
CancellationToken cancellationToken)
{
var page = int.Parse(context.GetQueryParameter("page") ?? "1");
var pageSize = int.Parse(context.GetQueryParameter("pageSize") ?? "20");
var invoices = await _service.ListAsync(page, pageSize, cancellationToken);
return Ok(invoices);
}
}
```
---
## Deliverables
1. `StellaOps.Microservice/StellaRequestContext.cs`
2. `StellaOps.Microservice/StellaResponseBuilder.cs`
3. `StellaOps.Microservice/IEndpointHandler.cs`
4. `StellaOps.Microservice/EndpointHandler.cs`
5. `StellaOps.Microservice/IRequestDispatcher.cs`
6. `StellaOps.Microservice/RequestDispatcher.cs`
7. `StellaOps.Microservice/ParameterBindingAttributes.cs`
8. Parameter binding tests
9. Response builder tests
10. Dispatcher routing tests
---
## Next Step
Proceed to [Step 22: Logging & Tracing](22-Step.md) to implement structured logging and distributed tracing.

698
docs/router/22-Step.md Normal file
View File

@@ -0,0 +1,698 @@
# Step 22: Logging & Tracing
**Phase 6: Observability & Resilience**
**Estimated Complexity:** Medium
**Dependencies:** Step 19 (Microservice Host Builder)
---
## Overview
Structured logging and distributed tracing provide observability across the gateway and microservices. Correlation IDs flow from HTTP requests through the transport layer to microservice handlers, enabling end-to-end request tracking.
---
## Goals
1. Implement structured logging with consistent context
2. Propagate correlation IDs across all layers
3. Integrate with OpenTelemetry for distributed tracing
4. Support log level configuration per component
5. Provide sensitive data filtering
---
## Correlation Context
```csharp
namespace StellaOps.Router.Common;
/// <summary>
/// Provides correlation context for request tracking.
/// </summary>
public static class CorrelationContext
{
private static readonly AsyncLocal<CorrelationData> _current = new();
public static CorrelationData Current => _current.Value ?? CorrelationData.Empty;
public static IDisposable BeginScope(CorrelationData data)
{
var previous = _current.Value;
_current.Value = data;
return new CorrelationScope(previous);
}
public static IDisposable BeginScope(string correlationId, string? serviceName = null)
{
return BeginScope(new CorrelationData
{
CorrelationId = correlationId,
ServiceName = serviceName ?? Current.ServiceName,
ParentId = Current.CorrelationId
});
}
private sealed class CorrelationScope : IDisposable
{
private readonly CorrelationData? _previous;
public CorrelationScope(CorrelationData? previous)
{
_previous = previous;
}
public void Dispose()
{
_current.Value = _previous;
}
}
}
public sealed class CorrelationData
{
public static readonly CorrelationData Empty = new();
public string CorrelationId { get; init; } = "";
public string? ParentId { get; init; }
public string? ServiceName { get; init; }
public string? InstanceId { get; init; }
public string? Method { get; init; }
public string? Path { get; init; }
public string? UserId { get; init; }
public Dictionary<string, string> Extra { get; init; } = new();
}
```
---
## Structured Log Enricher
```csharp
namespace StellaOps.Router.Common;
/// <summary>
/// Enriches log entries with correlation context.
/// </summary>
public sealed class CorrelationLogEnricher : ILoggerProvider
{
private readonly ILoggerProvider _inner;
public CorrelationLogEnricher(ILoggerProvider inner)
{
_inner = inner;
}
public ILogger CreateLogger(string categoryName)
{
return new CorrelationLogger(_inner.CreateLogger(categoryName));
}
public void Dispose() => _inner.Dispose();
private sealed class CorrelationLogger : ILogger
{
private readonly ILogger _inner;
public CorrelationLogger(ILogger inner)
{
_inner = inner;
}
public IDisposable? BeginScope<TState>(TState state) where TState : notnull
{
return _inner.BeginScope(state);
}
public bool IsEnabled(LogLevel logLevel) => _inner.IsEnabled(logLevel);
public void Log<TState>(
LogLevel logLevel,
EventId eventId,
TState state,
Exception? exception,
Func<TState, Exception?, string> formatter)
{
var correlation = CorrelationContext.Current;
// Create enriched state
using var scope = _inner.BeginScope(new Dictionary<string, object?>
{
["CorrelationId"] = correlation.CorrelationId,
["ServiceName"] = correlation.ServiceName,
["InstanceId"] = correlation.InstanceId,
["Method"] = correlation.Method,
["Path"] = correlation.Path,
["UserId"] = correlation.UserId
});
_inner.Log(logLevel, eventId, state, exception, formatter);
}
}
}
```
---
## Gateway Request Logging
```csharp
namespace StellaOps.Router.Gateway;
/// <summary>
/// Middleware for request/response logging with correlation.
/// </summary>
public sealed class RequestLoggingMiddleware
{
private readonly RequestDelegate _next;
private readonly ILogger<RequestLoggingMiddleware> _logger;
private readonly RequestLoggingConfig _config;
public RequestLoggingMiddleware(
RequestDelegate next,
ILogger<RequestLoggingMiddleware> logger,
IOptions<RequestLoggingConfig> config)
{
_next = next;
_logger = logger;
_config = config.Value;
}
public async Task InvokeAsync(HttpContext context)
{
var correlationId = context.Request.Headers["X-Correlation-ID"].FirstOrDefault()
?? context.TraceIdentifier;
// Set correlation context
using var scope = CorrelationContext.BeginScope(new CorrelationData
{
CorrelationId = correlationId,
ServiceName = "gateway",
Method = context.Request.Method,
Path = context.Request.Path
});
var sw = Stopwatch.StartNew();
try
{
// Log request
if (_config.LogRequests)
{
LogRequest(context, correlationId);
}
await _next(context);
sw.Stop();
// Log response
if (_config.LogResponses)
{
LogResponse(context, correlationId, sw.ElapsedMilliseconds);
}
}
catch (Exception ex)
{
sw.Stop();
LogError(context, correlationId, sw.ElapsedMilliseconds, ex);
throw;
}
}
private void LogRequest(HttpContext context, string correlationId)
{
var request = context.Request;
_logger.LogInformation(
"HTTP {Method} {Path} started | CorrelationId={CorrelationId} ClientIP={ClientIP} UserAgent={UserAgent}",
request.Method,
request.Path + request.QueryString,
correlationId,
context.Connection.RemoteIpAddress,
SanitizeHeader(request.Headers.UserAgent));
}
private void LogResponse(HttpContext context, string correlationId, long elapsedMs)
{
var level = context.Response.StatusCode >= 500 ? LogLevel.Error
: context.Response.StatusCode >= 400 ? LogLevel.Warning
: LogLevel.Information;
_logger.Log(
level,
"HTTP {Method} {Path} completed {StatusCode} in {ElapsedMs}ms | CorrelationId={CorrelationId}",
context.Request.Method,
context.Request.Path,
context.Response.StatusCode,
elapsedMs,
correlationId);
}
private void LogError(HttpContext context, string correlationId, long elapsedMs, Exception ex)
{
_logger.LogError(
ex,
"HTTP {Method} {Path} failed after {ElapsedMs}ms | CorrelationId={CorrelationId}",
context.Request.Method,
context.Request.Path,
elapsedMs,
correlationId);
}
private static string SanitizeHeader(StringValues value)
{
var str = value.ToString();
return str.Length > 200 ? str[..200] + "..." : str;
}
}
public class RequestLoggingConfig
{
public bool LogRequests { get; set; } = true;
public bool LogResponses { get; set; } = true;
public bool LogHeaders { get; set; } = false;
public bool LogBody { get; set; } = false;
public int MaxBodyLogLength { get; set; } = 1000;
public HashSet<string> SensitiveHeaders { get; set; } = new(StringComparer.OrdinalIgnoreCase)
{
"Authorization", "Cookie", "X-API-Key"
};
}
```
---
## OpenTelemetry Integration
```csharp
namespace StellaOps.Router.Common;
/// <summary>
/// Configures OpenTelemetry tracing for the router.
/// </summary>
public static class OpenTelemetryExtensions
{
public static IServiceCollection AddStellaTracing(
this IServiceCollection services,
IConfiguration configuration)
{
var config = configuration.GetSection("Tracing").Get<TracingConfig>()
?? new TracingConfig();
services.AddOpenTelemetry()
.WithTracing(builder =>
{
builder
.SetResourceBuilder(ResourceBuilder.CreateDefault()
.AddService(config.ServiceName))
.AddSource(StellaActivitySource.Name)
.AddAspNetCoreInstrumentation(options =>
{
options.Filter = ctx =>
!ctx.Request.Path.StartsWithSegments("/health");
options.RecordException = true;
})
.AddHttpClientInstrumentation();
// Add exporter based on config
switch (config.Exporter.ToLower())
{
case "jaeger":
builder.AddJaegerExporter(o =>
{
o.AgentHost = config.JaegerHost;
o.AgentPort = config.JaegerPort;
});
break;
case "otlp":
builder.AddOtlpExporter(o =>
{
o.Endpoint = new Uri(config.OtlpEndpoint);
});
break;
case "console":
builder.AddConsoleExporter();
break;
}
});
return services;
}
}
public static class StellaActivitySource
{
public const string Name = "StellaOps.Router";
private static readonly ActivitySource _source = new(Name);
public static Activity? StartActivity(string name, ActivityKind kind = ActivityKind.Internal)
{
return _source.StartActivity(name, kind);
}
public static Activity? StartRequestActivity(string method, string path)
{
var activity = _source.StartActivity("HandleRequest", ActivityKind.Server);
activity?.SetTag("http.method", method);
activity?.SetTag("http.route", path);
return activity;
}
public static Activity? StartTransportActivity(string transport, string serviceName)
{
var activity = _source.StartActivity("Transport", ActivityKind.Client);
activity?.SetTag("transport.type", transport);
activity?.SetTag("service.name", serviceName);
return activity;
}
}
public class TracingConfig
{
public string ServiceName { get; set; } = "stella-router";
public string Exporter { get; set; } = "console";
public string JaegerHost { get; set; } = "localhost";
public int JaegerPort { get; set; } = 6831;
public string OtlpEndpoint { get; set; } = "http://localhost:4317";
public double SampleRate { get; set; } = 1.0;
}
```
---
## Transport Trace Propagation
```csharp
namespace StellaOps.Router.Transport;
/// <summary>
/// Propagates trace context through the transport layer.
/// </summary>
public sealed class TracePropagator
{
/// <summary>
/// Injects trace context into request payload.
/// </summary>
public void InjectContext(RequestPayload payload)
{
var activity = Activity.Current;
if (activity == null)
return;
var headers = new Dictionary<string, string>(payload.Headers);
// Inject W3C Trace Context
headers["traceparent"] = $"00-{activity.TraceId}-{activity.SpanId}-{(activity.Recorded ? "01" : "00")}";
if (!string.IsNullOrEmpty(activity.TraceStateString))
{
headers["tracestate"] = activity.TraceStateString;
}
// Create new payload with updated headers
// (In real implementation, use record with 'with' expression)
}
/// <summary>
/// Extracts trace context from request payload.
/// </summary>
public ActivityContext? ExtractContext(RequestPayload payload)
{
if (!payload.Headers.TryGetValue("traceparent", out var traceparent))
return null;
if (ActivityContext.TryParse(traceparent, payload.Headers.GetValueOrDefault("tracestate"), out var ctx))
{
return ctx;
}
return null;
}
}
```
---
## Microservice Logging
```csharp
namespace StellaOps.Microservice;
/// <summary>
/// Request logging for microservice handlers.
/// </summary>
public sealed class HandlerLoggingDecorator : IRequestDispatcher
{
private readonly IRequestDispatcher _inner;
private readonly ILogger<HandlerLoggingDecorator> _logger;
private readonly TracePropagator _propagator;
public HandlerLoggingDecorator(
IRequestDispatcher inner,
ILogger<HandlerLoggingDecorator> logger,
TracePropagator propagator)
{
_inner = inner;
_logger = logger;
_propagator = propagator;
}
public async Task<ResponsePayload> DispatchAsync(
RequestPayload request,
CancellationToken cancellationToken)
{
// Extract and restore trace context
var parentContext = _propagator.ExtractContext(request);
using var activity = StellaActivitySource.StartActivity(
"HandleRequest",
ActivityKind.Server,
parentContext ?? default);
activity?.SetTag("http.method", request.Method);
activity?.SetTag("http.route", request.Path);
// Set correlation context
var correlationId = request.TraceId ?? activity?.TraceId.ToString() ?? Guid.NewGuid().ToString("N");
using var scope = CorrelationContext.BeginScope(new CorrelationData
{
CorrelationId = correlationId,
Method = request.Method,
Path = request.Path,
UserId = request.Claims.GetValueOrDefault("sub")
});
var sw = Stopwatch.StartNew();
try
{
_logger.LogDebug(
"Handling {Method} {Path} | CorrelationId={CorrelationId}",
request.Method, request.Path, correlationId);
var response = await _inner.DispatchAsync(request, cancellationToken);
sw.Stop();
activity?.SetTag("http.status_code", response.StatusCode);
var level = response.StatusCode >= 500 ? LogLevel.Error
: response.StatusCode >= 400 ? LogLevel.Warning
: LogLevel.Debug;
_logger.Log(
level,
"Completed {Method} {Path} with {StatusCode} in {ElapsedMs}ms | CorrelationId={CorrelationId}",
request.Method, request.Path, response.StatusCode, sw.ElapsedMilliseconds, correlationId);
return response;
}
catch (Exception ex)
{
sw.Stop();
activity?.SetStatus(ActivityStatusCode.Error, ex.Message);
_logger.LogError(
ex,
"Failed {Method} {Path} after {ElapsedMs}ms | CorrelationId={CorrelationId}",
request.Method, request.Path, sw.ElapsedMilliseconds, correlationId);
throw;
}
}
}
```
---
## Sensitive Data Filtering
```csharp
namespace StellaOps.Router.Common;
/// <summary>
/// Filters sensitive data from logs.
/// </summary>
public sealed class SensitiveDataFilter
{
private readonly HashSet<string> _sensitiveFields;
private readonly Regex _cardNumberRegex;
private readonly Regex _ssnRegex;
public SensitiveDataFilter(IOptions<SensitiveDataConfig> config)
{
var cfg = config.Value;
_sensitiveFields = new HashSet<string>(cfg.SensitiveFields, StringComparer.OrdinalIgnoreCase);
_cardNumberRegex = new Regex(@"\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b");
_ssnRegex = new Regex(@"\b\d{3}-\d{2}-\d{4}\b");
}
public string Filter(string input)
{
var result = input;
// Mask card numbers
result = _cardNumberRegex.Replace(result, m =>
m.Value[..4] + "****" + m.Value[^4..]);
// Mask SSNs
result = _ssnRegex.Replace(result, "***-**-****");
return result;
}
public Dictionary<string, string> FilterHeaders(IReadOnlyDictionary<string, string> headers)
{
return headers.ToDictionary(
h => h.Key,
h => _sensitiveFields.Contains(h.Key) ? "[REDACTED]" : h.Value);
}
public object FilterObject(object obj)
{
// Deep filter for JSON objects
var json = JsonSerializer.Serialize(obj);
var filtered = FilterJsonProperties(json);
return JsonSerializer.Deserialize<object>(filtered)!;
}
private string FilterJsonProperties(string json)
{
var doc = JsonDocument.Parse(json);
using var stream = new MemoryStream();
using var writer = new Utf8JsonWriter(stream);
FilterElement(doc.RootElement, writer);
writer.Flush();
return Encoding.UTF8.GetString(stream.ToArray());
}
private void FilterElement(JsonElement element, Utf8JsonWriter writer)
{
switch (element.ValueKind)
{
case JsonValueKind.Object:
writer.WriteStartObject();
foreach (var property in element.EnumerateObject())
{
writer.WritePropertyName(property.Name);
if (_sensitiveFields.Contains(property.Name))
{
writer.WriteStringValue("[REDACTED]");
}
else
{
FilterElement(property.Value, writer);
}
}
writer.WriteEndObject();
break;
case JsonValueKind.Array:
writer.WriteStartArray();
foreach (var item in element.EnumerateArray())
{
FilterElement(item, writer);
}
writer.WriteEndArray();
break;
default:
element.WriteTo(writer);
break;
}
}
}
public class SensitiveDataConfig
{
public HashSet<string> SensitiveFields { get; set; } = new(StringComparer.OrdinalIgnoreCase)
{
"password", "secret", "token", "apiKey", "api_key",
"authorization", "creditCard", "credit_card", "ssn",
"socialSecurityNumber", "social_security_number"
};
}
```
---
## YAML Configuration
```yaml
Logging:
LogLevel:
Default: "Information"
"StellaOps.Router": "Debug"
"Microsoft.AspNetCore": "Warning"
RequestLogging:
LogRequests: true
LogResponses: true
LogHeaders: false
LogBody: false
MaxBodyLogLength: 1000
SensitiveHeaders:
- Authorization
- Cookie
- X-API-Key
Tracing:
ServiceName: "stella-router"
Exporter: "otlp"
OtlpEndpoint: "http://otel-collector:4317"
SampleRate: 1.0
SensitiveData:
SensitiveFields:
- password
- secret
- token
- apiKey
- creditCard
- ssn
```
---
## Deliverables
1. `StellaOps.Router.Common/CorrelationContext.cs`
2. `StellaOps.Router.Common/CorrelationLogEnricher.cs`
3. `StellaOps.Router.Gateway/RequestLoggingMiddleware.cs`
4. `StellaOps.Router.Common/OpenTelemetryExtensions.cs`
5. `StellaOps.Router.Common/StellaActivitySource.cs`
6. `StellaOps.Router.Transport/TracePropagator.cs`
7. `StellaOps.Microservice/HandlerLoggingDecorator.cs`
8. `StellaOps.Router.Common/SensitiveDataFilter.cs`
9. Correlation propagation tests
10. Trace context tests
---
## Next Step
Proceed to [Step 23: Metrics & Health Checks](23-Step.md) to implement observability metrics.

769
docs/router/23-Step.md Normal file
View File

@@ -0,0 +1,769 @@
# Step 23: Metrics & Health Checks
**Phase 6: Observability & Resilience**
**Estimated Complexity:** Medium
**Dependencies:** Step 22 (Logging & Tracing)
---
## Overview
Metrics and health checks provide operational visibility into the router and microservices. Prometheus-compatible metrics expose request rates, latencies, error rates, and connection pool status. Health checks enable load balancers and orchestrators to route traffic appropriately.
---
## Goals
1. Expose Prometheus-compatible metrics
2. Track request/response metrics per endpoint
3. Monitor transport layer health
4. Provide liveness and readiness probes
5. Support custom health check integrations
---
## Metrics Configuration
```csharp
namespace StellaOps.Router.Common;
public class MetricsConfig
{
/// <summary>Whether to enable metrics collection.</summary>
public bool Enabled { get; set; } = true;
/// <summary>Path for metrics endpoint.</summary>
public string Path { get; set; } = "/metrics";
/// <summary>Histogram buckets for request duration.</summary>
public double[] DurationBuckets { get; set; } = new[]
{
0.001, 0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1.0, 2.5, 5.0, 10.0
};
/// <summary>Labels to include in metrics.</summary>
public HashSet<string> IncludeLabels { get; set; } = new()
{
"method", "path", "status_code", "service"
};
/// <summary>Whether to include path in labels (may cause high cardinality).</summary>
public bool IncludePathLabel { get; set; } = false;
/// <summary>Maximum unique path labels before aggregating.</summary>
public int MaxPathCardinality { get; set; } = 100;
}
```
---
## Core Metrics
```csharp
namespace StellaOps.Router.Common;
/// <summary>
/// Central metrics registry for Stella Router.
/// </summary>
public sealed class StellaMetrics
{
// Request metrics
public static readonly Counter<long> RequestsTotal = Meter.CreateCounter<long>(
"stella_requests_total",
description: "Total number of requests processed");
public static readonly Histogram<double> RequestDuration = Meter.CreateHistogram<double>(
"stella_request_duration_seconds",
unit: "s",
description: "Request processing duration in seconds");
public static readonly Counter<long> RequestErrors = Meter.CreateCounter<long>(
"stella_request_errors_total",
description: "Total number of request errors");
// Transport metrics
public static readonly UpDownCounter<int> ActiveConnections = Meter.CreateUpDownCounter<int>(
"stella_active_connections",
description: "Number of active transport connections");
public static readonly Counter<long> ConnectionsTotal = Meter.CreateCounter<long>(
"stella_connections_total",
description: "Total number of transport connections");
public static readonly Counter<long> FramesSent = Meter.CreateCounter<long>(
"stella_frames_sent_total",
description: "Total number of frames sent");
public static readonly Counter<long> FramesReceived = Meter.CreateCounter<long>(
"stella_frames_received_total",
description: "Total number of frames received");
public static readonly Counter<long> BytesSent = Meter.CreateCounter<long>(
"stella_bytes_sent_total",
unit: "By",
description: "Total bytes sent");
public static readonly Counter<long> BytesReceived = Meter.CreateCounter<long>(
"stella_bytes_received_total",
unit: "By",
description: "Total bytes received");
// Rate limiting metrics
public static readonly Counter<long> RateLimitHits = Meter.CreateCounter<long>(
"stella_rate_limit_hits_total",
description: "Number of requests that hit rate limits");
public static readonly Gauge<int> RateLimitBuckets = Meter.CreateGauge<int>(
"stella_rate_limit_buckets",
description: "Number of active rate limit buckets");
// Auth metrics
public static readonly Counter<long> AuthSuccesses = Meter.CreateCounter<long>(
"stella_auth_success_total",
description: "Number of successful authentications");
public static readonly Counter<long> AuthFailures = Meter.CreateCounter<long>(
"stella_auth_failures_total",
description: "Number of failed authentications");
// Circuit breaker metrics
public static readonly Gauge<int> CircuitBreakerState = Meter.CreateGauge<int>(
"stella_circuit_breaker_state",
description: "Circuit breaker state (0=closed, 1=half-open, 2=open)");
private static readonly Meter Meter = new("StellaOps.Router", "1.0.0");
}
```
---
## Request Metrics Middleware
```csharp
namespace StellaOps.Router.Gateway;
/// <summary>
/// Middleware to collect request metrics.
/// </summary>
public sealed class MetricsMiddleware
{
private readonly RequestDelegate _next;
private readonly MetricsConfig _config;
private readonly PathNormalizer _pathNormalizer;
public MetricsMiddleware(
RequestDelegate next,
IOptions<MetricsConfig> config)
{
_next = next;
_config = config.Value;
_pathNormalizer = new PathNormalizer(_config.MaxPathCardinality);
}
public async Task InvokeAsync(HttpContext context)
{
if (!_config.Enabled)
{
await _next(context);
return;
}
var sw = Stopwatch.StartNew();
var method = context.Request.Method;
var path = _config.IncludePathLabel
? _pathNormalizer.Normalize(context.Request.Path)
: "aggregated";
try
{
await _next(context);
}
finally
{
sw.Stop();
var tags = new TagList
{
{ "method", method },
{ "status_code", context.Response.StatusCode.ToString() }
};
if (_config.IncludePathLabel)
{
tags.Add("path", path);
}
StellaMetrics.RequestsTotal.Add(1, tags);
StellaMetrics.RequestDuration.Record(sw.Elapsed.TotalSeconds, tags);
if (context.Response.StatusCode >= 400)
{
StellaMetrics.RequestErrors.Add(1, tags);
}
}
}
}
/// <summary>
/// Normalizes paths to prevent high cardinality.
/// </summary>
internal sealed class PathNormalizer
{
private readonly int _maxCardinality;
private readonly ConcurrentDictionary<string, string> _pathCache = new();
private int _uniquePaths;
public PathNormalizer(int maxCardinality)
{
_maxCardinality = maxCardinality;
}
public string Normalize(string path)
{
if (_pathCache.TryGetValue(path, out var normalized))
return normalized;
// Replace path parameters with placeholders
var segments = path.Split('/');
for (int i = 0; i < segments.Length; i++)
{
if (Guid.TryParse(segments[i], out _) ||
int.TryParse(segments[i], out _) ||
segments[i].Length > 20)
{
segments[i] = "{id}";
}
}
normalized = string.Join("/", segments);
if (Interlocked.Increment(ref _uniquePaths) <= _maxCardinality)
{
_pathCache[path] = normalized;
}
else
{
normalized = "other";
}
return normalized;
}
}
```
---
## Transport Metrics
```csharp
namespace StellaOps.Router.Transport;
/// <summary>
/// Collects metrics for transport layer operations.
/// </summary>
public sealed class TransportMetricsCollector
{
public void RecordConnectionOpened(string transport, string serviceName)
{
var tags = new TagList
{
{ "transport", transport },
{ "service", serviceName }
};
StellaMetrics.ConnectionsTotal.Add(1, tags);
StellaMetrics.ActiveConnections.Add(1, tags);
}
public void RecordConnectionClosed(string transport, string serviceName)
{
var tags = new TagList
{
{ "transport", transport },
{ "service", serviceName }
};
StellaMetrics.ActiveConnections.Add(-1, tags);
}
public void RecordFrameSent(string transport, FrameType type, int bytes)
{
var tags = new TagList
{
{ "transport", transport },
{ "frame_type", type.ToString() }
};
StellaMetrics.FramesSent.Add(1, tags);
StellaMetrics.BytesSent.Add(bytes, new TagList { { "transport", transport } });
}
public void RecordFrameReceived(string transport, FrameType type, int bytes)
{
var tags = new TagList
{
{ "transport", transport },
{ "frame_type", type.ToString() }
};
StellaMetrics.FramesReceived.Add(1, tags);
StellaMetrics.BytesReceived.Add(bytes, new TagList { { "transport", transport } });
}
}
```
---
## Health Check System
```csharp
namespace StellaOps.Router.Common;
/// <summary>
/// Health check result.
/// </summary>
public sealed class HealthCheckResult
{
public HealthStatus Status { get; init; }
public string? Description { get; init; }
public TimeSpan Duration { get; init; }
public IReadOnlyDictionary<string, object>? Data { get; init; }
public Exception? Exception { get; init; }
}
public enum HealthStatus
{
Healthy,
Degraded,
Unhealthy
}
/// <summary>
/// Health check interface.
/// </summary>
public interface IHealthCheck
{
string Name { get; }
Task<HealthCheckResult> CheckAsync(CancellationToken cancellationToken);
}
/// <summary>
/// Aggregates multiple health checks.
/// </summary>
public sealed class HealthCheckService
{
private readonly IEnumerable<IHealthCheck> _checks;
private readonly ILogger<HealthCheckService> _logger;
public HealthCheckService(
IEnumerable<IHealthCheck> checks,
ILogger<HealthCheckService> logger)
{
_checks = checks;
_logger = logger;
}
public async Task<HealthReport> CheckHealthAsync(CancellationToken cancellationToken)
{
var results = new Dictionary<string, HealthCheckResult>();
var overallStatus = HealthStatus.Healthy;
foreach (var check in _checks)
{
var sw = Stopwatch.StartNew();
try
{
var result = await check.CheckAsync(cancellationToken);
result = result with { Duration = sw.Elapsed };
results[check.Name] = result;
if (result.Status > overallStatus)
{
overallStatus = result.Status;
}
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Health check {Name} failed", check.Name);
results[check.Name] = new HealthCheckResult
{
Status = HealthStatus.Unhealthy,
Description = ex.Message,
Duration = sw.Elapsed,
Exception = ex
};
overallStatus = HealthStatus.Unhealthy;
}
}
return new HealthReport
{
Status = overallStatus,
Checks = results,
TotalDuration = results.Values.Sum(r => r.Duration.TotalMilliseconds)
};
}
}
public sealed class HealthReport
{
public HealthStatus Status { get; init; }
public IReadOnlyDictionary<string, HealthCheckResult> Checks { get; init; } = new Dictionary<string, HealthCheckResult>();
public double TotalDuration { get; init; }
}
```
---
## Built-in Health Checks
```csharp
namespace StellaOps.Router.Gateway;
/// <summary>
/// Checks that at least one transport connection is active.
/// </summary>
public sealed class TransportHealthCheck : IHealthCheck
{
private readonly IGlobalRoutingState _routingState;
public string Name => "transport";
public TransportHealthCheck(IGlobalRoutingState routingState)
{
_routingState = routingState;
}
public Task<HealthCheckResult> CheckAsync(CancellationToken cancellationToken)
{
var connections = _routingState.GetAllConnections();
var activeCount = connections.Count(c => c.State == ConnectionState.Connected);
if (activeCount == 0)
{
return Task.FromResult(new HealthCheckResult
{
Status = HealthStatus.Unhealthy,
Description = "No active transport connections",
Data = new Dictionary<string, object> { ["connections"] = 0 }
});
}
return Task.FromResult(new HealthCheckResult
{
Status = HealthStatus.Healthy,
Description = $"{activeCount} active connections",
Data = new Dictionary<string, object> { ["connections"] = activeCount }
});
}
}
/// <summary>
/// Checks Authority service connectivity.
/// </summary>
public sealed class AuthorityHealthCheck : IHealthCheck
{
private readonly IAuthorityClient _authority;
private readonly TimeSpan _timeout;
public string Name => "authority";
public AuthorityHealthCheck(
IAuthorityClient authority,
IOptions<AuthorityConfig> config)
{
_authority = authority;
_timeout = config.Value.HealthCheckTimeout;
}
public async Task<HealthCheckResult> CheckAsync(CancellationToken cancellationToken)
{
try
{
using var cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
cts.CancelAfter(_timeout);
var isHealthy = await _authority.CheckHealthAsync(cts.Token);
return new HealthCheckResult
{
Status = isHealthy ? HealthStatus.Healthy : HealthStatus.Degraded,
Description = isHealthy ? "Authority is responsive" : "Authority returned unhealthy"
};
}
catch (Exception ex)
{
return new HealthCheckResult
{
Status = HealthStatus.Degraded, // Degraded, not unhealthy - gateway can still work
Description = $"Authority unreachable: {ex.Message}",
Exception = ex
};
}
}
}
/// <summary>
/// Checks rate limiter backend connectivity.
/// </summary>
public sealed class RateLimiterHealthCheck : IHealthCheck
{
private readonly IRateLimiter _rateLimiter;
public string Name => "rate_limiter";
public RateLimiterHealthCheck(IRateLimiter rateLimiter)
{
_rateLimiter = rateLimiter;
}
public async Task<HealthCheckResult> CheckAsync(CancellationToken cancellationToken)
{
try
{
// Try a simple operation
await _rateLimiter.CheckLimitAsync(
new RateLimitContext { Key = "__health_check__", Tier = RateLimitTier.Free },
cancellationToken);
return new HealthCheckResult
{
Status = HealthStatus.Healthy,
Description = "Rate limiter is responsive"
};
}
catch (Exception ex)
{
return new HealthCheckResult
{
Status = HealthStatus.Degraded,
Description = $"Rate limiter error: {ex.Message}",
Exception = ex
};
}
}
}
```
---
## Health Endpoints
```csharp
namespace StellaOps.Router.Gateway;
/// <summary>
/// Health check endpoints.
/// </summary>
public static class HealthEndpoints
{
public static IEndpointRouteBuilder MapHealthEndpoints(
this IEndpointRouteBuilder endpoints,
string basePath = "/health")
{
endpoints.MapGet(basePath + "/live", LivenessCheck);
endpoints.MapGet(basePath + "/ready", ReadinessCheck);
endpoints.MapGet(basePath, DetailedHealthCheck);
return endpoints;
}
/// <summary>
/// Liveness probe - is the process running?
/// </summary>
private static IResult LivenessCheck()
{
return Results.Ok(new { status = "alive" });
}
/// <summary>
/// Readiness probe - can the service accept traffic?
/// </summary>
private static async Task<IResult> ReadinessCheck(
HealthCheckService healthService,
CancellationToken cancellationToken)
{
var report = await healthService.CheckHealthAsync(cancellationToken);
return report.Status == HealthStatus.Unhealthy
? Results.Json(new
{
status = "not_ready",
checks = report.Checks.ToDictionary(c => c.Key, c => c.Value.Status.ToString())
}, statusCode: 503)
: Results.Ok(new { status = "ready" });
}
/// <summary>
/// Detailed health report.
/// </summary>
private static async Task<IResult> DetailedHealthCheck(
HealthCheckService healthService,
CancellationToken cancellationToken)
{
var report = await healthService.CheckHealthAsync(cancellationToken);
var response = new
{
status = report.Status.ToString().ToLower(),
totalDuration = $"{report.TotalDuration:F2}ms",
checks = report.Checks.ToDictionary(c => c.Key, c => new
{
status = c.Value.Status.ToString().ToLower(),
description = c.Value.Description,
duration = $"{c.Value.Duration.TotalMilliseconds:F2}ms",
data = c.Value.Data
})
};
var statusCode = report.Status switch
{
HealthStatus.Healthy => 200,
HealthStatus.Degraded => 200, // Still return 200 for degraded
HealthStatus.Unhealthy => 503,
_ => 200
};
return Results.Json(response, statusCode: statusCode);
}
}
```
---
## Prometheus Metrics Endpoint
```csharp
namespace StellaOps.Router.Gateway;
/// <summary>
/// Exposes metrics in Prometheus format.
/// </summary>
public sealed class PrometheusMetricsEndpoint
{
public static void Map(IEndpointRouteBuilder endpoints, string path = "/metrics")
{
endpoints.MapGet(path, async (HttpContext context) =>
{
var exporter = context.RequestServices.GetRequiredService<PrometheusExporter>();
var metrics = await exporter.ExportAsync();
context.Response.ContentType = "text/plain; version=0.0.4";
await context.Response.WriteAsync(metrics);
});
}
}
public sealed class PrometheusExporter
{
private readonly MeterProvider _meterProvider;
public PrometheusExporter(MeterProvider meterProvider)
{
_meterProvider = meterProvider;
}
public Task<string> ExportAsync()
{
// Use OpenTelemetry's Prometheus exporter
// This is a simplified example
var sb = new StringBuilder();
// Export would iterate over all registered metrics
// Real implementation uses OpenTelemetry.Exporter.Prometheus
return Task.FromResult(sb.ToString());
}
}
```
---
## Service Registration
```csharp
namespace StellaOps.Router.Gateway;
public static class MetricsExtensions
{
public static IServiceCollection AddStellaMetrics(
this IServiceCollection services,
IConfiguration configuration)
{
services.Configure<MetricsConfig>(configuration.GetSection("Metrics"));
services.AddOpenTelemetry()
.WithMetrics(builder =>
{
builder
.AddMeter("StellaOps.Router")
.AddAspNetCoreInstrumentation()
.AddPrometheusExporter();
});
return services;
}
public static IServiceCollection AddStellaHealthChecks(
this IServiceCollection services)
{
services.AddSingleton<HealthCheckService>();
services.AddSingleton<IHealthCheck, TransportHealthCheck>();
services.AddSingleton<IHealthCheck, AuthorityHealthCheck>();
services.AddSingleton<IHealthCheck, RateLimiterHealthCheck>();
return services;
}
}
```
---
## YAML Configuration
```yaml
Metrics:
Enabled: true
Path: "/metrics"
IncludePathLabel: false
MaxPathCardinality: 100
DurationBuckets:
- 0.005
- 0.01
- 0.025
- 0.05
- 0.1
- 0.25
- 0.5
- 1
- 2.5
- 5
- 10
HealthChecks:
Enabled: true
Path: "/health"
CacheDuration: "00:00:05"
```
---
## Deliverables
1. `StellaOps.Router.Common/StellaMetrics.cs`
2. `StellaOps.Router.Gateway/MetricsMiddleware.cs`
3. `StellaOps.Router.Transport/TransportMetricsCollector.cs`
4. `StellaOps.Router.Common/HealthCheckService.cs`
5. `StellaOps.Router.Gateway/TransportHealthCheck.cs`
6. `StellaOps.Router.Gateway/AuthorityHealthCheck.cs`
7. `StellaOps.Router.Gateway/HealthEndpoints.cs`
8. `StellaOps.Router.Gateway/PrometheusMetricsEndpoint.cs`
9. Metrics collection tests
10. Health check tests
---
## Next Step
Proceed to [Step 24: Circuit Breaker & Retry Policies](24-Step.md) to implement resilience patterns.

856
docs/router/24-Step.md Normal file
View File

@@ -0,0 +1,856 @@
# Step 24: Circuit Breaker & Retry Policies
**Phase 6: Observability & Resilience**
**Estimated Complexity:** High
**Dependencies:** Step 23 (Metrics & Health Checks)
---
## Overview
Circuit breakers and retry policies protect the system from cascading failures and transient errors. The circuit breaker prevents requests to failing services, while retry policies automatically retry failed requests with exponential backoff.
---
## Goals
1. Implement circuit breaker pattern for service protection
2. Support configurable retry policies
3. Enable per-service and per-endpoint policies
4. Integrate with metrics for observability
5. Provide graceful degradation strategies
---
## Circuit Breaker Configuration
```csharp
namespace StellaOps.Router.Resilience;
public class CircuitBreakerConfig
{
/// <summary>Number of failures before opening circuit.</summary>
public int FailureThreshold { get; set; } = 5;
/// <summary>Time window for counting failures.</summary>
public TimeSpan SamplingDuration { get; set; } = TimeSpan.FromSeconds(30);
/// <summary>How long to stay open before testing.</summary>
public TimeSpan BreakDuration { get; set; } = TimeSpan.FromSeconds(30);
/// <summary>Minimum throughput before circuit can trip.</summary>
public int MinimumThroughput { get; set; } = 10;
/// <summary>Failure ratio to trip circuit (0.0 to 1.0).</summary>
public double FailureRatioThreshold { get; set; } = 0.5;
/// <summary>HTTP status codes considered failures.</summary>
public HashSet<int> FailureStatusCodes { get; set; } = new()
{
500, 502, 503, 504
};
/// <summary>Exception types considered failures.</summary>
public HashSet<Type> FailureExceptions { get; set; } = new()
{
typeof(TimeoutException),
typeof(TaskCanceledException),
typeof(HttpRequestException)
};
}
```
---
## Circuit Breaker Implementation
```csharp
namespace StellaOps.Router.Resilience;
public enum CircuitState
{
Closed = 0, // Normal operation
Open = 2, // Blocking requests
HalfOpen = 1 // Testing with limited requests
}
/// <summary>
/// Circuit breaker for a single service or endpoint.
/// </summary>
public sealed class CircuitBreaker
{
private readonly CircuitBreakerConfig _config;
private readonly ILogger<CircuitBreaker> _logger;
private readonly SlidingWindow _window;
private CircuitState _state = CircuitState.Closed;
private DateTimeOffset _openedAt;
private readonly SemaphoreSlim _halfOpenLock = new(1, 1);
public string Name { get; }
public CircuitState State => _state;
public DateTimeOffset LastStateChange { get; private set; }
public CircuitBreaker(
string name,
CircuitBreakerConfig config,
ILogger<CircuitBreaker> logger)
{
Name = name;
_config = config;
_logger = logger;
_window = new SlidingWindow(config.SamplingDuration);
LastStateChange = DateTimeOffset.UtcNow;
}
/// <summary>
/// Checks if request is allowed through the circuit.
/// </summary>
public async Task<bool> AllowRequestAsync(CancellationToken cancellationToken)
{
switch (_state)
{
case CircuitState.Closed:
return true;
case CircuitState.Open:
if (DateTimeOffset.UtcNow - _openedAt >= _config.BreakDuration)
{
await TryTransitionToHalfOpenAsync();
}
return _state == CircuitState.HalfOpen;
case CircuitState.HalfOpen:
// Only allow one request at a time in half-open
return await _halfOpenLock.WaitAsync(0, cancellationToken);
default:
return false;
}
}
/// <summary>
/// Records a successful request.
/// </summary>
public void RecordSuccess()
{
_window.RecordSuccess();
if (_state == CircuitState.HalfOpen)
{
TransitionToClosed();
_halfOpenLock.Release();
}
}
/// <summary>
/// Records a failed request.
/// </summary>
public void RecordFailure()
{
_window.RecordFailure();
if (_state == CircuitState.HalfOpen)
{
TransitionToOpen();
_halfOpenLock.Release();
}
else if (_state == CircuitState.Closed)
{
CheckThreshold();
}
}
private void CheckThreshold()
{
var stats = _window.GetStats();
if (stats.TotalRequests < _config.MinimumThroughput)
return;
var failureRatio = (double)stats.Failures / stats.TotalRequests;
if (failureRatio >= _config.FailureRatioThreshold ||
stats.Failures >= _config.FailureThreshold)
{
TransitionToOpen();
}
}
private void TransitionToOpen()
{
_state = CircuitState.Open;
_openedAt = DateTimeOffset.UtcNow;
LastStateChange = _openedAt;
_logger.LogWarning(
"Circuit {Name} opened. Failures: {Failures}, Ratio: {Ratio:P2}",
Name, _window.GetStats().Failures,
(double)_window.GetStats().Failures / Math.Max(1, _window.GetStats().TotalRequests));
StellaMetrics.CircuitBreakerState.Record((int)CircuitState.Open,
new TagList { { "circuit", Name } });
}
private async Task TryTransitionToHalfOpenAsync()
{
if (_state != CircuitState.Open)
return;
if (await _halfOpenLock.WaitAsync(0))
{
_state = CircuitState.HalfOpen;
LastStateChange = DateTimeOffset.UtcNow;
_window.Reset();
_logger.LogInformation("Circuit {Name} transitioning to half-open", Name);
StellaMetrics.CircuitBreakerState.Record((int)CircuitState.HalfOpen,
new TagList { { "circuit", Name } });
}
}
private void TransitionToClosed()
{
_state = CircuitState.Closed;
LastStateChange = DateTimeOffset.UtcNow;
_window.Reset();
_logger.LogInformation("Circuit {Name} closed", Name);
StellaMetrics.CircuitBreakerState.Record((int)CircuitState.Closed,
new TagList { { "circuit", Name } });
}
}
/// <summary>
/// Sliding window for tracking success/failure counts.
/// </summary>
internal sealed class SlidingWindow
{
private readonly TimeSpan _duration;
private readonly ConcurrentQueue<(DateTimeOffset Time, bool Success)> _events = new();
public SlidingWindow(TimeSpan duration)
{
_duration = duration;
}
public void RecordSuccess()
{
_events.Enqueue((DateTimeOffset.UtcNow, true));
Cleanup();
}
public void RecordFailure()
{
_events.Enqueue((DateTimeOffset.UtcNow, false));
Cleanup();
}
public WindowStats GetStats()
{
Cleanup();
var successes = 0;
var failures = 0;
foreach (var evt in _events)
{
if (evt.Success)
successes++;
else
failures++;
}
return new WindowStats(successes, failures);
}
public void Reset()
{
_events.Clear();
}
private void Cleanup()
{
var cutoff = DateTimeOffset.UtcNow - _duration;
while (_events.TryPeek(out var evt) && evt.Time < cutoff)
{
_events.TryDequeue(out _);
}
}
}
internal readonly record struct WindowStats(int Successes, int Failures)
{
public int TotalRequests => Successes + Failures;
}
```
---
## Retry Policy Configuration
```csharp
namespace StellaOps.Router.Resilience;
public class RetryPolicyConfig
{
/// <summary>Maximum number of retries.</summary>
public int MaxRetries { get; set; } = 3;
/// <summary>Initial delay before first retry.</summary>
public TimeSpan InitialDelay { get; set; } = TimeSpan.FromMilliseconds(100);
/// <summary>Maximum delay between retries.</summary>
public TimeSpan MaxDelay { get; set; } = TimeSpan.FromSeconds(10);
/// <summary>Backoff multiplier for exponential delay.</summary>
public double BackoffMultiplier { get; set; } = 2.0;
/// <summary>Whether to add jitter to delays.</summary>
public bool UseJitter { get; set; } = true;
/// <summary>Maximum jitter to add (percentage of delay).</summary>
public double MaxJitterPercent { get; set; } = 0.25;
/// <summary>HTTP status codes that trigger retry.</summary>
public HashSet<int> RetryableStatusCodes { get; set; } = new()
{
408, 429, 500, 502, 503, 504
};
/// <summary>Exception types that trigger retry.</summary>
public HashSet<Type> RetryableExceptions { get; set; } = new()
{
typeof(TimeoutException),
typeof(HttpRequestException),
typeof(IOException)
};
}
```
---
## Retry Policy Implementation
```csharp
namespace StellaOps.Router.Resilience;
/// <summary>
/// Executes operations with retry logic.
/// </summary>
public sealed class RetryPolicy
{
private readonly RetryPolicyConfig _config;
private readonly ILogger<RetryPolicy> _logger;
public RetryPolicy(RetryPolicyConfig config, ILogger<RetryPolicy> logger)
{
_config = config;
_logger = logger;
}
/// <summary>
/// Executes an operation with retry logic.
/// </summary>
public async Task<T> ExecuteAsync<T>(
Func<CancellationToken, Task<T>> operation,
Func<T, bool> shouldRetry,
CancellationToken cancellationToken)
{
var attempt = 0;
var totalDelay = TimeSpan.Zero;
while (true)
{
try
{
attempt++;
var result = await operation(cancellationToken);
if (shouldRetry(result) && attempt <= _config.MaxRetries)
{
var delay = CalculateDelay(attempt);
totalDelay += delay;
_logger.LogDebug(
"Retrying operation (attempt {Attempt}/{MaxRetries}) after {Delay}ms",
attempt, _config.MaxRetries, delay.TotalMilliseconds);
await Task.Delay(delay, cancellationToken);
continue;
}
if (attempt > 1)
{
_logger.LogDebug(
"Operation succeeded after {Attempts} attempts, total delay: {TotalDelay}ms",
attempt, totalDelay.TotalMilliseconds);
}
return result;
}
catch (Exception ex) when (ShouldRetry(ex) && attempt <= _config.MaxRetries)
{
var delay = CalculateDelay(attempt);
totalDelay += delay;
_logger.LogWarning(
ex,
"Operation failed (attempt {Attempt}/{MaxRetries}), retrying after {Delay}ms",
attempt, _config.MaxRetries, delay.TotalMilliseconds);
await Task.Delay(delay, cancellationToken);
}
}
}
/// <summary>
/// Executes an operation with retry logic (response payload variant).
/// </summary>
public Task<ResponsePayload> ExecuteAsync(
Func<CancellationToken, Task<ResponsePayload>> operation,
CancellationToken cancellationToken)
{
return ExecuteAsync(
operation,
response => _config.RetryableStatusCodes.Contains(response.StatusCode),
cancellationToken);
}
private bool ShouldRetry(Exception ex)
{
var exType = ex.GetType();
return _config.RetryableExceptions.Any(t => t.IsAssignableFrom(exType));
}
private TimeSpan CalculateDelay(int attempt)
{
// Exponential backoff
var delay = TimeSpan.FromMilliseconds(
_config.InitialDelay.TotalMilliseconds * Math.Pow(_config.BackoffMultiplier, attempt - 1));
// Cap at max delay
if (delay > _config.MaxDelay)
{
delay = _config.MaxDelay;
}
// Add jitter
if (_config.UseJitter)
{
var jitter = delay.TotalMilliseconds * _config.MaxJitterPercent * Random.Shared.NextDouble();
delay = TimeSpan.FromMilliseconds(delay.TotalMilliseconds + jitter);
}
return delay;
}
}
```
---
## Resilience Policy Executor
```csharp
namespace StellaOps.Router.Resilience;
/// <summary>
/// Combines circuit breaker and retry policies.
/// </summary>
public interface IResiliencePolicy
{
Task<ResponsePayload> ExecuteAsync(
string serviceName,
Func<CancellationToken, Task<ResponsePayload>> operation,
CancellationToken cancellationToken);
}
public sealed class ResiliencePolicy : IResiliencePolicy
{
private readonly ICircuitBreakerRegistry _circuitBreakers;
private readonly RetryPolicy _retryPolicy;
private readonly ResilienceConfig _config;
private readonly ILogger<ResiliencePolicy> _logger;
public ResiliencePolicy(
ICircuitBreakerRegistry circuitBreakers,
RetryPolicy retryPolicy,
IOptions<ResilienceConfig> config,
ILogger<ResiliencePolicy> logger)
{
_circuitBreakers = circuitBreakers;
_retryPolicy = retryPolicy;
_config = config.Value;
_logger = logger;
}
public async Task<ResponsePayload> ExecuteAsync(
string serviceName,
Func<CancellationToken, Task<ResponsePayload>> operation,
CancellationToken cancellationToken)
{
var circuitBreaker = _circuitBreakers.GetOrCreate(serviceName);
// Check circuit breaker
if (!await circuitBreaker.AllowRequestAsync(cancellationToken))
{
_logger.LogWarning("Circuit breaker {Name} is open, rejecting request", serviceName);
return _config.FallbackResponse ?? new ResponsePayload
{
StatusCode = 503,
Headers = new Dictionary<string, string>
{
["X-Circuit-Breaker"] = "open",
["Retry-After"] = "30"
},
Body = Encoding.UTF8.GetBytes(JsonSerializer.Serialize(new
{
error = "Service temporarily unavailable",
service = serviceName
})),
IsFinalChunk = true
};
}
try
{
// Execute with retry
var response = await _retryPolicy.ExecuteAsync(operation, cancellationToken);
// Record result
if (IsSuccess(response))
{
circuitBreaker.RecordSuccess();
}
else if (IsFailure(response))
{
circuitBreaker.RecordFailure();
}
return response;
}
catch (Exception)
{
circuitBreaker.RecordFailure();
throw;
}
}
private bool IsSuccess(ResponsePayload response)
{
return response.StatusCode >= 200 && response.StatusCode < 400;
}
private bool IsFailure(ResponsePayload response)
{
return _config.CircuitBreaker.FailureStatusCodes.Contains(response.StatusCode);
}
}
public class ResilienceConfig
{
public CircuitBreakerConfig CircuitBreaker { get; set; } = new();
public RetryPolicyConfig Retry { get; set; } = new();
public ResponsePayload? FallbackResponse { get; set; }
}
```
---
## Circuit Breaker Registry
```csharp
namespace StellaOps.Router.Resilience;
/// <summary>
/// Registry of circuit breakers per service.
/// </summary>
public interface ICircuitBreakerRegistry
{
CircuitBreaker GetOrCreate(string name);
IReadOnlyDictionary<string, CircuitBreaker> GetAll();
void Reset(string name);
void ResetAll();
}
public sealed class CircuitBreakerRegistry : ICircuitBreakerRegistry
{
private readonly ConcurrentDictionary<string, CircuitBreaker> _breakers = new();
private readonly CircuitBreakerConfig _config;
private readonly ILoggerFactory _loggerFactory;
public CircuitBreakerRegistry(
IOptions<CircuitBreakerConfig> config,
ILoggerFactory loggerFactory)
{
_config = config.Value;
_loggerFactory = loggerFactory;
}
public CircuitBreaker GetOrCreate(string name)
{
return _breakers.GetOrAdd(name, n =>
{
var logger = _loggerFactory.CreateLogger<CircuitBreaker>();
return new CircuitBreaker(n, _config, logger);
});
}
public IReadOnlyDictionary<string, CircuitBreaker> GetAll()
{
return _breakers;
}
public void Reset(string name)
{
if (_breakers.TryRemove(name, out _))
{
// Will be recreated fresh on next request
}
}
public void ResetAll()
{
_breakers.Clear();
}
}
```
---
## Bulkhead Pattern
```csharp
namespace StellaOps.Router.Resilience;
/// <summary>
/// Bulkhead pattern - limits concurrent requests to a service.
/// </summary>
public sealed class Bulkhead
{
private readonly SemaphoreSlim _semaphore;
private readonly BulkheadConfig _config;
private readonly string _name;
private int _queuedRequests;
public string Name => _name;
public int ActiveRequests => _config.MaxConcurrency - _semaphore.CurrentCount;
public int QueuedRequests => _queuedRequests;
public Bulkhead(string name, BulkheadConfig config)
{
_name = name;
_config = config;
_semaphore = new SemaphoreSlim(config.MaxConcurrency, config.MaxConcurrency);
}
/// <summary>
/// Acquires a slot in the bulkhead.
/// </summary>
public async Task<IDisposable?> AcquireAsync(CancellationToken cancellationToken)
{
var queued = Interlocked.Increment(ref _queuedRequests);
if (queued > _config.MaxQueueSize)
{
Interlocked.Decrement(ref _queuedRequests);
return null; // Reject immediately
}
try
{
var acquired = await _semaphore.WaitAsync(_config.QueueTimeout, cancellationToken);
Interlocked.Decrement(ref _queuedRequests);
if (!acquired)
{
return null;
}
return new BulkheadLease(_semaphore);
}
catch
{
Interlocked.Decrement(ref _queuedRequests);
throw;
}
}
private sealed class BulkheadLease : IDisposable
{
private readonly SemaphoreSlim _semaphore;
private bool _disposed;
public BulkheadLease(SemaphoreSlim semaphore)
{
_semaphore = semaphore;
}
public void Dispose()
{
if (!_disposed)
{
_semaphore.Release();
_disposed = true;
}
}
}
}
public class BulkheadConfig
{
public int MaxConcurrency { get; set; } = 100;
public int MaxQueueSize { get; set; } = 50;
public TimeSpan QueueTimeout { get; set; } = TimeSpan.FromSeconds(10);
}
```
---
## Resilience Middleware
```csharp
namespace StellaOps.Router.Gateway;
/// <summary>
/// Middleware that applies resilience policies to requests.
/// </summary>
public sealed class ResilienceMiddleware
{
private readonly RequestDelegate _next;
private readonly IResiliencePolicy _policy;
public ResilienceMiddleware(RequestDelegate next, IResiliencePolicy policy)
{
_next = next;
_policy = policy;
}
public async Task InvokeAsync(HttpContext context)
{
// Get target service from route data
var serviceName = context.GetRouteValue("service")?.ToString();
if (string.IsNullOrEmpty(serviceName))
{
await _next(context);
return;
}
try
{
await _next(context);
}
catch (Exception ex) when (IsTransientException(ex))
{
// Convert to 503 with retry information
context.Response.StatusCode = 503;
context.Response.Headers["Retry-After"] = "30";
await context.Response.WriteAsJsonAsync(new
{
error = "Service temporarily unavailable",
retryAfter = 30
});
}
}
private bool IsTransientException(Exception ex)
{
return ex is TimeoutException or
HttpRequestException or
TaskCanceledException;
}
}
```
---
## Service Registration
```csharp
namespace StellaOps.Router.Resilience;
public static class ResilienceExtensions
{
public static IServiceCollection AddStellaResilience(
this IServiceCollection services,
IConfiguration configuration)
{
services.Configure<ResilienceConfig>(configuration.GetSection("Resilience"));
services.Configure<CircuitBreakerConfig>(configuration.GetSection("Resilience:CircuitBreaker"));
services.Configure<RetryPolicyConfig>(configuration.GetSection("Resilience:Retry"));
services.Configure<BulkheadConfig>(configuration.GetSection("Resilience:Bulkhead"));
services.AddSingleton<ICircuitBreakerRegistry, CircuitBreakerRegistry>();
services.AddSingleton<RetryPolicy>();
services.AddSingleton<IResiliencePolicy, ResiliencePolicy>();
return services;
}
}
```
---
## YAML Configuration
```yaml
Resilience:
CircuitBreaker:
FailureThreshold: 5
SamplingDuration: "00:00:30"
BreakDuration: "00:00:30"
MinimumThroughput: 10
FailureRatioThreshold: 0.5
FailureStatusCodes:
- 500
- 502
- 503
- 504
Retry:
MaxRetries: 3
InitialDelay: "00:00:00.100"
MaxDelay: "00:00:10"
BackoffMultiplier: 2.0
UseJitter: true
MaxJitterPercent: 0.25
RetryableStatusCodes:
- 408
- 429
- 502
- 503
- 504
Bulkhead:
MaxConcurrency: 100
MaxQueueSize: 50
QueueTimeout: "00:00:10"
```
---
## Deliverables
1. `StellaOps.Router.Resilience/CircuitBreaker.cs`
2. `StellaOps.Router.Resilience/CircuitBreakerConfig.cs`
3. `StellaOps.Router.Resilience/ICircuitBreakerRegistry.cs`
4. `StellaOps.Router.Resilience/CircuitBreakerRegistry.cs`
5. `StellaOps.Router.Resilience/RetryPolicy.cs`
6. `StellaOps.Router.Resilience/RetryPolicyConfig.cs`
7. `StellaOps.Router.Resilience/IResiliencePolicy.cs`
8. `StellaOps.Router.Resilience/ResiliencePolicy.cs`
9. `StellaOps.Router.Resilience/Bulkhead.cs`
10. `StellaOps.Router.Gateway/ResilienceMiddleware.cs`
11. Circuit breaker state transition tests
12. Retry policy tests
13. Bulkhead tests
---
## Next Step
Proceed to [Step 25: Configuration Hot-Reload](25-Step.md) to implement dynamic configuration updates.

754
docs/router/25-Step.md Normal file
View File

@@ -0,0 +1,754 @@
# Step 25: Configuration Hot-Reload
**Phase 7: Testing & Documentation**
**Estimated Complexity:** Medium
**Dependencies:** All previous configuration steps
---
## Overview
Configuration hot-reload enables dynamic updates to router and microservice configuration without restarts. This includes route definitions, rate limits, circuit breaker settings, and JWKS rotation.
---
## Goals
1. Support YAML configuration hot-reload
2. Implement file watcher for configuration changes
3. Provide atomic configuration updates
4. Support validation before applying changes
5. Enable rollback on invalid configuration
---
## Configuration Watcher
```csharp
namespace StellaOps.Router.Configuration;
/// <summary>
/// Watches configuration files for changes and triggers reloads.
/// </summary>
public sealed class ConfigurationWatcher : IHostedService, IDisposable
{
private readonly IConfiguration _configuration;
private readonly IOptionsMonitor<RouterConfig> _routerConfig;
private readonly ILogger<ConfigurationWatcher> _logger;
private readonly List<FileSystemWatcher> _watchers = new();
private readonly Subject<ConfigurationChange> _changes = new();
private readonly TimeSpan _debounceInterval = TimeSpan.FromMilliseconds(500);
private readonly ConcurrentDictionary<string, DateTimeOffset> _lastChange = new();
public IObservable<ConfigurationChange> Changes => _changes;
public ConfigurationWatcher(
IConfiguration configuration,
IOptionsMonitor<RouterConfig> routerConfig,
ILogger<ConfigurationWatcher> logger)
{
_configuration = configuration;
_routerConfig = routerConfig;
_logger = logger;
}
public Task StartAsync(CancellationToken cancellationToken)
{
// Watch all YAML configuration files
var configPaths = GetConfigurationFilePaths();
foreach (var path in configPaths)
{
if (!File.Exists(path))
continue;
var directory = Path.GetDirectoryName(path)!;
var fileName = Path.GetFileName(path);
var watcher = new FileSystemWatcher(directory)
{
Filter = fileName,
NotifyFilter = NotifyFilters.LastWrite | NotifyFilters.Size,
EnableRaisingEvents = true
};
watcher.Changed += OnConfigurationFileChanged;
_watchers.Add(watcher);
_logger.LogInformation("Watching configuration file: {Path}", path);
}
// Also subscribe to IOptionsMonitor for programmatic changes
_routerConfig.OnChange(config =>
{
_changes.OnNext(new ConfigurationChange
{
Section = "Router",
ChangeType = ChangeType.Modified,
Timestamp = DateTimeOffset.UtcNow
});
});
return Task.CompletedTask;
}
private void OnConfigurationFileChanged(object sender, FileSystemEventArgs e)
{
// Debounce rapid changes
var now = DateTimeOffset.UtcNow;
if (_lastChange.TryGetValue(e.FullPath, out var lastChange) &&
now - lastChange < _debounceInterval)
{
return;
}
_lastChange[e.FullPath] = now;
_logger.LogInformation("Configuration file changed: {Path}", e.FullPath);
// Delay to allow file writes to complete
Task.Delay(100).ContinueWith(_ =>
{
try
{
// Validate configuration before notifying
if (ValidateConfiguration(e.FullPath))
{
_changes.OnNext(new ConfigurationChange
{
Section = DetermineSectionFromPath(e.FullPath),
ChangeType = ChangeType.Modified,
FilePath = e.FullPath,
Timestamp = now
});
}
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to process configuration change for {Path}", e.FullPath);
}
});
}
private bool ValidateConfiguration(string path)
{
try
{
var yaml = File.ReadAllText(path);
var deserializer = new DeserializerBuilder()
.WithNamingConvention(CamelCaseNamingConvention.Instance)
.Build();
// Try to deserialize to validate YAML syntax
var doc = deserializer.Deserialize<Dictionary<string, object>>(yaml);
return doc != null;
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Invalid configuration file: {Path}", path);
return false;
}
}
private string DetermineSectionFromPath(string path)
{
var fileName = Path.GetFileNameWithoutExtension(path).ToLower();
return fileName switch
{
"router" => "Router",
"routes" => "Routes",
"ratelimits" => "RateLimits",
"endpoints" => "Endpoints",
_ => "Unknown"
};
}
private IEnumerable<string> GetConfigurationFilePaths()
{
// Get paths from configuration providers
var paths = new List<string>();
if (_configuration is IConfigurationRoot root)
{
foreach (var provider in root.Providers)
{
if (provider is FileConfigurationProvider fileProvider)
{
var source = fileProvider.Source;
if (source.FileProvider?.GetFileInfo(source.Path ?? "") is { Exists: true } fileInfo)
{
paths.Add(fileInfo.PhysicalPath ?? "");
}
}
}
}
return paths.Where(p => !string.IsNullOrEmpty(p));
}
public Task StopAsync(CancellationToken cancellationToken)
{
foreach (var watcher in _watchers)
{
watcher.EnableRaisingEvents = false;
}
return Task.CompletedTask;
}
public void Dispose()
{
foreach (var watcher in _watchers)
{
watcher.Dispose();
}
_changes.Dispose();
}
}
public sealed class ConfigurationChange
{
public string Section { get; init; } = "";
public ChangeType ChangeType { get; init; }
public string? FilePath { get; init; }
public DateTimeOffset Timestamp { get; init; }
}
public enum ChangeType
{
Added,
Modified,
Removed
}
```
---
## Route Configuration Reloader
```csharp
namespace StellaOps.Router.Configuration;
/// <summary>
/// Handles hot-reload of route configurations.
/// </summary>
public sealed class RouteConfigurationReloader : IHostedService
{
private readonly ConfigurationWatcher _watcher;
private readonly IRouteRegistry _routeRegistry;
private readonly ILogger<RouteConfigurationReloader> _logger;
private IDisposable? _subscription;
public RouteConfigurationReloader(
ConfigurationWatcher watcher,
IRouteRegistry routeRegistry,
ILogger<RouteConfigurationReloader> logger)
{
_watcher = watcher;
_routeRegistry = routeRegistry;
_logger = logger;
}
public Task StartAsync(CancellationToken cancellationToken)
{
_subscription = _watcher.Changes
.Where(c => c.Section == "Routes")
.Subscribe(OnRoutesChanged);
return Task.CompletedTask;
}
private void OnRoutesChanged(ConfigurationChange change)
{
_logger.LogInformation("Reloading routes from {Path}", change.FilePath);
try
{
_routeRegistry.Reload();
_logger.LogInformation("Routes reloaded successfully");
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to reload routes, keeping previous configuration");
}
}
public Task StopAsync(CancellationToken cancellationToken)
{
_subscription?.Dispose();
return Task.CompletedTask;
}
}
```
---
## Rate Limit Configuration Reloader
```csharp
namespace StellaOps.Router.Configuration;
/// <summary>
/// Handles hot-reload of rate limit configurations.
/// </summary>
public sealed class RateLimitConfigurationReloader : IHostedService
{
private readonly ConfigurationWatcher _watcher;
private readonly IRateLimiter _rateLimiter;
private readonly IOptionsMonitor<RateLimitConfig> _config;
private readonly ILogger<RateLimitConfigurationReloader> _logger;
private IDisposable? _subscription;
public RateLimitConfigurationReloader(
ConfigurationWatcher watcher,
IRateLimiter rateLimiter,
IOptionsMonitor<RateLimitConfig> config,
ILogger<RateLimitConfigurationReloader> logger)
{
_watcher = watcher;
_rateLimiter = rateLimiter;
_config = config;
_logger = logger;
}
public Task StartAsync(CancellationToken cancellationToken)
{
_subscription = _watcher.Changes
.Where(c => c.Section == "RateLimits")
.Subscribe(OnRateLimitsChanged);
_config.OnChange(OnRateLimitConfigChanged);
return Task.CompletedTask;
}
private void OnRateLimitsChanged(ConfigurationChange change)
{
_logger.LogInformation("Rate limit configuration changed, applying updates");
ApplyRateLimitChanges();
}
private void OnRateLimitConfigChanged(RateLimitConfig config)
{
_logger.LogInformation("Rate limit options changed, applying updates");
ApplyRateLimitChanges();
}
private void ApplyRateLimitChanges()
{
try
{
// Rate limiter will pick up new config from IOptionsMonitor
// Clear any cached tier information
if (_rateLimiter is ICacheableRateLimiter cacheable)
{
cacheable.ClearCache();
}
_logger.LogInformation("Rate limit configuration applied successfully");
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to apply rate limit changes");
}
}
public Task StopAsync(CancellationToken cancellationToken)
{
_subscription?.Dispose();
return Task.CompletedTask;
}
}
public interface ICacheableRateLimiter
{
void ClearCache();
}
```
---
## JWKS Hot-Reload
```csharp
namespace StellaOps.Router.Configuration;
/// <summary>
/// Handles JWKS rotation and cache refresh.
/// </summary>
public sealed class JwksReloader : IHostedService
{
private readonly IJwksCache _jwksCache;
private readonly JwtAuthenticationConfig _config;
private readonly ILogger<JwksReloader> _logger;
private Timer? _refreshTimer;
public JwksReloader(
IJwksCache jwksCache,
IOptions<JwtAuthenticationConfig> config,
ILogger<JwksReloader> logger)
{
_jwksCache = jwksCache;
_config = config.Value;
_logger = logger;
}
public Task StartAsync(CancellationToken cancellationToken)
{
// Periodic refresh of JWKS
var interval = _config.JwksRefreshInterval;
_refreshTimer = new Timer(
RefreshJwks,
null,
interval,
interval);
_logger.LogInformation(
"JWKS refresh scheduled every {Interval}",
interval);
return Task.CompletedTask;
}
private async void RefreshJwks(object? state)
{
try
{
_logger.LogDebug("Refreshing JWKS cache");
await _jwksCache.RefreshAsync(CancellationToken.None);
_logger.LogDebug("JWKS cache refreshed successfully");
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to refresh JWKS cache, will retry");
}
}
public Task StopAsync(CancellationToken cancellationToken)
{
_refreshTimer?.Dispose();
return Task.CompletedTask;
}
}
```
---
## Configuration Validation
```csharp
namespace StellaOps.Router.Configuration;
/// <summary>
/// Validates configuration before applying changes.
/// </summary>
public interface IConfigurationValidator
{
ValidationResult Validate<T>(T config) where T : class;
}
public sealed class ConfigurationValidator : IConfigurationValidator
{
private readonly ILogger<ConfigurationValidator> _logger;
public ConfigurationValidator(ILogger<ConfigurationValidator> logger)
{
_logger = logger;
}
public ValidationResult Validate<T>(T config) where T : class
{
var errors = new List<string>();
// Use data annotations validation
var context = new ValidationContext(config);
var results = new List<System.ComponentModel.DataAnnotations.ValidationResult>();
if (!Validator.TryValidateObject(config, context, results, validateAllProperties: true))
{
errors.AddRange(results.Select(r => r.ErrorMessage ?? "Unknown validation error"));
}
// Type-specific validation
errors.AddRange(config switch
{
RouterConfig router => ValidateRouterConfig(router),
RateLimitConfig rateLimit => ValidateRateLimitConfig(rateLimit),
_ => Enumerable.Empty<string>()
});
if (errors.Any())
{
_logger.LogWarning(
"Configuration validation failed: {Errors}",
string.Join(", ", errors));
}
return new ValidationResult
{
IsValid = !errors.Any(),
Errors = errors
};
}
private IEnumerable<string> ValidateRouterConfig(RouterConfig config)
{
if (config.MaxPayloadSize <= 0)
yield return "MaxPayloadSize must be positive";
if (config.RequestTimeout <= TimeSpan.Zero)
yield return "RequestTimeout must be positive";
}
private IEnumerable<string> ValidateRateLimitConfig(RateLimitConfig config)
{
foreach (var (tier, limits) in config.Tiers)
{
if (limits.RequestsPerMinute <= 0)
yield return $"Tier {tier}: RequestsPerMinute must be positive";
}
}
}
public sealed class ValidationResult
{
public bool IsValid { get; init; }
public IReadOnlyList<string> Errors { get; init; } = Array.Empty<string>();
}
```
---
## Atomic Configuration Update
```csharp
namespace StellaOps.Router.Configuration;
/// <summary>
/// Provides atomic configuration updates with rollback support.
/// </summary>
public sealed class AtomicConfigurationUpdater
{
private readonly IConfigurationValidator _validator;
private readonly ILogger<AtomicConfigurationUpdater> _logger;
private readonly ReaderWriterLockSlim _lock = new();
public AtomicConfigurationUpdater(
IConfigurationValidator validator,
ILogger<AtomicConfigurationUpdater> logger)
{
_validator = validator;
_logger = logger;
}
/// <summary>
/// Atomically updates configuration with validation and rollback.
/// </summary>
public async Task<bool> UpdateAsync<T>(
T currentConfig,
T newConfig,
Func<T, Task> applyAction,
Func<T, Task>? rollbackAction = null)
where T : class
{
// Validate new configuration
var validation = _validator.Validate(newConfig);
if (!validation.IsValid)
{
_logger.LogWarning(
"Configuration update rejected: {Errors}",
string.Join(", ", validation.Errors));
return false;
}
_lock.EnterWriteLock();
try
{
// Store current config for rollback
var backup = currentConfig;
try
{
await applyAction(newConfig);
_logger.LogInformation("Configuration updated successfully");
return true;
}
catch (Exception ex)
{
_logger.LogError(ex, "Configuration update failed, rolling back");
if (rollbackAction != null)
{
try
{
await rollbackAction(backup);
_logger.LogInformation("Configuration rolled back successfully");
}
catch (Exception rollbackEx)
{
_logger.LogError(rollbackEx, "Rollback failed!");
}
}
return false;
}
}
finally
{
_lock.ExitWriteLock();
}
}
}
```
---
## Configuration API Endpoints
```csharp
namespace StellaOps.Router.Gateway;
/// <summary>
/// API endpoints for configuration management.
/// </summary>
public static class ConfigurationEndpoints
{
public static IEndpointRouteBuilder MapConfigurationEndpoints(
this IEndpointRouteBuilder endpoints,
string basePath = "/api/config")
{
var group = endpoints.MapGroup(basePath)
.RequireAuthorization("admin");
group.MapGet("/", GetConfiguration);
group.MapGet("/{section}", GetConfigurationSection);
group.MapPost("/reload", ReloadConfiguration);
group.MapPost("/validate", ValidateConfiguration);
return endpoints;
}
private static async Task<IResult> GetConfiguration(
IConfiguration configuration)
{
var sections = new Dictionary<string, object>();
foreach (var child in configuration.GetChildren())
{
sections[child.Key] = GetSectionValue(child);
}
return Results.Ok(sections);
}
private static object GetSectionValue(IConfigurationSection section)
{
var children = section.GetChildren().ToList();
if (!children.Any())
{
return section.Value ?? "";
}
if (children.All(c => int.TryParse(c.Key, out _)))
{
// Array
return children.Select(c => GetSectionValue(c)).ToList();
}
// Object
return children.ToDictionary(c => c.Key, c => GetSectionValue(c));
}
private static IResult GetConfigurationSection(
string section,
IConfiguration configuration)
{
var configSection = configuration.GetSection(section);
if (!configSection.Exists())
{
return Results.NotFound(new { error = $"Section '{section}' not found" });
}
return Results.Ok(GetSectionValue(configSection));
}
private static async Task<IResult> ReloadConfiguration(
ConfigurationWatcher watcher,
ILogger<ConfigurationWatcher> logger)
{
logger.LogInformation("Manual configuration reload triggered");
// Trigger reload notification
// In practice, would re-read configuration files
return Results.Ok(new { message = "Configuration reload triggered" });
}
private static async Task<IResult> ValidateConfiguration(
HttpRequest request,
IConfigurationValidator validator)
{
var body = await request.ReadFromJsonAsync<Dictionary<string, object>>();
if (body == null)
{
return Results.BadRequest(new { error = "Invalid request body" });
}
// Basic syntax validation
return Results.Ok(new { valid = true });
}
}
```
---
## YAML Configuration
```yaml
Configuration:
# Enable hot-reload
HotReload:
Enabled: true
DebounceInterval: "00:00:00.500"
ValidateBeforeApply: true
# Files to watch
WatchPaths:
- "/etc/stellaops/router.yaml"
- "/etc/stellaops/routes.yaml"
- "/etc/stellaops/ratelimits.yaml"
# JWKS refresh settings
Jwks:
RefreshInterval: "00:05:00"
RefreshOnError: true
MaxRetries: 3
```
---
## Deliverables
1. `StellaOps.Router.Configuration/ConfigurationWatcher.cs`
2. `StellaOps.Router.Configuration/RouteConfigurationReloader.cs`
3. `StellaOps.Router.Configuration/RateLimitConfigurationReloader.cs`
4. `StellaOps.Router.Configuration/JwksReloader.cs`
5. `StellaOps.Router.Configuration/IConfigurationValidator.cs`
6. `StellaOps.Router.Configuration/ConfigurationValidator.cs`
7. `StellaOps.Router.Configuration/AtomicConfigurationUpdater.cs`
8. `StellaOps.Router.Gateway/ConfigurationEndpoints.cs`
9. Configuration reload tests
10. Validation tests
---
## Next Step
Proceed to [Step 26: End-to-End Testing](26-Step.md) to implement comprehensive integration tests.

683
docs/router/26-Step.md Normal file
View File

@@ -0,0 +1,683 @@
# Step 26: End-to-End Testing
**Phase 7: Testing & Documentation**
**Estimated Complexity:** High
**Dependencies:** All implementation steps
---
## Overview
End-to-end testing validates the complete request flow from HTTP client through the gateway, transport layer, microservice, and back. Tests cover all handlers, authentication, rate limiting, streaming, and failure scenarios.
---
## Goals
1. Validate complete request/response flow
2. Test all route handlers
3. Verify authentication and authorization
4. Test rate limiting behavior
5. Validate streaming and large payloads
6. Test failure scenarios and resilience
---
## Test Infrastructure
```csharp
namespace StellaOps.Router.Tests;
/// <summary>
/// End-to-end test fixture providing gateway and microservice hosts.
/// </summary>
public sealed class EndToEndTestFixture : IAsyncLifetime
{
private IHost? _gatewayHost;
private IHost? _microserviceHost;
private InMemoryTransportHub? _transportHub;
public HttpClient GatewayClient { get; private set; } = null!;
public string GatewayBaseUrl { get; private set; } = null!;
public async Task InitializeAsync()
{
// Shared transport hub for InMemory testing
_transportHub = new InMemoryTransportHub(
NullLoggerFactory.Instance.CreateLogger<InMemoryTransportHub>());
// Start gateway
_gatewayHost = await CreateGatewayHostAsync();
await _gatewayHost.StartAsync();
GatewayBaseUrl = "http://localhost:5000";
GatewayClient = new HttpClient { BaseAddress = new Uri(GatewayBaseUrl) };
// Start test microservice
_microserviceHost = await CreateMicroserviceHostAsync();
await _microserviceHost.StartAsync();
// Wait for connection
await Task.Delay(500);
}
private async Task<IHost> CreateGatewayHostAsync()
{
return Host.CreateDefaultBuilder()
.ConfigureWebHostDefaults(web =>
{
web.UseUrls("http://localhost:5000");
web.ConfigureServices((context, services) =>
{
services.AddSingleton(_transportHub!);
services.AddStellaGateway(context.Configuration);
services.AddInMemoryTransport();
// Use in-memory rate limiter
services.AddSingleton<IRateLimiter, InMemoryRateLimiter>();
// Mock Authority
services.AddSingleton<IAuthorityClient, MockAuthorityClient>();
});
web.Configure(app =>
{
app.UseRouting();
app.UseStellaGateway();
app.UseEndpoints(endpoints =>
{
endpoints.MapStellaRoutes();
});
});
})
.Build();
}
private async Task<IHost> CreateMicroserviceHostAsync()
{
var host = StellaMicroserviceBuilder
.Create("test-service")
.ConfigureServices(services =>
{
services.AddSingleton(_transportHub!);
services.AddScoped<TestEndpointHandler>();
})
.ConfigureTransport(t => t.Default = "InMemory")
.ConfigureEndpoints(e =>
{
e.AutoDiscover = true;
e.BasePath = "/api";
})
.Build();
return (IHost)host;
}
public async Task DisposeAsync()
{
GatewayClient.Dispose();
if (_microserviceHost != null)
{
await _microserviceHost.StopAsync();
_microserviceHost.Dispose();
}
if (_gatewayHost != null)
{
await _gatewayHost.StopAsync();
_gatewayHost.Dispose();
}
_transportHub?.Dispose();
}
}
```
---
## Test Endpoint Handler
```csharp
namespace StellaOps.Router.Tests;
[StellaEndpoint(BasePath = "/test")]
public class TestEndpointHandler : EndpointHandler
{
[StellaGet("echo")]
public ResponsePayload Echo()
{
return Ok(new
{
method = Context.Method,
path = Context.Path,
query = Context.Query.ToDictionary(q => q.Key, q => q.Value.ToString()),
headers = Context.Headers.ToDictionary(h => h.Key, h => h.Value.ToString()),
claims = Context.Claims
});
}
[StellaPost("echo")]
public async Task<ResponsePayload> EchoBody()
{
var body = Context.ReadBodyAsString();
return Ok(new { body });
}
[StellaGet("items/{id}")]
public ResponsePayload GetItem([FromPath] string id)
{
return Ok(new { id });
}
[StellaGet("slow")]
public async Task<ResponsePayload> SlowEndpoint(CancellationToken cancellationToken)
{
await Task.Delay(5000, cancellationToken);
return Ok(new { completed = true });
}
[StellaGet("error")]
public ResponsePayload ThrowError()
{
throw new InvalidOperationException("Test error");
}
[StellaGet("status/{code}")]
public ResponsePayload ReturnStatus([FromPath] int code)
{
return Response().WithStatus(code).WithJson(new { statusCode = code }).Build();
}
[StellaGet("protected")]
[StellaAuth(RequiredClaims = new[] { "admin" })]
public ResponsePayload ProtectedEndpoint()
{
return Ok(new { message = "Access granted" });
}
[StellaPost("upload")]
public ResponsePayload HandleUpload()
{
var size = Context.ContentLength ?? Context.RawBody?.Length ?? 0;
return Ok(new { bytesReceived = size });
}
[StellaGet("stream")]
public ResponsePayload StreamResponse()
{
var data = new byte[1024 * 1024]; // 1MB
Random.Shared.NextBytes(data);
return Response()
.WithBytes(data, "application/octet-stream")
.Build();
}
}
```
---
## Basic Request/Response Tests
```csharp
namespace StellaOps.Router.Tests;
public class BasicRequestResponseTests : IClassFixture<EndToEndTestFixture>
{
private readonly EndToEndTestFixture _fixture;
public BasicRequestResponseTests(EndToEndTestFixture fixture)
{
_fixture = fixture;
}
[Fact]
public async Task Get_Echo_ReturnsRequestDetails()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/test/echo");
var content = await response.Content.ReadFromJsonAsync<EchoResponse>();
// Assert
Assert.True(response.IsSuccessStatusCode);
Assert.Equal("GET", content?.Method);
Assert.Equal("/api/test/echo", content?.Path);
}
[Fact]
public async Task Post_Echo_ReturnsBody()
{
// Arrange
var client = _fixture.GatewayClient;
var body = new StringContent("{\"test\": true}", Encoding.UTF8, "application/json");
// Act
var response = await client.PostAsync("/api/test/echo", body);
var content = await response.Content.ReadFromJsonAsync<EchoBodyResponse>();
// Assert
Assert.True(response.IsSuccessStatusCode);
Assert.Contains("test", content?.Body);
}
[Fact]
public async Task Get_WithPathParameter_ExtractsParameter()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/test/items/12345");
var content = await response.Content.ReadFromJsonAsync<ItemResponse>();
// Assert
Assert.True(response.IsSuccessStatusCode);
Assert.Equal("12345", content?.Id);
}
[Fact]
public async Task Get_NonExistentPath_Returns404()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/nonexistent");
// Assert
Assert.Equal(HttpStatusCode.NotFound, response.StatusCode);
}
private record EchoResponse(
string Method,
string Path,
Dictionary<string, string> Query,
Dictionary<string, string> Claims);
private record EchoBodyResponse(string Body);
private record ItemResponse(string Id);
}
```
---
## Authentication Tests
```csharp
namespace StellaOps.Router.Tests;
public class AuthenticationTests : IClassFixture<EndToEndTestFixture>
{
private readonly EndToEndTestFixture _fixture;
public AuthenticationTests(EndToEndTestFixture fixture)
{
_fixture = fixture;
}
[Fact]
public async Task Protected_WithoutToken_Returns401()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/test/protected");
// Assert
Assert.Equal(HttpStatusCode.Unauthorized, response.StatusCode);
}
[Fact]
public async Task Protected_WithValidToken_Returns200()
{
// Arrange
var client = _fixture.GatewayClient;
var token = CreateTestToken(new Dictionary<string, string> { ["admin"] = "true" });
client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", token);
// Act
var response = await client.GetAsync("/api/test/protected");
// Assert
Assert.True(response.IsSuccessStatusCode);
}
[Fact]
public async Task Protected_WithInvalidToken_Returns401()
{
// Arrange
var client = _fixture.GatewayClient;
client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", "invalid-token");
// Act
var response = await client.GetAsync("/api/test/protected");
// Assert
Assert.Equal(HttpStatusCode.Unauthorized, response.StatusCode);
}
[Fact]
public async Task Protected_WithMissingClaim_Returns403()
{
// Arrange
var client = _fixture.GatewayClient;
var token = CreateTestToken(new Dictionary<string, string> { ["user"] = "true" }); // No admin claim
client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", token);
// Act
var response = await client.GetAsync("/api/test/protected");
// Assert
Assert.Equal(HttpStatusCode.Forbidden, response.StatusCode);
}
private string CreateTestToken(Dictionary<string, string> claims)
{
// Create a test JWT (would use test key in real implementation)
var handler = new JwtSecurityTokenHandler();
var key = new SymmetricSecurityKey(Encoding.UTF8.GetBytes("test-key-for-testing-only-12345"));
var creds = new SigningCredentials(key, SecurityAlgorithms.HmacSha256);
var claimsList = claims.Select(c => new Claim(c.Key, c.Value)).ToList();
claimsList.Add(new Claim("sub", "test-user"));
var token = new JwtSecurityToken(
issuer: "test",
audience: "test",
claims: claimsList,
expires: DateTime.UtcNow.AddHours(1),
signingCredentials: creds);
return handler.WriteToken(token);
}
}
```
---
## Rate Limiting Tests
```csharp
namespace StellaOps.Router.Tests;
public class RateLimitingTests : IClassFixture<EndToEndTestFixture>
{
private readonly EndToEndTestFixture _fixture;
public RateLimitingTests(EndToEndTestFixture fixture)
{
_fixture = fixture;
}
[Fact]
public async Task RateLimit_ExceedingLimit_Returns429()
{
// Arrange
var client = _fixture.GatewayClient;
var tasks = new List<Task<HttpResponseMessage>>();
// Act - Send 100 requests quickly
for (int i = 0; i < 100; i++)
{
tasks.Add(client.GetAsync("/api/test/echo"));
}
var responses = await Task.WhenAll(tasks);
// Assert - Some should be rate limited
var rateLimited = responses.Count(r => r.StatusCode == HttpStatusCode.TooManyRequests);
Assert.True(rateLimited > 0, "Expected some requests to be rate limited");
}
[Fact]
public async Task RateLimit_Headers_ArePresent()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/test/echo");
// Assert
Assert.True(response.Headers.Contains("X-RateLimit-Limit"));
Assert.True(response.Headers.Contains("X-RateLimit-Remaining"));
}
[Fact]
public async Task RateLimit_PerUser_IsolatesUsers()
{
// Arrange
var client1 = new HttpClient { BaseAddress = new Uri(_fixture.GatewayBaseUrl) };
var client2 = new HttpClient { BaseAddress = new Uri(_fixture.GatewayBaseUrl) };
client1.DefaultRequestHeaders.Add("X-API-Key", "user1-key");
client2.DefaultRequestHeaders.Add("X-API-Key", "user2-key");
// Act - Exhaust rate limit for user1
for (int i = 0; i < 50; i++)
{
await client1.GetAsync("/api/test/echo");
}
// User2 should still have quota
var response = await client2.GetAsync("/api/test/echo");
// Assert
Assert.True(response.IsSuccessStatusCode);
}
}
```
---
## Timeout and Cancellation Tests
```csharp
namespace StellaOps.Router.Tests;
public class TimeoutAndCancellationTests : IClassFixture<EndToEndTestFixture>
{
private readonly EndToEndTestFixture _fixture;
public TimeoutAndCancellationTests(EndToEndTestFixture fixture)
{
_fixture = fixture;
}
[Fact]
public async Task Slow_Request_TimesOut()
{
// Arrange
var client = new HttpClient
{
BaseAddress = new Uri(_fixture.GatewayBaseUrl),
Timeout = TimeSpan.FromSeconds(1)
};
// Act & Assert
await Assert.ThrowsAsync<TaskCanceledException>(
() => client.GetAsync("/api/test/slow"));
}
[Fact]
public async Task Cancelled_Request_PropagatesCancellation()
{
// Arrange
var client = _fixture.GatewayClient;
using var cts = new CancellationTokenSource();
// Act
var task = client.GetAsync("/api/test/slow", cts.Token);
await Task.Delay(100);
cts.Cancel();
// Assert
await Assert.ThrowsAsync<TaskCanceledException>(() => task);
}
}
```
---
## Streaming and Large Payload Tests
```csharp
namespace StellaOps.Router.Tests;
public class StreamingTests : IClassFixture<EndToEndTestFixture>
{
private readonly EndToEndTestFixture _fixture;
public StreamingTests(EndToEndTestFixture fixture)
{
_fixture = fixture;
}
[Fact]
public async Task LargeUpload_Succeeds()
{
// Arrange
var client = _fixture.GatewayClient;
var data = new byte[1024 * 1024]; // 1MB
Random.Shared.NextBytes(data);
var content = new ByteArrayContent(data);
content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
// Act
var response = await client.PostAsync("/api/test/upload", content);
var result = await response.Content.ReadFromJsonAsync<UploadResponse>();
// Assert
Assert.True(response.IsSuccessStatusCode);
Assert.Equal(data.Length, result?.BytesReceived);
}
[Fact]
public async Task LargeDownload_Succeeds()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/test/stream");
var data = await response.Content.ReadAsByteArrayAsync();
// Assert
Assert.True(response.IsSuccessStatusCode);
Assert.Equal(1024 * 1024, data.Length);
}
private record UploadResponse(long BytesReceived);
}
```
---
## Error Handling Tests
```csharp
namespace StellaOps.Router.Tests;
public class ErrorHandlingTests : IClassFixture<EndToEndTestFixture>
{
private readonly EndToEndTestFixture _fixture;
public ErrorHandlingTests(EndToEndTestFixture fixture)
{
_fixture = fixture;
}
[Fact]
public async Task Handler_Exception_Returns500()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/test/error");
// Assert
Assert.Equal(HttpStatusCode.InternalServerError, response.StatusCode);
}
[Fact]
public async Task Custom_StatusCode_IsPreserved()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/test/status/418");
// Assert
Assert.Equal((HttpStatusCode)418, response.StatusCode);
}
[Fact]
public async Task Error_Response_HasCorrectFormat()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/nonexistent");
var content = await response.Content.ReadFromJsonAsync<ErrorResponse>();
// Assert
Assert.NotNull(content?.Error);
}
private record ErrorResponse(string Error);
}
```
---
## YAML Configuration
```yaml
# Test configuration
Router:
Transports:
- Type: InMemory
Enabled: true
RateLimiting:
Enabled: true
DefaultTier: free
Tiers:
free:
RequestsPerMinute: 60
authenticated:
RequestsPerMinute: 600
Authentication:
Enabled: true
AllowAnonymous: false
TestMode: true
```
---
## Deliverables
1. `StellaOps.Router.Tests/EndToEndTestFixture.cs`
2. `StellaOps.Router.Tests/TestEndpointHandler.cs`
3. `StellaOps.Router.Tests/BasicRequestResponseTests.cs`
4. `StellaOps.Router.Tests/AuthenticationTests.cs`
5. `StellaOps.Router.Tests/RateLimitingTests.cs`
6. `StellaOps.Router.Tests/TimeoutAndCancellationTests.cs`
7. `StellaOps.Router.Tests/StreamingTests.cs`
8. `StellaOps.Router.Tests/ErrorHandlingTests.cs`
9. Mock implementations for Authority, Rate Limiter
10. CI integration configuration
---
## Next Step
Proceed to [Step 27: Reference Example & Migration Skeleton](27-Step.md) to create example implementations.

1524
docs/router/27-Step.md Normal file

File diff suppressed because it is too large Load Diff

755
docs/router/28-Step.md Normal file
View File

@@ -0,0 +1,755 @@
# Step 28: Agent Process Guidelines
## Overview
This document provides comprehensive guidelines for AI agents (Claude, Copilot, etc.) implementing the Stella Router. It establishes conventions, patterns, and decision frameworks to ensure consistent, high-quality implementations across all phases.
## Goals
1. Define clear coding standards and patterns for Router implementation
2. Establish decision frameworks for common scenarios
3. Provide checklists for implementation quality
4. Document testing requirements and coverage expectations
5. Define commit and PR conventions
## Implementation Standards
### Code Organization
```
src/Router/
├── StellaOps.Router.Core/ # Core abstractions and contracts
│ ├── Abstractions/ # Interfaces
│ ├── Configuration/ # Config models
│ ├── Extensions/ # Extension methods
│ └── Primitives/ # Value types
├── StellaOps.Router.Gateway/ # Gateway implementation
│ ├── Routing/ # Route matching
│ ├── Handlers/ # Route handlers
│ ├── Pipeline/ # Request pipeline
│ └── Middleware/ # Gateway middleware
├── StellaOps.Router.Transport/ # Transport implementations
│ ├── InMemory/ # In-process transport
│ ├── Tcp/ # TCP transport
│ └── Tls/ # TLS transport
├── StellaOps.Router.Microservice/ # Microservice SDK
│ ├── Hosting/ # Host builder
│ ├── Endpoints/ # Endpoint handling
│ └── Context/ # Request context
├── StellaOps.Router.Security/ # Security components
│ ├── Jwt/ # JWT validation
│ ├── Claims/ # Claim hydration
│ └── RateLimiting/ # Rate limiting
└── StellaOps.Router.Observability/ # Observability
├── Logging/ # Structured logging
├── Metrics/ # Prometheus metrics
└── Tracing/ # OpenTelemetry tracing
```
### Naming Conventions
| Element | Convention | Example |
|---------|------------|---------|
| Interfaces | `I` prefix, noun/adjective | `IRouteHandler`, `IConnectable` |
| Classes | PascalCase, noun | `JwtValidator`, `RouteTable` |
| Async methods | `Async` suffix | `ValidateTokenAsync`, `SendAsync` |
| Config classes | `Options` or `Configuration` suffix | `JwtValidationOptions` |
| Event handlers | `On` prefix | `OnConnectionEstablished` |
| Factory methods | `Create` prefix | `CreateHandler`, `CreateConnection` |
| Boolean properties | `Is`/`Has`/`Can` prefix | `IsValid`, `HasExpired`, `CanRetry` |
### File Structure
```csharp
// File: StellaOps.Router.Core/Abstractions/IRouteHandler.cs
// 1. License header (if required)
// 2. Using statements (sorted: System, Microsoft, Third-party, Internal)
using System;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.Logging;
using StellaOps.Router.Core.Configuration;
// 3. Namespace (one per file, matches folder structure)
namespace StellaOps.Router.Core.Abstractions;
// 4. XML documentation
/// <summary>
/// Handles requests for a specific route type.
/// </summary>
/// <remarks>
/// Implementations must be thread-safe and support concurrent request handling.
/// </remarks>
public interface IRouteHandler
{
// 5. Interface members (properties, then methods)
/// <summary>
/// Gets the handler type identifier.
/// </summary>
string HandlerType { get; }
/// <summary>
/// Determines if this handler can process the given route.
/// </summary>
bool CanHandle(RouteConfiguration route);
/// <summary>
/// Processes an incoming request.
/// </summary>
Task<ResponsePayload> HandleAsync(
RequestPayload request,
RouteConfiguration route,
CancellationToken cancellationToken = default);
}
```
### Error Handling Patterns
```csharp
// Pattern 1: Result types for expected failures
public readonly struct Result<T>
{
public T? Value { get; }
public Error? Error { get; }
public bool IsSuccess => Error == null;
private Result(T? value, Error? error)
{
Value = value;
Error = error;
}
public static Result<T> Success(T value) => new(value, null);
public static Result<T> Failure(Error error) => new(default, error);
public Result<TNext> Map<TNext>(Func<T, TNext> map) =>
IsSuccess ? Result<TNext>.Success(map(Value!)) : Result<TNext>.Failure(Error!);
public async Task<Result<TNext>> MapAsync<TNext>(Func<T, Task<TNext>> map) =>
IsSuccess ? Result<TNext>.Success(await map(Value!)) : Result<TNext>.Failure(Error!);
}
public record Error(string Code, string Message, Exception? Inner = null);
// Usage
public async Task<Result<JwtClaims>> ValidateTokenAsync(string token)
{
try
{
var claims = await _validator.ValidateAsync(token);
return Result<JwtClaims>.Success(claims);
}
catch (SecurityTokenExpiredException ex)
{
return Result<JwtClaims>.Failure(new Error("TOKEN_EXPIRED", "JWT has expired", ex));
}
catch (SecurityTokenInvalidSignatureException ex)
{
return Result<JwtClaims>.Failure(new Error("INVALID_SIGNATURE", "JWT signature invalid", ex));
}
}
// Pattern 2: Exceptions for unexpected failures
public class RouterException : Exception
{
public string ErrorCode { get; }
public int StatusCode { get; }
public RouterException(string errorCode, string message, int statusCode = 500)
: base(message)
{
ErrorCode = errorCode;
StatusCode = statusCode;
}
}
public class ConfigurationException : RouterException
{
public ConfigurationException(string message)
: base("CONFIG_ERROR", message, 500) { }
}
public class TransportException : RouterException
{
public TransportException(string message, Exception? inner = null)
: base("TRANSPORT_ERROR", message, 503) { }
}
```
### Async Patterns
```csharp
// Pattern 1: CancellationToken propagation
public async Task<ResponsePayload> HandleAsync(
RequestPayload request,
CancellationToken cancellationToken = default)
{
// Always check at start of long operations
cancellationToken.ThrowIfCancellationRequested();
// Propagate to all async calls
var validated = await _validator.ValidateAsync(request, cancellationToken);
var enriched = await _enricher.EnrichAsync(validated, cancellationToken);
var response = await _handler.ProcessAsync(enriched, cancellationToken);
return response;
}
// Pattern 2: Timeout handling
public async Task<T> WithTimeoutAsync<T>(
Func<CancellationToken, Task<T>> operation,
TimeSpan timeout,
CancellationToken cancellationToken = default)
{
using var cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
cts.CancelAfter(timeout);
try
{
return await operation(cts.Token);
}
catch (OperationCanceledException) when (!cancellationToken.IsCancellationRequested)
{
throw new TimeoutException($"Operation timed out after {timeout}");
}
}
// Pattern 3: Fire-and-forget with logging
public void FireAndForget(Func<Task> operation, ILogger logger, string operationName)
{
_ = Task.Run(async () =>
{
try
{
await operation();
}
catch (Exception ex)
{
logger.LogError(ex, "Fire-and-forget operation {Operation} failed", operationName);
}
});
}
```
### Dependency Injection Patterns
```csharp
// Pattern 1: Constructor injection with validation
public class JwtValidator : IJwtValidator
{
private readonly JwtValidationOptions _options;
private readonly IKeyProvider _keyProvider;
private readonly ILogger<JwtValidator> _logger;
public JwtValidator(
IOptions<JwtValidationOptions> options,
IKeyProvider keyProvider,
ILogger<JwtValidator> logger)
{
_options = options?.Value ?? throw new ArgumentNullException(nameof(options));
_keyProvider = keyProvider ?? throw new ArgumentNullException(nameof(keyProvider));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
ValidateOptions(_options);
}
private static void ValidateOptions(JwtValidationOptions options)
{
if (string.IsNullOrEmpty(options.Issuer))
throw new ConfigurationException("JWT issuer is required");
if (options.ClockSkew < TimeSpan.Zero)
throw new ConfigurationException("Clock skew cannot be negative");
}
}
// Pattern 2: Factory registration for complex objects
public static class ServiceCollectionExtensions
{
public static IServiceCollection AddStellaRouter(
this IServiceCollection services,
Action<RouterOptions> configure)
{
services.Configure(configure);
// Core services
services.AddSingleton<IRouteTable, RouteTable>();
services.AddSingleton<IRequestPipeline, RequestPipeline>();
// Keyed services for handlers
services.AddKeyedSingleton<IRouteHandler, MicroserviceHandler>("microservice");
services.AddKeyedSingleton<IRouteHandler, GraphQLHandler>("graphql");
services.AddKeyedSingleton<IRouteHandler, ReverseProxyHandler>("proxy");
// Factory for route handler resolution
services.AddSingleton<IRouteHandlerFactory>(sp => new RouteHandlerFactory(
sp.GetServices<IRouteHandler>().ToDictionary(h => h.HandlerType)));
return services;
}
}
// Pattern 3: Scoped services for request context
public static class RequestScopeExtensions
{
public static IServiceCollection AddRequestScope(this IServiceCollection services)
{
services.AddScoped<IRequestContext, RequestContext>();
services.AddScoped(sp => sp.GetRequiredService<IRequestContext>().User);
services.AddScoped(sp => sp.GetRequiredService<IRequestContext>().CorrelationId);
return services;
}
}
```
## Decision Framework
### When to Create New Types vs. Reuse
| Scenario | Decision | Rationale |
|----------|----------|-----------|
| Similar data, different context | Create new type | Type safety, clear intent |
| Same data, same context | Reuse type | DRY, reduce cognitive load |
| Third-party type | Create wrapper | Abstraction, testability |
| Config vs. runtime | Separate types | Immutability guarantees |
```csharp
// Example: Separate types for config vs runtime
public record RouteConfiguration(
string Path,
string Method,
string HandlerType,
Dictionary<string, string> Metadata);
public class CompiledRoute
{
public RouteConfiguration Config { get; }
public Regex PathPattern { get; }
public IRouteHandler Handler { get; }
// Runtime-computed fields
}
```
### When to Use Interfaces vs. Abstract Classes
| Use Interface | Use Abstract Class |
|---------------|-------------------|
| Multiple inheritance needed | Shared implementation |
| Contract-only definition | Template method pattern |
| Third-party implementation | Internal hierarchy only |
| Mocking/testing priority | Code reuse priority |
### Logging Level Guidelines
| Level | When to Use | Example |
|-------|-------------|---------|
| `Trace` | Internal flow details | `"Route matching attempt for {Path}"` |
| `Debug` | Diagnostic information | `"Cache hit for key {Key}"` |
| `Information` | Significant events | `"Request completed: {Method} {Path} → {Status}"` |
| `Warning` | Recoverable issues | `"Rate limit approaching: {Current}/{Max}"` |
| `Error` | Failures requiring attention | `"Failed to connect to Authority: {Error}"` |
| `Critical` | System-wide failures | `"Configuration invalid, router cannot start"` |
```csharp
// Structured logging patterns
_logger.LogInformation(
"Request processed: {Method} {Path} → {StatusCode} in {ElapsedMs}ms",
request.Method,
request.Path,
response.StatusCode,
stopwatch.ElapsedMilliseconds);
// Use LoggerMessage for high-performance paths
private static readonly Action<ILogger, string, string, int, long, Exception?> LogRequestComplete =
LoggerMessage.Define<string, string, int, long>(
LogLevel.Information,
new EventId(1001, "RequestComplete"),
"Request processed: {Method} {Path} → {StatusCode} in {ElapsedMs}ms");
// Usage
LogRequestComplete(_logger, method, path, statusCode, elapsed, null);
```
## Implementation Checklists
### Before Starting a Component
- [ ] Read the step documentation thoroughly
- [ ] Understand dependencies on previous steps
- [ ] Review related existing code patterns
- [ ] Identify configuration requirements
- [ ] Plan test coverage strategy
### During Implementation
- [ ] Follow naming conventions
- [ ] Add XML documentation to public APIs
- [ ] Implement `IDisposable`/`IAsyncDisposable` where needed
- [ ] Add structured logging at appropriate levels
- [ ] Handle cancellation tokens throughout
- [ ] Use result types for expected failures
- [ ] Validate all configuration at startup
### Before Marking Complete
- [ ] All public types have XML documentation
- [ ] Unit tests achieve >80% coverage
- [ ] Integration tests cover happy path + error cases
- [ ] No compiler warnings
- [ ] Code passes all linting rules
- [ ] Configuration is validated
- [ ] README/documentation updated if needed
### Pull Request Checklist
- [ ] PR title follows convention: `feat(router): description`
- [ ] Description explains what and why
- [ ] All tests pass
- [ ] No unrelated changes
- [ ] Breaking changes documented
- [ ] Reviewable size (<500 lines preferred)
## Testing Requirements
### Unit Test Coverage Targets
| Component Type | Target Coverage |
|---------------|-----------------|
| Core logic | 90% |
| Handlers | 85% |
| Middleware | 80% |
| Configuration | 75% |
| Extensions | 70% |
### Test Structure
```csharp
// Test file naming: {ClassName}Tests.cs
// Test method naming: {Method}_{Scenario}_{ExpectedResult}
public class JwtValidatorTests
{
private readonly JwtValidator _sut; // System Under Test
private readonly Mock<IKeyProvider> _keyProviderMock;
private readonly Mock<ILogger<JwtValidator>> _loggerMock;
public JwtValidatorTests()
{
_keyProviderMock = new Mock<IKeyProvider>();
_loggerMock = new Mock<ILogger<JwtValidator>>();
var options = Options.Create(new JwtValidationOptions
{
Issuer = "https://auth.example.com",
Audience = "stella-router"
});
_sut = new JwtValidator(options, _keyProviderMock.Object, _loggerMock.Object);
}
[Fact]
public async Task ValidateAsync_ValidToken_ReturnsSuccessWithClaims()
{
// Arrange
var token = GenerateValidToken();
_keyProviderMock
.Setup(x => x.GetSigningKeyAsync(It.IsAny<string>()))
.ReturnsAsync(TestKeys.ValidKey);
// Act
var result = await _sut.ValidateAsync(token);
// Assert
Assert.True(result.IsSuccess);
Assert.NotNull(result.Value);
Assert.Equal("test-user", result.Value.Subject);
}
[Fact]
public async Task ValidateAsync_ExpiredToken_ReturnsFailure()
{
// Arrange
var token = GenerateExpiredToken();
// Act
var result = await _sut.ValidateAsync(token);
// Assert
Assert.False(result.IsSuccess);
Assert.Equal("TOKEN_EXPIRED", result.Error!.Code);
}
[Theory]
[InlineData(null)]
[InlineData("")]
[InlineData(" ")]
public async Task ValidateAsync_NullOrEmptyToken_ReturnsFailure(string? token)
{
// Act
var result = await _sut.ValidateAsync(token!);
// Assert
Assert.False(result.IsSuccess);
Assert.Equal("INVALID_TOKEN", result.Error!.Code);
}
}
```
### Integration Test Patterns
```csharp
public class RouterIntegrationTests : IClassFixture<RouterTestFixture>
{
private readonly RouterTestFixture _fixture;
public RouterIntegrationTests(RouterTestFixture fixture)
{
_fixture = fixture;
}
[Fact]
public async Task EndToEnd_AuthenticatedRequest_ReturnsSuccess()
{
// Arrange
var client = _fixture.CreateAuthenticatedClient(claims: new()
{
["sub"] = "test-user",
["role"] = "admin"
});
// Act
var response = await client.GetAsync("/api/users/123");
// Assert
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
var user = await response.Content.ReadFromJsonAsync<UserDto>();
Assert.NotNull(user);
Assert.Equal("123", user.Id);
}
}
// Test fixture
public class RouterTestFixture : IAsyncLifetime
{
private IHost? _gatewayHost;
private IHost? _microserviceHost;
public async Task InitializeAsync()
{
// Start microservice
_microserviceHost = await CreateMicroserviceHost();
await _microserviceHost.StartAsync();
// Start gateway
_gatewayHost = await CreateGatewayHost();
await _gatewayHost.StartAsync();
}
public async Task DisposeAsync()
{
if (_gatewayHost != null)
await _gatewayHost.StopAsync();
if (_microserviceHost != null)
await _microserviceHost.StopAsync();
_gatewayHost?.Dispose();
_microserviceHost?.Dispose();
}
public HttpClient CreateAuthenticatedClient(Dictionary<string, object> claims)
{
var token = GenerateTestToken(claims);
var client = new HttpClient
{
BaseAddress = new Uri("http://localhost:5000")
};
client.DefaultRequestHeaders.Authorization =
new AuthenticationHeaderValue("Bearer", token);
return client;
}
}
```
## Git and PR Conventions
### Branch Naming
```
feat/router-<step>-<description>
fix/router-<issue-number>
refactor/router-<description>
test/router-<description>
docs/router-<description>
```
### Commit Messages
```
<type>(<scope>): <description>
[optional body]
[optional footer]
```
Types: `feat`, `fix`, `refactor`, `test`, `docs`, `chore`
Examples:
```
feat(router): implement JWT validation with per-endpoint keys
- Add JwtValidator with configurable key sources
- Support RS256 and ES256 algorithms
- Add JWKS endpoint caching with TTL
Closes #123
```
### PR Template
```markdown
## Summary
Brief description of what this PR does.
## Changes
- Change 1
- Change 2
- Change 3
## Testing
- [ ] Unit tests added/updated
- [ ] Integration tests added/updated
- [ ] Manual testing performed
## Checklist
- [ ] Code follows project conventions
- [ ] Documentation updated
- [ ] No breaking changes (or documented if any)
- [ ] All tests pass
```
## Common Pitfalls to Avoid
### Performance
```csharp
// ❌ BAD: Allocating in hot path
public bool MatchRoute(string path)
{
var parts = path.Split('/'); // Allocation
// ...
}
// ✅ GOOD: Use Span for parsing
public bool MatchRoute(ReadOnlySpan<char> path)
{
// Zero-allocation parsing
foreach (var segment in path.Split('/'))
{
// ...
}
}
// ❌ BAD: Synchronous I/O blocking async context
public async Task ProcessAsync()
{
var config = File.ReadAllText("config.json"); // Blocking!
}
// ✅ GOOD: Async all the way
public async Task ProcessAsync()
{
var config = await File.ReadAllTextAsync("config.json");
}
```
### Thread Safety
```csharp
// ❌ BAD: Non-thread-safe collection
private readonly Dictionary<string, Route> _routes = new();
public void AddRoute(string key, Route route)
{
_routes[key] = route; // Not thread-safe!
}
// ✅ GOOD: Thread-safe collection
private readonly ConcurrentDictionary<string, Route> _routes = new();
public void AddRoute(string key, Route route)
{
_routes[key] = route; // Thread-safe
}
// ✅ GOOD: Immutable update
private ImmutableDictionary<string, Route> _routes =
ImmutableDictionary<string, Route>.Empty;
public void AddRoute(string key, Route route)
{
ImmutableInterlocked.AddOrUpdate(ref _routes, key, route, (_, _) => route);
}
```
### Resource Management
```csharp
// ❌ BAD: Not disposing resources
public async Task SendAsync(byte[] data)
{
var client = new TcpClient();
await client.ConnectAsync("host", 9100);
await client.GetStream().WriteAsync(data);
// client never disposed!
}
// ✅ GOOD: Proper disposal
public async Task SendAsync(byte[] data)
{
using var client = new TcpClient();
await client.ConnectAsync("host", 9100);
await using var stream = client.GetStream();
await stream.WriteAsync(data);
}
// ✅ GOOD: Connection pooling
public class ConnectionPool : IDisposable
{
private readonly Channel<TcpClient> _pool;
public async Task<TcpClient> RentAsync()
{
if (_pool.Reader.TryRead(out var client))
return client;
return await CreateNewConnectionAsync();
}
public void Return(TcpClient client)
{
if (!_pool.Writer.TryWrite(client))
client.Dispose();
}
}
```
## Deliverables
| Artifact | Purpose |
|----------|---------|
| This document | Agent implementation guidelines |
| Code templates | Consistent starting points |
| Checklists | Quality gates |
| Test patterns | Consistent testing approach |
## Next Step
[Step 29: Integration Testing & CI →](29-Step.md)

1684
docs/router/29-Step.md Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1,41 +1,121 @@
# Sprint 7000·0001·0001 · Router Skeleton
# Sprint 7000-0001-0001 · Router Foundation · Project Skeleton
## Topic & Scope
- Stand up the dedicated StellaOps Router repo skeleton under `docs/router` as per `specs.md` / `01-Step.md`.
- Produce the empty solution structure, projects, references, and placeholder docs ready for future transport/SDK work.
- Enforce .NET 10 (`net10.0`) across all new projects; ignore prior net8 defaults.
- **Working directory:** `docs/router`.
Phase 1 of Router implementation: establish the project skeleton with all required directories, solution files, and empty stubs. This sprint creates the structural foundation that all subsequent router sprints depend on.
**Goal:** Get a clean, compiling skeleton in place that matches the spec and folder conventions, with zero real logic and minimal dependencies.
**Working directories:**
- `src/__Libraries/StellaOps.Router.Common/`
- `src/__Libraries/StellaOps.Router.Config/`
- `src/__Libraries/StellaOps.Microservice/`
- `src/__Libraries/StellaOps.Microservice.SourceGen/`
- `src/Gateway/StellaOps.Gateway.WebService/`
- `tests/StellaOps.Router.Common.Tests/`
- `tests/StellaOps.Gateway.WebService.Tests/`
- `tests/StellaOps.Microservice.Tests/`
**Isolation strategy:** Router uses a separate `StellaOps.Router.sln` solution file to enable fully independent building and testing. This prevents any impact on the main `StellaOps.sln` until the migration phase.
## Dependencies & Concurrency
- Depends on `docs/router/specs.md` remaining the authoritative requirements source.
- No upstream sprint blockers; this spin-off is self-contained.
- Can run in parallel with other repo work because it writes only under `docs/router`.
- **Upstream:** None. This is the first router sprint.
- **Downstream:** All other router sprints depend on this skeleton.
- **Parallel work:** None possible until this sprint completes.
- **Cross-module impact:** None. All work is in new directories.
## Documentation Prerequisites
- `docs/router/specs.md`
- `docs/router/implplan.md`
- `docs/router/01-Step.md`
- `docs/router/specs.md` (canonical specification - READ FIRST)
- `docs/router/implplan.md` (implementation plan overview)
- `docs/router/01-Step.md` (detailed task breakdown for this sprint)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Invariants (from specs.md)
Before coding, acknowledge these non-negotiables:
- Method + Path identity for endpoints
- Strict semver for versions
- Region from `GatewayNodeConfig.Region` (no host/header derivation)
- No HTTP transport for microservice-to-router communications
- Single connection carrying HELLO + HEARTBEAT + REQUEST/RESPONSE + CANCEL
- Router treats body as opaque bytes/streams
- `RequiringClaims` replaces any form of `AllowedRoles`
## Delivery Tracker
| # | Task ID | Status | Key dependency / next step | Owners | Task Definition |
| --- | --- | --- | --- | --- | --- |
| 1 | ROUTER-SKEL-SETUP | TODO | Read specs + step docs | Skeleton Agent | Create repo folders (`src/`, `src/__Libraries/`, `tests/`, `docs/router`) & add `README.md` pointer. |
| 2 | ROUTER-SKEL-SOLUTION | TODO | Task 1 | Skeleton Agent | Generate `StellaOps.Router.sln`, add Gateway + library + test projects targeting `net10.0`. |
| 3 | ROUTER-SKEL-REFS | TODO | Task 2 | Skeleton Agent | Wire project references per plan (Gateway→Common+Config, etc.). |
| 4 | ROUTER-SKEL-BUILDPROPS | TODO | Task 2 | Infra Agent | Add repo-level `Directory.Build.props` pinning `net10.0`, nullable, implicit usings. |
| 5 | ROUTER-SKEL-STUBS | TODO | Tasks 2-4 | Common/Microservice Agents | Add placeholder types/extension methods per `01-Step.md` (no logic). |
| 6 | ROUTER-SKEL-TESTS | TODO | Task 5 | QA Agent | Create dummy `[Fact]` tests in each test project so `dotnet test` passes. |
| 7 | ROUTER-SKEL-CI | TODO | Tasks 2-6 | Infra Agent | Configure CI pipeline running `dotnet restore/build/test` on solution. |
| # | Task ID | Status | Description | Working Directory |
|---|---------|--------|-------------|-------------------|
| 1 | SKEL-001 | TODO | Create directory structure (`src/__Libraries/`, `src/Gateway/`, `tests/`) | repo root |
| 2 | SKEL-002 | TODO | Create `StellaOps.Router.sln` solution file at repo root | repo root |
| 3 | SKEL-003 | TODO | Create `StellaOps.Router.Common` classlib project | `src/__Libraries/StellaOps.Router.Common/` |
| 4 | SKEL-004 | TODO | Create `StellaOps.Router.Config` classlib project | `src/__Libraries/StellaOps.Router.Config/` |
| 5 | SKEL-005 | TODO | Create `StellaOps.Microservice` classlib project | `src/__Libraries/StellaOps.Microservice/` |
| 6 | SKEL-006 | TODO | Create `StellaOps.Microservice.SourceGen` classlib stub | `src/__Libraries/StellaOps.Microservice.SourceGen/` |
| 7 | SKEL-007 | TODO | Create `StellaOps.Gateway.WebService` webapi project | `src/Gateway/StellaOps.Gateway.WebService/` |
| 8 | SKEL-008 | TODO | Create xunit test projects for Common, Gateway, Microservice | `tests/` |
| 9 | SKEL-009 | TODO | Wire project references per dependency graph | all projects |
| 10 | SKEL-010 | TODO | Add `Directory.Build.props` with common settings (net10.0, nullable, LangVersion) | repo root (router scope) |
| 11 | SKEL-011 | TODO | Stub empty placeholder types in each project (no logic) | all projects |
| 12 | SKEL-012 | TODO | Add dummy smoke tests so CI passes | `tests/` |
| 13 | SKEL-013 | TODO | Verify `dotnet build StellaOps.Router.sln` succeeds | repo root |
| 14 | SKEL-014 | TODO | Verify `dotnet test StellaOps.Router.sln` passes | repo root |
| 15 | SKEL-015 | TODO | Update `docs/router/README.md` with solution overview | `docs/router/` |
## Project Reference Graph
```
StellaOps.Gateway.WebService
├── StellaOps.Router.Common
└── StellaOps.Router.Config
└── StellaOps.Router.Common
StellaOps.Microservice
└── StellaOps.Router.Common
StellaOps.Microservice.SourceGen
(no references yet - stub only)
Test projects reference their corresponding main projects.
```
## Stub Types to Create
### StellaOps.Router.Common
- Enums: `TransportType`, `FrameType`, `InstanceHealthStatus`
- Models: `ClaimRequirement`, `EndpointDescriptor`, `InstanceDescriptor`, `ConnectionState`, `Frame`
- Interfaces: `IGlobalRoutingState`, `IRoutingPlugin`, `ITransportServer`, `ITransportClient`
### StellaOps.Router.Config
- `RouterConfig`, `ServiceConfig`, `PayloadLimits` (property-only classes)
### StellaOps.Microservice
- `StellaMicroserviceOptions`, `RouterEndpointConfig`
- `ServiceCollectionExtensions.AddStellaMicroservice()` (empty body)
### StellaOps.Gateway.WebService
- `GatewayNodeConfig` with Region, NodeId, Environment
- Minimal `Program.cs` that builds and runs (no logic)
## Exit Criteria
Before marking this sprint DONE:
1. [ ] `dotnet build StellaOps.Router.sln` succeeds with zero warnings
2. [ ] `dotnet test StellaOps.Router.sln` passes (even with dummy tests)
3. [ ] All project names match spec: `StellaOps.Gateway.WebService`, `StellaOps.Router.Common`, `StellaOps.Router.Config`, `StellaOps.Microservice`
4. [ ] No real business logic exists (no transport logic, no routing decisions, no YAML parsing)
5. [ ] `docs/router/README.md` exists and points to `specs.md`
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2025-12-02 | Created sprint skeleton per router spin-off instructions. | Planning |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Use .NET 10 baseline even though other modules still target net8; future agents must not downgrade frameworks.
- Scope intentionally limited to `docs/router` to avoid cross-repo conflicts; any shared assets must be duplicated or referenced via documentation until later alignment.
- Risk: missing AGENTS.md for this folder—future sprint should establish one if work extends beyond skeleton.
## Next Checkpoints
- 2025-12-04: Verify solution + CI scaffold committed and passing.
- Router uses a separate solution file (`StellaOps.Router.sln`) to enable isolated development. This will be merged into main `StellaOps.sln` during the migration phase.
- Target framework is `net10.0` to match the rest of StellaOps.
- `StellaOps.Microservice.SourceGen` is created as a plain classlib for now; it will be converted to a Source Generator project in a later sprint.

View File

@@ -0,0 +1,157 @@
# Sprint 7000-0001-0002 · Router Foundation · Common Library Models
## Topic & Scope
Phase 2 of Router implementation: implement the shared core model in `StellaOps.Router.Common`. This sprint makes Common the single, stable contract layer that Gateway, Microservice SDK, and transports all depend on.
**Goal:** Lock down the domain vocabulary. Implement all data types and interfaces with **no behavior** - just shapes that match `specs.md`.
**Working directory:** `src/__Libraries/StellaOps.Router.Common/`
**Key principle:** Changes to `StellaOps.Router.Common` after this sprint must be rare and reviewed. Everything else depends on it.
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0001_0001 (skeleton must be complete)
- **Downstream:** All other router sprints depend on these contracts
- **Parallel work:** None possible until this sprint completes
- **Cross-module impact:** None. All work is in `StellaOps.Router.Common`
## Documentation Prerequisites
- `docs/router/specs.md` (canonical specification - READ FIRST, sections 2-13)
- `docs/router/02-Step.md` (detailed task breakdown for this sprint)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | CMN-001 | TODO | Create `/Enums/TransportType.cs` with `[Udp, Tcp, Certificate, RabbitMq]` | No HTTP type per spec |
| 2 | CMN-002 | TODO | Create `/Enums/FrameType.cs` with Hello, Heartbeat, EndpointsUpdate, Request, RequestStreamData, Response, ResponseStreamData, Cancel | |
| 3 | CMN-003 | TODO | Create `/Enums/InstanceHealthStatus.cs` with Unknown, Healthy, Degraded, Draining, Unhealthy | |
| 4 | CMN-010 | TODO | Create `/Models/ClaimRequirement.cs` with Type (required) and Value (optional) | Replaces AllowedRoles |
| 5 | CMN-011 | TODO | Create `/Models/EndpointDescriptor.cs` with ServiceName, Version, Method, Path, DefaultTimeout, SupportsStreaming, RequiringClaims | |
| 6 | CMN-012 | TODO | Create `/Models/InstanceDescriptor.cs` with InstanceId, ServiceName, Version, Region | |
| 7 | CMN-013 | TODO | Create `/Models/ConnectionState.cs` with ConnectionId, Instance, Status, LastHeartbeatUtc, AveragePingMs, TransportType, Endpoints | |
| 8 | CMN-014 | TODO | Create `/Models/RoutingContext.cs` matching spec (neutral context, no ASP.NET dependency) | |
| 9 | CMN-015 | TODO | Create `/Models/RoutingDecision.cs` with Endpoint, Connection, TransportType, EffectiveTimeout | |
| 10 | CMN-016 | TODO | Create `/Models/PayloadLimits.cs` with MaxRequestBytesPerCall, MaxRequestBytesPerConnection, MaxAggregateInflightBytes | |
| 11 | CMN-020 | TODO | Create `/Models/Frame.cs` with Type, CorrelationId, Payload | |
| 12 | CMN-021 | TODO | Create `/Models/HelloPayload.cs` with InstanceDescriptor and list of EndpointDescriptors | |
| 13 | CMN-022 | TODO | Create `/Models/HeartbeatPayload.cs` with InstanceId, Status, metrics | |
| 14 | CMN-023 | TODO | Create `/Models/CancelPayload.cs` with Reason | |
| 15 | CMN-030 | TODO | Create `/Abstractions/IGlobalRoutingState.cs` interface | |
| 16 | CMN-031 | TODO | Create `/Abstractions/IRoutingPlugin.cs` interface | |
| 17 | CMN-032 | TODO | Create `/Abstractions/ITransportServer.cs` interface | |
| 18 | CMN-033 | TODO | Create `/Abstractions/ITransportClient.cs` interface | |
| 19 | CMN-034 | TODO | Create `/Abstractions/IRegionProvider.cs` interface (optional, if spec requires) | |
| 20 | CMN-040 | TODO | Write shape tests for EndpointDescriptor, ConnectionState | |
| 21 | CMN-041 | TODO | Write enum completeness tests for FrameType | |
| 22 | CMN-042 | TODO | Verify Common compiles with zero warnings (nullable enabled) | |
| 23 | CMN-043 | TODO | Verify Common only references BCL (no ASP.NET, no serializers) | |
## File Layout
```
/src/__Libraries/StellaOps.Router.Common/
/Enums/
TransportType.cs
FrameType.cs
InstanceHealthStatus.cs
/Models/
ClaimRequirement.cs
EndpointDescriptor.cs
InstanceDescriptor.cs
ConnectionState.cs
RoutingContext.cs
RoutingDecision.cs
PayloadLimits.cs
Frame.cs
HelloPayload.cs
HeartbeatPayload.cs
CancelPayload.cs
/Abstractions/
IGlobalRoutingState.cs
IRoutingPlugin.cs
ITransportClient.cs
ITransportServer.cs
IRegionProvider.cs
```
## Interface Signatures (from specs.md)
### IGlobalRoutingState
```csharp
public interface IGlobalRoutingState
{
EndpointDescriptor? ResolveEndpoint(string method, string path);
IReadOnlyList<ConnectionState> GetConnectionsFor(
string serviceName, string version, string method, string path);
}
```
### IRoutingPlugin
```csharp
public interface IRoutingPlugin
{
Task<RoutingDecision?> ChooseInstanceAsync(
RoutingContext context, CancellationToken cancellationToken);
}
```
### ITransportServer
```csharp
public interface ITransportServer
{
Task StartAsync(CancellationToken cancellationToken);
Task StopAsync(CancellationToken cancellationToken);
}
```
### ITransportClient
```csharp
public interface ITransportClient
{
Task<Frame> SendRequestAsync(
ConnectionState connection, Frame requestFrame,
TimeSpan timeout, CancellationToken cancellationToken);
Task SendCancelAsync(
ConnectionState connection, Guid correlationId, string? reason = null);
Task SendStreamingAsync(
ConnectionState connection, Frame requestHeader, Stream requestBody,
Func<Stream, Task> readResponseBody, PayloadLimits limits,
CancellationToken cancellationToken);
}
```
## Design Constraints
1. **No behavior:** Only shapes - no LINQ-heavy methods, no routing algorithms, no network code
2. **No serialization:** No JSON/MessagePack references; Common only defines shapes
3. **Immutability preferred:** Use `init` properties for descriptors; `ConnectionState` health fields may be mutable
4. **BCL only:** No ASP.NET or third-party package dependencies
5. **Nullable enabled:** All code must compile with zero nullable warnings
## Exit Criteria
Before marking this sprint DONE:
1. [ ] All types from `specs.md` Common section exist with matching names and properties
2. [ ] Common compiles with zero warnings
3. [ ] Common only references BCL (verify no package references in .csproj)
4. [ ] No behavior/logic in any type (pure DTOs and interfaces)
5. [ ] `StellaOps.Router.Common.Tests` runs and passes
6. [ ] `docs/router/specs.md` is updated if any discrepancy found (or code matches spec)
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- `RoutingContext` uses a neutral model (not ASP.NET `HttpContext`) to keep Common free of web dependencies. Gateway will adapt from `HttpContext` to this neutral model.
- `ConnectionState.Endpoints` uses `(string Method, string Path)` tuple as key for dictionary lookups.
- Frame payloads are `byte[]` - serialization happens at the transport layer, not in Common.

View File

@@ -0,0 +1,121 @@
# Sprint 7000-0002-0001 · Router Transport · InMemory Plugin
## Topic & Scope
Build a fake "in-memory" transport plugin for development and testing. This transport proves the HELLO/HEARTBEAT/REQUEST/RESPONSE/CANCEL semantics and routing logic **without** dealing with sockets and RabbitMQ yet.
**Goal:** Enable unit and integration testing of the router and SDK by providing an in-process transport where frames are passed via channels/queues in memory.
**Working directory:** `src/__Libraries/StellaOps.Router.Transport.InMemory/`
**Key principle:** This plugin will never ship to production; it's only for dev tests and CI. It must fully implement all transport abstractions so that switching to real transports later requires zero changes to Gateway or Microservice SDK code.
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0001_0002 (Common models must be complete)
- **Downstream:** SDK and Gateway sprints depend on this for testing
- **Parallel work:** Can run in parallel with CMN-040/041/042/043 test tasks if Common models are done
- **Cross-module impact:** None. Creates new directory only.
## Documentation Prerequisites
- `docs/router/specs.md` (sections 5, 10 - Transport and Cancellation requirements)
- `docs/router/03-Step.md` (detailed task breakdown)
- `docs/router/implplan.md` (phase 3 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | MEM-001 | TODO | Create `StellaOps.Router.Transport.InMemory` classlib project | Add to StellaOps.Router.sln |
| 2 | MEM-002 | TODO | Add project reference to `StellaOps.Router.Common` | |
| 3 | MEM-010 | TODO | Implement `InMemoryTransportServer` : `ITransportServer` | Gateway side |
| 4 | MEM-011 | TODO | Implement `InMemoryTransportClient` : `ITransportClient` | Microservice side |
| 5 | MEM-012 | TODO | Create shared `InMemoryConnectionRegistry` (concurrent dictionary keyed by ConnectionId) | Thread-safe |
| 6 | MEM-013 | TODO | Create `InMemoryChannel` for bidirectional frame passing | Use System.Threading.Channels |
| 7 | MEM-020 | TODO | Implement HELLO frame handling (client → server) | |
| 8 | MEM-021 | TODO | Implement HEARTBEAT frame handling (client → server) | |
| 9 | MEM-022 | TODO | Implement REQUEST frame handling (server → client) | |
| 10 | MEM-023 | TODO | Implement RESPONSE frame handling (client → server) | |
| 11 | MEM-024 | TODO | Implement CANCEL frame handling (bidirectional) | |
| 12 | MEM-025 | TODO | Implement REQUEST_STREAM_DATA / RESPONSE_STREAM_DATA frame handling | For streaming support |
| 13 | MEM-030 | TODO | Create `InMemoryTransportOptions` for configuration | Timeouts, buffer sizes |
| 14 | MEM-031 | TODO | Create DI registration extension `AddInMemoryTransport()` | |
| 15 | MEM-040 | TODO | Write integration tests for HELLO/HEARTBEAT flow | |
| 16 | MEM-041 | TODO | Write integration tests for REQUEST/RESPONSE flow | |
| 17 | MEM-042 | TODO | Write integration tests for CANCEL flow | |
| 18 | MEM-043 | TODO | Write integration tests for streaming flow | |
| 19 | MEM-050 | TODO | Create test project `StellaOps.Router.Transport.InMemory.Tests` | |
## Architecture
```
┌──────────────────────┐ InMemoryConnectionRegistry ┌──────────────────────┐
│ Gateway │ (ConcurrentDictionary<ConnectionId, │ Microservice │
│ (InMemoryTransport │◄──── InMemoryChannel>) ────►│ (InMemoryTransport │
│ Server) │ │ Client) │
└──────────────────────┘ └──────────────────────┘
│ │
│ Channel<Frame> ToMicroservice ─────────────────────────────────────►│
│◄─────────────────────────────────────────────── Channel<Frame> ToGateway
│ │
```
## InMemoryChannel Design
```csharp
internal sealed class InMemoryChannel
{
public string ConnectionId { get; }
public Channel<Frame> ToMicroservice { get; } // Gateway writes, SDK reads
public Channel<Frame> ToGateway { get; } // SDK writes, Gateway reads
public InstanceDescriptor? Instance { get; set; }
public CancellationTokenSource LifetimeToken { get; }
}
```
## Frame Flow Examples
### HELLO Flow
1. Microservice SDK calls `InMemoryTransportClient.ConnectAsync()`
2. Client creates `InMemoryChannel`, registers in `InMemoryConnectionRegistry`
3. Client sends HELLO frame via `ToGateway` channel
4. Server reads from `ToGateway`, processes HELLO, updates `ConnectionState`
### REQUEST/RESPONSE Flow
1. Gateway receives HTTP request
2. Gateway sends REQUEST frame via `ToMicroservice` channel
3. SDK reads from `ToMicroservice`, invokes handler
4. SDK sends RESPONSE frame via `ToGateway` channel
5. Gateway reads from `ToGateway`, returns HTTP response
### CANCEL Flow
1. HTTP client disconnects (or timeout)
2. Gateway sends CANCEL frame via `ToMicroservice` channel
3. SDK reads CANCEL, cancels handler's CancellationToken
4. SDK optionally sends partial RESPONSE or no response
## Exit Criteria
Before marking this sprint DONE:
1. [ ] `InMemoryTransportServer` fully implements `ITransportServer`
2. [ ] `InMemoryTransportClient` fully implements `ITransportClient`
3. [ ] All frame types (HELLO, HEARTBEAT, REQUEST, RESPONSE, STREAM_DATA, CANCEL) are handled
4. [ ] Thread-safe concurrent access to `InMemoryConnectionRegistry`
5. [ ] All integration tests pass
6. [ ] No external dependencies (only BCL + Router.Common)
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Uses `System.Threading.Channels` for async frame passing (unbounded by default, can add backpressure later)
- InMemory transport simulates latency only if explicitly configured (default: instant)
- Connection lifetime is tied to `CancellationTokenSource`; disposing triggers cleanup
- This transport is explicitly excluded from production deployments via conditional compilation or package separation

View File

@@ -0,0 +1,135 @@
# Sprint 7000-0003-0001 · Microservice SDK · Core Infrastructure
## Topic & Scope
Implement the core infrastructure of the Microservice SDK: options, endpoint discovery, and router connection management. After this sprint, a microservice can connect to a router and send HELLO with its endpoint list.
**Goal:** "Connect and say HELLO" - microservice connects to router(s) and registers its identity and endpoints.
**Working directory:** `src/__Libraries/StellaOps.Microservice/`
**Parallel track:** This sprint can run in parallel with Gateway sprints (7000-0004-*) once the InMemory transport is complete.
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0001_0002 (Common), SPRINT_7000_0002_0001 (InMemory transport)
- **Downstream:** SPRINT_7000_0003_0002 (request handling)
- **Parallel work:** Can run in parallel with Gateway core sprint
- **Cross-module impact:** None. All work in `src/__Libraries/StellaOps.Microservice/`
## Documentation Prerequisites
- `docs/router/specs.md` (section 7 - Microservice SDK requirements)
- `docs/router/04-Step.md` (detailed task breakdown)
- `docs/router/implplan.md` (phase 4 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | SDK-001 | TODO | Implement `StellaMicroserviceOptions` with all required properties | ServiceName, Version, Region, InstanceId, Routers, ConfigFilePath |
| 2 | SDK-002 | TODO | Implement `RouterEndpointConfig` (host, port, transport type) | |
| 3 | SDK-003 | TODO | Validate that Routers list is mandatory (throw if empty) | Per spec |
| 4 | SDK-010 | TODO | Create `[StellaEndpoint]` attribute for endpoint declaration | Method, Path, SupportsStreaming, Timeout |
| 5 | SDK-011 | TODO | Implement runtime reflection endpoint discovery | Scan assemblies for `[StellaEndpoint]` |
| 6 | SDK-012 | TODO | Build in-memory `EndpointDescriptor` list from discovered endpoints | |
| 7 | SDK-013 | TODO | Create `IEndpointDiscoveryProvider` abstraction | For source-gen vs reflection swap |
| 8 | SDK-020 | TODO | Implement `IRouterConnectionManager` interface | |
| 9 | SDK-021 | TODO | Implement `RouterConnectionManager` with connection pool | One connection per router endpoint |
| 10 | SDK-022 | TODO | Implement connection lifecycle (connect, reconnect on failure) | Exponential backoff |
| 11 | SDK-023 | TODO | Implement HELLO frame construction from options + endpoints | |
| 12 | SDK-024 | TODO | Send HELLO on connection establishment | |
| 13 | SDK-025 | TODO | Implement HEARTBEAT sending on timer | Configurable interval |
| 14 | SDK-030 | TODO | Implement `AddStellaMicroservice(IServiceCollection, Action<StellaMicroserviceOptions>)` | Full DI registration |
| 15 | SDK-031 | TODO | Register `IHostedService` for connection management | Start/stop with host |
| 16 | SDK-032 | TODO | Create `MicroserviceHostedService` that starts connections on app startup | |
| 17 | SDK-040 | TODO | Write unit tests for endpoint discovery | |
| 18 | SDK-041 | TODO | Write integration tests with InMemory transport | Connect, HELLO, HEARTBEAT |
## Endpoint Discovery
### Attribute-Based Declaration
```csharp
[StellaEndpoint("POST", "/billing/invoices")]
public sealed class CreateInvoiceEndpoint : IStellaEndpoint<CreateInvoiceRequest, CreateInvoiceResponse>
{
public Task<CreateInvoiceResponse> HandleAsync(CreateInvoiceRequest request, CancellationToken ct);
}
```
### Discovery Flow
1. On startup, scan loaded assemblies for types with `[StellaEndpoint]`
2. For each type, verify it implements a handler interface
3. Build `EndpointDescriptor` from attribute + defaults
4. Store in `IEndpointRegistry` for lookup and HELLO construction
### Handler Interface Detection
```csharp
// Typed with request
typeof(IStellaEndpoint<TRequest, TResponse>)
// Typed without request
typeof(IStellaEndpoint<TResponse>)
// Raw handler
typeof(IRawStellaEndpoint)
```
## Connection Lifecycle
```
┌─────────────┐ Connect ┌─────────────┐ HELLO ┌─────────────┐
│ Disconnected│────────────────►│ Connected │───────────────►│ Registered │
└─────────────┘ └─────────────┘ └─────────────┘
▲ │ │
│ │ Error │ Heartbeat timer
│ ▼ ▼
│ ┌─────────────┐ ┌─────────────┐
└────────────────────────│ Reconnect │◄───────────────│ Heartbeat │
Backoff │ (backoff) │ Error │ Active │
└─────────────┘ └─────────────┘
```
## StellaMicroserviceOptions
```csharp
public sealed class StellaMicroserviceOptions
{
public string ServiceName { get; set; } = string.Empty;
public string Version { get; set; } = string.Empty; // Strict semver
public string Region { get; set; } = string.Empty;
public string InstanceId { get; set; } = string.Empty; // Auto-generate if empty
public IList<RouterEndpointConfig> Routers { get; set; } = new List<RouterEndpointConfig>();
public string? ConfigFilePath { get; set; } // Optional YAML overrides
public TimeSpan HeartbeatInterval { get; set; } = TimeSpan.FromSeconds(10);
public TimeSpan ReconnectBackoffMax { get; set; } = TimeSpan.FromMinutes(1);
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [ ] `StellaMicroserviceOptions` fully implemented with validation
2. [ ] Endpoint discovery works via reflection
3. [ ] Connection manager connects to configured routers
4. [ ] HELLO frame sent on connection with full endpoint list
5. [ ] HEARTBEAT sent periodically on timer
6. [ ] Reconnection with backoff on connection failure
7. [ ] Integration tests pass with InMemory transport
8. [ ] `AddStellaMicroservice()` registers all services correctly
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Endpoint discovery defaults to reflection; source generation comes in a later sprint
- InstanceId auto-generates using `Guid.NewGuid().ToString("N")` if not provided
- Version validation enforces strict semver format
- Routers list cannot be empty - throws `InvalidOperationException` on startup
- YAML config file is optional at this stage (Sprint 7000-0007-0002)

View File

@@ -0,0 +1,173 @@
# Sprint 7000-0003-0002 · Microservice SDK · Request Handling
## Topic & Scope
Implement request handling in the Microservice SDK: receiving REQUEST frames, dispatching to handlers, and sending RESPONSE frames. Supports both typed and raw handler patterns.
**Goal:** Complete the request/response flow - microservice receives requests from router and returns responses.
**Working directory:** `src/__Libraries/StellaOps.Microservice/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0003_0001 (SDK core with connection + HELLO)
- **Downstream:** SPRINT_7000_0005_0003 (cancellation), SPRINT_7000_0005_0004 (streaming)
- **Parallel work:** Can run in parallel with Gateway middleware sprint
- **Cross-module impact:** None. All work in `src/__Libraries/StellaOps.Microservice/`
## Documentation Prerequisites
- `docs/router/specs.md` (section 7.2, 7.4, 7.5 - Endpoint definition, Connection behavior, Request handling)
- `docs/router/04-Step.md` (detailed task breakdown - request handling section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | HDL-001 | TODO | Define `IRawStellaEndpoint` interface | Takes RawRequestContext, returns RawResponse |
| 2 | HDL-002 | TODO | Define `IStellaEndpoint<TRequest, TResponse>` interface | Typed request/response |
| 3 | HDL-003 | TODO | Define `IStellaEndpoint<TResponse>` interface | No request body |
| 4 | HDL-010 | TODO | Implement `RawRequestContext` | Method, Path, Headers, Body stream, CancellationToken |
| 5 | HDL-011 | TODO | Implement `RawResponse` | StatusCode, Headers, Body stream |
| 6 | HDL-012 | TODO | Implement `IHeaderCollection` abstraction | Key-value header access |
| 7 | HDL-020 | TODO | Create `IEndpointRegistry` for handler lookup | (Method, Path) → handler instance |
| 8 | HDL-021 | TODO | Implement path template matching (ASP.NET-style routes) | Handles `{id}` parameters |
| 9 | HDL-022 | TODO | Implement path matching rules (case sensitivity, trailing slash) | Per spec |
| 10 | HDL-030 | TODO | Create `TypedEndpointAdapter` to wrap typed handlers as raw | IStellaEndpoint<T,R> → IRawStellaEndpoint |
| 11 | HDL-031 | TODO | Implement request deserialization in adapter | JSON by default |
| 12 | HDL-032 | TODO | Implement response serialization in adapter | JSON by default |
| 13 | HDL-040 | TODO | Implement `RequestDispatcher` | Frame → RawRequestContext → Handler → RawResponse → Frame |
| 14 | HDL-041 | TODO | Implement frame-to-context conversion | REQUEST frame → RawRequestContext |
| 15 | HDL-042 | TODO | Implement response-to-frame conversion | RawResponse → RESPONSE frame |
| 16 | HDL-043 | TODO | Wire dispatcher into connection read loop | Process REQUEST frames |
| 17 | HDL-050 | TODO | Implement `IServiceProvider` integration for handler instantiation | DI support |
| 18 | HDL-051 | TODO | Implement handler scoping (per-request scope) | IServiceScope per request |
| 19 | HDL-060 | TODO | Write unit tests for path matching | Various patterns |
| 20 | HDL-061 | TODO | Write unit tests for typed adapter | Serialization round-trip |
| 21 | HDL-062 | TODO | Write integration tests for full REQUEST/RESPONSE flow | With InMemory transport |
## Handler Interfaces
### Raw Handler
```csharp
public interface IRawStellaEndpoint
{
Task<RawResponse> HandleAsync(RawRequestContext context, CancellationToken cancellationToken);
}
```
### Typed Handlers
```csharp
public interface IStellaEndpoint<TRequest, TResponse>
{
Task<TResponse> HandleAsync(TRequest request, CancellationToken cancellationToken);
}
public interface IStellaEndpoint<TResponse>
{
Task<TResponse> HandleAsync(CancellationToken cancellationToken);
}
```
## RawRequestContext
```csharp
public sealed class RawRequestContext
{
public string Method { get; init; } = string.Empty;
public string Path { get; init; } = string.Empty;
public IReadOnlyDictionary<string, string> PathParameters { get; init; }
= new Dictionary<string, string>();
public IHeaderCollection Headers { get; init; } = default!;
public Stream Body { get; init; } = Stream.Null;
public CancellationToken CancellationToken { get; init; }
}
```
## RawResponse
```csharp
public sealed class RawResponse
{
public int StatusCode { get; init; } = 200;
public IHeaderCollection Headers { get; init; } = default!;
public Stream Body { get; init; } = Stream.Null;
public static RawResponse Ok(Stream body) => new() { StatusCode = 200, Body = body };
public static RawResponse NotFound() => new() { StatusCode = 404 };
public static RawResponse Error(int statusCode, string message) => ...;
}
```
## Path Template Matching
Must use same rules as router (ASP.NET-style):
- `{id}` matches any segment, value captured in PathParameters
- `{id:int}` constraint support (optional for v1)
- Case sensitivity: configurable, default case-insensitive
- Trailing slash: configurable, default treats `/foo` and `/foo/` as equivalent
## Request Flow
```
┌─────────────────┐ ┌────────────────────┐ ┌───────────────────┐
│ REQUEST Frame │────►│ RequestDispatcher │────►│ IEndpointRegistry │
│ (from Router) │ │ │ │ (Method, Path) │
└─────────────────┘ └────────────────────┘ └───────────────────┘
│ │
│ ▼
│ ┌───────────────────┐
│ │ Handler Instance │
│ │ (from DI scope) │
│ └───────────────────┘
│ │
│◄─────────────────────────┘
┌────────────────────┐
│ RawRequestContext │
└────────────────────┘
┌────────────────────┐
│ Handler.HandleAsync│
└────────────────────┘
┌────────────────────┐
│ RawResponse │
└────────────────────┘
┌────────────────────┐
│ RESPONSE Frame │
│ (to Router) │
└────────────────────┘
```
## Exit Criteria
Before marking this sprint DONE:
1. [ ] All handler interfaces defined and documented
2. [ ] `RawRequestContext` and `RawResponse` implemented
3. [ ] Path template matching works for common patterns
4. [ ] Typed handlers wrapped correctly via `TypedEndpointAdapter`
5. [ ] `RequestDispatcher` processes REQUEST frames end-to-end
6. [ ] DI integration works (handlers resolved from service provider)
7. [ ] Integration tests pass with InMemory transport
8. [ ] Body treated as opaque bytes (no interpretation at SDK level for raw handlers)
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Typed handlers use JSON serialization by default; configurable via options
- Path matching is case-insensitive by default (matches ASP.NET Core default)
- Each request gets its own DI scope for handler resolution
- Body stream may be buffered or streaming depending on endpoint configuration (streaming support comes in later sprint)
- Handler exceptions are caught and converted to 500 responses with error details (configurable)

View File

@@ -0,0 +1,135 @@
# Sprint 7000-0004-0001 · Gateway · Core Infrastructure
## Topic & Scope
Implement the core infrastructure of the Gateway: node configuration, global routing state, and basic routing plugin. This sprint creates the foundation for HTTP → transport → microservice routing.
**Goal:** Gateway can maintain routing state from connected microservices and select instances for routing decisions.
**Working directory:** `src/Gateway/StellaOps.Gateway.WebService/`
**Parallel track:** This sprint can run in parallel with Microservice SDK sprints (7000-0003-*) once the InMemory transport is complete.
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0001_0002 (Common), SPRINT_7000_0002_0001 (InMemory transport)
- **Downstream:** SPRINT_7000_0004_0002 (middleware), SPRINT_7000_0004_0003 (connection handling)
- **Parallel work:** Can run in parallel with SDK core sprint
- **Cross-module impact:** None. All work in `src/Gateway/StellaOps.Gateway.WebService/`
## Documentation Prerequisites
- `docs/router/specs.md` (section 6 - Gateway requirements)
- `docs/router/05-Step.md` (detailed task breakdown)
- `docs/router/implplan.md` (phase 5 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | GW-001 | TODO | Implement `GatewayNodeConfig` | Region, NodeId, Environment |
| 2 | GW-002 | TODO | Bind `GatewayNodeConfig` from configuration | appsettings.json section |
| 3 | GW-003 | TODO | Validate GatewayNodeConfig on startup | Region required |
| 4 | GW-010 | TODO | Implement `IGlobalRoutingState` as `InMemoryRoutingState` | Thread-safe implementation |
| 5 | GW-011 | TODO | Implement `ConnectionState` storage | ConcurrentDictionary by ConnectionId |
| 6 | GW-012 | TODO | Implement endpoint-to-connections index | (Method, Path) → List<ConnectionState> |
| 7 | GW-013 | TODO | Implement `ResolveEndpoint(method, path)` | Path template matching |
| 8 | GW-014 | TODO | Implement `GetConnectionsFor(serviceName, version, method, path)` | Filter by criteria |
| 9 | GW-020 | TODO | Create `IRoutingPlugin` implementation `DefaultRoutingPlugin` | Basic instance selection |
| 10 | GW-021 | TODO | Implement version filtering (strict semver equality) | Per spec |
| 11 | GW-022 | TODO | Implement health filtering (Healthy or Degraded only) | Per spec |
| 12 | GW-023 | TODO | Implement region preference (gateway region first) | Use GatewayNodeConfig.Region |
| 13 | GW-024 | TODO | Implement basic tie-breaking (any healthy instance) | Full algorithm in later sprint |
| 14 | GW-030 | TODO | Create `RoutingOptions` for configurable behavior | Default version, neighbor regions |
| 15 | GW-031 | TODO | Register routing services in DI | IGlobalRoutingState, IRoutingPlugin |
| 16 | GW-040 | TODO | Write unit tests for InMemoryRoutingState | |
| 17 | GW-041 | TODO | Write unit tests for DefaultRoutingPlugin | Version, health, region filtering |
## GatewayNodeConfig
```csharp
public sealed class GatewayNodeConfig
{
public string Region { get; set; } = string.Empty; // Required, e.g. "eu1"
public string NodeId { get; set; } = string.Empty; // e.g. "gw-eu1-01"
public string Environment { get; set; } = string.Empty; // e.g. "prod"
public IList<string> NeighborRegions { get; set; } = []; // Fallback regions
}
```
**Configuration binding:**
```json
{
"GatewayNode": {
"Region": "eu1",
"NodeId": "gw-eu1-01",
"Environment": "prod",
"NeighborRegions": ["eu2", "us1"]
}
}
```
## InMemoryRoutingState
```csharp
internal sealed class InMemoryRoutingState : IGlobalRoutingState
{
private readonly ConcurrentDictionary<string, ConnectionState> _connections = new();
private readonly ConcurrentDictionary<(string Method, string Path), List<string>> _endpointIndex = new();
public void AddConnection(ConnectionState connection) { ... }
public void RemoveConnection(string connectionId) { ... }
public void UpdateConnection(string connectionId, Action<ConnectionState> update) { ... }
public EndpointDescriptor? ResolveEndpoint(string method, string path) { ... }
public IReadOnlyList<ConnectionState> GetConnectionsFor(
string serviceName, string version, string method, string path) { ... }
}
```
## Routing Algorithm (Phase 1 - Basic)
```
1. Filter by ServiceName (exact match)
2. Filter by Version (strict semver equality)
3. Filter by Health (Healthy or Degraded only)
4. If any remain, pick one (random for now)
5. If none, return null (503 Service Unavailable)
```
**Note:** Full routing algorithm (region preference, ping-based selection, fallback) is implemented in SPRINT_7000_0005_0002.
## Region Derivation
Per spec section 2:
> Routing decisions MUST use `GatewayNodeConfig.Region` as the node's region; the router MUST NOT derive region from HTTP headers or URL host names.
This is enforced by:
1. GatewayNodeConfig is bound from static configuration only
2. No code path reads region from HttpContext
3. Tests verify region is never extracted from Host header
## Exit Criteria
Before marking this sprint DONE:
1. [ ] `GatewayNodeConfig` loads and validates from configuration
2. [ ] `InMemoryRoutingState` stores and indexes connections correctly
3. [ ] `ResolveEndpoint` performs path template matching
4. [ ] `DefaultRoutingPlugin` filters by version, health, region
5. [ ] All services registered in DI container
6. [ ] Unit tests pass for routing state and plugin
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Routing state is in-memory only; no persistence or distribution (single gateway node for v1)
- Path template matching reuses logic from SDK (shared in Common or duplicated)
- DefaultRoutingPlugin is intentionally simple; full algorithm comes in SPRINT_7000_0005_0002
- Region validation: startup fails fast if Region is empty

View File

@@ -0,0 +1,172 @@
# Sprint 7000-0004-0002 · Gateway · HTTP Middleware Pipeline
## Topic & Scope
Implement the HTTP middleware pipeline for the Gateway: endpoint resolution, authorization, routing decision, and transport dispatch. After this sprint, HTTP requests flow through the gateway to microservices via the InMemory transport.
**Goal:** Complete HTTP → transport → microservice → HTTP flow for basic buffered requests.
**Working directory:** `src/Gateway/StellaOps.Gateway.WebService/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0004_0001 (Gateway core)
- **Downstream:** SPRINT_7000_0004_0003 (connection handling)
- **Parallel work:** Can run in parallel with SDK request handling sprint
- **Cross-module impact:** None. All work in `src/Gateway/StellaOps.Gateway.WebService/`
## Documentation Prerequisites
- `docs/router/specs.md` (section 6.1 - HTTP ingress pipeline)
- `docs/router/05-Step.md` (middleware section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | MID-001 | TODO | Create `EndpointResolutionMiddleware` | (Method, Path) → EndpointDescriptor |
| 2 | MID-002 | TODO | Store resolved endpoint in `HttpContext.Items` | For downstream middleware |
| 3 | MID-003 | TODO | Return 404 if endpoint not found | |
| 4 | MID-010 | TODO | Create `AuthorizationMiddleware` stub | Checks authenticated only (full claims later) |
| 5 | MID-011 | TODO | Wire ASP.NET Core authentication | Standard middleware order |
| 6 | MID-012 | TODO | Return 401/403 for unauthorized requests | |
| 7 | MID-020 | TODO | Create `RoutingDecisionMiddleware` | Calls IRoutingPlugin.ChooseInstanceAsync |
| 8 | MID-021 | TODO | Store RoutingDecision in `HttpContext.Items` | |
| 9 | MID-022 | TODO | Return 503 if no instance available | |
| 10 | MID-023 | TODO | Return 504 if routing times out | |
| 11 | MID-030 | TODO | Create `TransportDispatchMiddleware` | Dispatches to selected transport |
| 12 | MID-031 | TODO | Implement buffered request dispatch | Read entire body, send REQUEST frame |
| 13 | MID-032 | TODO | Implement buffered response handling | Read RESPONSE frame, write to HTTP |
| 14 | MID-033 | TODO | Map transport errors to HTTP status codes | |
| 15 | MID-040 | TODO | Create `GlobalErrorHandlerMiddleware` | Catches unhandled exceptions |
| 16 | MID-041 | TODO | Implement structured error responses | JSON error envelope |
| 17 | MID-050 | TODO | Create `RequestLoggingMiddleware` | Correlation ID, service, endpoint, region, instance |
| 18 | MID-051 | TODO | Wire forwarded headers middleware | For reverse proxy support |
| 19 | MID-060 | TODO | Configure middleware pipeline in Program.cs | Correct order |
| 20 | MID-070 | TODO | Write integration tests for full HTTP→transport flow | With InMemory transport + SDK |
| 21 | MID-071 | TODO | Write tests for error scenarios (404, 503, etc.) | |
## Middleware Pipeline Order
```csharp
app.UseForwardedHeaders(); // Reverse proxy support
app.UseMiddleware<GlobalErrorHandlerMiddleware>();
app.UseMiddleware<RequestLoggingMiddleware>();
app.UseAuthentication(); // ASP.NET Core auth
app.UseMiddleware<EndpointResolutionMiddleware>();
app.UseMiddleware<AuthorizationMiddleware>();
app.UseMiddleware<RoutingDecisionMiddleware>();
app.UseMiddleware<TransportDispatchMiddleware>();
```
## EndpointResolutionMiddleware
```csharp
public class EndpointResolutionMiddleware
{
public async Task InvokeAsync(HttpContext context, IGlobalRoutingState routingState)
{
var method = context.Request.Method;
var path = context.Request.Path.Value ?? "/";
var endpoint = routingState.ResolveEndpoint(method, path);
if (endpoint == null)
{
context.Response.StatusCode = 404;
await context.Response.WriteAsJsonAsync(new { error = "Endpoint not found" });
return;
}
context.Items["ResolvedEndpoint"] = endpoint;
await _next(context);
}
}
```
## TransportDispatchMiddleware (Buffered Mode)
```csharp
public class TransportDispatchMiddleware
{
public async Task InvokeAsync(HttpContext context, ITransportClient transport)
{
var decision = (RoutingDecision)context.Items["RoutingDecision"]!;
var endpoint = (EndpointDescriptor)context.Items["ResolvedEndpoint"]!;
// Build REQUEST frame
using var bodyStream = new MemoryStream();
await context.Request.Body.CopyToAsync(bodyStream);
var requestFrame = new Frame
{
Type = FrameType.Request,
CorrelationId = Guid.NewGuid(),
Payload = BuildRequestPayload(context, bodyStream.ToArray())
};
// Send and await response
using var cts = CancellationTokenSource.CreateLinkedTokenSource(
context.RequestAborted);
cts.CancelAfter(decision.EffectiveTimeout);
var responseFrame = await transport.SendRequestAsync(
decision.Connection,
requestFrame,
decision.EffectiveTimeout,
cts.Token);
// Write response to HTTP
await WriteHttpResponse(context, responseFrame);
}
}
```
## Error Mapping
| Transport/Routing Error | HTTP Status |
|------------------------|-------------|
| Endpoint not found | 404 Not Found |
| No healthy instance | 503 Service Unavailable |
| Timeout | 504 Gateway Timeout |
| Microservice error (5xx) | Pass through status |
| Transport connection lost | 502 Bad Gateway |
| Payload too large | 413 Payload Too Large |
| Unauthorized | 401 Unauthorized |
| Forbidden (claims) | 403 Forbidden |
## HttpContext.Items Keys
```csharp
public static class ContextKeys
{
public const string ResolvedEndpoint = "ResolvedEndpoint";
public const string RoutingDecision = "RoutingDecision";
public const string CorrelationId = "CorrelationId";
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [ ] All middleware classes implemented
2. [ ] Pipeline configured in correct order
3. [ ] EndpointResolutionMiddleware resolves (Method, Path) → endpoint
4. [ ] AuthorizationMiddleware checks authentication (claims in later sprint)
5. [ ] RoutingDecisionMiddleware selects instance via IRoutingPlugin
6. [ ] TransportDispatchMiddleware sends/receives frames (buffered mode)
7. [ ] Error responses use consistent JSON envelope
8. [ ] Integration tests pass with InMemory transport
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Authorization middleware is a stub that only checks `User.Identity?.IsAuthenticated`; full RequiringClaims enforcement comes in SPRINT_7000_0008_0001
- Streaming support is not implemented in this sprint; TransportDispatchMiddleware only handles buffered mode
- Correlation ID is generated per request and logged throughout
- Request body is fully read into memory for buffered mode; streaming in SPRINT_7000_0005_0004

View File

@@ -0,0 +1,218 @@
# Sprint 7000-0004-0003 · Gateway · Connection Handling
## Topic & Scope
Implement connection handling in the Gateway: processing HELLO frames from microservices, maintaining connection state, and updating the global routing state. After this sprint, microservices can register with the gateway and be routed to.
**Goal:** Gateway receives HELLO from microservices and maintains live routing state. Combined with previous sprints, this enables full end-to-end HTTP → microservice routing.
**Working directory:** `src/Gateway/StellaOps.Gateway.WebService/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0004_0002 (middleware), SPRINT_7000_0003_0001 (SDK core with HELLO)
- **Downstream:** SPRINT_7000_0005_0001 (heartbeat/health)
- **Parallel work:** Should coordinate with SDK team for HELLO frame format agreement
- **Cross-module impact:** None. All work in Gateway.
## Documentation Prerequisites
- `docs/router/specs.md` (section 6.2 - Per-connection state and routing view)
- `docs/router/05-Step.md` (connection handling section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | CON-001 | TODO | Create `IConnectionHandler` interface | Processes frames per connection |
| 2 | CON-002 | TODO | Implement `ConnectionHandler` | Frame type dispatch |
| 3 | CON-010 | TODO | Implement HELLO frame processing | Parse HelloPayload, create ConnectionState |
| 4 | CON-011 | TODO | Validate HELLO payload | ServiceName, Version, InstanceId required |
| 5 | CON-012 | TODO | Register connection in IGlobalRoutingState | AddConnection |
| 6 | CON-013 | TODO | Build endpoint index from HELLO | (Method, Path) → ConnectionId |
| 7 | CON-020 | TODO | Create `TransportServerHost` hosted service | Starts ITransportServer |
| 8 | CON-021 | TODO | Wire transport server to connection handler | Frame routing |
| 9 | CON-022 | TODO | Handle new connections (InMemory: channel registration) | |
| 10 | CON-030 | TODO | Implement connection cleanup on disconnect | RemoveConnection from routing state |
| 11 | CON-031 | TODO | Clean up endpoint index on disconnect | Remove all endpoints for connection |
| 12 | CON-032 | TODO | Log connection lifecycle events | Connect, HELLO, disconnect |
| 13 | CON-040 | TODO | Implement connection ID generation | Unique per connection |
| 14 | CON-041 | TODO | Store connection metadata | Transport type, connect time |
| 15 | CON-050 | TODO | Write integration tests for HELLO flow | SDK → Gateway registration |
| 16 | CON-051 | TODO | Write tests for connection cleanup | |
| 17 | CON-052 | TODO | Write tests for multiple connections from same service | Different instances |
## Connection Lifecycle
```
┌─────────────────┐
│ New Connection │ (Transport layer signals new connection)
└────────┬────────┘
┌─────────────────┐
│ Awaiting HELLO │ (Connection exists but not registered for routing)
└────────┬────────┘
│ HELLO frame received
┌─────────────────┐
│ Validate HELLO │ (Check ServiceName, Version, endpoints)
└────────┬────────┘
│ Valid
┌─────────────────┐
│ Create │
│ ConnectionState │ (InstanceDescriptor, endpoints, health = Unknown)
└────────┬────────┘
┌─────────────────┐
│ Register in │ (Add to IGlobalRoutingState, index endpoints)
│ RoutingState │
└────────┬────────┘
┌─────────────────┐
│ Registered │ (Connection can receive routed requests)
└────────┬────────┘
│ Disconnect or error
┌─────────────────┐
│ Cleanup State │ (Remove from routing state, clean endpoint index)
└─────────────────┘
```
## HELLO Processing
```csharp
internal sealed class ConnectionHandler : IConnectionHandler
{
public async Task HandleFrameAsync(string connectionId, Frame frame)
{
switch (frame.Type)
{
case FrameType.Hello:
await ProcessHelloAsync(connectionId, frame);
break;
case FrameType.Heartbeat:
await ProcessHeartbeatAsync(connectionId, frame);
break;
case FrameType.Response:
case FrameType.ResponseStreamData:
await ProcessResponseAsync(connectionId, frame);
break;
default:
_logger.LogWarning("Unknown frame type {Type} from {ConnectionId}",
frame.Type, connectionId);
break;
}
}
private async Task ProcessHelloAsync(string connectionId, Frame frame)
{
var payload = DeserializeHelloPayload(frame.Payload);
// Validate
if (string.IsNullOrEmpty(payload.Instance.ServiceName))
throw new InvalidHelloException("ServiceName required");
if (string.IsNullOrEmpty(payload.Instance.Version))
throw new InvalidHelloException("Version required");
// Build ConnectionState
var connection = new ConnectionState
{
ConnectionId = connectionId,
Instance = payload.Instance,
Status = InstanceHealthStatus.Unknown,
LastHeartbeatUtc = DateTime.UtcNow,
TransportType = _currentTransportType,
Endpoints = payload.Endpoints.ToDictionary(
e => (e.Method, e.Path),
e => e)
};
// Register
_routingState.AddConnection(connection);
_logger.LogInformation(
"Registered {ServiceName} v{Version} instance {InstanceId} from {Region}",
payload.Instance.ServiceName,
payload.Instance.Version,
payload.Instance.InstanceId,
payload.Instance.Region);
}
}
```
## TransportServerHost
```csharp
internal sealed class TransportServerHost : IHostedService
{
private readonly ITransportServer _server;
private readonly IConnectionHandler _handler;
public async Task StartAsync(CancellationToken cancellationToken)
{
_server.OnConnection += HandleNewConnection;
_server.OnFrame += HandleFrame;
_server.OnDisconnect += HandleDisconnect;
await _server.StartAsync(cancellationToken);
}
private void HandleNewConnection(string connectionId)
{
_logger.LogInformation("New connection: {ConnectionId}", connectionId);
}
private async Task HandleFrame(string connectionId, Frame frame)
{
await _handler.HandleFrameAsync(connectionId, frame);
}
private void HandleDisconnect(string connectionId)
{
_routingState.RemoveConnection(connectionId);
_logger.LogInformation("Connection closed: {ConnectionId}", connectionId);
}
}
```
## Multiple Instances
The gateway must handle multiple instances of the same service:
- Same ServiceName + Version from different InstanceIds
- Each instance has its own ConnectionState
- Routing algorithm selects among available instances
```
Service: billing v1.0.0
├── Instance: billing-01 (Region: eu1) → Connection abc123
├── Instance: billing-02 (Region: eu1) → Connection def456
└── Instance: billing-03 (Region: us1) → Connection ghi789
```
## Exit Criteria
Before marking this sprint DONE:
1. [ ] HELLO frames processed correctly
2. [ ] ConnectionState created and stored
3. [ ] Endpoint index updated for routing lookups
4. [ ] Connection cleanup removes all state
5. [ ] TransportServerHost starts/stops with application
6. [ ] Integration tests: SDK registers, Gateway routes, SDK handles request
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Initial health status is `Unknown` until first heartbeat
- Connection ID format: GUID for InMemory, transport-specific for real transports
- HELLO validation failure disconnects the client (logs error)
- Duplicate HELLO from same connection replaces existing state (re-registration)

View File

@@ -0,0 +1,205 @@
# Sprint 7000-0005-0001 · Protocol Features · Heartbeat & Health
## Topic & Scope
Implement heartbeat processing and health tracking. Microservices send HEARTBEAT frames periodically; the gateway updates health status and marks stale instances as unhealthy.
**Goal:** Gateway maintains accurate health status for all connected instances, enabling health-aware routing.
**Working directories:**
- `src/__Libraries/StellaOps.Microservice/` (heartbeat sending)
- `src/Gateway/StellaOps.Gateway.WebService/` (heartbeat processing)
- `src/__Libraries/StellaOps.Router.Common/` (if payload changes needed)
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0004_0003 (Gateway connection handling), SPRINT_7000_0003_0001 (SDK core)
- **Downstream:** SPRINT_7000_0005_0002 (routing algorithm uses health)
- **Parallel work:** None. Sequential after connection handling.
- **Cross-module impact:** SDK and Gateway both modified.
## Documentation Prerequisites
- `docs/router/specs.md` (section 8 - Control/health/ping requirements)
- `docs/router/06-Step.md` (heartbeat section)
- `docs/router/implplan.md` (phase 6 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Working Directory |
|---|---------|--------|-------------|-------------------|
| 1 | HB-001 | TODO | Implement HeartbeatPayload serialization | Common |
| 2 | HB-002 | TODO | Add InstanceHealthStatus to HeartbeatPayload | Common |
| 3 | HB-003 | TODO | Add optional metrics to HeartbeatPayload (inflight count, error rate) | Common |
| 4 | HB-010 | TODO | Implement heartbeat sending timer in SDK | Microservice |
| 5 | HB-011 | TODO | Report current health status in heartbeat | Microservice |
| 6 | HB-012 | TODO | Report optional metrics in heartbeat | Microservice |
| 7 | HB-013 | TODO | Make heartbeat interval configurable | Microservice |
| 8 | HB-020 | TODO | Implement HEARTBEAT frame processing in Gateway | Gateway |
| 9 | HB-021 | TODO | Update LastHeartbeatUtc on heartbeat | Gateway |
| 10 | HB-022 | TODO | Update InstanceHealthStatus from payload | Gateway |
| 11 | HB-023 | TODO | Update optional metrics from payload | Gateway |
| 12 | HB-030 | TODO | Create HealthMonitorService hosted service | Gateway |
| 13 | HB-031 | TODO | Implement stale heartbeat detection | Configurable threshold |
| 14 | HB-032 | TODO | Mark instances Unhealthy when heartbeat stale | Gateway |
| 15 | HB-033 | TODO | Implement Draining status support | For graceful shutdown |
| 16 | HB-040 | TODO | Create HealthOptions for thresholds | StaleThreshold, DegradedThreshold |
| 17 | HB-041 | TODO | Bind HealthOptions from configuration | Gateway |
| 18 | HB-050 | TODO | Implement ping latency measurement (request/response timing) | Gateway |
| 19 | HB-051 | TODO | Update AveragePingMs from timing | Exponential moving average |
| 20 | HB-060 | TODO | Write integration tests for heartbeat flow | |
| 21 | HB-061 | TODO | Write tests for health status transitions | |
| 22 | HB-062 | TODO | Write tests for stale detection | |
## HeartbeatPayload
```csharp
public sealed class HeartbeatPayload
{
public string InstanceId { get; init; } = string.Empty;
public InstanceHealthStatus Status { get; init; }
public int? InflightRequestCount { get; init; }
public double? ErrorRatePercent { get; init; }
public DateTimeOffset Timestamp { get; init; }
}
```
## Health Status Transitions
```
┌─────────┐
First │ Unknown │
Heartbeat └────┬────┘
│ Status from payload
┌─────────┐
◄────────────────│ Healthy │◄───────────────┐
│ Degraded └────┬────┘ Healthy │
│ in payload │ │
▼ │ Stale threshold │
┌──────────┐ │ exceeded │
│ Degraded │ ▼ │
└────┬─────┘ ┌───────────┐ │
│ │ Unhealthy │───────────────┘
│ Stale └───────────┘ Heartbeat
│ threshold received
┌───────────┐
│ Unhealthy │
└───────────┘
```
**Special case: Draining**
- Microservice explicitly sets status to `Draining`
- Router stops sending new requests but allows in-flight to complete
- Used for graceful shutdown
## HealthMonitorService
```csharp
internal sealed class HealthMonitorService : BackgroundService
{
private readonly IGlobalRoutingState _routingState;
private readonly IOptions<HealthOptions> _options;
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
var interval = TimeSpan.FromSeconds(5); // Check frequency
while (!stoppingToken.IsCancellationRequested)
{
CheckStaleConnections();
await Task.Delay(interval, stoppingToken);
}
}
private void CheckStaleConnections()
{
var threshold = _options.Value.StaleThreshold;
var now = DateTime.UtcNow;
foreach (var connection in _routingState.GetAllConnections())
{
var age = now - connection.LastHeartbeatUtc;
if (age > threshold && connection.Status != InstanceHealthStatus.Unhealthy)
{
_routingState.UpdateConnection(connection.ConnectionId,
c => c.Status = InstanceHealthStatus.Unhealthy);
_logger.LogWarning(
"Instance {InstanceId} marked Unhealthy: no heartbeat for {Age}",
connection.Instance.InstanceId, age);
}
}
}
}
```
## HealthOptions
```csharp
public sealed class HealthOptions
{
public TimeSpan StaleThreshold { get; set; } = TimeSpan.FromSeconds(30);
public TimeSpan DegradedThreshold { get; set; } = TimeSpan.FromSeconds(15);
public int PingHistorySize { get; set; } = 10; // For moving average
}
```
## Ping Latency Measurement
Measure round-trip time for REQUEST/RESPONSE:
1. Record timestamp when REQUEST frame sent
2. Record timestamp when RESPONSE frame received
3. Calculate RTT = response_time - request_time
4. Update exponential moving average: `avg = 0.8 * avg + 0.2 * rtt`
```csharp
internal sealed class PingTracker
{
private readonly ConcurrentDictionary<Guid, long> _pendingRequests = new();
private double _averagePingMs;
public void RecordRequestSent(Guid correlationId)
{
_pendingRequests[correlationId] = Stopwatch.GetTimestamp();
}
public void RecordResponseReceived(Guid correlationId)
{
if (_pendingRequests.TryRemove(correlationId, out var startTicks))
{
var elapsed = Stopwatch.GetElapsedTime(startTicks);
var rtt = elapsed.TotalMilliseconds;
_averagePingMs = 0.8 * _averagePingMs + 0.2 * rtt;
}
}
public double AveragePingMs => _averagePingMs;
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [ ] SDK sends HEARTBEAT frames on timer
2. [ ] Gateway processes HEARTBEAT and updates ConnectionState
3. [ ] HealthMonitorService marks stale instances Unhealthy
4. [ ] Draining status stops new requests
5. [ ] Ping latency measured and stored
6. [ ] Health thresholds configurable
7. [ ] Integration tests pass
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Heartbeat interval default: 10 seconds (configurable)
- Stale threshold default: 30 seconds (3 missed heartbeats)
- Ping measurement uses REQUEST/RESPONSE timing, not separate PING frames
- Health status changes are logged for observability

View File

@@ -0,0 +1,217 @@
# Sprint 7000-0005-0002 · Protocol Features · Full Routing Algorithm
## Topic & Scope
Implement the complete routing algorithm as specified: region preference, ping-based selection, heartbeat recency, and fallback logic.
**Goal:** Routes prefer closest healthy instances with lowest latency, falling back through region tiers when necessary.
**Working directory:** `src/Gateway/StellaOps.Gateway.WebService/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0005_0001 (heartbeat/health provides the metrics)
- **Downstream:** SPRINT_7000_0005_0003 (cancellation), SPRINT_7000_0006_* (real transports)
- **Parallel work:** None. Sequential.
- **Cross-module impact:** Gateway only.
## Documentation Prerequisites
- `docs/router/specs.md` (section 4 - Routing algorithm / instance selection)
- `docs/router/06-Step.md` (routing algorithm section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | RTG-001 | TODO | Implement full filter chain in DefaultRoutingPlugin | |
| 2 | RTG-002 | TODO | Filter by ServiceName (exact match) | |
| 3 | RTG-003 | TODO | Filter by Version (strict semver equality) | |
| 4 | RTG-004 | TODO | Filter by Health (Healthy or Degraded only) | |
| 5 | RTG-010 | TODO | Implement region tier logic | |
| 6 | RTG-011 | TODO | Tier 0: Same region as gateway | GatewayNodeConfig.Region |
| 7 | RTG-012 | TODO | Tier 1: Configured neighbor regions | NeighborRegions |
| 8 | RTG-013 | TODO | Tier 2: All other regions | Fallback |
| 9 | RTG-020 | TODO | Implement instance scoring within tier | |
| 10 | RTG-021 | TODO | Primary sort: lower AveragePingMs | |
| 11 | RTG-022 | TODO | Secondary sort: more recent LastHeartbeatUtc | |
| 12 | RTG-023 | TODO | Tie-breaker: random or round-robin | Configurable |
| 13 | RTG-030 | TODO | Implement fallback decision order | |
| 14 | RTG-031 | TODO | Fallback 1: Greater ping (latency) | |
| 15 | RTG-032 | TODO | Fallback 2: Greater heartbeat age | |
| 16 | RTG-033 | TODO | Fallback 3: Less preferred region tier | |
| 17 | RTG-040 | TODO | Create RoutingOptions for algorithm tuning | |
| 18 | RTG-041 | TODO | Add default version configuration | Per service |
| 19 | RTG-042 | TODO | Add health status acceptance set | |
| 20 | RTG-050 | TODO | Write unit tests for each filter | |
| 21 | RTG-051 | TODO | Write unit tests for region tier logic | |
| 22 | RTG-052 | TODO | Write unit tests for scoring and tie-breaking | |
| 23 | RTG-053 | TODO | Write integration tests for routing decisions | |
## Routing Algorithm
```
Input: (ServiceName, Version, Method, Path)
Output: ConnectionState or null
1. Get all connections from IGlobalRoutingState.GetConnectionsFor(...)
2. Filter by ServiceName
- connections.Where(c => c.Instance.ServiceName == serviceName)
3. Filter by Version (strict semver equality)
- connections.Where(c => c.Instance.Version == version)
- If version not specified, use DefaultVersion from config
4. Filter by Health
- connections.Where(c => c.Status in {Healthy, Degraded})
- Exclude Unknown, Draining, Unhealthy
5. Group by Region Tier
- Tier 0: c.Instance.Region == GatewayNodeConfig.Region
- Tier 1: c.Instance.Region in GatewayNodeConfig.NeighborRegions
- Tier 2: All others
6. For each tier (0, 1, 2), if any candidates exist:
a. Sort by AveragePingMs (ascending)
b. For ties, sort by LastHeartbeatUtc (descending = more recent first)
c. For remaining ties, apply tie-breaker (random or round-robin)
d. Return first candidate
7. If no candidates in any tier, return null (503)
```
## Implementation
```csharp
public class DefaultRoutingPlugin : IRoutingPlugin
{
public async Task<RoutingDecision?> ChooseInstanceAsync(
RoutingContext context, CancellationToken cancellationToken)
{
var endpoint = context.Endpoint;
var gatewayRegion = context.GatewayRegion;
// Get all matching connections
var connections = _routingState.GetConnectionsFor(
endpoint.ServiceName,
endpoint.Version,
endpoint.Method,
endpoint.Path);
// Filter by health
var healthy = connections
.Where(c => c.Status is InstanceHealthStatus.Healthy
or InstanceHealthStatus.Degraded)
.ToList();
if (healthy.Count == 0)
return null;
// Group by region tier
var tier0 = healthy.Where(c => c.Instance.Region == gatewayRegion).ToList();
var tier1 = healthy.Where(c =>
_options.NeighborRegions.Contains(c.Instance.Region)).ToList();
var tier2 = healthy.Except(tier0).Except(tier1).ToList();
// Select from best tier
var selected = SelectFromTier(tier0)
?? SelectFromTier(tier1)
?? SelectFromTier(tier2);
if (selected == null)
return null;
return new RoutingDecision
{
Endpoint = endpoint,
Connection = selected,
TransportType = selected.TransportType,
EffectiveTimeout = endpoint.DefaultTimeout
};
}
private ConnectionState? SelectFromTier(List<ConnectionState> tier)
{
if (tier.Count == 0)
return null;
// Sort by ping (asc), then heartbeat (desc)
var sorted = tier
.OrderBy(c => c.AveragePingMs)
.ThenByDescending(c => c.LastHeartbeatUtc)
.ToList();
// Tie-breaker for same ping and heartbeat
var best = sorted.First();
var tied = sorted.TakeWhile(c =>
Math.Abs(c.AveragePingMs - best.AveragePingMs) < 0.1
&& c.LastHeartbeatUtc == best.LastHeartbeatUtc).ToList();
if (tied.Count == 1)
return tied[0];
// Round-robin or random for ties
return _options.TieBreaker == TieBreakerMode.Random
? tied[Random.Shared.Next(tied.Count)]
: tied[_roundRobinCounter++ % tied.Count];
}
}
```
## RoutingOptions
```csharp
public sealed class RoutingOptions
{
public Dictionary<string, string> DefaultVersions { get; set; } = new();
public HashSet<InstanceHealthStatus> AcceptableStatuses { get; set; }
= new() { InstanceHealthStatus.Healthy, InstanceHealthStatus.Degraded };
public TieBreakerMode TieBreaker { get; set; } = TieBreakerMode.RoundRobin;
}
public enum TieBreakerMode
{
Random,
RoundRobin
}
```
## Spec Compliance Verification
From specs.md section 4:
> * Region:
> * Prefer instances whose `Region == GatewayNodeConfig.Region`.
> * If none, fall back to configured neighbor regions.
> * If none, fall back to all other regions.
> * Within a chosen region tier:
> * Prefer lower `AveragePingMs`.
> * If several are tied, prefer more recent `LastHeartbeatUtc`.
> * If still tied, use a balancing strategy (e.g. random or round-robin).
Implementation must match exactly.
## Exit Criteria
Before marking this sprint DONE:
1. [ ] Full filter chain implemented (service, version, health)
2. [ ] Region tier logic works (same region → neighbors → others)
3. [ ] Scoring within tier (ping, heartbeat, tie-breaker)
4. [ ] RoutingOptions configurable
5. [ ] All unit tests pass
6. [ ] Integration tests verify routing decisions
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Ping tolerance for "ties": 0.1ms difference considered equal
- Round-robin counter is per-endpoint to avoid hot instances
- DefaultVersion lookup is per-service from configuration
- Degraded instances are routed to (may want to prefer Healthy first)

View File

@@ -0,0 +1,230 @@
# Sprint 7000-0005-0003 · Protocol Features · Cancellation Semantics
## Topic & Scope
Implement cancellation semantics on both gateway and microservice sides. When HTTP clients disconnect, timeouts occur, or payload limits are breached, CANCEL frames are sent to stop in-flight work.
**Goal:** Clean cancellation propagation from HTTP client through gateway to microservice handlers.
**Working directories:**
- `src/Gateway/StellaOps.Gateway.WebService/` (send CANCEL)
- `src/__Libraries/StellaOps.Microservice/` (receive CANCEL, cancel handler)
- `src/__Libraries/StellaOps.Router.Common/` (CancelPayload)
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0005_0002 (routing algorithm complete)
- **Downstream:** SPRINT_7000_0005_0004 (streaming uses cancellation)
- **Parallel work:** None. Sequential.
- **Cross-module impact:** SDK and Gateway both modified.
## Documentation Prerequisites
- `docs/router/specs.md` (sections 7.6, 10 - Cancellation requirements)
- `docs/router/07-Step.md` (cancellation section)
- `docs/router/implplan.md` (phase 7 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Working Directory |
|---|---------|--------|-------------|-------------------|
| 1 | CAN-001 | TODO | Define CancelPayload with Reason code | Common |
| 2 | CAN-002 | TODO | Define cancel reason constants | ClientDisconnected, Timeout, PayloadLimitExceeded, Shutdown |
| 3 | CAN-010 | TODO | Implement CANCEL frame sending in gateway | Gateway |
| 4 | CAN-011 | TODO | Wire HttpContext.RequestAborted to CANCEL | Gateway |
| 5 | CAN-012 | TODO | Implement timeout-triggered CANCEL | Gateway |
| 6 | CAN-013 | TODO | Implement payload-limit-triggered CANCEL | Gateway |
| 7 | CAN-014 | TODO | Implement shutdown-triggered CANCEL for in-flight | Gateway |
| 8 | CAN-020 | TODO | Stop forwarding REQUEST_STREAM_DATA after CANCEL | Gateway |
| 9 | CAN-021 | TODO | Ignore late RESPONSE frames for cancelled requests | Gateway |
| 10 | CAN-022 | TODO | Log cancelled requests with reason | Gateway |
| 11 | CAN-030 | TODO | Implement inflight request tracking in SDK | Microservice |
| 12 | CAN-031 | TODO | Create ConcurrentDictionary<Guid, CancellationTokenSource> | Microservice |
| 13 | CAN-032 | TODO | Add handler task to tracking map | Microservice |
| 14 | CAN-033 | TODO | Implement CANCEL frame processing | Microservice |
| 15 | CAN-034 | TODO | Call cts.Cancel() on CANCEL frame | Microservice |
| 16 | CAN-035 | TODO | Remove from tracking when handler completes | Microservice |
| 17 | CAN-040 | TODO | Implement connection-close cancellation | Microservice |
| 18 | CAN-041 | TODO | Cancel all inflight on connection loss | Microservice |
| 19 | CAN-050 | TODO | Pass CancellationToken to handler interfaces | Microservice |
| 20 | CAN-051 | TODO | Document cancellation best practices for handlers | Docs |
| 21 | CAN-060 | TODO | Write integration tests: client disconnect → handler cancelled | |
| 22 | CAN-061 | TODO | Write integration tests: timeout → handler cancelled | |
| 23 | CAN-062 | TODO | Write tests: late response ignored | |
## CancelPayload
```csharp
public sealed class CancelPayload
{
public string Reason { get; init; } = string.Empty;
}
public static class CancelReasons
{
public const string ClientDisconnected = "ClientDisconnected";
public const string Timeout = "Timeout";
public const string PayloadLimitExceeded = "PayloadLimitExceeded";
public const string Shutdown = "Shutdown";
}
```
## Gateway-Side: Sending CANCEL
### On Client Disconnect
```csharp
// In TransportDispatchMiddleware
context.RequestAborted.Register(async () =>
{
await transport.SendCancelAsync(
connection,
correlationId,
CancelReasons.ClientDisconnected);
});
```
### On Timeout
```csharp
using var cts = CancellationTokenSource.CreateLinkedTokenSource(context.RequestAborted);
cts.CancelAfter(decision.EffectiveTimeout);
try
{
var response = await transport.SendRequestAsync(..., cts.Token);
}
catch (OperationCanceledException) when (cts.IsCancellationRequested)
{
if (!context.RequestAborted.IsCancellationRequested)
{
// Timeout, not client disconnect
await transport.SendCancelAsync(connection, correlationId, CancelReasons.Timeout);
context.Response.StatusCode = 504;
return;
}
}
```
### Late Response Handling
```csharp
private readonly ConcurrentDictionary<Guid, bool> _cancelledRequests = new();
public void MarkCancelled(Guid correlationId)
{
_cancelledRequests[correlationId] = true;
}
public bool IsCancelled(Guid correlationId)
{
return _cancelledRequests.ContainsKey(correlationId);
}
// When response arrives
if (IsCancelled(frame.CorrelationId))
{
_logger.LogDebug("Ignoring late response for cancelled {CorrelationId}", frame.CorrelationId);
return; // Discard
}
```
## Microservice-Side: Receiving CANCEL
### Inflight Tracking
```csharp
internal sealed class InflightRequestTracker
{
private readonly ConcurrentDictionary<Guid, InflightRequest> _inflight = new();
public CancellationToken Track(Guid correlationId, Task handlerTask)
{
var cts = new CancellationTokenSource();
_inflight[correlationId] = new InflightRequest(cts, handlerTask);
return cts.Token;
}
public void Cancel(Guid correlationId, string reason)
{
if (_inflight.TryGetValue(correlationId, out var request))
{
request.Cts.Cancel();
_logger.LogInformation("Cancelled {CorrelationId}: {Reason}", correlationId, reason);
}
}
public void Complete(Guid correlationId)
{
if (_inflight.TryRemove(correlationId, out var request))
{
request.Cts.Dispose();
}
}
public void CancelAll(string reason)
{
foreach (var kvp in _inflight)
{
kvp.Value.Cts.Cancel();
}
_inflight.Clear();
}
}
```
### Connection-Close Handling
```csharp
// When connection closes unexpectedly
_inflightTracker.CancelAll("ConnectionClosed");
```
## Handler Cancellation Guidelines
Handlers MUST:
1. Accept `CancellationToken` parameter
2. Pass token to all async I/O operations
3. Check `token.IsCancellationRequested` in loops
4. Stop work promptly when cancelled
```csharp
public class ProcessDataEndpoint : IStellaEndpoint<DataRequest, DataResponse>
{
public async Task<DataResponse> HandleAsync(DataRequest request, CancellationToken ct)
{
// Pass token to I/O
var data = await _database.QueryAsync(request.Id, ct);
// Check in loops
foreach (var item in data)
{
ct.ThrowIfCancellationRequested();
await ProcessItemAsync(item, ct);
}
return new DataResponse { ... };
}
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [ ] CANCEL frames sent on client disconnect
2. [ ] CANCEL frames sent on timeout
3. [ ] SDK tracks inflight requests with CTS
4. [ ] SDK cancels handlers on CANCEL frame
5. [ ] Connection close cancels all inflight
6. [ ] Late responses are ignored/logged
7. [ ] Integration tests verify cancellation flow
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Cancellation is cooperative; handlers must honor the token
- CTS disposal happens on completion to avoid leaks
- Late response cleanup: entries expire after 60 seconds
- Shutdown CANCEL is best-effort (connections may close first)

View File

@@ -0,0 +1,215 @@
# Sprint 7000-0005-0004 · Protocol Features · Streaming Support
## Topic & Scope
Implement streaming request/response support. Large payloads stream through the gateway as `REQUEST_STREAM_DATA` and `RESPONSE_STREAM_DATA` frames rather than being fully buffered.
**Goal:** Enable large file uploads/downloads without memory exhaustion at gateway.
**Working directories:**
- `src/Gateway/StellaOps.Gateway.WebService/` (streaming dispatch)
- `src/__Libraries/StellaOps.Microservice/` (streaming handlers)
- `src/__Libraries/StellaOps.Router.Transport.InMemory/` (streaming frames)
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0005_0003 (cancellation - streaming needs cancel support)
- **Downstream:** SPRINT_7000_0005_0005 (payload limits)
- **Parallel work:** None. Sequential.
- **Cross-module impact:** SDK, Gateway, InMemory transport all modified.
## Documentation Prerequisites
- `docs/router/specs.md` (sections 5.4, 6.3, 7.5 - Streaming requirements)
- `docs/router/08-Step.md` (streaming section)
- `docs/router/implplan.md` (phase 8 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Working Directory |
|---|---------|--------|-------------|-------------------|
| 1 | STR-001 | TODO | Add SupportsStreaming flag to EndpointDescriptor | Common |
| 2 | STR-002 | TODO | Add streaming attribute support to [StellaEndpoint] | Common |
| 3 | STR-010 | TODO | Implement REQUEST_STREAM_DATA frame handling in transport | InMemory |
| 4 | STR-011 | TODO | Implement RESPONSE_STREAM_DATA frame handling in transport | InMemory |
| 5 | STR-012 | TODO | Implement end-of-stream signaling | InMemory |
| 6 | STR-020 | TODO | Implement streaming request dispatch in gateway | Gateway |
| 7 | STR-021 | TODO | Pipe HTTP body stream → REQUEST_STREAM_DATA frames | Gateway |
| 8 | STR-022 | TODO | Implement chunking for stream data | Configurable chunk size |
| 9 | STR-023 | TODO | Honor cancellation during streaming | Gateway |
| 10 | STR-030 | TODO | Implement streaming response handling in gateway | Gateway |
| 11 | STR-031 | TODO | Pipe RESPONSE_STREAM_DATA frames → HTTP response | Gateway |
| 12 | STR-032 | TODO | Set chunked transfer encoding | Gateway |
| 13 | STR-040 | TODO | Implement streaming body in RawRequestContext | Microservice |
| 14 | STR-041 | TODO | Expose Body as async-readable stream | Microservice |
| 15 | STR-042 | TODO | Implement backpressure (slow consumer) | Microservice |
| 16 | STR-050 | TODO | Implement streaming response writing | Microservice |
| 17 | STR-051 | TODO | Expose WriteBodyAsync for streaming output | Microservice |
| 18 | STR-052 | TODO | Chunk output into RESPONSE_STREAM_DATA frames | Microservice |
| 19 | STR-060 | TODO | Implement IRawStellaEndpoint streaming pattern | Microservice |
| 20 | STR-061 | TODO | Document streaming handler guidelines | Docs |
| 21 | STR-070 | TODO | Write integration tests for upload streaming | |
| 22 | STR-071 | TODO | Write integration tests for download streaming | |
| 23 | STR-072 | TODO | Write tests for cancellation during streaming | |
## Streaming Frame Protocol
### Request Streaming
```
Gateway → Microservice:
1. REQUEST frame (headers, method, path, CorrelationId)
2. REQUEST_STREAM_DATA frame (chunk 1)
3. REQUEST_STREAM_DATA frame (chunk 2)
...
N. REQUEST_STREAM_DATA frame (final chunk, EndOfStream=true)
```
### Response Streaming
```
Microservice → Gateway:
1. RESPONSE frame (status code, headers, CorrelationId)
2. RESPONSE_STREAM_DATA frame (chunk 1)
3. RESPONSE_STREAM_DATA frame (chunk 2)
...
N. RESPONSE_STREAM_DATA frame (final chunk, EndOfStream=true)
```
## StreamDataPayload
```csharp
public sealed class StreamDataPayload
{
public Guid CorrelationId { get; init; }
public byte[] Data { get; init; } = Array.Empty<byte>();
public bool EndOfStream { get; init; }
public int SequenceNumber { get; init; }
}
```
## Gateway Streaming Dispatch
```csharp
// In TransportDispatchMiddleware
if (endpoint.SupportsStreaming)
{
await DispatchStreamingAsync(context, transport, decision, cancellationToken);
}
else
{
await DispatchBufferedAsync(context, transport, decision, cancellationToken);
}
private async Task DispatchStreamingAsync(...)
{
// Send REQUEST header
var requestFrame = BuildRequestHeaderFrame(context);
await transport.SendFrameAsync(connection, requestFrame, ct);
// Stream body chunks
var buffer = new byte[_options.StreamChunkSize];
int bytesRead;
int sequence = 0;
while ((bytesRead = await context.Request.Body.ReadAsync(buffer, ct)) > 0)
{
var streamFrame = new Frame
{
Type = FrameType.RequestStreamData,
CorrelationId = requestFrame.CorrelationId,
Payload = SerializeStreamData(buffer[..bytesRead], sequence++, endOfStream: false)
};
await transport.SendFrameAsync(connection, streamFrame, ct);
}
// Send end-of-stream
var endFrame = new Frame
{
Type = FrameType.RequestStreamData,
CorrelationId = requestFrame.CorrelationId,
Payload = SerializeStreamData(Array.Empty<byte>(), sequence, endOfStream: true)
};
await transport.SendFrameAsync(connection, endFrame, ct);
// Receive response (streaming or buffered)
await ReceiveResponseAsync(context, transport, connection, requestFrame.CorrelationId, ct);
}
```
## Microservice Streaming Handler
```csharp
[StellaEndpoint("POST", "/files/upload", SupportsStreaming = true)]
public class FileUploadEndpoint : IRawStellaEndpoint
{
public async Task<RawResponse> HandleAsync(RawRequestContext context, CancellationToken ct)
{
// Body is a stream that reads from REQUEST_STREAM_DATA frames
var tempPath = Path.GetTempFileName();
await using var fileStream = File.Create(tempPath);
await context.Body.CopyToAsync(fileStream, ct);
return RawResponse.Ok($"Uploaded {fileStream.Length} bytes");
}
}
[StellaEndpoint("GET", "/files/{id}/download", SupportsStreaming = true)]
public class FileDownloadEndpoint : IRawStellaEndpoint
{
public async Task<RawResponse> HandleAsync(RawRequestContext context, CancellationToken ct)
{
var fileId = context.PathParameters["id"];
var filePath = _storage.GetPath(fileId);
// Return streaming response
return new RawResponse
{
StatusCode = 200,
Body = File.OpenRead(filePath), // Stream, not buffered
Headers = new HeaderCollection
{
["Content-Type"] = "application/octet-stream"
}
};
}
}
```
## StreamingOptions
```csharp
public sealed class StreamingOptions
{
public int ChunkSize { get; set; } = 64 * 1024; // 64KB default
public int MaxConcurrentStreams { get; set; } = 100;
public TimeSpan StreamIdleTimeout { get; set; } = TimeSpan.FromMinutes(5);
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [ ] REQUEST_STREAM_DATA frames implemented in transport
2. [ ] RESPONSE_STREAM_DATA frames implemented in transport
3. [ ] Gateway streams request body to microservice
4. [ ] Gateway streams response body to HTTP client
5. [ ] SDK exposes streaming Body in RawRequestContext
6. [ ] SDK can write streaming response
7. [ ] Cancellation works during streaming
8. [ ] Integration tests for upload and download streaming
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Default chunk size: 64KB (tunable)
- End-of-stream is explicit frame, not connection close
- Backpressure via channel capacity (bounded channels)
- Idle timeout cancels stuck streams
- Typed handlers don't support streaming (use IRawStellaEndpoint)

View File

@@ -0,0 +1,231 @@
# Sprint 7000-0005-0005 · Protocol Features · Payload Limits
## Topic & Scope
Implement payload size limits to protect the gateway from memory exhaustion. Enforce limits per-request, per-connection, and aggregate across all connections.
**Goal:** Gateway rejects oversized payloads early and cancels streams that exceed limits mid-flight.
**Working directory:** `src/Gateway/StellaOps.Gateway.WebService/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0005_0004 (streaming - limits apply to streams)
- **Downstream:** SPRINT_7000_0006_* (real transports)
- **Parallel work:** None. Sequential.
- **Cross-module impact:** Gateway only.
## Documentation Prerequisites
- `docs/router/specs.md` (section 6.5 - Payload and memory protection)
- `docs/router/08-Step.md` (payload limits section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | LIM-001 | TODO | Implement PayloadLimitsMiddleware | Before dispatch |
| 2 | LIM-002 | TODO | Check Content-Length header against MaxRequestBytesPerCall | |
| 3 | LIM-003 | TODO | Return 413 for oversized Content-Length | Early rejection |
| 4 | LIM-010 | TODO | Implement per-request byte counter | |
| 5 | LIM-011 | TODO | Track bytes read during streaming | |
| 6 | LIM-012 | TODO | Abort when MaxRequestBytesPerCall exceeded mid-stream | |
| 7 | LIM-013 | TODO | Send CANCEL frame on limit breach | |
| 8 | LIM-020 | TODO | Implement per-connection byte counter | |
| 9 | LIM-021 | TODO | Track total inflight bytes per connection | |
| 10 | LIM-022 | TODO | Throttle/reject when MaxRequestBytesPerConnection exceeded | |
| 11 | LIM-030 | TODO | Implement aggregate byte counter | |
| 12 | LIM-031 | TODO | Track total inflight bytes across all connections | |
| 13 | LIM-032 | TODO | Throttle/reject when MaxAggregateInflightBytes exceeded | |
| 14 | LIM-033 | TODO | Return 503 for aggregate limit | Service overloaded |
| 15 | LIM-040 | TODO | Implement ByteCountingStream wrapper | Counts bytes as they flow |
| 16 | LIM-041 | TODO | Wire counting stream into dispatch | |
| 17 | LIM-050 | TODO | Create PayloadLimitOptions | All three limits |
| 18 | LIM-051 | TODO | Bind PayloadLimitOptions from configuration | |
| 19 | LIM-060 | TODO | Log limit breaches with request details | |
| 20 | LIM-061 | TODO | Add metrics for payload tracking | Prometheus/OpenTelemetry |
| 21 | LIM-070 | TODO | Write tests for early rejection (Content-Length) | |
| 22 | LIM-071 | TODO | Write tests for mid-stream cancellation | |
| 23 | LIM-072 | TODO | Write tests for connection limit | |
| 24 | LIM-073 | TODO | Write tests for aggregate limit | |
## PayloadLimits
```csharp
public sealed class PayloadLimits
{
public long MaxRequestBytesPerCall { get; set; } = 10 * 1024 * 1024; // 10 MB
public long MaxRequestBytesPerConnection { get; set; } = 100 * 1024 * 1024; // 100 MB
public long MaxAggregateInflightBytes { get; set; } = 1024 * 1024 * 1024; // 1 GB
}
```
## PayloadLimitsMiddleware
```csharp
public class PayloadLimitsMiddleware
{
public async Task InvokeAsync(HttpContext context, IPayloadTracker tracker)
{
// Early rejection for known Content-Length
if (context.Request.ContentLength.HasValue)
{
if (context.Request.ContentLength > _limits.MaxRequestBytesPerCall)
{
_logger.LogWarning("Request rejected: Content-Length {Length} exceeds limit {Limit}",
context.Request.ContentLength, _limits.MaxRequestBytesPerCall);
context.Response.StatusCode = 413; // Payload Too Large
await context.Response.WriteAsJsonAsync(new
{
error = "Payload Too Large",
maxBytes = _limits.MaxRequestBytesPerCall
});
return;
}
}
// Check aggregate capacity
if (!tracker.TryReserve(context.Request.ContentLength ?? 0))
{
context.Response.StatusCode = 503; // Service Unavailable
await context.Response.WriteAsJsonAsync(new
{
error = "Service Overloaded",
message = "Too many concurrent requests"
});
return;
}
try
{
await _next(context);
}
finally
{
tracker.Release(/* bytes actually used */);
}
}
}
```
## IPayloadTracker
```csharp
public interface IPayloadTracker
{
bool TryReserve(long estimatedBytes);
void Release(long actualBytes);
long CurrentInflightBytes { get; }
bool IsOverloaded { get; }
}
internal sealed class PayloadTracker : IPayloadTracker
{
private long _totalInflightBytes;
private readonly ConcurrentDictionary<string, long> _perConnectionBytes = new();
public bool TryReserve(long estimatedBytes)
{
var newTotal = Interlocked.Add(ref _totalInflightBytes, estimatedBytes);
if (newTotal > _limits.MaxAggregateInflightBytes)
{
Interlocked.Add(ref _totalInflightBytes, -estimatedBytes);
return false;
}
return true;
}
public void Release(long actualBytes)
{
Interlocked.Add(ref _totalInflightBytes, -actualBytes);
}
}
```
## ByteCountingStream
```csharp
internal sealed class ByteCountingStream : Stream
{
private readonly Stream _inner;
private readonly long _limit;
private readonly Action _onLimitExceeded;
private long _bytesRead;
public override async ValueTask<int> ReadAsync(Memory<byte> buffer, CancellationToken ct)
{
var read = await _inner.ReadAsync(buffer, ct);
_bytesRead += read;
if (_bytesRead > _limit)
{
_onLimitExceeded();
throw new PayloadLimitExceededException(_bytesRead, _limit);
}
return read;
}
public long BytesRead => _bytesRead;
}
```
## Mid-Stream Limit Breach Flow
```
1. Streaming request begins
2. Gateway counts bytes as they flow through ByteCountingStream
3. When _bytesRead > MaxRequestBytesPerCall:
a. Stop reading from HTTP body
b. Send CANCEL frame with reason "PayloadLimitExceeded"
c. Return 413 to client
d. Log the incident with request details
```
## Configuration
```json
{
"PayloadLimits": {
"MaxRequestBytesPerCall": 10485760,
"MaxRequestBytesPerConnection": 104857600,
"MaxAggregateInflightBytes": 1073741824
}
}
```
## Error Responses
| Condition | HTTP Status | Error Message |
|-----------|-------------|---------------|
| Content-Length exceeds per-call limit | 413 | Payload Too Large |
| Streaming exceeds per-call limit | 413 | Payload Too Large |
| Per-connection limit exceeded | 429 | Too Many Requests |
| Aggregate limit exceeded | 503 | Service Overloaded |
## Exit Criteria
Before marking this sprint DONE:
1. [ ] Early rejection for known oversized Content-Length
2. [ ] Mid-stream cancellation when limit exceeded
3. [ ] CANCEL frame sent on limit breach
4. [ ] Per-connection tracking works
5. [ ] Aggregate tracking works
6. [ ] All limit scenarios tested
7. [ ] Metrics/logging in place
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Default limits are conservative; tune for your environment
- Per-connection limit applies to inflight bytes, not lifetime total
- Aggregate limit prevents memory exhaustion but may cause 503s under load
- ByteCountingStream adds minimal overhead
- Limit breach is logged at Warning level

View File

@@ -0,0 +1,231 @@
# Sprint 7000-0006-0001 · Real Transports · TCP Plugin
## Topic & Scope
Implement the TCP transport plugin. This is the primary production transport with length-prefixed framing for reliable frame delivery.
**Goal:** Replace InMemory transport with production-grade TCP transport.
**Working directory:** `src/__Libraries/StellaOps.Router.Transport.Tcp/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0005_0005 (all protocol features proven with InMemory)
- **Downstream:** SPRINT_7000_0006_0002 (TLS wraps TCP)
- **Parallel work:** None initially; UDP and RabbitMQ can start after TCP basics work
- **Cross-module impact:** None. New library only.
## Documentation Prerequisites
- `docs/router/specs.md` (section 5 - Transport plugin requirements)
- `docs/router/09-Step.md` (TCP transport section)
- `docs/router/implplan.md` (phase 9 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | TCP-001 | TODO | Create `StellaOps.Router.Transport.Tcp` classlib project | Add to solution |
| 2 | TCP-002 | TODO | Add project reference to Router.Common | |
| 3 | TCP-010 | TODO | Implement `TcpTransportServer` : `ITransportServer` | Gateway side |
| 4 | TCP-011 | TODO | Implement TCP listener with configurable bind address/port | |
| 5 | TCP-012 | TODO | Implement connection accept loop | One connection per microservice |
| 6 | TCP-013 | TODO | Implement connection ID generation | Based on endpoint |
| 7 | TCP-020 | TODO | Implement `TcpTransportClient` : `ITransportClient` | Microservice side |
| 8 | TCP-021 | TODO | Implement connection establishment | With retry |
| 9 | TCP-022 | TODO | Implement reconnection on failure | Exponential backoff |
| 10 | TCP-030 | TODO | Implement length-prefixed framing protocol | |
| 11 | TCP-031 | TODO | Frame format: [4-byte length][payload] | Big-endian length |
| 12 | TCP-032 | TODO | Implement frame reader (async, streaming) | |
| 13 | TCP-033 | TODO | Implement frame writer (async, thread-safe) | |
| 14 | TCP-040 | TODO | Implement frame multiplexing | Multiple correlations on one socket |
| 15 | TCP-041 | TODO | Route responses by CorrelationId | |
| 16 | TCP-042 | TODO | Handle out-of-order responses | |
| 17 | TCP-050 | TODO | Implement keep-alive/ping at TCP level | |
| 18 | TCP-051 | TODO | Detect dead connections | |
| 19 | TCP-052 | TODO | Clean up on connection loss | |
| 20 | TCP-060 | TODO | Create TcpTransportOptions | BindAddress, Port, BufferSize |
| 21 | TCP-061 | TODO | Create DI registration `AddTcpTransport()` | |
| 22 | TCP-070 | TODO | Write integration tests with real sockets | |
| 23 | TCP-071 | TODO | Write tests for reconnection | |
| 24 | TCP-072 | TODO | Write tests for multiplexing | |
| 25 | TCP-073 | TODO | Write load tests | Concurrent requests |
## Frame Format
```
┌─────────────────────────────────────────────────────────────┐
│ 4 bytes (big-endian) │ N bytes (payload) │
│ Payload Length │ [FrameType][CorrelationId][Data] │
└─────────────────────────────────────────────────────────────┘
```
### Payload Structure
```
Byte 0: FrameType (1 byte enum value)
Bytes 1-16: CorrelationId (16 bytes GUID)
Bytes 17+: Frame-specific data
```
## TcpTransportServer
```csharp
public sealed class TcpTransportServer : ITransportServer, IAsyncDisposable
{
private TcpListener? _listener;
private readonly ConcurrentDictionary<string, TcpConnection> _connections = new();
public async Task StartAsync(CancellationToken ct)
{
_listener = new TcpListener(_options.BindAddress, _options.Port);
_listener.Start();
_ = AcceptLoopAsync(ct);
}
private async Task AcceptLoopAsync(CancellationToken ct)
{
while (!ct.IsCancellationRequested)
{
var client = await _listener!.AcceptTcpClientAsync(ct);
var connectionId = GenerateConnectionId(client);
var connection = new TcpConnection(connectionId, client, this);
_connections[connectionId] = connection;
OnConnection?.Invoke(connectionId);
_ = connection.ReadLoopAsync(ct);
}
}
public async Task SendFrameAsync(string connectionId, Frame frame)
{
if (_connections.TryGetValue(connectionId, out var conn))
{
await conn.WriteFrameAsync(frame);
}
}
}
```
## TcpConnection (internal)
```csharp
internal sealed class TcpConnection : IAsyncDisposable
{
private readonly TcpClient _client;
private readonly NetworkStream _stream;
private readonly SemaphoreSlim _writeLock = new(1, 1);
public async Task ReadLoopAsync(CancellationToken ct)
{
var lengthBuffer = new byte[4];
while (!ct.IsCancellationRequested)
{
// Read length prefix
await ReadExactAsync(_stream, lengthBuffer, ct);
var length = BinaryPrimitives.ReadInt32BigEndian(lengthBuffer);
// Read payload
var payload = new byte[length];
await ReadExactAsync(_stream, payload, ct);
// Parse frame
var frame = ParseFrame(payload);
_server.OnFrame?.Invoke(_connectionId, frame);
}
}
public async Task WriteFrameAsync(Frame frame)
{
var payload = SerializeFrame(frame);
var lengthBytes = new byte[4];
BinaryPrimitives.WriteInt32BigEndian(lengthBytes, payload.Length);
await _writeLock.WaitAsync();
try
{
await _stream.WriteAsync(lengthBytes);
await _stream.WriteAsync(payload);
}
finally
{
_writeLock.Release();
}
}
}
```
## TcpTransportOptions
```csharp
public sealed class TcpTransportOptions
{
public IPAddress BindAddress { get; set; } = IPAddress.Any;
public int Port { get; set; } = 5100;
public int ReceiveBufferSize { get; set; } = 64 * 1024;
public int SendBufferSize { get; set; } = 64 * 1024;
public TimeSpan KeepAliveInterval { get; set; } = TimeSpan.FromSeconds(30);
public TimeSpan ConnectTimeout { get; set; } = TimeSpan.FromSeconds(10);
public int MaxReconnectAttempts { get; set; } = 10;
public TimeSpan MaxReconnectBackoff { get; set; } = TimeSpan.FromMinutes(1);
}
```
## Multiplexing
One TCP connection carries multiple concurrent requests:
- Each request has unique CorrelationId
- Responses can arrive in any order
- `ConcurrentDictionary<Guid, TaskCompletionSource<Frame>>` for pending requests
```csharp
internal sealed class PendingRequestTracker
{
private readonly ConcurrentDictionary<Guid, TaskCompletionSource<Frame>> _pending = new();
public Task<Frame> TrackRequest(Guid correlationId, CancellationToken ct)
{
var tcs = new TaskCompletionSource<Frame>(TaskCreationOptions.RunContinuationsAsynchronously);
ct.Register(() => tcs.TrySetCanceled());
_pending[correlationId] = tcs;
return tcs.Task;
}
public void CompleteRequest(Guid correlationId, Frame response)
{
if (_pending.TryRemove(correlationId, out var tcs))
{
tcs.TrySetResult(response);
}
}
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [ ] TcpTransportServer accepts connections and reads frames
2. [ ] TcpTransportClient connects and sends frames
3. [ ] Length-prefixed framing works correctly
4. [ ] Multiplexing routes responses to correct callers
5. [ ] Reconnection with backoff works
6. [ ] Keep-alive detects dead connections
7. [ ] Integration tests pass
8. [ ] Load tests demonstrate concurrent request handling
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Big-endian length prefix for network byte order
- Maximum frame size: 16 MB (configurable)
- One socket per microservice instance (not per request)
- Write lock prevents interleaved frames
- No compression at transport level (consider adding later)

View File

@@ -0,0 +1,227 @@
# Sprint 7000-0006-0002 · Real Transports · TLS/mTLS Plugin
## Topic & Scope
Implement the TLS transport plugin (Certificate transport). Wraps TCP with TLS encryption and supports optional mutual TLS (mTLS) for verifiable peer identity.
**Goal:** Secure transport with certificate-based authentication.
**Working directory:** `src/__Libraries/StellaOps.Router.Transport.Tls/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0006_0001 (TCP transport - this wraps it)
- **Downstream:** None. Parallel with UDP and RabbitMQ.
- **Parallel work:** Can run in parallel with UDP and RabbitMQ sprints.
- **Cross-module impact:** None. New library only.
## Documentation Prerequisites
- `docs/router/specs.md` (section 5 - Certificate transport requirements)
- `docs/router/09-Step.md` (TLS transport section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | TLS-001 | TODO | Create `StellaOps.Router.Transport.Tls` classlib project | Add to solution |
| 2 | TLS-002 | TODO | Add project reference to Router.Common and Transport.Tcp | Wraps TCP |
| 3 | TLS-010 | TODO | Implement `TlsTransportServer` : `ITransportServer` | Gateway side |
| 4 | TLS-011 | TODO | Wrap TcpListener with SslStream | |
| 5 | TLS-012 | TODO | Configure server certificate | |
| 6 | TLS-013 | TODO | Implement optional client certificate validation (mTLS) | |
| 7 | TLS-020 | TODO | Implement `TlsTransportClient` : `ITransportClient` | Microservice side |
| 8 | TLS-021 | TODO | Wrap TcpClient with SslStream | |
| 9 | TLS-022 | TODO | Implement server certificate validation | |
| 10 | TLS-023 | TODO | Implement client certificate presentation (mTLS) | |
| 11 | TLS-030 | TODO | Create TlsTransportOptions | Certificates, validation mode |
| 12 | TLS-031 | TODO | Support PEM file paths | |
| 13 | TLS-032 | TODO | Support PFX file paths with password | |
| 14 | TLS-033 | TODO | Support X509Certificate2 objects | For programmatic use |
| 15 | TLS-040 | TODO | Implement certificate chain validation | |
| 16 | TLS-041 | TODO | Implement certificate revocation checking (optional) | |
| 17 | TLS-042 | TODO | Implement hostname verification | |
| 18 | TLS-050 | TODO | Create DI registration `AddTlsTransport()` | |
| 19 | TLS-051 | TODO | Support certificate hot-reload | For rotation |
| 20 | TLS-060 | TODO | Write integration tests with self-signed certs | |
| 21 | TLS-061 | TODO | Write tests for mTLS | |
| 22 | TLS-062 | TODO | Write tests for cert validation failures | |
## TlsTransportOptions
```csharp
public sealed class TlsTransportOptions
{
// Server-side (Gateway)
public X509Certificate2? ServerCertificate { get; set; }
public string? ServerCertificatePath { get; set; } // PEM or PFX
public string? ServerCertificateKeyPath { get; set; } // PEM private key
public string? ServerCertificatePassword { get; set; } // For PFX
// Client-side (Microservice)
public X509Certificate2? ClientCertificate { get; set; }
public string? ClientCertificatePath { get; set; }
public string? ClientCertificateKeyPath { get; set; }
public string? ClientCertificatePassword { get; set; }
// Validation
public bool RequireClientCertificate { get; set; } = false; // mTLS
public bool AllowSelfSigned { get; set; } = false; // Dev only
public bool CheckCertificateRevocation { get; set; } = false;
public string? ExpectedServerHostname { get; set; } // For SNI
// Protocol
public SslProtocols EnabledProtocols { get; set; } = SslProtocols.Tls12 | SslProtocols.Tls13;
}
```
## Server Implementation
```csharp
public sealed class TlsTransportServer : ITransportServer
{
public async Task StartAsync(CancellationToken ct)
{
_listener = new TcpListener(_tcpOptions.BindAddress, _tcpOptions.Port);
_listener.Start();
_ = AcceptLoopAsync(ct);
}
private async Task AcceptLoopAsync(CancellationToken ct)
{
while (!ct.IsCancellationRequested)
{
var tcpClient = await _listener!.AcceptTcpClientAsync(ct);
var sslStream = new SslStream(
tcpClient.GetStream(),
leaveInnerStreamOpen: false,
userCertificateValidationCallback: ValidateClientCertificate);
try
{
await sslStream.AuthenticateAsServerAsync(new SslServerAuthenticationOptions
{
ServerCertificate = _options.ServerCertificate,
ClientCertificateRequired = _options.RequireClientCertificate,
EnabledSslProtocols = _options.EnabledProtocols,
CertificateRevocationCheckMode = _options.CheckCertificateRevocation
? X509RevocationMode.Online
: X509RevocationMode.NoCheck
}, ct);
// Connection authenticated, continue with frame reading
var connectionId = GenerateConnectionId(tcpClient, sslStream.RemoteCertificate);
var connection = new TlsConnection(connectionId, tcpClient, sslStream, this);
_connections[connectionId] = connection;
OnConnection?.Invoke(connectionId);
_ = connection.ReadLoopAsync(ct);
}
catch (AuthenticationException ex)
{
_logger.LogWarning(ex, "TLS handshake failed from {RemoteEndpoint}",
tcpClient.Client.RemoteEndPoint);
tcpClient.Dispose();
}
}
}
private bool ValidateClientCertificate(
object sender, X509Certificate? certificate,
X509Chain? chain, SslPolicyErrors errors)
{
if (!_options.RequireClientCertificate && certificate == null)
return true;
if (_options.AllowSelfSigned)
return true;
return errors == SslPolicyErrors.None;
}
}
```
## Client Implementation
```csharp
public sealed class TlsTransportClient : ITransportClient
{
public async Task ConnectAsync(CancellationToken ct)
{
var tcpClient = new TcpClient();
await tcpClient.ConnectAsync(_options.Host, _options.Port, ct);
var sslStream = new SslStream(
tcpClient.GetStream(),
leaveInnerStreamOpen: false,
userCertificateValidationCallback: ValidateServerCertificate);
await sslStream.AuthenticateAsClientAsync(new SslClientAuthenticationOptions
{
TargetHost = _options.ExpectedServerHostname ?? _options.Host,
ClientCertificates = _options.ClientCertificate != null
? new X509CertificateCollection { _options.ClientCertificate }
: null,
EnabledSslProtocols = _options.EnabledProtocols,
CertificateRevocationCheckMode = _options.CheckCertificateRevocation
? X509RevocationMode.Online
: X509RevocationMode.NoCheck
}, ct);
// Connected and authenticated
_stream = sslStream;
_tcpClient = tcpClient;
}
}
```
## mTLS Identity Extraction
With mTLS, the microservice identity can be verified from the client certificate:
```csharp
internal string ExtractIdentityFromCertificate(X509Certificate2 cert)
{
// Common patterns:
// 1. Common Name (CN)
var cn = cert.GetNameInfo(X509NameType.SimpleName, forIssuer: false);
// 2. Subject Alternative Name (SAN) - DNS or URI
var san = cert.Extensions["2.5.29.17"]; // SAN OID
// 3. Custom extension for service identity
// ...
return cn;
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [ ] TlsTransportServer accepts TLS connections
2. [ ] TlsTransportClient connects with TLS
3. [ ] Server and client certificate configuration works
4. [ ] mTLS (mutual TLS) works when enabled
5. [ ] Certificate validation works (chain, revocation, hostname)
6. [ ] AllowSelfSigned works for dev environments
7. [ ] Certificate hot-reload works
8. [ ] Integration tests pass
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- TLS 1.2 and 1.3 enabled by default (1.0/1.1 disabled)
- Certificate revocation checking is optional (can slow down)
- mTLS is optional (RequireClientCertificate = false by default)
- Identity extraction from cert is customizable
- Certificate hot-reload uses file system watcher

View File

@@ -0,0 +1,221 @@
# Sprint 7000-0006-0003 · Real Transports · UDP Plugin
## Topic & Scope
Implement the UDP transport plugin for small, bounded payloads. UDP provides low-latency communication for simple operations but cannot handle streaming or large payloads.
**Goal:** Fast transport for small, idempotent operations.
**Working directory:** `src/__Libraries/StellaOps.Router.Transport.Udp/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0006_0001 (TCP transport for reference patterns)
- **Downstream:** None.
- **Parallel work:** Can run in parallel with TLS and RabbitMQ sprints.
- **Cross-module impact:** None. New library only.
## Documentation Prerequisites
- `docs/router/specs.md` (section 5 - UDP transport requirements)
- `docs/router/09-Step.md` (UDP transport section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | UDP-001 | TODO | Create `StellaOps.Router.Transport.Udp` classlib project | Add to solution |
| 2 | UDP-002 | TODO | Add project reference to Router.Common | |
| 3 | UDP-010 | TODO | Implement `UdpTransportServer` : `ITransportServer` | Gateway side |
| 4 | UDP-011 | TODO | Implement UDP socket listener | |
| 5 | UDP-012 | TODO | Implement datagram receive loop | |
| 6 | UDP-013 | TODO | Route received datagrams by source address | |
| 7 | UDP-020 | TODO | Implement `UdpTransportClient` : `ITransportClient` | Microservice side |
| 8 | UDP-021 | TODO | Implement UDP socket for sending | |
| 9 | UDP-022 | TODO | Implement receive for responses | |
| 10 | UDP-030 | TODO | Enforce MaxRequestBytesPerCall limit | Single datagram |
| 11 | UDP-031 | TODO | Reject oversized payloads | |
| 12 | UDP-032 | TODO | Set maximum datagram size from config | |
| 13 | UDP-040 | TODO | Implement request/response correlation | Per-datagram matching |
| 14 | UDP-041 | TODO | Track pending requests with timeout | |
| 15 | UDP-042 | TODO | Handle out-of-order responses | |
| 16 | UDP-050 | TODO | Implement HELLO via UDP | |
| 17 | UDP-051 | TODO | Implement HEARTBEAT via UDP | |
| 18 | UDP-052 | TODO | Implement REQUEST/RESPONSE via UDP | No streaming |
| 19 | UDP-060 | TODO | Disable streaming for UDP transport | |
| 20 | UDP-061 | TODO | Reject endpoints with SupportsStreaming | |
| 21 | UDP-062 | TODO | Log streaming attempts as errors | |
| 22 | UDP-070 | TODO | Create UdpTransportOptions | BindAddress, Port, MaxDatagramSize |
| 23 | UDP-071 | TODO | Create DI registration `AddUdpTransport()` | |
| 24 | UDP-080 | TODO | Write integration tests | |
| 25 | UDP-081 | TODO | Write tests for size limit enforcement | |
## Constraints
From specs.md:
> UDP transport:
> * MUST be used only for small/bounded payloads (no unbounded streaming).
> * MUST respect configured `MaxRequestBytesPerCall`.
- **No streaming:** REQUEST_STREAM_DATA and RESPONSE_STREAM_DATA are not supported
- **Size limit:** Entire request must fit in one datagram
- **Best for:** Ping, health checks, small queries, commands
## Datagram Format
Single UDP datagram = single frame:
```
┌─────────────────────────────────────────────────────────────┐
│ FrameType (1 byte) │ CorrelationId (16 bytes) │ Data (N) │
└─────────────────────────────────────────────────────────────┘
```
Maximum datagram size: Typically 65,507 bytes (IPv4) but practical limit ~1400 for MTU safety.
## UdpTransportServer
```csharp
public sealed class UdpTransportServer : ITransportServer
{
private UdpClient? _listener;
private readonly ConcurrentDictionary<IPEndPoint, string> _endpointToConnectionId = new();
public async Task StartAsync(CancellationToken ct)
{
_listener = new UdpClient(_options.Port);
_ = ReceiveLoopAsync(ct);
}
private async Task ReceiveLoopAsync(CancellationToken ct)
{
while (!ct.IsCancellationRequested)
{
var result = await _listener!.ReceiveAsync(ct);
var remoteEndpoint = result.RemoteEndPoint;
var data = result.Buffer;
// Parse frame
var frame = ParseFrame(data);
// Get or create connection ID for this endpoint
var connectionId = _endpointToConnectionId.GetOrAdd(
remoteEndpoint,
ep => $"udp-{ep}");
// Handle HELLO specially to register connection
if (frame.Type == FrameType.Hello)
{
OnConnection?.Invoke(connectionId);
}
OnFrame?.Invoke(connectionId, frame);
}
}
public async Task SendFrameAsync(string connectionId, Frame frame)
{
var endpoint = ResolveEndpoint(connectionId);
var data = SerializeFrame(frame);
if (data.Length > _options.MaxDatagramSize)
throw new PayloadTooLargeException(data.Length, _options.MaxDatagramSize);
await _listener!.SendAsync(data, data.Length, endpoint);
}
}
```
## UdpTransportClient
```csharp
public sealed class UdpTransportClient : ITransportClient
{
private UdpClient? _client;
private readonly ConcurrentDictionary<Guid, TaskCompletionSource<Frame>> _pending = new();
public async Task ConnectAsync(string host, int port, CancellationToken ct)
{
_client = new UdpClient();
_client.Connect(host, port);
_ = ReceiveLoopAsync(ct);
}
public async Task<Frame> SendRequestAsync(
ConnectionState connection, Frame request,
TimeSpan timeout, CancellationToken ct)
{
var data = SerializeFrame(request);
if (data.Length > _options.MaxDatagramSize)
throw new PayloadTooLargeException(data.Length, _options.MaxDatagramSize);
var tcs = new TaskCompletionSource<Frame>();
using var cts = CancellationTokenSource.CreateLinkedTokenSource(ct);
cts.CancelAfter(timeout);
cts.Token.Register(() => tcs.TrySetCanceled());
_pending[request.CorrelationId] = tcs;
await _client!.SendAsync(data, data.Length);
return await tcs.Task;
}
// Streaming not supported
public Task SendStreamingAsync(...) => throw new NotSupportedException(
"UDP transport does not support streaming. Use TCP or TLS transport.");
}
```
## UdpTransportOptions
```csharp
public sealed class UdpTransportOptions
{
public IPAddress BindAddress { get; set; } = IPAddress.Any;
public int Port { get; set; } = 5101;
public int MaxDatagramSize { get; set; } = 8192; // Conservative default
public TimeSpan DefaultTimeout { get; set; } = TimeSpan.FromSeconds(5);
public bool AllowBroadcast { get; set; } = false;
}
```
## Use Cases
UDP is appropriate for:
- **Health checks:** Small, frequent, non-critical
- **Metrics collection:** Fire-and-forget updates
- **Cache invalidation:** Small notifications
- **DNS-like lookups:** Quick request/response
UDP is NOT appropriate for:
- **File uploads/downloads:** Requires streaming
- **Large requests/responses:** Exceeds datagram limit
- **Critical operations:** No delivery guarantee
- **Ordered sequences:** Out-of-order possible
## Exit Criteria
Before marking this sprint DONE:
1. [ ] UdpTransportServer receives datagrams
2. [ ] UdpTransportClient sends and receives
3. [ ] Size limits enforced
4. [ ] Streaming disabled/rejected
5. [ ] Request/response correlation works
6. [ ] Integration tests pass
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Default max datagram: 8KB (well under MTU)
- No retry/reliability - UDP is fire-and-forget
- Connection is logical (based on source IP:port)
- Timeout is per-request, no keepalive needed
- CANCEL is sent but may not arrive (best effort)

View File

@@ -0,0 +1,218 @@
# Sprint 7000-0006-0004 · Real Transports · RabbitMQ Plugin
## Topic & Scope
Implement the RabbitMQ transport plugin. Uses message queue infrastructure for reliable asynchronous communication with built-in durability options.
**Goal:** Reliable transport using existing message queue infrastructure.
**Working directory:** `src/__Libraries/StellaOps.Router.Transport.RabbitMq/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0006_0001 (TCP transport for reference patterns)
- **Downstream:** None.
- **Parallel work:** Can run in parallel with TLS and UDP sprints.
- **Cross-module impact:** None. New library only.
## Documentation Prerequisites
- `docs/router/specs.md` (section 5 - RabbitMQ transport requirements)
- `docs/router/09-Step.md` (RabbitMQ transport section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | RMQ-001 | TODO | Create `StellaOps.Router.Transport.RabbitMq` classlib project | Add to solution |
| 2 | RMQ-002 | TODO | Add project reference to Router.Common | |
| 3 | RMQ-003 | TODO | Add RabbitMQ.Client NuGet package | |
| 4 | RMQ-010 | TODO | Implement `RabbitMqTransportServer` : `ITransportServer` | Gateway side |
| 5 | RMQ-011 | TODO | Implement connection to RabbitMQ broker | |
| 6 | RMQ-012 | TODO | Create request queue per gateway node | |
| 7 | RMQ-013 | TODO | Create response exchange for routing | |
| 8 | RMQ-014 | TODO | Implement consumer for incoming frames | |
| 9 | RMQ-020 | TODO | Implement `RabbitMqTransportClient` : `ITransportClient` | Microservice side |
| 10 | RMQ-021 | TODO | Implement connection to RabbitMQ broker | |
| 11 | RMQ-022 | TODO | Create response queue per microservice instance | |
| 12 | RMQ-023 | TODO | Bind response queue to exchange | |
| 13 | RMQ-030 | TODO | Implement queue/exchange naming convention | |
| 14 | RMQ-031 | TODO | Format: `stella.router.{nodeId}.requests` | Gateway request queue |
| 15 | RMQ-032 | TODO | Format: `stella.router.responses` | Response exchange |
| 16 | RMQ-033 | TODO | Routing key: `{connectionId}` | For response routing |
| 17 | RMQ-040 | TODO | Use CorrelationId for request/response matching | BasicProperties |
| 18 | RMQ-041 | TODO | Set ReplyTo for response routing | |
| 19 | RMQ-042 | TODO | Implement pending request tracking | |
| 20 | RMQ-050 | TODO | Implement HELLO via RabbitMQ | |
| 21 | RMQ-051 | TODO | Implement HEARTBEAT via RabbitMQ | |
| 22 | RMQ-052 | TODO | Implement REQUEST/RESPONSE via RabbitMQ | |
| 23 | RMQ-053 | TODO | Implement CANCEL via RabbitMQ | |
| 24 | RMQ-060 | TODO | Implement streaming via RabbitMQ (optional) | Chunked messages |
| 25 | RMQ-061 | TODO | Consider at-most-once delivery semantics | |
| 26 | RMQ-070 | TODO | Create RabbitMqTransportOptions | Connection, queues, durability |
| 27 | RMQ-071 | TODO | Create DI registration `AddRabbitMqTransport()` | |
| 28 | RMQ-080 | TODO | Write integration tests with local RabbitMQ | |
| 29 | RMQ-081 | TODO | Write tests for connection recovery | |
## Queue/Exchange Topology
```
┌─────────────────────────┐
Microservice ──────────►│ stella.router.requests │
(HELLO, HEARTBEAT, │ (Direct Exchange) │
RESPONSE) └───────────┬─────────────┘
│ routing_key = nodeId
┌─────────────────────────┐
│ stella.gw.{nodeId}.in │◄─── Gateway consumes
│ (Queue) │
└─────────────────────────┘
Gateway ───────────────►┌─────────────────────────┐
(REQUEST, CANCEL) │ stella.router.responses │
│ (Topic Exchange) │
└───────────┬─────────────┘
│ routing_key = instanceId
┌─────────────────────────┐
│ stella.svc.{instanceId} │◄─── Microservice consumes
│ (Queue) │
└─────────────────────────┘
```
## Message Properties
```csharp
var properties = channel.CreateBasicProperties();
properties.CorrelationId = correlationId.ToString();
properties.ReplyTo = replyQueueName;
properties.Type = frameType.ToString();
properties.Timestamp = new AmqpTimestamp(DateTimeOffset.UtcNow.ToUnixTimeSeconds());
properties.Expiration = timeout.TotalMilliseconds.ToString();
properties.DeliveryMode = 1; // Non-persistent (or 2 for persistent)
```
## RabbitMqTransportOptions
```csharp
public sealed class RabbitMqTransportOptions
{
// Connection
public string HostName { get; set; } = "localhost";
public int Port { get; set; } = 5672;
public string VirtualHost { get; set; } = "/";
public string UserName { get; set; } = "guest";
public string Password { get; set; } = "guest";
// TLS
public bool UseSsl { get; set; } = false;
public string? SslCertPath { get; set; }
// Queues
public bool DurableQueues { get; set; } = false; // For dev, true for prod
public bool AutoDeleteQueues { get; set; } = true; // Clean up on disconnect
public int PrefetchCount { get; set; } = 10; // Concurrent messages
// Naming
public string ExchangePrefix { get; set; } = "stella.router";
public string QueuePrefix { get; set; } = "stella";
}
```
## RabbitMqTransportServer
```csharp
public sealed class RabbitMqTransportServer : ITransportServer
{
private IConnection? _connection;
private IModel? _channel;
private readonly string _requestQueueName;
public async Task StartAsync(CancellationToken ct)
{
var factory = new ConnectionFactory
{
HostName = _options.HostName,
Port = _options.Port,
VirtualHost = _options.VirtualHost,
UserName = _options.UserName,
Password = _options.Password
};
_connection = factory.CreateConnection();
_channel = _connection.CreateModel();
// Declare exchanges
_channel.ExchangeDeclare(_options.RequestExchange, ExchangeType.Direct, durable: true);
_channel.ExchangeDeclare(_options.ResponseExchange, ExchangeType.Topic, durable: true);
// Declare and bind request queue
_requestQueueName = $"{_options.QueuePrefix}.gw.{_nodeId}.in";
_channel.QueueDeclare(_requestQueueName,
durable: _options.DurableQueues,
exclusive: false,
autoDelete: _options.AutoDeleteQueues);
_channel.QueueBind(_requestQueueName, _options.RequestExchange, routingKey: _nodeId);
// Start consuming
var consumer = new EventingBasicConsumer(_channel);
consumer.Received += OnMessageReceived;
_channel.BasicConsume(_requestQueueName, autoAck: true, consumer);
}
private void OnMessageReceived(object? sender, BasicDeliverEventArgs e)
{
var frame = ParseFrame(e.Body.ToArray(), e.BasicProperties);
var connectionId = ExtractConnectionId(e.BasicProperties);
if (frame.Type == FrameType.Hello)
{
OnConnection?.Invoke(connectionId);
}
OnFrame?.Invoke(connectionId, frame);
}
}
```
## At-Most-Once Semantics
From specs.md:
> * Guarantee at-most-once semantics where practical.
This means:
- Auto-ack messages (no redelivery on failure)
- Non-durable queues/messages by default
- Idempotent handlers are caller's responsibility
For at-least-once (if needed later):
- Manual ack after processing
- Durable queues and persistent messages
- Deduplication in handler
## Exit Criteria
Before marking this sprint DONE:
1. [ ] RabbitMqTransportServer connects and consumes
2. [ ] RabbitMqTransportClient publishes and consumes
3. [ ] Queue/exchange topology correct
4. [ ] CorrelationId matching works
5. [ ] HELLO/HEARTBEAT/REQUEST/RESPONSE flow works
6. [ ] Connection recovery works
7. [ ] Integration tests pass with local RabbitMQ
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Auto-delete queues by default (clean up on disconnect)
- Non-persistent messages by default (speed over durability)
- Prefetch count limits concurrent processing
- Connection recovery uses RabbitMQ.Client built-in recovery
- Streaming is optional (can chunk large messages)

View File

@@ -0,0 +1,220 @@
# Sprint 7000-0007-0001 · Configuration · Router Config Library
## Topic & Scope
Implement the Router.Config library with YAML configuration support and hot-reload. Provides centralized configuration for services, endpoints, static instances, and payload limits.
**Goal:** Configuration-driven router behavior with runtime updates.
**Working directory:** `src/__Libraries/StellaOps.Router.Config/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0006_* (all transports - config applies to transport selection)
- **Downstream:** SPRINT_7000_0007_0002 (microservice YAML)
- **Parallel work:** None. Sequential.
- **Cross-module impact:** Gateway consumes this library.
## Documentation Prerequisites
- `docs/router/specs.md` (section 11 - Configuration and YAML requirements)
- `docs/router/10-Step.md` (configuration section)
- `docs/router/implplan.md` (phase 10 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | CFG-001 | TODO | Implement `RouterConfig` root object | |
| 2 | CFG-002 | TODO | Implement `ServiceConfig` for service definitions | |
| 3 | CFG-003 | TODO | Implement `EndpointConfig` for endpoint definitions | |
| 4 | CFG-004 | TODO | Implement `StaticInstanceConfig` for known instances | |
| 5 | CFG-010 | TODO | Implement YAML configuration binding | YamlDotNet |
| 6 | CFG-011 | TODO | Implement JSON configuration binding | System.Text.Json |
| 7 | CFG-012 | TODO | Implement environment variable overrides | |
| 8 | CFG-013 | TODO | Support configuration layering (base + overrides) | |
| 9 | CFG-020 | TODO | Implement hot-reload via IOptionsMonitor | |
| 10 | CFG-021 | TODO | Implement file system watcher for YAML | |
| 11 | CFG-022 | TODO | Trigger routing state refresh on config change | |
| 12 | CFG-023 | TODO | Handle errors in reloaded config (keep previous) | |
| 13 | CFG-030 | TODO | Implement `IRouterConfigProvider` interface | |
| 14 | CFG-031 | TODO | Implement validation on load | Required fields, format |
| 15 | CFG-032 | TODO | Log configuration changes | |
| 16 | CFG-040 | TODO | Create DI registration `AddRouterConfig()` | |
| 17 | CFG-041 | TODO | Integrate with Gateway startup | |
| 18 | CFG-050 | TODO | Write sample router.yaml | |
| 19 | CFG-051 | TODO | Write unit tests for binding | |
| 20 | CFG-052 | TODO | Write tests for hot-reload | |
## RouterConfig Structure
```csharp
public sealed class RouterConfig
{
public IList<ServiceConfig> Services { get; init; } = new List<ServiceConfig>();
public IList<StaticInstanceConfig> StaticInstances { get; init; } = new List<StaticInstanceConfig>();
public PayloadLimits PayloadLimits { get; init; } = new();
public RoutingOptions Routing { get; init; } = new();
}
public sealed class ServiceConfig
{
public string Name { get; init; } = string.Empty;
public string DefaultVersion { get; init; } = "1.0.0";
public TransportType DefaultTransport { get; init; } = TransportType.Tcp;
public IList<EndpointConfig> Endpoints { get; init; } = new List<EndpointConfig>();
}
public sealed class EndpointConfig
{
public string Method { get; init; } = "GET";
public string Path { get; init; } = string.Empty;
public TimeSpan? DefaultTimeout { get; init; }
public IList<ClaimRequirementConfig> RequiringClaims { get; init; } = new List<ClaimRequirementConfig>();
public bool? SupportsStreaming { get; init; }
}
public sealed class StaticInstanceConfig
{
public string ServiceName { get; init; } = string.Empty;
public string Version { get; init; } = string.Empty;
public string Region { get; init; } = string.Empty;
public string Host { get; init; } = string.Empty;
public int Port { get; init; }
public TransportType Transport { get; init; }
}
```
## Sample router.yaml
```yaml
# Router configuration
payloadLimits:
maxRequestBytesPerCall: 10485760 # 10 MB
maxRequestBytesPerConnection: 104857600
maxAggregateInflightBytes: 1073741824
routing:
neighborRegions:
- eu2
- us1
tieBreaker: roundRobin
services:
- name: billing
defaultVersion: "1.0.0"
defaultTransport: tcp
endpoints:
- method: POST
path: /invoices
defaultTimeout: 30s
requiringClaims:
- type: role
value: billing-admin
- method: GET
path: /invoices/{id}
defaultTimeout: 5s
- name: inventory
defaultVersion: "2.1.0"
defaultTransport: tls
endpoints:
- method: GET
path: /items
supportsStreaming: true
# Optional: static instances (usually discovered via HELLO)
staticInstances:
- serviceName: billing
version: "1.0.0"
region: eu1
host: billing-eu1-01.internal
port: 5100
transport: tcp
```
## Hot-Reload Implementation
```csharp
public sealed class RouterConfigProvider : IRouterConfigProvider, IDisposable
{
private RouterConfig _current;
private readonly FileSystemWatcher? _watcher;
private readonly ILogger<RouterConfigProvider> _logger;
public RouterConfigProvider(IOptions<RouterConfigOptions> options, ILogger<RouterConfigProvider> logger)
{
_logger = logger;
_current = LoadConfig(options.Value.ConfigPath);
if (options.Value.EnableHotReload)
{
_watcher = new FileSystemWatcher(Path.GetDirectoryName(options.Value.ConfigPath)!)
{
Filter = Path.GetFileName(options.Value.ConfigPath),
NotifyFilter = NotifyFilters.LastWrite
};
_watcher.Changed += OnConfigFileChanged;
_watcher.EnableRaisingEvents = true;
}
}
private void OnConfigFileChanged(object sender, FileSystemEventArgs e)
{
try
{
var newConfig = LoadConfig(e.FullPath);
ValidateConfig(newConfig);
var previous = _current;
_current = newConfig;
_logger.LogInformation("Router configuration reloaded successfully");
ConfigurationChanged?.Invoke(this, new ConfigChangedEventArgs(previous, newConfig));
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to reload configuration, keeping previous");
}
}
public RouterConfig Current => _current;
public event EventHandler<ConfigChangedEventArgs>? ConfigurationChanged;
}
```
## Configuration Precedence
1. **Code defaults** (in Common library)
2. **YAML configuration** (router.yaml)
3. **JSON configuration** (appsettings.json)
4. **Environment variables** (STELLAOPS_ROUTER_*)
5. **Microservice HELLO** (dynamic registration)
6. **Authority overrides** (for RequiringClaims)
Later sources override earlier ones.
## Exit Criteria
Before marking this sprint DONE:
1. [ ] RouterConfig binds from YAML correctly
2. [ ] JSON and environment variables also work
3. [ ] Hot-reload updates config without restart
4. [ ] Validation rejects invalid config
5. [ ] Sample router.yaml documents all options
6. [ ] DI integration works with Gateway
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- YamlDotNet for YAML parsing (mature, well-supported)
- File watcher has debounce to avoid multiple reloads
- Invalid hot-reload keeps previous config (fail-safe)
- Static instances are optional (most discover via HELLO)

View File

@@ -0,0 +1,213 @@
# Sprint 7000-0007-0002 · Configuration · Microservice YAML Config
## Topic & Scope
Implement YAML configuration support for microservices. Allows endpoint-level overrides for timeouts, RequiringClaims, and streaming flags without code changes.
**Goal:** Microservices can customize endpoint behavior via YAML without rebuilding.
**Working directory:** `src/__Libraries/StellaOps.Microservice/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0007_0001 (Router.Config patterns)
- **Downstream:** SPRINT_7000_0008_0001 (Authority integration)
- **Parallel work:** None. Sequential.
- **Cross-module impact:** Microservice SDK only.
## Documentation Prerequisites
- `docs/router/specs.md` (sections 7.3, 11 - Microservice config requirements)
- `docs/router/10-Step.md` (microservice YAML section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | MCFG-001 | TODO | Create `MicroserviceEndpointConfig` class | |
| 2 | MCFG-002 | TODO | Create `MicroserviceYamlConfig` root object | |
| 3 | MCFG-010 | TODO | Implement YAML loading from ConfigFilePath | |
| 4 | MCFG-011 | TODO | Implement endpoint matching by (Method, Path) | |
| 5 | MCFG-012 | TODO | Implement override merge with code defaults | |
| 6 | MCFG-020 | TODO | Override DefaultTimeout per endpoint | |
| 7 | MCFG-021 | TODO | Override RequiringClaims per endpoint | |
| 8 | MCFG-022 | TODO | Override SupportsStreaming per endpoint | |
| 9 | MCFG-030 | TODO | Implement precedence: code → YAML | |
| 10 | MCFG-031 | TODO | Document that YAML cannot create endpoints (only modify) | |
| 11 | MCFG-032 | TODO | Warn on YAML entries that don't match code endpoints | |
| 12 | MCFG-040 | TODO | Integrate with endpoint discovery | |
| 13 | MCFG-041 | TODO | Apply overrides before HELLO construction | |
| 14 | MCFG-050 | TODO | Create sample microservice.yaml | |
| 15 | MCFG-051 | TODO | Write unit tests for merge logic | |
| 16 | MCFG-052 | TODO | Write tests for precedence | |
## MicroserviceYamlConfig Structure
```csharp
public sealed class MicroserviceYamlConfig
{
public IList<EndpointOverrideConfig> Endpoints { get; init; } = new List<EndpointOverrideConfig>();
}
public sealed class EndpointOverrideConfig
{
public string Method { get; init; } = string.Empty;
public string Path { get; init; } = string.Empty;
public TimeSpan? DefaultTimeout { get; init; }
public bool? SupportsStreaming { get; init; }
public IList<ClaimRequirementConfig>? RequiringClaims { get; init; }
}
```
## Sample microservice.yaml
```yaml
# Microservice endpoint overrides
# Note: Only modifies endpoints declared in code; cannot create new endpoints
endpoints:
- method: POST
path: /invoices
defaultTimeout: 60s # Override code default of 30s
requiringClaims:
- type: role
value: invoice-creator
- type: department
value: finance
- method: GET
path: /invoices/{id}
defaultTimeout: 10s
- method: POST
path: /reports/generate
supportsStreaming: true # Enable streaming for large reports
defaultTimeout: 300s # 5 minutes for long-running reports
```
## Merge Logic
```csharp
internal sealed class EndpointOverrideMerger
{
public EndpointDescriptor Merge(
EndpointDescriptor codeDefault,
EndpointOverrideConfig? yamlOverride)
{
if (yamlOverride == null)
return codeDefault;
return codeDefault with
{
DefaultTimeout = yamlOverride.DefaultTimeout ?? codeDefault.DefaultTimeout,
SupportsStreaming = yamlOverride.SupportsStreaming ?? codeDefault.SupportsStreaming,
RequiringClaims = yamlOverride.RequiringClaims?.Select(c =>
new ClaimRequirement { Type = c.Type, Value = c.Value }).ToList()
?? codeDefault.RequiringClaims
};
}
}
```
## Precedence Rules
From specs.md section 7.3:
> Precedence rules MUST be clearly defined and honored:
> * Service identity & router pool: from `StellaMicroserviceOptions` (not YAML).
> * Endpoint set: from code (attributes/source gen); YAML MAY override properties but ideally not create endpoints not present in code.
> * `RequiringClaims` and timeouts: YAML overrides defaults from code, unless overridden by central Authority.
```
┌─────────────────┐
│ Code defaults │ [StellaEndpoint] attribute values
└────────┬────────┘
│ YAML overrides (if present)
┌─────────────────┐
│ YAML config │ Endpoint-specific overrides
└────────┬────────┘
│ Authority overrides (later sprint)
┌─────────────────┐
│ Effective │ Final values sent in HELLO
└─────────────────┘
```
## Integration with Discovery
```csharp
internal sealed class EndpointDiscoveryService
{
private readonly IMicroserviceYamlLoader _yamlLoader;
private readonly EndpointOverrideMerger _merger;
public IReadOnlyList<EndpointDescriptor> DiscoverEndpoints()
{
// 1. Discover from code
var codeEndpoints = DiscoverFromReflection();
// 2. Load YAML overrides
var yamlConfig = _yamlLoader.Load();
// 3. Merge
return codeEndpoints.Select(ep =>
{
var yamlOverride = yamlConfig?.Endpoints
.FirstOrDefault(y => y.Method == ep.Method && y.Path == ep.Path);
if (yamlOverride == null)
return ep;
return _merger.Merge(ep, yamlOverride);
}).ToList();
}
}
```
## Warning on Unmatched YAML
```csharp
private void WarnUnmatchedOverrides(
IEnumerable<EndpointDescriptor> codeEndpoints,
MicroserviceYamlConfig? yamlConfig)
{
if (yamlConfig == null) return;
var codeKeys = codeEndpoints.Select(e => (e.Method, e.Path)).ToHashSet();
foreach (var yamlEntry in yamlConfig.Endpoints)
{
if (!codeKeys.Contains((yamlEntry.Method, yamlEntry.Path)))
{
_logger.LogWarning(
"YAML override for {Method} {Path} does not match any code endpoint",
yamlEntry.Method, yamlEntry.Path);
}
}
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [ ] YAML loading works from ConfigFilePath
2. [ ] Merge applies YAML overrides to code defaults
3. [ ] Precedence is code → YAML
4. [ ] Unmatched YAML entries logged as warnings
5. [ ] Sample microservice.yaml documented
6. [ ] Unit tests for merge logic
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- YAML cannot create endpoints (only modify) per spec
- Missing YAML file is not an error (optional config)
- Hot-reload of microservice YAML is not supported (restart required)
- RequiringClaims in YAML fully replaces code defaults (not merged)

View File

@@ -0,0 +1,204 @@
# Sprint 7000-0008-0001 · Integration · Authority Claims Override
## Topic & Scope
Implement Authority integration for RequiringClaims overrides. The central Authority service can push endpoint authorization requirements that override microservice defaults.
**Goal:** Centralized authorization policy that takes precedence over microservice-defined claims.
**Working directories:**
- `src/Gateway/StellaOps.Gateway.WebService/` (apply overrides)
- `src/Authority/` (if Authority changes needed)
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0007_0002 (microservice YAML - establishes precedence)
- **Downstream:** SPRINT_7000_0008_0002 (source generator)
- **Parallel work:** Can run in parallel with source generator sprint.
- **Cross-module impact:** May require Authority module changes.
## Documentation Prerequisites
- `docs/router/specs.md` (section 9 - Authorization / requiringClaims / Authority requirements)
- `docs/modules/authority/architecture.md` (Authority module design)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Working Directory |
|---|---------|--------|-------------|-------------------|
| 1 | AUTH-001 | TODO | Define `IAuthorityClaimsProvider` interface | Common/Gateway |
| 2 | AUTH-002 | TODO | Define `ClaimsOverride` model | Common |
| 3 | AUTH-010 | TODO | Implement Gateway startup claims fetch | Gateway |
| 4 | AUTH-011 | TODO | Request overrides from Authority on startup | |
| 5 | AUTH-012 | TODO | Wait for Authority before handling traffic (configurable) | |
| 6 | AUTH-020 | TODO | Implement runtime claims update | Gateway |
| 7 | AUTH-021 | TODO | Periodically refresh from Authority | |
| 8 | AUTH-022 | TODO | Or subscribe to Authority push notifications | |
| 9 | AUTH-030 | TODO | Merge Authority overrides with microservice defaults | Gateway |
| 10 | AUTH-031 | TODO | Authority takes precedence over YAML and code | |
| 11 | AUTH-032 | TODO | Store effective RequiringClaims per endpoint | |
| 12 | AUTH-040 | TODO | Implement AuthorizationMiddleware with claims enforcement | Gateway |
| 13 | AUTH-041 | TODO | Check user principal has all required claims | |
| 14 | AUTH-042 | TODO | Return 403 Forbidden on claim failure | |
| 15 | AUTH-050 | TODO | Create configuration for Authority connection | Gateway |
| 16 | AUTH-051 | TODO | Handle Authority unavailable (use cached/defaults) | |
| 17 | AUTH-060 | TODO | Write integration tests for claims enforcement | |
| 18 | AUTH-061 | TODO | Write tests for Authority override precedence | |
## IAuthorityClaimsProvider
```csharp
public interface IAuthorityClaimsProvider
{
Task<IReadOnlyDictionary<EndpointKey, IReadOnlyList<ClaimRequirement>>> GetOverridesAsync(
CancellationToken cancellationToken);
event EventHandler<ClaimsOverrideChangedEventArgs>? OverridesChanged;
}
public readonly record struct EndpointKey(string ServiceName, string Method, string Path);
public sealed class ClaimsOverrideChangedEventArgs : EventArgs
{
public IReadOnlyDictionary<EndpointKey, IReadOnlyList<ClaimRequirement>> Overrides { get; init; } = new Dictionary<EndpointKey, IReadOnlyList<ClaimRequirement>>();
}
```
## Final Precedence Chain
```
┌─────────────────────┐
│ Code defaults │ [StellaEndpoint] RequiringClaims
└──────────┬──────────┘
│ YAML overrides
┌─────────────────────┐
│ Microservice YAML │ Endpoint-specific claims
└──────────┬──────────┘
│ Authority overrides (highest priority)
┌─────────────────────┐
│ Authority Policy │ Central claims requirements
└──────────┬──────────┘
┌─────────────────────┐
│ Effective Claims │ What Gateway enforces
└─────────────────────┘
```
## AuthorizationMiddleware (Updated)
```csharp
public class AuthorizationMiddleware
{
public async Task InvokeAsync(HttpContext context, IEffectiveClaimsStore claimsStore)
{
var endpoint = (EndpointDescriptor)context.Items["ResolvedEndpoint"]!;
// Get effective claims (already merged with Authority)
var effectiveClaims = claimsStore.GetEffectiveClaims(
endpoint.ServiceName, endpoint.Method, endpoint.Path);
// Check each required claim
foreach (var required in effectiveClaims)
{
var userClaims = context.User.Claims;
bool hasClaim = required.Value == null
? userClaims.Any(c => c.Type == required.Type)
: userClaims.Any(c => c.Type == required.Type && c.Value == required.Value);
if (!hasClaim)
{
_logger.LogWarning(
"Authorization failed: user lacks claim {ClaimType}={ClaimValue}",
required.Type, required.Value ?? "(any)");
context.Response.StatusCode = 403;
await context.Response.WriteAsJsonAsync(new
{
error = "Forbidden",
requiredClaim = new { type = required.Type, value = required.Value }
});
return;
}
}
await _next(context);
}
}
```
## IEffectiveClaimsStore
```csharp
public interface IEffectiveClaimsStore
{
IReadOnlyList<ClaimRequirement> GetEffectiveClaims(
string serviceName, string method, string path);
void UpdateFromMicroservice(string serviceName, IReadOnlyList<EndpointDescriptor> endpoints);
void UpdateFromAuthority(IReadOnlyDictionary<EndpointKey, IReadOnlyList<ClaimRequirement>> overrides);
}
internal sealed class EffectiveClaimsStore : IEffectiveClaimsStore
{
private readonly ConcurrentDictionary<EndpointKey, IReadOnlyList<ClaimRequirement>> _microserviceClaims = new();
private readonly ConcurrentDictionary<EndpointKey, IReadOnlyList<ClaimRequirement>> _authorityClaims = new();
public IReadOnlyList<ClaimRequirement> GetEffectiveClaims(
string serviceName, string method, string path)
{
var key = new EndpointKey(serviceName, method, path);
// Authority takes precedence
if (_authorityClaims.TryGetValue(key, out var authorityClaims))
return authorityClaims;
// Fall back to microservice defaults
if (_microserviceClaims.TryGetValue(key, out var msClaims))
return msClaims;
return Array.Empty<ClaimRequirement>();
}
}
```
## Authority Connection Options
```csharp
public sealed class AuthorityConnectionOptions
{
public string AuthorityUrl { get; set; } = string.Empty;
public bool WaitForAuthorityOnStartup { get; set; } = true;
public TimeSpan StartupTimeout { get; set; } = TimeSpan.FromSeconds(30);
public TimeSpan RefreshInterval { get; set; } = TimeSpan.FromMinutes(5);
public bool UseAuthorityPushNotifications { get; set; } = false;
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [ ] IAuthorityClaimsProvider implemented
2. [ ] Gateway fetches overrides on startup
3. [ ] Authority overrides take precedence
4. [ ] AuthorizationMiddleware enforces effective claims
5. [ ] Graceful handling when Authority unavailable
6. [ ] Integration tests verify claims enforcement
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Authority overrides fully replace microservice claims (not merged)
- Startup can optionally wait for Authority (fail-safe mode proceeds without)
- Refresh interval is 5 minutes by default (tune for your environment)
- Authority push notifications optional (polling is default)
- This sprint assumes Authority module exists; coordinate with Authority team

View File

@@ -0,0 +1,231 @@
# Sprint 7000-0008-0002 · Integration · Endpoint Source Generator
## Topic & Scope
Implement a Roslyn source generator for compile-time endpoint discovery. Generates endpoint metadata at build time, eliminating runtime reflection overhead.
**Goal:** Faster startup and AOT compatibility via build-time endpoint discovery.
**Working directory:** `src/__Libraries/StellaOps.Microservice.SourceGen/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0003_0001 (SDK core with reflection-based discovery)
- **Downstream:** None.
- **Parallel work:** Can run in parallel with Authority integration.
- **Cross-module impact:** Microservice SDK consumes generated code.
## Documentation Prerequisites
- `docs/router/specs.md` (section 7.2 - Endpoint definition & discovery)
- Roslyn Source Generator documentation
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | GEN-001 | TODO | Convert project to source generator | Microsoft.CodeAnalysis.CSharp |
| 2 | GEN-002 | TODO | Implement `[StellaEndpoint]` attribute detection | Syntax receiver |
| 3 | GEN-003 | TODO | Extract Method, Path, and other attribute properties | |
| 4 | GEN-010 | TODO | Detect handler interface implementation | IStellaEndpoint<T,R>, etc. |
| 5 | GEN-011 | TODO | Generate `EndpointDescriptor` instances | |
| 6 | GEN-012 | TODO | Generate `IGeneratedEndpointProvider` implementation | |
| 7 | GEN-020 | TODO | Generate registration code for DI | |
| 8 | GEN-021 | TODO | Generate handler factory methods | |
| 9 | GEN-030 | TODO | Implement incremental generation | For fast builds |
| 10 | GEN-031 | TODO | Cache compilation results | |
| 11 | GEN-040 | TODO | Add analyzer for invalid [StellaEndpoint] usage | Diagnostics |
| 12 | GEN-041 | TODO | Error on missing handler interface | |
| 13 | GEN-042 | TODO | Warning on duplicate Method+Path | |
| 14 | GEN-050 | TODO | Hook into SDK to prefer generated over reflection | |
| 15 | GEN-051 | TODO | Fall back to reflection if generation not available | |
| 16 | GEN-060 | TODO | Write unit tests for generator | |
| 17 | GEN-061 | TODO | Test generated code compiles and works | |
| 18 | GEN-062 | TODO | Test incremental generation | |
## Source Generator Output
Given this input:
```csharp
[StellaEndpoint("POST", "/invoices", DefaultTimeout = 30)]
public sealed class CreateInvoiceEndpoint : IStellaEndpoint<CreateInvoiceRequest, CreateInvoiceResponse>
{
public Task<CreateInvoiceResponse> HandleAsync(CreateInvoiceRequest request, CancellationToken ct) => ...;
}
```
The generator produces:
```csharp
// <auto-generated/>
namespace StellaOps.Microservice.Generated
{
[global::System.CodeDom.Compiler.GeneratedCode("StellaOps.Microservice.SourceGen", "1.0.0")]
internal static class StellaEndpoints
{
public static global::System.Collections.Generic.IReadOnlyList<global::StellaOps.Router.Common.EndpointDescriptor>
GetEndpoints()
{
return new global::StellaOps.Router.Common.EndpointDescriptor[]
{
new global::StellaOps.Router.Common.EndpointDescriptor
{
Method = "POST",
Path = "/invoices",
DefaultTimeout = global::System.TimeSpan.FromSeconds(30),
SupportsStreaming = false,
RequiringClaims = global::System.Array.Empty<global::StellaOps.Router.Common.ClaimRequirement>(),
HandlerType = typeof(global::MyApp.CreateInvoiceEndpoint)
},
// ... more endpoints
};
}
public static void RegisterHandlers(
global::Microsoft.Extensions.DependencyInjection.IServiceCollection services)
{
services.AddTransient<global::MyApp.CreateInvoiceEndpoint>();
// ... more handlers
}
}
}
```
## Generator Implementation
```csharp
[Generator]
public class StellaEndpointGenerator : IIncrementalGenerator
{
public void Initialize(IncrementalGeneratorInitializationContext context)
{
// Find all classes with [StellaEndpoint]
var endpointClasses = context.SyntaxProvider
.ForAttributeWithMetadataName(
"StellaOps.Microservice.StellaEndpointAttribute",
predicate: static (node, _) => node is ClassDeclarationSyntax,
transform: static (ctx, _) => GetEndpointInfo(ctx))
.Where(static info => info is not null);
// Combine and generate
context.RegisterSourceOutput(
endpointClasses.Collect(),
static (spc, endpoints) => GenerateEndpointsClass(spc, endpoints!));
}
private static EndpointInfo? GetEndpointInfo(GeneratorAttributeSyntaxContext context)
{
var classSymbol = (INamedTypeSymbol)context.TargetSymbol;
var attribute = context.Attributes[0];
// Extract attribute parameters
var method = attribute.ConstructorArguments[0].Value as string;
var path = attribute.ConstructorArguments[1].Value as string;
// Find timeout, streaming, etc. from named arguments
var timeout = attribute.NamedArguments
.FirstOrDefault(a => a.Key == "DefaultTimeout").Value.Value as int? ?? 30;
// Verify handler interface
var implementsHandler = classSymbol.AllInterfaces
.Any(i => i.Name.StartsWith("IStellaEndpoint"));
if (!implementsHandler)
{
// Report diagnostic
return null;
}
return new EndpointInfo(classSymbol, method!, path!, timeout);
}
}
```
## IGeneratedEndpointProvider
```csharp
public interface IGeneratedEndpointProvider
{
IReadOnlyList<EndpointDescriptor> GetEndpoints();
void RegisterHandlers(IServiceCollection services);
}
// Generated implementation
internal sealed class GeneratedEndpointProvider : IGeneratedEndpointProvider
{
public IReadOnlyList<EndpointDescriptor> GetEndpoints()
=> StellaEndpoints.GetEndpoints();
public void RegisterHandlers(IServiceCollection services)
=> StellaEndpoints.RegisterHandlers(services);
}
```
## SDK Integration
```csharp
internal sealed class EndpointDiscoveryService
{
public IReadOnlyList<EndpointDescriptor> DiscoverEndpoints()
{
// Prefer generated
var generated = TryGetGeneratedProvider();
if (generated != null)
{
_logger.LogDebug("Using source-generated endpoint discovery");
return generated.GetEndpoints();
}
// Fall back to reflection
_logger.LogDebug("Using reflection-based endpoint discovery");
return DiscoverFromReflection();
}
private IGeneratedEndpointProvider? TryGetGeneratedProvider()
{
// Look for generated type in entry assembly
var entryAssembly = Assembly.GetEntryAssembly();
var providerType = entryAssembly?.GetType(
"StellaOps.Microservice.Generated.GeneratedEndpointProvider");
if (providerType != null)
return (IGeneratedEndpointProvider)Activator.CreateInstance(providerType)!;
return null;
}
}
```
## Diagnostics
| ID | Severity | Message |
|----|----------|---------|
| STELLA001 | Error | Class with [StellaEndpoint] must implement IStellaEndpoint<> or IRawStellaEndpoint |
| STELLA002 | Warning | Duplicate endpoint: {Method} {Path} |
| STELLA003 | Warning | [StellaEndpoint] on abstract class is ignored |
| STELLA004 | Info | Generated {N} endpoint descriptors |
## Exit Criteria
Before marking this sprint DONE:
1. [ ] Source generator detects [StellaEndpoint] classes
2. [ ] Generates EndpointDescriptor array
3. [ ] Generates DI registration
4. [ ] Incremental generation for fast builds
5. [ ] Analyzers report invalid usage
6. [ ] SDK prefers generated over reflection
7. [ ] All tests pass
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Incremental generation is essential for large projects
- Generated code uses fully qualified names to avoid conflicts
- Fallback to reflection ensures compatibility with older projects
- AOT scenarios require source generation (no reflection)

View File

@@ -0,0 +1,260 @@
# Sprint 7000-0009-0001 · Examples · Reference Implementation
## Topic & Scope
Build a complete reference example demonstrating the router, gateway, and microservice SDK working together. Provides templates for common patterns and validates the entire system end-to-end.
**Goal:** Working example that developers can copy and adapt.
**Working directory:** `examples/router/`
## Dependencies & Concurrency
- **Upstream:** All feature sprints complete (7000-0001 through 7000-0008)
- **Downstream:** SPRINT_7000_0009_0002 (migration docs)
- **Parallel work:** Can run in parallel with migration docs.
- **Cross-module impact:** None. Examples only.
## Documentation Prerequisites
- `docs/router/specs.md` (complete specification)
- `docs/router/implplan.md` (phase 11 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | EX-001 | TODO | Create `examples/router/` directory structure | |
| 2 | EX-002 | TODO | Create example solution `Examples.Router.sln` | |
| 3 | EX-010 | TODO | Create `Examples.Gateway` project | Full gateway setup |
| 4 | EX-011 | TODO | Configure gateway with all middleware | |
| 5 | EX-012 | TODO | Create example router.yaml | |
| 6 | EX-013 | TODO | Configure TCP and TLS transports | |
| 7 | EX-020 | TODO | Create `Examples.Billing.Microservice` project | |
| 8 | EX-021 | TODO | Implement simple GET/POST endpoints | |
| 9 | EX-022 | TODO | Implement streaming upload endpoint | IRawStellaEndpoint |
| 10 | EX-023 | TODO | Create example microservice.yaml | |
| 11 | EX-030 | TODO | Create `Examples.Inventory.Microservice` project | Second service |
| 12 | EX-031 | TODO | Demonstrate multi-service routing | |
| 13 | EX-040 | TODO | Create docker-compose.yaml | Local dev environment |
| 14 | EX-041 | TODO | Include RabbitMQ for transport option | |
| 15 | EX-042 | TODO | Include health monitoring | |
| 16 | EX-050 | TODO | Write README.md with run instructions | |
| 17 | EX-051 | TODO | Document adding new endpoints | |
| 18 | EX-052 | TODO | Document cancellation behavior | |
| 19 | EX-053 | TODO | Document payload limit testing | |
| 20 | EX-060 | TODO | Create integration test project | |
| 21 | EX-061 | TODO | Test full end-to-end flow | |
## Directory Structure
```
examples/router/
├── Examples.Router.sln
├── docker-compose.yaml
├── README.md
├── src/
│ ├── Examples.Gateway/
│ │ ├── Program.cs
│ │ ├── appsettings.json
│ │ └── router.yaml
│ ├── Examples.Billing.Microservice/
│ │ ├── Program.cs
│ │ ├── appsettings.json
│ │ ├── microservice.yaml
│ │ └── Endpoints/
│ │ ├── CreateInvoiceEndpoint.cs
│ │ ├── GetInvoiceEndpoint.cs
│ │ └── UploadAttachmentEndpoint.cs
│ └── Examples.Inventory.Microservice/
│ ├── Program.cs
│ └── Endpoints/
│ ├── ListItemsEndpoint.cs
│ └── GetItemEndpoint.cs
└── tests/
└── Examples.Integration.Tests/
```
## Example Gateway Program.cs
```csharp
var builder = WebApplication.CreateBuilder(args);
// Router configuration
builder.Services.AddRouterConfig(options =>
{
options.ConfigPath = "router.yaml";
options.EnableHotReload = true;
});
// Gateway node configuration
builder.Services.Configure<GatewayNodeConfig>(
builder.Configuration.GetSection("GatewayNode"));
// Transports
builder.Services.AddTcpTransport(options =>
{
options.Port = 5100;
});
builder.Services.AddTlsTransport(options =>
{
options.Port = 5101;
options.ServerCertificatePath = "certs/gateway.pfx";
});
// Routing
builder.Services.AddSingleton<IGlobalRoutingState, InMemoryRoutingState>();
builder.Services.AddSingleton<IRoutingPlugin, DefaultRoutingPlugin>();
// Authority integration
builder.Services.AddAuthorityClaimsProvider(options =>
{
options.AuthorityUrl = builder.Configuration["Authority:Url"];
});
var app = builder.Build();
// Middleware pipeline
app.UseForwardedHeaders();
app.UseMiddleware<GlobalErrorHandlerMiddleware>();
app.UseMiddleware<RequestLoggingMiddleware>();
app.UseMiddleware<PayloadLimitsMiddleware>();
app.UseAuthentication();
app.UseMiddleware<EndpointResolutionMiddleware>();
app.UseMiddleware<AuthorizationMiddleware>();
app.UseMiddleware<RoutingDecisionMiddleware>();
app.UseMiddleware<TransportDispatchMiddleware>();
app.Run();
```
## Example Microservice Program.cs
```csharp
var builder = Host.CreateApplicationBuilder(args);
builder.Services.AddStellaMicroservice(options =>
{
options.ServiceName = "billing";
options.Version = "1.0.0";
options.Region = "eu1";
options.InstanceId = $"billing-{Environment.MachineName}";
options.ConfigFilePath = "microservice.yaml";
options.Routers = new[]
{
new RouterEndpointConfig
{
Host = "gateway.local",
Port = 5100,
TransportType = TransportType.Tcp
}
};
});
var host = builder.Build();
await host.RunAsync();
```
## Example Endpoints
### Typed Endpoint
```csharp
[StellaEndpoint("POST", "/invoices", DefaultTimeout = 30)]
public sealed class CreateInvoiceEndpoint : IStellaEndpoint<CreateInvoiceRequest, CreateInvoiceResponse>
{
private readonly IInvoiceService _service;
public CreateInvoiceEndpoint(IInvoiceService service) => _service = service;
public async Task<CreateInvoiceResponse> HandleAsync(
CreateInvoiceRequest request,
CancellationToken ct)
{
var invoice = await _service.CreateAsync(request, ct);
return new CreateInvoiceResponse { InvoiceId = invoice.Id };
}
}
```
### Streaming Endpoint
```csharp
[StellaEndpoint("POST", "/invoices/{id}/attachments", SupportsStreaming = true)]
public sealed class UploadAttachmentEndpoint : IRawStellaEndpoint
{
private readonly IStorageService _storage;
public async Task<RawResponse> HandleAsync(RawRequestContext context, CancellationToken ct)
{
var invoiceId = context.PathParameters["id"];
// Stream body directly to storage
var path = await _storage.StoreAsync(invoiceId, context.Body, ct);
return RawResponse.Ok(JsonSerializer.Serialize(new { path }));
}
}
```
## docker-compose.yaml
```yaml
version: '3.8'
services:
gateway:
build: ./src/Examples.Gateway
ports:
- "8080:8080" # HTTP ingress
- "5100:5100" # TCP transport
- "5101:5101" # TLS transport
environment:
- GatewayNode__Region=eu1
- GatewayNode__NodeId=gw-01
billing:
build: ./src/Examples.Billing.Microservice
environment:
- Stella__Routers__0__Host=gateway
- Stella__Routers__0__Port=5100
depends_on:
- gateway
inventory:
build: ./src/Examples.Inventory.Microservice
environment:
- Stella__Routers__0__Host=gateway
- Stella__Routers__0__Port=5100
depends_on:
- gateway
rabbitmq:
image: rabbitmq:3-management
ports:
- "5672:5672"
- "15672:15672"
```
## Exit Criteria
Before marking this sprint DONE:
1. [ ] All example projects build
2. [ ] docker-compose starts full environment
3. [ ] HTTP requests route through gateway to microservices
4. [ ] Streaming upload works
5. [ ] Multiple microservices register correctly
6. [ ] README documents all usage patterns
7. [ ] Integration tests pass
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Examples are separate solution from main StellaOps
- Uses Docker for easy local dev
- Includes both TCP and TLS examples
- RabbitMQ included for transport option demo

View File

@@ -0,0 +1,267 @@
# Sprint 7000-0010-0001 · Migration · WebService to Microservice
## Topic & Scope
Define and document the migration path from existing `StellaOps.*.WebService` projects to the new microservice pattern with router. This is the final sprint that connects the router infrastructure to the rest of StellaOps.
**Goal:** Clear migration guide and tooling for converting WebServices to Microservices.
**Working directories:**
- `docs/router/` (migration documentation)
- Potentially existing WebService projects (for pilot migration)
## Dependencies & Concurrency
- **Upstream:** All router sprints complete (7000-0001 through 7000-0009)
- **Downstream:** None. Final sprint.
- **Parallel work:** None.
- **Cross-module impact:** YES - This sprint affects existing StellaOps modules.
## Documentation Prerequisites
- `docs/router/specs.md` (section 14 - Migration requirements)
- `docs/router/implplan.md` (phase 11-12 guidance)
- Existing WebService project structures
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | MIG-001 | TODO | Inventory all existing WebService projects | List all services |
| 2 | MIG-002 | TODO | Document HTTP routes per service | Method + Path |
| 3 | MIG-010 | TODO | Document Strategy A: In-place adaptation | |
| 4 | MIG-011 | TODO | Add SDK to existing WebService | |
| 5 | MIG-012 | TODO | Wrap controllers in [StellaEndpoint] handlers | |
| 6 | MIG-013 | TODO | Register with router alongside HTTP | |
| 7 | MIG-014 | TODO | Gradual traffic shift from HTTP to router | |
| 8 | MIG-020 | TODO | Document Strategy B: Clean split | |
| 9 | MIG-021 | TODO | Extract domain logic to shared library | |
| 10 | MIG-022 | TODO | Create new Microservice project | |
| 11 | MIG-023 | TODO | Map routes to handlers | |
| 12 | MIG-024 | TODO | Phase out original WebService | |
| 13 | MIG-030 | TODO | Document CancellationToken wiring | |
| 14 | MIG-031 | TODO | Identify async operations needing token | |
| 15 | MIG-032 | TODO | Update DB calls, HTTP calls, etc. | |
| 16 | MIG-040 | TODO | Document streaming migration | |
| 17 | MIG-041 | TODO | Convert file upload controllers | |
| 18 | MIG-042 | TODO | Convert file download controllers | |
| 19 | MIG-050 | TODO | Create migration checklist template | |
| 20 | MIG-051 | TODO | Create automated route inventory tool | Optional |
| 21 | MIG-060 | TODO | Pilot migration: choose one WebService | |
| 22 | MIG-061 | TODO | Execute pilot migration | |
| 23 | MIG-062 | TODO | Document lessons learned | |
| 24 | MIG-070 | TODO | Merge Router.sln into StellaOps.sln | |
| 25 | MIG-071 | TODO | Update CI/CD for router components | |
## Migration Strategies
### Strategy A: In-Place Adaptation
Best for: Services that need to maintain HTTP compatibility during transition.
```
┌─────────────────────────────────────┐
│ StellaOps.Billing.WebService │
│ ┌─────────────────────────────┐ │
│ │ Existing HTTP Controllers │◄───┼──── HTTP clients (legacy)
│ └─────────────────────────────┘ │
│ ┌─────────────────────────────┐ │
│ │ [StellaEndpoint] Handlers │◄───┼──── Router (new)
│ └─────────────────────────────┘ │
│ ┌─────────────────────────────┐ │
│ │ Shared Domain Logic │ │
│ └─────────────────────────────┘ │
└─────────────────────────────────────┘
```
Steps:
1. Add `StellaOps.Microservice` package reference
2. Create handler classes for each route
3. Handlers call existing service layer
4. Register with router pool
5. Test via router
6. Shift traffic gradually
7. Remove HTTP controllers when ready
### Strategy B: Clean Split
Best for: Major refactoring or when HTTP compatibility not needed.
```
┌─────────────────────────────────────┐
│ StellaOps.Billing.Domain │ ◄── Shared library
│ (extracted business logic) │
└─────────────────────────────────────┘
▲ ▲
│ │
┌─────────┴───────┐ ┌───────┴─────────┐
│ (Legacy) │ │ (New) │
│ Billing.Web │ │ Billing.Micro │
│ Service │ │ service │
│ HTTP only │ │ Router only │
└─────────────────┘ └─────────────────┘
```
Steps:
1. Extract domain logic to `.Domain` library
2. Create new `.Microservice` project
3. Implement handlers using domain library
4. Deploy alongside WebService
5. Shift traffic to router
6. Deprecate WebService
## Controller to Handler Mapping
### Before (ASP.NET Controller)
```csharp
[ApiController]
[Route("api/invoices")]
public class InvoicesController : ControllerBase
{
private readonly IInvoiceService _service;
[HttpPost]
[Authorize(Roles = "billing-admin")]
public async Task<IActionResult> Create(
[FromBody] CreateInvoiceRequest request,
CancellationToken ct) // <-- Often missing!
{
var invoice = await _service.CreateAsync(request);
return Ok(new { invoice.Id });
}
}
```
### After (Microservice Handler)
```csharp
[StellaEndpoint("POST", "/api/invoices")]
public sealed class CreateInvoiceEndpoint : IStellaEndpoint<CreateInvoiceRequest, CreateInvoiceResponse>
{
private readonly IInvoiceService _service;
public CreateInvoiceEndpoint(IInvoiceService service) => _service = service;
public async Task<CreateInvoiceResponse> HandleAsync(
CreateInvoiceRequest request,
CancellationToken ct) // <-- Required, propagated
{
var invoice = await _service.CreateAsync(request, ct); // Pass token!
return new CreateInvoiceResponse { InvoiceId = invoice.Id };
}
}
```
## CancellationToken Checklist
For each migrated handler, verify:
- [ ] Handler accepts CancellationToken parameter
- [ ] Token passed to all database calls
- [ ] Token passed to all HTTP client calls
- [ ] Token passed to all file I/O operations
- [ ] Long-running loops check `ct.IsCancellationRequested`
- [ ] Token passed to Task.Delay, WaitAsync, etc.
## Streaming Migration
### File Upload (Before)
```csharp
[HttpPost("upload")]
public async Task<IActionResult> Upload(IFormFile file)
{
using var stream = file.OpenReadStream();
await _storage.SaveAsync(stream);
return Ok();
}
```
### File Upload (After)
```csharp
[StellaEndpoint("POST", "/upload", SupportsStreaming = true)]
public sealed class UploadEndpoint : IRawStellaEndpoint
{
public async Task<RawResponse> HandleAsync(RawRequestContext ctx, CancellationToken ct)
{
await _storage.SaveAsync(ctx.Body, ct); // Body is already a stream
return RawResponse.Ok();
}
}
```
## Migration Checklist Template
```markdown
# Migration Checklist: [ServiceName]
## Inventory
- [ ] List all HTTP routes (Method + Path)
- [ ] Identify streaming endpoints
- [ ] Identify authorization requirements
- [ ] Document external dependencies
## Preparation
- [ ] Add StellaOps.Microservice package
- [ ] Configure router connection
- [ ] Set up local gateway for testing
## Per-Route Migration
For each route:
- [ ] Create [StellaEndpoint] handler class
- [ ] Map request/response types
- [ ] Wire CancellationToken throughout
- [ ] Convert to IRawStellaEndpoint if streaming
- [ ] Write unit tests
- [ ] Write integration tests
## Cutover
- [ ] Deploy alongside existing WebService
- [ ] Verify via router routing
- [ ] Shift percentage of traffic
- [ ] Monitor for errors
- [ ] Full cutover
- [ ] Remove WebService HTTP listeners
## Cleanup
- [ ] Remove unused controller code
- [ ] Remove HTTP pipeline configuration
- [ ] Update documentation
```
## StellaOps Modules to Migrate
| Module | WebService | Priority | Complexity |
|--------|------------|----------|------------|
| Concelier | StellaOps.Concelier.WebService | High | Medium |
| Scanner | StellaOps.Scanner.WebService | High | High (streaming) |
| Authority | StellaOps.Authority.WebService | Medium | Low |
| Orchestrator | StellaOps.Orchestrator.WebService | Medium | Medium |
| Scheduler | StellaOps.Scheduler.WebService | Low | Low |
| Notify | StellaOps.Notify.WebService | Low | Low |
## Exit Criteria
Before marking this sprint DONE:
1. [ ] Migration strategies documented
2. [ ] Controller-to-handler mapping guide complete
3. [ ] CancellationToken checklist complete
4. [ ] Streaming migration guide complete
5. [ ] Migration checklist template created
6. [ ] Pilot migration executed successfully
7. [ ] Router.sln merged into StellaOps.sln
8. [ ] CI/CD updated
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Pilot migration should be a low-risk service first
- Strategy A preferred for gradual transition
- Strategy B preferred for greenfield-like rewrites
- CancellationToken wiring is the #1 source of migration bugs
- Streaming endpoints require IRawStellaEndpoint, not typed handlers
- Authorization migrates from [Authorize(Roles)] to RequiringClaims

200
docs/router/SPRINT_INDEX.md Normal file
View File

@@ -0,0 +1,200 @@
# Stella Ops Router - Sprint Index
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
This document provides an overview of all sprints for implementing the StellaOps Router infrastructure. Sprints are organized for maximum agent independence while respecting dependencies.
## Key Documents
| Document | Purpose |
|----------|---------|
| [specs.md](./specs.md) | **Canonical specification** - READ FIRST |
| [implplan.md](./implplan.md) | High-level implementation plan |
| Step files (01-29) | Detailed task breakdowns per phase |
## Sprint Epochs
All router sprints use **Epoch 7000** to maintain isolation from existing StellaOps work.
| Batch | Focus Area | Sprints |
|-------|------------|---------|
| 0001 | Foundation | Skeleton, Common library |
| 0002 | InMemory Transport | Prove the design before real transports |
| 0003 | Microservice SDK | Core infrastructure, request handling |
| 0004 | Gateway | Core, middleware, connection handling |
| 0005 | Protocol Features | Heartbeat, routing, cancellation, streaming, limits |
| 0006 | Real Transports | TCP, TLS, UDP, RabbitMQ |
| 0007 | Configuration | Router config, microservice YAML |
| 0008 | Integration | Authority, source generator |
| 0009 | Examples | Reference implementation |
| 0010 | Migration | WebService → Microservice |
## Sprint Dependency Graph
```
┌─────────────────────────────────────┐
│ SPRINT_7000_0001_0001 │
│ Router Skeleton │
└───────────────┬─────────────────────┘
┌───────────────▼─────────────────────┐
│ SPRINT_7000_0001_0002 │
│ Common Library Models │
└───────────────┬─────────────────────┘
┌───────────────▼─────────────────────┐
│ SPRINT_7000_0002_0001 │
│ InMemory Transport │
└───────────────┬─────────────────────┘
┌──────────────────────────┼──────────────────────────┐
│ │ │
▼ │ ▼
┌─────────────────────┐ │ ┌─────────────────────┐
│ SPRINT_7000_0003_* │ │ │ SPRINT_7000_0004_* │
│ Microservice SDK │ │ │ Gateway │
│ (2 sprints) │◄────────────┼────────────►│ (3 sprints) │
└─────────┬───────────┘ │ └─────────┬───────────┘
│ │ │
└─────────────────────────┼───────────────────────┘
┌───────────────▼─────────────────────┐
│ SPRINT_7000_0005_0001-0005 │
│ Protocol Features (sequential) │
│ Heartbeat → Routing → Cancel │
│ → Streaming → Payload Limits │
└───────────────┬─────────────────────┘
┌──────────────────────────┼──────────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ TCP Transport │ │ UDP Transport │ │ RabbitMQ │
│ 7000_0006_0001 │ │ 7000_0006_0003 │ │ 7000_0006_0004 │
└────────┬────────┘ └─────────────────┘ └─────────────────┘
┌─────────────────┐
│ TLS Transport │
│ 7000_0006_0002 │
└────────┬────────┘
└──────────────────────────┬──────────────────────────┘
┌───────────────▼─────────────────────┐
│ SPRINT_7000_0007_0001-0002 │
│ Configuration (sequential) │
└───────────────┬─────────────────────┘
┌──────────────────────────┼──────────────────────────┐
│ │ │
▼ │ ▼
┌─────────────────────┐ │ ┌─────────────────────┐
│ Authority Integration│ │ │ Source Generator │
│ 7000_0008_0001 │◄────────────┼────────────►│ 7000_0008_0002 │
└─────────────────────┘ │ └─────────────────────┘
┌───────────────▼─────────────────────┐
│ SPRINT_7000_0009_0001 │
│ Reference Example │
└───────────────┬─────────────────────┘
┌───────────────▼─────────────────────┐
│ SPRINT_7000_0010_0001 │
│ Migration │
│ (Connects to rest of StellaOps) │
└─────────────────────────────────────┘
```
## Parallel Execution Opportunities
These sprints can run in parallel:
| Phase | Parallel Track A | Parallel Track B | Parallel Track C |
|-------|------------------|------------------|------------------|
| After InMemory | SDK Core (0003_0001) | Gateway Core (0004_0001) | - |
| After Protocol | TCP (0006_0001) | UDP (0006_0003) | RabbitMQ (0006_0004) |
| After TCP | TLS (0006_0002) | (continues above) | (continues above) |
| After Config | Authority (0008_0001) | Source Gen (0008_0002) | - |
## Sprint Status Overview
| Sprint | Name | Status | Working Directory |
|--------|------|--------|-------------------|
| 7000-0001-0001 | Router Skeleton | TODO | Multiple (see sprint) |
| 7000-0001-0002 | Common Library | TODO | `src/__Libraries/StellaOps.Router.Common/` |
| 7000-0002-0001 | InMemory Transport | TODO | `src/__Libraries/StellaOps.Router.Transport.InMemory/` |
| 7000-0003-0001 | SDK Core | TODO | `src/__Libraries/StellaOps.Microservice/` |
| 7000-0003-0002 | SDK Handlers | TODO | `src/__Libraries/StellaOps.Microservice/` |
| 7000-0004-0001 | Gateway Core | TODO | `src/Gateway/StellaOps.Gateway.WebService/` |
| 7000-0004-0002 | Gateway Middleware | TODO | `src/Gateway/StellaOps.Gateway.WebService/` |
| 7000-0004-0003 | Gateway Connections | TODO | `src/Gateway/StellaOps.Gateway.WebService/` |
| 7000-0005-0001 | Heartbeat & Health | TODO | SDK + Gateway |
| 7000-0005-0002 | Routing Algorithm | TODO | `src/Gateway/StellaOps.Gateway.WebService/` |
| 7000-0005-0003 | Cancellation | TODO | SDK + Gateway |
| 7000-0005-0004 | Streaming | TODO | SDK + Gateway + InMemory |
| 7000-0005-0005 | Payload Limits | TODO | `src/Gateway/StellaOps.Gateway.WebService/` |
| 7000-0006-0001 | TCP Transport | TODO | `src/__Libraries/StellaOps.Router.Transport.Tcp/` |
| 7000-0006-0002 | TLS Transport | TODO | `src/__Libraries/StellaOps.Router.Transport.Tls/` |
| 7000-0006-0003 | UDP Transport | TODO | `src/__Libraries/StellaOps.Router.Transport.Udp/` |
| 7000-0006-0004 | RabbitMQ Transport | TODO | `src/__Libraries/StellaOps.Router.Transport.RabbitMq/` |
| 7000-0007-0001 | Router Config | TODO | `src/__Libraries/StellaOps.Router.Config/` |
| 7000-0007-0002 | Microservice YAML | TODO | `src/__Libraries/StellaOps.Microservice/` |
| 7000-0008-0001 | Authority Integration | TODO | Gateway + Authority |
| 7000-0008-0002 | Source Generator | TODO | `src/__Libraries/StellaOps.Microservice.SourceGen/` |
| 7000-0009-0001 | Reference Example | TODO | `examples/router/` |
| 7000-0010-0001 | Migration | TODO | Multiple (final integration) |
## Critical Path
The minimum path to a working router:
1. **7000-0001-0001** → Skeleton
2. **7000-0001-0002** → Common models
3. **7000-0002-0001** → InMemory transport
4. **7000-0003-0001** → SDK core
5. **7000-0003-0002** → SDK handlers
6. **7000-0004-0001** → Gateway core
7. **7000-0004-0002** → Gateway middleware
8. **7000-0004-0003** → Gateway connections
After these 8 sprints, you have a working router with InMemory transport for testing.
## Isolation Strategy
The router is developed in isolation using:
1. **Separate solution file:** `StellaOps.Router.sln`
2. **Dedicated directories:** All router code in new directories
3. **No changes to existing modules:** Until migration sprint
4. **InMemory transport first:** No network dependencies during core development
This ensures:
- Router development doesn't impact existing StellaOps builds
- Agents can work independently on router without merge conflicts
- Full testing possible without real infrastructure
- Migration is a conscious, controlled step
## Agent Assignment Guidance
For maximum parallelization:
- **Foundation Agent:** Sprints 7000-0001-0001, 7000-0001-0002
- **SDK Agent:** Sprints 7000-0003-0001, 7000-0003-0002
- **Gateway Agent:** Sprints 7000-0004-0001, 7000-0004-0002, 7000-0004-0003
- **Transport Agent:** Sprints 7000-0002-0001, 7000-0006-*
- **Protocol Agent:** Sprints 7000-0005-*
- **Config Agent:** Sprints 7000-0007-*
- **Integration Agent:** Sprints 7000-0008-*, 7000-0010-0001
- **Documentation Agent:** Sprint 7000-0009-0001
## Invariants (Never Violate)
From `specs.md`, these are non-negotiable:
- **Method + Path** is the endpoint identity
- **Strict semver** for version matching
- **Region from GatewayNodeConfig.Region** (never from headers/host)
- **No HTTP transport** between gateway and microservices
- **RequiringClaims** (not AllowedRoles) for authorization
- **Opaque body handling** (router doesn't interpret payloads)
Any change to these invariants requires updating `specs.md` first.