Add unit tests for RabbitMq and Udp transport servers and clients
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled

- Implemented comprehensive unit tests for RabbitMqTransportServer, covering constructor, disposal, connection management, event handlers, and exception handling.
- Added configuration tests for RabbitMqTransportServer to validate SSL, durable queues, auto-recovery, and custom virtual host options.
- Created unit tests for UdpFrameProtocol, including frame parsing and serialization, header size validation, and round-trip data preservation.
- Developed tests for UdpTransportClient, focusing on connection handling, event subscriptions, and exception scenarios.
- Established tests for UdpTransportServer, ensuring proper start/stop behavior, connection state management, and event handling.
- Included tests for UdpTransportOptions to verify default values and modification capabilities.
- Enhanced service registration tests for Udp transport services in the dependency injection container.
This commit is contained in:
master
2025-12-05 19:01:12 +02:00
parent 53508ceccb
commit cc69d332e3
245 changed files with 22440 additions and 27719 deletions

View File

@@ -1,422 +0,0 @@
Goal for this phase: get a clean, compiling skeleton in place that matches the spec and folder conventions, with zero real logic and minimal dependencies. After this, all future work plugs into this structure.
Ill break it into concrete tasks you can assign to agents.
---
## 1. Define the repository layout
**Owner: “Skeleton” / infra agent**
Target layout (no code yet, just dirs):
```text
/ (repo root)
StellaOps.Router.sln
/src
/StellaOps.Gateway.WebService
/__Libraries
/StellaOps.Router.Common
/StellaOps.Router.Config
/StellaOps.Microservice
/StellaOps.Microservice.SourceGen (empty stub for now)
/tests
/StellaOps.Router.Common.Tests
/StellaOps.Gateway.WebService.Tests
/StellaOps.Microservice.Tests
/docs
/router
specs.md (already exists)
README.md (placeholder, 23 lines)
```
Tasks:
1. Create `src`, `src/__Libraries`, `tests`, `docs/router` directories if missing.
2. Move/confirm `docs/router/specs.md` is the canonical spec.
3. Add `docs/router/README.md` with a pointer: “Start with specs.md; this folder will host router-related docs.”
---
## 2. Create the solution and projects
**Owner: skeleton agent**
### 2.1 Create solution
* At repo root:
```bash
dotnet new sln -n StellaOps.Router
```
* Add projects as they are created in the next step.
### 2.2 Create projects
For each project below:
* `dotnet new` with appropriate template.
* Set `RootNamespace` / `AssemblyName` to match folder & spec.
Projects:
1. **Gateway webservice**
```bash
cd src/StellaOps.Gateway.WebService
dotnet new webapi -n StellaOps.Gateway.WebService
```
* This will create an ASP.NET Core Web API project; well trim later.
2. **Common library**
```bash
cd src/__Libraries
dotnet new classlib -n StellaOps.Router.Common
```
3. **Config library**
```bash
dotnet new classlib -n StellaOps.Router.Config
```
4. **Microservice SDK**
```bash
dotnet new classlib -n StellaOps.Microservice
```
5. **Microservice Source Generator (stub)**
```bash
dotnet new classlib -n StellaOps.Microservice.SourceGen
```
* This will be converted to an Analyzer/SourceGen project later; for now it can compile as a plain library.
6. **Test projects**
Under `tests`:
```bash
cd tests
dotnet new xunit -n StellaOps.Router.Common.Tests
dotnet new xunit -n StellaOps.Gateway.WebService.Tests
dotnet new xunit -n StellaOps.Microservice.Tests
```
### 2.3 Add projects to solution
At repo root:
```bash
dotnet sln StellaOps.Router.sln add \
src/StellaOps.Gateway.WebService/StellaOps.Gateway.WebService.csproj \
src/__Libraries/StellaOps.Router.Common/StellaOps.Router.Common.csproj \
src/__Libraries/StellaOps.Router.Config/StellaOps.Router.Config.csproj \
src/__Libraries/StellaOps.Microservice/StellaOps.Microservice.csproj \
src/__Libraries/StellaOps.Microservice.SourceGen/StellaOps.Microservice.SourceGen.csproj \
tests/StellaOps.Router.Common.Tests/StellaOps.Router.Common.Tests.csproj \
tests/StellaOps.Gateway.WebService.Tests/StellaOps.Gateway.WebService.Tests.csproj \
tests/StellaOps.Microservice.Tests/StellaOps.Microservice.Tests.csproj
```
---
## 3. Wire basic project references
**Owner: skeleton agent**
The reference graph should be:
* `StellaOps.Gateway.WebService`
* references `StellaOps.Router.Common`
* references `StellaOps.Router.Config`
* `StellaOps.Microservice`
* references `StellaOps.Router.Common`
* (later) references `StellaOps.Microservice.SourceGen` as analyzer; for now no reference.
* `StellaOps.Router.Config`
* references `StellaOps.Router.Common` (for `EndpointDescriptor`, `InstanceDescriptor`, etc.)
Test projects:
* `StellaOps.Router.Common.Tests` → `StellaOps.Router.Common`
* `StellaOps.Gateway.WebService.Tests` → `StellaOps.Gateway.WebService`
* `StellaOps.Microservice.Tests` → `StellaOps.Microservice`
Use `dotnet add reference`:
```bash
dotnet add src/StellaOps.Gateway.WebService/StellaOps.Gateway.WebService.csproj reference \
src/__Libraries/StellaOps.Router.Common/StellaOps.Router.Common.csproj \
src/__Libraries/StellaOps.Router.Config/StellaOps.Router.Config.csproj
dotnet add src/__Libraries/StellaOps.Microservice/StellaOps.Microservice.csproj reference \
src/__Libraries/StellaOps.Router.Common/StellaOps.Router.Common.csproj
dotnet add src/__Libraries/StellaOps.Router.Config/StellaOps.Router.Config.csproj reference \
src/__Libraries/StellaOps.Router.Common/StellaOps.Router.Common.csproj
dotnet add tests/StellaOps.Router.Common.Tests/StellaOps.Router.Common.Tests.csproj reference \
src/__Libraries/StellaOps.Router.Common/StellaOps.Router.Common.csproj
dotnet add tests/StellaOps.Gateway.WebService.Tests/StellaOps.Gateway.WebService.Tests.csproj reference \
src/StellaOps.Gateway.WebService/StellaOps.Gateway.WebService.csproj
dotnet add tests/StellaOps.Microservice.Tests/StellaOps.Microservice.Tests.csproj reference \
src/__Libraries/StellaOps.Microservice/StellaOps.Microservice.csproj
```
---
## 4. Set common build settings
**Owner: infra agent**
Add a `Directory.Build.props` at repo root to centralize:
* Target framework (e.g. `net8.0`).
* Nullable context.
* LangVersion.
Example (minimal):
```xml
<Project>
<PropertyGroup>
<TargetFramework>net8.0</TargetFramework>
<Nullable>enable</Nullable>
<LangVersion>preview</LangVersion> <!-- if needed for newer features -->
<ImplicitUsings>enable</ImplicitUsings>
</PropertyGroup>
</Project>
```
Then, strip redundant `<TargetFramework>` from individual `.csproj` files if desired.
---
## 5. Stub namespaces and “empty” entry points
**Owner: each projects agent**
### 5.1 Common library
Create empty placeholder types that match the spec names (no logic, just shells) so everything compiles and IntelliSense knows the shapes.
Example files:
* `TransportType.cs`
* `FrameType.cs`
* `InstanceHealthStatus.cs`
* `ClaimRequirement.cs`
* `EndpointDescriptor.cs`
* `InstanceDescriptor.cs`
* `ConnectionState.cs`
* `RoutingContext.cs`
* `RoutingDecision.cs`
* `PayloadLimits.cs`
* Interfaces: `IGlobalRoutingState`, `IRoutingPlugin`, `ITransportServer`, `ITransportClient`.
Each type can be an auto-property-only record/class/enum; no methods yet.
Example:
```csharp
namespace StellaOps.Router.Common;
public enum TransportType
{
Udp,
Tcp,
Certificate,
RabbitMq
}
```
and so on.
### 5.2 Config library
Add a minimal `RouterConfig` and `PayloadLimits` class aligned with the spec; again, just properties.
```csharp
namespace StellaOps.Router.Config;
public sealed class RouterConfig
{
public IList<ServiceConfig> Services { get; init; } = new List<ServiceConfig>();
public PayloadLimits PayloadLimits { get; init; } = new();
}
public sealed class ServiceConfig
{
public string Name { get; init; } = string.Empty;
public string DefaultVersion { get; init; } = "1.0.0";
}
```
No YAML binding, no logic yet.
### 5.3 Microservice library
Create:
* `StellaMicroserviceOptions` with required properties.
* `RouterEndpointConfig` (host/port/transport).
* Extension method `AddStellaMicroservice(...)` with an empty body that just registers options and placeholder services.
```csharp
namespace StellaOps.Microservice;
public sealed class StellaMicroserviceOptions
{
public string ServiceName { get; set; } = string.Empty;
public string Version { get; set; } = string.Empty;
public string Region { get; set; } = string.Empty;
public string InstanceId { get; set; } = string.Empty;
public IList<RouterEndpointConfig> Routers { get; set; } = new List<RouterEndpointConfig>();
public string? ConfigFilePath { get; set; }
}
public sealed class RouterEndpointConfig
{
public string Host { get; set; } = string.Empty;
public int Port { get; set; }
public TransportType TransportType { get; set; }
}
```
`AddStellaMicroservice`:
```csharp
public static class ServiceCollectionExtensions
{
public static IServiceCollection AddStellaMicroservice(
this IServiceCollection services,
Action<StellaMicroserviceOptions> configure)
{
services.Configure(configure);
// TODO: register internal SDK services in later phases
return services;
}
}
```
### 5.4 Microservice.SourceGen
For now:
* Leave this as an empty classlib with an empty `README.md` stating:
* “This project will host Roslyn source generators for endpoint discovery. No implementation yet.”
Dont hook it as an analyzer until there is content.
### 5.5 Gateway webservice
Simplify the scaffolded Web API to minimal:
* In `Program.cs`:
* Build a barebones `WebApplication` that:
* Binds `GatewayNodeConfig` from config.
* Adds controllers or minimal endpoints.
* Runs; no router logic yet.
Example:
```csharp
var builder = WebApplication.CreateBuilder(args);
builder.Services.Configure<GatewayNodeConfig>(
builder.Configuration.GetSection("GatewayNode"));
builder.Services.AddControllers();
var app = builder.Build();
app.MapControllers(); // may be empty for now
app.Run();
```
* Add `GatewayNodeConfig` class in `StellaOps.Gateway.WebService` project.
---
## 6. Make tests compile (even if empty)
**Owner: test agent**
For each test project:
* Reference the appropriate main project (already done).
* Add a single dummy test class so CI passes:
```csharp
public class SmokeTests
{
[Fact]
public void SolutionCompiles()
{
Assert.True(true);
}
}
```
This is just to ensure the pipeline runs; real tests come later.
---
## 7. Add initial CI/build pipeline
**Owner: infra agent**
Set up minimal CI (GitHub Actions, GitLab, Azure DevOps, whatever you use):
* Steps:
* `dotnet restore`
* `dotnet build StellaOps.Router.sln -c Release`
* `dotnet test StellaOps.Router.sln -c Release`
No packaging or deployment yet; just compile + tests.
---
## 8. Sanity check & readiness criteria
Before you let agents move to “Common model implementation” (next phase), confirm:
* The solution builds cleanly in a clean checkout (`dotnet restore`, `dotnet build`).
* All test projects run and pass (even with dummy tests).
* Namespaces and project names match the spec:
* `StellaOps.Gateway.WebService`
* `StellaOps.Router.Common`
* `StellaOps.Router.Config`
* `StellaOps.Microservice`
* There is no real business logic yet:
* No transport logic.
* No routing decisions.
* No reflection or YAML.
* `docs/router/specs.md` is referenced in `docs/router/README.md` as the spec.
At that point, the skeleton is in place and stable. Next phases can then focus on:
* Filling in `Common` contracts properly.
* Implementing the in-memory transport.
* Wiring minimal microservice/gateway flows.
If you want, I can outline the next phase (“implement core model + in-memory transport”) with a similar task breakdown next.

View File

@@ -1,375 +0,0 @@
For this step, the goal is: make `StellaOps.Router.Common` the single, stable contract layer that everything else can depend on, with **no behavior** yet, just shapes. After this, gateway, microservice SDK, transports, and config can all compile against it.
Think of this as “lock down the domain vocabulary”.
---
## 0. Pre-work
**All devs touching Common:**
1. Read `docs/router/specs.md`, specifically:
* The sections describing:
* Enums (`TransportType`, `FrameType`, `InstanceHealthStatus`, etc.).
* Endpoint/instance/routing models.
* Frames and request/response correlation.
* Routing state and routing plugin.
2. Agree that no class/interface will be added to Common if it isnt in the spec (or discussed with you and then added to the spec).
---
## 1. Inventory and file layout
**Owner: “Common” lead**
1. From `specs.md`, extract a **type inventory** for `StellaOps.Router.Common`:
Enumerations:
* `TransportType`
* `FrameType`
* `InstanceHealthStatus`
Core value objects:
* `ClaimRequirement`
* `EndpointDescriptor`
* `InstanceDescriptor`
* `ConnectionState`
* `PayloadLimits` (if used from Common; otherwise keep in Config only)
* Any small value types youve defined (e.g. cancel payload, ping metrics etc. if present in specs).
Routing:
* `RoutingContext`
* `RoutingDecision`
Frames:
* `Frame` (type + correlation id + payload)
* Optional payload contracts for HELLO, HEARTBEAT, ENDPOINTS_UPDATE, etc., if youve specified them explicitly.
Abstractions/interfaces:
* `IGlobalRoutingState`
* `IRoutingPlugin`
* `ITransportServer`
* `ITransportClient`
* Optional: `IRegionProvider` if you kept it in the spec.
2. Propose a file layout inside `src/__Libraries/StellaOps.Router.Common`:
Example:
```text
/StellaOps.Router.Common
/Enums
TransportType.cs
FrameType.cs
InstanceHealthStatus.cs
/Models
ClaimRequirement.cs
EndpointDescriptor.cs
InstanceDescriptor.cs
ConnectionState.cs
RoutingContext.cs
RoutingDecision.cs
Frame.cs
/Abstractions
IGlobalRoutingState.cs
IRoutingPlugin.cs
ITransportClient.cs
ITransportServer.cs
IRegionProvider.cs (if used)
```
3. Get a quick 👍/👎 from you on the layout (no code yet, just file names and namespaces).
---
## 2. Implement enums and basic models
**Owner: Common dev**
Scope: simple, immutable models, no methods.
1. **Enums**
Implement:
* `TransportType` with `[Udp, Tcp, Certificate, RabbitMq]`.
* `FrameType` with:
* `Hello`, `Heartbeat`, `EndpointsUpdate`, `Request`, `RequestStreamData`, `Response`, `ResponseStreamData`, `Cancel` (and any others in specs).
* `InstanceHealthStatus` with:
* `Unknown`, `Healthy`, `Degraded`, `Draining`, `Unhealthy`.
All enums live under `namespace StellaOps.Router.Common;`.
2. **Value models**
Implement as plain classes/records with auto-properties:
* `ClaimRequirement`:
* `string Type` (required).
* `string? Value` (optional).
* `EndpointDescriptor`:
* `string ServiceName`
* `string Version`
* `string Method`
* `string Path`
* `TimeSpan DefaultTimeout`
* `bool SupportsStreaming`
* `IReadOnlyList<ClaimRequirement> RequiringClaims`
* `InstanceDescriptor`:
* `string InstanceId`
* `string ServiceName`
* `string Version`
* `string Region`
* `ConnectionState`:
* `string ConnectionId`
* `InstanceDescriptor Instance`
* `InstanceHealthStatus Status`
* `DateTime LastHeartbeatUtc`
* `double AveragePingMs`
* `TransportType TransportType`
* `IReadOnlyDictionary<(string Method, string Path), EndpointDescriptor> Endpoints`
Design choices:
* Make constructors minimal (empty constructors okay for now).
* Use `init` where reasonable to encourage immutability for descriptors; `ConnectionState` can have mutable health fields.
3. **PayloadLimits (if in Common)**
If the spec places `PayloadLimits` in Common (versus Config), implement:
```csharp
public sealed class PayloadLimits
{
public long MaxRequestBytesPerCall { get; set; }
public long MaxRequestBytesPerConnection { get; set; }
public long MaxAggregateInflightBytes { get; set; }
}
```
If its defined in Config only, leave it there and avoid duplication.
---
## 3. Implement frame & correlation model
**Owner: Common dev**
1. Implement `Frame`:
```csharp
public sealed class Frame
{
public FrameType Type { get; init; }
public Guid CorrelationId { get; init; }
public byte[] Payload { get; init; } = Array.Empty<byte>();
}
```
2. If `specs.md` defines specific payload DTOs (e.g. `HelloPayload`, `HeartbeatPayload`, `CancelPayload`), define them too:
* `HelloPayload`:
* `InstanceDescriptor` and list of `EndpointDescriptor`s, or the equivalent properties.
* `HeartbeatPayload`:
* `InstanceId`, `Status`, metrics.
* `CancelPayload`:
* `string Reason` or similar.
Keep them as simple DTOs with no logic.
3. Do **not** implement serialization yet (no JSON/MessagePack references here); Common should only define shapes.
---
## 4. Routing abstractions
**Owner: Common dev**
Implement the routing interface + context & decision types.
1. `RoutingContext`:
* Match the spec. If your `specs.md` version includes `HttpContext`, follow it; if you intentionally kept Common free of ASP.NET types, use a neutral context (e.g. method/path/headers/principal).
* For now, if `HttpContext` is included in spec, define:
```csharp
public sealed class RoutingContext
{
public object HttpContext { get; init; } = default!; // or Microsoft.AspNetCore.Http.HttpContext if allowed
public EndpointDescriptor Endpoint { get; init; } = default!;
public string GatewayRegion { get; init; } = string.Empty;
}
```
Then you can refine the type once you finalize whether Common can reference ASP.NET packages. If you want to avoid that now, define your own lightweight context model and let gateway adapt.
2. `RoutingDecision`:
* Must include:
* `EndpointDescriptor Endpoint`
* `ConnectionState Connection`
* `TransportType TransportType`
* `TimeSpan EffectiveTimeout`
3. `IGlobalRoutingState`:
Interface only, no implementation:
```csharp
public interface IGlobalRoutingState
{
EndpointDescriptor? ResolveEndpoint(string method, string path);
IReadOnlyList<ConnectionState> GetConnectionsFor(
string serviceName,
string version,
string method,
string path);
}
```
4. `IRoutingPlugin`:
* Single method:
```csharp
public interface IRoutingPlugin
{
Task<RoutingDecision?> ChooseInstanceAsync(
RoutingContext context,
CancellationToken cancellationToken);
}
```
* No logic; just interface.
---
## 5. Transport abstractions
**Owner: Common dev**
Implement the shared transport contracts.
1. `ITransportServer`:
```csharp
public interface ITransportServer
{
Task StartAsync(CancellationToken cancellationToken);
Task StopAsync(CancellationToken cancellationToken);
}
```
2. `ITransportClient`:
Per spec, you need:
* A buffered call (request → response).
* A streaming call.
* A cancel call.
Interfaces only; content roughly:
```csharp
public interface ITransportClient
{
Task<Frame> SendRequestAsync(
ConnectionState connection,
Frame requestFrame,
TimeSpan timeout,
CancellationToken cancellationToken);
Task SendCancelAsync(
ConnectionState connection,
Guid correlationId,
string? reason = null);
Task SendStreamingAsync(
ConnectionState connection,
Frame requestHeader,
Stream requestBody,
Func<Stream, Task> readResponseBody,
PayloadLimits limits,
CancellationToken cancellationToken);
}
```
No implementation or transport-specific logic here. No network types beyond `Stream` and `Task`.
3. `IRegionProvider` (if you decided to keep it):
```csharp
public interface IRegionProvider
{
string Region { get; }
}
```
---
## 6. Wire Common into tests (sanity checks only)
**Owner: Common tests dev**
Create a few very simple unit tests in `StellaOps.Router.Common.Tests`:
1. **Shape tests** (these are mostly compile-time):
* That `EndpointDescriptor` has the expected properties and default values can be set.
* That `ConnectionState` can be constructed and that its `Endpoints` dictionary handles `(Method, Path)` keys.
2. **Enum completeness tests**:
* Assert that `Enum.GetValues(typeof(FrameType))` contains all expected values. This catches accidental changes.
3. **No behavior yet**:
* No routing algorithms or transport behavior tests here; just that model contracts behave like dumb DTOs (e.g. property assignment, default value semantics).
This is mostly to lock in the shape and catch accidental refactors later.
---
## 7. Cleanliness & review checklist
Before you move on to the in-memory transport and gateway/microservice wiring, check:
1. `StellaOps.Router.Common`:
* Compiles with zero warnings (nullable enabled).
* Only references BCL; no ASP.NET or serializer packages unless intentionally agreed in the spec.
2. All types listed in `specs.md` under the Common section exist and match names & property sets.
3. No behavior/logic:
* No LINQ-heavy methods.
* No routing algorithm code.
* No network code.
* No YAML/JSON or serialization.
4. `StellaOps.Router.Common.Tests` runs and passes.
5. `docs/router/specs.md` is updated if there was any discrepancy (or the code is updated to match the spec, not the other way around).
---
If you want the next step, I can outline “3. Build in-memory transport + minimal HELLO/REQUEST/RESPONSE wiring” in the same style, so agents can move from contracts to a working vertical slice.

View File

@@ -1,144 +0,0 @@
For this step, youre not writing any real logic yet youre just making sure the projects depend on each other in the right direction so future work doesnt turn into spaghetti.
Think of it as locking in the dependency graph.
---
## 1. Pin the desired dependency graph
First, make explicit what is allowed to depend on what.
Target graph:
* `StellaOps.Router.Common`
* Lowest layer.
* **No** project references to any other StellaOps projects.
* `StellaOps.Router.Config`
* References:
* `StellaOps.Router.Common`.
* `StellaOps.Microservice`
* References:
* `StellaOps.Router.Common`.
* `StellaOps.Microservice.SourceGen`
* For now: no references, or only to Common if needed for types in generated code.
* Later: will be consumed as an analyzer by `StellaOps.Microservice`, not via normal project reference.
* `StellaOps.Gateway.WebService`
* References:
* `StellaOps.Router.Common`
* `StellaOps.Router.Config`.
Test projects:
* `StellaOps.Router.Common.Tests``StellaOps.Router.Common`
* `StellaOps.Gateway.WebService.Tests``StellaOps.Gateway.WebService`
* `StellaOps.Microservice.Tests``StellaOps.Microservice`
Explicitly: there should be **no** circular references, and nothing should reference the Gateway from libraries.
---
## 2. Add the project references
From repo root, for each needed edge:
```bash
# Gateway → Common + Config
dotnet add src/StellaOps.Gateway.WebService/StellaOps.Gateway.WebService.csproj reference \
src/__Libraries/StellaOps.Router.Common/StellaOps.Router.Common.csproj \
src/__Libraries/StellaOps.Router.Config/StellaOps.Router.Config.csproj
# Microservice → Common
dotnet add src/__Libraries/StellaOps.Microservice/StellaOps.Microservice.csproj reference \
src/__Libraries/StellaOps.Router.Common/StellaOps.Router.Common.csproj
# Config → Common
dotnet add src/__Libraries/StellaOps.Router.Config/StellaOps.Router.Config.csproj reference \
src/__Libraries/StellaOps.Router.Common/StellaOps.Router.Common.csproj
# Tests → main projects
dotnet add tests/StellaOps.Router.Common.Tests/StellaOps.Router.Common.Tests.csproj reference \
src/__Libraries/StellaOps.Router.Common/StellaOps.Router.Common.csproj
dotnet add tests/StellaOps.Gateway.WebService.Tests/StellaOps.Gateway.WebService.Tests.csproj reference \
src/StellaOps.Gateway.WebService/StellaOps.Gateway.WebService.csproj
dotnet add tests/StellaOps.Microservice.Tests/StellaOps.Microservice.Tests.csproj reference \
src/__Libraries/StellaOps.Microservice/StellaOps.Microservice.csproj
```
Do **not** add any references:
* From `Common` → anything.
* From `Config` → Gateway or Microservice.
* From `Microservice` → Gateway.
* From tests → libraries other than their primary target (unless you explicitly want shared test utils later).
---
## 3. Verify the .csproj contents
Have one agent open each `.csproj` and confirm:
* `StellaOps.Router.Common.csproj`
* No `<ProjectReference>` elements.
* `StellaOps.Router.Config.csproj`
* Exactly one `<ProjectReference>`: Common.
* `StellaOps.Microservice.csproj`
* Exactly one `<ProjectReference>`: Common.
* `StellaOps.Microservice.SourceGen.csproj`
* No project references for now (well convert it to a proper analyzer / source-generator package later).
* `StellaOps.Gateway.WebService.csproj`
* Exactly two `<ProjectReference>`s: Common + Config.
* No reference to Microservice.
* Test projects:
* Each test project references only its corresponding main project (no cross-test coupling).
If anything else is present (e.g. leftover references from templates), remove them.
---
## 4. Run a full build & test as a sanity check
From repo root:
```bash
dotnet restore
dotnet build StellaOps.Router.sln -c Debug
dotnet test StellaOps.Router.sln -c Debug
```
Acceptance criteria for this step:
* Solution builds without reference errors.
* All test projects compile and run (even if they only have dummy tests).
* Intellisense / navigation in IDE shows:
* Gateway can see Common & Config types.
* Microservice can see Common types.
* Config can see Common types.
* No library can see Gateway unless through tests.
Once this is stable, your devs can safely move on to implementing the Common model and know they wont have to rewrite references later.

View File

@@ -1,520 +0,0 @@
For this step, the goal is: a microservice that can:
* Start up with `AddStellaMicroservice(...)`
* Discover its endpoints from attributes
* Connect to the router (via InMemory transport)
* Send a HELLO with identity + endpoints
* Receive a REQUEST and return a RESPONSE
No streaming, no cancellation, no heartbeat yet. Pure minimal handshake & dispatch.
---
## 0. Preconditions
Before your agents start this step, you should have:
* `StellaOps.Router.Common` contracts in place (enums, `EndpointDescriptor`, `ConnectionState`, `Frame`, etc.).
* The solution skeleton and project references configured.
* A **stub** InMemory transport “router harness” (at least a place to park the future InMemory transport). Even if its not fully implemented, assume it will expose:
* A way for a microservice to “connect” and register itself.
* A way to deliver frames from router to microservice and back.
If InMemory isnt built yet, the microservice code should be written *against abstractions* so you can plug it in later.
---
## 1. Define microservice public surface (SDK contract)
**Project:** `__Libraries/StellaOps.Microservice`
**Owner:** microservice SDK agent
Purpose: give product teams a stable way to define services and endpoints without caring about transports.
### 1.1 Options
Make sure `StellaMicroserviceOptions` matches the spec:
```csharp
public sealed class StellaMicroserviceOptions
{
public string ServiceName { get; set; } = string.Empty;
public string Version { get; set; } = string.Empty;
public string Region { get; set; } = string.Empty;
public string InstanceId { get; set; } = string.Empty;
public IList<RouterEndpointConfig> Routers { get; set; } = new List<RouterEndpointConfig>();
public string? ConfigFilePath { get; set; }
}
public sealed class RouterEndpointConfig
{
public string Host { get; set; } = string.Empty;
public int Port { get; set; }
public TransportType TransportType { get; set; }
}
```
`Routers` is mandatory: without at least one router configured, the SDK should refuse to start later (that policy can be enforced in the handshake stage).
### 1.2 Public endpoint abstractions
Define:
* Attribute for endpoint identity:
```csharp
[AttributeUsage(AttributeTargets.Class, AllowMultiple = true)]
public sealed class StellaEndpointAttribute : Attribute
{
public string Method { get; }
public string Path { get; }
public StellaEndpointAttribute(string method, string path)
{
Method = method;
Path = path;
}
}
```
* Raw handler:
```csharp
public sealed class RawRequestContext
{
public string Method { get; init; } = string.Empty;
public string Path { get; init; } = string.Empty;
public IReadOnlyDictionary<string,string> Headers { get; init; } =
new Dictionary<string,string>();
public Stream Body { get; init; } = Stream.Null;
public CancellationToken CancellationToken { get; init; }
}
public sealed class RawResponse
{
public int StatusCode { get; set; } = 200;
public IDictionary<string,string> Headers { get; } =
new Dictionary<string,string>();
public Func<Stream,Task>? WriteBodyAsync { get; set; } // may be null
}
public interface IRawStellaEndpoint
{
Task<RawResponse> HandleAsync(RawRequestContext ctx);
}
```
* Typed convenience interfaces (used later, but define now):
```csharp
public interface IStellaEndpoint<TRequest,TResponse>
{
Task<TResponse> HandleAsync(TRequest request, CancellationToken ct);
}
public interface IStellaEndpoint<TResponse>
{
Task<TResponse> HandleAsync(CancellationToken ct);
}
```
At this step, you dont need to implement adapters yet, but the signatures must be fixed.
### 1.3 Registration extension
Extend `AddStellaMicroservice` to wire options + a few internal services:
```csharp
public static class ServiceCollectionExtensions
{
public static IServiceCollection AddStellaMicroservice(
this IServiceCollection services,
Action<StellaMicroserviceOptions> configure)
{
services.Configure(configure);
services.AddSingleton<IEndpointCatalog, EndpointCatalog>(); // to be implemented
services.AddSingleton<IEndpointDispatcher, EndpointDispatcher>(); // to be implemented
services.AddHostedService<MicroserviceBootstrapHostedService>(); // handshake loop
return services;
}
}
```
This still compiles with empty implementations; you fill them in next steps.
---
## 2. Endpoint discovery (reflection only for now)
**Project:** `StellaOps.Microservice`
**Owner:** SDK agent
Goal: given the entry assembly, build:
* A list of `EndpointDescriptor` objects (from Common).
* A mapping `(Method, Path) -> handler type` used for dispatch.
### 2.1 Internal types
Define an internal representation:
```csharp
internal sealed class EndpointRegistration
{
public EndpointDescriptor Descriptor { get; init; } = default!;
public Type HandlerType { get; init; } = default!;
}
```
Define an interface for discovery:
```csharp
internal interface IEndpointDiscovery
{
IReadOnlyList<EndpointRegistration> DiscoverEndpoints(StellaMicroserviceOptions options);
}
```
### 2.2 Implement reflection-based discovery
Create `ReflectionEndpointDiscovery`:
* Scan the entry assembly (and optionally referenced assemblies) for classes that:
* Have `StellaEndpointAttribute`.
* Implement either:
* `IRawStellaEndpoint`, or
* `IStellaEndpoint<,>`, or
* `IStellaEndpoint<>`.
* For each `[StellaEndpoint]` usage:
* Create `EndpointDescriptor` with:
* `ServiceName` = `options.ServiceName`.
* `Version` = `options.Version`.
* `Method`, `Path` from attribute.
* `DefaultTimeout` = some sensible default (e.g. `TimeSpan.FromSeconds(30)`; refine later).
* `SupportsStreaming` = `false` (for now).
* `RequiringClaims` = empty array (for now).
* Create `EndpointRegistration` with `Descriptor` + `HandlerType`.
* Return the list.
Wire it into DI:
```csharp
services.AddSingleton<IEndpointDiscovery, ReflectionEndpointDiscovery>();
```
---
## 3. Endpoint catalog & dispatcher (microservice internal)
**Project:** `StellaOps.Microservice`
**Owner:** SDK agent
Goal: presence of:
* A catalog holding endpoints and descriptors.
* A dispatcher that takes frames and calls handlers.
### 3.1 Endpoint catalog
Define:
```csharp
internal interface IEndpointCatalog
{
IReadOnlyList<EndpointDescriptor> Descriptors { get; }
bool TryGetHandler(string method, string path, out EndpointRegistration endpoint);
}
internal sealed class EndpointCatalog : IEndpointCatalog
{
private readonly Dictionary<(string Method, string Path), EndpointRegistration> _map;
public IReadOnlyList<EndpointDescriptor> Descriptors { get; }
public EndpointCatalog(IEndpointDiscovery discovery,
IOptions<StellaMicroserviceOptions> optionsAccessor)
{
var options = optionsAccessor.Value;
var registrations = discovery.DiscoverEndpoints(options);
_map = registrations.ToDictionary(
r => (r.Descriptor.Method, r.Descriptor.Path),
r => r,
StringComparer.OrdinalIgnoreCase);
Descriptors = registrations.Select(r => r.Descriptor).ToArray();
}
public bool TryGetHandler(string method, string path, out EndpointRegistration endpoint) =>
_map.TryGetValue((method, path), out endpoint!);
}
```
You can refine path normalization later; for now, keep it simple.
### 3.2 Endpoint dispatcher
Define:
```csharp
internal interface IEndpointDispatcher
{
Task<Frame> HandleRequestAsync(Frame requestFrame, CancellationToken ct);
}
```
Implement `EndpointDispatcher` with minimal behavior:
1. Decode `requestFrame.Payload` into a small DTO carrying:
* Method
* Path
* Headers (if you already have a format; if not, assume no headers in v0)
* Body bytes
For this step, you can stub decoding as:
* Payload = raw body bytes.
* Method/Path are carried separately in frame header or in a simple DTO; decide a minimal interim format and write it down.
2. Use `IEndpointCatalog.TryGetHandler(method, path, ...)`:
* If not found:
* Build a `RawResponse` with status 404 and empty body.
3. If handler implements `IRawStellaEndpoint`:
* Instantiate via DI (`IServiceProvider.GetRequiredService(handlerType)`).
* Build `RawRequestContext` with:
* Method, Path, Headers, Body (`new MemoryStream(bodyBytes)` for now).
* `CancellationToken` = `ct`.
* Call `HandleAsync`.
* Convert `RawResponse` into a response frame payload.
4. If handler implements `IStellaEndpoint<,>` (typed):
* For now, **you can skip typed handling** or wire a very simple JSON-based adapter if you want to unlock it early. The focus in this step is the raw path; typed adapters can come in the next iteration.
Return a `Frame` with:
* `Type = FrameType.Response`
* `CorrelationId` = `requestFrame.CorrelationId`
* `Payload` = encoded response (status + body bytes).
No streaming, no cancellation logic beyond passing `ct` through — router wont cancel yet.
---
## 4. Minimal handshake hosted service (using InMemory)
**Project:** `StellaOps.Microservice`
**Owner:** SDK agent
This is where the microservice actually “talks” to the router.
### 4.1 Define a microservice connection abstraction
Your SDK should not depend directly on InMemory; define an internal abstraction:
```csharp
internal interface IMicroserviceConnection
{
Task StartAsync(CancellationToken ct);
Task StopAsync(CancellationToken ct);
}
```
The implementation for this step will target the InMemory transport; later you can add TCP/TLS/RabbitMQ versions.
### 4.2 Implement InMemory microservice connection
Assuming you have or will have an `IInMemoryRouter` (or similar) dev harness, implement:
```csharp
internal sealed class InMemoryMicroserviceConnection : IMicroserviceConnection
{
private readonly IEndpointCatalog _catalog;
private readonly IEndpointDispatcher _dispatcher;
private readonly IOptions<StellaMicroserviceOptions> _options;
private readonly IInMemoryRouterClient _routerClient; // dev-only abstraction
public InMemoryMicroserviceConnection(
IEndpointCatalog catalog,
IEndpointDispatcher dispatcher,
IOptions<StellaMicroserviceOptions> options,
IInMemoryRouterClient routerClient)
{
_catalog = catalog;
_dispatcher = dispatcher;
_options = options;
_routerClient = routerClient;
}
public async Task StartAsync(CancellationToken ct)
{
var opts = _options.Value;
// Build HELLO payload from options + catalog.Descriptors
var helloPayload = BuildHelloPayload(opts, _catalog.Descriptors);
await _routerClient.ConnectAsync(opts, ct);
await _routerClient.SendHelloAsync(helloPayload, ct);
// Start background receive loop
_ = Task.Run(() => ReceiveLoopAsync(ct), ct);
}
public Task StopAsync(CancellationToken ct)
{
// For now: ask routerClient to disconnect; finer handling later
return _routerClient.DisconnectAsync(ct);
}
private async Task ReceiveLoopAsync(CancellationToken ct)
{
await foreach (var frame in _routerClient.GetIncomingFramesAsync(ct))
{
if (frame.Type == FrameType.Request)
{
var response = await _dispatcher.HandleRequestAsync(frame, ct);
await _routerClient.SendFrameAsync(response, ct);
}
else
{
// Ignore other frame types in this minimal step
}
}
}
}
```
`IInMemoryRouterClient` is whatever dev harness you build for the in-memory transport; the exact shape is not important for this steps planning, only that it provides:
* `ConnectAsync`
* `SendHelloAsync`
* `GetIncomingFramesAsync` (async stream of frames)
* `SendFrameAsync` for responses
* `DisconnectAsync`
### 4.3 Hosted service to bootstrap the connection
Implement `MicroserviceBootstrapHostedService`:
```csharp
internal sealed class MicroserviceBootstrapHostedService : IHostedService
{
private readonly IMicroserviceConnection _connection;
public MicroserviceBootstrapHostedService(IMicroserviceConnection connection)
{
_connection = connection;
}
public Task StartAsync(CancellationToken cancellationToken) =>
_connection.StartAsync(cancellationToken);
public Task StopAsync(CancellationToken cancellationToken) =>
_connection.StopAsync(cancellationToken);
}
```
Wire `IMicroserviceConnection` to `InMemoryMicroserviceConnection` in DI for now:
```csharp
services.AddSingleton<IMicroserviceConnection, InMemoryMicroserviceConnection>();
```
In a later phase, youll swap this to transport-specific connectors.
---
## 5. End-to-end smoke test (InMemory only)
**Project:** `StellaOps.Microservice.Tests` + a minimal InMemory router test harness
**Owner:** test agent
Goal: prove that minimal handshake & dispatch works in memory.
1. Build a trivial test microservice:
* Define a handler:
```csharp
[StellaEndpoint("GET", "/ping")]
public sealed class PingEndpoint : IRawStellaEndpoint
{
public Task<RawResponse> HandleAsync(RawRequestContext ctx)
{
var resp = new RawResponse { StatusCode = 200 };
resp.Headers["Content-Type"] = "text/plain";
resp.WriteBodyAsync = stream => stream.WriteAsync(
Encoding.UTF8.GetBytes("pong"));
return Task.FromResult(resp);
}
}
```
2. Test harness:
* Spin up:
* An instance of the microservice host (generic HostBuilder).
* An in-memory “router” that:
* Accepts HELLO from the microservice.
* Sends a single REQUEST frame for `GET /ping`.
* Receives the RESPONSE frame.
3. Assert:
* The HELLO includes the `/ping` endpoint.
* The REQUEST is dispatched to `PingEndpoint`.
* The RESPONSE has status 200 and body “pong”.
This verifies that:
* `AddStellaMicroservice` wires discovery, catalog, dispatcher, bootstrap.
* The microservice sends HELLO on connect.
* The microservice can handle at least one request via InMemory.
---
## 6. Done criteria for “minimal handshake & dispatch”
You can consider this step complete when:
* `StellaOps.Microservice` exposes:
* Options.
* Attribute & handler interfaces (raw + typed).
* `AddStellaMicroservice` registering discovery, catalog, dispatcher, and hosted service.
* The microservice can:
* Discover endpoints via reflection.
* Build a `HELLO` payload and send it over InMemory on startup.
* Receive a `REQUEST` frame over InMemory.
* Dispatch that request to the correct handler.
* Return a `RESPONSE` frame.
Not yet required in this step:
* Streaming bodies.
* Heartbeats or health evaluation.
* Cancellation via CANCEL frames.
* Authority overrides for requiringClaims.
Those come in subsequent phases; right now you just want a working minimal vertical slice: InMemory microservice that says “HELLO” and responds to one simple request.

View File

@@ -1,554 +0,0 @@
For this step, the goal is: the gateway can accept an HTTP request, route it to **one** microservice over the **InMemory** transport, get a response, and return it to the client.
No health/heartbeat yet. No streaming yet. Just: HTTP → InMemory → microservice → InMemory → HTTP.
Ill assume youre still in the InMemory world and not touching TCP/UDP/RabbitMQ at this stage.
---
## 0. Preconditions
Before you start:
* `StellaOps.Router.Common` exists and exposes:
* `EndpointDescriptor`, `ConnectionState`, `Frame`, `FrameType`, `TransportType`, `RoutingDecision`.
* Interfaces: `IGlobalRoutingState`, `IRoutingPlugin`, `ITransportClient`.
* `StellaOps.Microservice` minimal handshake & dispatch is in place (from your “step 4”):
* Microservice can:
* Discover endpoints.
* Connect to an InMemory router client.
* Send HELLO.
* Receive REQUEST and send RESPONSE.
* Gateway project exists (`StellaOps.Gateway.WebService`) and runs as a basic ASP.NET Core app.
If anything in that list is not true, fix it first or adjust the plan accordingly.
---
## 1. Implement an InMemory transport “hub”
You need a simple in-process component that:
* Keeps track of “connections” from microservices.
* Delivers frames from the gateway to the correct microservice and back.
You can host this either:
* In a dedicated **test/support** assembly, or
* In the gateway project but marked as “dev-only” transport.
For this step, keep it simple and in-memory.
### 1.1 Define an InMemory router hub
Conceptually:
```csharp
public interface IInMemoryRouterHub
{
// Called by microservice side to register a new connection
Task<string> RegisterMicroserviceAsync(
InstanceDescriptor instance,
IReadOnlyList<EndpointDescriptor> endpoints,
Func<Frame, Task> onFrameFromGateway,
CancellationToken ct);
// Called by microservice when it wants to send a frame to the gateway
Task SendFromMicroserviceAsync(string connectionId, Frame frame, CancellationToken ct);
// Called by gateway transport client when sending a frame to a microservice
Task<Frame> SendFromGatewayAsync(string connectionId, Frame frame, CancellationToken ct);
}
```
Internally, the hub maintains per-connection data:
* `ConnectionId`
* `InstanceDescriptor`
* Endpoints
* Delegate `onFrameFromGateway` (microservice receiver)
For minimal routing you can start by:
* Only supporting `SendFromGatewayAsync` for REQUEST and returning RESPONSE.
* For now, heartbeat frames can be ignored or stubbed.
### 1.2 Connect the microservice side
Your `InMemoryMicroserviceConnection` (from step 4) should:
* Call `RegisterMicroserviceAsync` on the hub when it sends HELLO:
* Get `connectionId`.
* Provide a handler `onFrameFromGateway` that:
* Dispatches REQUEST frames via `IEndpointDispatcher`.
* Sends RESPONSE frames back via `SendFromMicroserviceAsync`.
This is mostly microservice work; you should already have most of it outlined.
---
## 2. Implement an InMemory `ITransportClient` in the gateway
Now focus on the gateway side.
**Project:** `StellaOps.Gateway.WebService` (or a small internal infra class in the same project)
### 2.1 `InMemoryTransportClient`
Implement `ITransportClient` using the `IInMemoryRouterHub`:
```csharp
public sealed class InMemoryTransportClient : ITransportClient
{
private readonly IInMemoryRouterHub _hub;
public InMemoryTransportClient(IInMemoryRouterHub hub)
{
_hub = hub;
}
public Task<Frame> SendRequestAsync(
ConnectionState connection,
Frame requestFrame,
TimeSpan timeout,
CancellationToken ct)
{
// connection.ConnectionId must be set when HELLO is processed
return _hub.SendFromGatewayAsync(connection.ConnectionId, requestFrame, ct);
}
public Task SendCancelAsync(ConnectionState connection, Guid correlationId, string? reason = null)
=> Task.CompletedTask; // no-op at this stage
public Task SendStreamingAsync(
ConnectionState connection,
Frame requestHeader,
Stream requestBody,
Func<Stream, Task> readResponseBody,
PayloadLimits limits,
CancellationToken ct)
=> throw new NotSupportedException("Streaming not implemented for InMemory in this step.");
}
```
For now:
* Ignore streaming.
* Ignore cancel.
* Just call `SendFromGatewayAsync` and get a response frame.
### 2.2 Register it in DI
In gateway `Program.cs` or a DI setup:
```csharp
services.AddSingleton<IInMemoryRouterHub, InMemoryRouterHub>(); // your hub implementation
services.AddSingleton<ITransportClient, InMemoryTransportClient>();
```
Youll later swap this with real transport clients (TCP, UDP, Rabbit), but for now everything uses InMemory.
---
## 3. Implement minimal `IGlobalRoutingState`
You now need the gateways internal view of:
* Which endpoints exist.
* Which connections serve them.
**Project:** `StellaOps.Gateway.WebService` or a small internal infra namespace.
### 3.1 In-memory implementation
Implement an `InMemoryGlobalRoutingState` something like:
```csharp
public sealed class InMemoryGlobalRoutingState : IGlobalRoutingState
{
private readonly object _lock = new();
private readonly Dictionary<(string, string), EndpointDescriptor> _endpoints = new();
private readonly List<ConnectionState> _connections = new();
public EndpointDescriptor? ResolveEndpoint(string method, string path)
{
lock (_lock)
{
_endpoints.TryGetValue((method, path), out var endpoint);
return endpoint;
}
}
public IReadOnlyList<ConnectionState> GetConnectionsFor(
string serviceName,
string version,
string method,
string path)
{
lock (_lock)
{
return _connections
.Where(c =>
c.Instance.ServiceName == serviceName &&
c.Instance.Version == version &&
c.Endpoints.ContainsKey((method, path)))
.ToList();
}
}
// Called when HELLO arrives from microservice
public void RegisterConnection(ConnectionState connection)
{
lock (_lock)
{
_connections.Add(connection);
foreach (var kvp in connection.Endpoints)
{
var key = kvp.Key; // (Method, Path)
var descriptor = kvp.Value;
// global endpoint map: any connection's descriptor is ok as "canonical"
_endpoints[(key.Method, key.Path)] = descriptor;
}
}
}
}
```
You will refine this later; for minimal routing it's enough.
### 3.2 Hook HELLO to `IGlobalRoutingState`
In your InMemory router hub, when a microservice registers (HELLO):
* Create a `ConnectionState`:
```csharp
var conn = new ConnectionState
{
ConnectionId = generatedConnectionId,
Instance = instanceDescriptor,
Status = InstanceHealthStatus.Healthy,
LastHeartbeatUtc = DateTime.UtcNow,
AveragePingMs = 0,
TransportType = TransportType.Udp, // or TransportType.Tcp logically for InMemory
Endpoints = endpointDescriptors.ToDictionary(
e => (e.Method, e.Path),
e => e)
};
```
* Call `InMemoryGlobalRoutingState.RegisterConnection(conn)`.
This gives the gateway a routing view as soon as HELLO is processed.
---
## 4. Implement HTTP pipeline middlewares for routing
Now, wire the gateway HTTP pipeline so that an incoming HTTP request is:
1. Resolved to a logical endpoint.
2. Routed to one connection.
3. Dispatched via InMemory transport.
### 4.1 EndpointResolutionMiddleware
This maps `(Method, Path)` to an `EndpointDescriptor`.
Create a middleware:
```csharp
public sealed class EndpointResolutionMiddleware
{
private readonly RequestDelegate _next;
public EndpointResolutionMiddleware(RequestDelegate next) => _next = next;
public async Task Invoke(HttpContext context, IGlobalRoutingState routingState)
{
var method = context.Request.Method;
var path = context.Request.Path.ToString();
var endpoint = routingState.ResolveEndpoint(method, path);
if (endpoint is null)
{
context.Response.StatusCode = StatusCodes.Status404NotFound;
await context.Response.WriteAsync("Endpoint not found");
return;
}
context.Items["Stella.EndpointDescriptor"] = endpoint;
await _next(context);
}
}
```
Register it in the pipeline:
```csharp
app.UseMiddleware<EndpointResolutionMiddleware>();
```
Before or after auth depending on your final pipeline; for minimal routing, order is not critical.
### 4.2 Minimal routing plugin (pick first connection)
Implement a very naive `IRoutingPlugin` just to get things moving:
```csharp
public sealed class NaiveRoutingPlugin : IRoutingPlugin
{
private readonly IGlobalRoutingState _state;
public NaiveRoutingPlugin(IGlobalRoutingState state) => _state = state;
public Task<RoutingDecision?> ChooseInstanceAsync(
RoutingContext context,
CancellationToken cancellationToken)
{
var endpoint = context.Endpoint;
var connections = _state.GetConnectionsFor(
endpoint.ServiceName,
endpoint.Version,
endpoint.Method,
endpoint.Path);
var chosen = connections.FirstOrDefault();
if (chosen is null)
return Task.FromResult<RoutingDecision?>(null);
var decision = new RoutingDecision
{
Endpoint = endpoint,
Connection = chosen,
TransportType = chosen.TransportType,
EffectiveTimeout = endpoint.DefaultTimeout
};
return Task.FromResult<RoutingDecision?>(decision);
}
}
```
Register it:
```csharp
services.AddSingleton<IGlobalRoutingState, InMemoryGlobalRoutingState>();
services.AddSingleton<IRoutingPlugin, NaiveRoutingPlugin>();
```
### 4.3 RoutingDecisionMiddleware
This middleware grabs the endpoint descriptor and asks the routing plugin for a connection.
```csharp
public sealed class RoutingDecisionMiddleware
{
private readonly RequestDelegate _next;
public RoutingDecisionMiddleware(RequestDelegate next) => _next = next;
public async Task Invoke(HttpContext context, IRoutingPlugin routingPlugin)
{
var endpoint = (EndpointDescriptor?)context.Items["Stella.EndpointDescriptor"];
if (endpoint is null)
{
context.Response.StatusCode = 500;
await context.Response.WriteAsync("Endpoint metadata missing");
return;
}
var routingContext = new RoutingContext
{
Endpoint = endpoint,
GatewayRegion = "not_used_yet", // youll fill this from GatewayNodeConfig later
HttpContext = context
};
var decision = await routingPlugin.ChooseInstanceAsync(routingContext, context.RequestAborted);
if (decision is null)
{
context.Response.StatusCode = StatusCodes.Status503ServiceUnavailable;
await context.Response.WriteAsync("No instances available");
return;
}
context.Items["Stella.RoutingDecision"] = decision;
await _next(context);
}
}
```
Register it after `EndpointResolutionMiddleware`:
```csharp
app.UseMiddleware<RoutingDecisionMiddleware>();
```
### 4.4 TransportDispatchMiddleware
This middleware:
* Builds a REQUEST frame from HTTP.
* Uses `ITransportClient` to send it to the chosen connection.
* Writes the RESPONSE frame back to HTTP.
Minimal version (buffered, no streaming):
```csharp
public sealed class TransportDispatchMiddleware
{
private readonly RequestDelegate _next;
public TransportDispatchMiddleware(RequestDelegate next) => _next = next;
public async Task Invoke(
HttpContext context,
ITransportClient transportClient)
{
var decision = (RoutingDecision?)context.Items["Stella.RoutingDecision"];
if (decision is null)
{
context.Response.StatusCode = 500;
await context.Response.WriteAsync("Routing decision missing");
return;
}
// Read request body into memory (safe for minimal tests)
byte[] bodyBytes;
using (var ms = new MemoryStream())
{
await context.Request.Body.CopyToAsync(ms);
bodyBytes = ms.ToArray();
}
var requestPayload = new MinimalRequestPayload
{
Method = context.Request.Method,
Path = context.Request.Path.ToString(),
Body = bodyBytes
// headers can be ignored or added later
};
var requestFrame = new Frame
{
Type = FrameType.Request,
CorrelationId = Guid.NewGuid(),
Payload = SerializeRequestPayload(requestPayload)
};
var timeout = decision.EffectiveTimeout;
using var cts = CancellationTokenSource.CreateLinkedTokenSource(context.RequestAborted);
cts.CancelAfter(timeout);
Frame responseFrame;
try
{
responseFrame = await transportClient.SendRequestAsync(
decision.Connection,
requestFrame,
timeout,
cts.Token);
}
catch (OperationCanceledException)
{
context.Response.StatusCode = StatusCodes.Status504GatewayTimeout;
await context.Response.WriteAsync("Upstream timeout");
return;
}
var responsePayload = DeserializeResponsePayload(responseFrame.Payload);
context.Response.StatusCode = responsePayload.StatusCode;
foreach (var (k, v) in responsePayload.Headers)
{
context.Response.Headers[k] = v;
}
if (responsePayload.Body is { Length: > 0 })
{
await context.Response.Body.WriteAsync(responsePayload.Body);
}
}
}
```
Youll need minimal DTOs and serializers (`MinimalRequestPayload`, `MinimalResponsePayload`) just to move bytes. You can use JSON for now; protocol details will be formalized later.
Register it after `RoutingDecisionMiddleware`:
```csharp
app.UseMiddleware<TransportDispatchMiddleware>();
```
At this point, you no longer need ASP.NET controllers for microservice endpoints; you can have a catch-all pipeline.
---
## 5. Minimal end-to-end test
**Owner:** test agent, probably in `StellaOps.Gateway.WebService.Tests` (plus a simple host for microservice in tests)
Scenario:
1. Start an in-memory microservice host:
* It uses `AddStellaMicroservice`.
* It attaches to the same `IInMemoryRouterHub` instance as the gateway (created inside the test).
* It has a single endpoint:
* `[StellaEndpoint("GET", "/ping")]`
* Handler returns “pong”.
2. Start the gateway host:
* Inject the same `IInMemoryRouterHub`.
* Use middlewares: `EndpointResolutionMiddleware`, `RoutingDecisionMiddleware`, `TransportDispatchMiddleware`.
3. Invoke HTTP `GET /ping` against the gateway (using `WebApplicationFactory` or `TestServer`).
Assert:
* HTTP status 200.
* Body “pong”.
* The router hub saw:
* At least one HELLO frame.
* One REQUEST frame.
* One RESPONSE frame.
This proves:
* HELLO → gateway routing state population.
* Endpoint resolution → connection selection.
* InMemory transport client used.
* Minimal dispatch works.
---
## 6. Done criteria for “Gateway: minimal routing using InMemory plugin”
Youre done with this step when:
* A microservice can register with the gateway via InMemory.
* The gateways `IGlobalRoutingState` knows about endpoints and connections.
* The HTTP pipeline:
* Resolves an endpoint based on `(Method, Path)`.
* Asks `IRoutingPlugin` for a connection.
* Uses `ITransportClient` (InMemory) to send REQUEST and get RESPONSE.
* Returns the mapped HTTP response to the client.
* You have at least one automated test showing:
* `GET /ping` through gateway → InMemory → microservice → back to HTTP.
After this, youre ready to:
* Swap `NaiveRoutingPlugin` with the health/region-sensitive plugin you defined.
* Implement heartbeat and latency.
* Later replace InMemory with TCP/UDP/Rabbit without changing the HTTP pipeline.

View File

@@ -1,541 +0,0 @@
For this step, youre layering **liveness** and **basic routing intelligence** on top of the minimal handshake/dispatch you already designed.
Target outcome:
* Microservices send **heartbeats** over the existing connection.
* The router tracks **LastHeartbeatUtc**, **health status**, and **AveragePingMs** per connection.
* The routers `IRoutingPlugin` uses **region + health + latency** to pick an instance.
No need to handle cancellation or streaming yet; just make routing decisions *not* naive.
---
## 0. Preconditions
Before starting, confirm:
* `StellaOps.Router.Common` already has:
* `InstanceHealthStatus` enum.
* `ConnectionState` with at least `Instance`, `Status`, `LastHeartbeatUtc`, `AveragePingMs`, `TransportType`.
* Minimal handshake is working:
* Microservice sends HELLO (instance + endpoints).
* Router creates `ConnectionState` & populates global routing view.
* Router can send REQUEST and receive RESPONSE via InMemory transport.
If any of that is incomplete, shore it up first.
---
## 1. Extend Common with heartbeat payloads
**Project:** `StellaOps.Router.Common`
**Owner:** Common dev
Add DTOs for heartbeat frames.
### 1.1 Heartbeat payload
```csharp
public sealed class HeartbeatPayload
{
public string InstanceId { get; init; } = string.Empty;
public InstanceHealthStatus Status { get; init; } = InstanceHealthStatus.Healthy;
// Optional basic metrics
public int InFlightRequests { get; init; }
public double ErrorRate { get; init; } // 01 range, optional
}
```
* This is application-level health; `Status` lets the microservice say “Degraded” / “Draining”.
* In-flight + error rate can be used later for smarter routing; initially, you can ignore them.
### 1.2 Wire into frame model
Ensure:
* `FrameType` includes `Heartbeat`:
```csharp
public enum FrameType : byte
{
Hello = 1,
Heartbeat = 2,
EndpointsUpdate = 3,
Request = 4,
RequestStreamData = 5,
Response = 6,
ResponseStreamData = 7,
Cancel = 8
}
```
* No behavior in Common; only DTOs and enums.
---
## 2. Microservice SDK: send heartbeats on the same connection
**Project:** `StellaOps.Microservice`
**Owner:** SDK dev
You already have `MicroserviceConnectionHostedService` doing HELLO and request dispatch. Now add heartbeat sending.
### 2.1 Introduce heartbeat options
Extend `StellaMicroserviceOptions` with simple settings:
```csharp
public sealed class StellaMicroserviceOptions
{
// existing fields...
public TimeSpan HeartbeatInterval { get; set; } = TimeSpan.FromSeconds(10);
public TimeSpan HeartbeatTimeout { get; set; } = TimeSpan.FromSeconds(30); // used by router, not here
}
```
### 2.2 Internal heartbeat sender
Create an internal interface and implementation:
```csharp
internal interface IHeartbeatSource
{
InstanceHealthStatus GetCurrentStatus();
int GetInFlightRequests();
double GetErrorRate();
}
```
For now you can implement a trivial `DefaultHeartbeatSource`:
* `GetCurrentStatus()` → `Healthy`.
* `GetInFlightRequests()` → 0.
* `GetErrorRate()` → 0.
Wire this in DI:
```csharp
services.AddSingleton<IHeartbeatSource, DefaultHeartbeatSource>();
```
### 2.3 Add heartbeat loop to MicroserviceConnectionHostedService
In `StartAsync` of `MicroserviceConnectionHostedService`:
* After sending HELLO and subscribing to requests, start a background heartbeat loop.
Pseudo-plan:
```csharp
private Task? _heartbeatLoop;
public async Task StartAsync(CancellationToken ct)
{
// existing HELLO logic...
await _connection.SendHelloAsync(payload, ct);
_connection.OnRequest(frame => HandleRequestAsync(frame, ct));
_heartbeatLoop = Task.Run(() => HeartbeatLoopAsync(ct), ct);
}
private async Task HeartbeatLoopAsync(CancellationToken outerCt)
{
var opt = _options.Value;
var interval = opt.HeartbeatInterval;
var instanceId = opt.InstanceId;
while (!outerCt.IsCancellationRequested)
{
var payload = new HeartbeatPayload
{
InstanceId = instanceId,
Status = _heartbeatSource.GetCurrentStatus(),
InFlightRequests = _heartbeatSource.GetInFlightRequests(),
ErrorRate = _heartbeatSource.GetErrorRate()
};
var frame = new Frame
{
Type = FrameType.Heartbeat,
CorrelationId = Guid.Empty, // or a reserved value
Payload = SerializeHeartbeatPayload(payload)
};
await _connection.SendHeartbeatAsync(frame, outerCt);
try
{
await Task.Delay(interval, outerCt);
}
catch (TaskCanceledException)
{
break;
}
}
}
```
Youll need to extend `IMicroserviceConnection` with:
```csharp
Task SendHeartbeatAsync(Frame frame, CancellationToken ct);
```
In this step, manipulation is simple: every N seconds, push a heartbeat.
---
## 3. Router: accept heartbeats and update connection health
**Project:** `StellaOps.Gateway.WebService`
**Owner:** Gateway dev
You already have an InMemory router or similar structure that:
* Handles HELLO frames, creates `ConnectionState`.
* Maintains a global `IGlobalRoutingState`.
Now you need to:
* Handle HEARTBEAT frames.
* Update `ConnectionState.Status` and `LastHeartbeatUtc`.
### 3.1 Frame dispatch on router side
In your routers InMemory server loop (or equivalent), add case for `FrameType.Heartbeat`:
* Deserialize `HeartbeatPayload` from `frame.Payload`.
* Find the corresponding `ConnectionState` by `InstanceId` (and/or connection ID).
* Update:
* `LastHeartbeatUtc` = `DateTime.UtcNow`.
* `Status` = `payload.Status`.
You can add a method in your routing-state implementation:
```csharp
public void UpdateHeartbeat(string connectionId, HeartbeatPayload payload)
{
if (!_connections.TryGetValue(connectionId, out var conn))
return;
conn.LastHeartbeatUtc = DateTime.UtcNow;
conn.Status = payload.Status;
}
```
The routers transport server should know which `connectionId` delivered the frame; pass that along.
### 3.2 Detect stale connections (health degradation)
Add a background “health monitor” in the gateway:
* Reads `HeartbeatTimeout` from configuration (can reuse the same default as microservice or have separate router-side config).
* Periodically scans all `ConnectionState` entries:
* If `Now - LastHeartbeatUtc > HeartbeatTimeout`, mark `Status = Unhealthy` (or remove connection entirely).
* If connection drops (transport disconnect), also mark `Unhealthy` or remove.
This can be a simple `IHostedService`:
```csharp
internal sealed class ConnectionHealthMonitor : IHostedService
{
private readonly IGlobalRoutingState _state;
private readonly TimeSpan _heartbeatTimeout;
private Task? _loop;
private CancellationTokenSource? _cts;
public Task StartAsync(CancellationToken cancellationToken)
{
_cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
_loop = Task.Run(() => MonitorLoopAsync(_cts.Token), _cts.Token);
return Task.CompletedTask;
}
public async Task StopAsync(CancellationToken cancellationToken)
{
_cts?.Cancel();
if (_loop is not null)
await _loop;
}
private async Task MonitorLoopAsync(CancellationToken ct)
{
while (!ct.IsCancellationRequested)
{
_state.MarkStaleConnectionsUnhealthy(_heartbeatTimeout, DateTime.UtcNow);
await Task.Delay(TimeSpan.FromSeconds(5), ct);
}
}
}
```
Youll add a method like `MarkStaleConnectionsUnhealthy` on your `IGlobalRoutingState` implementation.
---
## 4. Track basic latency (AveragePingMs)
**Project:** Gateway + Common
**Owner:** Gateway dev
You want `AveragePingMs` per connection to inform routing decisions.
### 4.1 Decide where to measure
Simplest: measure “request → response” round-trip time in the gateway:
* When you send a `Request` frame to a specific connection, record:
* `SentAtUtc[CorrelationId] = DateTime.UtcNow`.
* When you receive a `Response` frame with that correlation:
* Compute `latencyMs = (UtcNow - SentAtUtc[CorrelationId]).TotalMilliseconds`.
* Discard map entry.
Then update `ConnectionState.AveragePingMs`, e.g. with an exponential moving average:
```csharp
conn.AveragePingMs = conn.AveragePingMs <= 0
? latencyMs
: conn.AveragePingMs * 0.8 + latencyMs * 0.2;
```
### 4.2 Where to hook this
* In the **gateway-side transport client** (InMemory implementation for now):
* When sending `Request` frame:
* Register `SentAtUtc` per correlation ID.
* When receiving `Response` frame:
* Compute latency.
* Call `IGlobalRoutingState.UpdateLatency(connectionId, latencyMs)`.
Add a method to the routing state:
```csharp
public void UpdateLatency(string connectionId, double latencyMs)
{
if (_connections.TryGetValue(connectionId, out var conn))
{
if (conn.AveragePingMs <= 0)
conn.AveragePingMs = latencyMs;
else
conn.AveragePingMs = conn.AveragePingMs * 0.8 + latencyMs * 0.2;
}
}
```
You can keep it simple; sophistication can come later.
---
## 5. Basic routing plugin implementation
**Project:** `StellaOps.Gateway.WebService`
**Owner:** Gateway dev
You already have `IRoutingPlugin` defined. Now implement a concrete `BasicRoutingPlugin` that respects:
* Region (gateway region first, then neighbor tiers).
* Health (`Healthy` / `Degraded` only).
* Latency preference (`AveragePingMs`).
### 5.1 Inputs & data
`RoutingContext` should carry:
* `EndpointDescriptor` (with ServiceName, Version, Method, Path).
* `GatewayRegion` (from `GatewayNodeConfig.Region`).
* The `HttpContext` if you need headers (not needed for routing at this stage).
`IGlobalRoutingState` should provide:
* `GetConnectionsFor(serviceName, version, method, path)` returning all `ConnectionState`s that support that endpoint.
### 5.2 Basic algorithm
Algorithm outline:
```csharp
public sealed class BasicRoutingPlugin : IRoutingPlugin
{
private readonly IGlobalRoutingState _state;
private readonly string[] _neighborRegions; // configured, can be empty
public async Task<RoutingDecision?> ChooseInstanceAsync(
RoutingContext context,
CancellationToken cancellationToken)
{
var endpoint = context.Endpoint;
var candidates = _state.GetConnectionsFor(
endpoint.ServiceName,
endpoint.Version,
endpoint.Method,
endpoint.Path);
if (candidates.Count == 0)
return null;
// 1. Filter by health (only Healthy or Degraded)
var healthy = candidates
.Where(c => c.Status == InstanceHealthStatus.Healthy || c.Status == InstanceHealthStatus.Degraded)
.ToList();
if (healthy.Count == 0)
return null;
// 2. Partition by region tier
var gatewayRegion = context.GatewayRegion;
List<ConnectionState> tier1 = healthy.Where(c => c.Instance.Region == gatewayRegion).ToList();
List<ConnectionState> tier2 = healthy.Where(c => _neighborRegions.Contains(c.Instance.Region)).ToList();
List<ConnectionState> tier3 = healthy.Except(tier1).Except(tier2).ToList();
var chosenTier = tier1.Count > 0 ? tier1 : tier2.Count > 0 ? tier2 : tier3;
if (chosenTier.Count == 0)
return null;
// 3. Sort by latency, then heartbeat freshness
var ordered = chosenTier
.OrderBy(c => c.AveragePingMs <= 0 ? double.MaxValue : c.AveragePingMs)
.ThenByDescending(c => c.LastHeartbeatUtc)
.ToList();
var winner = ordered[0];
// 4. Build decision
return new RoutingDecision
{
Endpoint = endpoint,
Connection = winner,
TransportType = winner.TransportType,
EffectiveTimeout = endpoint.DefaultTimeout // or compose with config later
};
}
}
```
Wire it into DI:
```csharp
services.AddSingleton<IRoutingPlugin, BasicRoutingPlugin>();
```
And ensure `RoutingDecisionMiddleware` calls it.
---
## 6. Integrate health-aware routing into the HTTP pipeline
**Project:** `StellaOps.Gateway.WebService`
**Owner:** Gateway dev
Update your `RoutingDecisionMiddleware` to:
* Use the final `IRoutingPlugin` instead of picking a random connection.
* Handle null decision appropriately:
* If `ChooseInstanceAsync` returns `null`, respond with `503 Service Unavailable` or `502 Bad Gateway` and a generic error body, log the incident.
Check that:
* Gateways region is injected (via `GatewayNodeConfig.Region`) into `RoutingContext.GatewayRegion`.
* Endpoint descriptor is resolved before you call the plugin.
---
## 7. Testing plan
**Project:** `StellaOps.Gateway.WebService.Tests`, `StellaOps.Microservice.Tests`
**Owner:** test agent
Write basic tests to lock in behavior.
### 7.1 Microservice heartbeat tests
In `StellaOps.Microservice.Tests`:
* Use a fake `IMicroserviceConnection` that records frames sent.
* Configure `HeartbeatInterval` to a small number (e.g. 100 ms).
* Start a Host with `AddStellaMicroservice`.
* Wait some time, assert:
* At least one HELLO frame was sent.
* At least N HEARTBEAT frames were sent.
* HEARTBEAT payload has correct `InstanceId` and `Status`.
### 7.2 Router health update tests
In `StellaOps.Gateway.WebService.Tests` (or a separate routing-state test project):
* Create an instance of your `IGlobalRoutingState` implementation.
* Add a connection via HELLO simulation.
* Call `UpdateHeartbeat` with a HeartbeatPayload.
* Assert:
* `LastHeartbeatUtc` updated.
* `Status` set to `Healthy` (or whatever payload said).
* Advance time (simulate via injecting a clock or mocking DateTime) and call `MarkStaleConnectionsUnhealthy`:
* Assert that `Status` changed to `Unhealthy`.
### 7.3 Routing plugin tests
Write tests for `BasicRoutingPlugin`:
* Case 1: multiple connections, some unhealthy:
* Only Healthy/Degraded are considered.
* Case 2: multiple regions:
* Instances in gateway region win over others.
* Case 3: same region, different `AveragePingMs`:
* Lower latency chosen.
* Case 4: same latency, different `LastHeartbeatUtc`:
* More recent heartbeat chosen.
These tests will give you confidence that the routing logic behaves as requested and is stable as you add complexity later (streaming, cancellation, etc.).
---
## 8. Done criteria for “Add heartbeat, health, basic routing rules”
You can declare this step complete when:
* Microservices:
* Periodically send HEARTBEAT frames on the same connection they use for requests.
* Gateway/router:
* Updates `LastHeartbeatUtc` and `Status` on receipt of HEARTBEAT.
* Marks stale or disconnected connections as `Unhealthy` (or removes them).
* Tracks `AveragePingMs` per connection based on request/response round trips.
* Routing:
* `IRoutingPlugin` chooses instances based on:
* Strict `ServiceName` + `Version` + endpoint match.
* Health (`Healthy`/`Degraded` only).
* Region preference (gateway region > neighbors > others).
* Latency (`AveragePingMs`) then heartbeat recency.
* Tests:
* Validate heartbeats are sent and processed.
* Validate stale connections are marked unhealthy.
* Validate routing plugin picks the expected instance in simple scenarios.
Once this is in place, you have a live, health-aware routing fabric. The next logical step after this is to add **cancellation** and then **streaming + payload limits** on top of the same structures.

View File

@@ -1,378 +0,0 @@
For this step youre wiring **request cancellation** endtoend in the InMemory setup:
> Client / gateway gives up → gateway sends CANCEL → microservice cancels handler
No need to mix in streaming or payload limits yet; just enforce cancellation for timeouts and client disconnects.
---
## 0. Preconditions
Have in place:
* `FrameType.Cancel` in `StellaOps.Router.Common.FrameType`.
* `ITransportClient.SendCancelAsync(ConnectionState, Guid, string?)` in Common.
* Minimal InMemory path from HTTP → gateway → microservice (HELLO + REQUEST/RESPONSE) working.
If `FrameType.Cancel` or `SendCancelAsync` arent there yet, add them first.
---
## 1. Common: cancel payload (optional, but useful)
If you want reasons attached, add a DTO in Common:
```csharp
public sealed class CancelPayload
{
public string Reason { get; init; } = string.Empty; // eg: "ClientDisconnected", "Timeout"
}
```
Youll serialize this into `Frame.Payload` when sending a CANCEL. If you dont care about reasons yet, you can skip the payload and just use the correlation id.
No behavior in Common, just the shape.
---
## 2. Gateway: trigger CANCEL on client abort and timeout
### 2.1 Extend `TransportDispatchMiddleware`
You already:
* Generate a `correlationId`.
* Build a `FrameType.Request`.
* Call `ITransportClient.SendRequestAsync(...)` and await it.
Now:
1. Create a linked CTS that combines:
* `HttpContext.RequestAborted`
* The endpoint timeout
2. Register a callback on `RequestAborted` that sends a CANCEL with the same correlationId.
3. On `OperationCanceledException` where the HTTP token is not canceled (pure timeout), send a CANCEL once and return 504.
Sketch:
```csharp
public async Task Invoke(HttpContext context, ITransportClient transportClient)
{
var decision = (RoutingDecision)context.Items[RouterHttpContextKeys.RoutingDecision]!;
var correlationId = Guid.NewGuid();
// build requestFrame as before
var timeout = decision.EffectiveTimeout;
using var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(context.RequestAborted);
linkedCts.CancelAfter(timeout);
// fire-and-forget cancel on client disconnect
context.RequestAborted.Register(() =>
{
_ = transportClient.SendCancelAsync(
decision.Connection, correlationId, "ClientDisconnected");
});
Frame responseFrame;
try
{
responseFrame = await transportClient.SendRequestAsync(
decision.Connection,
requestFrame,
timeout,
linkedCts.Token);
}
catch (OperationCanceledException) when (!context.RequestAborted.IsCancellationRequested)
{
// internal timeout
await transportClient.SendCancelAsync(
decision.Connection, correlationId, "Timeout");
context.Response.StatusCode = StatusCodes.Status504GatewayTimeout;
await context.Response.WriteAsync("Upstream timeout");
return;
}
// existing response mapping goes here
}
```
Key points:
* The gateway sends CANCEL **as soon as**:
* The client disconnects (RequestAborted).
* Or the internal timeout triggers (catch branch).
* We do not need any global correlation registry on the gateway side; the middleware has the `correlationId` and `Connection`.
---
## 3. InMemory transport: propagate CANCEL to microservice
### 3.1 Implement `SendCancelAsync` in `InMemoryTransportClient` (gateway side)
In your gateway InMemory implementation:
```csharp
public Task SendCancelAsync(ConnectionState connection, Guid correlationId, string? reason = null)
{
var payload = reason is null
? Array.Empty<byte>()
: SerializeCancelPayload(new CancelPayload { Reason = reason });
var frame = new Frame
{
Type = FrameType.Cancel,
CorrelationId = correlationId,
Payload = payload
};
return _hub.SendFromGatewayAsync(connection.ConnectionId, frame, CancellationToken.None);
}
```
`_hub.SendFromGatewayAsync` must route the frame to the microservices receive loop for that connection.
### 3.2 Hub routing
Ensure your `IInMemoryRouterHub` implementation:
* When `SendFromGatewayAsync(connectionId, cancelFrame, ct)` is called:
* Enqueues that frame onto the microservices incoming channel (`GetFramesForMicroserviceAsync` stream).
No extra logic; just treat CANCEL like REQUEST/HELLO in terms of delivery.
---
## 4. Microservice: track in-flight requests
Now microservice needs to know **which** request to cancel when a CANCEL arrives.
### 4.1 In-flight registry
In the microservice connection class (the one doing the receive loop):
```csharp
private readonly ConcurrentDictionary<Guid, RequestExecution> _inflight =
new();
private sealed class RequestExecution
{
public CancellationTokenSource Cts { get; init; } = default!;
public Task ExecutionTask { get; init; } = default!;
}
```
When a `Request` frame arrives:
* Create a `CancellationTokenSource`.
* Start the handler using that token.
* Store both in `_inflight`.
Example pattern in `ReceiveLoopAsync`:
```csharp
private async Task ReceiveLoopAsync(CancellationToken ct)
{
await foreach (var frame in _routerClient.GetIncomingFramesAsync(ct))
{
switch (frame.Type)
{
case FrameType.Request:
HandleRequest(frame);
break;
case FrameType.Cancel:
HandleCancel(frame);
break;
// other frame types...
}
}
}
private void HandleRequest(Frame frame)
{
var cts = new CancellationTokenSource();
var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(cts.Token); // later link to global shutdown if needed
var exec = new RequestExecution
{
Cts = cts,
ExecutionTask = HandleRequestCoreAsync(frame, linkedCts.Token)
};
_inflight[frame.CorrelationId] = exec;
_ = exec.ExecutionTask.ContinueWith(_ =>
{
_inflight.TryRemove(frame.CorrelationId, out _);
cts.Dispose();
linkedCts.Dispose();
}, TaskScheduler.Default);
}
```
### 4.2 Wire CancellationToken into dispatcher
`HandleRequestCoreAsync` should:
* Deserialize the request payload.
* Build a `RawRequestContext` with `CancellationToken = token`.
* Pass that token through to:
* `IRawStellaEndpoint.HandleAsync(context)` (via the context).
* Or typed handler adapter (`IStellaEndpoint<,>` / `IStellaEndpoint<TResponse>`), passing it explicitly.
Example pattern:
```csharp
private async Task HandleRequestCoreAsync(Frame frame, CancellationToken ct)
{
var req = DeserializeRequestPayload(frame.Payload);
if (!_catalog.TryGetHandler(req.Method, req.Path, out var registration))
{
var notFound = BuildNotFoundResponse(frame.CorrelationId);
await _routerClient.SendFrameAsync(notFound, ct);
return;
}
using var bodyStream = new MemoryStream(req.Body); // minimal case
var ctx = new RawRequestContext
{
Method = req.Method,
Path = req.Path,
Headers = req.Headers,
Body = bodyStream,
CancellationToken = ct
};
var handler = (IRawStellaEndpoint)_serviceProvider.GetRequiredService(registration.HandlerType);
var response = await handler.HandleAsync(ctx);
var respFrame = BuildResponseFrame(frame.CorrelationId, response);
await _routerClient.SendFrameAsync(respFrame, ct);
}
```
Now each handler sees a token that will be canceled when a CANCEL frame arrives.
### 4.3 Handle CANCEL frames
When a `Cancel` frame arrives:
```csharp
private void HandleCancel(Frame frame)
{
if (_inflight.TryGetValue(frame.CorrelationId, out var exec))
{
exec.Cts.Cancel();
}
// Ignore if not found (e.g. already completed)
}
```
If you care about the reason, deserialize `CancelPayload` and log it; not required for behavior.
---
## 5. Handler guidance (for your Microservice docs)
In `Stella Ops Router Microservice.md`, add simple rules devs must follow:
* Any longrunning or IO-heavy code in endpoints MUST:
* Accept a `CancellationToken` (for typed endpoints).
* Or use `RawRequestContext.CancellationToken` for raw endpoints.
* Always pass the token into:
* DB calls.
* File I/O and stream operations.
* HTTP/gRPC calls to other services.
* Do not swallow `OperationCanceledException` unless there is a good reason; normally let it bubble or treat it as a normal cancellation.
Concrete example for devs:
```csharp
[StellaEndpoint("POST", "/billing/slow-operation")]
public sealed class SlowEndpoint : IRawStellaEndpoint
{
public async Task<RawResponse> HandleAsync(RawRequestContext ctx)
{
// Correct: observe token
await Task.Delay(TimeSpan.FromMinutes(5), ctx.CancellationToken);
return new RawResponse { StatusCode = 204 };
}
}
```
---
## 6. Tests
### 6.1 Client abort → CANCEL
Test outline:
* Setup:
* Gateway + microservice wired via InMemory hub.
* Microservice endpoint that:
* Waits on `Task.Delay(TimeSpan.FromMinutes(5), ctx.CancellationToken)`.
* Test:
1. Start HTTP request to `/slow`.
2. After sending request, cancel the clients HttpClient token or close the connection.
3. Assert:
* Gateways InMemory transport sent a `FrameType.Cancel`.
* Microservices handler is canceled (e.g. no longer running after a short time).
* No response (or partial) is written; HTTP side will produce whatever your test harness expects when client aborts.
### 6.2 Gateway timeout → CANCEL
* Configure endpoint timeout small (e.g. 100 ms).
* Have endpoint sleep for 5 seconds with the token.
* Assert:
* Gateway returns 504.
* Cancel frame was sent.
* Handler is canceled (task completes early).
These tests lock in the semantics so later additions (real transports, streaming) dont regress cancellation.
---
## 7. Done criteria for “Add cancellation semantics (with InMemory)”
You can mark step 7 as complete when:
* For every routed request, the gateway knows its correlationId and connection.
* On client disconnect:
* Gateway sends a `FrameType.Cancel` with that correlationId.
* On internal timeout:
* Gateway sends a `FrameType.Cancel` and returns 504 to the client.
* InMemory hub delivers CANCEL frames to the microservice.
* Microservice:
* Tracks inflight requests by correlationId.
* Cancels the proper `CancellationTokenSource` when CANCEL arrives.
* Passes the token into handlers via `RawRequestContext` and typed adapters.
* At least one automated test proves:
* Cancellation propagates from gateway to microservice and stops the handler.
Once this is done, youll be in good shape to add streaming & payload-limits on top, because the cancel path is already wired endtoend.

View File

@@ -1,501 +0,0 @@
For this step youre teaching the system to handle **streams** instead of always buffering, and to **enforce payload limits** so the gateway cant be DoSd by large uploads. Still only using the InMemory transport.
Goal state:
* Gateway can stream HTTP request/response bodies to/from microservice without buffering everything.
* Gateway enforces percall and global/inflight payload limits.
* Microservice sees a `Stream` on `RawRequestContext.Body` and reads from it.
* All of this works over the existing InMemory “connection”.
Ill break it into concrete tasks.
---
## 0. Preconditions
Make sure you already have:
* Minimal InMemory routing working:
* HTTP → gateway → InMemory → microservice → InMemory → HTTP.
* Cancellation wired (step 7):
* `FrameType.Cancel`.
* `ITransportClient.SendCancelAsync` implemented for InMemory.
* Microservice uses `CancellationToken` in `RawRequestContext`.
Then layer streaming & limits on top.
---
## 1. Confirm / finalize Common primitives for streaming & limits
**Project:** `StellaOps.Router.Common`
Tasks:
1. Ensure `FrameType` has:
```csharp
public enum FrameType : byte
{
Hello = 1,
Heartbeat = 2,
EndpointsUpdate = 3,
Request = 4,
RequestStreamData = 5,
Response = 6,
ResponseStreamData = 7,
Cancel = 8
}
```
You may not *use* `RequestStreamData` / `ResponseStreamData` in InMemory implementation initially if you choose the bridging approach, but having them defined keeps the model coherent.
2. Ensure `EndpointDescriptor` has:
```csharp
public bool SupportsStreaming { get; init; }
```
3. Ensure `PayloadLimits` type exists (in Common or Config, but referenced by both):
```csharp
public sealed class PayloadLimits
{
public long MaxRequestBytesPerCall { get; set; } // per HTTP request
public long MaxRequestBytesPerConnection { get; set; } // per microservice connection
public long MaxAggregateInflightBytes { get; set; } // across all requests
}
```
4. `ITransportClient` already contains:
```csharp
Task SendStreamingAsync(
ConnectionState connection,
Frame requestHeader,
Stream requestBody,
Func<Stream, Task> readResponseBody,
PayloadLimits limits,
CancellationToken ct);
```
If not, add it now (implementation will be InMemory-only for this step).
No logic in Common; just shapes.
---
## 2. Gateway: payload budget tracker
You need a small service in the gateway that tracks inflight bytes to enforce limits.
**Project:** `StellaOps.Gateway.WebService`
### 2.1 Define a budget interface
```csharp
public interface IPayloadBudget
{
bool TryReserve(string connectionId, Guid requestId, long bytes);
void Release(string connectionId, Guid requestId, long bytes);
}
```
### 2.2 Implement a simple in-memory tracker
Implementation outline:
* Track:
* `long _globalInflightBytes`.
* `Dictionary<string,long> _perConnectionInflightBytes`.
* `Dictionary<Guid,long> _perRequestInflightBytes`.
All updated under a lock or `ConcurrentDictionary` + `Interlocked`.
Logic for `TryReserve`:
* Compute proposed:
* `newGlobal = _globalInflightBytes + bytes`
* `newConn = perConnection[connectionId] + bytes`
* `newReq = perRequest[requestId] + bytes`
* If any exceed configured limits (`PayloadLimits` from config), return `false`.
* Else:
* Commit updates and return `true`.
`Release` subtracts the bytes, never going below zero.
Register in DI:
```csharp
services.AddSingleton<IPayloadBudget, PayloadBudget>();
```
---
## 3. Gateway: choose buffered vs streaming path
Extend `TransportDispatchMiddleware` to branch on mode.
**Project:** `StellaOps.Gateway.WebService`
### 3.1 Decide mode
At the start of the middleware:
```csharp
var decision = (RoutingDecision)context.Items[RouterHttpContextKeys.RoutingDecision]!;
var endpoint = decision.Endpoint;
var limits = _options.Value.PayloadLimits; // from RouterConfig
var supportsStreaming = endpoint.SupportsStreaming;
var hasKnownLength = context.Request.ContentLength.HasValue;
var contentLength = context.Request.ContentLength ?? -1;
// Simple rule for now:
var useStreaming =
supportsStreaming &&
(!hasKnownLength || contentLength > limits.MaxRequestBytesPerCall);
```
* If `useStreaming == false`:
* Use buffered path with hard size checks.
* If `useStreaming == true`:
* Use streaming path (`ITransportClient.SendStreamingAsync`).
---
## 4. Gateway: buffered path with limits
**Still in `TransportDispatchMiddleware`**
### 4.1 Early 413 check
When `supportsStreaming == false`:
1. If `Content-Length` known and:
```csharp
if (hasKnownLength && contentLength > limits.MaxRequestBytesPerCall)
{
context.Response.StatusCode = StatusCodes.Status413PayloadTooLarge;
return;
}
```
2. When reading body into memory:
* Read in chunks.
* Track `bytesReadThisCall`.
* If `bytesReadThisCall > limits.MaxRequestBytesPerCall`, abort and return 413.
You dont have to call `IPayloadBudget` for buffered mode yet; you can, but the hard per-call limit already protects RAM for this step.
Buffered path then proceeds as before:
* Build `MinimalRequestPayload` with full body.
* Send via `SendRequestAsync`.
* Map response.
---
## 5. Gateway: streaming path (InMemory)
This is the new part.
### 5.1 Use `ITransportClient.SendStreamingAsync`
In the `useStreaming == true` branch:
```csharp
var correlationId = Guid.NewGuid();
var headerPayload = new MinimalRequestPayload
{
Method = context.Request.Method,
Path = context.Request.Path.ToString(),
Headers = ExtractHeaders(context.Request),
Body = Array.Empty<byte>(), // streaming body will follow
IsStreaming = true // add this flag to your payload DTO
};
var headerFrame = new Frame
{
Type = FrameType.Request,
CorrelationId = correlationId,
Payload = SerializeRequestPayload(headerPayload)
};
using var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(context.RequestAborted);
linkedCts.CancelAfter(decision.EffectiveTimeout);
// register cancel → SendCancelAsync (already done in step 7)
await _transportClient.SendStreamingAsync(
decision.Connection,
headerFrame,
context.Request.Body,
async responseBodyStream =>
{
// Copy microservice stream directly to HTTP response
await responseBodyStream.CopyToAsync(context.Response.Body, linkedCts.Token);
},
limits,
linkedCts.Token);
```
Key points:
* Streaming path does not buffer the whole body.
* Limits and cancellation are enforced inside `SendStreamingAsync`.
---
## 6. InMemory transport: streaming implementation
**Project:** gateway side InMemory `ITransportClient` implementation and InMemory router hub; microservice side connection.
For InMemory, you can model streaming via **bridged streams**: a producer/consumer pair in memory.
### 6.1 Add streaming call to InMemory client
In `InMemoryTransportClient`:
```csharp
public async Task SendStreamingAsync(
ConnectionState connection,
Frame requestHeader,
Stream httpRequestBody,
Func<Stream, Task> readResponseBody,
PayloadLimits limits,
CancellationToken ct)
{
await _hub.StreamFromGatewayAsync(
connection.ConnectionId,
requestHeader,
httpRequestBody,
readResponseBody,
limits,
ct);
}
```
Expose `StreamFromGatewayAsync` on `IInMemoryRouterHub`:
```csharp
Task StreamFromGatewayAsync(
string connectionId,
Frame requestHeader,
Stream requestBody,
Func<Stream, Task> readResponseBody,
PayloadLimits limits,
CancellationToken ct);
```
### 6.2 InMemory hub streaming strategy (bridging style)
Inside `StreamFromGatewayAsync`:
1. Create a **pair of connected streams** for request body:
* e.g., a custom `ProducerConsumerStream` built on a `Channel<byte[]>` or `System.IO.Pipelines`.
* “Producer” side (writer) will be fed from HTTP.
* “Consumer” side will be given to the microservice as `RawRequestContext.Body`.
2. Create a **pair of connected streams** for response body:
* “Consumer” side will be used in `readResponseBody` to write to HTTP.
* “Producer” side will be given to the microservice handler to write response body.
3. On the microservice side:
* Build a `RawRequestContext` with `Body = requestBodyConsumerStream` and `CancellationToken = ct`.
* Dispatch to the endpoint handler as usual.
* Have the handlers `RawResponse.WriteBodyAsync` pointed at `responseBodyProducerStream`.
4. Parallel tasks:
* Task 1: Copy HTTP → `requestBodyProducerStream` in chunks, enforcing `PayloadLimits` (see next section).
* Task 2: Execute the handler, which reads from `Body` and writes to `responseBodyProducerStream`.
* Task 3: Copy `responseBodyConsumerStream` → HTTP via `readResponseBody`.
5. Propagate cancellation:
* If `ct` is canceled (client disconnect/timeout/payload limit breach):
* Stop HTTP→requestBody copy.
* Signal stream completion / cancellation to handler.
* Handler should see cancellation via `CancellationToken`.
Because this is InMemory, you dont *have* to materialize explicit `RequestStreamData` frames; you only need the behavior. Real transports will implement the same semantics with actual frames.
---
## 7. Enforce payload limits in streaming copy
Still in `StreamFromGatewayAsync` / InMemory side:
### 7.1 HTTP → microservice copy with budget
In Task 1:
```csharp
var buffer = new byte[64 * 1024];
int read;
var requestId = requestHeader.CorrelationId;
var connectionId = connectionIdFromArgs;
while ((read = await httpRequestBody.ReadAsync(buffer, 0, buffer.Length, ct)) > 0)
{
if (!_budget.TryReserve(connectionId, requestId, read))
{
// Limit exceeded: signal failure
await _cancelCallback?.Invoke(requestId, "PayloadLimitExceeded"); // or call SendCancelAsync
break;
}
await requestBodyProducerStream.WriteAsync(buffer.AsMemory(0, read), ct);
}
// After loop, ensure we release whatever was reserved
_budget.Release(connectionId, requestId, totalBytesReserved);
await requestBodyProducerStream.FlushAsync(ct);
await requestBodyProducerStream.DisposeAsync();
```
If `TryReserve` fails:
* Stop reading further bytes.
* Trigger cancellation downstream:
* Either call the existing `SendCancelAsync` path.
* Or signal completion with error and let handler catch cancellation.
Gateway side should then translate this into 413 or 503 to the client.
### 7.2 Response copy
Response path doesnt need budget tracking (the danger is inbound to gateway); but if you want symmetry, you can also enforce a max outbound size.
For now, just stream microservice → HTTP through `readResponseBody` until EOF or cancellation.
---
## 8. Microservice side: streaming-aware `RawRequestContext.Body`
Your streaming bridging already gives the handler a `Stream` that reads what the gateway sends:
* No changes required in handler interfaces.
* You only need to ensure:
* `RawRequestContext.Body` **may be non-seekable**.
* Handlers know they must treat it as a forward-only stream.
Guidance for devs in `Microservice.md`:
* For binary uploads or large files, implement `IRawStellaEndpoint` and read incrementally:
```csharp
[StellaEndpoint("POST", "/billing/invoices/upload")]
public sealed class InvoiceUploadEndpoint : IRawStellaEndpoint
{
public async Task<RawResponse> HandleAsync(RawRequestContext ctx)
{
var buffer = new byte[64 * 1024];
int read;
while ((read = await ctx.Body.ReadAsync(buffer.AsMemory(0, buffer.Length), ctx.CancellationToken)) > 0)
{
// Process chunk
}
return new RawResponse { StatusCode = 204 };
}
}
```
---
## 9. Tests
**Scope:** still InMemory, but now streaming & limits.
### 9.1 Streaming happy path
* Setup:
* Endpoint with `SupportsStreaming = true`.
* `IRawStellaEndpoint` that:
* Counts total bytes read from `ctx.Body`.
* Returns 200.
* Test:
* Send an HTTP POST with a body larger than `MaxRequestBytesPerCall`, but with streaming enabled.
* Assert:
* Gateway does **not** buffer entire body in one array (you can assert via instrumentation or at least confirm no 413).
* Handler sees the full number of bytes.
* Response is 200.
### 9.2 Per-call limit breach
* Configure:
* `SupportsStreaming = false` (or use streaming but set low `MaxRequestBytesPerCall`).
* Test:
* Send a body larger than limit.
* Assert:
* Gateway responds 413.
* Handler is not invoked at all.
### 9.3 Global/in-flight limit breach
* Configure:
* `MaxAggregateInflightBytes` very low (e.g. 1 MB).
* Test:
* Start multiple concurrent streaming requests that each try to send more than the allowed total.
* Assert:
* Some of them get a CANCEL / error (413 or 503).
* `IPayloadBudget` denies reservations and releases resources correctly.
---
## 10. Done criteria for “Add streaming & payload limits (InMemory)”
Youre done with this step when:
* Gateway:
* Chooses buffered vs streaming based on `EndpointDescriptor.SupportsStreaming` and size.
* Enforces `MaxRequestBytesPerCall` for buffered requests (413 on violation).
* Uses `ITransportClient.SendStreamingAsync` for streaming.
* Has an `IPayloadBudget` preventing excessive in-flight payload accumulation.
* InMemory transport:
* Implements `SendStreamingAsync` by bridging HTTP streams to microservice handlers and back.
* Enforces payload limits while copying.
* Microservice:
* Receives a functional `Stream` in `RawRequestContext.Body`.
* Can implement `IRawStellaEndpoint` that reads incrementally for large payloads.
* Tests:
* Demonstrate a streaming endpoint works for large payloads.
* Demonstrate per-call and aggregate limits are respected and cause rejections/cancellations.
After this, you can reuse the same semantics when you implement real transports (TCP/TLS/RabbitMQ), with InMemory as your reference implementation.

View File

@@ -1,562 +0,0 @@
For this step youre taking the protocol you already proved with InMemory and putting it on real transports:
* TCP (baseline)
* Certificate/TLS (secure TCP)
* UDP (small, nonstreaming)
* RabbitMQ
The idea: every plugin implements the same `Frame` semantics (HELLO/HEARTBEAT/REQUEST/RESPONSE/CANCEL, plus streaming where supported), and the gateway/microservices dont change their business logic at all.
Ill structure this as a sequence of substeps you can execute in order.
---
## 0. Preconditions
Before you start adding real transports, make sure:
* Frame model is stable in `StellaOps.Router.Common`:
* `Frame`, `FrameType`, `TransportType`.
* Microservice and gateway code use **only**:
* `ITransportClient` to send (gateway side).
* `ITransportServer` / connection abstractions to receive (gateway side).
* `IMicroserviceConnection` + `ITransportClient` under the hood (microservice side).
* InMemory transport is working with:
* HELLO
* REQUEST / RESPONSE
* CANCEL
* Streaming & payload limits (step 8)
If any code still directly talks to “InMemoryRouterHub” from app logic, hide it behind the `ITransportClient` / `ITransportServer` abstractions first.
---
## 1. Freeze the wire protocol and serializer
**Owner:** protocol / infra dev
Before touching sockets or RabbitMQ, lock down **how a `Frame` is encoded** on the wire. This must be consistent across all transports except InMemory (which can cheat a bit internally).
### 1.1 Frame header
Define a simple binary header; for example:
* 1 byte: `FrameType`
* 16 bytes: `CorrelationId` (`Guid`)
* 4 bytes: payload length (`int32`, big- or little-endian, but be consistent)
Total header = 21 bytes. Then `payloadLength` bytes follow.
You can evolve later but start with something simple.
### 1.2 Frame serializer
In a small shared, **nonASP.NET** assembly (either Common or a new `StellaOps.Router.Protocol` library), implement:
```csharp
public interface IFrameSerializer
{
void WriteFrame(Frame frame, Stream stream, CancellationToken ct);
Task WriteFrameAsync(Frame frame, Stream stream, CancellationToken ct);
Frame ReadFrame(Stream stream, CancellationToken ct);
Task<Frame> ReadFrameAsync(Stream stream, CancellationToken ct);
}
```
Implementation:
* Writes header then payload.
* Reads header then payload; throws on EOF.
For payloads (HELLO, HEARTBEAT, etc.), use one encoding consistently (e.g. `System.Text.Json` for now) and **centralize** DTO ⇒ `byte[]` conversions:
```csharp
public static class PayloadCodec
{
public static byte[] Encode<T>(T payload) { ... }
public static T Decode<T>(byte[] bytes) { ... }
}
```
All transports use `IFrameSerializer` + `PayloadCodec`.
---
## 2. Introduce a transport registry / resolver
**Projects:** gateway + microservice
**Owner:** infra dev
You need a way to map `TransportType` to a concrete plugin.
### 2.1 Gateway side
Define:
```csharp
public interface ITransportClientResolver
{
ITransportClient GetClient(TransportType transportType);
}
public interface ITransportServerFactory
{
ITransportServer CreateServer(TransportType transportType);
}
```
Initial implementation:
* Registers the available clients:
```csharp
public sealed class TransportClientResolver : ITransportClientResolver
{
private readonly IServiceProvider _sp;
public TransportClientResolver(IServiceProvider sp) => _sp = sp;
public ITransportClient GetClient(TransportType transportType) =>
transportType switch
{
TransportType.Tcp => _sp.GetRequiredService<TcpTransportClient>(),
TransportType.Certificate=> _sp.GetRequiredService<TlsTransportClient>(),
TransportType.Udp => _sp.GetRequiredService<UdpTransportClient>(),
TransportType.RabbitMq => _sp.GetRequiredService<RabbitMqTransportClient>(),
_ => throw new NotSupportedException($"Transport {transportType} not supported.")
};
}
```
Then in `TransportDispatchMiddleware`, instead of injecting a single `ITransportClient`, inject `ITransportClientResolver` and choose:
```csharp
var client = clientResolver.GetClient(decision.TransportType);
```
### 2.2 Microservice side
On the microservice, you can do something similar:
```csharp
internal interface IMicroserviceTransportConnector
{
Task ConnectAsync(StellaMicroserviceOptions options, CancellationToken ct);
}
```
Implement one per transport type; later `StellaMicroserviceOptions.Routers` will determine which transport to use for each router endpoint.
---
## 3. Implement plugin 1: TCP
Start with TCP; its the most straightforward and will largely mirror your InMemory behavior.
### 3.1 Gateway: `TcpTransportServer`
**Project:** `StellaOps.Gateway.WebService` or a transport sub-namespace.
Responsibilities:
* Listen on a configured TCP port (e.g. from `RouterConfig`).
* Accept connections, each mapping to a `ConnectionId`.
* For each connection:
* Start a background receive loop:
* Use `IFrameSerializer.ReadFrameAsync` on a `NetworkStream`.
* On `FrameType.Hello`:
* Deserialize HELLO payload.
* Build a `ConnectionState` and register with `IGlobalRoutingState`.
* On `FrameType.Heartbeat`:
* Update heartbeat for that `ConnectionId`.
* On `FrameType.Response` or `ResponseStreamData`:
* Push frame into the gateways correlation / streaming handler (similar to InMemory path).
* On `FrameType.Cancel` (rare from microservice):
* Optionally implement; can be ignored for now.
* Provide a sending API to the matching `TcpTransportClient` (gateway-side) using `WriteFrameAsync`.
You will likely have:
* A `TcpConnectionContext` per connected microservice:
* Holds `ConnectionId`, `TcpClient`, `NetworkStream`, `TaskCompletionSource` maps for correlation IDs.
### 3.2 Gateway: `TcpTransportClient` (gateway-side, to microservices)
Implements `ITransportClient`:
* `SendRequestAsync`:
* Given `ConnectionState`:
* Get the associated `TcpConnectionContext`.
* Register a `TaskCompletionSource<Frame>` keyed by `CorrelationId`.
* Call `WriteFrameAsync(requestFrame)` on the connections stream.
* Await the TCS, which is completed in the receive loop when a `Response` frame arrives.
* `SendStreamingAsync`:
* Write header `FrameType.Request`.
* Read from `BudgetedRequestStream` in chunks:
* For TCP plugin you can either:
* Use `RequestStreamData` frames with chunk payloads, or
* Keep the simple bridging approach and send a single `Request` with all body bytes.
* Since you already validated streaming semantics with InMemory, you can decide:
* For first version of TCP, **only support buffered data**, then add chunk frames later.
* `SendCancelAsync`:
* Write a `FrameType.Cancel` frame with the same `CorrelationId`.
### 3.3 Microservice: `TcpTransportClientConnection`
**Project:** `StellaOps.Microservice`
Responsibilities on microservice side:
* For each `RouterEndpointConfig` where `TransportType == Tcp`:
* Open a `TcpClient` to `Host:Port`.
* Use `IFrameSerializer` to send:
* `HELLO` frame (payload = identity + descriptors).
* Periodic `HEARTBEAT` frames.
* `RESPONSE` frames for incoming `REQUEST`s.
* Receive loop:
* `ReadFrameAsync` from `NetworkStream`.
* On `REQUEST`:
* Dispatch through `IEndpointDispatcher`.
* For minimal streaming, treat payload as buffered; youll align with streaming later.
* On `CANCEL`:
* Use correlation ID to cancel the `CancellationTokenSource` you already maintain.
This is conceptually the same as InMemory but using real sockets.
---
## 4. Implement plugin 2: Certificate/TLS
Build TLS on top of TCP plugin; do not fork logic unnecessarily.
### 4.1 Gateway: `TlsTransportServer`
* Wrap accepted `TcpClient` sockets in `SslStream`.
* Load server certificate from configuration (for the node/region).
* Authenticate client if you want mutual TLS.
Structure:
* Reuse almost all of `TcpTransportServer` logic, but instead of `NetworkStream` you use `SslStream` as the underlying stream for `IFrameSerializer`.
### 4.2 Microservice: `TlsTransportClientConnection`
* Instead of plain `TcpClient.GetStream`, wrap in `SslStream`.
* Authenticate server (hostname & certificate).
* Optional: present client certificate.
Configuration fields in `RouterEndpointConfig` (or a TLS-specific sub-config):
* `UseTls` / `TransportType.Certificate`.
* Certificate paths / thumbprints / validation parameters.
At the SDK level, you just treat it as a different transport type; protocol remains identical.
---
## 5. Implement plugin 3: UDP (small, nonstreaming)
UDP is only for small, bounded payloads. No streaming, besteffort delivery.
### 5.1 Constraints
* Use UDP **only** for buffered, small payload endpoints.
* No streaming (`SupportsStreaming` must be `false` for UDP endpoints).
* No guarantee of delivery or ordering; caller must tolerate occasional failures/timeouts.
### 5.2 Gateway: `UdpTransportServer`
Responsibilities:
* Listen on a UDP port.
* Parse each incoming datagram as a full `Frame`:
* `FrameType.Hello`:
* Register a “logical connection” keyed by `(remoteEndpoint)` and `InstanceId`.
* `FrameType.Heartbeat`:
* Update health for that logical connection.
* `FrameType.Response`:
* Use `CorrelationId` and “connectionId” to complete a `TaskCompletionSource` as with TCP.
Because UDP is connectionless, your `ConnectionId` can be:
* A composite of microservice identity + remote endpoint, e.g. `"{instanceId}@{ip}:{port}"`.
### 5.3 Gateway: `UdpTransportClient` (gateway-side)
`SendRequestAsync`:
* Serialize `Frame` to `byte[]`.
* Send via `UdpClient.SendAsync` to the remote endpoint from `ConnectionState`.
* Start a timer:
* Wait for `Response` datagram with matching `CorrelationId`.
* If none comes within timeout → throw `OperationCanceledException`.
`SendStreamingAsync`:
* For this first iteration, **throw NotSupportedException**.
* Router should not route streaming endpoints over UDP; your routing config should enforce that.
`SendCancelAsync`:
* Optionally send a CANCEL datagram; but in practice, if requests are small, this is less useful. You can still implement it for symmetry.
### 5.4 Microservice: UDP connection
For microservice side:
* A single `UdpClient` bound to a local port.
* For each configured router (host/port):
* HELLO: send a `FrameType.Hello` datagram.
* HEARTBEAT: send periodic `FrameType.Heartbeat`.
* REQUEST handling: not needed; UDP plugin is used **for gateway → microservice** only if you design it that way. More likely, microservice is the server in TCP, but for UDP you might decide microservice is listening on port and gateway sends requests. So invert roles if needed.
Given the complexity and limited utility, you can treat UDP as “advanced/optional transport” and implement it last.
---
## 6. Implement plugin 4: RabbitMQ
This is conceptually similar to what you had in Serdica.
### 6.1 Exchange/queue design
Decide and document (in `Protocol & Transport Specification.md`) something like:
* Exchange: `stella.router`
* Routing keys:
* `request.{serviceName}.{version}` — gateway → microservice.
* Microservices reply queue per instance: `reply.{serviceName}.{version}.{instanceId}`.
Rabbit usages:
* Gateway:
* Publishes REQUEST frames to `request.{serviceName}.{version}`.
* Consumes from `reply.*` for responses.
* Microservice:
* Consumes from `request.{serviceName}.{version}`.
* Publishes responses to its own reply queue; sets `CorrelationId` property.
### 6.2 Gateway: `RabbitMqTransportClient`
Implements `ITransportClient`:
* `SendRequestAsync`:
* Create a message with:
* Body = serialized `Frame` (REQUEST or buffered streaming).
* Properties:
* `CorrelationId` = `frame.CorrelationId`.
* `ReplyTo` = microservices reply queue name for this instance.
* Publish to `request.{serviceName}.{version}`.
* Await a response:
* Consumer on reply queue completes a `TaskCompletionSource<Frame>` keyed by correlation ID.
* `SendStreamingAsync`:
* For v1, you can:
* Only support buffered endpoints over RabbitMQ (like UDP).
* Or send chunked messages (`RequestStreamData` frames as separate messages) and reconstruct on microservice side.
* Id recommend:
* Start with buffered only over RabbitMQ.
* Mark Rabbit as “no streaming support yet” in config.
* `SendCancelAsync`:
* Option 1: send a separate CANCEL message with same `CorrelationId`.
* Option 2: rely on timeout; cancellation doesnt buy much given overhead.
### 6.3 Microservice: RabbitMQ listener
* Single `IConnection` and `IModel`.
* Declare and bind:
* Service request queue: `request.{serviceName}.{version}`.
* Reply queue: `reply.{serviceName}.{version}.{instanceId}`.
* Consume request queue:
* On message:
* Deserialize `Frame`.
* Dispatch through `IEndpointDispatcher`.
* Publish RESPONSE message to `ReplyTo` queue with same `CorrelationId`.
If you already have RabbitMQ experience from Serdica, this should feel familiar.
---
## 7. Routing config & transport selection
**Projects:** router config + microservice options
**Owner:** config / platform dev
You need to define which transport is actually used in production.
### 7.1 Gateway config (RouterConfig)
Per service/instance, store:
* `TransportType` to listen on / expect connections for.
* Ports / Rabbit URLs / TLS settings.
Example shape in `RouterConfig`:
```csharp
public sealed class ServiceInstanceConfig
{
public string ServiceName { get; set; } = string.Empty;
public string Version { get; set; } = string.Empty;
public string Region { get; set; } = string.Empty;
public TransportType TransportType { get; set; } = TransportType.Udp; // default
public int Port { get; set; } // for TCP/UDP/TLS
public string? RabbitConnectionString { get; set; }
// TLS info, etc.
}
```
`StellaOps.Gateway.WebService` startup:
* Reads these configs.
* Starts corresponding `ITransportServer` instances.
### 7.2 Microservice options
`StellaMicroserviceOptions.Routers` entries must define:
* `Host`
* `Port`
* `TransportType`
* Any transport-specific settings (TLS, Rabbit URL).
At connect time, microservice chooses:
* For each `RouterEndpointConfig`, instantiate the right connector:
```csharp
switch(config.TransportType)
{
case TransportType.Tcp:
use TcpMicroserviceConnector;
break;
case TransportType.Certificate:
use TlsMicroserviceConnector;
break;
case TransportType.Udp:
use UdpMicroserviceConnector;
break;
case TransportType.RabbitMq:
use RabbitMqMicroserviceConnector;
break;
}
```
---
## 8. Implementation order & testing strategy
**Owner:** tech lead
Do NOT try to implement all at once. Suggested order:
1. **TCP**:
* Reuse InMemory test suite:
* HELLO + endpoint registration.
* REQUEST → RESPONSE.
* CANCEL.
* Heartbeats.
* (Optional) streaming as buffered stub for v1, then add genuine streaming.
2. **Certificate/TLS**:
* Wrap TCP logic in TLS.
* Same tests, plus:
* Certificate validation.
* Mutual TLS if required.
3. **RabbitMQ**:
* Start with buffered-only endpoints.
* Mirror existing InMemory/TCP tests where payloads are small.
* Add tests for connection resilience (reconnect, etc.).
4. **UDP**:
* Implement only for very small buffered requests; no streaming.
* Add tests that verify:
* HELLO + basic health.
* REQUEST → RESPONSE with small payload.
* Proper timeouts.
At each stage, tests for that plugin must reuse the **same microservice and gateway** code that worked with InMemory. Only the transport factories change.
---
## 9. Done criteria for “Implement real transport plugins one by one”
You can consider step 9 done when:
* There are **concrete implementations** of `ITransportServer` + `ITransportClient` for:
* TCP
* Certificate/TLS
* UDP (buffered only)
* RabbitMQ (buffered at minimum)
* Gateway startup:
* Reads `RouterConfig`.
* Starts appropriate transport servers per node/region.
* Microservice SDK:
* Reads `StellaMicroserviceOptions.Routers`.
* Connects to router nodes using the configured `TransportType`.
* Uses the same HELLO/HEARTBEAT/REQUEST/RESPONSE/CANCEL semantics as InMemory.
* The same functional tests that passed for InMemory:
* Now pass with TCP plugin.
* At least a subset pass with TLS, Rabbit, and UDP, honoring their constraints (no streaming on UDP, etc.).
From there, you can move into hardening each plugin (reconnect, backoff, error handling) and documenting “which transport to use when” in your router docs.

View File

@@ -1,586 +0,0 @@
For this step youre wiring **configuration** into the system properly:
* Router reads a stronglytyped config model (including payload limits, node region, transports).
* Microservices can optionally load a YAML file to **override** endpoint metadata discovered by reflection.
* No behavior changes to routing or transports, just how they get their settings.
Think “config plumbing and merging rules,” not new business logic.
---
## 0. Preconditions
Before starting, confirm:
* `__Libraries/StellaOps.Router.Config` project exists and references `StellaOps.Router.Common`.
* `StellaOps.Microservice` has:
* `StellaMicroserviceOptions` (ServiceName, Version, Region, InstanceId, Routers, ConfigFilePath).
* Reflectionbased endpoint discovery that produces `EndpointDescriptor` instances.
* Gateway and microservices currently use **hardcoded** or stub config; youre about to replace that with real config.
---
## 1. Define RouterConfig model and YAML schema
**Project:** `__Libraries/StellaOps.Router.Config`
**Owner:** config / platform dev
### 1.1 C# model
Create clear, minimal models to cover current needs (you can extend later):
```csharp
namespace StellaOps.Router.Config;
public sealed class RouterConfig
{
public GatewayNodeConfig Node { get; set; } = new();
public PayloadLimits PayloadLimits { get; set; } = new();
public IList<TransportEndpointConfig> Transports { get; set; } = new List<TransportEndpointConfig>();
public IList<ServiceConfig> Services { get; set; } = new List<ServiceConfig>();
}
public sealed class GatewayNodeConfig
{
public string NodeId { get; set; } = string.Empty;
public string Region { get; set; } = string.Empty;
public string Environment { get; set; } = "prod";
}
public sealed class TransportEndpointConfig
{
public TransportType TransportType { get; set; }
public int Port { get; set; } // for TCP/UDP/TLS
public bool Enabled { get; set; } = true;
// TLS-specific
public string? ServerCertificatePath { get; set; }
public string? ServerCertificatePassword { get; set; }
public bool RequireClientCertificate { get; set; }
// Rabbit-specific
public string? RabbitConnectionString { get; set; }
}
public sealed class ServiceConfig
{
public string Name { get; set; } = string.Empty;
public string DefaultVersion { get; set; } = "1.0.0";
public IList<string> NeighborRegions { get; set; } = new List<string>();
}
```
Use the `PayloadLimits` class from Common (or mirror it here and keep a single definition).
### 1.2 YAML shape
Decide and document a YAML layout, e.g.:
```yaml
node:
nodeId: "gw-eu1-01"
region: "eu1"
environment: "prod"
payloadLimits:
maxRequestBytesPerCall: 10485760 # 10 MB
maxRequestBytesPerConnection: 52428800
maxAggregateInflightBytes: 209715200
transports:
- transportType: Tcp
port: 45000
enabled: true
- transportType: Certificate
port: 45001
enabled: false
serverCertificatePath: "certs/router.pfx"
serverCertificatePassword: "secret"
- transportType: Udp
port: 45002
enabled: true
- transportType: RabbitMq
enabled: true
rabbitConnectionString: "amqp://guest:guest@localhost:5672"
services:
- name: "Billing"
defaultVersion: "1.0.0"
neighborRegions: ["eu2", "us1"]
- name: "Identity"
defaultVersion: "2.1.0"
neighborRegions: ["eu2"]
```
This YAML is the canonical config for the router; environment variables and JSON can override individual properties later via `IConfiguration`.
---
## 2. Implement Router.Config loader and DI extensions
**Project:** `StellaOps.Router.Config`
### 2.1 Choose YAML library
Add a YAML library (e.g. YamlDotNet) to `StellaOps.Router.Config`:
```bash
dotnet add src/__Libraries/StellaOps.Router.Config/StellaOps.Router.Config.csproj package YamlDotNet
```
### 2.2 Implement simple loader
Provide a helper that can load YAML into `RouterConfig`:
```csharp
public static class RouterConfigLoader
{
public static RouterConfig LoadFromYaml(string path)
{
using var reader = new StreamReader(path);
var yaml = new YamlStream();
yaml.Load(reader);
var root = (YamlMappingNode)yaml.Documents[0].RootNode;
var json = ConvertYamlToJson(root); // simplest: walk node, serialize to JSON string
return JsonSerializer.Deserialize<RouterConfig>(json)!;
}
}
```
Alternatively, bind YAML directly to `RouterConfig` with YamlDotNets object mapping; the detail is implementationspecific.
### 2.3 ASP.NET Core integration extension
In the router library, add a DI extension the gateway can call:
```csharp
public static class ServiceCollectionExtensions
{
public static IServiceCollection AddRouterConfig(
this IServiceCollection services,
IConfiguration configuration)
{
services.Configure<RouterConfig>(configuration.GetSection("Router"));
services.AddSingleton(sp => sp.GetRequiredService<IOptionsMonitor<RouterConfig>>());
return services;
}
}
```
Gateway will:
* Add the YAML file to the configuration builder.
* Call `AddRouterConfig` to bind it.
---
## 3. Wire RouterConfig into Gateway startup & components
**Project:** `StellaOps.Gateway.WebService`
**Owner:** gateway dev
### 3.1 Program.cs configuration
Adjust `Program.cs`:
```csharp
var builder = WebApplication.CreateBuilder(args);
// add YAML config
builder.Configuration
.AddJsonFile("appsettings.json", optional: true)
.AddYamlFile("router.yaml", optional: false, reloadOnChange: true)
.AddEnvironmentVariables("STELLAOPS_");
// bind RouterConfig
builder.Services.AddRouterConfig(builder.Configuration.GetSection("Router"));
var app = builder.Build();
```
Key points:
* `AddYamlFile("router.yaml", reloadOnChange: true)` ensures hotreload from YAML.
* `AddEnvironmentVariables("STELLAOPS_")` allows envbased overrides (optional, but useful).
### 3.2 Inject config into transport factories and routing
Where you start transports:
* Inject `IOptionsMonitor<RouterConfig>` into your `ITransportServerFactory`, and use `RouterConfig.Transports` to know which servers to create and on which ports.
Where you need node identity:
* Inject `IOptionsMonitor<RouterConfig>` into any service needing `GatewayNodeConfig` (e.g. when building `RoutingContext.GatewayRegion`):
```csharp
var nodeRegion = routerConfig.CurrentValue.Node.Region;
```
Where you need payload limits:
* Inject `IOptionsMonitor<RouterConfig>` into `IPayloadBudget` or `TransportDispatchMiddleware` to fetch current `PayloadLimits`.
Because youre using `IOptionsMonitor`, components can react to changes when `router.yaml` is modified.
---
## 4. Microservice YAML: schema & loader
**Project:** `__Libraries/StellaOps.Microservice`
**Owner:** SDK dev
Microservice YAML is optional and used **only** to override endpoint metadata, not to define identity or router pool.
### 4.1 Define YAML shape
Keep it focused on endpoints and overrides:
```yaml
service:
serviceName: "Billing"
version: "1.0.0"
region: "eu1"
endpoints:
- method: "POST"
path: "/billing/invoices/upload"
defaultTimeout: "00:02:00"
supportsStreaming: true
requiringClaims:
- type: "role"
value: "billing-editor"
- method: "GET"
path: "/billing/invoices/{id}"
defaultTimeout: "00:00:10"
requiringClaims:
- type: "role"
value: "billing-reader"
```
Identity (`serviceName`, `version`, `region`) in YAML is **informative**; the authoritative values still come from `StellaMicroserviceOptions`. If they differ, you log, but dont override options from YAML.
### 4.2 C# model
In `StellaOps.Microservice`:
```csharp
internal sealed class MicroserviceYamlConfig
{
public MicroserviceYamlService? Service { get; set; }
public IList<MicroserviceYamlEndpoint> Endpoints { get; set; } = new List<MicroserviceYamlEndpoint>();
}
internal sealed class MicroserviceYamlService
{
public string? ServiceName { get; set; }
public string? Version { get; set; }
public string? Region { get; set; }
}
internal sealed class MicroserviceYamlEndpoint
{
public string Method { get; set; } = string.Empty;
public string Path { get; set; } = string.Empty;
public string? DefaultTimeout { get; set; }
public bool? SupportsStreaming { get; set; }
public IList<ClaimRequirement> RequiringClaims { get; set; } = new List<ClaimRequirement>();
}
```
### 4.3 YAML loader
Reuse YamlDotNet (add package to `StellaOps.Microservice` if needed):
```csharp
internal interface IMicroserviceYamlLoader
{
MicroserviceYamlConfig? Load(string? path);
}
internal sealed class MicroserviceYamlLoader : IMicroserviceYamlLoader
{
private readonly ILogger<MicroserviceYamlLoader> _logger;
public MicroserviceYamlLoader(ILogger<MicroserviceYamlLoader> logger)
{
_logger = logger;
}
public MicroserviceYamlConfig? Load(string? path)
{
if (string.IsNullOrWhiteSpace(path) || !File.Exists(path))
return null;
try
{
using var reader = new StreamReader(path);
var deserializer = new DeserializerBuilder().Build();
return deserializer.Deserialize<MicroserviceYamlConfig>(reader);
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to load microservice YAML from {Path}", path);
return null;
}
}
}
```
Register in DI:
```csharp
services.AddSingleton<IMicroserviceYamlLoader, MicroserviceYamlLoader>();
```
---
## 5. Merge YAML overrides with reflection-discovered endpoints
**Project:** `StellaOps.Microservice`
**Owner:** SDK dev
Extend `EndpointCatalog` to apply YAML overrides.
### 5.1 Extend constructor to accept YAML config
Adjust `EndpointCatalog`:
```csharp
internal sealed class EndpointCatalog : IEndpointCatalog
{
public IReadOnlyList<EndpointDescriptor> Descriptors { get; }
private readonly Dictionary<(string Method, string Path), EndpointRegistration> _map;
public EndpointCatalog(
IEndpointDiscovery discovery,
IMicroserviceYamlLoader yamlLoader,
IOptions<StellaMicroserviceOptions> optionsAccessor)
{
var options = optionsAccessor.Value;
var registrations = discovery.DiscoverEndpoints(options);
var yamlConfig = yamlLoader.Load(options.ConfigFilePath);
registrations = ApplyYamlOverrides(registrations, yamlConfig);
_map = registrations.ToDictionary(
r => (r.Descriptor.Method, r.Descriptor.Path),
r => r,
StringComparer.OrdinalIgnoreCase);
Descriptors = registrations.Select(r => r.Descriptor).ToArray();
}
}
```
### 5.2 Implement `ApplyYamlOverrides`
Key rules:
* Identity (ServiceName, Version, Region) always come from `StellaMicroserviceOptions`.
* YAML can override:
* `DefaultTimeout`
* `SupportsStreaming`
* `RequiringClaims`
Implementation sketch:
```csharp
private static IReadOnlyList<EndpointRegistration> ApplyYamlOverrides(
IReadOnlyList<EndpointRegistration> registrations,
MicroserviceYamlConfig? yaml)
{
if (yaml is null || yaml.Endpoints.Count == 0)
return registrations;
var overrideMap = yaml.Endpoints.ToDictionary(
e => (e.Method, e.Path),
e => e,
StringComparer.OrdinalIgnoreCase);
var result = new List<EndpointRegistration>(registrations.Count);
foreach (var reg in registrations)
{
if (!overrideMap.TryGetValue((reg.Descriptor.Method, reg.Descriptor.Path), out var ov))
{
result.Add(reg);
continue;
}
var desc = reg.Descriptor;
var timeout = desc.DefaultTimeout;
if (!string.IsNullOrWhiteSpace(ov.DefaultTimeout) &&
TimeSpan.TryParse(ov.DefaultTimeout, out var parsed))
{
timeout = parsed;
}
var supportsStreaming = desc.SupportsStreaming;
if (ov.SupportsStreaming.HasValue)
{
supportsStreaming = ov.SupportsStreaming.Value;
}
var requiringClaims = ov.RequiringClaims.Count > 0
? ov.RequiringClaims.ToArray()
: desc.RequiringClaims;
var overriddenDescriptor = new EndpointDescriptor
{
ServiceName = desc.ServiceName,
Version = desc.Version,
Method = desc.Method,
Path = desc.Path,
DefaultTimeout = timeout,
SupportsStreaming = supportsStreaming,
RequiringClaims = requiringClaims
};
result.Add(new EndpointRegistration
{
Descriptor = overriddenDescriptor,
HandlerType = reg.HandlerType
});
}
return result;
}
```
This ensures code defines the set of endpoints; YAML only tunes metadata.
---
## 6. Hotreload / YAML change handling
**Router side:** you already enabled `reloadOnChange` for `router.yaml`, and use `IOptionsMonitor<RouterConfig>`. Next:
* Components that care about changes must **react**:
* Payload limits:
* `IPayloadBudget` or `TransportDispatchMiddleware` should read `routerConfig.CurrentValue.PayloadLimits` on each request rather than caching.
* Node region:
* `RoutingContext.GatewayRegion` can be built from `routerConfig.CurrentValue.Node.Region` per request.
You do **not** need a custom watcher; `IOptionsMonitor` already tracks config changes.
**Microservice side:** for now you can start with **load-on-startup** YAML. If you want hotreload:
* Implement a FileSystemWatcher in `MicroserviceYamlLoader` or a small `IHostedService`:
* Watch `options.ConfigFilePath` for changes.
* On change:
* Reload YAML.
* Rebuild `EndpointDescriptor` list.
* Send an updated HELLO or an ENDPOINTS_UPDATE frame to router.
Given complexity, you can postpone true hot reload to a later iteration and document that microservices must be restarted to pick up YAML changes.
---
## 7. Tests
**Router.Config tests:**
* Unit tests for `RouterConfigLoader`:
* Given a YAML string, bind to `RouterConfig` properly.
* Validate `TransportType.Tcp` / `Udp` / `RabbitMq` values map correctly.
* Integration test:
* Start gateway with `router.yaml`.
* Access `IOptionsMonitor<RouterConfig>` in a test controller or test service and assert values.
* Modify YAML on disk (if test infra allows) and ensure values update via `IOptionsMonitor`.
**Microservice YAML tests:**
* Unit tests for `MicroserviceYamlLoader`:
* Load valid YAML, confirm endpoints and claims/timeouts parsed.
* `EndpointCatalog` tests:
* Build fake `EndpointRegistration` list from reflection.
* Build YAML overrides.
* Call `ApplyYamlOverrides` and assert:
* Timeouts updated.
* SupportsStreaming updated.
* RequiringClaims replaced where provided.
* Descriptors with no matching YAML remain unchanged.
---
## 8. Documentation updates
Update docs under `docs/router`:
1. **Stella Ops Router Webserver.md**:
* Describe `router.yaml`:
* Node config (region, nodeId).
* PayloadLimits.
* Transports.
* Explain precedence:
* YAML as base.
* Environment variables can override individual fields via `STELLAOPS_Router__Node__Region` etc.
2. **Stella Ops Router Microservice.md**:
* Explain `ConfigFilePath` in `StellaMicroserviceOptions`.
* Show full example microservice YAML and how it maps to endpoint metadata.
* Clearly state:
* Identity comes from options (code/config), not YAML.
* YAML can override perendpoint timeout, streaming flag, requiringClaims.
* YAML cant add endpoints that dont exist in code.
3. **Stella Ops Router Documentation.md**:
* Add a short “Configuration” chapter:
* Where `router.yaml` lives.
* Where microservice YAML lives.
* How to run locally with custom configs.
---
## 9. Done criteria for “Add Router.Config + Microservice YAML integration”
You can call step 10 complete when:
* Router:
* Loads `router.yaml` into `RouterConfig` using `StellaOps.Router.Config`.
* Uses `RouterConfig.Node.Region` when building routing context.
* Uses `RouterConfig.PayloadLimits` for payload budget enforcement.
* Uses `RouterConfig.Transports` to start the right `ITransportServer` instances.
* Supports runtime changes to `router.yaml` via `IOptionsMonitor` for at least node identity and payload limits.
* Microservice:
* Accepts optional `ConfigFilePath` in `StellaMicroserviceOptions`.
* Loads YAML (when present) and merges overrides into reflectiondiscovered endpoints.
* Sends HELLO with the **merged** descriptors (i.e., YAML-aware defaults).
* Behavior remains unchanged when no YAML is provided (pure reflection mode).
* Tests:
* Confirm config binding for router and microservice.
* Confirm YAML overrides are applied correctly to endpoint metadata.
At that point, configuration is no longer hardcoded, and you have a clear, documented path for both router operators and microservice teams to configure behavior via YAML with predictable precedence.

View File

@@ -1,550 +0,0 @@
Goal for this step: have a **concrete, runnable example** (gateway + one microservice) and a **clear skeleton** for migrating any existing `StellaOps.*.WebService` into `StellaOps.*.Microservice`. After this, devs should be able to:
* Run a full vertical slice locally.
* Open a “migration cookbook” and follow a predictable recipe.
Ill split it into two tracks: reference example, then migration skeleton.
---
## 1. Reference example: “Billing” vertical slice
### 1.1 Create the sample microservice project
**Project:** `src/StellaOps.Billing.Microservice`
**Owner:** feature/example dev
Tasks:
1. Create the project:
```bash
cd src
dotnet new worker -n StellaOps.Billing.Microservice
```
2. Add references:
```bash
dotnet add StellaOps.Billing.Microservice/StellaOps.Billing.Microservice.csproj reference \
__Libraries/StellaOps.Microservice/StellaOps.Microservice.csproj
dotnet add StellaOps.Billing.Microservice/StellaOps.Billing.Microservice.csproj reference \
__Libraries/StellaOps.Router.Common/StellaOps.Router.Common.csproj
```
3. In `Program.cs`, wire the SDK with **InMemory transport** for now:
```csharp
var builder = Host.CreateApplicationBuilder(args);
builder.Services.AddStellaMicroservice(opts =>
{
opts.ServiceName = "Billing";
opts.Version = "1.0.0";
opts.Region = "eu1";
opts.InstanceId = $"billing-{Environment.MachineName}";
opts.Routers.Add(new RouterEndpointConfig
{
Host = "localhost",
Port = 50050, // to match gateways InMemory/TCP harness
TransportType = TransportType.Tcp
});
opts.ConfigFilePath = "billing.microservice.yaml"; // optional overrides
});
var app = builder.Build();
await app.RunAsync();
```
(You can keep `TransportType` as TCP even if implemented in-process for now; once real TCP is in, nothing changes here.)
---
### 1.2 Implement a few canonical endpoints
Pick 34 endpoints that exercise different features:
1. **Health / contract check**
```csharp
[StellaEndpoint("GET", "/ping")]
public sealed class PingEndpoint : IRawStellaEndpoint
{
public Task<RawResponse> HandleAsync(RawRequestContext ctx)
{
var resp = new RawResponse { StatusCode = 200 };
resp.Headers["Content-Type"] = "text/plain";
resp.WriteBodyAsync = async stream =>
{
await stream.WriteAsync("pong"u8.ToArray(), ctx.CancellationToken);
};
return Task.FromResult(resp);
}
}
```
2. **Simple JSON read/write (non-streaming)**
```csharp
public sealed record CreateInvoiceRequest(string CustomerId, decimal Amount);
public sealed record CreateInvoiceResponse(Guid Id);
[StellaEndpoint("POST", "/billing/invoices")]
public sealed class CreateInvoiceEndpoint : IStellaEndpoint<CreateInvoiceRequest, CreateInvoiceResponse>
{
public Task<CreateInvoiceResponse> HandleAsync(CreateInvoiceRequest req, CancellationToken ct)
{
// pretend to store in DB
return Task.FromResult(new CreateInvoiceResponse(Guid.NewGuid()));
}
}
```
3. **Streaming upload (large file)**
```csharp
[StellaEndpoint("POST", "/billing/invoices/upload")]
public sealed class InvoiceUploadEndpoint : IRawStellaEndpoint
{
public async Task<RawResponse> HandleAsync(RawRequestContext ctx)
{
var buffer = new byte[64 * 1024];
var total = 0L;
int read;
while ((read = await ctx.Body.ReadAsync(buffer.AsMemory(0, buffer.Length), ctx.CancellationToken)) > 0)
{
total += read;
// process chunk or write to temp file
}
var resp = new RawResponse { StatusCode = 200 };
resp.Headers["Content-Type"] = "application/json";
resp.WriteBodyAsync = async stream =>
{
var json = $"{{\"bytesReceived\":{total}}}";
await stream.WriteAsync(System.Text.Encoding.UTF8.GetBytes(json), ctx.CancellationToken);
};
return resp;
}
}
```
This gives devs examples of:
* Raw endpoint (`/ping`, `/upload`).
* Typed endpoint (`/billing/invoices`).
* Streaming usage (`Body.ReadAsync`).
---
### 1.3 Microservice YAML override example
**File:** `src/StellaOps.Billing.Microservice/billing.microservice.yaml`
```yaml
endpoints:
- method: GET
path: /ping
timeout: 00:00:02
- method: POST
path: /billing/invoices
timeout: 00:00:05
supportsStreaming: false
requiringClaims:
- type: role
value: BillingWriter
- method: POST
path: /billing/invoices/upload
timeout: 00:02:00
supportsStreaming: true
requiringClaims:
- type: role
value: BillingUploader
```
This file demonstrates:
* Timeout override.
* Streaming flag.
* `RequiringClaims` usage.
---
### 1.4 Gateway example config for Billing
**File:** `config/router.billing.yaml` (for local dev)
```yaml
nodeId: "gw-dev-01"
region: "eu1"
payloadLimits:
maxRequestBytesPerCall: 10485760 # 10 MB
maxRequestBytesPerConnection: 52428800 # 50 MB
maxAggregateInflightBytes: 209715200 # 200 MB
services:
- name: "Billing"
defaultVersion: "1.0.0"
endpoints:
- method: "GET"
path: "/ping"
# router defaults, if any
- method: "POST"
path: "/billing/invoices"
defaultTimeout: "00:00:05"
requiringClaims:
- type: "role"
value: "BillingWriter"
- method: "POST"
path: "/billing/invoices/upload"
defaultTimeout: "00:02:00"
supportsStreaming: true
requiringClaims:
- type: "role"
value: "BillingUploader"
```
This lets you show precedence:
* Reflection → microservice YAML → router YAML.
---
### 1.5 Gateway wiring for the example
**Project:** `StellaOps.Gateway.WebService`
In `Program.cs`:
1. Load router config and point it to `router.billing.yaml` for dev:
```csharp
builder.Configuration
.AddJsonFile("appsettings.json", optional: true)
.AddEnvironmentVariables(prefix: "STELLAOPS_");
builder.Services.AddOptions<RouterConfig>()
.Configure<IConfiguration>((cfg, configuration) =>
{
configuration.GetSection("Router").Bind(cfg);
var yamlPath = configuration["Router:YamlPath"] ?? "config/router.billing.yaml";
if (File.Exists(yamlPath))
{
var yamlCfg = RouterConfigLoader.LoadFromFile(yamlPath);
// either cfg = yamlCfg (if you treat YAML as source of truth)
OverlayRouterConfig(cfg, yamlCfg);
}
});
builder.Services.AddOptions<GatewayNodeConfig>()
.Configure<IOptions<RouterConfig>>((node, routerCfg) =>
{
var cfg = routerCfg.Value;
node.NodeId = cfg.NodeId;
node.Region = cfg.Region;
});
```
2. Ensure you start the appropriate transport server (for dev, TCP on localhost:50050):
* From `RouterConfig.Transports` or a dev shortcut, start the TCP server listening on that port.
3. HTTP pipeline:
* `EndpointResolutionMiddleware`
* `RoutingDecisionMiddleware`
* `TransportDispatchMiddleware`
Now your dev loop is:
* Run `StellaOps.Gateway.WebService`.
* Run `StellaOps.Billing.Microservice`.
* `curl http://localhost:{gatewayPort}/ping` → should go through gateway to microservice and back.
* Similarly for `/billing/invoices` and `/billing/invoices/upload`.
---
### 1.6 Example documentation
Create `docs/router/examples/Billing.Sample.md`:
* “How to run the example”:
* build solution
* `dotnet run` for gateway
* `dotnet run` for Billing microservice
* Show sample `curl` commands:
* `curl http://localhost:8080/ping`
* `curl -X POST http://localhost:8080/billing/invoices -d '{"customerId":"C1","amount":123.45}'`
* `curl -X POST http://localhost:8080/billing/invoices/upload --data-binary @bigfile.bin`
* Note where config files live and how to change them.
This becomes your canonical reference for new teams.
---
## 2. Migration skeleton: from WebService to Microservice
Now that you have a working example, you need a **repeatable recipe** for migrating any existing `StellaOps.*.WebService` into the microservice router model.
### 2.1 Define the migration target shape
For each webservice you migrate, you want:
* A new project: `StellaOps.{Domain}.Microservice`.
* Shared domain logic extracted into a library (if not already): `StellaOps.{Domain}.Core` or similar.
* Controllers → endpoint classes:
* `Controller` methods ⇨ `[StellaEndpoint]`-annotated types.
* `HttpGet/HttpPost` attributes ⇨ `Method` and `Path` pair.
* Configuration:
* WebServices appsettings routes → microservice YAML + router YAML.
* Authentication/authorization → `RequiringClaims` in endpoint metadata.
Document this target shape in `docs/router/Migration of Webservices to Microservices.md`.
---
### 2.2 Skeleton microservice template
Create a **generic** microservice skeleton that any team can copy:
**Project:** `templates/StellaOps.Template.Microservice` or at least a folder `samples/MigrationSkeleton/`.
Contents:
* `Program.cs`:
```csharp
var builder = Host.CreateApplicationBuilder(args);
builder.Services.AddStellaMicroservice(opts =>
{
opts.ServiceName = "{DomainName}";
opts.Version = "1.0.0";
opts.Region = "eu1";
opts.InstanceId = "{DomainName}-" + Environment.MachineName;
// Mandatory router pool configuration
opts.Routers.Add(new RouterEndpointConfig
{
Host = "localhost", // or injected via env
Port = 50050,
TransportType = TransportType.Tcp
});
opts.ConfigFilePath = $"{DomainName}.microservice.yaml";
});
// domain DI (reuse existing domain services from WebService)
// builder.Services.AddDomainServices();
var app = builder.Build();
await app.RunAsync();
```
* A sample endpoint mapping from a typical WebService controller method:
Legacy controller:
```csharp
[ApiController]
[Route("api/billing/invoices")]
public class InvoicesController : ControllerBase
{
[HttpPost]
[Authorize(Roles = "BillingWriter")]
public async Task<ActionResult<InvoiceDto>> Create(CreateInvoiceRequest request)
{
var result = await _service.Create(request);
return Ok(result);
}
}
```
Microservice endpoint:
```csharp
[StellaEndpoint("POST", "/billing/invoices")]
public sealed class CreateInvoiceEndpoint : IStellaEndpoint<CreateInvoiceRequest, InvoiceDto>
{
private readonly IInvoiceService _service;
public CreateInvoiceEndpoint(IInvoiceService service)
{
_service = service;
}
public Task<InvoiceDto> HandleAsync(CreateInvoiceRequest request, CancellationToken ct)
{
return _service.Create(request, ct);
}
}
```
And matching YAML:
```yaml
endpoints:
- method: POST
path: /billing/invoices
timeout: 00:00:05
requiringClaims:
- type: role
value: BillingWriter
```
This skeleton demonstrates the mapping clearly.
---
### 2.3 Migration workflow for a team (per service)
Put this as a checklist in `Migration of Webservices to Microservices.md`:
1. **Inventory existing HTTP surface**
* List all controllers and actions with:
* HTTP method.
* Route template (full path).
* Auth attributes (`[Authorize(Roles=..)]` or policies).
* Whether the action handles large uploads/downloads.
2. **Create microservice project**
* Add `StellaOps.{Domain}.Microservice` using the skeleton.
* Reference domain logic project (`StellaOps.{Domain}.Core`), or extract one if necessary.
3. **Map each controller action → endpoint**
For each action:
* Create an endpoint class in the microservice:
* `IRawStellaEndpoint` for:
* Large payloads.
* Very custom body handling.
* `IStellaEndpoint<TRequest,TResponse>` for standard JSON APIs.
* Use `[StellaEndpoint("{METHOD}", "{PATH}")]` matching the existing route.
4. **Wire domain services & auth**
* Register the same domain services the WebService used (DB contexts, repositories, etc.).
* Translate role/claim-based `[Authorize]` usage to microservice YAML `RequiringClaims`.
5. **Create microservice YAML**
* For each new endpoint:
* Define default timeout.
* `supportsStreaming: true` where appropriate.
* `requiringClaims` matching prior auth requirements.
6. **Update router YAML**
* Add service entry under `services`:
* `name: "{Domain}"`.
* `defaultVersion: "1.0.0"`.
* Add endpoints (method/path, router-side overrides if needed).
7. **Smoke-test locally**
* Run gateway + microservice side-by-side.
* Hit the same URLs via gateway that previously were served by the WebService directly.
* Compare behavior (status codes, semantics) with existing environment.
8. **Gradual rollout**
Strategy options:
* **Proxy mode**:
* Keep WebService behind gateway for a while.
* Add router endpoints that proxy to existing WebService (via HTTP) while microservice matures.
* Gradually switch endpoints to microservice once stable.
* **Blue/green**:
* Run WebService and Microservice in parallel.
* Route a small percentage of traffic to microservice via router.
* Increase gradually.
Outline these as patterns in the migration doc, but keep them high-level here.
---
### 2.4 Migration skeleton repository structure
Add a clear place in repo for skeleton code & docs:
```text
/docs
/router
Migration of Webservices to Microservices.md
examples/
Billing.Sample.md
/samples
/Billing
StellaOps.Billing.Microservice/ # full example project
router.billing.yaml # example router config
/MigrationSkeleton
StellaOps.Template.Microservice/ # template project
example-controller-mapping.md # before/after snippet
```
The **skeleton** project should:
* Compile.
* Contain TODO markers where teams fill in domain pieces.
* Be referenced in the migration doc so people know where to look.
---
### 2.5 Tests to make the reference stick
Add a minimal test suite around the Billing example:
* **Integration tests** in `tests/StellaOps.Billing.IntegrationTests`:
* Start gateway + Billing microservice (using in-memory test host or docker-compose).
* `GET /ping` returns 200 and “pong”.
* `POST /billing/invoices` returns 200 with a JSON body containing an `id`.
* `POST /billing/invoices/upload` with a large payload succeeds and reports `bytesReceived`.
* Use these tests as a reference for future services: they show how to spin up a microservice + gateway in tests.
---
## 3. Done criteria for step 11
You can treat “Build a reference example + migration skeleton” as complete when:
* `StellaOps.Billing.Microservice` exists, runs, and successfully serves requests through the gateway using your real transport (or InMemory/TCP for dev).
* `router.billing.yaml` plus `billing.microservice.yaml` show config patterns for:
* timeouts
* streaming
* requiringClaims
* `docs/router/examples/Billing.Sample.md` explains how to run and test the example.
* `Migration of Webservices to Microservices.md` contains:
* A concrete mapping example (controller → endpoint + YAML).
* A step-by-step migration checklist for teams.
* Pointers to the skeleton project and sample configs.
* A template microservice project exists (`StellaOps.Template.Microservice` or equivalent) that teams can copy to bootstrap new services.
Once you have this, onboarding new domains and migrating old WebServices stops being an ad-hoc effort and becomes a repeatable, documented process.

View File

@@ -1,415 +0,0 @@
Below is how Id tell your dev agents to operate on this codebase so it doesnt turn into chaos over time.
Think of this as the “rules of engagement” for Stella Ops Router.
---
## 1. Nonnegotiable operating principles
All agents follow these rules:
1. **Specs are law**
* `docs/router/specs.md` is the primary source of truth.
* If code and spec differ:
* Fix the spec **first** (in a PR), then adjust the code.
* No “quick fixes” that contradict the spec.
2. **Common & protocol are sacred**
* `StellaOps.Router.Common` and the wire protocol (Frame/FrameType/serialization) are stable layers.
* Any change to:
* `Frame`, `FrameType`
* `EndpointDescriptor`, `ConnectionState`
* `ITransportClient` / `ITransportServer`
* …requires:
* Explicit spec update.
* Compatibility consideration.
* Code review by someone thinking about all transports and both sides (gateway + microservice).
3. **InMemory first, then real transports**
* New protocol semantics (e.g., new frame type, new behavior, new timeout rules) MUST:
1. Be implemented and proven with InMemory.
2. Have tests passing with InMemory.
3. Only then be rolled into TCP/TLS/UDP/RabbitMQ.
4. **No backdoor HTTP between microservices and router**
* Microservices must never talk HTTP to the router for control plane or data.
* All microservicerouter traffic goes through the registered transports (UDP/TCP/TLS/RabbitMQ) using `Frame`.
5. **Method + Path = contract**
* Endpoint identity is always: `HTTP Method + Path`, nothing else.
* No “dynamic” routing hacks that bypass the `(Method, Path)` resolution.
---
## 2. How agents should structure work (vertical slices, not scattered edits)
Whenever you assign work, agents should:
1. **Work in vertical slices**
* Example slice: “Cancellation with InMemory”, “Streaming + payload limits with TCP”, “RabbitMQ buffered requests”.
* Each slice includes:
* Spec amendments (if needed).
* Common contracts (if needed).
* Implementation (gateway + microservice + transport).
* Tests.
2. **Avoid crosscutting, halffinished changes**
* Do not:
* Change Common, start on TCP, then get bored and leave InMemory broken.
* Do:
* Finish one vertical slice endtoend, then move on.
3. **Keep changes small and reviewable**
* Prefer:
* One PR for “add YAML overrides merging”.
* Another PR for “add router YAML hotreload details”.
* Avoid huge omnibus PRs that change protocol, transports, router, and microservice in one go.
---
## 3. Change categories & review rules
Agents should classify their work by category and obey the review level.
1. **Category A Protocol / Common changes**
* Affects:
* `Frame`, `FrameType`, payload DTOs.
* `EndpointDescriptor`, `ConnectionState`, `RoutingDecision`.
* `ITransportClient`, `ITransportServer`.
* Requirements:
* Spec change with rationale.
* Crossside impact analysis: gateway + microservice + all transports.
* Tests updated for InMemory and at least one real transport.
* Review: 2+ reviewers, one acting as “protocol owner”.
2. **Category B Router logic / routing plugin**
* Affects:
* `IGlobalRoutingState` implementation.
* `IRoutingPlugin` logic (region, ping, heartbeat).
* Requirements:
* Unit tests for routing plugin (selection rules).
* At least one integration test through gateway + InMemory.
* Review: at least one reviewer who understands region/version semantics.
3. **Category C Transport implementation**
* Affects:
* TCP/TLS/UDP/RabbitMQ clients & servers.
* Requirements:
* Transportspecific tests (connection, basic request/response, timeout).
* No protocol changes.
* Review: 12 reviewers, including one who owns that transport.
4. **Category D SDK / Microservice developer experience**
* Affects:
* `StellaOps.Microservice` public surface, endpoint discovery, YAML merging.
* Requirements:
* API review for public surface.
* Docs update (`Microservice.md`) if behavior changes.
* Review: 12 reviewers.
5. **Category E Docs only**
* Affects:
* `docs/router/*`, no code.
* Requirements:
* Ensure docs match current behavior; if not, spawn followup issues.
---
## 4. Workflow per change (what each agent does)
For any nontrivial change:
1. **Check the spec**
* Confirm that:
* The desired behavior is already described, or
* You will extend the spec first.
2. **Update / extend spec if needed**
* Edit `docs/router/specs.md` or appropriate doc.
* Document:
* Whats changing.
* Why we need it.
* Which components are affected.
3. **Adjust Common / contracts if needed**
* Only after spec is updated.
* Keep changes minimal and backwards compatible where possible.
4. **Implement in InMemory path**
* Update:
* InMemory `ITransportClient`/hub.
* Microservice and gateway logic that rely on it.
* Add tests to prove behavior.
5. **Port to real transports**
* Implement the same behavior in:
* TCP (baseline).
* TLS (wrapping TCP).
* Others when needed.
* Reuse the same InMemory tests pattern for transport tests.
6. **Add / update tests**
* Unit tests for logic.
* Integration tests for gateway + microservice via at least one real transport.
7. **Update documentation**
* Update relevant docs:
* `Stella Ops Router - Webserver.md`
* `Stella Ops Router - Microservice.md`
* `Common.md`, if common contracts changed.
* Highlight any new configuration knobs or invariants.
---
## 5. Testing expectations for all agents
Agents should treat tests as part of the change, not an afterthought.
1. **Unit tests**
* For:
* Routing plugin decisions.
* YAML merge behavior.
* Payload budget logic.
* Goal:
* All tricky branches are covered.
2. **Integration tests**
* For gateway + microservice using:
* InMemory.
* At least one real transport (TCP in dev).
* Scenarios to maintain:
* Simple request/response.
* Streaming upload.
* Cancellation on client abort.
* Timeout leading to CANCEL.
* Payload limit exceeded.
3. **Smoke tests for examples**
* Ensure `StellaOps.Billing.Microservice` example always passes a small test:
* `/billing/health` works.
* `/billing/invoices/upload` streaming behaves.
4. **CI gating**
* No PR merges unless:
* `dotnet build` for solution succeeds.
* All tests pass.
* If agents add new projects/tests, CI must be updated in the same PR.
---
## 6. How agents should use configuration & YAML
1. **Router side**
* Always read payload limits, node region, transports from `RouterConfig` (bound from YAML + env).
* Do not hardcode:
* Limits.
* Regions.
* Ports.
* If behavior depends on config, fetch from `IOptionsMonitor<RouterConfig>` at runtime, not from cached fields unless you explicitly freeze.
2. **Microservice side**
* Identity & router pool:
* From `StellaMicroserviceOptions` (code/env).
* Endpoint metadata overrides:
* From YAML (`ConfigFilePath`) merged into reflection result.
* Agents must not let YAML create endpoints that dont exist in code; overrides only.
3. **No hidden defaults**
* If a default is important (e.g. `HeartbeatInterval`), document it and centralize it.
* Dont sprinkle magic numbers across code.
---
## 7. Adding new capabilities: pattern all agents follow
When someone wants a new capability (e.g. “retry on transient transport failures”):
1. **Open a design issue / doc snippet**
* Describe:
* Problem.
* Proposed design.
* Where it sits in architecture (router, microservice, transport, config).
2. **Update spec**
* Write the behavior in the appropriate doc section.
* Include:
* API shape (if public).
* Transport impacts.
* Failure modes.
3. **Follow the vertical slice path**
* Implement in Common (if needed).
* Implement InMemory.
* Implement in primary transport (TCP).
* Add tests.
* Update docs.
Agents should not just spike code into TCP implementation without spec or tests.
---
## 8. Logging, tracing, and debugging expectations
Agents should instrument consistently; this matters for operations and for debugging during development.
1. **Use structured logging**
* At minimum, include:
* `ServiceName`
* `InstanceId`
* `CorrelationId`
* `Method`
* `Path`
* `ConnectionId`
* Never log full payload bodies by default for privacy and performance; log sizes and key metadata instead.
2. **Trace correlation**
* Ensure correlation IDs:
* Propagate from HTTP (gateway) into `Frame.CorrelationId`.
* Are used in logs on both sides (gateway + microservice).
3. **Agent debugging guidance**
* When debugging a routing or transport problem:
* Turn on debug logging for gateway + microservice for that service.
* Use the correlation ID to follow the request endtoend.
* Verify:
* HELLO registration.
* HEARTBEAT events.
* REQUEST leaving gateway.
* RESPONSE arriving.
---
## 9. Daily agent workflow (practical directions)
For each day / task, an agent should:
1. **Start from an issue or spec line item**
* Never “just code something” without an issue/state in the backlog.
2. **Locate the relevant doc**
* Spec section.
* Example docs (e.g. Billing sample).
* Migration doc if working on conversion.
3. **Work in a feature branch**
* Branch name reflects scope: `feature/streaming-tcp`, `fix/router-cancellation`, etc.
4. **Keep notes**
* If an assumption is made (e.g. “we currently dont support streaming over RabbitMQ”), note it in the issue.
* If they discover inconsistency in docs, open a docfix issue.
5. **Finish the full slice**
* Code + tests + docs.
* Keep partial implementations behind feature flags (if needed) and clearly marked.
6. **Open PR with clear description**
* What changed.
* Which spec section it implements or modifies.
* Any risks or rollback notes.
---
## 10. Guardrails against drift
Finally, a few things agents must actively avoid:
* **No silent protocol changes**
* Dont change `FrameType` semantics, payload formats, or header layout without:
* Spec update.
* Full impact review.
* **No specless behavior**
* If something matters at runtime (timeouts, retries, routing rules), it has to be in the docs, not just in someones head.
* **No bypassing of router**
* Do not introduce “temporary” direct calls from clients to microservices. All client HTTP should go via gateway.
* **No direct dependencies on specific transports in domain code**
* Domain and microservice endpoint logic must not know if the transport is TCP, TLS, UDP, or RabbitMQ. They only see `RawRequestContext`, `RawResponse`, and cancellation tokens.
---
If you want, I can turn this into a onepage “Agent Handbook” markdown file you can drop into `docs/router/AGENTS_PROCESS.md` and link from `specs.md` so every AI or human dev working on this stack has the same ground rules.

View File

@@ -1,946 +0,0 @@
# Step 13: InMemory Transport Implementation
**Phase 3: Transport Layer**
**Estimated Complexity:** Medium
**Dependencies:** Step 12 (Request/Response Serialization)
---
## Overview
The InMemory transport provides a high-performance, zero-network transport for testing, local development, and same-process microservices. It serves as the reference implementation for the transport layer and must pass all protocol tests before any real transport implementation.
---
## Goals
1. Implement a fully-functional in-process transport without network overhead
2. Serve as the reference implementation for transport protocol compliance
3. Enable fast integration tests without network dependencies
4. Support all frame types and streaming semantics
5. Provide debugging hooks for protocol validation
---
## Core Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ InMemory Transport Hub │
├─────────────────────────────────────────────────────────────┤
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Gateway Side │◄──►│ Channels │◄──►│Microservice │ │
│ │ Client │ │ (Duplex) │ │ Server │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ Connection Registry Frame Queue Handler Dispatch │
└─────────────────────────────────────────────────────────────┘
```
---
## Core Types
### InMemory Channel
```csharp
namespace StellaOps.Router.Transport.InMemory;
/// <summary>
/// Bidirectional in-memory channel for frame exchange.
/// </summary>
public sealed class InMemoryChannel : IAsyncDisposable
{
private readonly Channel<Frame> _gatewayToService;
private readonly Channel<Frame> _serviceToGateway;
private readonly CancellationTokenSource _cts;
public string ChannelId { get; }
public string ServiceName { get; }
public string InstanceId { get; }
public ConnectionState State { get; private set; }
public DateTimeOffset CreatedAt { get; }
public DateTimeOffset LastActivityAt { get; private set; }
public InMemoryChannel(string serviceName, string instanceId)
{
ChannelId = Guid.NewGuid().ToString("N");
ServiceName = serviceName;
InstanceId = instanceId;
CreatedAt = DateTimeOffset.UtcNow;
LastActivityAt = CreatedAt;
State = ConnectionState.Connecting;
_cts = new CancellationTokenSource();
// Bounded channels to provide backpressure
var options = new BoundedChannelOptions(1000)
{
FullMode = BoundedChannelFullMode.Wait,
SingleReader = false,
SingleWriter = false
};
_gatewayToService = Channel.CreateBounded<Frame>(options);
_serviceToGateway = Channel.CreateBounded<Frame>(options);
}
/// <summary>
/// Gets the writer for sending frames from gateway to service.
/// </summary>
public ChannelWriter<Frame> GatewayWriter => _gatewayToService.Writer;
/// <summary>
/// Gets the reader for receiving frames from gateway (service side).
/// </summary>
public ChannelReader<Frame> ServiceReader => _gatewayToService.Reader;
/// <summary>
/// Gets the writer for sending frames from service to gateway.
/// </summary>
public ChannelWriter<Frame> ServiceWriter => _serviceToGateway.Writer;
/// <summary>
/// Gets the reader for receiving frames from service (gateway side).
/// </summary>
public ChannelReader<Frame> GatewayReader => _serviceToGateway.Reader;
public void MarkConnected()
{
State = ConnectionState.Connected;
LastActivityAt = DateTimeOffset.UtcNow;
}
public void UpdateActivity()
{
LastActivityAt = DateTimeOffset.UtcNow;
}
public async ValueTask DisposeAsync()
{
State = ConnectionState.Disconnected;
_cts.Cancel();
_gatewayToService.Writer.TryComplete();
_serviceToGateway.Writer.TryComplete();
_cts.Dispose();
}
}
```
### InMemory Hub
```csharp
namespace StellaOps.Router.Transport.InMemory;
/// <summary>
/// Central hub managing all InMemory transport connections.
/// </summary>
public sealed class InMemoryTransportHub : IDisposable
{
private readonly ConcurrentDictionary<string, InMemoryChannel> _channels = new();
private readonly ConcurrentDictionary<string, List<string>> _serviceChannels = new();
private readonly ILogger<InMemoryTransportHub> _logger;
public InMemoryTransportHub(ILogger<InMemoryTransportHub> logger)
{
_logger = logger;
}
/// <summary>
/// Creates a new channel for a microservice connection.
/// </summary>
public InMemoryChannel CreateChannel(string serviceName, string instanceId)
{
var channel = new InMemoryChannel(serviceName, instanceId);
if (!_channels.TryAdd(channel.ChannelId, channel))
{
throw new InvalidOperationException($"Channel {channel.ChannelId} already exists");
}
_serviceChannels.AddOrUpdate(
serviceName,
_ => new List<string> { channel.ChannelId },
(_, list) => { lock (list) { list.Add(channel.ChannelId); } return list; }
);
_logger.LogDebug(
"Created InMemory channel {ChannelId} for {ServiceName}/{InstanceId}",
channel.ChannelId, serviceName, instanceId);
return channel;
}
/// <summary>
/// Gets a channel by ID.
/// </summary>
public InMemoryChannel? GetChannel(string channelId)
{
return _channels.TryGetValue(channelId, out var channel) ? channel : null;
}
/// <summary>
/// Gets all channels for a service.
/// </summary>
public IReadOnlyList<InMemoryChannel> GetServiceChannels(string serviceName)
{
if (!_serviceChannels.TryGetValue(serviceName, out var channelIds))
return Array.Empty<InMemoryChannel>();
var result = new List<InMemoryChannel>();
lock (channelIds)
{
foreach (var id in channelIds)
{
if (_channels.TryGetValue(id, out var channel) &&
channel.State == ConnectionState.Connected)
{
result.Add(channel);
}
}
}
return result;
}
/// <summary>
/// Removes a channel from the hub.
/// </summary>
public async Task RemoveChannelAsync(string channelId)
{
if (_channels.TryRemove(channelId, out var channel))
{
if (_serviceChannels.TryGetValue(channel.ServiceName, out var list))
{
lock (list) { list.Remove(channelId); }
}
await channel.DisposeAsync();
_logger.LogDebug("Removed InMemory channel {ChannelId}", channelId);
}
}
/// <summary>
/// Gets all active channels.
/// </summary>
public IEnumerable<InMemoryChannel> GetAllChannels()
{
return _channels.Values.Where(c => c.State == ConnectionState.Connected);
}
public void Dispose()
{
foreach (var channel in _channels.Values)
{
_ = channel.DisposeAsync();
}
_channels.Clear();
_serviceChannels.Clear();
}
}
```
---
## Gateway-Side Client
```csharp
namespace StellaOps.Router.Transport.InMemory;
/// <summary>
/// Gateway-side client for InMemory transport.
/// </summary>
public sealed class InMemoryTransportClient : ITransportClient
{
private readonly InMemoryTransportHub _hub;
private readonly IPayloadSerializer _serializer;
private readonly ILogger<InMemoryTransportClient> _logger;
private readonly ConcurrentDictionary<string, TaskCompletionSource<ResponsePayload>> _pendingRequests = new();
public string TransportType => "InMemory";
public InMemoryTransportClient(
InMemoryTransportHub hub,
IPayloadSerializer serializer,
ILogger<InMemoryTransportClient> logger)
{
_hub = hub;
_serializer = serializer;
_logger = logger;
}
public async Task<ResponsePayload> SendRequestAsync(
string serviceName,
RequestPayload request,
TimeSpan timeout,
CancellationToken cancellationToken)
{
var channels = _hub.GetServiceChannels(serviceName);
if (channels.Count == 0)
{
throw new NoAvailableInstanceException(serviceName);
}
// Simple round-robin selection (in production, use routing plugin)
var channel = channels[Random.Shared.Next(channels.Count)];
var correlationId = Guid.NewGuid().ToString("N");
var tcs = new TaskCompletionSource<ResponsePayload>(TaskCreationOptions.RunContinuationsAsynchronously);
_pendingRequests[correlationId] = tcs;
try
{
// Create and send request frame
var frame = new Frame
{
Type = FrameType.Request,
CorrelationId = correlationId,
Payload = _serializer.SerializeRequest(request)
};
await channel.GatewayWriter.WriteAsync(frame, cancellationToken);
channel.UpdateActivity();
// Start listening for response
_ = ListenForResponseAsync(channel, correlationId, cancellationToken);
// Wait for response with timeout
using var timeoutCts = new CancellationTokenSource(timeout);
using var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(
cancellationToken, timeoutCts.Token);
try
{
return await tcs.Task.WaitAsync(linkedCts.Token);
}
catch (OperationCanceledException) when (timeoutCts.IsCancellationRequested)
{
// Send cancel frame
await SendCancelAsync(channel, correlationId);
throw new TimeoutException($"Request to {serviceName} timed out after {timeout}");
}
}
finally
{
_pendingRequests.TryRemove(correlationId, out _);
}
}
public async IAsyncEnumerable<ResponsePayload> SendStreamingRequestAsync(
string serviceName,
IAsyncEnumerable<RequestPayload> requestChunks,
TimeSpan timeout,
[EnumeratorCancellation] CancellationToken cancellationToken)
{
var channels = _hub.GetServiceChannels(serviceName);
if (channels.Count == 0)
{
throw new NoAvailableInstanceException(serviceName);
}
var channel = channels[Random.Shared.Next(channels.Count)];
var correlationId = Guid.NewGuid().ToString("N");
// Send all request chunks
await foreach (var chunk in requestChunks.WithCancellation(cancellationToken))
{
var frame = new Frame
{
Type = FrameType.Request,
CorrelationId = correlationId,
Payload = _serializer.SerializeRequest(chunk),
Flags = chunk.IsStreaming ? FrameFlags.None : FrameFlags.Final
};
await channel.GatewayWriter.WriteAsync(frame, cancellationToken);
channel.UpdateActivity();
}
// Read response chunks
await foreach (var frame in channel.GatewayReader.ReadAllAsync(cancellationToken))
{
if (frame.CorrelationId != correlationId)
continue;
if (frame.Type == FrameType.Response)
{
var response = _serializer.DeserializeResponse(frame.Payload);
yield return response;
if (response.IsFinalChunk || frame.Flags.HasFlag(FrameFlags.Final))
yield break;
}
}
}
private async Task ListenForResponseAsync(
InMemoryChannel channel,
string correlationId,
CancellationToken cancellationToken)
{
try
{
await foreach (var frame in channel.GatewayReader.ReadAllAsync(cancellationToken))
{
if (frame.CorrelationId != correlationId)
continue;
if (frame.Type == FrameType.Response)
{
var response = _serializer.DeserializeResponse(frame.Payload);
if (_pendingRequests.TryGetValue(correlationId, out var tcs))
{
tcs.TrySetResult(response);
}
return;
}
}
}
catch (OperationCanceledException)
{
// Expected on cancellation
}
}
private async Task SendCancelAsync(InMemoryChannel channel, string correlationId)
{
try
{
var cancelFrame = new Frame
{
Type = FrameType.Cancel,
CorrelationId = correlationId,
Payload = Array.Empty<byte>()
};
await channel.GatewayWriter.WriteAsync(cancelFrame);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to send cancel frame for {CorrelationId}", correlationId);
}
}
}
```
---
## Microservice-Side Server
```csharp
namespace StellaOps.Router.Transport.InMemory;
/// <summary>
/// Microservice-side server for InMemory transport.
/// </summary>
public sealed class InMemoryTransportServer : ITransportServer
{
private readonly InMemoryTransportHub _hub;
private readonly IPayloadSerializer _serializer;
private readonly ILogger<InMemoryTransportServer> _logger;
private InMemoryChannel? _channel;
private CancellationTokenSource? _cts;
private Task? _processingTask;
public string TransportType => "InMemory";
public bool IsConnected => _channel?.State == ConnectionState.Connected;
public event Func<RequestPayload, CancellationToken, Task<ResponsePayload>>? OnRequest;
public event Func<string, CancellationToken, Task>? OnCancel;
public InMemoryTransportServer(
InMemoryTransportHub hub,
IPayloadSerializer serializer,
ILogger<InMemoryTransportServer> logger)
{
_hub = hub;
_serializer = serializer;
_logger = logger;
}
public async Task ConnectAsync(
string serviceName,
string instanceId,
EndpointDescriptor[] endpoints,
CancellationToken cancellationToken)
{
_channel = _hub.CreateChannel(serviceName, instanceId);
_cts = new CancellationTokenSource();
// Send HELLO frame
var helloPayload = new HelloPayload
{
ServiceName = serviceName,
InstanceId = instanceId,
Endpoints = endpoints,
Metadata = new Dictionary<string, string>
{
["transport"] = "InMemory",
["pid"] = Environment.ProcessId.ToString()
}
};
var helloFrame = new Frame
{
Type = FrameType.Hello,
CorrelationId = Guid.NewGuid().ToString("N"),
Payload = _serializer.SerializeHello(helloPayload)
};
await _channel.ServiceWriter.WriteAsync(helloFrame, cancellationToken);
// Wait for HELLO response
var response = await _channel.ServiceReader.ReadAsync(cancellationToken);
if (response.Type != FrameType.Hello)
{
throw new ProtocolException($"Expected HELLO response, got {response.Type}");
}
_channel.MarkConnected();
_logger.LogInformation(
"InMemory transport connected for {ServiceName}/{InstanceId}",
serviceName, instanceId);
// Start processing loop
_processingTask = ProcessFramesAsync(_cts.Token);
}
private async Task ProcessFramesAsync(CancellationToken cancellationToken)
{
if (_channel == null) return;
try
{
await foreach (var frame in _channel.ServiceReader.ReadAllAsync(cancellationToken))
{
_channel.UpdateActivity();
switch (frame.Type)
{
case FrameType.Request:
_ = HandleRequestAsync(frame, cancellationToken);
break;
case FrameType.Cancel:
if (OnCancel != null)
{
await OnCancel(frame.CorrelationId, cancellationToken);
}
break;
case FrameType.Heartbeat:
await HandleHeartbeatAsync(frame);
break;
}
}
}
catch (OperationCanceledException)
{
// Expected on shutdown
}
catch (Exception ex)
{
_logger.LogError(ex, "Error processing InMemory frames");
}
}
private async Task HandleRequestAsync(Frame frame, CancellationToken cancellationToken)
{
if (_channel == null || OnRequest == null) return;
try
{
var request = _serializer.DeserializeRequest(frame.Payload);
var response = await OnRequest(request, cancellationToken);
var responseFrame = new Frame
{
Type = FrameType.Response,
CorrelationId = frame.CorrelationId,
Payload = _serializer.SerializeResponse(response),
Flags = FrameFlags.Final
};
await _channel.ServiceWriter.WriteAsync(responseFrame, cancellationToken);
}
catch (Exception ex)
{
_logger.LogError(ex, "Error handling request {CorrelationId}", frame.CorrelationId);
// Send error response
var errorResponse = new ResponsePayload
{
StatusCode = 500,
Headers = new Dictionary<string, string>(),
ErrorMessage = ex.Message,
IsFinalChunk = true
};
var errorFrame = new Frame
{
Type = FrameType.Response,
CorrelationId = frame.CorrelationId,
Payload = _serializer.SerializeResponse(errorResponse),
Flags = FrameFlags.Final | FrameFlags.Error
};
await _channel.ServiceWriter.WriteAsync(errorFrame, cancellationToken);
}
}
private async Task HandleHeartbeatAsync(Frame frame)
{
if (_channel == null) return;
var pongFrame = new Frame
{
Type = FrameType.Heartbeat,
CorrelationId = frame.CorrelationId,
Payload = frame.Payload // Echo back
};
await _channel.ServiceWriter.WriteAsync(pongFrame);
}
public async Task DisconnectAsync()
{
_cts?.Cancel();
if (_processingTask != null)
{
try
{
await _processingTask.WaitAsync(TimeSpan.FromSeconds(5));
}
catch (TimeoutException)
{
_logger.LogWarning("InMemory processing task did not complete in time");
}
}
if (_channel != null)
{
await _hub.RemoveChannelAsync(_channel.ChannelId);
}
_cts?.Dispose();
}
public async Task SendHeartbeatAsync(CancellationToken cancellationToken)
{
if (_channel == null || _channel.State != ConnectionState.Connected)
return;
var heartbeatFrame = new Frame
{
Type = FrameType.Heartbeat,
CorrelationId = Guid.NewGuid().ToString("N"),
Payload = BitConverter.GetBytes(DateTimeOffset.UtcNow.ToUnixTimeMilliseconds())
};
await _channel.ServiceWriter.WriteAsync(heartbeatFrame, cancellationToken);
}
}
```
---
## Integration with Global Routing State
```csharp
namespace StellaOps.Router.Transport.InMemory;
/// <summary>
/// InMemory transport integration with gateway routing state.
/// </summary>
public sealed class InMemoryRoutingIntegration : IHostedService
{
private readonly InMemoryTransportHub _hub;
private readonly IGlobalRoutingState _routingState;
private readonly ILogger<InMemoryRoutingIntegration> _logger;
private Timer? _syncTimer;
public InMemoryRoutingIntegration(
InMemoryTransportHub hub,
IGlobalRoutingState routingState,
ILogger<InMemoryRoutingIntegration> logger)
{
_hub = hub;
_routingState = routingState;
_logger = logger;
}
public Task StartAsync(CancellationToken cancellationToken)
{
// Sync InMemory channels with routing state periodically
_syncTimer = new Timer(SyncChannels, null, TimeSpan.Zero, TimeSpan.FromSeconds(5));
return Task.CompletedTask;
}
private void SyncChannels(object? state)
{
try
{
foreach (var channel in _hub.GetAllChannels())
{
var connection = new EndpointConnection
{
ServiceName = channel.ServiceName,
InstanceId = channel.InstanceId,
ConnectionId = channel.ChannelId,
Transport = "InMemory",
State = channel.State,
LastHeartbeat = channel.LastActivityAt
};
_routingState.UpdateConnection(connection);
}
}
catch (Exception ex)
{
_logger.LogError(ex, "Error syncing InMemory channels");
}
}
public Task StopAsync(CancellationToken cancellationToken)
{
_syncTimer?.Dispose();
return Task.CompletedTask;
}
}
```
---
## Service Registration
```csharp
namespace StellaOps.Router.Transport.InMemory;
public static class InMemoryTransportExtensions
{
/// <summary>
/// Adds InMemory transport to the gateway.
/// </summary>
public static IServiceCollection AddInMemoryTransport(this IServiceCollection services)
{
services.AddSingleton<InMemoryTransportHub>();
services.AddSingleton<ITransportClient, InMemoryTransportClient>();
services.AddHostedService<InMemoryRoutingIntegration>();
return services;
}
/// <summary>
/// Adds InMemory transport to a microservice.
/// </summary>
public static IServiceCollection AddInMemoryMicroserviceTransport(
this IServiceCollection services,
Action<InMemoryTransportOptions>? configure = null)
{
var options = new InMemoryTransportOptions();
configure?.Invoke(options);
services.AddSingleton(options);
services.AddSingleton<ITransportServer, InMemoryTransportServer>();
return services;
}
}
public class InMemoryTransportOptions
{
public int MaxPendingRequests { get; set; } = 1000;
public TimeSpan ConnectionTimeout { get; set; } = TimeSpan.FromSeconds(30);
}
```
---
## Testing Utilities
```csharp
namespace StellaOps.Router.Transport.InMemory.Testing;
/// <summary>
/// Test fixture for InMemory transport testing.
/// </summary>
public sealed class InMemoryTransportFixture : IAsyncDisposable
{
private readonly InMemoryTransportHub _hub;
private readonly ILoggerFactory _loggerFactory;
public InMemoryTransportHub Hub => _hub;
public InMemoryTransportFixture()
{
_loggerFactory = LoggerFactory.Create(b => b.AddConsole());
_hub = new InMemoryTransportHub(_loggerFactory.CreateLogger<InMemoryTransportHub>());
}
public InMemoryTransportClient CreateClient()
{
var serializer = new MessagePackPayloadSerializer();
return new InMemoryTransportClient(
_hub,
serializer,
_loggerFactory.CreateLogger<InMemoryTransportClient>());
}
public InMemoryTransportServer CreateServer()
{
var serializer = new MessagePackPayloadSerializer();
return new InMemoryTransportServer(
_hub,
serializer,
_loggerFactory.CreateLogger<InMemoryTransportServer>());
}
public async ValueTask DisposeAsync()
{
_hub.Dispose();
_loggerFactory.Dispose();
}
}
```
---
## Unit Tests
```csharp
public class InMemoryTransportTests
{
[Fact]
public async Task SimpleRequestResponse_Works()
{
await using var fixture = new InMemoryTransportFixture();
var client = fixture.CreateClient();
var server = fixture.CreateServer();
// Setup server
server.OnRequest += (request, ct) => Task.FromResult(new ResponsePayload
{
StatusCode = 200,
Headers = new Dictionary<string, string>(),
Body = Encoding.UTF8.GetBytes($"Hello {request.Path}")
});
await server.ConnectAsync("test-service", "instance-1", Array.Empty<EndpointDescriptor>(), default);
// Send request
var response = await client.SendRequestAsync(
"test-service",
new RequestPayload
{
Method = "GET",
Path = "/test",
Headers = new Dictionary<string, string>(),
Claims = new Dictionary<string, string>()
},
TimeSpan.FromSeconds(5),
default);
Assert.Equal(200, response.StatusCode);
Assert.Equal("Hello /test", Encoding.UTF8.GetString(response.Body!));
}
[Fact]
public async Task Cancellation_SendsCancelFrame()
{
await using var fixture = new InMemoryTransportFixture();
var client = fixture.CreateClient();
var server = fixture.CreateServer();
var cancelReceived = new TaskCompletionSource<bool>();
server.OnRequest += async (request, ct) =>
{
await Task.Delay(TimeSpan.FromSeconds(30), ct);
return new ResponsePayload { StatusCode = 200, Headers = new Dictionary<string, string>() };
};
server.OnCancel += (correlationId, ct) =>
{
cancelReceived.TrySetResult(true);
return Task.CompletedTask;
};
await server.ConnectAsync("test-service", "instance-1", Array.Empty<EndpointDescriptor>(), default);
// Send request with short timeout
await Assert.ThrowsAsync<TimeoutException>(() =>
client.SendRequestAsync(
"test-service",
new RequestPayload { Method = "GET", Path = "/slow", Headers = new Dictionary<string, string>(), Claims = new Dictionary<string, string>() },
TimeSpan.FromMilliseconds(100),
default));
// Verify cancel was received
var result = await cancelReceived.Task.WaitAsync(TimeSpan.FromSeconds(1));
Assert.True(result);
}
[Fact]
public async Task MultipleInstances_DistributesRequests()
{
await using var fixture = new InMemoryTransportFixture();
var client = fixture.CreateClient();
var server1 = fixture.CreateServer();
var server2 = fixture.CreateServer();
var server1Count = 0;
var server2Count = 0;
server1.OnRequest += (r, ct) =>
{
Interlocked.Increment(ref server1Count);
return Task.FromResult(new ResponsePayload { StatusCode = 200, Headers = new Dictionary<string, string>() });
};
server2.OnRequest += (r, ct) =>
{
Interlocked.Increment(ref server2Count);
return Task.FromResult(new ResponsePayload { StatusCode = 200, Headers = new Dictionary<string, string>() });
};
await server1.ConnectAsync("test-service", "instance-1", Array.Empty<EndpointDescriptor>(), default);
await server2.ConnectAsync("test-service", "instance-2", Array.Empty<EndpointDescriptor>(), default);
// Send multiple requests
for (int i = 0; i < 100; i++)
{
await client.SendRequestAsync(
"test-service",
new RequestPayload { Method = "GET", Path = "/test", Headers = new Dictionary<string, string>(), Claims = new Dictionary<string, string>() },
TimeSpan.FromSeconds(5),
default);
}
// Both instances should have received requests
Assert.True(server1Count > 0);
Assert.True(server2Count > 0);
Assert.Equal(100, server1Count + server2Count);
}
}
```
---
## Deliverables
1. `StellaOps.Router.Transport.InMemory/InMemoryChannel.cs`
2. `StellaOps.Router.Transport.InMemory/InMemoryTransportHub.cs`
3. `StellaOps.Router.Transport.InMemory/InMemoryTransportClient.cs`
4. `StellaOps.Router.Transport.InMemory/InMemoryTransportServer.cs`
5. `StellaOps.Router.Transport.InMemory/InMemoryRoutingIntegration.cs`
6. `StellaOps.Router.Transport.InMemory/InMemoryTransportExtensions.cs`
7. `StellaOps.Router.Transport.InMemory.Testing/InMemoryTransportFixture.cs`
8. Unit tests for all frame types
9. Integration tests for request/response patterns
10. Streaming tests
---
## Next Step
Proceed to [Step 14: TCP Transport Implementation](14-Step.md) to implement the primary production transport.

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,994 +0,0 @@
# Step 16: GraphQL Handler Implementation
**Phase 4: Handler Plugins**
**Estimated Complexity:** High
**Dependencies:** Step 10 (Microservice Handler)
---
## Overview
The GraphQL handler routes GraphQL queries, mutations, and subscriptions to appropriate microservices based on schema analysis. It supports schema stitching, query splitting, and federated execution across multiple services.
---
## Goals
1. Route GraphQL operations to appropriate backend services
2. Support schema federation/stitching across microservices
3. Handle batched queries with DataLoader patterns
4. Support subscriptions via WebSocket upgrade
5. Provide introspection proxying and schema caching
---
## Core Architecture
```
┌──────────────────────────────────────────────────────────────────┐
│ GraphQL Handler │
├──────────────────────────────────────────────────────────────────┤
│ │
│ HTTP Request │
│ │ │
│ ▼ │
│ ┌───────────────┐ │
│ │ Query Parser │──► Extract operation type & fields │
│ └───────┬───────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ ┌─────────────────┐ │
│ │ Query Planner │───►│ Schema Registry │ │
│ └───────┬───────┘ └─────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ │
│ │Query Executor │──► Split & dispatch to services │
│ └───────┬───────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ │
│ │Result Merger │──► Combine partial results │
│ └───────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────┘
```
---
## Configuration
```csharp
namespace StellaOps.Router.Handlers.GraphQL;
public class GraphQLHandlerConfig
{
/// <summary>Path prefix for GraphQL endpoint.</summary>
public string Path { get; set; } = "/graphql";
/// <summary>Whether to enable introspection queries.</summary>
public bool EnableIntrospection { get; set; } = true;
/// <summary>Whether to enable subscriptions.</summary>
public bool EnableSubscriptions { get; set; } = true;
/// <summary>Maximum query depth to prevent DOS.</summary>
public int MaxQueryDepth { get; set; } = 15;
/// <summary>Maximum query complexity score.</summary>
public int MaxQueryComplexity { get; set; } = 1000;
/// <summary>Timeout for query execution.</summary>
public TimeSpan ExecutionTimeout { get; set; } = TimeSpan.FromSeconds(30);
/// <summary>Cache duration for schema introspection.</summary>
public TimeSpan SchemaCacheDuration { get; set; } = TimeSpan.FromMinutes(5);
/// <summary>Whether to enable query batching.</summary>
public bool EnableBatching { get; set; } = true;
/// <summary>Maximum batch size.</summary>
public int MaxBatchSize { get; set; } = 10;
/// <summary>Registered GraphQL services and their type ownership.</summary>
public Dictionary<string, GraphQLServiceConfig> Services { get; set; } = new();
}
public class GraphQLServiceConfig
{
/// <summary>Service name for routing.</summary>
public required string ServiceName { get; set; }
/// <summary>Root types this service handles (Query, Mutation, Subscription).</summary>
public HashSet<string> RootTypes { get; set; } = new();
/// <summary>Specific fields this service owns.</summary>
public Dictionary<string, HashSet<string>> OwnedFields { get; set; } = new();
/// <summary>Whether this service provides the full schema.</summary>
public bool IsSchemaProvider { get; set; }
}
```
---
## Core Types
```csharp
namespace StellaOps.Router.Handlers.GraphQL;
/// <summary>
/// Parsed GraphQL request.
/// </summary>
public sealed class GraphQLRequest
{
public required string Query { get; init; }
public string? OperationName { get; init; }
public Dictionary<string, object?>? Variables { get; init; }
public Dictionary<string, object?>? Extensions { get; init; }
}
/// <summary>
/// GraphQL response format.
/// </summary>
public sealed class GraphQLResponse
{
public object? Data { get; set; }
public List<GraphQLError>? Errors { get; set; }
public Dictionary<string, object?>? Extensions { get; set; }
}
public sealed class GraphQLError
{
public required string Message { get; init; }
public List<GraphQLLocation>? Locations { get; init; }
public List<object>? Path { get; init; }
public Dictionary<string, object?>? Extensions { get; init; }
}
public sealed class GraphQLLocation
{
public int Line { get; init; }
public int Column { get; init; }
}
/// <summary>
/// Represents a planned query execution.
/// </summary>
public sealed class QueryPlan
{
public GraphQLOperationType OperationType { get; init; }
public List<QueryPlanNode> Nodes { get; init; } = new();
}
public sealed class QueryPlanNode
{
public string ServiceName { get; init; } = "";
public string SubQuery { get; init; } = "";
public List<string> RequiredFields { get; init; } = new();
public List<QueryPlanNode> DependsOn { get; init; } = new();
}
public enum GraphQLOperationType
{
Query,
Mutation,
Subscription
}
```
---
## GraphQL Handler Implementation
```csharp
namespace StellaOps.Router.Handlers.GraphQL;
public sealed class GraphQLHandler : IRouteHandler
{
public string HandlerType => "GraphQL";
public int Priority => 100;
private readonly GraphQLHandlerConfig _config;
private readonly IGraphQLParser _parser;
private readonly IQueryPlanner _planner;
private readonly IQueryExecutor _executor;
private readonly ISchemaRegistry _schemaRegistry;
private readonly ILogger<GraphQLHandler> _logger;
public GraphQLHandler(
IOptions<GraphQLHandlerConfig> config,
IGraphQLParser parser,
IQueryPlanner planner,
IQueryExecutor executor,
ISchemaRegistry schemaRegistry,
ILogger<GraphQLHandler> logger)
{
_config = config.Value;
_parser = parser;
_planner = planner;
_executor = executor;
_schemaRegistry = schemaRegistry;
_logger = logger;
}
public bool CanHandle(RouteMatchResult match)
{
return match.Handler == "GraphQL" ||
match.Route.Path.StartsWith(_config.Path, StringComparison.OrdinalIgnoreCase);
}
public async Task<RouteHandlerResult> HandleAsync(
HttpContext context,
RouteMatchResult match,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
try
{
// Handle WebSocket upgrade for subscriptions
if (context.WebSockets.IsWebSocketRequest && _config.EnableSubscriptions)
{
return await HandleSubscriptionAsync(context, claims, cancellationToken);
}
// Parse GraphQL request
var request = await ParseRequestAsync(context, cancellationToken);
// Validate query
var validationResult = _parser.Validate(
request.Query,
_config.MaxQueryDepth,
_config.MaxQueryComplexity);
if (!validationResult.IsValid)
{
return CreateErrorResponse(validationResult.Errors);
}
// Parse and analyze query
var operation = _parser.Parse(request.Query, request.OperationName);
// Check if introspection
if (operation.IsIntrospection)
{
if (!_config.EnableIntrospection)
{
return CreateErrorResponse(new[] { "Introspection is disabled" });
}
return await HandleIntrospectionAsync(request, cancellationToken);
}
// Plan query execution
var plan = _planner.CreatePlan(operation, _config.Services);
_logger.LogDebug(
"Query plan created: {NodeCount} nodes for {OperationType}",
plan.Nodes.Count, plan.OperationType);
// Execute plan
var result = await _executor.ExecuteAsync(
plan,
request,
claims,
_config.ExecutionTimeout,
cancellationToken);
return CreateSuccessResponse(result);
}
catch (GraphQLParseException ex)
{
return CreateErrorResponse(new[] { ex.Message });
}
catch (Exception ex)
{
_logger.LogError(ex, "GraphQL execution error");
return CreateErrorResponse(new[] { "Internal server error" }, 500);
}
}
private async Task<GraphQLRequest> ParseRequestAsync(
HttpContext context,
CancellationToken cancellationToken)
{
if (context.Request.Method == "GET")
{
return new GraphQLRequest
{
Query = context.Request.Query["query"].ToString(),
OperationName = context.Request.Query["operationName"].ToString(),
Variables = ParseVariables(context.Request.Query["variables"].ToString())
};
}
var body = await JsonSerializer.DeserializeAsync<GraphQLRequest>(
context.Request.Body,
cancellationToken: cancellationToken);
return body ?? throw new GraphQLParseException("Invalid request body");
}
private Dictionary<string, object?>? ParseVariables(string? json)
{
if (string.IsNullOrEmpty(json))
return null;
return JsonSerializer.Deserialize<Dictionary<string, object?>>(json);
}
private async Task<RouteHandlerResult> HandleIntrospectionAsync(
GraphQLRequest request,
CancellationToken cancellationToken)
{
var schema = await _schemaRegistry.GetMergedSchemaAsync(cancellationToken);
var result = await _executor.ExecuteIntrospectionAsync(schema, request, cancellationToken);
return CreateSuccessResponse(result);
}
private async Task<RouteHandlerResult> HandleSubscriptionAsync(
HttpContext context,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
var webSocket = await context.WebSockets.AcceptWebSocketAsync("graphql-transport-ws");
await _executor.HandleSubscriptionAsync(webSocket, claims, cancellationToken);
return new RouteHandlerResult
{
Handled = true,
StatusCode = 101 // Switching Protocols
};
}
private RouteHandlerResult CreateSuccessResponse(GraphQLResponse response)
{
return new RouteHandlerResult
{
Handled = true,
StatusCode = 200,
ContentType = "application/json",
Body = JsonSerializer.SerializeToUtf8Bytes(response)
};
}
private RouteHandlerResult CreateErrorResponse(IEnumerable<string> messages, int statusCode = 200)
{
var response = new GraphQLResponse
{
Errors = messages.Select(m => new GraphQLError { Message = m }).ToList()
};
return new RouteHandlerResult
{
Handled = true,
StatusCode = statusCode,
ContentType = "application/json",
Body = JsonSerializer.SerializeToUtf8Bytes(response)
};
}
}
```
---
## Query Planner
```csharp
namespace StellaOps.Router.Handlers.GraphQL;
public interface IQueryPlanner
{
QueryPlan CreatePlan(
ParsedOperation operation,
Dictionary<string, GraphQLServiceConfig> services);
}
public sealed class QueryPlanner : IQueryPlanner
{
private readonly ILogger<QueryPlanner> _logger;
public QueryPlanner(ILogger<QueryPlanner> logger)
{
_logger = logger;
}
public QueryPlan CreatePlan(
ParsedOperation operation,
Dictionary<string, GraphQLServiceConfig> services)
{
var plan = new QueryPlan
{
OperationType = operation.OperationType
};
// Group fields by owning service
var fieldsByService = new Dictionary<string, List<FieldSelection>>();
foreach (var field in operation.SelectionSet)
{
var service = FindOwningService(operation.OperationType, field.Name, services);
if (!fieldsByService.ContainsKey(service))
{
fieldsByService[service] = new List<FieldSelection>();
}
fieldsByService[service].Add(field);
}
// Create execution nodes
foreach (var (serviceName, fields) in fieldsByService)
{
var subQuery = BuildSubQuery(operation, fields);
plan.Nodes.Add(new QueryPlanNode
{
ServiceName = serviceName,
SubQuery = subQuery,
RequiredFields = fields.Select(f => f.Name).ToList()
});
}
// For mutations, nodes must execute sequentially
if (operation.OperationType == GraphQLOperationType.Mutation)
{
for (int i = 1; i < plan.Nodes.Count; i++)
{
plan.Nodes[i].DependsOn.Add(plan.Nodes[i - 1]);
}
}
return plan;
}
private string FindOwningService(
GraphQLOperationType opType,
string fieldName,
Dictionary<string, GraphQLServiceConfig> services)
{
var rootType = opType switch
{
GraphQLOperationType.Query => "Query",
GraphQLOperationType.Mutation => "Mutation",
GraphQLOperationType.Subscription => "Subscription",
_ => "Query"
};
foreach (var (name, config) in services)
{
if (config.OwnedFields.TryGetValue(rootType, out var fields) &&
fields.Contains(fieldName))
{
return name;
}
if (config.RootTypes.Contains(rootType))
{
return name;
}
}
throw new GraphQLExecutionException($"No service found for field: {rootType}.{fieldName}");
}
private string BuildSubQuery(ParsedOperation operation, List<FieldSelection> fields)
{
var sb = new StringBuilder();
sb.Append(operation.OperationType.ToString().ToLower());
if (!string.IsNullOrEmpty(operation.Name))
{
sb.Append(' ').Append(operation.Name);
}
if (operation.Variables.Count > 0)
{
sb.Append('(');
sb.Append(string.Join(", ", operation.Variables.Select(v => $"${v.Name}: {v.Type}")));
sb.Append(')');
}
sb.Append(" { ");
foreach (var field in fields)
{
AppendField(sb, field);
}
sb.Append(" }");
return sb.ToString();
}
private void AppendField(StringBuilder sb, FieldSelection field)
{
if (!string.IsNullOrEmpty(field.Alias))
{
sb.Append(field.Alias).Append(": ");
}
sb.Append(field.Name);
if (field.Arguments.Count > 0)
{
sb.Append('(');
sb.Append(string.Join(", ", field.Arguments.Select(a => $"{a.Key}: {FormatValue(a.Value)}")));
sb.Append(')');
}
if (field.SelectionSet.Count > 0)
{
sb.Append(" { ");
foreach (var subField in field.SelectionSet)
{
AppendField(sb, subField);
sb.Append(' ');
}
sb.Append('}');
}
sb.Append(' ');
}
private string FormatValue(object? value)
{
return value switch
{
null => "null",
string s => $"\"{s}\"",
bool b => b.ToString().ToLower(),
_ => value.ToString() ?? "null"
};
}
}
```
---
## Query Executor
```csharp
namespace StellaOps.Router.Handlers.GraphQL;
public interface IQueryExecutor
{
Task<GraphQLResponse> ExecuteAsync(
QueryPlan plan,
GraphQLRequest request,
IReadOnlyDictionary<string, string> claims,
TimeSpan timeout,
CancellationToken cancellationToken);
Task<GraphQLResponse> ExecuteIntrospectionAsync(
GraphQLSchema schema,
GraphQLRequest request,
CancellationToken cancellationToken);
Task HandleSubscriptionAsync(
WebSocket webSocket,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken);
}
public sealed class QueryExecutor : IQueryExecutor
{
private readonly ITransportClientFactory _transportFactory;
private readonly IPayloadSerializer _serializer;
private readonly ILogger<QueryExecutor> _logger;
public QueryExecutor(
ITransportClientFactory transportFactory,
IPayloadSerializer serializer,
ILogger<QueryExecutor> logger)
{
_transportFactory = transportFactory;
_serializer = serializer;
_logger = logger;
}
public async Task<GraphQLResponse> ExecuteAsync(
QueryPlan plan,
GraphQLRequest request,
IReadOnlyDictionary<string, string> claims,
TimeSpan timeout,
CancellationToken cancellationToken)
{
using var cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
cts.CancelAfter(timeout);
var results = new ConcurrentDictionary<string, object?>();
var errors = new ConcurrentBag<GraphQLError>();
// Execute nodes respecting dependencies
await ExecuteNodesAsync(plan.Nodes, request, claims, results, errors, cts.Token);
// Merge results
var data = MergeResults(plan.Nodes, results);
return new GraphQLResponse
{
Data = data,
Errors = errors.Any() ? errors.ToList() : null
};
}
private async Task ExecuteNodesAsync(
List<QueryPlanNode> nodes,
GraphQLRequest request,
IReadOnlyDictionary<string, string> claims,
ConcurrentDictionary<string, object?> results,
ConcurrentBag<GraphQLError> errors,
CancellationToken cancellationToken)
{
// Group nodes by dependency level
var executed = new HashSet<QueryPlanNode>();
while (executed.Count < nodes.Count)
{
var ready = nodes
.Where(n => !executed.Contains(n))
.Where(n => n.DependsOn.All(d => executed.Contains(d)))
.ToList();
if (ready.Count == 0)
{
throw new GraphQLExecutionException("Circular dependency in query plan");
}
// Execute ready nodes in parallel
await Parallel.ForEachAsync(ready, cancellationToken, async (node, ct) =>
{
try
{
var result = await ExecuteNodeAsync(node, request, claims, ct);
MergeNodeResult(results, result);
}
catch (Exception ex)
{
_logger.LogError(ex, "Error executing node for service {Service}", node.ServiceName);
errors.Add(new GraphQLError
{
Message = $"Error from {node.ServiceName}: {ex.Message}",
Path = node.RequiredFields.Cast<object>().ToList()
});
}
});
foreach (var node in ready)
{
executed.Add(node);
}
}
}
private async Task<GraphQLResponse> ExecuteNodeAsync(
QueryPlanNode node,
GraphQLRequest request,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
var client = _transportFactory.GetClient(node.ServiceName);
var payload = new RequestPayload
{
Method = "POST",
Path = "/graphql",
Headers = new Dictionary<string, string>
{
["Content-Type"] = "application/json"
},
Claims = claims.ToDictionary(x => x.Key, x => x.Value),
Body = JsonSerializer.SerializeToUtf8Bytes(new
{
query = node.SubQuery,
variables = request.Variables,
operationName = request.OperationName
})
};
var response = await client.SendRequestAsync(
node.ServiceName,
payload,
TimeSpan.FromSeconds(30),
cancellationToken);
if (response.Body == null)
{
throw new GraphQLExecutionException($"Empty response from {node.ServiceName}");
}
return JsonSerializer.Deserialize<GraphQLResponse>(response.Body)
?? throw new GraphQLExecutionException($"Invalid response from {node.ServiceName}");
}
private void MergeNodeResult(ConcurrentDictionary<string, object?> results, GraphQLResponse response)
{
if (response.Data is JsonElement element && element.ValueKind == JsonValueKind.Object)
{
foreach (var property in element.EnumerateObject())
{
results[property.Name] = property.Value.Clone();
}
}
}
private object? MergeResults(List<QueryPlanNode> nodes, ConcurrentDictionary<string, object?> results)
{
return results.ToDictionary(x => x.Key, x => x.Value);
}
public Task<GraphQLResponse> ExecuteIntrospectionAsync(
GraphQLSchema schema,
GraphQLRequest request,
CancellationToken cancellationToken)
{
// Execute introspection against merged schema
var result = schema.ExecuteIntrospection(request);
return Task.FromResult(result);
}
public async Task HandleSubscriptionAsync(
WebSocket webSocket,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
var buffer = new byte[4096];
try
{
while (webSocket.State == WebSocketState.Open && !cancellationToken.IsCancellationRequested)
{
var result = await webSocket.ReceiveAsync(buffer, cancellationToken);
if (result.MessageType == WebSocketMessageType.Close)
{
await webSocket.CloseAsync(
WebSocketCloseStatus.NormalClosure,
"Closed by client",
cancellationToken);
break;
}
var message = Encoding.UTF8.GetString(buffer, 0, result.Count);
await HandleSubscriptionMessageAsync(webSocket, message, claims, cancellationToken);
}
}
catch (WebSocketException ex)
{
_logger.LogWarning(ex, "WebSocket error in subscription");
}
}
private async Task HandleSubscriptionMessageAsync(
WebSocket webSocket,
string message,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
// Implement graphql-transport-ws protocol
var msg = JsonSerializer.Deserialize<SubscriptionMessage>(message);
switch (msg?.Type)
{
case "connection_init":
await SendAsync(webSocket, new { type = "connection_ack" }, cancellationToken);
break;
case "subscribe":
// Start subscription
break;
case "complete":
// End subscription
break;
}
}
private async Task SendAsync(WebSocket webSocket, object message, CancellationToken cancellationToken)
{
var bytes = JsonSerializer.SerializeToUtf8Bytes(message);
await webSocket.SendAsync(bytes, WebSocketMessageType.Text, true, cancellationToken);
}
}
internal class SubscriptionMessage
{
public string? Type { get; set; }
public string? Id { get; set; }
public GraphQLRequest? Payload { get; set; }
}
```
---
## Schema Registry
```csharp
namespace StellaOps.Router.Handlers.GraphQL;
public interface ISchemaRegistry
{
Task<GraphQLSchema> GetMergedSchemaAsync(CancellationToken cancellationToken);
void InvalidateCache();
}
public sealed class SchemaRegistry : ISchemaRegistry
{
private readonly GraphQLHandlerConfig _config;
private readonly ITransportClientFactory _transportFactory;
private readonly ILogger<SchemaRegistry> _logger;
private GraphQLSchema? _cachedSchema;
private DateTimeOffset _cacheExpiry;
private readonly SemaphoreSlim _lock = new(1, 1);
public SchemaRegistry(
IOptions<GraphQLHandlerConfig> config,
ITransportClientFactory transportFactory,
ILogger<SchemaRegistry> logger)
{
_config = config.Value;
_transportFactory = transportFactory;
_logger = logger;
}
public async Task<GraphQLSchema> GetMergedSchemaAsync(CancellationToken cancellationToken)
{
if (_cachedSchema != null && DateTimeOffset.UtcNow < _cacheExpiry)
{
return _cachedSchema;
}
await _lock.WaitAsync(cancellationToken);
try
{
if (_cachedSchema != null && DateTimeOffset.UtcNow < _cacheExpiry)
{
return _cachedSchema;
}
var schemas = new List<string>();
foreach (var (name, config) in _config.Services)
{
if (config.IsSchemaProvider)
{
var schema = await FetchSchemaAsync(config.ServiceName, cancellationToken);
schemas.Add(schema);
}
}
_cachedSchema = MergeSchemas(schemas);
_cacheExpiry = DateTimeOffset.UtcNow.Add(_config.SchemaCacheDuration);
_logger.LogInformation("Schema cache refreshed, expires at {Expiry}", _cacheExpiry);
return _cachedSchema;
}
finally
{
_lock.Release();
}
}
private async Task<string> FetchSchemaAsync(string serviceName, CancellationToken cancellationToken)
{
var client = _transportFactory.GetClient(serviceName);
var introspectionQuery = @"
query IntrospectionQuery {
__schema {
types { ...FullType }
queryType { name }
mutationType { name }
subscriptionType { name }
}
}
fragment FullType on __Type {
kind name description
fields(includeDeprecated: true) {
name description
args { ...InputValue }
type { ...TypeRef }
isDeprecated deprecationReason
}
}
fragment InputValue on __InputValue { name description type { ...TypeRef } }
fragment TypeRef on __Type {
kind name
ofType { kind name ofType { kind name ofType { kind name } } }
}";
var payload = new RequestPayload
{
Method = "POST",
Path = "/graphql",
Headers = new Dictionary<string, string> { ["Content-Type"] = "application/json" },
Claims = new Dictionary<string, string>(),
Body = JsonSerializer.SerializeToUtf8Bytes(new { query = introspectionQuery })
};
var response = await client.SendRequestAsync(
serviceName,
payload,
TimeSpan.FromSeconds(30),
cancellationToken);
return Encoding.UTF8.GetString(response.Body ?? Array.Empty<byte>());
}
private GraphQLSchema MergeSchemas(List<string> schemas)
{
// Merge multiple introspection results into unified schema
return new GraphQLSchema(schemas);
}
public void InvalidateCache()
{
_cachedSchema = null;
_cacheExpiry = DateTimeOffset.MinValue;
}
}
```
---
## YAML Configuration
```yaml
GraphQL:
Path: "/graphql"
EnableIntrospection: true
EnableSubscriptions: true
MaxQueryDepth: 15
MaxQueryComplexity: 1000
ExecutionTimeout: "00:00:30"
SchemaCacheDuration: "00:05:00"
EnableBatching: true
MaxBatchSize: 10
Services:
users:
ServiceName: "user-service"
RootTypes:
- Query
- Mutation
OwnedFields:
Query:
- user
- users
- me
Mutation:
- createUser
- updateUser
IsSchemaProvider: true
billing:
ServiceName: "billing-service"
OwnedFields:
Query:
- invoices
- subscription
Mutation:
- createInvoice
IsSchemaProvider: true
```
---
## Deliverables
1. `StellaOps.Router.Handlers.GraphQL/GraphQLHandler.cs`
2. `StellaOps.Router.Handlers.GraphQL/GraphQLHandlerConfig.cs`
3. `StellaOps.Router.Handlers.GraphQL/IGraphQLParser.cs`
4. `StellaOps.Router.Handlers.GraphQL/IQueryPlanner.cs`
5. `StellaOps.Router.Handlers.GraphQL/QueryPlanner.cs`
6. `StellaOps.Router.Handlers.GraphQL/IQueryExecutor.cs`
7. `StellaOps.Router.Handlers.GraphQL/QueryExecutor.cs`
8. `StellaOps.Router.Handlers.GraphQL/ISchemaRegistry.cs`
9. `StellaOps.Router.Handlers.GraphQL/SchemaRegistry.cs`
10. Unit tests for query planning
11. Integration tests for federated execution
12. Subscription handling tests
---
## Next Step
Proceed to [Step 17: S3/Storage Handler Implementation](17-Step.md) to implement the storage route handler.

View File

@@ -1,903 +0,0 @@
# Step 17: S3/Storage Handler Implementation
**Phase 4: Handler Plugins**
**Estimated Complexity:** Medium
**Dependencies:** Step 10 (Microservice Handler)
---
## Overview
The S3/Storage handler routes file operations to object storage backends (S3, MinIO, Azure Blob, GCS). It handles presigned URL generation, multipart uploads, streaming downloads, and integrates with claim-based access control.
---
## Goals
1. Route file operations to appropriate storage backends
2. Generate presigned URLs for direct client uploads/downloads
3. Support multipart uploads for large files
4. Stream files without buffering in gateway
5. Enforce claim-based access control on storage operations
---
## Core Architecture
```
┌────────────────────────────────────────────────────────────────┐
│ Storage Handler │
├────────────────────────────────────────────────────────────────┤
│ │
│ HTTP Request │
│ │ │
│ ▼ │
│ ┌───────────────┐ ┌─────────────────────┐ │
│ │ Path Resolver │───►│ Bucket/Key Mapping │ │
│ └───────┬───────┘ └─────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ ┌─────────────────────┐ │
│ │Access Control │───►│ Claim-Based Policy │ │
│ └───────┬───────┘ └─────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────┐ │
│ │ Storage Backend │ │
│ │ ┌─────┐ ┌───────┐ ┌──────┐ ┌─────┐ │ │
│ │ │ S3 │ │ MinIO │ │Azure │ │ GCS │ │ │
│ │ └─────┘ └───────┘ └──────┘ └─────┘ │ │
│ └───────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────┘
```
---
## Configuration
```csharp
namespace StellaOps.Router.Handlers.Storage;
public class StorageHandlerConfig
{
/// <summary>Path prefix for storage routes.</summary>
public string PathPrefix { get; set; } = "/files";
/// <summary>Default storage backend.</summary>
public string DefaultBackend { get; set; } = "s3";
/// <summary>Maximum upload size (bytes).</summary>
public long MaxUploadSize { get; set; } = 5L * 1024 * 1024 * 1024; // 5GB
/// <summary>Multipart threshold (bytes).</summary>
public long MultipartThreshold { get; set; } = 100 * 1024 * 1024; // 100MB
/// <summary>Presigned URL expiration.</summary>
public TimeSpan PresignedUrlExpiration { get; set; } = TimeSpan.FromHours(1);
/// <summary>Whether to use presigned URLs for uploads.</summary>
public bool UsePresignedUploads { get; set; } = true;
/// <summary>Whether to use presigned URLs for downloads.</summary>
public bool UsePresignedDownloads { get; set; } = true;
/// <summary>Storage backends configuration.</summary>
public Dictionary<string, StorageBackendConfig> Backends { get; set; } = new();
/// <summary>Bucket mappings (path pattern to bucket).</summary>
public List<BucketMapping> BucketMappings { get; set; } = new();
}
public class StorageBackendConfig
{
public string Type { get; set; } = "S3"; // S3, Azure, GCS
public string Endpoint { get; set; } = "";
public string Region { get; set; } = "us-east-1";
public string AccessKey { get; set; } = "";
public string SecretKey { get; set; } = "";
public bool UsePathStyle { get; set; } = false;
public bool UseSsl { get; set; } = true;
}
public class BucketMapping
{
public string PathPattern { get; set; } = "";
public string Bucket { get; set; } = "";
public string? KeyPrefix { get; set; }
public string Backend { get; set; } = "default";
public StorageAccessPolicy Policy { get; set; } = new();
}
public class StorageAccessPolicy
{
public bool RequireAuthentication { get; set; } = true;
public List<string> AllowedClaims { get; set; } = new();
public string? OwnerClaimPath { get; set; }
public bool EnforceOwnership { get; set; } = false;
}
```
---
## Storage Handler Implementation
```csharp
namespace StellaOps.Router.Handlers.Storage;
public sealed class StorageHandler : IRouteHandler
{
public string HandlerType => "Storage";
public int Priority => 90;
private readonly StorageHandlerConfig _config;
private readonly IStorageBackendFactory _backendFactory;
private readonly IAccessControlEvaluator _accessControl;
private readonly ILogger<StorageHandler> _logger;
public StorageHandler(
IOptions<StorageHandlerConfig> config,
IStorageBackendFactory backendFactory,
IAccessControlEvaluator accessControl,
ILogger<StorageHandler> logger)
{
_config = config.Value;
_backendFactory = backendFactory;
_accessControl = accessControl;
_logger = logger;
}
public bool CanHandle(RouteMatchResult match)
{
return match.Handler == "Storage" ||
match.Route.Path.StartsWith(_config.PathPrefix, StringComparison.OrdinalIgnoreCase);
}
public async Task<RouteHandlerResult> HandleAsync(
HttpContext context,
RouteMatchResult match,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
try
{
// Resolve storage location
var location = ResolveLocation(context.Request.Path, context.Request.Query);
// Check access
var accessResult = _accessControl.Evaluate(location, claims, context.Request.Method);
if (!accessResult.Allowed)
{
return new RouteHandlerResult
{
Handled = true,
StatusCode = 403,
Body = Encoding.UTF8.GetBytes(accessResult.Reason ?? "Access denied")
};
}
// Get backend
var backend = _backendFactory.GetBackend(location.Backend);
return context.Request.Method.ToUpper() switch
{
"GET" => await HandleGetAsync(context, backend, location, cancellationToken),
"HEAD" => await HandleHeadAsync(context, backend, location, cancellationToken),
"PUT" => await HandlePutAsync(context, backend, location, claims, cancellationToken),
"POST" => await HandlePostAsync(context, backend, location, claims, cancellationToken),
"DELETE" => await HandleDeleteAsync(context, backend, location, cancellationToken),
_ => new RouteHandlerResult { Handled = true, StatusCode = 405 }
};
}
catch (StorageNotFoundException)
{
return new RouteHandlerResult { Handled = true, StatusCode = 404 };
}
catch (Exception ex)
{
_logger.LogError(ex, "Storage operation error");
return new RouteHandlerResult
{
Handled = true,
StatusCode = 500,
Body = Encoding.UTF8.GetBytes("Storage operation failed")
};
}
}
private StorageLocation ResolveLocation(PathString path, IQueryCollection query)
{
var relativePath = path.Value?.Substring(_config.PathPrefix.Length).TrimStart('/') ?? "";
foreach (var mapping in _config.BucketMappings)
{
if (IsMatch(relativePath, mapping.PathPattern))
{
var key = ExtractKey(relativePath, mapping);
return new StorageLocation
{
Backend = mapping.Backend,
Bucket = mapping.Bucket,
Key = key,
Policy = mapping.Policy
};
}
}
// Default: first segment is bucket, rest is key
var segments = relativePath.Split('/', 2);
return new StorageLocation
{
Backend = _config.DefaultBackend,
Bucket = segments[0],
Key = segments.Length > 1 ? segments[1] : ""
};
}
private bool IsMatch(string path, string pattern)
{
var regex = new Regex("^" + Regex.Escape(pattern).Replace("\\*", ".*") + "$");
return regex.IsMatch(path);
}
private string ExtractKey(string path, BucketMapping mapping)
{
var key = path;
if (!string.IsNullOrEmpty(mapping.KeyPrefix))
{
key = mapping.KeyPrefix.TrimEnd('/') + "/" + key;
}
return key;
}
private async Task<RouteHandlerResult> HandleGetAsync(
HttpContext context,
IStorageBackend backend,
StorageLocation location,
CancellationToken cancellationToken)
{
// Check for presigned download
if (_config.UsePresignedDownloads && !IsRangeRequest(context.Request))
{
var presignedUrl = await backend.GetPresignedDownloadUrlAsync(
location.Bucket,
location.Key,
_config.PresignedUrlExpiration,
cancellationToken);
return new RouteHandlerResult
{
Handled = true,
StatusCode = 307, // Temporary Redirect
Headers = new Dictionary<string, string>
{
["Location"] = presignedUrl,
["Cache-Control"] = "no-store"
}
};
}
// Stream directly
var metadata = await backend.GetObjectMetadataAsync(location.Bucket, location.Key, cancellationToken);
var stream = await backend.GetObjectStreamAsync(location.Bucket, location.Key, cancellationToken);
context.Response.StatusCode = 200;
context.Response.ContentType = metadata.ContentType;
context.Response.ContentLength = metadata.ContentLength;
if (!string.IsNullOrEmpty(metadata.ETag))
{
context.Response.Headers["ETag"] = metadata.ETag;
}
await stream.CopyToAsync(context.Response.Body, cancellationToken);
return new RouteHandlerResult { Handled = true, StatusCode = 200 };
}
private bool IsRangeRequest(HttpRequest request)
{
return request.Headers.ContainsKey("Range");
}
private async Task<RouteHandlerResult> HandleHeadAsync(
HttpContext context,
IStorageBackend backend,
StorageLocation location,
CancellationToken cancellationToken)
{
var metadata = await backend.GetObjectMetadataAsync(location.Bucket, location.Key, cancellationToken);
return new RouteHandlerResult
{
Handled = true,
StatusCode = 200,
Headers = new Dictionary<string, string>
{
["Content-Type"] = metadata.ContentType,
["Content-Length"] = metadata.ContentLength.ToString(),
["ETag"] = metadata.ETag ?? "",
["Last-Modified"] = metadata.LastModified.ToString("R")
}
};
}
private async Task<RouteHandlerResult> HandlePutAsync(
HttpContext context,
IStorageBackend backend,
StorageLocation location,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
var contentLength = context.Request.ContentLength ?? 0;
// Validate size
if (contentLength > _config.MaxUploadSize)
{
return new RouteHandlerResult
{
Handled = true,
StatusCode = 413,
Body = Encoding.UTF8.GetBytes($"File too large. Max size: {_config.MaxUploadSize}")
};
}
// Use presigned upload for large files
if (_config.UsePresignedUploads && contentLength > _config.MultipartThreshold)
{
var uploadInfo = await backend.InitiateMultipartUploadAsync(
location.Bucket,
location.Key,
context.Request.ContentType ?? "application/octet-stream",
cancellationToken);
return new RouteHandlerResult
{
Handled = true,
StatusCode = 200,
ContentType = "application/json",
Body = JsonSerializer.SerializeToUtf8Bytes(new
{
uploadId = uploadInfo.UploadId,
parts = uploadInfo.PresignedPartUrls
})
};
}
// Direct upload
var contentType = context.Request.ContentType ?? "application/octet-stream";
var metadata = new Dictionary<string, string>();
// Add owner metadata if enforced
if (location.Policy?.EnforceOwnership == true && location.Policy.OwnerClaimPath != null)
{
if (claims.TryGetValue(location.Policy.OwnerClaimPath, out var owner))
{
metadata["x-owner"] = owner;
}
}
await backend.PutObjectAsync(
location.Bucket,
location.Key,
context.Request.Body,
contentLength,
contentType,
metadata,
cancellationToken);
return new RouteHandlerResult
{
Handled = true,
StatusCode = 201,
Headers = new Dictionary<string, string>
{
["Location"] = $"{_config.PathPrefix}/{location.Bucket}/{location.Key}"
}
};
}
private async Task<RouteHandlerResult> HandlePostAsync(
HttpContext context,
IStorageBackend backend,
StorageLocation location,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
var action = context.Request.Query["action"].ToString();
return action switch
{
"presign" => await HandlePresignRequestAsync(context, backend, location, cancellationToken),
"complete" => await HandleCompleteMultipartAsync(context, backend, location, cancellationToken),
"abort" => await HandleAbortMultipartAsync(context, backend, location, cancellationToken),
_ => await HandlePutAsync(context, backend, location, claims, cancellationToken)
};
}
private async Task<RouteHandlerResult> HandlePresignRequestAsync(
HttpContext context,
IStorageBackend backend,
StorageLocation location,
CancellationToken cancellationToken)
{
var method = context.Request.Query["method"].ToString().ToUpper();
var expiration = _config.PresignedUrlExpiration;
string presignedUrl;
if (method == "PUT")
{
var contentType = context.Request.Query["contentType"].ToString();
presignedUrl = await backend.GetPresignedUploadUrlAsync(
location.Bucket,
location.Key,
contentType,
expiration,
cancellationToken);
}
else
{
presignedUrl = await backend.GetPresignedDownloadUrlAsync(
location.Bucket,
location.Key,
expiration,
cancellationToken);
}
return new RouteHandlerResult
{
Handled = true,
StatusCode = 200,
ContentType = "application/json",
Body = JsonSerializer.SerializeToUtf8Bytes(new
{
url = presignedUrl,
expiresAt = DateTimeOffset.UtcNow.Add(expiration)
})
};
}
private async Task<RouteHandlerResult> HandleCompleteMultipartAsync(
HttpContext context,
IStorageBackend backend,
StorageLocation location,
CancellationToken cancellationToken)
{
var body = await JsonSerializer.DeserializeAsync<CompleteMultipartRequest>(
context.Request.Body,
cancellationToken: cancellationToken);
if (body == null)
{
return new RouteHandlerResult { Handled = true, StatusCode = 400 };
}
await backend.CompleteMultipartUploadAsync(
location.Bucket,
location.Key,
body.UploadId,
body.Parts,
cancellationToken);
return new RouteHandlerResult { Handled = true, StatusCode = 200 };
}
private async Task<RouteHandlerResult> HandleAbortMultipartAsync(
HttpContext context,
IStorageBackend backend,
StorageLocation location,
CancellationToken cancellationToken)
{
var uploadId = context.Request.Query["uploadId"].ToString();
await backend.AbortMultipartUploadAsync(
location.Bucket,
location.Key,
uploadId,
cancellationToken);
return new RouteHandlerResult { Handled = true, StatusCode = 204 };
}
private async Task<RouteHandlerResult> HandleDeleteAsync(
HttpContext context,
IStorageBackend backend,
StorageLocation location,
CancellationToken cancellationToken)
{
await backend.DeleteObjectAsync(location.Bucket, location.Key, cancellationToken);
return new RouteHandlerResult { Handled = true, StatusCode = 204 };
}
}
internal class CompleteMultipartRequest
{
public string UploadId { get; set; } = "";
public List<UploadPart> Parts { get; set; } = new();
}
internal class StorageLocation
{
public string Backend { get; set; } = "";
public string Bucket { get; set; } = "";
public string Key { get; set; } = "";
public StorageAccessPolicy? Policy { get; set; }
}
```
---
## Storage Backend Interface
```csharp
namespace StellaOps.Router.Handlers.Storage;
public interface IStorageBackend
{
Task<ObjectMetadata> GetObjectMetadataAsync(
string bucket, string key, CancellationToken cancellationToken);
Task<Stream> GetObjectStreamAsync(
string bucket, string key, CancellationToken cancellationToken);
Task PutObjectAsync(
string bucket, string key, Stream content, long contentLength,
string contentType, Dictionary<string, string>? metadata,
CancellationToken cancellationToken);
Task DeleteObjectAsync(
string bucket, string key, CancellationToken cancellationToken);
Task<string> GetPresignedDownloadUrlAsync(
string bucket, string key, TimeSpan expiration,
CancellationToken cancellationToken);
Task<string> GetPresignedUploadUrlAsync(
string bucket, string key, string contentType, TimeSpan expiration,
CancellationToken cancellationToken);
Task<MultipartUploadInfo> InitiateMultipartUploadAsync(
string bucket, string key, string contentType,
CancellationToken cancellationToken);
Task CompleteMultipartUploadAsync(
string bucket, string key, string uploadId, List<UploadPart> parts,
CancellationToken cancellationToken);
Task AbortMultipartUploadAsync(
string bucket, string key, string uploadId,
CancellationToken cancellationToken);
}
public class ObjectMetadata
{
public string ContentType { get; set; } = "application/octet-stream";
public long ContentLength { get; set; }
public string? ETag { get; set; }
public DateTimeOffset LastModified { get; set; }
public Dictionary<string, string> CustomMetadata { get; set; } = new();
}
public class MultipartUploadInfo
{
public string UploadId { get; set; } = "";
public List<PresignedPartUrl> PresignedPartUrls { get; set; } = new();
}
public class PresignedPartUrl
{
public int PartNumber { get; set; }
public string Url { get; set; } = "";
}
public class UploadPart
{
public int PartNumber { get; set; }
public string ETag { get; set; } = "";
}
```
---
## S3 Backend Implementation
```csharp
namespace StellaOps.Router.Handlers.Storage;
public sealed class S3StorageBackend : IStorageBackend
{
private readonly IAmazonS3 _client;
private readonly ILogger<S3StorageBackend> _logger;
public S3StorageBackend(IAmazonS3 client, ILogger<S3StorageBackend> logger)
{
_client = client;
_logger = logger;
}
public async Task<ObjectMetadata> GetObjectMetadataAsync(
string bucket, string key, CancellationToken cancellationToken)
{
var response = await _client.GetObjectMetadataAsync(bucket, key, cancellationToken);
return new ObjectMetadata
{
ContentType = response.Headers.ContentType,
ContentLength = response.ContentLength,
ETag = response.ETag,
LastModified = response.LastModified,
CustomMetadata = response.Metadata.Keys
.ToDictionary(k => k, k => response.Metadata[k])
};
}
public async Task<Stream> GetObjectStreamAsync(
string bucket, string key, CancellationToken cancellationToken)
{
var response = await _client.GetObjectAsync(bucket, key, cancellationToken);
return response.ResponseStream;
}
public async Task PutObjectAsync(
string bucket, string key, Stream content, long contentLength,
string contentType, Dictionary<string, string>? metadata,
CancellationToken cancellationToken)
{
var request = new PutObjectRequest
{
BucketName = bucket,
Key = key,
InputStream = content,
ContentType = contentType
};
if (metadata != null)
{
foreach (var (k, v) in metadata)
{
request.Metadata.Add(k, v);
}
}
await _client.PutObjectAsync(request, cancellationToken);
}
public async Task DeleteObjectAsync(
string bucket, string key, CancellationToken cancellationToken)
{
await _client.DeleteObjectAsync(bucket, key, cancellationToken);
}
public Task<string> GetPresignedDownloadUrlAsync(
string bucket, string key, TimeSpan expiration,
CancellationToken cancellationToken)
{
var request = new GetPreSignedUrlRequest
{
BucketName = bucket,
Key = key,
Expires = DateTime.UtcNow.Add(expiration),
Verb = HttpVerb.GET
};
var url = _client.GetPreSignedURL(request);
return Task.FromResult(url);
}
public Task<string> GetPresignedUploadUrlAsync(
string bucket, string key, string contentType, TimeSpan expiration,
CancellationToken cancellationToken)
{
var request = new GetPreSignedUrlRequest
{
BucketName = bucket,
Key = key,
Expires = DateTime.UtcNow.Add(expiration),
Verb = HttpVerb.PUT,
ContentType = contentType
};
var url = _client.GetPreSignedURL(request);
return Task.FromResult(url);
}
public async Task<MultipartUploadInfo> InitiateMultipartUploadAsync(
string bucket, string key, string contentType,
CancellationToken cancellationToken)
{
var initResponse = await _client.InitiateMultipartUploadAsync(
bucket, key, cancellationToken);
// Generate presigned URLs for parts (assuming 100MB parts, 50 parts max)
var partUrls = new List<PresignedPartUrl>();
for (int i = 1; i <= 50; i++)
{
var url = _client.GetPreSignedURL(new GetPreSignedUrlRequest
{
BucketName = bucket,
Key = key,
Expires = DateTime.UtcNow.AddHours(24),
Verb = HttpVerb.PUT,
UploadId = initResponse.UploadId,
PartNumber = i
});
partUrls.Add(new PresignedPartUrl { PartNumber = i, Url = url });
}
return new MultipartUploadInfo
{
UploadId = initResponse.UploadId,
PresignedPartUrls = partUrls
};
}
public async Task CompleteMultipartUploadAsync(
string bucket, string key, string uploadId, List<UploadPart> parts,
CancellationToken cancellationToken)
{
var request = new CompleteMultipartUploadRequest
{
BucketName = bucket,
Key = key,
UploadId = uploadId,
PartETags = parts.Select(p => new PartETag(p.PartNumber, p.ETag)).ToList()
};
await _client.CompleteMultipartUploadAsync(request, cancellationToken);
}
public async Task AbortMultipartUploadAsync(
string bucket, string key, string uploadId,
CancellationToken cancellationToken)
{
await _client.AbortMultipartUploadAsync(bucket, key, uploadId, cancellationToken);
}
}
```
---
## Access Control Evaluator
```csharp
namespace StellaOps.Router.Handlers.Storage;
public interface IAccessControlEvaluator
{
AccessResult Evaluate(
StorageLocation location,
IReadOnlyDictionary<string, string> claims,
string httpMethod);
}
public class AccessResult
{
public bool Allowed { get; set; }
public string? Reason { get; set; }
}
public sealed class ClaimBasedAccessControlEvaluator : IAccessControlEvaluator
{
public AccessResult Evaluate(
StorageLocation location,
IReadOnlyDictionary<string, string> claims,
string httpMethod)
{
var policy = location.Policy ?? new StorageAccessPolicy();
// Check authentication requirement
if (policy.RequireAuthentication && !claims.Any())
{
return new AccessResult { Allowed = false, Reason = "Authentication required" };
}
// Check allowed claims
if (policy.AllowedClaims.Any())
{
var hasRequiredClaim = policy.AllowedClaims.Any(c =>
{
var parts = c.Split('=', 2);
if (parts.Length == 2)
{
return claims.TryGetValue(parts[0], out var value) && value == parts[1];
}
return claims.ContainsKey(c);
});
if (!hasRequiredClaim)
{
return new AccessResult { Allowed = false, Reason = "Required claim not present" };
}
}
// Check ownership for write operations
if (policy.EnforceOwnership && IsWriteOperation(httpMethod))
{
if (string.IsNullOrEmpty(policy.OwnerClaimPath))
{
return new AccessResult { Allowed = false, Reason = "Owner claim path not configured" };
}
if (!claims.ContainsKey(policy.OwnerClaimPath))
{
return new AccessResult { Allowed = false, Reason = "Owner claim required" };
}
}
return new AccessResult { Allowed = true };
}
private bool IsWriteOperation(string method)
{
return method.ToUpper() is "PUT" or "POST" or "DELETE" or "PATCH";
}
}
```
---
## YAML Configuration
```yaml
Storage:
PathPrefix: "/files"
DefaultBackend: "s3"
MaxUploadSize: 5368709120 # 5GB
MultipartThreshold: 104857600 # 100MB
PresignedUrlExpiration: "01:00:00"
UsePresignedUploads: true
UsePresignedDownloads: true
Backends:
s3:
Type: "S3"
Endpoint: "https://s3.amazonaws.com"
Region: "us-east-1"
AccessKey: "${AWS_ACCESS_KEY}"
SecretKey: "${AWS_SECRET_KEY}"
minio:
Type: "S3"
Endpoint: "https://minio.internal:9000"
Region: "us-east-1"
AccessKey: "${MINIO_ACCESS_KEY}"
SecretKey: "${MINIO_SECRET_KEY}"
UsePathStyle: true
BucketMappings:
- PathPattern: "uploads/*"
Bucket: "user-uploads"
KeyPrefix: "files/"
Backend: "s3"
Policy:
RequireAuthentication: true
EnforceOwnership: true
OwnerClaimPath: "sub"
- PathPattern: "public/*"
Bucket: "public-assets"
Backend: "s3"
Policy:
RequireAuthentication: false
```
---
## Deliverables
1. `StellaOps.Router.Handlers.Storage/StorageHandler.cs`
2. `StellaOps.Router.Handlers.Storage/StorageHandlerConfig.cs`
3. `StellaOps.Router.Handlers.Storage/IStorageBackend.cs`
4. `StellaOps.Router.Handlers.Storage/S3StorageBackend.cs`
5. `StellaOps.Router.Handlers.Storage/IAccessControlEvaluator.cs`
6. `StellaOps.Router.Handlers.Storage/ClaimBasedAccessControlEvaluator.cs`
7. `StellaOps.Router.Handlers.Storage/StorageBackendFactory.cs`
8. Presigned URL generation tests
9. Multipart upload tests
10. Access control tests
---
## Next Step
Proceed to [Step 18: Reverse Proxy Handler Implementation](18-Step.md) to implement direct reverse proxy routing.

View File

@@ -1,890 +0,0 @@
# Step 18: Reverse Proxy Handler Implementation
**Phase 4: Handler Plugins**
**Estimated Complexity:** Medium
**Dependencies:** Step 10 (Microservice Handler)
---
## Overview
The Reverse Proxy handler forwards requests to external HTTP services without using the internal transport protocol. It's used for legacy services, third-party APIs, and services that can't be modified to use the Stella transport layer.
---
## Goals
1. Forward HTTP requests to configurable upstream servers
2. Support connection pooling and HTTP/2 multiplexing
3. Handle request/response transformation
4. Support health checks and circuit breaking
5. Maintain correlation IDs for tracing
---
## Core Architecture
```
┌────────────────────────────────────────────────────────────────┐
│ Reverse Proxy Handler │
├────────────────────────────────────────────────────────────────┤
│ │
│ Incoming Request │
│ │ │
│ ▼ │
│ ┌───────────────┐ ┌─────────────────────┐ │
│ │Path Rewriter │───►│ URL Transformation │ │
│ └───────┬───────┘ └─────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ ┌─────────────────────┐ │
│ │ Header Filter │───►│ Add/Remove Headers │ │
│ └───────┬───────┘ └─────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ ┌─────────────────────┐ │
│ │ Load Balancer │───►│ Round Robin/Weighted │ │
│ └───────┬───────┘ └─────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────┐ │
│ │ HttpClient Pool │ │
│ │ (Connection pooling, HTTP/2, retries) │ │
│ └───────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────┘
```
---
## Configuration
```csharp
namespace StellaOps.Router.Handlers.ReverseProxy;
public class ReverseProxyConfig
{
/// <summary>Upstream definitions by name.</summary>
public Dictionary<string, UpstreamConfig> Upstreams { get; set; } = new();
/// <summary>Route-to-upstream mappings.</summary>
public List<ProxyRoute> Routes { get; set; } = new();
/// <summary>Default timeout for upstream requests.</summary>
public TimeSpan DefaultTimeout { get; set; } = TimeSpan.FromSeconds(30);
/// <summary>Whether to forward X-Forwarded-* headers.</summary>
public bool AddForwardedHeaders { get; set; } = true;
/// <summary>Whether to preserve host header.</summary>
public bool PreserveHost { get; set; } = false;
/// <summary>Connection pool settings.</summary>
public ConnectionPoolConfig ConnectionPool { get; set; } = new();
}
public class UpstreamConfig
{
/// <summary>Upstream server addresses.</summary>
public List<UpstreamServer> Servers { get; set; } = new();
/// <summary>Load balancing strategy.</summary>
public LoadBalanceStrategy LoadBalance { get; set; } = LoadBalanceStrategy.RoundRobin;
/// <summary>Health check configuration.</summary>
public HealthCheckConfig? HealthCheck { get; set; }
/// <summary>Circuit breaker configuration.</summary>
public CircuitBreakerConfig? CircuitBreaker { get; set; }
/// <summary>Retry configuration.</summary>
public RetryConfig? Retry { get; set; }
}
public class UpstreamServer
{
public string Address { get; set; } = "";
public int Weight { get; set; } = 1;
public bool Backup { get; set; } = false;
}
public class ProxyRoute
{
/// <summary>Path pattern to match.</summary>
public string PathPattern { get; set; } = "";
/// <summary>Target upstream name.</summary>
public string Upstream { get; set; } = "";
/// <summary>Path rewrite rule.</summary>
public PathRewriteRule? Rewrite { get; set; }
/// <summary>Header transformations.</summary>
public HeaderTransformConfig? Headers { get; set; }
/// <summary>Timeout override.</summary>
public TimeSpan? Timeout { get; set; }
/// <summary>Required claims for access.</summary>
public List<string>? RequiredClaims { get; set; }
}
public class PathRewriteRule
{
public string Pattern { get; set; } = "";
public string Replacement { get; set; } = "";
}
public class HeaderTransformConfig
{
public Dictionary<string, string> Add { get; set; } = new();
public List<string> Remove { get; set; } = new();
public Dictionary<string, string> Set { get; set; } = new();
public bool ForwardClaims { get; set; } = false;
public string ClaimsHeaderPrefix { get; set; } = "X-Claim-";
}
public class HealthCheckConfig
{
public string Path { get; set; } = "/health";
public TimeSpan Interval { get; set; } = TimeSpan.FromSeconds(10);
public TimeSpan Timeout { get; set; } = TimeSpan.FromSeconds(5);
public int UnhealthyThreshold { get; set; } = 3;
public int HealthyThreshold { get; set; } = 2;
}
public class CircuitBreakerConfig
{
public int FailureThreshold { get; set; } = 5;
public TimeSpan SamplingDuration { get; set; } = TimeSpan.FromSeconds(30);
public TimeSpan BreakDuration { get; set; } = TimeSpan.FromSeconds(30);
public double FailureRatioThreshold { get; set; } = 0.5;
}
public class RetryConfig
{
public int MaxRetries { get; set; } = 3;
public TimeSpan InitialDelay { get; set; } = TimeSpan.FromMilliseconds(100);
public double BackoffMultiplier { get; set; } = 2.0;
public List<int> RetryableStatusCodes { get; set; } = new() { 502, 503, 504 };
}
public class ConnectionPoolConfig
{
public int MaxConnectionsPerServer { get; set; } = 100;
public TimeSpan ConnectionIdleTimeout { get; set; } = TimeSpan.FromMinutes(2);
public bool EnableHttp2 { get; set; } = true;
}
public enum LoadBalanceStrategy
{
RoundRobin,
Random,
LeastConnections,
WeightedRoundRobin,
IPHash
}
```
---
## Reverse Proxy Handler Implementation
```csharp
namespace StellaOps.Router.Handlers.ReverseProxy;
public sealed class ReverseProxyHandler : IRouteHandler
{
public string HandlerType => "ReverseProxy";
public int Priority => 50;
private readonly ReverseProxyConfig _config;
private readonly IUpstreamManager _upstreamManager;
private readonly IHttpClientFactory _httpClientFactory;
private readonly ILogger<ReverseProxyHandler> _logger;
public ReverseProxyHandler(
IOptions<ReverseProxyConfig> config,
IUpstreamManager upstreamManager,
IHttpClientFactory httpClientFactory,
ILogger<ReverseProxyHandler> logger)
{
_config = config.Value;
_upstreamManager = upstreamManager;
_httpClientFactory = httpClientFactory;
_logger = logger;
}
public bool CanHandle(RouteMatchResult match)
{
if (match.Handler == "ReverseProxy")
return true;
return _config.Routes.Any(r => IsRouteMatch(match.Route.Path, r.PathPattern));
}
public async Task<RouteHandlerResult> HandleAsync(
HttpContext context,
RouteMatchResult match,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
// Find matching route
var route = _config.Routes.FirstOrDefault(r =>
IsRouteMatch(context.Request.Path, r.PathPattern));
if (route == null)
{
return new RouteHandlerResult { Handled = false };
}
// Check required claims
if (route.RequiredClaims?.Any() == true)
{
if (!route.RequiredClaims.All(c => claims.ContainsKey(c)))
{
return new RouteHandlerResult
{
Handled = true,
StatusCode = 403,
Body = Encoding.UTF8.GetBytes("Forbidden")
};
}
}
// Get upstream server
var server = await _upstreamManager.GetServerAsync(route.Upstream, context, cancellationToken);
if (server == null)
{
_logger.LogWarning("No healthy upstream for {Upstream}", route.Upstream);
return new RouteHandlerResult
{
Handled = true,
StatusCode = 503,
Body = Encoding.UTF8.GetBytes("Service unavailable")
};
}
try
{
return await ForwardRequestAsync(context, route, server, claims, cancellationToken);
}
catch (Exception ex)
{
_logger.LogError(ex, "Proxy error for {Upstream}", route.Upstream);
_upstreamManager.ReportFailure(route.Upstream, server.Address);
return new RouteHandlerResult
{
Handled = true,
StatusCode = 502,
Body = Encoding.UTF8.GetBytes("Bad gateway")
};
}
}
private bool IsRouteMatch(string path, string pattern)
{
if (pattern.EndsWith("*"))
{
return path.StartsWith(pattern.TrimEnd('*'), StringComparison.OrdinalIgnoreCase);
}
return string.Equals(path, pattern, StringComparison.OrdinalIgnoreCase);
}
private async Task<RouteHandlerResult> ForwardRequestAsync(
HttpContext context,
ProxyRoute route,
UpstreamServer server,
IReadOnlyDictionary<string, string> claims,
CancellationToken cancellationToken)
{
var request = context.Request;
// Build upstream URL
var targetUri = BuildTargetUri(server.Address, request, route.Rewrite);
// Create HTTP request
var httpRequest = new HttpRequestMessage
{
Method = new HttpMethod(request.Method),
RequestUri = targetUri
};
// Copy headers
CopyRequestHeaders(request, httpRequest, route.Headers, claims);
// Add forwarded headers
if (_config.AddForwardedHeaders)
{
AddForwardedHeaders(context, httpRequest);
}
// Copy body for non-GET/HEAD requests
if (!HttpMethods.IsGet(request.Method) && !HttpMethods.IsHead(request.Method))
{
httpRequest.Content = new StreamContent(request.Body);
if (request.ContentType != null)
{
httpRequest.Content.Headers.ContentType = MediaTypeHeaderValue.Parse(request.ContentType);
}
}
// Send request
var client = _httpClientFactory.CreateClient("proxy");
var timeout = route.Timeout ?? _config.DefaultTimeout;
using var cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
cts.CancelAfter(timeout);
var response = await client.SendAsync(httpRequest, HttpCompletionOption.ResponseHeadersRead, cts.Token);
// Copy response
return await BuildResponseAsync(context, response, route.Headers, cancellationToken);
}
private Uri BuildTargetUri(string serverAddress, HttpRequest request, PathRewriteRule? rewrite)
{
var path = request.Path.Value ?? "/";
if (rewrite != null)
{
path = Regex.Replace(path, rewrite.Pattern, rewrite.Replacement);
}
var query = request.QueryString.Value ?? "";
var baseUri = new Uri(serverAddress.TrimEnd('/'));
return new Uri(baseUri, path + query);
}
private void CopyRequestHeaders(
HttpRequest source,
HttpRequestMessage target,
HeaderTransformConfig? transform,
IReadOnlyDictionary<string, string> claims)
{
// Skip hop-by-hop headers
var skipHeaders = new HashSet<string>(StringComparer.OrdinalIgnoreCase)
{
"Connection", "Keep-Alive", "Proxy-Authenticate", "Proxy-Authorization",
"TE", "Trailer", "Transfer-Encoding", "Upgrade", "Host"
};
// Headers to remove
if (transform?.Remove != null)
{
foreach (var header in transform.Remove)
{
skipHeaders.Add(header);
}
}
foreach (var header in source.Headers)
{
if (skipHeaders.Contains(header.Key))
continue;
target.Headers.TryAddWithoutValidation(header.Key, header.Value.ToArray());
}
// Add configured headers
if (transform?.Add != null)
{
foreach (var (key, value) in transform.Add)
{
target.Headers.TryAddWithoutValidation(key, value);
}
}
// Set configured headers (overwrite)
if (transform?.Set != null)
{
foreach (var (key, value) in transform.Set)
{
target.Headers.Remove(key);
target.Headers.TryAddWithoutValidation(key, value);
}
}
// Forward claims as headers
if (transform?.ForwardClaims == true)
{
var prefix = transform.ClaimsHeaderPrefix ?? "X-Claim-";
foreach (var (key, value) in claims)
{
var headerName = prefix + key.Replace('/', '-').Replace(':', '-');
target.Headers.TryAddWithoutValidation(headerName, value);
}
}
// Preserve or set Host
if (_config.PreserveHost)
{
target.Headers.Host = source.Host.Value;
}
}
private void AddForwardedHeaders(HttpContext context, HttpRequestMessage request)
{
var connection = context.Connection;
var httpRequest = context.Request;
// X-Forwarded-For
var forwardedFor = httpRequest.Headers["X-Forwarded-For"].FirstOrDefault();
var clientIp = connection.RemoteIpAddress?.ToString();
if (!string.IsNullOrEmpty(clientIp))
{
forwardedFor = string.IsNullOrEmpty(forwardedFor)
? clientIp
: $"{forwardedFor}, {clientIp}";
}
request.Headers.TryAddWithoutValidation("X-Forwarded-For", forwardedFor);
// X-Forwarded-Proto
request.Headers.TryAddWithoutValidation("X-Forwarded-Proto", httpRequest.Scheme);
// X-Forwarded-Host
request.Headers.TryAddWithoutValidation("X-Forwarded-Host", httpRequest.Host.Value);
// X-Real-IP
if (connection.RemoteIpAddress != null)
{
request.Headers.TryAddWithoutValidation("X-Real-IP", connection.RemoteIpAddress.ToString());
}
// X-Request-ID (correlation)
request.Headers.TryAddWithoutValidation("X-Request-ID", context.TraceIdentifier);
}
private async Task<RouteHandlerResult> BuildResponseAsync(
HttpContext context,
HttpResponseMessage response,
HeaderTransformConfig? transform,
CancellationToken cancellationToken)
{
var httpResponse = context.Response;
httpResponse.StatusCode = (int)response.StatusCode;
// Copy response headers
var skipHeaders = new HashSet<string>(StringComparer.OrdinalIgnoreCase)
{
"Transfer-Encoding", "Connection"
};
foreach (var header in response.Headers)
{
if (skipHeaders.Contains(header.Key))
continue;
httpResponse.Headers[header.Key] = header.Value.ToArray();
}
foreach (var header in response.Content.Headers)
{
if (skipHeaders.Contains(header.Key))
continue;
httpResponse.Headers[header.Key] = header.Value.ToArray();
}
// Stream response body
await response.Content.CopyToAsync(httpResponse.Body, cancellationToken);
return new RouteHandlerResult
{
Handled = true,
StatusCode = (int)response.StatusCode
};
}
}
```
---
## Upstream Manager
```csharp
namespace StellaOps.Router.Handlers.ReverseProxy;
public interface IUpstreamManager
{
Task<UpstreamServer?> GetServerAsync(
string upstreamName,
HttpContext context,
CancellationToken cancellationToken);
void ReportSuccess(string upstreamName, string serverAddress);
void ReportFailure(string upstreamName, string serverAddress);
}
public sealed class UpstreamManager : IUpstreamManager, IHostedService
{
private readonly ReverseProxyConfig _config;
private readonly ILogger<UpstreamManager> _logger;
private readonly ConcurrentDictionary<string, ServerState> _serverStates = new();
private readonly ConcurrentDictionary<string, int> _roundRobinCounters = new();
private Timer? _healthCheckTimer;
public UpstreamManager(
IOptions<ReverseProxyConfig> config,
ILogger<UpstreamManager> logger)
{
_config = config.Value;
_logger = logger;
InitializeServerStates();
}
private void InitializeServerStates()
{
foreach (var (name, upstream) in _config.Upstreams)
{
foreach (var server in upstream.Servers)
{
var key = $"{name}:{server.Address}";
_serverStates[key] = new ServerState
{
Address = server.Address,
Weight = server.Weight,
IsHealthy = true,
IsBackup = server.Backup
};
}
}
}
public Task<UpstreamServer?> GetServerAsync(
string upstreamName,
HttpContext context,
CancellationToken cancellationToken)
{
if (!_config.Upstreams.TryGetValue(upstreamName, out var upstream))
{
return Task.FromResult<UpstreamServer?>(null);
}
var healthyServers = upstream.Servers
.Where(s => IsServerHealthy(upstreamName, s.Address) && !s.Backup)
.ToList();
// Fall back to backup servers if no primary available
if (healthyServers.Count == 0)
{
healthyServers = upstream.Servers
.Where(s => IsServerHealthy(upstreamName, s.Address) && s.Backup)
.ToList();
}
if (healthyServers.Count == 0)
{
return Task.FromResult<UpstreamServer?>(null);
}
var server = upstream.LoadBalance switch
{
LoadBalanceStrategy.RoundRobin => SelectRoundRobin(upstreamName, healthyServers),
LoadBalanceStrategy.Random => SelectRandom(healthyServers),
LoadBalanceStrategy.WeightedRoundRobin => SelectWeightedRoundRobin(upstreamName, healthyServers),
LoadBalanceStrategy.LeastConnections => SelectLeastConnections(upstreamName, healthyServers),
LoadBalanceStrategy.IPHash => SelectIPHash(context, healthyServers),
_ => healthyServers[0]
};
return Task.FromResult<UpstreamServer?>(server);
}
private bool IsServerHealthy(string upstreamName, string address)
{
var key = $"{upstreamName}:{address}";
return _serverStates.TryGetValue(key, out var state) && state.IsHealthy;
}
private UpstreamServer SelectRoundRobin(string upstreamName, List<UpstreamServer> servers)
{
var counter = _roundRobinCounters.AddOrUpdate(upstreamName, 0, (_, c) => c + 1);
return servers[counter % servers.Count];
}
private UpstreamServer SelectRandom(List<UpstreamServer> servers)
{
return servers[Random.Shared.Next(servers.Count)];
}
private UpstreamServer SelectWeightedRoundRobin(string upstreamName, List<UpstreamServer> servers)
{
var totalWeight = servers.Sum(s => s.Weight);
var counter = _roundRobinCounters.AddOrUpdate(upstreamName, 0, (_, c) => c + 1);
var position = counter % totalWeight;
var cumulative = 0;
foreach (var server in servers)
{
cumulative += server.Weight;
if (position < cumulative)
return server;
}
return servers[^1];
}
private UpstreamServer SelectLeastConnections(string upstreamName, List<UpstreamServer> servers)
{
return servers
.OrderBy(s =>
{
var key = $"{upstreamName}:{s.Address}";
return _serverStates.TryGetValue(key, out var state) ? state.ActiveConnections : 0;
})
.First();
}
private UpstreamServer SelectIPHash(HttpContext context, List<UpstreamServer> servers)
{
var ip = context.Connection.RemoteIpAddress?.ToString() ?? "127.0.0.1";
var hash = ip.GetHashCode();
return servers[Math.Abs(hash) % servers.Count];
}
public void ReportSuccess(string upstreamName, string serverAddress)
{
var key = $"{upstreamName}:{serverAddress}";
if (_serverStates.TryGetValue(key, out var state))
{
state.ConsecutiveFailures = 0;
state.ConsecutiveSuccesses++;
// Check circuit breaker reset
if (!state.IsHealthy && state.ConsecutiveSuccesses >= GetHealthyThreshold(upstreamName))
{
state.IsHealthy = true;
_logger.LogInformation("Server {Server} marked healthy", serverAddress);
}
}
}
public void ReportFailure(string upstreamName, string serverAddress)
{
var key = $"{upstreamName}:{serverAddress}";
if (_serverStates.TryGetValue(key, out var state))
{
state.ConsecutiveSuccesses = 0;
state.ConsecutiveFailures++;
// Check circuit breaker trip
if (state.IsHealthy && state.ConsecutiveFailures >= GetUnhealthyThreshold(upstreamName))
{
state.IsHealthy = false;
_logger.LogWarning("Server {Server} marked unhealthy after {Failures} failures",
serverAddress, state.ConsecutiveFailures);
}
}
}
private int GetUnhealthyThreshold(string upstreamName)
{
return _config.Upstreams.TryGetValue(upstreamName, out var upstream)
? upstream.HealthCheck?.UnhealthyThreshold ?? 3
: 3;
}
private int GetHealthyThreshold(string upstreamName)
{
return _config.Upstreams.TryGetValue(upstreamName, out var upstream)
? upstream.HealthCheck?.HealthyThreshold ?? 2
: 2;
}
public Task StartAsync(CancellationToken cancellationToken)
{
_healthCheckTimer = new Timer(PerformHealthChecks, null, TimeSpan.Zero, TimeSpan.FromSeconds(10));
return Task.CompletedTask;
}
private async void PerformHealthChecks(object? state)
{
foreach (var (name, upstream) in _config.Upstreams)
{
if (upstream.HealthCheck == null)
continue;
foreach (var server in upstream.Servers)
{
await CheckServerHealthAsync(name, server, upstream.HealthCheck);
}
}
}
private async Task CheckServerHealthAsync(
string upstreamName,
UpstreamServer server,
HealthCheckConfig config)
{
try
{
using var client = new HttpClient { Timeout = config.Timeout };
var uri = new Uri(new Uri(server.Address), config.Path);
var response = await client.GetAsync(uri);
if (response.IsSuccessStatusCode)
{
ReportSuccess(upstreamName, server.Address);
}
else
{
ReportFailure(upstreamName, server.Address);
}
}
catch
{
ReportFailure(upstreamName, server.Address);
}
}
public Task StopAsync(CancellationToken cancellationToken)
{
_healthCheckTimer?.Dispose();
return Task.CompletedTask;
}
}
internal class ServerState
{
public string Address { get; set; } = "";
public int Weight { get; set; } = 1;
public bool IsHealthy { get; set; } = true;
public bool IsBackup { get; set; }
public int ConsecutiveFailures { get; set; }
public int ConsecutiveSuccesses { get; set; }
public int ActiveConnections { get; set; }
}
```
---
## Service Registration
```csharp
namespace StellaOps.Router.Handlers.ReverseProxy;
public static class ReverseProxyExtensions
{
public static IServiceCollection AddReverseProxyHandler(
this IServiceCollection services,
IConfiguration configuration)
{
services.Configure<ReverseProxyConfig>(
configuration.GetSection("ReverseProxy"));
services.AddSingleton<IUpstreamManager, UpstreamManager>();
services.AddHostedService(sp => (UpstreamManager)sp.GetRequiredService<IUpstreamManager>());
services.AddHttpClient("proxy", client =>
{
client.DefaultRequestVersion = HttpVersion.Version20;
client.DefaultVersionPolicy = HttpVersionPolicy.RequestVersionOrLower;
})
.ConfigurePrimaryHttpMessageHandler(() => new SocketsHttpHandler
{
PooledConnectionLifetime = TimeSpan.FromMinutes(5),
MaxConnectionsPerServer = 100,
EnableMultipleHttp2Connections = true
});
services.AddSingleton<IRouteHandler, ReverseProxyHandler>();
return services;
}
}
```
---
## YAML Configuration
```yaml
ReverseProxy:
DefaultTimeout: "00:00:30"
AddForwardedHeaders: true
PreserveHost: false
ConnectionPool:
MaxConnectionsPerServer: 100
ConnectionIdleTimeout: "00:02:00"
EnableHttp2: true
Upstreams:
legacy-api:
LoadBalance: RoundRobin
Servers:
- Address: "http://legacy-api-1:8080"
Weight: 2
- Address: "http://legacy-api-2:8080"
Weight: 1
- Address: "http://legacy-api-backup:8080"
Backup: true
HealthCheck:
Path: "/health"
Interval: "00:00:10"
Timeout: "00:00:05"
UnhealthyThreshold: 3
HealthyThreshold: 2
CircuitBreaker:
FailureThreshold: 5
SamplingDuration: "00:00:30"
BreakDuration: "00:00:30"
Retry:
MaxRetries: 3
InitialDelay: "00:00:00.100"
BackoffMultiplier: 2.0
RetryableStatusCodes: [502, 503, 504]
external-service:
LoadBalance: LeastConnections
Servers:
- Address: "https://api.external-service.com"
Routes:
- PathPattern: "/legacy/*"
Upstream: "legacy-api"
Rewrite:
Pattern: "^/legacy"
Replacement: "/api/v1"
Headers:
Add:
X-Proxy-Source: "stella-router"
Remove:
- "X-Internal-Token"
ForwardClaims: true
ClaimsHeaderPrefix: "X-User-"
RequiredClaims:
- "sub"
- PathPattern: "/external/*"
Upstream: "external-service"
Timeout: "00:01:00"
Headers:
Set:
Authorization: "Bearer ${EXTERNAL_API_KEY}"
```
---
## Deliverables
1. `StellaOps.Router.Handlers.ReverseProxy/ReverseProxyHandler.cs`
2. `StellaOps.Router.Handlers.ReverseProxy/ReverseProxyConfig.cs`
3. `StellaOps.Router.Handlers.ReverseProxy/IUpstreamManager.cs`
4. `StellaOps.Router.Handlers.ReverseProxy/UpstreamManager.cs`
5. `StellaOps.Router.Handlers.ReverseProxy/ReverseProxyExtensions.cs`
6. Load balancing strategy tests
7. Health check tests
8. Circuit breaker tests
9. Header transformation tests
---
## Next Step
Proceed to [Step 19: Additional Handler Plugins](19-Step.md) to implement static files and WebSocket handlers.

View File

@@ -1,714 +0,0 @@
# Step 19: Microservice Host Builder
**Phase 5: Microservice SDK**
**Estimated Complexity:** High
**Dependencies:** Step 14 (TCP Transport), Step 15 (TLS Transport)
---
## Overview
The Microservice Host Builder provides a fluent API for building microservices that connect to the Stella Router. It handles transport configuration, endpoint registration, graceful shutdown, and integration with ASP.NET Core's hosting infrastructure.
---
## Goals
1. Provide fluent builder API for microservice configuration
2. Support both standalone and ASP.NET Core integrated hosting
3. Handle transport lifecycle (connect, reconnect, disconnect)
4. Support multiple transport configurations
5. Enable dual-exposure mode (gateway + direct HTTP)
---
## Core Architecture
```
┌────────────────────────────────────────────────────────────────┐
│ Microservice Host Builder │
├────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ StellaMicroserviceHost │ │
│ │ ┌───────────────┐ ┌───────────────┐ ┌─────────────┐ │ │
│ │ │Transport Layer│ │Endpoint Registry│ │ Request │ │ │
│ │ │ (TCP/TLS/etc) │ │(Discovery/Reg) │ │ Dispatcher │ │ │
│ │ └───────────────┘ └───────────────┘ └─────────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Optional: ASP.NET Core Host │ │
│ │ (Kestrel for direct HTTP access + default claims) │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────┘
```
---
## Configuration
```csharp
namespace StellaOps.Microservice;
public class StellaMicroserviceOptions
{
/// <summary>Service name for registration.</summary>
public required string ServiceName { get; set; }
/// <summary>Unique instance identifier (auto-generated if not set).</summary>
public string InstanceId { get; set; } = Guid.NewGuid().ToString("N")[..8];
/// <summary>Service version for routing.</summary>
public string Version { get; set; } = "1.0.0";
/// <summary>Region for routing affinity.</summary>
public string? Region { get; set; }
/// <summary>Tags for routing metadata.</summary>
public Dictionary<string, string> Tags { get; set; } = new();
/// <summary>Router connection pool.</summary>
public List<RouterConnectionConfig> Routers { get; set; } = new();
/// <summary>Transport configuration.</summary>
public TransportConfig Transport { get; set; } = new();
/// <summary>Endpoint discovery configuration.</summary>
public EndpointDiscoveryConfig Discovery { get; set; } = new();
/// <summary>Heartbeat configuration.</summary>
public HeartbeatConfig Heartbeat { get; set; } = new();
/// <summary>Dual exposure mode configuration.</summary>
public DualExposureConfig? DualExposure { get; set; }
/// <summary>Graceful shutdown timeout.</summary>
public TimeSpan ShutdownTimeout { get; set; } = TimeSpan.FromSeconds(30);
}
public class RouterConnectionConfig
{
public string Host { get; set; } = "localhost";
public int Port { get; set; } = 9500;
public string Transport { get; set; } = "TCP"; // TCP, TLS, InMemory
public int Priority { get; set; } = 1;
public bool Enabled { get; set; } = true;
}
public class TransportConfig
{
public string Default { get; set; } = "TCP";
public TcpClientConfig? Tcp { get; set; }
public TlsClientConfig? Tls { get; set; }
public int MaxReconnectAttempts { get; set; } = -1; // -1 = unlimited
public TimeSpan ReconnectDelay { get; set; } = TimeSpan.FromSeconds(5);
}
public class EndpointDiscoveryConfig
{
/// <summary>Assemblies to scan for endpoints.</summary>
public List<string> ScanAssemblies { get; set; } = new();
/// <summary>Path to YAML overrides file.</summary>
public string? ConfigFilePath { get; set; }
/// <summary>Base path prefix for all endpoints.</summary>
public string? BasePath { get; set; }
/// <summary>Whether to auto-discover endpoints via reflection.</summary>
public bool AutoDiscover { get; set; } = true;
}
public class HeartbeatConfig
{
public TimeSpan Interval { get; set; } = TimeSpan.FromSeconds(10);
public TimeSpan Timeout { get; set; } = TimeSpan.FromSeconds(5);
public int MissedHeartbeatsThreshold { get; set; } = 3;
}
public class DualExposureConfig
{
/// <summary>Enable direct HTTP access.</summary>
public bool Enabled { get; set; } = false;
/// <summary>HTTP port for direct access.</summary>
public int HttpPort { get; set; } = 8080;
/// <summary>Default claims for direct access (no JWT).</summary>
public Dictionary<string, string> DefaultClaims { get; set; } = new();
/// <summary>Whether to require JWT for direct access.</summary>
public bool RequireAuthentication { get; set; } = false;
}
```
---
## Host Builder Implementation
```csharp
namespace StellaOps.Microservice;
public interface IStellaMicroserviceBuilder
{
IStellaMicroserviceBuilder ConfigureServices(Action<IServiceCollection> configure);
IStellaMicroserviceBuilder ConfigureTransport(Action<TransportConfig> configure);
IStellaMicroserviceBuilder ConfigureEndpoints(Action<EndpointDiscoveryConfig> configure);
IStellaMicroserviceBuilder AddRouter(string host, int port, string transport = "TCP");
IStellaMicroserviceBuilder EnableDualExposure(Action<DualExposureConfig>? configure = null);
IStellaMicroserviceBuilder UseYamlConfig(string path);
IStellaMicroserviceHost Build();
}
public sealed class StellaMicroserviceBuilder : IStellaMicroserviceBuilder
{
private readonly StellaMicroserviceOptions _options;
private readonly IServiceCollection _services;
private readonly List<Action<IServiceCollection>> _configureActions = new();
public StellaMicroserviceBuilder(string serviceName)
{
_options = new StellaMicroserviceOptions { ServiceName = serviceName };
_services = new ServiceCollection();
// Add default services
_services.AddLogging(b => b.AddConsole());
_services.AddSingleton(_options);
}
public static IStellaMicroserviceBuilder Create(string serviceName)
{
return new StellaMicroserviceBuilder(serviceName);
}
public IStellaMicroserviceBuilder ConfigureServices(Action<IServiceCollection> configure)
{
_configureActions.Add(configure);
return this;
}
public IStellaMicroserviceBuilder ConfigureTransport(Action<TransportConfig> configure)
{
configure(_options.Transport);
return this;
}
public IStellaMicroserviceBuilder ConfigureEndpoints(Action<EndpointDiscoveryConfig> configure)
{
configure(_options.Discovery);
return this;
}
public IStellaMicroserviceBuilder AddRouter(string host, int port, string transport = "TCP")
{
_options.Routers.Add(new RouterConnectionConfig
{
Host = host,
Port = port,
Transport = transport,
Priority = _options.Routers.Count + 1
});
return this;
}
public IStellaMicroserviceBuilder EnableDualExposure(Action<DualExposureConfig>? configure = null)
{
_options.DualExposure = new DualExposureConfig { Enabled = true };
configure?.Invoke(_options.DualExposure);
return this;
}
public IStellaMicroserviceBuilder UseYamlConfig(string path)
{
_options.Discovery.ConfigFilePath = path;
return this;
}
public IStellaMicroserviceHost Build()
{
// Apply custom service configuration
foreach (var action in _configureActions)
{
action(_services);
}
// Add core services
AddCoreServices();
// Add transport services
AddTransportServices();
// Add endpoint services
AddEndpointServices();
var serviceProvider = _services.BuildServiceProvider();
return serviceProvider.GetRequiredService<IStellaMicroserviceHost>();
}
private void AddCoreServices()
{
_services.AddSingleton<IStellaMicroserviceHost, StellaMicroserviceHost>();
_services.AddSingleton<IEndpointRegistry, EndpointRegistry>();
_services.AddSingleton<IRequestDispatcher, RequestDispatcher>();
_services.AddSingleton<IPayloadSerializer, MessagePackPayloadSerializer>();
}
private void AddTransportServices()
{
_services.AddSingleton<TcpFrameCodec>();
switch (_options.Transport.Default.ToUpper())
{
case "TCP":
_services.AddSingleton<ITransportServer, TcpTransportClient>();
break;
case "TLS":
_services.AddSingleton<ICertificateProvider, CertificateProvider>();
_services.AddSingleton<ITransportServer, TlsTransportClient>();
break;
case "INMEMORY":
// InMemory requires hub to be provided externally
_services.AddSingleton<ITransportServer, InMemoryTransportServer>();
break;
}
}
private void AddEndpointServices()
{
_services.AddSingleton<IEndpointDiscovery, ReflectionEndpointDiscovery>();
if (!string.IsNullOrEmpty(_options.Discovery.ConfigFilePath))
{
_services.AddSingleton<IEndpointOverrideProvider, YamlEndpointOverrideProvider>();
}
}
}
```
---
## Microservice Host Implementation
```csharp
namespace StellaOps.Microservice;
public interface IStellaMicroserviceHost : IAsyncDisposable
{
StellaMicroserviceOptions Options { get; }
bool IsConnected { get; }
Task StartAsync(CancellationToken cancellationToken = default);
Task StopAsync(CancellationToken cancellationToken = default);
Task WaitForShutdownAsync(CancellationToken cancellationToken = default);
}
public sealed class StellaMicroserviceHost : IStellaMicroserviceHost, IHostedService
{
private readonly StellaMicroserviceOptions _options;
private readonly ITransportServer _transport;
private readonly IEndpointRegistry _endpointRegistry;
private readonly IRequestDispatcher _dispatcher;
private readonly ILogger<StellaMicroserviceHost> _logger;
private readonly CancellationTokenSource _shutdownCts = new();
private readonly TaskCompletionSource _shutdownComplete = new();
private Timer? _heartbeatTimer;
private IHost? _httpHost;
public StellaMicroserviceOptions Options => _options;
public bool IsConnected => _transport.IsConnected;
public StellaMicroserviceHost(
StellaMicroserviceOptions options,
ITransportServer transport,
IEndpointRegistry endpointRegistry,
IRequestDispatcher dispatcher,
ILogger<StellaMicroserviceHost> logger)
{
_options = options;
_transport = transport;
_endpointRegistry = endpointRegistry;
_dispatcher = dispatcher;
_logger = logger;
}
public async Task StartAsync(CancellationToken cancellationToken = default)
{
_logger.LogInformation(
"Starting microservice {ServiceName}/{InstanceId}",
_options.ServiceName, _options.InstanceId);
// Discover endpoints
var endpoints = await _endpointRegistry.DiscoverEndpointsAsync(cancellationToken);
_logger.LogInformation("Discovered {Count} endpoints", endpoints.Length);
// Wire up request handler
_transport.OnRequest += HandleRequestAsync;
_transport.OnCancel += HandleCancelAsync;
// Connect to router
var router = _options.Routers.OrderBy(r => r.Priority).FirstOrDefault()
?? throw new InvalidOperationException("No routers configured");
await _transport.ConnectAsync(
_options.ServiceName,
_options.InstanceId,
endpoints,
cancellationToken);
_logger.LogInformation(
"Connected to router at {Host}:{Port}",
router.Host, router.Port);
// Start heartbeat
_heartbeatTimer = new Timer(
SendHeartbeatAsync,
null,
_options.Heartbeat.Interval,
_options.Heartbeat.Interval);
// Start dual exposure HTTP if enabled
if (_options.DualExposure?.Enabled == true)
{
await StartHttpHostAsync(cancellationToken);
}
_logger.LogInformation(
"Microservice {ServiceName} started successfully",
_options.ServiceName);
}
private async Task<ResponsePayload> HandleRequestAsync(
RequestPayload request,
CancellationToken cancellationToken)
{
using var activity = Activity.StartActivity("HandleRequest");
activity?.SetTag("http.method", request.Method);
activity?.SetTag("http.path", request.Path);
try
{
return await _dispatcher.DispatchAsync(request, cancellationToken);
}
catch (Exception ex)
{
_logger.LogError(ex, "Error handling request {Path}", request.Path);
return new ResponsePayload
{
StatusCode = 500,
Headers = new Dictionary<string, string>(),
Body = Encoding.UTF8.GetBytes($"{{\"error\": \"{ex.Message}\"}}"),
IsFinalChunk = true
};
}
}
private Task HandleCancelAsync(string correlationId, CancellationToken cancellationToken)
{
_logger.LogDebug("Request {CorrelationId} cancelled", correlationId);
// Propagate cancellation to active request handling
return Task.CompletedTask;
}
private async void SendHeartbeatAsync(object? state)
{
try
{
await _transport.SendHeartbeatAsync(_shutdownCts.Token);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to send heartbeat");
}
}
private async Task StartHttpHostAsync(CancellationToken cancellationToken)
{
var config = _options.DualExposure!;
_httpHost = Host.CreateDefaultBuilder()
.ConfigureWebHostDefaults(web =>
{
web.UseKestrel(k => k.ListenAnyIP(config.HttpPort));
web.Configure(app =>
{
app.UseRouting();
app.UseEndpoints(endpoints =>
{
endpoints.MapFallback(async context =>
{
// Inject default claims for direct access
var claims = config.DefaultClaims;
var request = new RequestPayload
{
Method = context.Request.Method,
Path = context.Request.Path + context.Request.QueryString,
Host = context.Request.Host.Value,
Headers = context.Request.Headers
.ToDictionary(h => h.Key, h => h.Value.ToString()),
Claims = claims,
ClientIp = context.Connection.RemoteIpAddress?.ToString(),
TraceId = context.TraceIdentifier
};
// Read body if present
if (context.Request.ContentLength > 0)
{
using var ms = new MemoryStream();
await context.Request.Body.CopyToAsync(ms);
request = request with { Body = ms.ToArray() };
}
var response = await _dispatcher.DispatchAsync(request, context.RequestAborted);
context.Response.StatusCode = response.StatusCode;
foreach (var (key, value) in response.Headers)
{
context.Response.Headers[key] = value;
}
if (response.Body != null)
{
await context.Response.Body.WriteAsync(response.Body);
}
});
});
});
})
.Build();
await _httpHost.StartAsync(cancellationToken);
_logger.LogInformation(
"Direct HTTP access enabled on port {Port}",
config.HttpPort);
}
public async Task StopAsync(CancellationToken cancellationToken = default)
{
_logger.LogInformation(
"Stopping microservice {ServiceName}",
_options.ServiceName);
_shutdownCts.Cancel();
_heartbeatTimer?.Dispose();
if (_httpHost != null)
{
await _httpHost.StopAsync(cancellationToken);
}
await _transport.DisconnectAsync();
_logger.LogInformation(
"Microservice {ServiceName} stopped",
_options.ServiceName);
_shutdownComplete.TrySetResult();
}
public Task WaitForShutdownAsync(CancellationToken cancellationToken = default)
{
return _shutdownComplete.Task.WaitAsync(cancellationToken);
}
public async ValueTask DisposeAsync()
{
await StopAsync();
_shutdownCts.Dispose();
}
// IHostedService implementation for ASP.NET Core integration
Task IHostedService.StartAsync(CancellationToken cancellationToken) => StartAsync(cancellationToken);
Task IHostedService.StopAsync(CancellationToken cancellationToken) => StopAsync(cancellationToken);
}
```
---
## ASP.NET Core Integration
```csharp
namespace StellaOps.Microservice;
public static class StellaMicroserviceExtensions
{
/// <summary>
/// Adds Stella microservice to an existing ASP.NET Core host.
/// </summary>
public static IServiceCollection AddStellaMicroservice(
this IServiceCollection services,
Action<StellaMicroserviceOptions> configure)
{
var options = new StellaMicroserviceOptions { ServiceName = "unknown" };
configure(options);
services.AddSingleton(options);
services.AddSingleton<IEndpointRegistry, EndpointRegistry>();
services.AddSingleton<IRequestDispatcher, RequestDispatcher>();
services.AddSingleton<IPayloadSerializer, MessagePackPayloadSerializer>();
services.AddSingleton<TcpFrameCodec>();
// Add transport based on configuration
switch (options.Transport.Default.ToUpper())
{
case "TCP":
services.AddSingleton<ITransportServer, TcpTransportClient>();
break;
case "TLS":
services.AddSingleton<ICertificateProvider, CertificateProvider>();
services.AddSingleton<ITransportServer, TlsTransportClient>();
break;
}
services.AddSingleton<IStellaMicroserviceHost, StellaMicroserviceHost>();
services.AddHostedService(sp => (StellaMicroserviceHost)sp.GetRequiredService<IStellaMicroserviceHost>());
return services;
}
/// <summary>
/// Configures an endpoint handler for the microservice.
/// </summary>
public static IServiceCollection AddEndpointHandler<THandler>(
this IServiceCollection services)
where THandler : class, IEndpointHandler
{
services.AddScoped<IEndpointHandler, THandler>();
return services;
}
}
```
---
## Usage Examples
### Standalone Microservice
```csharp
var host = StellaMicroserviceBuilder
.Create("billing-service")
.AddRouter("gateway.internal", 9500, "TLS")
.ConfigureTransport(t =>
{
t.Tls = new TlsClientConfig
{
ClientCertificatePath = "/etc/certs/billing.pfx",
ClientCertificatePassword = Environment.GetEnvironmentVariable("CERT_PASSWORD")
};
})
.ConfigureEndpoints(e =>
{
e.BasePath = "/billing";
e.ScanAssemblies.Add("BillingService.Handlers");
})
.ConfigureServices(services =>
{
services.AddScoped<BillingContext>();
services.AddScoped<InvoiceHandler>();
})
.Build();
await host.StartAsync();
await host.WaitForShutdownAsync();
```
### ASP.NET Core Integration
```csharp
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddStellaMicroservice(options =>
{
options.ServiceName = "user-service";
options.Region = "us-east-1";
options.Routers.Add(new RouterConnectionConfig
{
Host = "gateway.internal",
Port = 9500
});
options.DualExposure = new DualExposureConfig
{
Enabled = true,
HttpPort = 8080,
DefaultClaims = new Dictionary<string, string>
{
["tier"] = "free"
}
};
});
builder.Services.AddEndpointHandler<UserEndpointHandler>();
var app = builder.Build();
await app.RunAsync();
```
---
## YAML Configuration
```yaml
Microservice:
ServiceName: "billing-service"
Version: "1.0.0"
Region: "us-east-1"
Tags:
team: "payments"
tier: "critical"
Routers:
- Host: "gateway-primary.internal"
Port: 9500
Transport: "TLS"
Priority: 1
- Host: "gateway-secondary.internal"
Port: 9500
Transport: "TLS"
Priority: 2
Transport:
Default: "TLS"
Tls:
ClientCertificatePath: "/etc/certs/service.pfx"
ClientCertificatePassword: "${CERT_PASSWORD}"
Discovery:
AutoDiscover: true
BasePath: "/billing"
ConfigFilePath: "/etc/stellaops/endpoints.yaml"
Heartbeat:
Interval: "00:00:10"
Timeout: "00:00:05"
DualExposure:
Enabled: true
HttpPort: 8080
DefaultClaims:
tier: "free"
ShutdownTimeout: "00:00:30"
```
---
## Deliverables
1. `StellaOps.Microservice/StellaMicroserviceOptions.cs`
2. `StellaOps.Microservice/IStellaMicroserviceBuilder.cs`
3. `StellaOps.Microservice/StellaMicroserviceBuilder.cs`
4. `StellaOps.Microservice/IStellaMicroserviceHost.cs`
5. `StellaOps.Microservice/StellaMicroserviceHost.cs`
6. `StellaOps.Microservice/StellaMicroserviceExtensions.cs`
7. Builder pattern tests
8. Lifecycle tests (start/stop/reconnect)
9. Dual exposure mode tests
---
## Next Step
Proceed to [Step 20: Endpoint Discovery & Registration](20-Step.md) to implement automatic endpoint discovery.

View File

@@ -1,696 +0,0 @@
# Step 20: Endpoint Discovery & Registration
**Phase 5: Microservice SDK**
**Estimated Complexity:** Medium
**Dependencies:** Step 19 (Microservice Host Builder)
---
## Overview
Endpoint discovery automatically finds and registers HTTP endpoints from microservice code using attributes and reflection. YAML configuration provides overrides for metadata like rate limits, authentication requirements, and versioning.
---
## Goals
1. Discover endpoints via reflection and attributes
2. Support YAML-based metadata overrides
3. Generate EndpointDescriptor for router registration
4. Support endpoint versioning and deprecation
5. Validate endpoint configurations at startup
---
## Endpoint Attributes
```csharp
namespace StellaOps.Microservice;
/// <summary>
/// Marks a class as containing Stella endpoints.
/// </summary>
[AttributeUsage(AttributeTargets.Class)]
public sealed class StellaEndpointAttribute : Attribute
{
public string? BasePath { get; set; }
public string? Version { get; set; }
public string[]? Tags { get; set; }
}
/// <summary>
/// Marks a method as a Stella endpoint handler.
/// </summary>
[AttributeUsage(AttributeTargets.Method)]
public sealed class StellaRouteAttribute : Attribute
{
public string Method { get; }
public string Path { get; }
public string? Name { get; set; }
public string? Description { get; set; }
public StellaRouteAttribute(string method, string path)
{
Method = method;
Path = path;
}
}
/// <summary>
/// Specifies authentication requirements for an endpoint.
/// </summary>
[AttributeUsage(AttributeTargets.Method | AttributeTargets.Class)]
public sealed class StellaAuthAttribute : Attribute
{
public bool Required { get; set; } = true;
public string[]? RequiredClaims { get; set; }
public string? Policy { get; set; }
}
/// <summary>
/// Specifies rate limiting for an endpoint.
/// </summary>
[AttributeUsage(AttributeTargets.Method | AttributeTargets.Class)]
public sealed class StellaRateLimitAttribute : Attribute
{
public int RequestsPerMinute { get; set; }
public string? BucketKey { get; set; } // e.g., "sub", "ip", "path"
}
/// <summary>
/// Specifies timeout for an endpoint.
/// </summary>
[AttributeUsage(AttributeTargets.Method | AttributeTargets.Class)]
public sealed class StellaTimeoutAttribute : Attribute
{
public int TimeoutMs { get; }
public StellaTimeoutAttribute(int timeoutMs)
{
TimeoutMs = timeoutMs;
}
}
/// <summary>
/// Marks an endpoint as deprecated.
/// </summary>
[AttributeUsage(AttributeTargets.Method)]
public sealed class StellaDeprecatedAttribute : Attribute
{
public string? Message { get; set; }
public string? AlternativeEndpoint { get; set; }
public string? SunsetDate { get; set; }
}
/// <summary>
/// Convenience attributes for common HTTP methods.
/// </summary>
public sealed class StellaGetAttribute : StellaRouteAttribute
{
public StellaGetAttribute(string path) : base("GET", path) { }
}
public sealed class StellaPostAttribute : StellaRouteAttribute
{
public StellaPostAttribute(string path) : base("POST", path) { }
}
public sealed class StellaPutAttribute : StellaRouteAttribute
{
public StellaPutAttribute(string path) : base("PUT", path) { }
}
public sealed class StellaDeleteAttribute : StellaRouteAttribute
{
public StellaDeleteAttribute(string path) : base("DELETE", path) { }
}
public sealed class StellaPatchAttribute : StellaRouteAttribute
{
public StellaPatchAttribute(string path) : base("PATCH", path) { }
}
```
---
## Endpoint Descriptor
```csharp
namespace StellaOps.Microservice;
/// <summary>
/// Describes an endpoint for router registration.
/// </summary>
public sealed class EndpointDescriptor
{
/// <summary>HTTP method (GET, POST, etc.).</summary>
public required string Method { get; init; }
/// <summary>Path pattern (may include parameters like {id}).</summary>
public required string Path { get; init; }
/// <summary>Unique endpoint name.</summary>
public string? Name { get; init; }
/// <summary>Endpoint description for documentation.</summary>
public string? Description { get; init; }
/// <summary>API version.</summary>
public string? Version { get; init; }
/// <summary>Tags for grouping/filtering.</summary>
public string[]? Tags { get; init; }
/// <summary>Whether authentication is required.</summary>
public bool RequiresAuth { get; init; } = true;
/// <summary>Required claims for access.</summary>
public string[]? RequiredClaims { get; init; }
/// <summary>Authentication policy name.</summary>
public string? AuthPolicy { get; init; }
/// <summary>Rate limit configuration.</summary>
public RateLimitDescriptor? RateLimit { get; init; }
/// <summary>Request timeout in milliseconds.</summary>
public int? TimeoutMs { get; init; }
/// <summary>Deprecation information.</summary>
public DeprecationDescriptor? Deprecation { get; init; }
/// <summary>Custom metadata.</summary>
public Dictionary<string, string>? Metadata { get; init; }
}
public sealed class RateLimitDescriptor
{
public int RequestsPerMinute { get; init; }
public string BucketKey { get; init; } = "sub";
}
public sealed class DeprecationDescriptor
{
public string? Message { get; init; }
public string? AlternativeEndpoint { get; init; }
public DateOnly? SunsetDate { get; init; }
}
```
---
## Endpoint Discovery Interface
```csharp
namespace StellaOps.Microservice;
public interface IEndpointDiscovery
{
/// <summary>
/// Discovers endpoints from configured assemblies.
/// </summary>
Task<IReadOnlyList<DiscoveredEndpoint>> DiscoverAsync(CancellationToken cancellationToken);
}
public sealed class DiscoveredEndpoint
{
public required EndpointDescriptor Descriptor { get; init; }
public required Type HandlerType { get; init; }
public required MethodInfo HandlerMethod { get; init; }
}
```
---
## Reflection-Based Discovery
```csharp
namespace StellaOps.Microservice;
public sealed class ReflectionEndpointDiscovery : IEndpointDiscovery
{
private readonly EndpointDiscoveryConfig _config;
private readonly ILogger<ReflectionEndpointDiscovery> _logger;
public ReflectionEndpointDiscovery(
StellaMicroserviceOptions options,
ILogger<ReflectionEndpointDiscovery> logger)
{
_config = options.Discovery;
_logger = logger;
}
public Task<IReadOnlyList<DiscoveredEndpoint>> DiscoverAsync(CancellationToken cancellationToken)
{
var endpoints = new List<DiscoveredEndpoint>();
var assemblies = GetAssembliesToScan();
foreach (var assembly in assemblies)
{
foreach (var type in assembly.GetExportedTypes())
{
var classAttr = type.GetCustomAttribute<StellaEndpointAttribute>();
if (classAttr == null)
continue;
var classAuth = type.GetCustomAttribute<StellaAuthAttribute>();
var classRateLimit = type.GetCustomAttribute<StellaRateLimitAttribute>();
var classTimeout = type.GetCustomAttribute<StellaTimeoutAttribute>();
foreach (var method in type.GetMethods(BindingFlags.Public | BindingFlags.Instance))
{
var routeAttr = method.GetCustomAttribute<StellaRouteAttribute>();
if (routeAttr == null)
continue;
var endpoint = BuildEndpoint(
type, method, classAttr, routeAttr,
classAuth, classRateLimit, classTimeout);
endpoints.Add(endpoint);
_logger.LogDebug(
"Discovered endpoint: {Method} {Path}",
endpoint.Descriptor.Method, endpoint.Descriptor.Path);
}
}
}
_logger.LogInformation("Discovered {Count} endpoints", endpoints.Count);
return Task.FromResult<IReadOnlyList<DiscoveredEndpoint>>(endpoints);
}
private IEnumerable<Assembly> GetAssembliesToScan()
{
if (_config.ScanAssemblies.Any())
{
return _config.ScanAssemblies.Select(Assembly.Load);
}
// Default: scan entry assembly and referenced assemblies
var entry = Assembly.GetEntryAssembly();
if (entry == null)
return Enumerable.Empty<Assembly>();
return new[] { entry }
.Concat(entry.GetReferencedAssemblies().Select(Assembly.Load));
}
private DiscoveredEndpoint BuildEndpoint(
Type handlerType,
MethodInfo method,
StellaEndpointAttribute classAttr,
StellaRouteAttribute routeAttr,
StellaAuthAttribute? classAuth,
StellaRateLimitAttribute? classRateLimit,
StellaTimeoutAttribute? classTimeout)
{
// Method-level attributes override class-level
var methodAuth = method.GetCustomAttribute<StellaAuthAttribute>() ?? classAuth;
var methodRateLimit = method.GetCustomAttribute<StellaRateLimitAttribute>() ?? classRateLimit;
var methodTimeout = method.GetCustomAttribute<StellaTimeoutAttribute>() ?? classTimeout;
var deprecatedAttr = method.GetCustomAttribute<StellaDeprecatedAttribute>();
// Build full path
var basePath = classAttr.BasePath?.TrimEnd('/') ?? "";
if (!string.IsNullOrEmpty(_config.BasePath))
{
basePath = _config.BasePath.TrimEnd('/') + basePath;
}
var fullPath = basePath + "/" + routeAttr.Path.TrimStart('/');
var descriptor = new EndpointDescriptor
{
Method = routeAttr.Method,
Path = fullPath,
Name = routeAttr.Name ?? $"{handlerType.Name}.{method.Name}",
Description = routeAttr.Description,
Version = classAttr.Version,
Tags = classAttr.Tags,
RequiresAuth = methodAuth?.Required ?? true,
RequiredClaims = methodAuth?.RequiredClaims,
AuthPolicy = methodAuth?.Policy,
RateLimit = methodRateLimit != null ? new RateLimitDescriptor
{
RequestsPerMinute = methodRateLimit.RequestsPerMinute,
BucketKey = methodRateLimit.BucketKey ?? "sub"
} : null,
TimeoutMs = methodTimeout?.TimeoutMs,
Deprecation = deprecatedAttr != null ? new DeprecationDescriptor
{
Message = deprecatedAttr.Message,
AlternativeEndpoint = deprecatedAttr.AlternativeEndpoint,
SunsetDate = DateOnly.TryParse(deprecatedAttr.SunsetDate, out var date) ? date : null
} : null
};
return new DiscoveredEndpoint
{
Descriptor = descriptor,
HandlerType = handlerType,
HandlerMethod = method
};
}
}
```
---
## YAML Override Provider
```csharp
namespace StellaOps.Microservice;
public interface IEndpointOverrideProvider
{
/// <summary>
/// Applies overrides to discovered endpoints.
/// </summary>
void ApplyOverrides(IList<DiscoveredEndpoint> endpoints);
}
public sealed class YamlEndpointOverrideProvider : IEndpointOverrideProvider
{
private readonly EndpointDiscoveryConfig _config;
private readonly ILogger<YamlEndpointOverrideProvider> _logger;
private readonly Dictionary<string, EndpointOverride> _overrides = new();
public YamlEndpointOverrideProvider(
StellaMicroserviceOptions options,
ILogger<YamlEndpointOverrideProvider> logger)
{
_config = options.Discovery;
_logger = logger;
LoadOverrides();
}
private void LoadOverrides()
{
if (string.IsNullOrEmpty(_config.ConfigFilePath))
return;
if (!File.Exists(_config.ConfigFilePath))
{
_logger.LogWarning("Endpoint config file not found: {Path}", _config.ConfigFilePath);
return;
}
var yaml = File.ReadAllText(_config.ConfigFilePath);
var deserializer = new DeserializerBuilder()
.WithNamingConvention(CamelCaseNamingConvention.Instance)
.Build();
var config = deserializer.Deserialize<EndpointOverrideConfig>(yaml);
if (config?.Endpoints != null)
{
foreach (var (key, value) in config.Endpoints)
{
_overrides[key] = value;
}
}
_logger.LogInformation("Loaded {Count} endpoint overrides", _overrides.Count);
}
public void ApplyOverrides(IList<DiscoveredEndpoint> endpoints)
{
foreach (var endpoint in endpoints)
{
var key = $"{endpoint.Descriptor.Method} {endpoint.Descriptor.Path}";
if (_overrides.TryGetValue(key, out var over) ||
_overrides.TryGetValue(endpoint.Descriptor.Path, out over) ||
(endpoint.Descriptor.Name != null && _overrides.TryGetValue(endpoint.Descriptor.Name, out over)))
{
ApplyOverride(endpoint, over);
}
}
}
private void ApplyOverride(DiscoveredEndpoint endpoint, EndpointOverride over)
{
// Create new descriptor with overrides applied
var original = endpoint.Descriptor;
var updated = new EndpointDescriptor
{
Method = original.Method,
Path = original.Path,
Name = over.Name ?? original.Name,
Description = over.Description ?? original.Description,
Version = over.Version ?? original.Version,
Tags = over.Tags ?? original.Tags,
RequiresAuth = over.RequiresAuth ?? original.RequiresAuth,
RequiredClaims = over.RequiredClaims ?? original.RequiredClaims,
AuthPolicy = over.AuthPolicy ?? original.AuthPolicy,
RateLimit = over.RateLimit != null ? new RateLimitDescriptor
{
RequestsPerMinute = over.RateLimit.RequestsPerMinute,
BucketKey = over.RateLimit.BucketKey ?? "sub"
} : original.RateLimit,
TimeoutMs = over.TimeoutMs ?? original.TimeoutMs,
Deprecation = original.Deprecation, // Keep original deprecation
Metadata = MergeMetadata(original.Metadata, over.Metadata)
};
// Replace descriptor (need mutable property or rebuild)
// In real implementation, use record with 'with' expression
_logger.LogDebug("Applied override to endpoint {Path}", original.Path);
}
private Dictionary<string, string>? MergeMetadata(
Dictionary<string, string>? original,
Dictionary<string, string>? over)
{
if (original == null && over == null)
return null;
var result = new Dictionary<string, string>(original ?? new());
if (over != null)
{
foreach (var (key, value) in over)
{
result[key] = value;
}
}
return result;
}
}
internal class EndpointOverrideConfig
{
public Dictionary<string, EndpointOverride>? Endpoints { get; set; }
}
internal class EndpointOverride
{
public string? Name { get; set; }
public string? Description { get; set; }
public string? Version { get; set; }
public string[]? Tags { get; set; }
public bool? RequiresAuth { get; set; }
public string[]? RequiredClaims { get; set; }
public string? AuthPolicy { get; set; }
public RateLimitOverride? RateLimit { get; set; }
public int? TimeoutMs { get; set; }
public Dictionary<string, string>? Metadata { get; set; }
}
internal class RateLimitOverride
{
public int RequestsPerMinute { get; set; }
public string? BucketKey { get; set; }
}
```
---
## Endpoint Registry
```csharp
namespace StellaOps.Microservice;
public interface IEndpointRegistry
{
Task<EndpointDescriptor[]> DiscoverEndpointsAsync(CancellationToken cancellationToken);
DiscoveredEndpoint? FindEndpoint(string method, string path);
}
public sealed class EndpointRegistry : IEndpointRegistry
{
private readonly IEndpointDiscovery _discovery;
private readonly IEndpointOverrideProvider? _overrideProvider;
private readonly ILogger<EndpointRegistry> _logger;
private IReadOnlyList<DiscoveredEndpoint>? _endpoints;
private readonly Dictionary<string, DiscoveredEndpoint> _endpointLookup = new();
public EndpointRegistry(
IEndpointDiscovery discovery,
IEndpointOverrideProvider? overrideProvider,
ILogger<EndpointRegistry> logger)
{
_discovery = discovery;
_overrideProvider = overrideProvider;
_logger = logger;
}
public async Task<EndpointDescriptor[]> DiscoverEndpointsAsync(CancellationToken cancellationToken)
{
_endpoints = await _discovery.DiscoverAsync(cancellationToken);
if (_overrideProvider != null)
{
var mutableList = _endpoints.ToList();
_overrideProvider.ApplyOverrides(mutableList);
_endpoints = mutableList;
}
// Build lookup table
_endpointLookup.Clear();
foreach (var endpoint in _endpoints)
{
var key = $"{endpoint.Descriptor.Method}:{endpoint.Descriptor.Path}";
_endpointLookup[key] = endpoint;
}
// Validate endpoints
ValidateEndpoints(_endpoints);
return _endpoints.Select(e => e.Descriptor).ToArray();
}
public DiscoveredEndpoint? FindEndpoint(string method, string path)
{
// Exact match
var key = $"{method}:{path}";
if (_endpointLookup.TryGetValue(key, out var endpoint))
return endpoint;
// Pattern match for path parameters
foreach (var ep in _endpoints ?? Enumerable.Empty<DiscoveredEndpoint>())
{
if (ep.Descriptor.Method != method)
continue;
if (IsPathMatch(path, ep.Descriptor.Path))
return ep;
}
return null;
}
private bool IsPathMatch(string requestPath, string pattern)
{
var patternSegments = pattern.Split('/', StringSplitOptions.RemoveEmptyEntries);
var pathSegments = requestPath.Split('/', StringSplitOptions.RemoveEmptyEntries);
if (patternSegments.Length != pathSegments.Length)
return false;
for (int i = 0; i < patternSegments.Length; i++)
{
var patternSeg = patternSegments[i];
var pathSeg = pathSegments[i];
// Check for path parameter
if (patternSeg.StartsWith('{') && patternSeg.EndsWith('}'))
continue;
if (!string.Equals(patternSeg, pathSeg, StringComparison.OrdinalIgnoreCase))
return false;
}
return true;
}
private void ValidateEndpoints(IReadOnlyList<DiscoveredEndpoint> endpoints)
{
var duplicates = endpoints
.GroupBy(e => $"{e.Descriptor.Method}:{e.Descriptor.Path}")
.Where(g => g.Count() > 1)
.Select(g => g.Key)
.ToList();
if (duplicates.Any())
{
throw new InvalidOperationException(
$"Duplicate endpoints detected: {string.Join(", ", duplicates)}");
}
// Validate handler method signatures
foreach (var endpoint in endpoints)
{
ValidateHandlerMethod(endpoint);
}
}
private void ValidateHandlerMethod(DiscoveredEndpoint endpoint)
{
var method = endpoint.HandlerMethod;
var returnType = method.ReturnType;
// Must return Task<ResponsePayload> or Task<T> where T can be serialized
if (!typeof(Task).IsAssignableFrom(returnType))
{
throw new InvalidOperationException(
$"Handler {method.Name} must return Task or Task<T>");
}
}
}
```
---
## YAML Configuration Example
```yaml
# endpoints.yaml - Endpoint overrides
Endpoints:
# Override by path
"GET /billing/invoices":
RateLimit:
RequestsPerMinute: 100
BucketKey: "sub"
TimeoutMs: 30000
# Override by name
"InvoiceHandler.GetInvoice":
RequiredClaims:
- "billing:read"
AuthPolicy: "billing-read"
# Override by method + path
"POST /billing/invoices":
RequiredClaims:
- "billing:write"
RateLimit:
RequestsPerMinute: 10
BucketKey: "sub"
Metadata:
audit: "required"
```
---
## Deliverables
1. `StellaOps.Microservice/Attributes/*.cs` (all endpoint attributes)
2. `StellaOps.Microservice/EndpointDescriptor.cs`
3. `StellaOps.Microservice/IEndpointDiscovery.cs`
4. `StellaOps.Microservice/ReflectionEndpointDiscovery.cs`
5. `StellaOps.Microservice/IEndpointOverrideProvider.cs`
6. `StellaOps.Microservice/YamlEndpointOverrideProvider.cs`
7. `StellaOps.Microservice/IEndpointRegistry.cs`
8. `StellaOps.Microservice/EndpointRegistry.cs`
9. Attribute parsing tests
10. YAML override tests
11. Path matching tests
---
## Next Step
Proceed to [Step 21: Request/Response Context](21-Step.md) to implement the request handling context.

View File

@@ -1,793 +0,0 @@
# Step 21: Request/Response Context
**Phase 5: Microservice SDK**
**Estimated Complexity:** Medium
**Dependencies:** Step 20 (Endpoint Discovery)
---
## Overview
The Request/Response Context provides a clean abstraction for endpoint handlers to access request data, claims, and build responses. It hides transport details while providing easy access to parsed path parameters, query strings, headers, and the request body.
---
## Goals
1. Provide clean request context abstraction
2. Support path parameter extraction
3. Provide typed body deserialization
4. Support streaming responses
5. Enable easy response building
---
## Request Context
```csharp
namespace StellaOps.Microservice;
/// <summary>
/// Context for handling a request in a microservice endpoint.
/// </summary>
public sealed class StellaRequestContext
{
private readonly RequestPayload _payload;
private readonly Dictionary<string, string> _pathParameters;
private readonly Lazy<IQueryCollection> _query;
private readonly Lazy<IHeaderDictionary> _headers;
internal StellaRequestContext(
RequestPayload payload,
Dictionary<string, string> pathParameters)
{
_payload = payload;
_pathParameters = pathParameters;
_query = new Lazy<IQueryCollection>(() => ParseQuery(payload.Path));
_headers = new Lazy<IHeaderDictionary>(() => new HeaderDictionary(
payload.Headers.ToDictionary(
h => h.Key,
h => new StringValues(h.Value))));
}
/// <summary>HTTP method.</summary>
public string Method => _payload.Method;
/// <summary>Request path (without query string).</summary>
public string Path => _payload.Path.Split('?')[0];
/// <summary>Full path including query string.</summary>
public string FullPath => _payload.Path;
/// <summary>Host header value.</summary>
public string? Host => _payload.Host;
/// <summary>Client IP address.</summary>
public string? ClientIp => _payload.ClientIp;
/// <summary>Trace/correlation ID.</summary>
public string? TraceId => _payload.TraceId;
/// <summary>Request headers.</summary>
public IHeaderDictionary Headers => _headers.Value;
/// <summary>Query string parameters.</summary>
public IQueryCollection Query => _query.Value;
/// <summary>Authenticated claims from JWT + hydration.</summary>
public IReadOnlyDictionary<string, string> Claims => _payload.Claims;
/// <summary>Path parameters extracted from route pattern.</summary>
public IReadOnlyDictionary<string, string> PathParameters => _pathParameters;
/// <summary>Content-Type header value.</summary>
public string? ContentType => Headers.ContentType;
/// <summary>Content-Length header value.</summary>
public long? ContentLength => _payload.ContentLength > 0 ? _payload.ContentLength : null;
/// <summary>Whether the request has a body.</summary>
public bool HasBody => _payload.Body != null && _payload.Body.Length > 0;
/// <summary>Raw request body bytes.</summary>
public byte[]? RawBody => _payload.Body;
/// <summary>
/// Gets a path parameter by name.
/// </summary>
public string? GetPathParameter(string name)
{
return _pathParameters.TryGetValue(name, out var value) ? value : null;
}
/// <summary>
/// Gets a required path parameter, throws if missing.
/// </summary>
public string RequirePathParameter(string name)
{
return _pathParameters.TryGetValue(name, out var value)
? value
: throw new ArgumentException($"Missing path parameter: {name}");
}
/// <summary>
/// Gets a query parameter by name.
/// </summary>
public string? GetQueryParameter(string name)
{
return Query.TryGetValue(name, out var values) ? values.FirstOrDefault() : null;
}
/// <summary>
/// Gets all values for a query parameter.
/// </summary>
public string[] GetQueryParameterValues(string name)
{
return Query.TryGetValue(name, out var values) ? values.ToArray() : Array.Empty<string>();
}
/// <summary>
/// Gets a header value by name.
/// </summary>
public string? GetHeader(string name)
{
return Headers.TryGetValue(name, out var values) ? values.FirstOrDefault() : null;
}
/// <summary>
/// Gets a claim value by name.
/// </summary>
public string? GetClaim(string name)
{
return Claims.TryGetValue(name, out var value) ? value : null;
}
/// <summary>
/// Gets a required claim, throws if missing.
/// </summary>
public string RequireClaim(string name)
{
return Claims.TryGetValue(name, out var value)
? value
: throw new UnauthorizedAccessException($"Missing required claim: {name}");
}
/// <summary>
/// Reads the body as a string.
/// </summary>
public string? ReadBodyAsString(Encoding? encoding = null)
{
if (_payload.Body == null || _payload.Body.Length == 0)
return null;
return (encoding ?? Encoding.UTF8).GetString(_payload.Body);
}
/// <summary>
/// Deserializes the body as JSON.
/// </summary>
public T? ReadBodyAsJson<T>(JsonSerializerOptions? options = null)
{
if (_payload.Body == null || _payload.Body.Length == 0)
return default;
return JsonSerializer.Deserialize<T>(_payload.Body, options ?? JsonDefaults.Options);
}
/// <summary>
/// Deserializes the body as JSON, throwing if null or invalid.
/// </summary>
public T RequireBodyAsJson<T>(JsonSerializerOptions? options = null) where T : class
{
var result = ReadBodyAsJson<T>(options);
return result ?? throw new ArgumentException("Request body is required");
}
/// <summary>
/// Gets a body stream for reading.
/// </summary>
public Stream GetBodyStream()
{
return new MemoryStream(_payload.Body ?? Array.Empty<byte>(), writable: false);
}
private static IQueryCollection ParseQuery(string path)
{
var queryIndex = path.IndexOf('?');
if (queryIndex < 0)
return QueryCollection.Empty;
var queryString = path[(queryIndex + 1)..];
return QueryHelpers.ParseQuery(queryString);
}
}
internal static class JsonDefaults
{
public static readonly JsonSerializerOptions Options = new()
{
PropertyNameCaseInsensitive = true,
PropertyNamingPolicy = JsonNamingPolicy.CamelCase
};
}
```
---
## Response Builder
```csharp
namespace StellaOps.Microservice;
/// <summary>
/// Builder for constructing endpoint responses.
/// </summary>
public sealed class StellaResponseBuilder
{
private int _statusCode = 200;
private readonly Dictionary<string, string> _headers = new(StringComparer.OrdinalIgnoreCase);
private byte[]? _body;
private string _contentType = "application/json";
/// <summary>
/// Creates a new response builder.
/// </summary>
public static StellaResponseBuilder Create() => new();
/// <summary>
/// Sets the status code.
/// </summary>
public StellaResponseBuilder WithStatus(int statusCode)
{
_statusCode = statusCode;
return this;
}
/// <summary>
/// Sets a response header.
/// </summary>
public StellaResponseBuilder WithHeader(string name, string value)
{
_headers[name] = value;
return this;
}
/// <summary>
/// Sets multiple response headers.
/// </summary>
public StellaResponseBuilder WithHeaders(IEnumerable<KeyValuePair<string, string>> headers)
{
foreach (var (key, value) in headers)
{
_headers[key] = value;
}
return this;
}
/// <summary>
/// Sets the Content-Type header.
/// </summary>
public StellaResponseBuilder WithContentType(string contentType)
{
_contentType = contentType;
return this;
}
/// <summary>
/// Sets a JSON body.
/// </summary>
public StellaResponseBuilder WithJson<T>(T value, JsonSerializerOptions? options = null)
{
_contentType = "application/json";
_body = JsonSerializer.SerializeToUtf8Bytes(value, options ?? JsonDefaults.Options);
return this;
}
/// <summary>
/// Sets a string body.
/// </summary>
public StellaResponseBuilder WithText(string text, Encoding? encoding = null)
{
if (!_headers.ContainsKey("Content-Type") && _contentType == "application/json")
{
_contentType = "text/plain";
}
_body = (encoding ?? Encoding.UTF8).GetBytes(text);
return this;
}
/// <summary>
/// Sets raw bytes as body.
/// </summary>
public StellaResponseBuilder WithBytes(byte[] data, string? contentType = null)
{
if (contentType != null)
{
_contentType = contentType;
}
_body = data;
return this;
}
/// <summary>
/// Sets a stream as body.
/// </summary>
public StellaResponseBuilder WithStream(Stream stream, string? contentType = null)
{
if (contentType != null)
{
_contentType = contentType;
}
using var ms = new MemoryStream();
stream.CopyTo(ms);
_body = ms.ToArray();
return this;
}
/// <summary>
/// Builds the response payload.
/// </summary>
public ResponsePayload Build()
{
_headers["Content-Type"] = _contentType;
return new ResponsePayload
{
StatusCode = _statusCode,
Headers = new Dictionary<string, string>(_headers),
Body = _body,
IsFinalChunk = true
};
}
// Static factory methods for common responses
/// <summary>Creates a 200 OK response with JSON body.</summary>
public static ResponsePayload Ok<T>(T value) =>
Create().WithStatus(200).WithJson(value).Build();
/// <summary>Creates a 200 OK response with no body.</summary>
public static ResponsePayload Ok() =>
Create().WithStatus(200).Build();
/// <summary>Creates a 201 Created response with JSON body.</summary>
public static ResponsePayload Created<T>(T value, string? location = null)
{
var builder = Create().WithStatus(201).WithJson(value);
if (location != null)
{
builder.WithHeader("Location", location);
}
return builder.Build();
}
/// <summary>Creates a 204 No Content response.</summary>
public static ResponsePayload NoContent() =>
Create().WithStatus(204).Build();
/// <summary>Creates a 400 Bad Request response.</summary>
public static ResponsePayload BadRequest(string message) =>
Create().WithStatus(400).WithJson(new { error = message }).Build();
/// <summary>Creates a 400 Bad Request response with validation errors.</summary>
public static ResponsePayload BadRequest(Dictionary<string, string[]> errors) =>
Create().WithStatus(400).WithJson(new { errors }).Build();
/// <summary>Creates a 401 Unauthorized response.</summary>
public static ResponsePayload Unauthorized(string? message = null) =>
Create().WithStatus(401).WithJson(new { error = message ?? "Unauthorized" }).Build();
/// <summary>Creates a 403 Forbidden response.</summary>
public static ResponsePayload Forbidden(string? message = null) =>
Create().WithStatus(403).WithJson(new { error = message ?? "Forbidden" }).Build();
/// <summary>Creates a 404 Not Found response.</summary>
public static ResponsePayload NotFound(string? message = null) =>
Create().WithStatus(404).WithJson(new { error = message ?? "Not found" }).Build();
/// <summary>Creates a 409 Conflict response.</summary>
public static ResponsePayload Conflict(string message) =>
Create().WithStatus(409).WithJson(new { error = message }).Build();
/// <summary>Creates a 500 Internal Server Error response.</summary>
public static ResponsePayload InternalError(string? message = null) =>
Create().WithStatus(500).WithJson(new { error = message ?? "Internal server error" }).Build();
/// <summary>Creates a 503 Service Unavailable response.</summary>
public static ResponsePayload ServiceUnavailable(string? message = null) =>
Create().WithStatus(503).WithJson(new { error = message ?? "Service unavailable" }).Build();
/// <summary>Creates a redirect response.</summary>
public static ResponsePayload Redirect(string location, bool permanent = false) =>
Create()
.WithStatus(permanent ? 301 : 302)
.WithHeader("Location", location)
.Build();
}
```
---
## Endpoint Handler Interface
```csharp
namespace StellaOps.Microservice;
/// <summary>
/// Interface for endpoint handler classes.
/// </summary>
public interface IEndpointHandler
{
}
/// <summary>
/// Base class for endpoint handlers with helper methods.
/// </summary>
public abstract class EndpointHandler : IEndpointHandler
{
/// <summary>Current request context (set by dispatcher).</summary>
public StellaRequestContext Context { get; internal set; } = null!;
/// <summary>Creates a 200 OK response with JSON body.</summary>
protected ResponsePayload Ok<T>(T value) => StellaResponseBuilder.Ok(value);
/// <summary>Creates a 200 OK response with no body.</summary>
protected ResponsePayload Ok() => StellaResponseBuilder.Ok();
/// <summary>Creates a 201 Created response.</summary>
protected ResponsePayload Created<T>(T value, string? location = null) =>
StellaResponseBuilder.Created(value, location);
/// <summary>Creates a 204 No Content response.</summary>
protected ResponsePayload NoContent() => StellaResponseBuilder.NoContent();
/// <summary>Creates a 400 Bad Request response.</summary>
protected ResponsePayload BadRequest(string message) =>
StellaResponseBuilder.BadRequest(message);
/// <summary>Creates a 401 Unauthorized response.</summary>
protected ResponsePayload Unauthorized(string? message = null) =>
StellaResponseBuilder.Unauthorized(message);
/// <summary>Creates a 403 Forbidden response.</summary>
protected ResponsePayload Forbidden(string? message = null) =>
StellaResponseBuilder.Forbidden(message);
/// <summary>Creates a 404 Not Found response.</summary>
protected ResponsePayload NotFound(string? message = null) =>
StellaResponseBuilder.NotFound(message);
/// <summary>Creates a response with custom status and body.</summary>
protected StellaResponseBuilder Response() => StellaResponseBuilder.Create();
}
```
---
## Request Dispatcher
```csharp
namespace StellaOps.Microservice;
public interface IRequestDispatcher
{
Task<ResponsePayload> DispatchAsync(RequestPayload request, CancellationToken cancellationToken);
}
public sealed class RequestDispatcher : IRequestDispatcher
{
private readonly IEndpointRegistry _registry;
private readonly IServiceProvider _serviceProvider;
private readonly ILogger<RequestDispatcher> _logger;
public RequestDispatcher(
IEndpointRegistry registry,
IServiceProvider serviceProvider,
ILogger<RequestDispatcher> logger)
{
_registry = registry;
_serviceProvider = serviceProvider;
_logger = logger;
}
public async Task<ResponsePayload> DispatchAsync(
RequestPayload request,
CancellationToken cancellationToken)
{
var path = request.Path.Split('?')[0];
var endpoint = _registry.FindEndpoint(request.Method, path);
if (endpoint == null)
{
_logger.LogDebug("No endpoint found for {Method} {Path}", request.Method, path);
return StellaResponseBuilder.NotFound($"No endpoint: {request.Method} {path}");
}
// Extract path parameters
var pathParams = ExtractPathParameters(path, endpoint.Descriptor.Path);
// Create request context
var context = new StellaRequestContext(request, pathParams);
// Create handler instance
using var scope = _serviceProvider.CreateScope();
var handler = scope.ServiceProvider.GetService(endpoint.HandlerType);
if (handler == null)
{
// Try to create without DI
handler = Activator.CreateInstance(endpoint.HandlerType);
}
if (handler == null)
{
_logger.LogError("Cannot create handler {Type}", endpoint.HandlerType);
return StellaResponseBuilder.InternalError("Handler instantiation failed");
}
// Set context on base handler
if (handler is EndpointHandler baseHandler)
{
baseHandler.Context = context;
}
try
{
// Invoke handler method
var result = endpoint.HandlerMethod.Invoke(handler, BuildMethodParameters(
endpoint.HandlerMethod, context, cancellationToken));
// Handle async methods
if (result is Task<ResponsePayload> taskResponse)
{
return await taskResponse;
}
else if (result is Task task)
{
await task;
// Method returned Task without result - assume OK
return StellaResponseBuilder.Ok();
}
else if (result is ResponsePayload response)
{
return response;
}
else if (result != null)
{
// Serialize result as JSON
return StellaResponseBuilder.Ok(result);
}
else
{
return StellaResponseBuilder.NoContent();
}
}
catch (TargetInvocationException ex) when (ex.InnerException != null)
{
throw ex.InnerException;
}
}
private Dictionary<string, string> ExtractPathParameters(string actualPath, string pattern)
{
var result = new Dictionary<string, string>();
var patternSegments = pattern.Split('/', StringSplitOptions.RemoveEmptyEntries);
var pathSegments = actualPath.Split('/', StringSplitOptions.RemoveEmptyEntries);
for (int i = 0; i < patternSegments.Length && i < pathSegments.Length; i++)
{
var patternSeg = patternSegments[i];
if (patternSeg.StartsWith('{') && patternSeg.EndsWith('}'))
{
var paramName = patternSeg[1..^1];
result[paramName] = pathSegments[i];
}
}
return result;
}
private object?[] BuildMethodParameters(
MethodInfo method,
StellaRequestContext context,
CancellationToken cancellationToken)
{
var parameters = method.GetParameters();
var args = new object?[parameters.Length];
for (int i = 0; i < parameters.Length; i++)
{
var param = parameters[i];
var paramType = param.ParameterType;
if (paramType == typeof(StellaRequestContext))
{
args[i] = context;
}
else if (paramType == typeof(CancellationToken))
{
args[i] = cancellationToken;
}
else if (param.GetCustomAttribute<FromPathAttribute>() != null)
{
var value = context.GetPathParameter(param.Name ?? "");
args[i] = ConvertParameter(value, paramType);
}
else if (param.GetCustomAttribute<FromQueryAttribute>() != null)
{
var value = context.GetQueryParameter(param.Name ?? "");
args[i] = ConvertParameter(value, paramType);
}
else if (param.GetCustomAttribute<FromHeaderAttribute>() != null)
{
var headerName = param.GetCustomAttribute<FromHeaderAttribute>()?.Name ?? param.Name;
var value = context.GetHeader(headerName ?? "");
args[i] = ConvertParameter(value, paramType);
}
else if (param.GetCustomAttribute<FromClaimAttribute>() != null)
{
var claimName = param.GetCustomAttribute<FromClaimAttribute>()?.Name ?? param.Name;
var value = context.GetClaim(claimName ?? "");
args[i] = ConvertParameter(value, paramType);
}
else if (param.GetCustomAttribute<FromBodyAttribute>() != null || IsComplexType(paramType))
{
// Deserialize body
args[i] = context.ReadBodyAsJson(paramType);
}
else
{
args[i] = param.HasDefaultValue ? param.DefaultValue : null;
}
}
return args;
}
private static object? ConvertParameter(string? value, Type targetType)
{
if (value == null)
return targetType.IsValueType ? Activator.CreateInstance(targetType) : null;
if (targetType == typeof(string))
return value;
if (targetType == typeof(int) || targetType == typeof(int?))
return int.TryParse(value, out var i) ? i : null;
if (targetType == typeof(long) || targetType == typeof(long?))
return long.TryParse(value, out var l) ? l : null;
if (targetType == typeof(Guid) || targetType == typeof(Guid?))
return Guid.TryParse(value, out var g) ? g : null;
if (targetType == typeof(bool) || targetType == typeof(bool?))
return bool.TryParse(value, out var b) ? b : null;
return Convert.ChangeType(value, targetType);
}
private static bool IsComplexType(Type type)
{
return !type.IsPrimitive &&
type != typeof(string) &&
type != typeof(decimal) &&
type != typeof(Guid) &&
type != typeof(DateTime) &&
type != typeof(DateTimeOffset) &&
!type.IsEnum;
}
private object? ReadBodyAsJson(StellaRequestContext context, Type targetType)
{
if (!context.HasBody)
return null;
var json = context.RawBody;
return JsonSerializer.Deserialize(json, targetType, JsonDefaults.Options);
}
}
```
---
## Parameter Binding Attributes
```csharp
namespace StellaOps.Microservice;
[AttributeUsage(AttributeTargets.Parameter)]
public sealed class FromPathAttribute : Attribute { }
[AttributeUsage(AttributeTargets.Parameter)]
public sealed class FromQueryAttribute : Attribute { }
[AttributeUsage(AttributeTargets.Parameter)]
public sealed class FromHeaderAttribute : Attribute
{
public string? Name { get; set; }
}
[AttributeUsage(AttributeTargets.Parameter)]
public sealed class FromClaimAttribute : Attribute
{
public string? Name { get; set; }
}
[AttributeUsage(AttributeTargets.Parameter)]
public sealed class FromBodyAttribute : Attribute { }
```
---
## Usage Example
```csharp
[StellaEndpoint(BasePath = "/billing")]
public class InvoiceHandler : EndpointHandler
{
private readonly InvoiceService _service;
public InvoiceHandler(InvoiceService service)
{
_service = service;
}
[StellaGet("invoices/{id}")]
public async Task<ResponsePayload> GetInvoice(
[FromPath] Guid id,
CancellationToken cancellationToken)
{
var invoice = await _service.GetByIdAsync(id, cancellationToken);
if (invoice == null)
return NotFound($"Invoice {id} not found");
return Ok(invoice);
}
[StellaPost("invoices")]
[StellaAuth(RequiredClaims = new[] { "billing:write" })]
public async Task<ResponsePayload> CreateInvoice(
[FromBody] CreateInvoiceRequest request,
[FromClaim(Name = "sub")] string userId,
CancellationToken cancellationToken)
{
var invoice = await _service.CreateAsync(request, userId, cancellationToken);
return Created(invoice, $"/billing/invoices/{invoice.Id}");
}
[StellaGet("invoices")]
public async Task<ResponsePayload> ListInvoices(
StellaRequestContext context,
CancellationToken cancellationToken)
{
var page = int.Parse(context.GetQueryParameter("page") ?? "1");
var pageSize = int.Parse(context.GetQueryParameter("pageSize") ?? "20");
var invoices = await _service.ListAsync(page, pageSize, cancellationToken);
return Ok(invoices);
}
}
```
---
## Deliverables
1. `StellaOps.Microservice/StellaRequestContext.cs`
2. `StellaOps.Microservice/StellaResponseBuilder.cs`
3. `StellaOps.Microservice/IEndpointHandler.cs`
4. `StellaOps.Microservice/EndpointHandler.cs`
5. `StellaOps.Microservice/IRequestDispatcher.cs`
6. `StellaOps.Microservice/RequestDispatcher.cs`
7. `StellaOps.Microservice/ParameterBindingAttributes.cs`
8. Parameter binding tests
9. Response builder tests
10. Dispatcher routing tests
---
## Next Step
Proceed to [Step 22: Logging & Tracing](22-Step.md) to implement structured logging and distributed tracing.

View File

@@ -1,698 +0,0 @@
# Step 22: Logging & Tracing
**Phase 6: Observability & Resilience**
**Estimated Complexity:** Medium
**Dependencies:** Step 19 (Microservice Host Builder)
---
## Overview
Structured logging and distributed tracing provide observability across the gateway and microservices. Correlation IDs flow from HTTP requests through the transport layer to microservice handlers, enabling end-to-end request tracking.
---
## Goals
1. Implement structured logging with consistent context
2. Propagate correlation IDs across all layers
3. Integrate with OpenTelemetry for distributed tracing
4. Support log level configuration per component
5. Provide sensitive data filtering
---
## Correlation Context
```csharp
namespace StellaOps.Router.Common;
/// <summary>
/// Provides correlation context for request tracking.
/// </summary>
public static class CorrelationContext
{
private static readonly AsyncLocal<CorrelationData> _current = new();
public static CorrelationData Current => _current.Value ?? CorrelationData.Empty;
public static IDisposable BeginScope(CorrelationData data)
{
var previous = _current.Value;
_current.Value = data;
return new CorrelationScope(previous);
}
public static IDisposable BeginScope(string correlationId, string? serviceName = null)
{
return BeginScope(new CorrelationData
{
CorrelationId = correlationId,
ServiceName = serviceName ?? Current.ServiceName,
ParentId = Current.CorrelationId
});
}
private sealed class CorrelationScope : IDisposable
{
private readonly CorrelationData? _previous;
public CorrelationScope(CorrelationData? previous)
{
_previous = previous;
}
public void Dispose()
{
_current.Value = _previous;
}
}
}
public sealed class CorrelationData
{
public static readonly CorrelationData Empty = new();
public string CorrelationId { get; init; } = "";
public string? ParentId { get; init; }
public string? ServiceName { get; init; }
public string? InstanceId { get; init; }
public string? Method { get; init; }
public string? Path { get; init; }
public string? UserId { get; init; }
public Dictionary<string, string> Extra { get; init; } = new();
}
```
---
## Structured Log Enricher
```csharp
namespace StellaOps.Router.Common;
/// <summary>
/// Enriches log entries with correlation context.
/// </summary>
public sealed class CorrelationLogEnricher : ILoggerProvider
{
private readonly ILoggerProvider _inner;
public CorrelationLogEnricher(ILoggerProvider inner)
{
_inner = inner;
}
public ILogger CreateLogger(string categoryName)
{
return new CorrelationLogger(_inner.CreateLogger(categoryName));
}
public void Dispose() => _inner.Dispose();
private sealed class CorrelationLogger : ILogger
{
private readonly ILogger _inner;
public CorrelationLogger(ILogger inner)
{
_inner = inner;
}
public IDisposable? BeginScope<TState>(TState state) where TState : notnull
{
return _inner.BeginScope(state);
}
public bool IsEnabled(LogLevel logLevel) => _inner.IsEnabled(logLevel);
public void Log<TState>(
LogLevel logLevel,
EventId eventId,
TState state,
Exception? exception,
Func<TState, Exception?, string> formatter)
{
var correlation = CorrelationContext.Current;
// Create enriched state
using var scope = _inner.BeginScope(new Dictionary<string, object?>
{
["CorrelationId"] = correlation.CorrelationId,
["ServiceName"] = correlation.ServiceName,
["InstanceId"] = correlation.InstanceId,
["Method"] = correlation.Method,
["Path"] = correlation.Path,
["UserId"] = correlation.UserId
});
_inner.Log(logLevel, eventId, state, exception, formatter);
}
}
}
```
---
## Gateway Request Logging
```csharp
namespace StellaOps.Router.Gateway;
/// <summary>
/// Middleware for request/response logging with correlation.
/// </summary>
public sealed class RequestLoggingMiddleware
{
private readonly RequestDelegate _next;
private readonly ILogger<RequestLoggingMiddleware> _logger;
private readonly RequestLoggingConfig _config;
public RequestLoggingMiddleware(
RequestDelegate next,
ILogger<RequestLoggingMiddleware> logger,
IOptions<RequestLoggingConfig> config)
{
_next = next;
_logger = logger;
_config = config.Value;
}
public async Task InvokeAsync(HttpContext context)
{
var correlationId = context.Request.Headers["X-Correlation-ID"].FirstOrDefault()
?? context.TraceIdentifier;
// Set correlation context
using var scope = CorrelationContext.BeginScope(new CorrelationData
{
CorrelationId = correlationId,
ServiceName = "gateway",
Method = context.Request.Method,
Path = context.Request.Path
});
var sw = Stopwatch.StartNew();
try
{
// Log request
if (_config.LogRequests)
{
LogRequest(context, correlationId);
}
await _next(context);
sw.Stop();
// Log response
if (_config.LogResponses)
{
LogResponse(context, correlationId, sw.ElapsedMilliseconds);
}
}
catch (Exception ex)
{
sw.Stop();
LogError(context, correlationId, sw.ElapsedMilliseconds, ex);
throw;
}
}
private void LogRequest(HttpContext context, string correlationId)
{
var request = context.Request;
_logger.LogInformation(
"HTTP {Method} {Path} started | CorrelationId={CorrelationId} ClientIP={ClientIP} UserAgent={UserAgent}",
request.Method,
request.Path + request.QueryString,
correlationId,
context.Connection.RemoteIpAddress,
SanitizeHeader(request.Headers.UserAgent));
}
private void LogResponse(HttpContext context, string correlationId, long elapsedMs)
{
var level = context.Response.StatusCode >= 500 ? LogLevel.Error
: context.Response.StatusCode >= 400 ? LogLevel.Warning
: LogLevel.Information;
_logger.Log(
level,
"HTTP {Method} {Path} completed {StatusCode} in {ElapsedMs}ms | CorrelationId={CorrelationId}",
context.Request.Method,
context.Request.Path,
context.Response.StatusCode,
elapsedMs,
correlationId);
}
private void LogError(HttpContext context, string correlationId, long elapsedMs, Exception ex)
{
_logger.LogError(
ex,
"HTTP {Method} {Path} failed after {ElapsedMs}ms | CorrelationId={CorrelationId}",
context.Request.Method,
context.Request.Path,
elapsedMs,
correlationId);
}
private static string SanitizeHeader(StringValues value)
{
var str = value.ToString();
return str.Length > 200 ? str[..200] + "..." : str;
}
}
public class RequestLoggingConfig
{
public bool LogRequests { get; set; } = true;
public bool LogResponses { get; set; } = true;
public bool LogHeaders { get; set; } = false;
public bool LogBody { get; set; } = false;
public int MaxBodyLogLength { get; set; } = 1000;
public HashSet<string> SensitiveHeaders { get; set; } = new(StringComparer.OrdinalIgnoreCase)
{
"Authorization", "Cookie", "X-API-Key"
};
}
```
---
## OpenTelemetry Integration
```csharp
namespace StellaOps.Router.Common;
/// <summary>
/// Configures OpenTelemetry tracing for the router.
/// </summary>
public static class OpenTelemetryExtensions
{
public static IServiceCollection AddStellaTracing(
this IServiceCollection services,
IConfiguration configuration)
{
var config = configuration.GetSection("Tracing").Get<TracingConfig>()
?? new TracingConfig();
services.AddOpenTelemetry()
.WithTracing(builder =>
{
builder
.SetResourceBuilder(ResourceBuilder.CreateDefault()
.AddService(config.ServiceName))
.AddSource(StellaActivitySource.Name)
.AddAspNetCoreInstrumentation(options =>
{
options.Filter = ctx =>
!ctx.Request.Path.StartsWithSegments("/health");
options.RecordException = true;
})
.AddHttpClientInstrumentation();
// Add exporter based on config
switch (config.Exporter.ToLower())
{
case "jaeger":
builder.AddJaegerExporter(o =>
{
o.AgentHost = config.JaegerHost;
o.AgentPort = config.JaegerPort;
});
break;
case "otlp":
builder.AddOtlpExporter(o =>
{
o.Endpoint = new Uri(config.OtlpEndpoint);
});
break;
case "console":
builder.AddConsoleExporter();
break;
}
});
return services;
}
}
public static class StellaActivitySource
{
public const string Name = "StellaOps.Router";
private static readonly ActivitySource _source = new(Name);
public static Activity? StartActivity(string name, ActivityKind kind = ActivityKind.Internal)
{
return _source.StartActivity(name, kind);
}
public static Activity? StartRequestActivity(string method, string path)
{
var activity = _source.StartActivity("HandleRequest", ActivityKind.Server);
activity?.SetTag("http.method", method);
activity?.SetTag("http.route", path);
return activity;
}
public static Activity? StartTransportActivity(string transport, string serviceName)
{
var activity = _source.StartActivity("Transport", ActivityKind.Client);
activity?.SetTag("transport.type", transport);
activity?.SetTag("service.name", serviceName);
return activity;
}
}
public class TracingConfig
{
public string ServiceName { get; set; } = "stella-router";
public string Exporter { get; set; } = "console";
public string JaegerHost { get; set; } = "localhost";
public int JaegerPort { get; set; } = 6831;
public string OtlpEndpoint { get; set; } = "http://localhost:4317";
public double SampleRate { get; set; } = 1.0;
}
```
---
## Transport Trace Propagation
```csharp
namespace StellaOps.Router.Transport;
/// <summary>
/// Propagates trace context through the transport layer.
/// </summary>
public sealed class TracePropagator
{
/// <summary>
/// Injects trace context into request payload.
/// </summary>
public void InjectContext(RequestPayload payload)
{
var activity = Activity.Current;
if (activity == null)
return;
var headers = new Dictionary<string, string>(payload.Headers);
// Inject W3C Trace Context
headers["traceparent"] = $"00-{activity.TraceId}-{activity.SpanId}-{(activity.Recorded ? "01" : "00")}";
if (!string.IsNullOrEmpty(activity.TraceStateString))
{
headers["tracestate"] = activity.TraceStateString;
}
// Create new payload with updated headers
// (In real implementation, use record with 'with' expression)
}
/// <summary>
/// Extracts trace context from request payload.
/// </summary>
public ActivityContext? ExtractContext(RequestPayload payload)
{
if (!payload.Headers.TryGetValue("traceparent", out var traceparent))
return null;
if (ActivityContext.TryParse(traceparent, payload.Headers.GetValueOrDefault("tracestate"), out var ctx))
{
return ctx;
}
return null;
}
}
```
---
## Microservice Logging
```csharp
namespace StellaOps.Microservice;
/// <summary>
/// Request logging for microservice handlers.
/// </summary>
public sealed class HandlerLoggingDecorator : IRequestDispatcher
{
private readonly IRequestDispatcher _inner;
private readonly ILogger<HandlerLoggingDecorator> _logger;
private readonly TracePropagator _propagator;
public HandlerLoggingDecorator(
IRequestDispatcher inner,
ILogger<HandlerLoggingDecorator> logger,
TracePropagator propagator)
{
_inner = inner;
_logger = logger;
_propagator = propagator;
}
public async Task<ResponsePayload> DispatchAsync(
RequestPayload request,
CancellationToken cancellationToken)
{
// Extract and restore trace context
var parentContext = _propagator.ExtractContext(request);
using var activity = StellaActivitySource.StartActivity(
"HandleRequest",
ActivityKind.Server,
parentContext ?? default);
activity?.SetTag("http.method", request.Method);
activity?.SetTag("http.route", request.Path);
// Set correlation context
var correlationId = request.TraceId ?? activity?.TraceId.ToString() ?? Guid.NewGuid().ToString("N");
using var scope = CorrelationContext.BeginScope(new CorrelationData
{
CorrelationId = correlationId,
Method = request.Method,
Path = request.Path,
UserId = request.Claims.GetValueOrDefault("sub")
});
var sw = Stopwatch.StartNew();
try
{
_logger.LogDebug(
"Handling {Method} {Path} | CorrelationId={CorrelationId}",
request.Method, request.Path, correlationId);
var response = await _inner.DispatchAsync(request, cancellationToken);
sw.Stop();
activity?.SetTag("http.status_code", response.StatusCode);
var level = response.StatusCode >= 500 ? LogLevel.Error
: response.StatusCode >= 400 ? LogLevel.Warning
: LogLevel.Debug;
_logger.Log(
level,
"Completed {Method} {Path} with {StatusCode} in {ElapsedMs}ms | CorrelationId={CorrelationId}",
request.Method, request.Path, response.StatusCode, sw.ElapsedMilliseconds, correlationId);
return response;
}
catch (Exception ex)
{
sw.Stop();
activity?.SetStatus(ActivityStatusCode.Error, ex.Message);
_logger.LogError(
ex,
"Failed {Method} {Path} after {ElapsedMs}ms | CorrelationId={CorrelationId}",
request.Method, request.Path, sw.ElapsedMilliseconds, correlationId);
throw;
}
}
}
```
---
## Sensitive Data Filtering
```csharp
namespace StellaOps.Router.Common;
/// <summary>
/// Filters sensitive data from logs.
/// </summary>
public sealed class SensitiveDataFilter
{
private readonly HashSet<string> _sensitiveFields;
private readonly Regex _cardNumberRegex;
private readonly Regex _ssnRegex;
public SensitiveDataFilter(IOptions<SensitiveDataConfig> config)
{
var cfg = config.Value;
_sensitiveFields = new HashSet<string>(cfg.SensitiveFields, StringComparer.OrdinalIgnoreCase);
_cardNumberRegex = new Regex(@"\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b");
_ssnRegex = new Regex(@"\b\d{3}-\d{2}-\d{4}\b");
}
public string Filter(string input)
{
var result = input;
// Mask card numbers
result = _cardNumberRegex.Replace(result, m =>
m.Value[..4] + "****" + m.Value[^4..]);
// Mask SSNs
result = _ssnRegex.Replace(result, "***-**-****");
return result;
}
public Dictionary<string, string> FilterHeaders(IReadOnlyDictionary<string, string> headers)
{
return headers.ToDictionary(
h => h.Key,
h => _sensitiveFields.Contains(h.Key) ? "[REDACTED]" : h.Value);
}
public object FilterObject(object obj)
{
// Deep filter for JSON objects
var json = JsonSerializer.Serialize(obj);
var filtered = FilterJsonProperties(json);
return JsonSerializer.Deserialize<object>(filtered)!;
}
private string FilterJsonProperties(string json)
{
var doc = JsonDocument.Parse(json);
using var stream = new MemoryStream();
using var writer = new Utf8JsonWriter(stream);
FilterElement(doc.RootElement, writer);
writer.Flush();
return Encoding.UTF8.GetString(stream.ToArray());
}
private void FilterElement(JsonElement element, Utf8JsonWriter writer)
{
switch (element.ValueKind)
{
case JsonValueKind.Object:
writer.WriteStartObject();
foreach (var property in element.EnumerateObject())
{
writer.WritePropertyName(property.Name);
if (_sensitiveFields.Contains(property.Name))
{
writer.WriteStringValue("[REDACTED]");
}
else
{
FilterElement(property.Value, writer);
}
}
writer.WriteEndObject();
break;
case JsonValueKind.Array:
writer.WriteStartArray();
foreach (var item in element.EnumerateArray())
{
FilterElement(item, writer);
}
writer.WriteEndArray();
break;
default:
element.WriteTo(writer);
break;
}
}
}
public class SensitiveDataConfig
{
public HashSet<string> SensitiveFields { get; set; } = new(StringComparer.OrdinalIgnoreCase)
{
"password", "secret", "token", "apiKey", "api_key",
"authorization", "creditCard", "credit_card", "ssn",
"socialSecurityNumber", "social_security_number"
};
}
```
---
## YAML Configuration
```yaml
Logging:
LogLevel:
Default: "Information"
"StellaOps.Router": "Debug"
"Microsoft.AspNetCore": "Warning"
RequestLogging:
LogRequests: true
LogResponses: true
LogHeaders: false
LogBody: false
MaxBodyLogLength: 1000
SensitiveHeaders:
- Authorization
- Cookie
- X-API-Key
Tracing:
ServiceName: "stella-router"
Exporter: "otlp"
OtlpEndpoint: "http://otel-collector:4317"
SampleRate: 1.0
SensitiveData:
SensitiveFields:
- password
- secret
- token
- apiKey
- creditCard
- ssn
```
---
## Deliverables
1. `StellaOps.Router.Common/CorrelationContext.cs`
2. `StellaOps.Router.Common/CorrelationLogEnricher.cs`
3. `StellaOps.Router.Gateway/RequestLoggingMiddleware.cs`
4. `StellaOps.Router.Common/OpenTelemetryExtensions.cs`
5. `StellaOps.Router.Common/StellaActivitySource.cs`
6. `StellaOps.Router.Transport/TracePropagator.cs`
7. `StellaOps.Microservice/HandlerLoggingDecorator.cs`
8. `StellaOps.Router.Common/SensitiveDataFilter.cs`
9. Correlation propagation tests
10. Trace context tests
---
## Next Step
Proceed to [Step 23: Metrics & Health Checks](23-Step.md) to implement observability metrics.

View File

@@ -1,769 +0,0 @@
# Step 23: Metrics & Health Checks
**Phase 6: Observability & Resilience**
**Estimated Complexity:** Medium
**Dependencies:** Step 22 (Logging & Tracing)
---
## Overview
Metrics and health checks provide operational visibility into the router and microservices. Prometheus-compatible metrics expose request rates, latencies, error rates, and connection pool status. Health checks enable load balancers and orchestrators to route traffic appropriately.
---
## Goals
1. Expose Prometheus-compatible metrics
2. Track request/response metrics per endpoint
3. Monitor transport layer health
4. Provide liveness and readiness probes
5. Support custom health check integrations
---
## Metrics Configuration
```csharp
namespace StellaOps.Router.Common;
public class MetricsConfig
{
/// <summary>Whether to enable metrics collection.</summary>
public bool Enabled { get; set; } = true;
/// <summary>Path for metrics endpoint.</summary>
public string Path { get; set; } = "/metrics";
/// <summary>Histogram buckets for request duration.</summary>
public double[] DurationBuckets { get; set; } = new[]
{
0.001, 0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1.0, 2.5, 5.0, 10.0
};
/// <summary>Labels to include in metrics.</summary>
public HashSet<string> IncludeLabels { get; set; } = new()
{
"method", "path", "status_code", "service"
};
/// <summary>Whether to include path in labels (may cause high cardinality).</summary>
public bool IncludePathLabel { get; set; } = false;
/// <summary>Maximum unique path labels before aggregating.</summary>
public int MaxPathCardinality { get; set; } = 100;
}
```
---
## Core Metrics
```csharp
namespace StellaOps.Router.Common;
/// <summary>
/// Central metrics registry for Stella Router.
/// </summary>
public sealed class StellaMetrics
{
// Request metrics
public static readonly Counter<long> RequestsTotal = Meter.CreateCounter<long>(
"stella_requests_total",
description: "Total number of requests processed");
public static readonly Histogram<double> RequestDuration = Meter.CreateHistogram<double>(
"stella_request_duration_seconds",
unit: "s",
description: "Request processing duration in seconds");
public static readonly Counter<long> RequestErrors = Meter.CreateCounter<long>(
"stella_request_errors_total",
description: "Total number of request errors");
// Transport metrics
public static readonly UpDownCounter<int> ActiveConnections = Meter.CreateUpDownCounter<int>(
"stella_active_connections",
description: "Number of active transport connections");
public static readonly Counter<long> ConnectionsTotal = Meter.CreateCounter<long>(
"stella_connections_total",
description: "Total number of transport connections");
public static readonly Counter<long> FramesSent = Meter.CreateCounter<long>(
"stella_frames_sent_total",
description: "Total number of frames sent");
public static readonly Counter<long> FramesReceived = Meter.CreateCounter<long>(
"stella_frames_received_total",
description: "Total number of frames received");
public static readonly Counter<long> BytesSent = Meter.CreateCounter<long>(
"stella_bytes_sent_total",
unit: "By",
description: "Total bytes sent");
public static readonly Counter<long> BytesReceived = Meter.CreateCounter<long>(
"stella_bytes_received_total",
unit: "By",
description: "Total bytes received");
// Rate limiting metrics
public static readonly Counter<long> RateLimitHits = Meter.CreateCounter<long>(
"stella_rate_limit_hits_total",
description: "Number of requests that hit rate limits");
public static readonly Gauge<int> RateLimitBuckets = Meter.CreateGauge<int>(
"stella_rate_limit_buckets",
description: "Number of active rate limit buckets");
// Auth metrics
public static readonly Counter<long> AuthSuccesses = Meter.CreateCounter<long>(
"stella_auth_success_total",
description: "Number of successful authentications");
public static readonly Counter<long> AuthFailures = Meter.CreateCounter<long>(
"stella_auth_failures_total",
description: "Number of failed authentications");
// Circuit breaker metrics
public static readonly Gauge<int> CircuitBreakerState = Meter.CreateGauge<int>(
"stella_circuit_breaker_state",
description: "Circuit breaker state (0=closed, 1=half-open, 2=open)");
private static readonly Meter Meter = new("StellaOps.Router", "1.0.0");
}
```
---
## Request Metrics Middleware
```csharp
namespace StellaOps.Router.Gateway;
/// <summary>
/// Middleware to collect request metrics.
/// </summary>
public sealed class MetricsMiddleware
{
private readonly RequestDelegate _next;
private readonly MetricsConfig _config;
private readonly PathNormalizer _pathNormalizer;
public MetricsMiddleware(
RequestDelegate next,
IOptions<MetricsConfig> config)
{
_next = next;
_config = config.Value;
_pathNormalizer = new PathNormalizer(_config.MaxPathCardinality);
}
public async Task InvokeAsync(HttpContext context)
{
if (!_config.Enabled)
{
await _next(context);
return;
}
var sw = Stopwatch.StartNew();
var method = context.Request.Method;
var path = _config.IncludePathLabel
? _pathNormalizer.Normalize(context.Request.Path)
: "aggregated";
try
{
await _next(context);
}
finally
{
sw.Stop();
var tags = new TagList
{
{ "method", method },
{ "status_code", context.Response.StatusCode.ToString() }
};
if (_config.IncludePathLabel)
{
tags.Add("path", path);
}
StellaMetrics.RequestsTotal.Add(1, tags);
StellaMetrics.RequestDuration.Record(sw.Elapsed.TotalSeconds, tags);
if (context.Response.StatusCode >= 400)
{
StellaMetrics.RequestErrors.Add(1, tags);
}
}
}
}
/// <summary>
/// Normalizes paths to prevent high cardinality.
/// </summary>
internal sealed class PathNormalizer
{
private readonly int _maxCardinality;
private readonly ConcurrentDictionary<string, string> _pathCache = new();
private int _uniquePaths;
public PathNormalizer(int maxCardinality)
{
_maxCardinality = maxCardinality;
}
public string Normalize(string path)
{
if (_pathCache.TryGetValue(path, out var normalized))
return normalized;
// Replace path parameters with placeholders
var segments = path.Split('/');
for (int i = 0; i < segments.Length; i++)
{
if (Guid.TryParse(segments[i], out _) ||
int.TryParse(segments[i], out _) ||
segments[i].Length > 20)
{
segments[i] = "{id}";
}
}
normalized = string.Join("/", segments);
if (Interlocked.Increment(ref _uniquePaths) <= _maxCardinality)
{
_pathCache[path] = normalized;
}
else
{
normalized = "other";
}
return normalized;
}
}
```
---
## Transport Metrics
```csharp
namespace StellaOps.Router.Transport;
/// <summary>
/// Collects metrics for transport layer operations.
/// </summary>
public sealed class TransportMetricsCollector
{
public void RecordConnectionOpened(string transport, string serviceName)
{
var tags = new TagList
{
{ "transport", transport },
{ "service", serviceName }
};
StellaMetrics.ConnectionsTotal.Add(1, tags);
StellaMetrics.ActiveConnections.Add(1, tags);
}
public void RecordConnectionClosed(string transport, string serviceName)
{
var tags = new TagList
{
{ "transport", transport },
{ "service", serviceName }
};
StellaMetrics.ActiveConnections.Add(-1, tags);
}
public void RecordFrameSent(string transport, FrameType type, int bytes)
{
var tags = new TagList
{
{ "transport", transport },
{ "frame_type", type.ToString() }
};
StellaMetrics.FramesSent.Add(1, tags);
StellaMetrics.BytesSent.Add(bytes, new TagList { { "transport", transport } });
}
public void RecordFrameReceived(string transport, FrameType type, int bytes)
{
var tags = new TagList
{
{ "transport", transport },
{ "frame_type", type.ToString() }
};
StellaMetrics.FramesReceived.Add(1, tags);
StellaMetrics.BytesReceived.Add(bytes, new TagList { { "transport", transport } });
}
}
```
---
## Health Check System
```csharp
namespace StellaOps.Router.Common;
/// <summary>
/// Health check result.
/// </summary>
public sealed class HealthCheckResult
{
public HealthStatus Status { get; init; }
public string? Description { get; init; }
public TimeSpan Duration { get; init; }
public IReadOnlyDictionary<string, object>? Data { get; init; }
public Exception? Exception { get; init; }
}
public enum HealthStatus
{
Healthy,
Degraded,
Unhealthy
}
/// <summary>
/// Health check interface.
/// </summary>
public interface IHealthCheck
{
string Name { get; }
Task<HealthCheckResult> CheckAsync(CancellationToken cancellationToken);
}
/// <summary>
/// Aggregates multiple health checks.
/// </summary>
public sealed class HealthCheckService
{
private readonly IEnumerable<IHealthCheck> _checks;
private readonly ILogger<HealthCheckService> _logger;
public HealthCheckService(
IEnumerable<IHealthCheck> checks,
ILogger<HealthCheckService> logger)
{
_checks = checks;
_logger = logger;
}
public async Task<HealthReport> CheckHealthAsync(CancellationToken cancellationToken)
{
var results = new Dictionary<string, HealthCheckResult>();
var overallStatus = HealthStatus.Healthy;
foreach (var check in _checks)
{
var sw = Stopwatch.StartNew();
try
{
var result = await check.CheckAsync(cancellationToken);
result = result with { Duration = sw.Elapsed };
results[check.Name] = result;
if (result.Status > overallStatus)
{
overallStatus = result.Status;
}
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Health check {Name} failed", check.Name);
results[check.Name] = new HealthCheckResult
{
Status = HealthStatus.Unhealthy,
Description = ex.Message,
Duration = sw.Elapsed,
Exception = ex
};
overallStatus = HealthStatus.Unhealthy;
}
}
return new HealthReport
{
Status = overallStatus,
Checks = results,
TotalDuration = results.Values.Sum(r => r.Duration.TotalMilliseconds)
};
}
}
public sealed class HealthReport
{
public HealthStatus Status { get; init; }
public IReadOnlyDictionary<string, HealthCheckResult> Checks { get; init; } = new Dictionary<string, HealthCheckResult>();
public double TotalDuration { get; init; }
}
```
---
## Built-in Health Checks
```csharp
namespace StellaOps.Router.Gateway;
/// <summary>
/// Checks that at least one transport connection is active.
/// </summary>
public sealed class TransportHealthCheck : IHealthCheck
{
private readonly IGlobalRoutingState _routingState;
public string Name => "transport";
public TransportHealthCheck(IGlobalRoutingState routingState)
{
_routingState = routingState;
}
public Task<HealthCheckResult> CheckAsync(CancellationToken cancellationToken)
{
var connections = _routingState.GetAllConnections();
var activeCount = connections.Count(c => c.State == ConnectionState.Connected);
if (activeCount == 0)
{
return Task.FromResult(new HealthCheckResult
{
Status = HealthStatus.Unhealthy,
Description = "No active transport connections",
Data = new Dictionary<string, object> { ["connections"] = 0 }
});
}
return Task.FromResult(new HealthCheckResult
{
Status = HealthStatus.Healthy,
Description = $"{activeCount} active connections",
Data = new Dictionary<string, object> { ["connections"] = activeCount }
});
}
}
/// <summary>
/// Checks Authority service connectivity.
/// </summary>
public sealed class AuthorityHealthCheck : IHealthCheck
{
private readonly IAuthorityClient _authority;
private readonly TimeSpan _timeout;
public string Name => "authority";
public AuthorityHealthCheck(
IAuthorityClient authority,
IOptions<AuthorityConfig> config)
{
_authority = authority;
_timeout = config.Value.HealthCheckTimeout;
}
public async Task<HealthCheckResult> CheckAsync(CancellationToken cancellationToken)
{
try
{
using var cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
cts.CancelAfter(_timeout);
var isHealthy = await _authority.CheckHealthAsync(cts.Token);
return new HealthCheckResult
{
Status = isHealthy ? HealthStatus.Healthy : HealthStatus.Degraded,
Description = isHealthy ? "Authority is responsive" : "Authority returned unhealthy"
};
}
catch (Exception ex)
{
return new HealthCheckResult
{
Status = HealthStatus.Degraded, // Degraded, not unhealthy - gateway can still work
Description = $"Authority unreachable: {ex.Message}",
Exception = ex
};
}
}
}
/// <summary>
/// Checks rate limiter backend connectivity.
/// </summary>
public sealed class RateLimiterHealthCheck : IHealthCheck
{
private readonly IRateLimiter _rateLimiter;
public string Name => "rate_limiter";
public RateLimiterHealthCheck(IRateLimiter rateLimiter)
{
_rateLimiter = rateLimiter;
}
public async Task<HealthCheckResult> CheckAsync(CancellationToken cancellationToken)
{
try
{
// Try a simple operation
await _rateLimiter.CheckLimitAsync(
new RateLimitContext { Key = "__health_check__", Tier = RateLimitTier.Free },
cancellationToken);
return new HealthCheckResult
{
Status = HealthStatus.Healthy,
Description = "Rate limiter is responsive"
};
}
catch (Exception ex)
{
return new HealthCheckResult
{
Status = HealthStatus.Degraded,
Description = $"Rate limiter error: {ex.Message}",
Exception = ex
};
}
}
}
```
---
## Health Endpoints
```csharp
namespace StellaOps.Router.Gateway;
/// <summary>
/// Health check endpoints.
/// </summary>
public static class HealthEndpoints
{
public static IEndpointRouteBuilder MapHealthEndpoints(
this IEndpointRouteBuilder endpoints,
string basePath = "/health")
{
endpoints.MapGet(basePath + "/live", LivenessCheck);
endpoints.MapGet(basePath + "/ready", ReadinessCheck);
endpoints.MapGet(basePath, DetailedHealthCheck);
return endpoints;
}
/// <summary>
/// Liveness probe - is the process running?
/// </summary>
private static IResult LivenessCheck()
{
return Results.Ok(new { status = "alive" });
}
/// <summary>
/// Readiness probe - can the service accept traffic?
/// </summary>
private static async Task<IResult> ReadinessCheck(
HealthCheckService healthService,
CancellationToken cancellationToken)
{
var report = await healthService.CheckHealthAsync(cancellationToken);
return report.Status == HealthStatus.Unhealthy
? Results.Json(new
{
status = "not_ready",
checks = report.Checks.ToDictionary(c => c.Key, c => c.Value.Status.ToString())
}, statusCode: 503)
: Results.Ok(new { status = "ready" });
}
/// <summary>
/// Detailed health report.
/// </summary>
private static async Task<IResult> DetailedHealthCheck(
HealthCheckService healthService,
CancellationToken cancellationToken)
{
var report = await healthService.CheckHealthAsync(cancellationToken);
var response = new
{
status = report.Status.ToString().ToLower(),
totalDuration = $"{report.TotalDuration:F2}ms",
checks = report.Checks.ToDictionary(c => c.Key, c => new
{
status = c.Value.Status.ToString().ToLower(),
description = c.Value.Description,
duration = $"{c.Value.Duration.TotalMilliseconds:F2}ms",
data = c.Value.Data
})
};
var statusCode = report.Status switch
{
HealthStatus.Healthy => 200,
HealthStatus.Degraded => 200, // Still return 200 for degraded
HealthStatus.Unhealthy => 503,
_ => 200
};
return Results.Json(response, statusCode: statusCode);
}
}
```
---
## Prometheus Metrics Endpoint
```csharp
namespace StellaOps.Router.Gateway;
/// <summary>
/// Exposes metrics in Prometheus format.
/// </summary>
public sealed class PrometheusMetricsEndpoint
{
public static void Map(IEndpointRouteBuilder endpoints, string path = "/metrics")
{
endpoints.MapGet(path, async (HttpContext context) =>
{
var exporter = context.RequestServices.GetRequiredService<PrometheusExporter>();
var metrics = await exporter.ExportAsync();
context.Response.ContentType = "text/plain; version=0.0.4";
await context.Response.WriteAsync(metrics);
});
}
}
public sealed class PrometheusExporter
{
private readonly MeterProvider _meterProvider;
public PrometheusExporter(MeterProvider meterProvider)
{
_meterProvider = meterProvider;
}
public Task<string> ExportAsync()
{
// Use OpenTelemetry's Prometheus exporter
// This is a simplified example
var sb = new StringBuilder();
// Export would iterate over all registered metrics
// Real implementation uses OpenTelemetry.Exporter.Prometheus
return Task.FromResult(sb.ToString());
}
}
```
---
## Service Registration
```csharp
namespace StellaOps.Router.Gateway;
public static class MetricsExtensions
{
public static IServiceCollection AddStellaMetrics(
this IServiceCollection services,
IConfiguration configuration)
{
services.Configure<MetricsConfig>(configuration.GetSection("Metrics"));
services.AddOpenTelemetry()
.WithMetrics(builder =>
{
builder
.AddMeter("StellaOps.Router")
.AddAspNetCoreInstrumentation()
.AddPrometheusExporter();
});
return services;
}
public static IServiceCollection AddStellaHealthChecks(
this IServiceCollection services)
{
services.AddSingleton<HealthCheckService>();
services.AddSingleton<IHealthCheck, TransportHealthCheck>();
services.AddSingleton<IHealthCheck, AuthorityHealthCheck>();
services.AddSingleton<IHealthCheck, RateLimiterHealthCheck>();
return services;
}
}
```
---
## YAML Configuration
```yaml
Metrics:
Enabled: true
Path: "/metrics"
IncludePathLabel: false
MaxPathCardinality: 100
DurationBuckets:
- 0.005
- 0.01
- 0.025
- 0.05
- 0.1
- 0.25
- 0.5
- 1
- 2.5
- 5
- 10
HealthChecks:
Enabled: true
Path: "/health"
CacheDuration: "00:00:05"
```
---
## Deliverables
1. `StellaOps.Router.Common/StellaMetrics.cs`
2. `StellaOps.Router.Gateway/MetricsMiddleware.cs`
3. `StellaOps.Router.Transport/TransportMetricsCollector.cs`
4. `StellaOps.Router.Common/HealthCheckService.cs`
5. `StellaOps.Router.Gateway/TransportHealthCheck.cs`
6. `StellaOps.Router.Gateway/AuthorityHealthCheck.cs`
7. `StellaOps.Router.Gateway/HealthEndpoints.cs`
8. `StellaOps.Router.Gateway/PrometheusMetricsEndpoint.cs`
9. Metrics collection tests
10. Health check tests
---
## Next Step
Proceed to [Step 24: Circuit Breaker & Retry Policies](24-Step.md) to implement resilience patterns.

View File

@@ -1,856 +0,0 @@
# Step 24: Circuit Breaker & Retry Policies
**Phase 6: Observability & Resilience**
**Estimated Complexity:** High
**Dependencies:** Step 23 (Metrics & Health Checks)
---
## Overview
Circuit breakers and retry policies protect the system from cascading failures and transient errors. The circuit breaker prevents requests to failing services, while retry policies automatically retry failed requests with exponential backoff.
---
## Goals
1. Implement circuit breaker pattern for service protection
2. Support configurable retry policies
3. Enable per-service and per-endpoint policies
4. Integrate with metrics for observability
5. Provide graceful degradation strategies
---
## Circuit Breaker Configuration
```csharp
namespace StellaOps.Router.Resilience;
public class CircuitBreakerConfig
{
/// <summary>Number of failures before opening circuit.</summary>
public int FailureThreshold { get; set; } = 5;
/// <summary>Time window for counting failures.</summary>
public TimeSpan SamplingDuration { get; set; } = TimeSpan.FromSeconds(30);
/// <summary>How long to stay open before testing.</summary>
public TimeSpan BreakDuration { get; set; } = TimeSpan.FromSeconds(30);
/// <summary>Minimum throughput before circuit can trip.</summary>
public int MinimumThroughput { get; set; } = 10;
/// <summary>Failure ratio to trip circuit (0.0 to 1.0).</summary>
public double FailureRatioThreshold { get; set; } = 0.5;
/// <summary>HTTP status codes considered failures.</summary>
public HashSet<int> FailureStatusCodes { get; set; } = new()
{
500, 502, 503, 504
};
/// <summary>Exception types considered failures.</summary>
public HashSet<Type> FailureExceptions { get; set; } = new()
{
typeof(TimeoutException),
typeof(TaskCanceledException),
typeof(HttpRequestException)
};
}
```
---
## Circuit Breaker Implementation
```csharp
namespace StellaOps.Router.Resilience;
public enum CircuitState
{
Closed = 0, // Normal operation
Open = 2, // Blocking requests
HalfOpen = 1 // Testing with limited requests
}
/// <summary>
/// Circuit breaker for a single service or endpoint.
/// </summary>
public sealed class CircuitBreaker
{
private readonly CircuitBreakerConfig _config;
private readonly ILogger<CircuitBreaker> _logger;
private readonly SlidingWindow _window;
private CircuitState _state = CircuitState.Closed;
private DateTimeOffset _openedAt;
private readonly SemaphoreSlim _halfOpenLock = new(1, 1);
public string Name { get; }
public CircuitState State => _state;
public DateTimeOffset LastStateChange { get; private set; }
public CircuitBreaker(
string name,
CircuitBreakerConfig config,
ILogger<CircuitBreaker> logger)
{
Name = name;
_config = config;
_logger = logger;
_window = new SlidingWindow(config.SamplingDuration);
LastStateChange = DateTimeOffset.UtcNow;
}
/// <summary>
/// Checks if request is allowed through the circuit.
/// </summary>
public async Task<bool> AllowRequestAsync(CancellationToken cancellationToken)
{
switch (_state)
{
case CircuitState.Closed:
return true;
case CircuitState.Open:
if (DateTimeOffset.UtcNow - _openedAt >= _config.BreakDuration)
{
await TryTransitionToHalfOpenAsync();
}
return _state == CircuitState.HalfOpen;
case CircuitState.HalfOpen:
// Only allow one request at a time in half-open
return await _halfOpenLock.WaitAsync(0, cancellationToken);
default:
return false;
}
}
/// <summary>
/// Records a successful request.
/// </summary>
public void RecordSuccess()
{
_window.RecordSuccess();
if (_state == CircuitState.HalfOpen)
{
TransitionToClosed();
_halfOpenLock.Release();
}
}
/// <summary>
/// Records a failed request.
/// </summary>
public void RecordFailure()
{
_window.RecordFailure();
if (_state == CircuitState.HalfOpen)
{
TransitionToOpen();
_halfOpenLock.Release();
}
else if (_state == CircuitState.Closed)
{
CheckThreshold();
}
}
private void CheckThreshold()
{
var stats = _window.GetStats();
if (stats.TotalRequests < _config.MinimumThroughput)
return;
var failureRatio = (double)stats.Failures / stats.TotalRequests;
if (failureRatio >= _config.FailureRatioThreshold ||
stats.Failures >= _config.FailureThreshold)
{
TransitionToOpen();
}
}
private void TransitionToOpen()
{
_state = CircuitState.Open;
_openedAt = DateTimeOffset.UtcNow;
LastStateChange = _openedAt;
_logger.LogWarning(
"Circuit {Name} opened. Failures: {Failures}, Ratio: {Ratio:P2}",
Name, _window.GetStats().Failures,
(double)_window.GetStats().Failures / Math.Max(1, _window.GetStats().TotalRequests));
StellaMetrics.CircuitBreakerState.Record((int)CircuitState.Open,
new TagList { { "circuit", Name } });
}
private async Task TryTransitionToHalfOpenAsync()
{
if (_state != CircuitState.Open)
return;
if (await _halfOpenLock.WaitAsync(0))
{
_state = CircuitState.HalfOpen;
LastStateChange = DateTimeOffset.UtcNow;
_window.Reset();
_logger.LogInformation("Circuit {Name} transitioning to half-open", Name);
StellaMetrics.CircuitBreakerState.Record((int)CircuitState.HalfOpen,
new TagList { { "circuit", Name } });
}
}
private void TransitionToClosed()
{
_state = CircuitState.Closed;
LastStateChange = DateTimeOffset.UtcNow;
_window.Reset();
_logger.LogInformation("Circuit {Name} closed", Name);
StellaMetrics.CircuitBreakerState.Record((int)CircuitState.Closed,
new TagList { { "circuit", Name } });
}
}
/// <summary>
/// Sliding window for tracking success/failure counts.
/// </summary>
internal sealed class SlidingWindow
{
private readonly TimeSpan _duration;
private readonly ConcurrentQueue<(DateTimeOffset Time, bool Success)> _events = new();
public SlidingWindow(TimeSpan duration)
{
_duration = duration;
}
public void RecordSuccess()
{
_events.Enqueue((DateTimeOffset.UtcNow, true));
Cleanup();
}
public void RecordFailure()
{
_events.Enqueue((DateTimeOffset.UtcNow, false));
Cleanup();
}
public WindowStats GetStats()
{
Cleanup();
var successes = 0;
var failures = 0;
foreach (var evt in _events)
{
if (evt.Success)
successes++;
else
failures++;
}
return new WindowStats(successes, failures);
}
public void Reset()
{
_events.Clear();
}
private void Cleanup()
{
var cutoff = DateTimeOffset.UtcNow - _duration;
while (_events.TryPeek(out var evt) && evt.Time < cutoff)
{
_events.TryDequeue(out _);
}
}
}
internal readonly record struct WindowStats(int Successes, int Failures)
{
public int TotalRequests => Successes + Failures;
}
```
---
## Retry Policy Configuration
```csharp
namespace StellaOps.Router.Resilience;
public class RetryPolicyConfig
{
/// <summary>Maximum number of retries.</summary>
public int MaxRetries { get; set; } = 3;
/// <summary>Initial delay before first retry.</summary>
public TimeSpan InitialDelay { get; set; } = TimeSpan.FromMilliseconds(100);
/// <summary>Maximum delay between retries.</summary>
public TimeSpan MaxDelay { get; set; } = TimeSpan.FromSeconds(10);
/// <summary>Backoff multiplier for exponential delay.</summary>
public double BackoffMultiplier { get; set; } = 2.0;
/// <summary>Whether to add jitter to delays.</summary>
public bool UseJitter { get; set; } = true;
/// <summary>Maximum jitter to add (percentage of delay).</summary>
public double MaxJitterPercent { get; set; } = 0.25;
/// <summary>HTTP status codes that trigger retry.</summary>
public HashSet<int> RetryableStatusCodes { get; set; } = new()
{
408, 429, 500, 502, 503, 504
};
/// <summary>Exception types that trigger retry.</summary>
public HashSet<Type> RetryableExceptions { get; set; } = new()
{
typeof(TimeoutException),
typeof(HttpRequestException),
typeof(IOException)
};
}
```
---
## Retry Policy Implementation
```csharp
namespace StellaOps.Router.Resilience;
/// <summary>
/// Executes operations with retry logic.
/// </summary>
public sealed class RetryPolicy
{
private readonly RetryPolicyConfig _config;
private readonly ILogger<RetryPolicy> _logger;
public RetryPolicy(RetryPolicyConfig config, ILogger<RetryPolicy> logger)
{
_config = config;
_logger = logger;
}
/// <summary>
/// Executes an operation with retry logic.
/// </summary>
public async Task<T> ExecuteAsync<T>(
Func<CancellationToken, Task<T>> operation,
Func<T, bool> shouldRetry,
CancellationToken cancellationToken)
{
var attempt = 0;
var totalDelay = TimeSpan.Zero;
while (true)
{
try
{
attempt++;
var result = await operation(cancellationToken);
if (shouldRetry(result) && attempt <= _config.MaxRetries)
{
var delay = CalculateDelay(attempt);
totalDelay += delay;
_logger.LogDebug(
"Retrying operation (attempt {Attempt}/{MaxRetries}) after {Delay}ms",
attempt, _config.MaxRetries, delay.TotalMilliseconds);
await Task.Delay(delay, cancellationToken);
continue;
}
if (attempt > 1)
{
_logger.LogDebug(
"Operation succeeded after {Attempts} attempts, total delay: {TotalDelay}ms",
attempt, totalDelay.TotalMilliseconds);
}
return result;
}
catch (Exception ex) when (ShouldRetry(ex) && attempt <= _config.MaxRetries)
{
var delay = CalculateDelay(attempt);
totalDelay += delay;
_logger.LogWarning(
ex,
"Operation failed (attempt {Attempt}/{MaxRetries}), retrying after {Delay}ms",
attempt, _config.MaxRetries, delay.TotalMilliseconds);
await Task.Delay(delay, cancellationToken);
}
}
}
/// <summary>
/// Executes an operation with retry logic (response payload variant).
/// </summary>
public Task<ResponsePayload> ExecuteAsync(
Func<CancellationToken, Task<ResponsePayload>> operation,
CancellationToken cancellationToken)
{
return ExecuteAsync(
operation,
response => _config.RetryableStatusCodes.Contains(response.StatusCode),
cancellationToken);
}
private bool ShouldRetry(Exception ex)
{
var exType = ex.GetType();
return _config.RetryableExceptions.Any(t => t.IsAssignableFrom(exType));
}
private TimeSpan CalculateDelay(int attempt)
{
// Exponential backoff
var delay = TimeSpan.FromMilliseconds(
_config.InitialDelay.TotalMilliseconds * Math.Pow(_config.BackoffMultiplier, attempt - 1));
// Cap at max delay
if (delay > _config.MaxDelay)
{
delay = _config.MaxDelay;
}
// Add jitter
if (_config.UseJitter)
{
var jitter = delay.TotalMilliseconds * _config.MaxJitterPercent * Random.Shared.NextDouble();
delay = TimeSpan.FromMilliseconds(delay.TotalMilliseconds + jitter);
}
return delay;
}
}
```
---
## Resilience Policy Executor
```csharp
namespace StellaOps.Router.Resilience;
/// <summary>
/// Combines circuit breaker and retry policies.
/// </summary>
public interface IResiliencePolicy
{
Task<ResponsePayload> ExecuteAsync(
string serviceName,
Func<CancellationToken, Task<ResponsePayload>> operation,
CancellationToken cancellationToken);
}
public sealed class ResiliencePolicy : IResiliencePolicy
{
private readonly ICircuitBreakerRegistry _circuitBreakers;
private readonly RetryPolicy _retryPolicy;
private readonly ResilienceConfig _config;
private readonly ILogger<ResiliencePolicy> _logger;
public ResiliencePolicy(
ICircuitBreakerRegistry circuitBreakers,
RetryPolicy retryPolicy,
IOptions<ResilienceConfig> config,
ILogger<ResiliencePolicy> logger)
{
_circuitBreakers = circuitBreakers;
_retryPolicy = retryPolicy;
_config = config.Value;
_logger = logger;
}
public async Task<ResponsePayload> ExecuteAsync(
string serviceName,
Func<CancellationToken, Task<ResponsePayload>> operation,
CancellationToken cancellationToken)
{
var circuitBreaker = _circuitBreakers.GetOrCreate(serviceName);
// Check circuit breaker
if (!await circuitBreaker.AllowRequestAsync(cancellationToken))
{
_logger.LogWarning("Circuit breaker {Name} is open, rejecting request", serviceName);
return _config.FallbackResponse ?? new ResponsePayload
{
StatusCode = 503,
Headers = new Dictionary<string, string>
{
["X-Circuit-Breaker"] = "open",
["Retry-After"] = "30"
},
Body = Encoding.UTF8.GetBytes(JsonSerializer.Serialize(new
{
error = "Service temporarily unavailable",
service = serviceName
})),
IsFinalChunk = true
};
}
try
{
// Execute with retry
var response = await _retryPolicy.ExecuteAsync(operation, cancellationToken);
// Record result
if (IsSuccess(response))
{
circuitBreaker.RecordSuccess();
}
else if (IsFailure(response))
{
circuitBreaker.RecordFailure();
}
return response;
}
catch (Exception)
{
circuitBreaker.RecordFailure();
throw;
}
}
private bool IsSuccess(ResponsePayload response)
{
return response.StatusCode >= 200 && response.StatusCode < 400;
}
private bool IsFailure(ResponsePayload response)
{
return _config.CircuitBreaker.FailureStatusCodes.Contains(response.StatusCode);
}
}
public class ResilienceConfig
{
public CircuitBreakerConfig CircuitBreaker { get; set; } = new();
public RetryPolicyConfig Retry { get; set; } = new();
public ResponsePayload? FallbackResponse { get; set; }
}
```
---
## Circuit Breaker Registry
```csharp
namespace StellaOps.Router.Resilience;
/// <summary>
/// Registry of circuit breakers per service.
/// </summary>
public interface ICircuitBreakerRegistry
{
CircuitBreaker GetOrCreate(string name);
IReadOnlyDictionary<string, CircuitBreaker> GetAll();
void Reset(string name);
void ResetAll();
}
public sealed class CircuitBreakerRegistry : ICircuitBreakerRegistry
{
private readonly ConcurrentDictionary<string, CircuitBreaker> _breakers = new();
private readonly CircuitBreakerConfig _config;
private readonly ILoggerFactory _loggerFactory;
public CircuitBreakerRegistry(
IOptions<CircuitBreakerConfig> config,
ILoggerFactory loggerFactory)
{
_config = config.Value;
_loggerFactory = loggerFactory;
}
public CircuitBreaker GetOrCreate(string name)
{
return _breakers.GetOrAdd(name, n =>
{
var logger = _loggerFactory.CreateLogger<CircuitBreaker>();
return new CircuitBreaker(n, _config, logger);
});
}
public IReadOnlyDictionary<string, CircuitBreaker> GetAll()
{
return _breakers;
}
public void Reset(string name)
{
if (_breakers.TryRemove(name, out _))
{
// Will be recreated fresh on next request
}
}
public void ResetAll()
{
_breakers.Clear();
}
}
```
---
## Bulkhead Pattern
```csharp
namespace StellaOps.Router.Resilience;
/// <summary>
/// Bulkhead pattern - limits concurrent requests to a service.
/// </summary>
public sealed class Bulkhead
{
private readonly SemaphoreSlim _semaphore;
private readonly BulkheadConfig _config;
private readonly string _name;
private int _queuedRequests;
public string Name => _name;
public int ActiveRequests => _config.MaxConcurrency - _semaphore.CurrentCount;
public int QueuedRequests => _queuedRequests;
public Bulkhead(string name, BulkheadConfig config)
{
_name = name;
_config = config;
_semaphore = new SemaphoreSlim(config.MaxConcurrency, config.MaxConcurrency);
}
/// <summary>
/// Acquires a slot in the bulkhead.
/// </summary>
public async Task<IDisposable?> AcquireAsync(CancellationToken cancellationToken)
{
var queued = Interlocked.Increment(ref _queuedRequests);
if (queued > _config.MaxQueueSize)
{
Interlocked.Decrement(ref _queuedRequests);
return null; // Reject immediately
}
try
{
var acquired = await _semaphore.WaitAsync(_config.QueueTimeout, cancellationToken);
Interlocked.Decrement(ref _queuedRequests);
if (!acquired)
{
return null;
}
return new BulkheadLease(_semaphore);
}
catch
{
Interlocked.Decrement(ref _queuedRequests);
throw;
}
}
private sealed class BulkheadLease : IDisposable
{
private readonly SemaphoreSlim _semaphore;
private bool _disposed;
public BulkheadLease(SemaphoreSlim semaphore)
{
_semaphore = semaphore;
}
public void Dispose()
{
if (!_disposed)
{
_semaphore.Release();
_disposed = true;
}
}
}
}
public class BulkheadConfig
{
public int MaxConcurrency { get; set; } = 100;
public int MaxQueueSize { get; set; } = 50;
public TimeSpan QueueTimeout { get; set; } = TimeSpan.FromSeconds(10);
}
```
---
## Resilience Middleware
```csharp
namespace StellaOps.Router.Gateway;
/// <summary>
/// Middleware that applies resilience policies to requests.
/// </summary>
public sealed class ResilienceMiddleware
{
private readonly RequestDelegate _next;
private readonly IResiliencePolicy _policy;
public ResilienceMiddleware(RequestDelegate next, IResiliencePolicy policy)
{
_next = next;
_policy = policy;
}
public async Task InvokeAsync(HttpContext context)
{
// Get target service from route data
var serviceName = context.GetRouteValue("service")?.ToString();
if (string.IsNullOrEmpty(serviceName))
{
await _next(context);
return;
}
try
{
await _next(context);
}
catch (Exception ex) when (IsTransientException(ex))
{
// Convert to 503 with retry information
context.Response.StatusCode = 503;
context.Response.Headers["Retry-After"] = "30";
await context.Response.WriteAsJsonAsync(new
{
error = "Service temporarily unavailable",
retryAfter = 30
});
}
}
private bool IsTransientException(Exception ex)
{
return ex is TimeoutException or
HttpRequestException or
TaskCanceledException;
}
}
```
---
## Service Registration
```csharp
namespace StellaOps.Router.Resilience;
public static class ResilienceExtensions
{
public static IServiceCollection AddStellaResilience(
this IServiceCollection services,
IConfiguration configuration)
{
services.Configure<ResilienceConfig>(configuration.GetSection("Resilience"));
services.Configure<CircuitBreakerConfig>(configuration.GetSection("Resilience:CircuitBreaker"));
services.Configure<RetryPolicyConfig>(configuration.GetSection("Resilience:Retry"));
services.Configure<BulkheadConfig>(configuration.GetSection("Resilience:Bulkhead"));
services.AddSingleton<ICircuitBreakerRegistry, CircuitBreakerRegistry>();
services.AddSingleton<RetryPolicy>();
services.AddSingleton<IResiliencePolicy, ResiliencePolicy>();
return services;
}
}
```
---
## YAML Configuration
```yaml
Resilience:
CircuitBreaker:
FailureThreshold: 5
SamplingDuration: "00:00:30"
BreakDuration: "00:00:30"
MinimumThroughput: 10
FailureRatioThreshold: 0.5
FailureStatusCodes:
- 500
- 502
- 503
- 504
Retry:
MaxRetries: 3
InitialDelay: "00:00:00.100"
MaxDelay: "00:00:10"
BackoffMultiplier: 2.0
UseJitter: true
MaxJitterPercent: 0.25
RetryableStatusCodes:
- 408
- 429
- 502
- 503
- 504
Bulkhead:
MaxConcurrency: 100
MaxQueueSize: 50
QueueTimeout: "00:00:10"
```
---
## Deliverables
1. `StellaOps.Router.Resilience/CircuitBreaker.cs`
2. `StellaOps.Router.Resilience/CircuitBreakerConfig.cs`
3. `StellaOps.Router.Resilience/ICircuitBreakerRegistry.cs`
4. `StellaOps.Router.Resilience/CircuitBreakerRegistry.cs`
5. `StellaOps.Router.Resilience/RetryPolicy.cs`
6. `StellaOps.Router.Resilience/RetryPolicyConfig.cs`
7. `StellaOps.Router.Resilience/IResiliencePolicy.cs`
8. `StellaOps.Router.Resilience/ResiliencePolicy.cs`
9. `StellaOps.Router.Resilience/Bulkhead.cs`
10. `StellaOps.Router.Gateway/ResilienceMiddleware.cs`
11. Circuit breaker state transition tests
12. Retry policy tests
13. Bulkhead tests
---
## Next Step
Proceed to [Step 25: Configuration Hot-Reload](25-Step.md) to implement dynamic configuration updates.

View File

@@ -1,754 +0,0 @@
# Step 25: Configuration Hot-Reload
**Phase 7: Testing & Documentation**
**Estimated Complexity:** Medium
**Dependencies:** All previous configuration steps
---
## Overview
Configuration hot-reload enables dynamic updates to router and microservice configuration without restarts. This includes route definitions, rate limits, circuit breaker settings, and JWKS rotation.
---
## Goals
1. Support YAML configuration hot-reload
2. Implement file watcher for configuration changes
3. Provide atomic configuration updates
4. Support validation before applying changes
5. Enable rollback on invalid configuration
---
## Configuration Watcher
```csharp
namespace StellaOps.Router.Configuration;
/// <summary>
/// Watches configuration files for changes and triggers reloads.
/// </summary>
public sealed class ConfigurationWatcher : IHostedService, IDisposable
{
private readonly IConfiguration _configuration;
private readonly IOptionsMonitor<RouterConfig> _routerConfig;
private readonly ILogger<ConfigurationWatcher> _logger;
private readonly List<FileSystemWatcher> _watchers = new();
private readonly Subject<ConfigurationChange> _changes = new();
private readonly TimeSpan _debounceInterval = TimeSpan.FromMilliseconds(500);
private readonly ConcurrentDictionary<string, DateTimeOffset> _lastChange = new();
public IObservable<ConfigurationChange> Changes => _changes;
public ConfigurationWatcher(
IConfiguration configuration,
IOptionsMonitor<RouterConfig> routerConfig,
ILogger<ConfigurationWatcher> logger)
{
_configuration = configuration;
_routerConfig = routerConfig;
_logger = logger;
}
public Task StartAsync(CancellationToken cancellationToken)
{
// Watch all YAML configuration files
var configPaths = GetConfigurationFilePaths();
foreach (var path in configPaths)
{
if (!File.Exists(path))
continue;
var directory = Path.GetDirectoryName(path)!;
var fileName = Path.GetFileName(path);
var watcher = new FileSystemWatcher(directory)
{
Filter = fileName,
NotifyFilter = NotifyFilters.LastWrite | NotifyFilters.Size,
EnableRaisingEvents = true
};
watcher.Changed += OnConfigurationFileChanged;
_watchers.Add(watcher);
_logger.LogInformation("Watching configuration file: {Path}", path);
}
// Also subscribe to IOptionsMonitor for programmatic changes
_routerConfig.OnChange(config =>
{
_changes.OnNext(new ConfigurationChange
{
Section = "Router",
ChangeType = ChangeType.Modified,
Timestamp = DateTimeOffset.UtcNow
});
});
return Task.CompletedTask;
}
private void OnConfigurationFileChanged(object sender, FileSystemEventArgs e)
{
// Debounce rapid changes
var now = DateTimeOffset.UtcNow;
if (_lastChange.TryGetValue(e.FullPath, out var lastChange) &&
now - lastChange < _debounceInterval)
{
return;
}
_lastChange[e.FullPath] = now;
_logger.LogInformation("Configuration file changed: {Path}", e.FullPath);
// Delay to allow file writes to complete
Task.Delay(100).ContinueWith(_ =>
{
try
{
// Validate configuration before notifying
if (ValidateConfiguration(e.FullPath))
{
_changes.OnNext(new ConfigurationChange
{
Section = DetermineSectionFromPath(e.FullPath),
ChangeType = ChangeType.Modified,
FilePath = e.FullPath,
Timestamp = now
});
}
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to process configuration change for {Path}", e.FullPath);
}
});
}
private bool ValidateConfiguration(string path)
{
try
{
var yaml = File.ReadAllText(path);
var deserializer = new DeserializerBuilder()
.WithNamingConvention(CamelCaseNamingConvention.Instance)
.Build();
// Try to deserialize to validate YAML syntax
var doc = deserializer.Deserialize<Dictionary<string, object>>(yaml);
return doc != null;
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Invalid configuration file: {Path}", path);
return false;
}
}
private string DetermineSectionFromPath(string path)
{
var fileName = Path.GetFileNameWithoutExtension(path).ToLower();
return fileName switch
{
"router" => "Router",
"routes" => "Routes",
"ratelimits" => "RateLimits",
"endpoints" => "Endpoints",
_ => "Unknown"
};
}
private IEnumerable<string> GetConfigurationFilePaths()
{
// Get paths from configuration providers
var paths = new List<string>();
if (_configuration is IConfigurationRoot root)
{
foreach (var provider in root.Providers)
{
if (provider is FileConfigurationProvider fileProvider)
{
var source = fileProvider.Source;
if (source.FileProvider?.GetFileInfo(source.Path ?? "") is { Exists: true } fileInfo)
{
paths.Add(fileInfo.PhysicalPath ?? "");
}
}
}
}
return paths.Where(p => !string.IsNullOrEmpty(p));
}
public Task StopAsync(CancellationToken cancellationToken)
{
foreach (var watcher in _watchers)
{
watcher.EnableRaisingEvents = false;
}
return Task.CompletedTask;
}
public void Dispose()
{
foreach (var watcher in _watchers)
{
watcher.Dispose();
}
_changes.Dispose();
}
}
public sealed class ConfigurationChange
{
public string Section { get; init; } = "";
public ChangeType ChangeType { get; init; }
public string? FilePath { get; init; }
public DateTimeOffset Timestamp { get; init; }
}
public enum ChangeType
{
Added,
Modified,
Removed
}
```
---
## Route Configuration Reloader
```csharp
namespace StellaOps.Router.Configuration;
/// <summary>
/// Handles hot-reload of route configurations.
/// </summary>
public sealed class RouteConfigurationReloader : IHostedService
{
private readonly ConfigurationWatcher _watcher;
private readonly IRouteRegistry _routeRegistry;
private readonly ILogger<RouteConfigurationReloader> _logger;
private IDisposable? _subscription;
public RouteConfigurationReloader(
ConfigurationWatcher watcher,
IRouteRegistry routeRegistry,
ILogger<RouteConfigurationReloader> logger)
{
_watcher = watcher;
_routeRegistry = routeRegistry;
_logger = logger;
}
public Task StartAsync(CancellationToken cancellationToken)
{
_subscription = _watcher.Changes
.Where(c => c.Section == "Routes")
.Subscribe(OnRoutesChanged);
return Task.CompletedTask;
}
private void OnRoutesChanged(ConfigurationChange change)
{
_logger.LogInformation("Reloading routes from {Path}", change.FilePath);
try
{
_routeRegistry.Reload();
_logger.LogInformation("Routes reloaded successfully");
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to reload routes, keeping previous configuration");
}
}
public Task StopAsync(CancellationToken cancellationToken)
{
_subscription?.Dispose();
return Task.CompletedTask;
}
}
```
---
## Rate Limit Configuration Reloader
```csharp
namespace StellaOps.Router.Configuration;
/// <summary>
/// Handles hot-reload of rate limit configurations.
/// </summary>
public sealed class RateLimitConfigurationReloader : IHostedService
{
private readonly ConfigurationWatcher _watcher;
private readonly IRateLimiter _rateLimiter;
private readonly IOptionsMonitor<RateLimitConfig> _config;
private readonly ILogger<RateLimitConfigurationReloader> _logger;
private IDisposable? _subscription;
public RateLimitConfigurationReloader(
ConfigurationWatcher watcher,
IRateLimiter rateLimiter,
IOptionsMonitor<RateLimitConfig> config,
ILogger<RateLimitConfigurationReloader> logger)
{
_watcher = watcher;
_rateLimiter = rateLimiter;
_config = config;
_logger = logger;
}
public Task StartAsync(CancellationToken cancellationToken)
{
_subscription = _watcher.Changes
.Where(c => c.Section == "RateLimits")
.Subscribe(OnRateLimitsChanged);
_config.OnChange(OnRateLimitConfigChanged);
return Task.CompletedTask;
}
private void OnRateLimitsChanged(ConfigurationChange change)
{
_logger.LogInformation("Rate limit configuration changed, applying updates");
ApplyRateLimitChanges();
}
private void OnRateLimitConfigChanged(RateLimitConfig config)
{
_logger.LogInformation("Rate limit options changed, applying updates");
ApplyRateLimitChanges();
}
private void ApplyRateLimitChanges()
{
try
{
// Rate limiter will pick up new config from IOptionsMonitor
// Clear any cached tier information
if (_rateLimiter is ICacheableRateLimiter cacheable)
{
cacheable.ClearCache();
}
_logger.LogInformation("Rate limit configuration applied successfully");
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to apply rate limit changes");
}
}
public Task StopAsync(CancellationToken cancellationToken)
{
_subscription?.Dispose();
return Task.CompletedTask;
}
}
public interface ICacheableRateLimiter
{
void ClearCache();
}
```
---
## JWKS Hot-Reload
```csharp
namespace StellaOps.Router.Configuration;
/// <summary>
/// Handles JWKS rotation and cache refresh.
/// </summary>
public sealed class JwksReloader : IHostedService
{
private readonly IJwksCache _jwksCache;
private readonly JwtAuthenticationConfig _config;
private readonly ILogger<JwksReloader> _logger;
private Timer? _refreshTimer;
public JwksReloader(
IJwksCache jwksCache,
IOptions<JwtAuthenticationConfig> config,
ILogger<JwksReloader> logger)
{
_jwksCache = jwksCache;
_config = config.Value;
_logger = logger;
}
public Task StartAsync(CancellationToken cancellationToken)
{
// Periodic refresh of JWKS
var interval = _config.JwksRefreshInterval;
_refreshTimer = new Timer(
RefreshJwks,
null,
interval,
interval);
_logger.LogInformation(
"JWKS refresh scheduled every {Interval}",
interval);
return Task.CompletedTask;
}
private async void RefreshJwks(object? state)
{
try
{
_logger.LogDebug("Refreshing JWKS cache");
await _jwksCache.RefreshAsync(CancellationToken.None);
_logger.LogDebug("JWKS cache refreshed successfully");
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to refresh JWKS cache, will retry");
}
}
public Task StopAsync(CancellationToken cancellationToken)
{
_refreshTimer?.Dispose();
return Task.CompletedTask;
}
}
```
---
## Configuration Validation
```csharp
namespace StellaOps.Router.Configuration;
/// <summary>
/// Validates configuration before applying changes.
/// </summary>
public interface IConfigurationValidator
{
ValidationResult Validate<T>(T config) where T : class;
}
public sealed class ConfigurationValidator : IConfigurationValidator
{
private readonly ILogger<ConfigurationValidator> _logger;
public ConfigurationValidator(ILogger<ConfigurationValidator> logger)
{
_logger = logger;
}
public ValidationResult Validate<T>(T config) where T : class
{
var errors = new List<string>();
// Use data annotations validation
var context = new ValidationContext(config);
var results = new List<System.ComponentModel.DataAnnotations.ValidationResult>();
if (!Validator.TryValidateObject(config, context, results, validateAllProperties: true))
{
errors.AddRange(results.Select(r => r.ErrorMessage ?? "Unknown validation error"));
}
// Type-specific validation
errors.AddRange(config switch
{
RouterConfig router => ValidateRouterConfig(router),
RateLimitConfig rateLimit => ValidateRateLimitConfig(rateLimit),
_ => Enumerable.Empty<string>()
});
if (errors.Any())
{
_logger.LogWarning(
"Configuration validation failed: {Errors}",
string.Join(", ", errors));
}
return new ValidationResult
{
IsValid = !errors.Any(),
Errors = errors
};
}
private IEnumerable<string> ValidateRouterConfig(RouterConfig config)
{
if (config.MaxPayloadSize <= 0)
yield return "MaxPayloadSize must be positive";
if (config.RequestTimeout <= TimeSpan.Zero)
yield return "RequestTimeout must be positive";
}
private IEnumerable<string> ValidateRateLimitConfig(RateLimitConfig config)
{
foreach (var (tier, limits) in config.Tiers)
{
if (limits.RequestsPerMinute <= 0)
yield return $"Tier {tier}: RequestsPerMinute must be positive";
}
}
}
public sealed class ValidationResult
{
public bool IsValid { get; init; }
public IReadOnlyList<string> Errors { get; init; } = Array.Empty<string>();
}
```
---
## Atomic Configuration Update
```csharp
namespace StellaOps.Router.Configuration;
/// <summary>
/// Provides atomic configuration updates with rollback support.
/// </summary>
public sealed class AtomicConfigurationUpdater
{
private readonly IConfigurationValidator _validator;
private readonly ILogger<AtomicConfigurationUpdater> _logger;
private readonly ReaderWriterLockSlim _lock = new();
public AtomicConfigurationUpdater(
IConfigurationValidator validator,
ILogger<AtomicConfigurationUpdater> logger)
{
_validator = validator;
_logger = logger;
}
/// <summary>
/// Atomically updates configuration with validation and rollback.
/// </summary>
public async Task<bool> UpdateAsync<T>(
T currentConfig,
T newConfig,
Func<T, Task> applyAction,
Func<T, Task>? rollbackAction = null)
where T : class
{
// Validate new configuration
var validation = _validator.Validate(newConfig);
if (!validation.IsValid)
{
_logger.LogWarning(
"Configuration update rejected: {Errors}",
string.Join(", ", validation.Errors));
return false;
}
_lock.EnterWriteLock();
try
{
// Store current config for rollback
var backup = currentConfig;
try
{
await applyAction(newConfig);
_logger.LogInformation("Configuration updated successfully");
return true;
}
catch (Exception ex)
{
_logger.LogError(ex, "Configuration update failed, rolling back");
if (rollbackAction != null)
{
try
{
await rollbackAction(backup);
_logger.LogInformation("Configuration rolled back successfully");
}
catch (Exception rollbackEx)
{
_logger.LogError(rollbackEx, "Rollback failed!");
}
}
return false;
}
}
finally
{
_lock.ExitWriteLock();
}
}
}
```
---
## Configuration API Endpoints
```csharp
namespace StellaOps.Router.Gateway;
/// <summary>
/// API endpoints for configuration management.
/// </summary>
public static class ConfigurationEndpoints
{
public static IEndpointRouteBuilder MapConfigurationEndpoints(
this IEndpointRouteBuilder endpoints,
string basePath = "/api/config")
{
var group = endpoints.MapGroup(basePath)
.RequireAuthorization("admin");
group.MapGet("/", GetConfiguration);
group.MapGet("/{section}", GetConfigurationSection);
group.MapPost("/reload", ReloadConfiguration);
group.MapPost("/validate", ValidateConfiguration);
return endpoints;
}
private static async Task<IResult> GetConfiguration(
IConfiguration configuration)
{
var sections = new Dictionary<string, object>();
foreach (var child in configuration.GetChildren())
{
sections[child.Key] = GetSectionValue(child);
}
return Results.Ok(sections);
}
private static object GetSectionValue(IConfigurationSection section)
{
var children = section.GetChildren().ToList();
if (!children.Any())
{
return section.Value ?? "";
}
if (children.All(c => int.TryParse(c.Key, out _)))
{
// Array
return children.Select(c => GetSectionValue(c)).ToList();
}
// Object
return children.ToDictionary(c => c.Key, c => GetSectionValue(c));
}
private static IResult GetConfigurationSection(
string section,
IConfiguration configuration)
{
var configSection = configuration.GetSection(section);
if (!configSection.Exists())
{
return Results.NotFound(new { error = $"Section '{section}' not found" });
}
return Results.Ok(GetSectionValue(configSection));
}
private static async Task<IResult> ReloadConfiguration(
ConfigurationWatcher watcher,
ILogger<ConfigurationWatcher> logger)
{
logger.LogInformation("Manual configuration reload triggered");
// Trigger reload notification
// In practice, would re-read configuration files
return Results.Ok(new { message = "Configuration reload triggered" });
}
private static async Task<IResult> ValidateConfiguration(
HttpRequest request,
IConfigurationValidator validator)
{
var body = await request.ReadFromJsonAsync<Dictionary<string, object>>();
if (body == null)
{
return Results.BadRequest(new { error = "Invalid request body" });
}
// Basic syntax validation
return Results.Ok(new { valid = true });
}
}
```
---
## YAML Configuration
```yaml
Configuration:
# Enable hot-reload
HotReload:
Enabled: true
DebounceInterval: "00:00:00.500"
ValidateBeforeApply: true
# Files to watch
WatchPaths:
- "/etc/stellaops/router.yaml"
- "/etc/stellaops/routes.yaml"
- "/etc/stellaops/ratelimits.yaml"
# JWKS refresh settings
Jwks:
RefreshInterval: "00:05:00"
RefreshOnError: true
MaxRetries: 3
```
---
## Deliverables
1. `StellaOps.Router.Configuration/ConfigurationWatcher.cs`
2. `StellaOps.Router.Configuration/RouteConfigurationReloader.cs`
3. `StellaOps.Router.Configuration/RateLimitConfigurationReloader.cs`
4. `StellaOps.Router.Configuration/JwksReloader.cs`
5. `StellaOps.Router.Configuration/IConfigurationValidator.cs`
6. `StellaOps.Router.Configuration/ConfigurationValidator.cs`
7. `StellaOps.Router.Configuration/AtomicConfigurationUpdater.cs`
8. `StellaOps.Router.Gateway/ConfigurationEndpoints.cs`
9. Configuration reload tests
10. Validation tests
---
## Next Step
Proceed to [Step 26: End-to-End Testing](26-Step.md) to implement comprehensive integration tests.

View File

@@ -1,683 +0,0 @@
# Step 26: End-to-End Testing
**Phase 7: Testing & Documentation**
**Estimated Complexity:** High
**Dependencies:** All implementation steps
---
## Overview
End-to-end testing validates the complete request flow from HTTP client through the gateway, transport layer, microservice, and back. Tests cover all handlers, authentication, rate limiting, streaming, and failure scenarios.
---
## Goals
1. Validate complete request/response flow
2. Test all route handlers
3. Verify authentication and authorization
4. Test rate limiting behavior
5. Validate streaming and large payloads
6. Test failure scenarios and resilience
---
## Test Infrastructure
```csharp
namespace StellaOps.Router.Tests;
/// <summary>
/// End-to-end test fixture providing gateway and microservice hosts.
/// </summary>
public sealed class EndToEndTestFixture : IAsyncLifetime
{
private IHost? _gatewayHost;
private IHost? _microserviceHost;
private InMemoryTransportHub? _transportHub;
public HttpClient GatewayClient { get; private set; } = null!;
public string GatewayBaseUrl { get; private set; } = null!;
public async Task InitializeAsync()
{
// Shared transport hub for InMemory testing
_transportHub = new InMemoryTransportHub(
NullLoggerFactory.Instance.CreateLogger<InMemoryTransportHub>());
// Start gateway
_gatewayHost = await CreateGatewayHostAsync();
await _gatewayHost.StartAsync();
GatewayBaseUrl = "http://localhost:5000";
GatewayClient = new HttpClient { BaseAddress = new Uri(GatewayBaseUrl) };
// Start test microservice
_microserviceHost = await CreateMicroserviceHostAsync();
await _microserviceHost.StartAsync();
// Wait for connection
await Task.Delay(500);
}
private async Task<IHost> CreateGatewayHostAsync()
{
return Host.CreateDefaultBuilder()
.ConfigureWebHostDefaults(web =>
{
web.UseUrls("http://localhost:5000");
web.ConfigureServices((context, services) =>
{
services.AddSingleton(_transportHub!);
services.AddStellaGateway(context.Configuration);
services.AddInMemoryTransport();
// Use in-memory rate limiter
services.AddSingleton<IRateLimiter, InMemoryRateLimiter>();
// Mock Authority
services.AddSingleton<IAuthorityClient, MockAuthorityClient>();
});
web.Configure(app =>
{
app.UseRouting();
app.UseStellaGateway();
app.UseEndpoints(endpoints =>
{
endpoints.MapStellaRoutes();
});
});
})
.Build();
}
private async Task<IHost> CreateMicroserviceHostAsync()
{
var host = StellaMicroserviceBuilder
.Create("test-service")
.ConfigureServices(services =>
{
services.AddSingleton(_transportHub!);
services.AddScoped<TestEndpointHandler>();
})
.ConfigureTransport(t => t.Default = "InMemory")
.ConfigureEndpoints(e =>
{
e.AutoDiscover = true;
e.BasePath = "/api";
})
.Build();
return (IHost)host;
}
public async Task DisposeAsync()
{
GatewayClient.Dispose();
if (_microserviceHost != null)
{
await _microserviceHost.StopAsync();
_microserviceHost.Dispose();
}
if (_gatewayHost != null)
{
await _gatewayHost.StopAsync();
_gatewayHost.Dispose();
}
_transportHub?.Dispose();
}
}
```
---
## Test Endpoint Handler
```csharp
namespace StellaOps.Router.Tests;
[StellaEndpoint(BasePath = "/test")]
public class TestEndpointHandler : EndpointHandler
{
[StellaGet("echo")]
public ResponsePayload Echo()
{
return Ok(new
{
method = Context.Method,
path = Context.Path,
query = Context.Query.ToDictionary(q => q.Key, q => q.Value.ToString()),
headers = Context.Headers.ToDictionary(h => h.Key, h => h.Value.ToString()),
claims = Context.Claims
});
}
[StellaPost("echo")]
public async Task<ResponsePayload> EchoBody()
{
var body = Context.ReadBodyAsString();
return Ok(new { body });
}
[StellaGet("items/{id}")]
public ResponsePayload GetItem([FromPath] string id)
{
return Ok(new { id });
}
[StellaGet("slow")]
public async Task<ResponsePayload> SlowEndpoint(CancellationToken cancellationToken)
{
await Task.Delay(5000, cancellationToken);
return Ok(new { completed = true });
}
[StellaGet("error")]
public ResponsePayload ThrowError()
{
throw new InvalidOperationException("Test error");
}
[StellaGet("status/{code}")]
public ResponsePayload ReturnStatus([FromPath] int code)
{
return Response().WithStatus(code).WithJson(new { statusCode = code }).Build();
}
[StellaGet("protected")]
[StellaAuth(RequiredClaims = new[] { "admin" })]
public ResponsePayload ProtectedEndpoint()
{
return Ok(new { message = "Access granted" });
}
[StellaPost("upload")]
public ResponsePayload HandleUpload()
{
var size = Context.ContentLength ?? Context.RawBody?.Length ?? 0;
return Ok(new { bytesReceived = size });
}
[StellaGet("stream")]
public ResponsePayload StreamResponse()
{
var data = new byte[1024 * 1024]; // 1MB
Random.Shared.NextBytes(data);
return Response()
.WithBytes(data, "application/octet-stream")
.Build();
}
}
```
---
## Basic Request/Response Tests
```csharp
namespace StellaOps.Router.Tests;
public class BasicRequestResponseTests : IClassFixture<EndToEndTestFixture>
{
private readonly EndToEndTestFixture _fixture;
public BasicRequestResponseTests(EndToEndTestFixture fixture)
{
_fixture = fixture;
}
[Fact]
public async Task Get_Echo_ReturnsRequestDetails()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/test/echo");
var content = await response.Content.ReadFromJsonAsync<EchoResponse>();
// Assert
Assert.True(response.IsSuccessStatusCode);
Assert.Equal("GET", content?.Method);
Assert.Equal("/api/test/echo", content?.Path);
}
[Fact]
public async Task Post_Echo_ReturnsBody()
{
// Arrange
var client = _fixture.GatewayClient;
var body = new StringContent("{\"test\": true}", Encoding.UTF8, "application/json");
// Act
var response = await client.PostAsync("/api/test/echo", body);
var content = await response.Content.ReadFromJsonAsync<EchoBodyResponse>();
// Assert
Assert.True(response.IsSuccessStatusCode);
Assert.Contains("test", content?.Body);
}
[Fact]
public async Task Get_WithPathParameter_ExtractsParameter()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/test/items/12345");
var content = await response.Content.ReadFromJsonAsync<ItemResponse>();
// Assert
Assert.True(response.IsSuccessStatusCode);
Assert.Equal("12345", content?.Id);
}
[Fact]
public async Task Get_NonExistentPath_Returns404()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/nonexistent");
// Assert
Assert.Equal(HttpStatusCode.NotFound, response.StatusCode);
}
private record EchoResponse(
string Method,
string Path,
Dictionary<string, string> Query,
Dictionary<string, string> Claims);
private record EchoBodyResponse(string Body);
private record ItemResponse(string Id);
}
```
---
## Authentication Tests
```csharp
namespace StellaOps.Router.Tests;
public class AuthenticationTests : IClassFixture<EndToEndTestFixture>
{
private readonly EndToEndTestFixture _fixture;
public AuthenticationTests(EndToEndTestFixture fixture)
{
_fixture = fixture;
}
[Fact]
public async Task Protected_WithoutToken_Returns401()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/test/protected");
// Assert
Assert.Equal(HttpStatusCode.Unauthorized, response.StatusCode);
}
[Fact]
public async Task Protected_WithValidToken_Returns200()
{
// Arrange
var client = _fixture.GatewayClient;
var token = CreateTestToken(new Dictionary<string, string> { ["admin"] = "true" });
client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", token);
// Act
var response = await client.GetAsync("/api/test/protected");
// Assert
Assert.True(response.IsSuccessStatusCode);
}
[Fact]
public async Task Protected_WithInvalidToken_Returns401()
{
// Arrange
var client = _fixture.GatewayClient;
client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", "invalid-token");
// Act
var response = await client.GetAsync("/api/test/protected");
// Assert
Assert.Equal(HttpStatusCode.Unauthorized, response.StatusCode);
}
[Fact]
public async Task Protected_WithMissingClaim_Returns403()
{
// Arrange
var client = _fixture.GatewayClient;
var token = CreateTestToken(new Dictionary<string, string> { ["user"] = "true" }); // No admin claim
client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", token);
// Act
var response = await client.GetAsync("/api/test/protected");
// Assert
Assert.Equal(HttpStatusCode.Forbidden, response.StatusCode);
}
private string CreateTestToken(Dictionary<string, string> claims)
{
// Create a test JWT (would use test key in real implementation)
var handler = new JwtSecurityTokenHandler();
var key = new SymmetricSecurityKey(Encoding.UTF8.GetBytes("test-key-for-testing-only-12345"));
var creds = new SigningCredentials(key, SecurityAlgorithms.HmacSha256);
var claimsList = claims.Select(c => new Claim(c.Key, c.Value)).ToList();
claimsList.Add(new Claim("sub", "test-user"));
var token = new JwtSecurityToken(
issuer: "test",
audience: "test",
claims: claimsList,
expires: DateTime.UtcNow.AddHours(1),
signingCredentials: creds);
return handler.WriteToken(token);
}
}
```
---
## Rate Limiting Tests
```csharp
namespace StellaOps.Router.Tests;
public class RateLimitingTests : IClassFixture<EndToEndTestFixture>
{
private readonly EndToEndTestFixture _fixture;
public RateLimitingTests(EndToEndTestFixture fixture)
{
_fixture = fixture;
}
[Fact]
public async Task RateLimit_ExceedingLimit_Returns429()
{
// Arrange
var client = _fixture.GatewayClient;
var tasks = new List<Task<HttpResponseMessage>>();
// Act - Send 100 requests quickly
for (int i = 0; i < 100; i++)
{
tasks.Add(client.GetAsync("/api/test/echo"));
}
var responses = await Task.WhenAll(tasks);
// Assert - Some should be rate limited
var rateLimited = responses.Count(r => r.StatusCode == HttpStatusCode.TooManyRequests);
Assert.True(rateLimited > 0, "Expected some requests to be rate limited");
}
[Fact]
public async Task RateLimit_Headers_ArePresent()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/test/echo");
// Assert
Assert.True(response.Headers.Contains("X-RateLimit-Limit"));
Assert.True(response.Headers.Contains("X-RateLimit-Remaining"));
}
[Fact]
public async Task RateLimit_PerUser_IsolatesUsers()
{
// Arrange
var client1 = new HttpClient { BaseAddress = new Uri(_fixture.GatewayBaseUrl) };
var client2 = new HttpClient { BaseAddress = new Uri(_fixture.GatewayBaseUrl) };
client1.DefaultRequestHeaders.Add("X-API-Key", "user1-key");
client2.DefaultRequestHeaders.Add("X-API-Key", "user2-key");
// Act - Exhaust rate limit for user1
for (int i = 0; i < 50; i++)
{
await client1.GetAsync("/api/test/echo");
}
// User2 should still have quota
var response = await client2.GetAsync("/api/test/echo");
// Assert
Assert.True(response.IsSuccessStatusCode);
}
}
```
---
## Timeout and Cancellation Tests
```csharp
namespace StellaOps.Router.Tests;
public class TimeoutAndCancellationTests : IClassFixture<EndToEndTestFixture>
{
private readonly EndToEndTestFixture _fixture;
public TimeoutAndCancellationTests(EndToEndTestFixture fixture)
{
_fixture = fixture;
}
[Fact]
public async Task Slow_Request_TimesOut()
{
// Arrange
var client = new HttpClient
{
BaseAddress = new Uri(_fixture.GatewayBaseUrl),
Timeout = TimeSpan.FromSeconds(1)
};
// Act & Assert
await Assert.ThrowsAsync<TaskCanceledException>(
() => client.GetAsync("/api/test/slow"));
}
[Fact]
public async Task Cancelled_Request_PropagatesCancellation()
{
// Arrange
var client = _fixture.GatewayClient;
using var cts = new CancellationTokenSource();
// Act
var task = client.GetAsync("/api/test/slow", cts.Token);
await Task.Delay(100);
cts.Cancel();
// Assert
await Assert.ThrowsAsync<TaskCanceledException>(() => task);
}
}
```
---
## Streaming and Large Payload Tests
```csharp
namespace StellaOps.Router.Tests;
public class StreamingTests : IClassFixture<EndToEndTestFixture>
{
private readonly EndToEndTestFixture _fixture;
public StreamingTests(EndToEndTestFixture fixture)
{
_fixture = fixture;
}
[Fact]
public async Task LargeUpload_Succeeds()
{
// Arrange
var client = _fixture.GatewayClient;
var data = new byte[1024 * 1024]; // 1MB
Random.Shared.NextBytes(data);
var content = new ByteArrayContent(data);
content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
// Act
var response = await client.PostAsync("/api/test/upload", content);
var result = await response.Content.ReadFromJsonAsync<UploadResponse>();
// Assert
Assert.True(response.IsSuccessStatusCode);
Assert.Equal(data.Length, result?.BytesReceived);
}
[Fact]
public async Task LargeDownload_Succeeds()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/test/stream");
var data = await response.Content.ReadAsByteArrayAsync();
// Assert
Assert.True(response.IsSuccessStatusCode);
Assert.Equal(1024 * 1024, data.Length);
}
private record UploadResponse(long BytesReceived);
}
```
---
## Error Handling Tests
```csharp
namespace StellaOps.Router.Tests;
public class ErrorHandlingTests : IClassFixture<EndToEndTestFixture>
{
private readonly EndToEndTestFixture _fixture;
public ErrorHandlingTests(EndToEndTestFixture fixture)
{
_fixture = fixture;
}
[Fact]
public async Task Handler_Exception_Returns500()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/test/error");
// Assert
Assert.Equal(HttpStatusCode.InternalServerError, response.StatusCode);
}
[Fact]
public async Task Custom_StatusCode_IsPreserved()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/test/status/418");
// Assert
Assert.Equal((HttpStatusCode)418, response.StatusCode);
}
[Fact]
public async Task Error_Response_HasCorrectFormat()
{
// Arrange
var client = _fixture.GatewayClient;
// Act
var response = await client.GetAsync("/api/nonexistent");
var content = await response.Content.ReadFromJsonAsync<ErrorResponse>();
// Assert
Assert.NotNull(content?.Error);
}
private record ErrorResponse(string Error);
}
```
---
## YAML Configuration
```yaml
# Test configuration
Router:
Transports:
- Type: InMemory
Enabled: true
RateLimiting:
Enabled: true
DefaultTier: free
Tiers:
free:
RequestsPerMinute: 60
authenticated:
RequestsPerMinute: 600
Authentication:
Enabled: true
AllowAnonymous: false
TestMode: true
```
---
## Deliverables
1. `StellaOps.Router.Tests/EndToEndTestFixture.cs`
2. `StellaOps.Router.Tests/TestEndpointHandler.cs`
3. `StellaOps.Router.Tests/BasicRequestResponseTests.cs`
4. `StellaOps.Router.Tests/AuthenticationTests.cs`
5. `StellaOps.Router.Tests/RateLimitingTests.cs`
6. `StellaOps.Router.Tests/TimeoutAndCancellationTests.cs`
7. `StellaOps.Router.Tests/StreamingTests.cs`
8. `StellaOps.Router.Tests/ErrorHandlingTests.cs`
9. Mock implementations for Authority, Rate Limiter
10. CI integration configuration
---
## Next Step
Proceed to [Step 27: Reference Example & Migration Skeleton](27-Step.md) to create example implementations.

File diff suppressed because it is too large Load Diff

View File

@@ -1,755 +0,0 @@
# Step 28: Agent Process Guidelines
## Overview
This document provides comprehensive guidelines for AI agents (Claude, Copilot, etc.) implementing the Stella Router. It establishes conventions, patterns, and decision frameworks to ensure consistent, high-quality implementations across all phases.
## Goals
1. Define clear coding standards and patterns for Router implementation
2. Establish decision frameworks for common scenarios
3. Provide checklists for implementation quality
4. Document testing requirements and coverage expectations
5. Define commit and PR conventions
## Implementation Standards
### Code Organization
```
src/Router/
├── StellaOps.Router.Core/ # Core abstractions and contracts
│ ├── Abstractions/ # Interfaces
│ ├── Configuration/ # Config models
│ ├── Extensions/ # Extension methods
│ └── Primitives/ # Value types
├── StellaOps.Router.Gateway/ # Gateway implementation
│ ├── Routing/ # Route matching
│ ├── Handlers/ # Route handlers
│ ├── Pipeline/ # Request pipeline
│ └── Middleware/ # Gateway middleware
├── StellaOps.Router.Transport/ # Transport implementations
│ ├── InMemory/ # In-process transport
│ ├── Tcp/ # TCP transport
│ └── Tls/ # TLS transport
├── StellaOps.Router.Microservice/ # Microservice SDK
│ ├── Hosting/ # Host builder
│ ├── Endpoints/ # Endpoint handling
│ └── Context/ # Request context
├── StellaOps.Router.Security/ # Security components
│ ├── Jwt/ # JWT validation
│ ├── Claims/ # Claim hydration
│ └── RateLimiting/ # Rate limiting
└── StellaOps.Router.Observability/ # Observability
├── Logging/ # Structured logging
├── Metrics/ # Prometheus metrics
└── Tracing/ # OpenTelemetry tracing
```
### Naming Conventions
| Element | Convention | Example |
|---------|------------|---------|
| Interfaces | `I` prefix, noun/adjective | `IRouteHandler`, `IConnectable` |
| Classes | PascalCase, noun | `JwtValidator`, `RouteTable` |
| Async methods | `Async` suffix | `ValidateTokenAsync`, `SendAsync` |
| Config classes | `Options` or `Configuration` suffix | `JwtValidationOptions` |
| Event handlers | `On` prefix | `OnConnectionEstablished` |
| Factory methods | `Create` prefix | `CreateHandler`, `CreateConnection` |
| Boolean properties | `Is`/`Has`/`Can` prefix | `IsValid`, `HasExpired`, `CanRetry` |
### File Structure
```csharp
// File: StellaOps.Router.Core/Abstractions/IRouteHandler.cs
// 1. License header (if required)
// 2. Using statements (sorted: System, Microsoft, Third-party, Internal)
using System;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.Logging;
using StellaOps.Router.Core.Configuration;
// 3. Namespace (one per file, matches folder structure)
namespace StellaOps.Router.Core.Abstractions;
// 4. XML documentation
/// <summary>
/// Handles requests for a specific route type.
/// </summary>
/// <remarks>
/// Implementations must be thread-safe and support concurrent request handling.
/// </remarks>
public interface IRouteHandler
{
// 5. Interface members (properties, then methods)
/// <summary>
/// Gets the handler type identifier.
/// </summary>
string HandlerType { get; }
/// <summary>
/// Determines if this handler can process the given route.
/// </summary>
bool CanHandle(RouteConfiguration route);
/// <summary>
/// Processes an incoming request.
/// </summary>
Task<ResponsePayload> HandleAsync(
RequestPayload request,
RouteConfiguration route,
CancellationToken cancellationToken = default);
}
```
### Error Handling Patterns
```csharp
// Pattern 1: Result types for expected failures
public readonly struct Result<T>
{
public T? Value { get; }
public Error? Error { get; }
public bool IsSuccess => Error == null;
private Result(T? value, Error? error)
{
Value = value;
Error = error;
}
public static Result<T> Success(T value) => new(value, null);
public static Result<T> Failure(Error error) => new(default, error);
public Result<TNext> Map<TNext>(Func<T, TNext> map) =>
IsSuccess ? Result<TNext>.Success(map(Value!)) : Result<TNext>.Failure(Error!);
public async Task<Result<TNext>> MapAsync<TNext>(Func<T, Task<TNext>> map) =>
IsSuccess ? Result<TNext>.Success(await map(Value!)) : Result<TNext>.Failure(Error!);
}
public record Error(string Code, string Message, Exception? Inner = null);
// Usage
public async Task<Result<JwtClaims>> ValidateTokenAsync(string token)
{
try
{
var claims = await _validator.ValidateAsync(token);
return Result<JwtClaims>.Success(claims);
}
catch (SecurityTokenExpiredException ex)
{
return Result<JwtClaims>.Failure(new Error("TOKEN_EXPIRED", "JWT has expired", ex));
}
catch (SecurityTokenInvalidSignatureException ex)
{
return Result<JwtClaims>.Failure(new Error("INVALID_SIGNATURE", "JWT signature invalid", ex));
}
}
// Pattern 2: Exceptions for unexpected failures
public class RouterException : Exception
{
public string ErrorCode { get; }
public int StatusCode { get; }
public RouterException(string errorCode, string message, int statusCode = 500)
: base(message)
{
ErrorCode = errorCode;
StatusCode = statusCode;
}
}
public class ConfigurationException : RouterException
{
public ConfigurationException(string message)
: base("CONFIG_ERROR", message, 500) { }
}
public class TransportException : RouterException
{
public TransportException(string message, Exception? inner = null)
: base("TRANSPORT_ERROR", message, 503) { }
}
```
### Async Patterns
```csharp
// Pattern 1: CancellationToken propagation
public async Task<ResponsePayload> HandleAsync(
RequestPayload request,
CancellationToken cancellationToken = default)
{
// Always check at start of long operations
cancellationToken.ThrowIfCancellationRequested();
// Propagate to all async calls
var validated = await _validator.ValidateAsync(request, cancellationToken);
var enriched = await _enricher.EnrichAsync(validated, cancellationToken);
var response = await _handler.ProcessAsync(enriched, cancellationToken);
return response;
}
// Pattern 2: Timeout handling
public async Task<T> WithTimeoutAsync<T>(
Func<CancellationToken, Task<T>> operation,
TimeSpan timeout,
CancellationToken cancellationToken = default)
{
using var cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
cts.CancelAfter(timeout);
try
{
return await operation(cts.Token);
}
catch (OperationCanceledException) when (!cancellationToken.IsCancellationRequested)
{
throw new TimeoutException($"Operation timed out after {timeout}");
}
}
// Pattern 3: Fire-and-forget with logging
public void FireAndForget(Func<Task> operation, ILogger logger, string operationName)
{
_ = Task.Run(async () =>
{
try
{
await operation();
}
catch (Exception ex)
{
logger.LogError(ex, "Fire-and-forget operation {Operation} failed", operationName);
}
});
}
```
### Dependency Injection Patterns
```csharp
// Pattern 1: Constructor injection with validation
public class JwtValidator : IJwtValidator
{
private readonly JwtValidationOptions _options;
private readonly IKeyProvider _keyProvider;
private readonly ILogger<JwtValidator> _logger;
public JwtValidator(
IOptions<JwtValidationOptions> options,
IKeyProvider keyProvider,
ILogger<JwtValidator> logger)
{
_options = options?.Value ?? throw new ArgumentNullException(nameof(options));
_keyProvider = keyProvider ?? throw new ArgumentNullException(nameof(keyProvider));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
ValidateOptions(_options);
}
private static void ValidateOptions(JwtValidationOptions options)
{
if (string.IsNullOrEmpty(options.Issuer))
throw new ConfigurationException("JWT issuer is required");
if (options.ClockSkew < TimeSpan.Zero)
throw new ConfigurationException("Clock skew cannot be negative");
}
}
// Pattern 2: Factory registration for complex objects
public static class ServiceCollectionExtensions
{
public static IServiceCollection AddStellaRouter(
this IServiceCollection services,
Action<RouterOptions> configure)
{
services.Configure(configure);
// Core services
services.AddSingleton<IRouteTable, RouteTable>();
services.AddSingleton<IRequestPipeline, RequestPipeline>();
// Keyed services for handlers
services.AddKeyedSingleton<IRouteHandler, MicroserviceHandler>("microservice");
services.AddKeyedSingleton<IRouteHandler, GraphQLHandler>("graphql");
services.AddKeyedSingleton<IRouteHandler, ReverseProxyHandler>("proxy");
// Factory for route handler resolution
services.AddSingleton<IRouteHandlerFactory>(sp => new RouteHandlerFactory(
sp.GetServices<IRouteHandler>().ToDictionary(h => h.HandlerType)));
return services;
}
}
// Pattern 3: Scoped services for request context
public static class RequestScopeExtensions
{
public static IServiceCollection AddRequestScope(this IServiceCollection services)
{
services.AddScoped<IRequestContext, RequestContext>();
services.AddScoped(sp => sp.GetRequiredService<IRequestContext>().User);
services.AddScoped(sp => sp.GetRequiredService<IRequestContext>().CorrelationId);
return services;
}
}
```
## Decision Framework
### When to Create New Types vs. Reuse
| Scenario | Decision | Rationale |
|----------|----------|-----------|
| Similar data, different context | Create new type | Type safety, clear intent |
| Same data, same context | Reuse type | DRY, reduce cognitive load |
| Third-party type | Create wrapper | Abstraction, testability |
| Config vs. runtime | Separate types | Immutability guarantees |
```csharp
// Example: Separate types for config vs runtime
public record RouteConfiguration(
string Path,
string Method,
string HandlerType,
Dictionary<string, string> Metadata);
public class CompiledRoute
{
public RouteConfiguration Config { get; }
public Regex PathPattern { get; }
public IRouteHandler Handler { get; }
// Runtime-computed fields
}
```
### When to Use Interfaces vs. Abstract Classes
| Use Interface | Use Abstract Class |
|---------------|-------------------|
| Multiple inheritance needed | Shared implementation |
| Contract-only definition | Template method pattern |
| Third-party implementation | Internal hierarchy only |
| Mocking/testing priority | Code reuse priority |
### Logging Level Guidelines
| Level | When to Use | Example |
|-------|-------------|---------|
| `Trace` | Internal flow details | `"Route matching attempt for {Path}"` |
| `Debug` | Diagnostic information | `"Cache hit for key {Key}"` |
| `Information` | Significant events | `"Request completed: {Method} {Path} → {Status}"` |
| `Warning` | Recoverable issues | `"Rate limit approaching: {Current}/{Max}"` |
| `Error` | Failures requiring attention | `"Failed to connect to Authority: {Error}"` |
| `Critical` | System-wide failures | `"Configuration invalid, router cannot start"` |
```csharp
// Structured logging patterns
_logger.LogInformation(
"Request processed: {Method} {Path} → {StatusCode} in {ElapsedMs}ms",
request.Method,
request.Path,
response.StatusCode,
stopwatch.ElapsedMilliseconds);
// Use LoggerMessage for high-performance paths
private static readonly Action<ILogger, string, string, int, long, Exception?> LogRequestComplete =
LoggerMessage.Define<string, string, int, long>(
LogLevel.Information,
new EventId(1001, "RequestComplete"),
"Request processed: {Method} {Path} → {StatusCode} in {ElapsedMs}ms");
// Usage
LogRequestComplete(_logger, method, path, statusCode, elapsed, null);
```
## Implementation Checklists
### Before Starting a Component
- [ ] Read the step documentation thoroughly
- [ ] Understand dependencies on previous steps
- [ ] Review related existing code patterns
- [ ] Identify configuration requirements
- [ ] Plan test coverage strategy
### During Implementation
- [ ] Follow naming conventions
- [ ] Add XML documentation to public APIs
- [ ] Implement `IDisposable`/`IAsyncDisposable` where needed
- [ ] Add structured logging at appropriate levels
- [ ] Handle cancellation tokens throughout
- [ ] Use result types for expected failures
- [ ] Validate all configuration at startup
### Before Marking Complete
- [ ] All public types have XML documentation
- [ ] Unit tests achieve >80% coverage
- [ ] Integration tests cover happy path + error cases
- [ ] No compiler warnings
- [ ] Code passes all linting rules
- [ ] Configuration is validated
- [ ] README/documentation updated if needed
### Pull Request Checklist
- [ ] PR title follows convention: `feat(router): description`
- [ ] Description explains what and why
- [ ] All tests pass
- [ ] No unrelated changes
- [ ] Breaking changes documented
- [ ] Reviewable size (<500 lines preferred)
## Testing Requirements
### Unit Test Coverage Targets
| Component Type | Target Coverage |
|---------------|-----------------|
| Core logic | 90% |
| Handlers | 85% |
| Middleware | 80% |
| Configuration | 75% |
| Extensions | 70% |
### Test Structure
```csharp
// Test file naming: {ClassName}Tests.cs
// Test method naming: {Method}_{Scenario}_{ExpectedResult}
public class JwtValidatorTests
{
private readonly JwtValidator _sut; // System Under Test
private readonly Mock<IKeyProvider> _keyProviderMock;
private readonly Mock<ILogger<JwtValidator>> _loggerMock;
public JwtValidatorTests()
{
_keyProviderMock = new Mock<IKeyProvider>();
_loggerMock = new Mock<ILogger<JwtValidator>>();
var options = Options.Create(new JwtValidationOptions
{
Issuer = "https://auth.example.com",
Audience = "stella-router"
});
_sut = new JwtValidator(options, _keyProviderMock.Object, _loggerMock.Object);
}
[Fact]
public async Task ValidateAsync_ValidToken_ReturnsSuccessWithClaims()
{
// Arrange
var token = GenerateValidToken();
_keyProviderMock
.Setup(x => x.GetSigningKeyAsync(It.IsAny<string>()))
.ReturnsAsync(TestKeys.ValidKey);
// Act
var result = await _sut.ValidateAsync(token);
// Assert
Assert.True(result.IsSuccess);
Assert.NotNull(result.Value);
Assert.Equal("test-user", result.Value.Subject);
}
[Fact]
public async Task ValidateAsync_ExpiredToken_ReturnsFailure()
{
// Arrange
var token = GenerateExpiredToken();
// Act
var result = await _sut.ValidateAsync(token);
// Assert
Assert.False(result.IsSuccess);
Assert.Equal("TOKEN_EXPIRED", result.Error!.Code);
}
[Theory]
[InlineData(null)]
[InlineData("")]
[InlineData(" ")]
public async Task ValidateAsync_NullOrEmptyToken_ReturnsFailure(string? token)
{
// Act
var result = await _sut.ValidateAsync(token!);
// Assert
Assert.False(result.IsSuccess);
Assert.Equal("INVALID_TOKEN", result.Error!.Code);
}
}
```
### Integration Test Patterns
```csharp
public class RouterIntegrationTests : IClassFixture<RouterTestFixture>
{
private readonly RouterTestFixture _fixture;
public RouterIntegrationTests(RouterTestFixture fixture)
{
_fixture = fixture;
}
[Fact]
public async Task EndToEnd_AuthenticatedRequest_ReturnsSuccess()
{
// Arrange
var client = _fixture.CreateAuthenticatedClient(claims: new()
{
["sub"] = "test-user",
["role"] = "admin"
});
// Act
var response = await client.GetAsync("/api/users/123");
// Assert
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
var user = await response.Content.ReadFromJsonAsync<UserDto>();
Assert.NotNull(user);
Assert.Equal("123", user.Id);
}
}
// Test fixture
public class RouterTestFixture : IAsyncLifetime
{
private IHost? _gatewayHost;
private IHost? _microserviceHost;
public async Task InitializeAsync()
{
// Start microservice
_microserviceHost = await CreateMicroserviceHost();
await _microserviceHost.StartAsync();
// Start gateway
_gatewayHost = await CreateGatewayHost();
await _gatewayHost.StartAsync();
}
public async Task DisposeAsync()
{
if (_gatewayHost != null)
await _gatewayHost.StopAsync();
if (_microserviceHost != null)
await _microserviceHost.StopAsync();
_gatewayHost?.Dispose();
_microserviceHost?.Dispose();
}
public HttpClient CreateAuthenticatedClient(Dictionary<string, object> claims)
{
var token = GenerateTestToken(claims);
var client = new HttpClient
{
BaseAddress = new Uri("http://localhost:5000")
};
client.DefaultRequestHeaders.Authorization =
new AuthenticationHeaderValue("Bearer", token);
return client;
}
}
```
## Git and PR Conventions
### Branch Naming
```
feat/router-<step>-<description>
fix/router-<issue-number>
refactor/router-<description>
test/router-<description>
docs/router-<description>
```
### Commit Messages
```
<type>(<scope>): <description>
[optional body]
[optional footer]
```
Types: `feat`, `fix`, `refactor`, `test`, `docs`, `chore`
Examples:
```
feat(router): implement JWT validation with per-endpoint keys
- Add JwtValidator with configurable key sources
- Support RS256 and ES256 algorithms
- Add JWKS endpoint caching with TTL
Closes #123
```
### PR Template
```markdown
## Summary
Brief description of what this PR does.
## Changes
- Change 1
- Change 2
- Change 3
## Testing
- [ ] Unit tests added/updated
- [ ] Integration tests added/updated
- [ ] Manual testing performed
## Checklist
- [ ] Code follows project conventions
- [ ] Documentation updated
- [ ] No breaking changes (or documented if any)
- [ ] All tests pass
```
## Common Pitfalls to Avoid
### Performance
```csharp
// ❌ BAD: Allocating in hot path
public bool MatchRoute(string path)
{
var parts = path.Split('/'); // Allocation
// ...
}
// ✅ GOOD: Use Span for parsing
public bool MatchRoute(ReadOnlySpan<char> path)
{
// Zero-allocation parsing
foreach (var segment in path.Split('/'))
{
// ...
}
}
// ❌ BAD: Synchronous I/O blocking async context
public async Task ProcessAsync()
{
var config = File.ReadAllText("config.json"); // Blocking!
}
// ✅ GOOD: Async all the way
public async Task ProcessAsync()
{
var config = await File.ReadAllTextAsync("config.json");
}
```
### Thread Safety
```csharp
// ❌ BAD: Non-thread-safe collection
private readonly Dictionary<string, Route> _routes = new();
public void AddRoute(string key, Route route)
{
_routes[key] = route; // Not thread-safe!
}
// ✅ GOOD: Thread-safe collection
private readonly ConcurrentDictionary<string, Route> _routes = new();
public void AddRoute(string key, Route route)
{
_routes[key] = route; // Thread-safe
}
// ✅ GOOD: Immutable update
private ImmutableDictionary<string, Route> _routes =
ImmutableDictionary<string, Route>.Empty;
public void AddRoute(string key, Route route)
{
ImmutableInterlocked.AddOrUpdate(ref _routes, key, route, (_, _) => route);
}
```
### Resource Management
```csharp
// ❌ BAD: Not disposing resources
public async Task SendAsync(byte[] data)
{
var client = new TcpClient();
await client.ConnectAsync("host", 9100);
await client.GetStream().WriteAsync(data);
// client never disposed!
}
// ✅ GOOD: Proper disposal
public async Task SendAsync(byte[] data)
{
using var client = new TcpClient();
await client.ConnectAsync("host", 9100);
await using var stream = client.GetStream();
await stream.WriteAsync(data);
}
// ✅ GOOD: Connection pooling
public class ConnectionPool : IDisposable
{
private readonly Channel<TcpClient> _pool;
public async Task<TcpClient> RentAsync()
{
if (_pool.Reader.TryRead(out var client))
return client;
return await CreateNewConnectionAsync();
}
public void Return(TcpClient client)
{
if (!_pool.Writer.TryWrite(client))
client.Dispose();
}
}
```
## Deliverables
| Artifact | Purpose |
|----------|---------|
| This document | Agent implementation guidelines |
| Code templates | Consistent starting points |
| Checklists | Quality gates |
| Test patterns | Consistent testing approach |
## Next Step
[Step 29: Integration Testing & CI →](29-Step.md)

File diff suppressed because it is too large Load Diff

View File

@@ -1,62 +0,0 @@
# StellaOps Router
The StellaOps Router is the internal communication infrastructure that enables microservices to communicate through a central gateway.
## Overview
The router provides:
- **Gateway WebService** (`StellaOps.Gateway.WebService`): HTTP ingress service that routes requests to microservices
- **Microservice SDK** (`StellaOps.Microservice`): SDK for building microservices that connect to the router
- **Transport Plugins**: Multiple transport options (TCP, TLS, UDP, RabbitMQ, InMemory for testing)
- **Claims-based Authorization**: Using `RequiringClaims` instead of role-based access
## Key Documents
| Document | Purpose |
|----------|---------|
| [specs.md](./specs.md) | **Canonical specification** - READ FIRST |
| [implplan.md](./implplan.md) | High-level implementation plan |
| [SPRINT_INDEX.md](./SPRINT_INDEX.md) | Sprint overview and dependency graph |
## Solution Structure
```
StellaOps.Router.slnx
├── src/__Libraries/
│ ├── StellaOps.Router.Common/ # Shared types, enums, interfaces
│ ├── StellaOps.Router.Config/ # Router configuration models
│ ├── StellaOps.Microservice/ # Microservice SDK
│ └── StellaOps.Microservice.SourceGen/ # Build-time endpoint discovery
├── src/Gateway/
│ └── StellaOps.Gateway.WebService/ # HTTP gateway service
└── tests/
├── StellaOps.Router.Common.Tests/
├── StellaOps.Gateway.WebService.Tests/
└── StellaOps.Microservice.Tests/
```
## Building
```bash
# Build the router solution
dotnet build StellaOps.Router.slnx
# Run tests
dotnet test StellaOps.Router.slnx
```
## Invariants (Non-Negotiable)
From the specification, these are non-negotiable:
- **Method + Path** is the endpoint identity
- **Strict semver** for version matching
- **Region from GatewayNodeConfig.Region** (never from headers/host)
- **No HTTP transport** between gateway and microservices
- **RequiringClaims** (not AllowedRoles) for authorization
- **Opaque body handling** (router doesn't interpret payloads)
## Status
Currently in development. See [SPRINT_INDEX.md](./SPRINT_INDEX.md) for implementation progress.

View File

@@ -1,121 +0,0 @@
# Sprint 7000-0001-0001 · Router Foundation · Project Skeleton
## Topic & Scope
Phase 1 of Router implementation: establish the project skeleton with all required directories, solution files, and empty stubs. This sprint creates the structural foundation that all subsequent router sprints depend on.
**Goal:** Get a clean, compiling skeleton in place that matches the spec and folder conventions, with zero real logic and minimal dependencies.
**Working directories:**
- `src/__Libraries/StellaOps.Router.Common/`
- `src/__Libraries/StellaOps.Router.Config/`
- `src/__Libraries/StellaOps.Microservice/`
- `src/__Libraries/StellaOps.Microservice.SourceGen/`
- `src/Gateway/StellaOps.Gateway.WebService/`
- `tests/StellaOps.Router.Common.Tests/`
- `tests/StellaOps.Gateway.WebService.Tests/`
- `tests/StellaOps.Microservice.Tests/`
**Isolation strategy:** Router uses a separate `StellaOps.Router.sln` solution file to enable fully independent building and testing. This prevents any impact on the main `StellaOps.sln` until the migration phase.
## Dependencies & Concurrency
- **Upstream:** None. This is the first router sprint.
- **Downstream:** All other router sprints depend on this skeleton.
- **Parallel work:** None possible until this sprint completes.
- **Cross-module impact:** None. All work is in new directories.
## Documentation Prerequisites
- `docs/router/specs.md` (canonical specification - READ FIRST)
- `docs/router/implplan.md` (implementation plan overview)
- `docs/router/01-Step.md` (detailed task breakdown for this sprint)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Invariants (from specs.md)
Before coding, acknowledge these non-negotiables:
- Method + Path identity for endpoints
- Strict semver for versions
- Region from `GatewayNodeConfig.Region` (no host/header derivation)
- No HTTP transport for microservice-to-router communications
- Single connection carrying HELLO + HEARTBEAT + REQUEST/RESPONSE + CANCEL
- Router treats body as opaque bytes/streams
- `RequiringClaims` replaces any form of `AllowedRoles`
## Delivery Tracker
| # | Task ID | Status | Description | Working Directory |
|---|---------|--------|-------------|-------------------|
| 1 | SKEL-001 | DONE | Create directory structure (`src/__Libraries/`, `src/Gateway/`, `tests/`) | repo root |
| 2 | SKEL-002 | DONE | Create `StellaOps.Router.slnx` solution file at repo root | repo root |
| 3 | SKEL-003 | DONE | Create `StellaOps.Router.Common` classlib project | `src/__Libraries/StellaOps.Router.Common/` |
| 4 | SKEL-004 | DONE | Create `StellaOps.Router.Config` classlib project | `src/__Libraries/StellaOps.Router.Config/` |
| 5 | SKEL-005 | DONE | Create `StellaOps.Microservice` classlib project | `src/__Libraries/StellaOps.Microservice/` |
| 6 | SKEL-006 | DONE | Create `StellaOps.Microservice.SourceGen` classlib stub | `src/__Libraries/StellaOps.Microservice.SourceGen/` |
| 7 | SKEL-007 | DONE | Create `StellaOps.Gateway.WebService` webapi project | `src/Gateway/StellaOps.Gateway.WebService/` |
| 8 | SKEL-008 | DONE | Create xunit test projects for Common, Gateway, Microservice | `tests/` |
| 9 | SKEL-009 | DONE | Wire project references per dependency graph | all projects |
| 10 | SKEL-010 | DONE | Add common settings (net10.0, nullable, LangVersion) to each csproj | all projects |
| 11 | SKEL-011 | DONE | Stub empty placeholder types in each project (no logic) | all projects |
| 12 | SKEL-012 | DONE | Add dummy smoke tests so CI passes | `tests/` |
| 13 | SKEL-013 | DONE | Verify `dotnet build StellaOps.Router.slnx` succeeds | repo root |
| 14 | SKEL-014 | DONE | Verify `dotnet test StellaOps.Router.slnx` passes | repo root |
| 15 | SKEL-015 | DONE | Update `docs/router/README.md` with solution overview | `docs/router/` |
## Project Reference Graph
```
StellaOps.Gateway.WebService
├── StellaOps.Router.Common
└── StellaOps.Router.Config
└── StellaOps.Router.Common
StellaOps.Microservice
└── StellaOps.Router.Common
StellaOps.Microservice.SourceGen
(no references yet - stub only)
Test projects reference their corresponding main projects.
```
## Stub Types to Create
### StellaOps.Router.Common
- Enums: `TransportType`, `FrameType`, `InstanceHealthStatus`
- Models: `ClaimRequirement`, `EndpointDescriptor`, `InstanceDescriptor`, `ConnectionState`, `Frame`
- Interfaces: `IGlobalRoutingState`, `IRoutingPlugin`, `ITransportServer`, `ITransportClient`
### StellaOps.Router.Config
- `RouterConfig`, `ServiceConfig`, `PayloadLimits` (property-only classes)
### StellaOps.Microservice
- `StellaMicroserviceOptions`, `RouterEndpointConfig`
- `ServiceCollectionExtensions.AddStellaMicroservice()` (empty body)
### StellaOps.Gateway.WebService
- `GatewayNodeConfig` with Region, NodeId, Environment
- Minimal `Program.cs` that builds and runs (no logic)
## Exit Criteria
Before marking this sprint DONE:
1. [x] `dotnet build StellaOps.Router.slnx` succeeds with zero warnings
2. [x] `dotnet test StellaOps.Router.slnx` passes (even with dummy tests)
3. [x] All project names match spec: `StellaOps.Gateway.WebService`, `StellaOps.Router.Common`, `StellaOps.Router.Config`, `StellaOps.Microservice`
4. [x] No real business logic exists (no transport logic, no routing decisions, no YAML parsing)
5. [x] `docs/router/README.md` exists and points to `specs.md`
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2024-12-04 | Sprint completed: all skeleton projects created, build and tests passing | Claude |
## Decisions & Risks
- Router uses a separate solution file (`StellaOps.Router.sln`) to enable isolated development. This will be merged into main `StellaOps.sln` during the migration phase.
- Target framework is `net10.0` to match the rest of StellaOps.
- `StellaOps.Microservice.SourceGen` is created as a plain classlib for now; it will be converted to a Source Generator project in a later sprint.

View File

@@ -1,157 +0,0 @@
# Sprint 7000-0001-0002 · Router Foundation · Common Library Models
## Topic & Scope
Phase 2 of Router implementation: implement the shared core model in `StellaOps.Router.Common`. This sprint makes Common the single, stable contract layer that Gateway, Microservice SDK, and transports all depend on.
**Goal:** Lock down the domain vocabulary. Implement all data types and interfaces with **no behavior** - just shapes that match `specs.md`.
**Working directory:** `src/__Libraries/StellaOps.Router.Common/`
**Key principle:** Changes to `StellaOps.Router.Common` after this sprint must be rare and reviewed. Everything else depends on it.
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0001_0001 (skeleton must be complete)
- **Downstream:** All other router sprints depend on these contracts
- **Parallel work:** None possible until this sprint completes
- **Cross-module impact:** None. All work is in `StellaOps.Router.Common`
## Documentation Prerequisites
- `docs/router/specs.md` (canonical specification - READ FIRST, sections 2-13)
- `docs/router/02-Step.md` (detailed task breakdown for this sprint)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | CMN-001 | DONE | Create `/Enums/TransportType.cs` with `[Udp, Tcp, Certificate, RabbitMq]` | No HTTP type per spec |
| 2 | CMN-002 | DONE | Create `/Enums/FrameType.cs` with Hello, Heartbeat, EndpointsUpdate, Request, RequestStreamData, Response, ResponseStreamData, Cancel | |
| 3 | CMN-003 | DONE | Create `/Enums/InstanceHealthStatus.cs` with Unknown, Healthy, Degraded, Draining, Unhealthy | |
| 4 | CMN-010 | DONE | Create `/Models/ClaimRequirement.cs` with Type (required) and Value (optional) | Replaces AllowedRoles |
| 5 | CMN-011 | DONE | Create `/Models/EndpointDescriptor.cs` with ServiceName, Version, Method, Path, DefaultTimeout, SupportsStreaming, RequiringClaims | |
| 6 | CMN-012 | DONE | Create `/Models/InstanceDescriptor.cs` with InstanceId, ServiceName, Version, Region | |
| 7 | CMN-013 | DONE | Create `/Models/ConnectionState.cs` with ConnectionId, Instance, Status, LastHeartbeatUtc, AveragePingMs, TransportType, Endpoints | |
| 8 | CMN-014 | DONE | Create `/Models/RoutingContext.cs` matching spec (neutral context, no ASP.NET dependency) | |
| 9 | CMN-015 | DONE | Create `/Models/RoutingDecision.cs` with Endpoint, Connection, TransportType, EffectiveTimeout | |
| 10 | CMN-016 | DONE | Create `/Models/PayloadLimits.cs` with MaxRequestBytesPerCall, MaxRequestBytesPerConnection, MaxAggregateInflightBytes | |
| 11 | CMN-020 | DONE | Create `/Models/Frame.cs` with Type, CorrelationId, Payload | |
| 12 | CMN-021 | DONE | Create `/Models/HelloPayload.cs` with InstanceDescriptor and list of EndpointDescriptors | |
| 13 | CMN-022 | DONE | Create `/Models/HeartbeatPayload.cs` with InstanceId, Status, metrics | |
| 14 | CMN-023 | DONE | Create `/Models/CancelPayload.cs` with Reason | |
| 15 | CMN-030 | DONE | Create `/Abstractions/IGlobalRoutingState.cs` interface | |
| 16 | CMN-031 | DONE | Create `/Abstractions/IRoutingPlugin.cs` interface | |
| 17 | CMN-032 | DONE | Create `/Abstractions/ITransportServer.cs` interface | |
| 18 | CMN-033 | DONE | Create `/Abstractions/ITransportClient.cs` interface | |
| 19 | CMN-034 | DONE | Create `/Abstractions/IRegionProvider.cs` interface (optional, if spec requires) | |
| 20 | CMN-040 | DONE | Write shape tests for EndpointDescriptor, ConnectionState | Already covered in existing tests |
| 21 | CMN-041 | DONE | Write enum completeness tests for FrameType | |
| 22 | CMN-042 | DONE | Verify Common compiles with zero warnings (nullable enabled) | |
| 23 | CMN-043 | DONE | Verify Common only references BCL (no ASP.NET, no serializers) | |
## File Layout
```
/src/__Libraries/StellaOps.Router.Common/
/Enums/
TransportType.cs
FrameType.cs
InstanceHealthStatus.cs
/Models/
ClaimRequirement.cs
EndpointDescriptor.cs
InstanceDescriptor.cs
ConnectionState.cs
RoutingContext.cs
RoutingDecision.cs
PayloadLimits.cs
Frame.cs
HelloPayload.cs
HeartbeatPayload.cs
CancelPayload.cs
/Abstractions/
IGlobalRoutingState.cs
IRoutingPlugin.cs
ITransportClient.cs
ITransportServer.cs
IRegionProvider.cs
```
## Interface Signatures (from specs.md)
### IGlobalRoutingState
```csharp
public interface IGlobalRoutingState
{
EndpointDescriptor? ResolveEndpoint(string method, string path);
IReadOnlyList<ConnectionState> GetConnectionsFor(
string serviceName, string version, string method, string path);
}
```
### IRoutingPlugin
```csharp
public interface IRoutingPlugin
{
Task<RoutingDecision?> ChooseInstanceAsync(
RoutingContext context, CancellationToken cancellationToken);
}
```
### ITransportServer
```csharp
public interface ITransportServer
{
Task StartAsync(CancellationToken cancellationToken);
Task StopAsync(CancellationToken cancellationToken);
}
```
### ITransportClient
```csharp
public interface ITransportClient
{
Task<Frame> SendRequestAsync(
ConnectionState connection, Frame requestFrame,
TimeSpan timeout, CancellationToken cancellationToken);
Task SendCancelAsync(
ConnectionState connection, Guid correlationId, string? reason = null);
Task SendStreamingAsync(
ConnectionState connection, Frame requestHeader, Stream requestBody,
Func<Stream, Task> readResponseBody, PayloadLimits limits,
CancellationToken cancellationToken);
}
```
## Design Constraints
1. **No behavior:** Only shapes - no LINQ-heavy methods, no routing algorithms, no network code
2. **No serialization:** No JSON/MessagePack references; Common only defines shapes
3. **Immutability preferred:** Use `init` properties for descriptors; `ConnectionState` health fields may be mutable
4. **BCL only:** No ASP.NET or third-party package dependencies
5. **Nullable enabled:** All code must compile with zero nullable warnings
## Exit Criteria
Before marking this sprint DONE:
1. [x] All types from `specs.md` Common section exist with matching names and properties
2. [x] Common compiles with zero warnings
3. [x] Common only references BCL (verify no package references in .csproj)
4. [x] No behavior/logic in any type (pure DTOs and interfaces)
5. [x] `StellaOps.Router.Common.Tests` runs and passes
6. [x] `docs/router/specs.md` is updated if any discrepancy found (or code matches spec)
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2024-12-04 | Sprint completed: all models and interfaces implemented per spec | Claude |
## Decisions & Risks
- `RoutingContext` uses a neutral model (not ASP.NET `HttpContext`) to keep Common free of web dependencies. Gateway will adapt from `HttpContext` to this neutral model.
- `ConnectionState.Endpoints` uses `(string Method, string Path)` tuple as key for dictionary lookups.
- Frame payloads are `byte[]` - serialization happens at the transport layer, not in Common.

View File

@@ -1,121 +0,0 @@
# Sprint 7000-0002-0001 · Router Transport · InMemory Plugin
## Topic & Scope
Build a fake "in-memory" transport plugin for development and testing. This transport proves the HELLO/HEARTBEAT/REQUEST/RESPONSE/CANCEL semantics and routing logic **without** dealing with sockets and RabbitMQ yet.
**Goal:** Enable unit and integration testing of the router and SDK by providing an in-process transport where frames are passed via channels/queues in memory.
**Working directory:** `src/__Libraries/StellaOps.Router.Transport.InMemory/`
**Key principle:** This plugin will never ship to production; it's only for dev tests and CI. It must fully implement all transport abstractions so that switching to real transports later requires zero changes to Gateway or Microservice SDK code.
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0001_0002 (Common models must be complete)
- **Downstream:** SDK and Gateway sprints depend on this for testing
- **Parallel work:** Can run in parallel with CMN-040/041/042/043 test tasks if Common models are done
- **Cross-module impact:** None. Creates new directory only.
## Documentation Prerequisites
- `docs/router/specs.md` (sections 5, 10 - Transport and Cancellation requirements)
- `docs/router/03-Step.md` (detailed task breakdown)
- `docs/router/implplan.md` (phase 3 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | MEM-001 | DONE | Create `StellaOps.Router.Transport.InMemory` classlib project | Add to StellaOps.Router.sln |
| 2 | MEM-002 | DONE | Add project reference to `StellaOps.Router.Common` | |
| 3 | MEM-010 | DONE | Implement `InMemoryTransportServer` : `ITransportServer` | Gateway side |
| 4 | MEM-011 | DONE | Implement `InMemoryTransportClient` : `ITransportClient` | Microservice side |
| 5 | MEM-012 | DONE | Create shared `InMemoryConnectionRegistry` (concurrent dictionary keyed by ConnectionId) | Thread-safe |
| 6 | MEM-013 | DONE | Create `InMemoryChannel` for bidirectional frame passing | Use System.Threading.Channels |
| 7 | MEM-020 | DONE | Implement HELLO frame handling (client → server) | |
| 8 | MEM-021 | DONE | Implement HEARTBEAT frame handling (client → server) | |
| 9 | MEM-022 | DONE | Implement REQUEST frame handling (server → client) | |
| 10 | MEM-023 | DONE | Implement RESPONSE frame handling (client → server) | |
| 11 | MEM-024 | DONE | Implement CANCEL frame handling (bidirectional) | |
| 12 | MEM-025 | DONE | Implement REQUEST_STREAM_DATA / RESPONSE_STREAM_DATA frame handling | For streaming support |
| 13 | MEM-030 | DONE | Create `InMemoryTransportOptions` for configuration | Timeouts, buffer sizes |
| 14 | MEM-031 | DONE | Create DI registration extension `AddInMemoryTransport()` | |
| 15 | MEM-040 | DONE | Write integration tests for HELLO/HEARTBEAT flow | |
| 16 | MEM-041 | DONE | Write integration tests for REQUEST/RESPONSE flow | |
| 17 | MEM-042 | DONE | Write integration tests for CANCEL flow | |
| 18 | MEM-043 | DONE | Write integration tests for streaming flow | |
| 19 | MEM-050 | DONE | Create test project `StellaOps.Router.Transport.InMemory.Tests` | |
## Architecture
```
┌──────────────────────┐ InMemoryConnectionRegistry ┌──────────────────────┐
│ Gateway │ (ConcurrentDictionary<ConnectionId, │ Microservice │
│ (InMemoryTransport │◄──── InMemoryChannel>) ────►│ (InMemoryTransport │
│ Server) │ │ Client) │
└──────────────────────┘ └──────────────────────┘
│ │
│ Channel<Frame> ToMicroservice ─────────────────────────────────────►│
│◄─────────────────────────────────────────────── Channel<Frame> ToGateway
│ │
```
## InMemoryChannel Design
```csharp
internal sealed class InMemoryChannel
{
public string ConnectionId { get; }
public Channel<Frame> ToMicroservice { get; } // Gateway writes, SDK reads
public Channel<Frame> ToGateway { get; } // SDK writes, Gateway reads
public InstanceDescriptor? Instance { get; set; }
public CancellationTokenSource LifetimeToken { get; }
}
```
## Frame Flow Examples
### HELLO Flow
1. Microservice SDK calls `InMemoryTransportClient.ConnectAsync()`
2. Client creates `InMemoryChannel`, registers in `InMemoryConnectionRegistry`
3. Client sends HELLO frame via `ToGateway` channel
4. Server reads from `ToGateway`, processes HELLO, updates `ConnectionState`
### REQUEST/RESPONSE Flow
1. Gateway receives HTTP request
2. Gateway sends REQUEST frame via `ToMicroservice` channel
3. SDK reads from `ToMicroservice`, invokes handler
4. SDK sends RESPONSE frame via `ToGateway` channel
5. Gateway reads from `ToGateway`, returns HTTP response
### CANCEL Flow
1. HTTP client disconnects (or timeout)
2. Gateway sends CANCEL frame via `ToMicroservice` channel
3. SDK reads CANCEL, cancels handler's CancellationToken
4. SDK optionally sends partial RESPONSE or no response
## Exit Criteria
Before marking this sprint DONE:
1. [x] `InMemoryTransportServer` fully implements `ITransportServer`
2. [x] `InMemoryTransportClient` fully implements `ITransportClient`
3. [x] All frame types (HELLO, HEARTBEAT, REQUEST, RESPONSE, STREAM_DATA, CANCEL) are handled
4. [x] Thread-safe concurrent access to `InMemoryConnectionRegistry`
5. [x] All integration tests pass
6. [x] No external dependencies (only BCL + Router.Common + DI/Options/Logging abstractions)
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2024-12-04 | Sprint completed: all InMemory transport components implemented and tested | Claude |
## Decisions & Risks
- Uses `System.Threading.Channels` for async frame passing (unbounded by default, can add backpressure later)
- InMemory transport simulates latency only if explicitly configured (default: instant)
- Connection lifetime is tied to `CancellationTokenSource`; disposing triggers cleanup
- This transport is explicitly excluded from production deployments via conditional compilation or package separation

View File

@@ -1,135 +0,0 @@
# Sprint 7000-0003-0001 · Microservice SDK · Core Infrastructure
## Topic & Scope
Implement the core infrastructure of the Microservice SDK: options, endpoint discovery, and router connection management. After this sprint, a microservice can connect to a router and send HELLO with its endpoint list.
**Goal:** "Connect and say HELLO" - microservice connects to router(s) and registers its identity and endpoints.
**Working directory:** `src/__Libraries/StellaOps.Microservice/`
**Parallel track:** This sprint can run in parallel with Gateway sprints (7000-0004-*) once the InMemory transport is complete.
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0001_0002 (Common), SPRINT_7000_0002_0001 (InMemory transport)
- **Downstream:** SPRINT_7000_0003_0002 (request handling)
- **Parallel work:** Can run in parallel with Gateway core sprint
- **Cross-module impact:** None. All work in `src/__Libraries/StellaOps.Microservice/`
## Documentation Prerequisites
- `docs/router/specs.md` (section 7 - Microservice SDK requirements)
- `docs/router/04-Step.md` (detailed task breakdown)
- `docs/router/implplan.md` (phase 4 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | SDK-001 | DONE | Implement `StellaMicroserviceOptions` with all required properties | ServiceName, Version, Region, InstanceId, Routers, ConfigFilePath |
| 2 | SDK-002 | DONE | Implement `RouterEndpointConfig` (host, port, transport type) | |
| 3 | SDK-003 | DONE | Validate that Routers list is mandatory (throw if empty) | Per spec |
| 4 | SDK-010 | DONE | Create `[StellaEndpoint]` attribute for endpoint declaration | Method, Path, SupportsStreaming, Timeout |
| 5 | SDK-011 | DONE | Implement runtime reflection endpoint discovery | Scan assemblies for `[StellaEndpoint]` |
| 6 | SDK-012 | DONE | Build in-memory `EndpointDescriptor` list from discovered endpoints | |
| 7 | SDK-013 | DONE | Create `IEndpointDiscoveryProvider` abstraction | For source-gen vs reflection swap |
| 8 | SDK-020 | DONE | Implement `IRouterConnectionManager` interface | |
| 9 | SDK-021 | DONE | Implement `RouterConnectionManager` with connection pool | One connection per router endpoint |
| 10 | SDK-022 | DONE | Implement connection lifecycle (connect, reconnect on failure) | Exponential backoff |
| 11 | SDK-023 | DONE | Implement HELLO frame construction from options + endpoints | |
| 12 | SDK-024 | DONE | Send HELLO on connection establishment | Via InMemory transport |
| 13 | SDK-025 | DONE | Implement HEARTBEAT sending on timer | Configurable interval |
| 14 | SDK-030 | DONE | Implement `AddStellaMicroservice(IServiceCollection, Action<StellaMicroserviceOptions>)` | Full DI registration |
| 15 | SDK-031 | DONE | Register `IHostedService` for connection management | Start/stop with host |
| 16 | SDK-032 | DONE | Create `MicroserviceHostedService` that starts connections on app startup | |
| 17 | SDK-040 | DONE | Write unit tests for endpoint discovery | |
| 18 | SDK-041 | DONE | Write integration tests with InMemory transport | Connect, HELLO, HEARTBEAT |
## Endpoint Discovery
### Attribute-Based Declaration
```csharp
[StellaEndpoint("POST", "/billing/invoices")]
public sealed class CreateInvoiceEndpoint : IStellaEndpoint<CreateInvoiceRequest, CreateInvoiceResponse>
{
public Task<CreateInvoiceResponse> HandleAsync(CreateInvoiceRequest request, CancellationToken ct);
}
```
### Discovery Flow
1. On startup, scan loaded assemblies for types with `[StellaEndpoint]`
2. For each type, verify it implements a handler interface
3. Build `EndpointDescriptor` from attribute + defaults
4. Store in `IEndpointRegistry` for lookup and HELLO construction
### Handler Interface Detection
```csharp
// Typed with request
typeof(IStellaEndpoint<TRequest, TResponse>)
// Typed without request
typeof(IStellaEndpoint<TResponse>)
// Raw handler
typeof(IRawStellaEndpoint)
```
## Connection Lifecycle
```
┌─────────────┐ Connect ┌─────────────┐ HELLO ┌─────────────┐
│ Disconnected│────────────────►│ Connected │───────────────►│ Registered │
└─────────────┘ └─────────────┘ └─────────────┘
▲ │ │
│ │ Error │ Heartbeat timer
│ ▼ ▼
│ ┌─────────────┐ ┌─────────────┐
└────────────────────────│ Reconnect │◄───────────────│ Heartbeat │
Backoff │ (backoff) │ Error │ Active │
└─────────────┘ └─────────────┘
```
## StellaMicroserviceOptions
```csharp
public sealed class StellaMicroserviceOptions
{
public string ServiceName { get; set; } = string.Empty;
public string Version { get; set; } = string.Empty; // Strict semver
public string Region { get; set; } = string.Empty;
public string InstanceId { get; set; } = string.Empty; // Auto-generate if empty
public IList<RouterEndpointConfig> Routers { get; set; } = new List<RouterEndpointConfig>();
public string? ConfigFilePath { get; set; } // Optional YAML overrides
public TimeSpan HeartbeatInterval { get; set; } = TimeSpan.FromSeconds(10);
public TimeSpan ReconnectBackoffMax { get; set; } = TimeSpan.FromMinutes(1);
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [x] `StellaMicroserviceOptions` fully implemented with validation
2. [x] Endpoint discovery works via reflection
3. [x] Connection manager connects to configured routers
4. [x] HELLO frame sent on connection with full endpoint list
5. [x] HEARTBEAT sent periodically on timer
6. [x] Reconnection with backoff on connection failure
7. [x] Integration tests pass with InMemory transport
8. [x] `AddStellaMicroservice()` registers all services correctly
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2024-12-04 | Sprint completed: SDK core infrastructure implemented | Claude |
## Decisions & Risks
- Endpoint discovery defaults to reflection; source generation comes in a later sprint
- InstanceId auto-generates using `Guid.NewGuid().ToString("N")` if not provided
- Version validation enforces strict semver format
- Routers list cannot be empty - throws `InvalidOperationException` on startup
- YAML config file is optional at this stage (Sprint 7000-0007-0002)

View File

@@ -1,173 +0,0 @@
# Sprint 7000-0003-0002 · Microservice SDK · Request Handling
## Topic & Scope
Implement request handling in the Microservice SDK: receiving REQUEST frames, dispatching to handlers, and sending RESPONSE frames. Supports both typed and raw handler patterns.
**Goal:** Complete the request/response flow - microservice receives requests from router and returns responses.
**Working directory:** `src/__Libraries/StellaOps.Microservice/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0003_0001 (SDK core with connection + HELLO)
- **Downstream:** SPRINT_7000_0005_0003 (cancellation), SPRINT_7000_0005_0004 (streaming)
- **Parallel work:** Can run in parallel with Gateway middleware sprint
- **Cross-module impact:** None. All work in `src/__Libraries/StellaOps.Microservice/`
## Documentation Prerequisites
- `docs/router/specs.md` (section 7.2, 7.4, 7.5 - Endpoint definition, Connection behavior, Request handling)
- `docs/router/04-Step.md` (detailed task breakdown - request handling section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | HDL-001 | TODO | Define `IRawStellaEndpoint` interface | Takes RawRequestContext, returns RawResponse |
| 2 | HDL-002 | TODO | Define `IStellaEndpoint<TRequest, TResponse>` interface | Typed request/response |
| 3 | HDL-003 | TODO | Define `IStellaEndpoint<TResponse>` interface | No request body |
| 4 | HDL-010 | TODO | Implement `RawRequestContext` | Method, Path, Headers, Body stream, CancellationToken |
| 5 | HDL-011 | TODO | Implement `RawResponse` | StatusCode, Headers, Body stream |
| 6 | HDL-012 | TODO | Implement `IHeaderCollection` abstraction | Key-value header access |
| 7 | HDL-020 | TODO | Create `IEndpointRegistry` for handler lookup | (Method, Path) → handler instance |
| 8 | HDL-021 | TODO | Implement path template matching (ASP.NET-style routes) | Handles `{id}` parameters |
| 9 | HDL-022 | TODO | Implement path matching rules (case sensitivity, trailing slash) | Per spec |
| 10 | HDL-030 | TODO | Create `TypedEndpointAdapter` to wrap typed handlers as raw | IStellaEndpoint<T,R> → IRawStellaEndpoint |
| 11 | HDL-031 | TODO | Implement request deserialization in adapter | JSON by default |
| 12 | HDL-032 | TODO | Implement response serialization in adapter | JSON by default |
| 13 | HDL-040 | TODO | Implement `RequestDispatcher` | Frame → RawRequestContext → Handler → RawResponse → Frame |
| 14 | HDL-041 | TODO | Implement frame-to-context conversion | REQUEST frame → RawRequestContext |
| 15 | HDL-042 | TODO | Implement response-to-frame conversion | RawResponse → RESPONSE frame |
| 16 | HDL-043 | TODO | Wire dispatcher into connection read loop | Process REQUEST frames |
| 17 | HDL-050 | TODO | Implement `IServiceProvider` integration for handler instantiation | DI support |
| 18 | HDL-051 | TODO | Implement handler scoping (per-request scope) | IServiceScope per request |
| 19 | HDL-060 | TODO | Write unit tests for path matching | Various patterns |
| 20 | HDL-061 | TODO | Write unit tests for typed adapter | Serialization round-trip |
| 21 | HDL-062 | TODO | Write integration tests for full REQUEST/RESPONSE flow | With InMemory transport |
## Handler Interfaces
### Raw Handler
```csharp
public interface IRawStellaEndpoint
{
Task<RawResponse> HandleAsync(RawRequestContext context, CancellationToken cancellationToken);
}
```
### Typed Handlers
```csharp
public interface IStellaEndpoint<TRequest, TResponse>
{
Task<TResponse> HandleAsync(TRequest request, CancellationToken cancellationToken);
}
public interface IStellaEndpoint<TResponse>
{
Task<TResponse> HandleAsync(CancellationToken cancellationToken);
}
```
## RawRequestContext
```csharp
public sealed class RawRequestContext
{
public string Method { get; init; } = string.Empty;
public string Path { get; init; } = string.Empty;
public IReadOnlyDictionary<string, string> PathParameters { get; init; }
= new Dictionary<string, string>();
public IHeaderCollection Headers { get; init; } = default!;
public Stream Body { get; init; } = Stream.Null;
public CancellationToken CancellationToken { get; init; }
}
```
## RawResponse
```csharp
public sealed class RawResponse
{
public int StatusCode { get; init; } = 200;
public IHeaderCollection Headers { get; init; } = default!;
public Stream Body { get; init; } = Stream.Null;
public static RawResponse Ok(Stream body) => new() { StatusCode = 200, Body = body };
public static RawResponse NotFound() => new() { StatusCode = 404 };
public static RawResponse Error(int statusCode, string message) => ...;
}
```
## Path Template Matching
Must use same rules as router (ASP.NET-style):
- `{id}` matches any segment, value captured in PathParameters
- `{id:int}` constraint support (optional for v1)
- Case sensitivity: configurable, default case-insensitive
- Trailing slash: configurable, default treats `/foo` and `/foo/` as equivalent
## Request Flow
```
┌─────────────────┐ ┌────────────────────┐ ┌───────────────────┐
│ REQUEST Frame │────►│ RequestDispatcher │────►│ IEndpointRegistry │
│ (from Router) │ │ │ │ (Method, Path) │
└─────────────────┘ └────────────────────┘ └───────────────────┘
│ │
│ ▼
│ ┌───────────────────┐
│ │ Handler Instance │
│ │ (from DI scope) │
│ └───────────────────┘
│ │
│◄─────────────────────────┘
┌────────────────────┐
│ RawRequestContext │
└────────────────────┘
┌────────────────────┐
│ Handler.HandleAsync│
└────────────────────┘
┌────────────────────┐
│ RawResponse │
└────────────────────┘
┌────────────────────┐
│ RESPONSE Frame │
│ (to Router) │
└────────────────────┘
```
## Exit Criteria
Before marking this sprint DONE:
1. [ ] All handler interfaces defined and documented
2. [ ] `RawRequestContext` and `RawResponse` implemented
3. [ ] Path template matching works for common patterns
4. [ ] Typed handlers wrapped correctly via `TypedEndpointAdapter`
5. [ ] `RequestDispatcher` processes REQUEST frames end-to-end
6. [ ] DI integration works (handlers resolved from service provider)
7. [ ] Integration tests pass with InMemory transport
8. [ ] Body treated as opaque bytes (no interpretation at SDK level for raw handlers)
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Typed handlers use JSON serialization by default; configurable via options
- Path matching is case-insensitive by default (matches ASP.NET Core default)
- Each request gets its own DI scope for handler resolution
- Body stream may be buffered or streaming depending on endpoint configuration (streaming support comes in later sprint)
- Handler exceptions are caught and converted to 500 responses with error details (configurable)

View File

@@ -1,135 +0,0 @@
# Sprint 7000-0004-0001 · Gateway · Core Infrastructure
## Topic & Scope
Implement the core infrastructure of the Gateway: node configuration, global routing state, and basic routing plugin. This sprint creates the foundation for HTTP → transport → microservice routing.
**Goal:** Gateway can maintain routing state from connected microservices and select instances for routing decisions.
**Working directory:** `src/Gateway/StellaOps.Gateway.WebService/`
**Parallel track:** This sprint can run in parallel with Microservice SDK sprints (7000-0003-*) once the InMemory transport is complete.
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0001_0002 (Common), SPRINT_7000_0002_0001 (InMemory transport)
- **Downstream:** SPRINT_7000_0004_0002 (middleware), SPRINT_7000_0004_0003 (connection handling)
- **Parallel work:** Can run in parallel with SDK core sprint
- **Cross-module impact:** None. All work in `src/Gateway/StellaOps.Gateway.WebService/`
## Documentation Prerequisites
- `docs/router/specs.md` (section 6 - Gateway requirements)
- `docs/router/05-Step.md` (detailed task breakdown)
- `docs/router/implplan.md` (phase 5 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | GW-001 | TODO | Implement `GatewayNodeConfig` | Region, NodeId, Environment |
| 2 | GW-002 | TODO | Bind `GatewayNodeConfig` from configuration | appsettings.json section |
| 3 | GW-003 | TODO | Validate GatewayNodeConfig on startup | Region required |
| 4 | GW-010 | TODO | Implement `IGlobalRoutingState` as `InMemoryRoutingState` | Thread-safe implementation |
| 5 | GW-011 | TODO | Implement `ConnectionState` storage | ConcurrentDictionary by ConnectionId |
| 6 | GW-012 | TODO | Implement endpoint-to-connections index | (Method, Path) → List<ConnectionState> |
| 7 | GW-013 | TODO | Implement `ResolveEndpoint(method, path)` | Path template matching |
| 8 | GW-014 | TODO | Implement `GetConnectionsFor(serviceName, version, method, path)` | Filter by criteria |
| 9 | GW-020 | TODO | Create `IRoutingPlugin` implementation `DefaultRoutingPlugin` | Basic instance selection |
| 10 | GW-021 | TODO | Implement version filtering (strict semver equality) | Per spec |
| 11 | GW-022 | TODO | Implement health filtering (Healthy or Degraded only) | Per spec |
| 12 | GW-023 | TODO | Implement region preference (gateway region first) | Use GatewayNodeConfig.Region |
| 13 | GW-024 | TODO | Implement basic tie-breaking (any healthy instance) | Full algorithm in later sprint |
| 14 | GW-030 | TODO | Create `RoutingOptions` for configurable behavior | Default version, neighbor regions |
| 15 | GW-031 | TODO | Register routing services in DI | IGlobalRoutingState, IRoutingPlugin |
| 16 | GW-040 | TODO | Write unit tests for InMemoryRoutingState | |
| 17 | GW-041 | TODO | Write unit tests for DefaultRoutingPlugin | Version, health, region filtering |
## GatewayNodeConfig
```csharp
public sealed class GatewayNodeConfig
{
public string Region { get; set; } = string.Empty; // Required, e.g. "eu1"
public string NodeId { get; set; } = string.Empty; // e.g. "gw-eu1-01"
public string Environment { get; set; } = string.Empty; // e.g. "prod"
public IList<string> NeighborRegions { get; set; } = []; // Fallback regions
}
```
**Configuration binding:**
```json
{
"GatewayNode": {
"Region": "eu1",
"NodeId": "gw-eu1-01",
"Environment": "prod",
"NeighborRegions": ["eu2", "us1"]
}
}
```
## InMemoryRoutingState
```csharp
internal sealed class InMemoryRoutingState : IGlobalRoutingState
{
private readonly ConcurrentDictionary<string, ConnectionState> _connections = new();
private readonly ConcurrentDictionary<(string Method, string Path), List<string>> _endpointIndex = new();
public void AddConnection(ConnectionState connection) { ... }
public void RemoveConnection(string connectionId) { ... }
public void UpdateConnection(string connectionId, Action<ConnectionState> update) { ... }
public EndpointDescriptor? ResolveEndpoint(string method, string path) { ... }
public IReadOnlyList<ConnectionState> GetConnectionsFor(
string serviceName, string version, string method, string path) { ... }
}
```
## Routing Algorithm (Phase 1 - Basic)
```
1. Filter by ServiceName (exact match)
2. Filter by Version (strict semver equality)
3. Filter by Health (Healthy or Degraded only)
4. If any remain, pick one (random for now)
5. If none, return null (503 Service Unavailable)
```
**Note:** Full routing algorithm (region preference, ping-based selection, fallback) is implemented in SPRINT_7000_0005_0002.
## Region Derivation
Per spec section 2:
> Routing decisions MUST use `GatewayNodeConfig.Region` as the node's region; the router MUST NOT derive region from HTTP headers or URL host names.
This is enforced by:
1. GatewayNodeConfig is bound from static configuration only
2. No code path reads region from HttpContext
3. Tests verify region is never extracted from Host header
## Exit Criteria
Before marking this sprint DONE:
1. [ ] `GatewayNodeConfig` loads and validates from configuration
2. [ ] `InMemoryRoutingState` stores and indexes connections correctly
3. [ ] `ResolveEndpoint` performs path template matching
4. [ ] `DefaultRoutingPlugin` filters by version, health, region
5. [ ] All services registered in DI container
6. [ ] Unit tests pass for routing state and plugin
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Routing state is in-memory only; no persistence or distribution (single gateway node for v1)
- Path template matching reuses logic from SDK (shared in Common or duplicated)
- DefaultRoutingPlugin is intentionally simple; full algorithm comes in SPRINT_7000_0005_0002
- Region validation: startup fails fast if Region is empty

View File

@@ -1,172 +0,0 @@
# Sprint 7000-0004-0002 · Gateway · HTTP Middleware Pipeline
## Topic & Scope
Implement the HTTP middleware pipeline for the Gateway: endpoint resolution, authorization, routing decision, and transport dispatch. After this sprint, HTTP requests flow through the gateway to microservices via the InMemory transport.
**Goal:** Complete HTTP → transport → microservice → HTTP flow for basic buffered requests.
**Working directory:** `src/Gateway/StellaOps.Gateway.WebService/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0004_0001 (Gateway core)
- **Downstream:** SPRINT_7000_0004_0003 (connection handling)
- **Parallel work:** Can run in parallel with SDK request handling sprint
- **Cross-module impact:** None. All work in `src/Gateway/StellaOps.Gateway.WebService/`
## Documentation Prerequisites
- `docs/router/specs.md` (section 6.1 - HTTP ingress pipeline)
- `docs/router/05-Step.md` (middleware section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | MID-001 | TODO | Create `EndpointResolutionMiddleware` | (Method, Path) → EndpointDescriptor |
| 2 | MID-002 | TODO | Store resolved endpoint in `HttpContext.Items` | For downstream middleware |
| 3 | MID-003 | TODO | Return 404 if endpoint not found | |
| 4 | MID-010 | TODO | Create `AuthorizationMiddleware` stub | Checks authenticated only (full claims later) |
| 5 | MID-011 | TODO | Wire ASP.NET Core authentication | Standard middleware order |
| 6 | MID-012 | TODO | Return 401/403 for unauthorized requests | |
| 7 | MID-020 | TODO | Create `RoutingDecisionMiddleware` | Calls IRoutingPlugin.ChooseInstanceAsync |
| 8 | MID-021 | TODO | Store RoutingDecision in `HttpContext.Items` | |
| 9 | MID-022 | TODO | Return 503 if no instance available | |
| 10 | MID-023 | TODO | Return 504 if routing times out | |
| 11 | MID-030 | TODO | Create `TransportDispatchMiddleware` | Dispatches to selected transport |
| 12 | MID-031 | TODO | Implement buffered request dispatch | Read entire body, send REQUEST frame |
| 13 | MID-032 | TODO | Implement buffered response handling | Read RESPONSE frame, write to HTTP |
| 14 | MID-033 | TODO | Map transport errors to HTTP status codes | |
| 15 | MID-040 | TODO | Create `GlobalErrorHandlerMiddleware` | Catches unhandled exceptions |
| 16 | MID-041 | TODO | Implement structured error responses | JSON error envelope |
| 17 | MID-050 | TODO | Create `RequestLoggingMiddleware` | Correlation ID, service, endpoint, region, instance |
| 18 | MID-051 | TODO | Wire forwarded headers middleware | For reverse proxy support |
| 19 | MID-060 | TODO | Configure middleware pipeline in Program.cs | Correct order |
| 20 | MID-070 | TODO | Write integration tests for full HTTP→transport flow | With InMemory transport + SDK |
| 21 | MID-071 | TODO | Write tests for error scenarios (404, 503, etc.) | |
## Middleware Pipeline Order
```csharp
app.UseForwardedHeaders(); // Reverse proxy support
app.UseMiddleware<GlobalErrorHandlerMiddleware>();
app.UseMiddleware<RequestLoggingMiddleware>();
app.UseAuthentication(); // ASP.NET Core auth
app.UseMiddleware<EndpointResolutionMiddleware>();
app.UseMiddleware<AuthorizationMiddleware>();
app.UseMiddleware<RoutingDecisionMiddleware>();
app.UseMiddleware<TransportDispatchMiddleware>();
```
## EndpointResolutionMiddleware
```csharp
public class EndpointResolutionMiddleware
{
public async Task InvokeAsync(HttpContext context, IGlobalRoutingState routingState)
{
var method = context.Request.Method;
var path = context.Request.Path.Value ?? "/";
var endpoint = routingState.ResolveEndpoint(method, path);
if (endpoint == null)
{
context.Response.StatusCode = 404;
await context.Response.WriteAsJsonAsync(new { error = "Endpoint not found" });
return;
}
context.Items["ResolvedEndpoint"] = endpoint;
await _next(context);
}
}
```
## TransportDispatchMiddleware (Buffered Mode)
```csharp
public class TransportDispatchMiddleware
{
public async Task InvokeAsync(HttpContext context, ITransportClient transport)
{
var decision = (RoutingDecision)context.Items["RoutingDecision"]!;
var endpoint = (EndpointDescriptor)context.Items["ResolvedEndpoint"]!;
// Build REQUEST frame
using var bodyStream = new MemoryStream();
await context.Request.Body.CopyToAsync(bodyStream);
var requestFrame = new Frame
{
Type = FrameType.Request,
CorrelationId = Guid.NewGuid(),
Payload = BuildRequestPayload(context, bodyStream.ToArray())
};
// Send and await response
using var cts = CancellationTokenSource.CreateLinkedTokenSource(
context.RequestAborted);
cts.CancelAfter(decision.EffectiveTimeout);
var responseFrame = await transport.SendRequestAsync(
decision.Connection,
requestFrame,
decision.EffectiveTimeout,
cts.Token);
// Write response to HTTP
await WriteHttpResponse(context, responseFrame);
}
}
```
## Error Mapping
| Transport/Routing Error | HTTP Status |
|------------------------|-------------|
| Endpoint not found | 404 Not Found |
| No healthy instance | 503 Service Unavailable |
| Timeout | 504 Gateway Timeout |
| Microservice error (5xx) | Pass through status |
| Transport connection lost | 502 Bad Gateway |
| Payload too large | 413 Payload Too Large |
| Unauthorized | 401 Unauthorized |
| Forbidden (claims) | 403 Forbidden |
## HttpContext.Items Keys
```csharp
public static class ContextKeys
{
public const string ResolvedEndpoint = "ResolvedEndpoint";
public const string RoutingDecision = "RoutingDecision";
public const string CorrelationId = "CorrelationId";
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [ ] All middleware classes implemented
2. [ ] Pipeline configured in correct order
3. [ ] EndpointResolutionMiddleware resolves (Method, Path) → endpoint
4. [ ] AuthorizationMiddleware checks authentication (claims in later sprint)
5. [ ] RoutingDecisionMiddleware selects instance via IRoutingPlugin
6. [ ] TransportDispatchMiddleware sends/receives frames (buffered mode)
7. [ ] Error responses use consistent JSON envelope
8. [ ] Integration tests pass with InMemory transport
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Authorization middleware is a stub that only checks `User.Identity?.IsAuthenticated`; full RequiringClaims enforcement comes in SPRINT_7000_0008_0001
- Streaming support is not implemented in this sprint; TransportDispatchMiddleware only handles buffered mode
- Correlation ID is generated per request and logged throughout
- Request body is fully read into memory for buffered mode; streaming in SPRINT_7000_0005_0004

View File

@@ -1,218 +0,0 @@
# Sprint 7000-0004-0003 · Gateway · Connection Handling
## Topic & Scope
Implement connection handling in the Gateway: processing HELLO frames from microservices, maintaining connection state, and updating the global routing state. After this sprint, microservices can register with the gateway and be routed to.
**Goal:** Gateway receives HELLO from microservices and maintains live routing state. Combined with previous sprints, this enables full end-to-end HTTP → microservice routing.
**Working directory:** `src/Gateway/StellaOps.Gateway.WebService/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0004_0002 (middleware), SPRINT_7000_0003_0001 (SDK core with HELLO)
- **Downstream:** SPRINT_7000_0005_0001 (heartbeat/health)
- **Parallel work:** Should coordinate with SDK team for HELLO frame format agreement
- **Cross-module impact:** None. All work in Gateway.
## Documentation Prerequisites
- `docs/router/specs.md` (section 6.2 - Per-connection state and routing view)
- `docs/router/05-Step.md` (connection handling section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | CON-001 | TODO | Create `IConnectionHandler` interface | Processes frames per connection |
| 2 | CON-002 | TODO | Implement `ConnectionHandler` | Frame type dispatch |
| 3 | CON-010 | TODO | Implement HELLO frame processing | Parse HelloPayload, create ConnectionState |
| 4 | CON-011 | TODO | Validate HELLO payload | ServiceName, Version, InstanceId required |
| 5 | CON-012 | TODO | Register connection in IGlobalRoutingState | AddConnection |
| 6 | CON-013 | TODO | Build endpoint index from HELLO | (Method, Path) → ConnectionId |
| 7 | CON-020 | TODO | Create `TransportServerHost` hosted service | Starts ITransportServer |
| 8 | CON-021 | TODO | Wire transport server to connection handler | Frame routing |
| 9 | CON-022 | TODO | Handle new connections (InMemory: channel registration) | |
| 10 | CON-030 | TODO | Implement connection cleanup on disconnect | RemoveConnection from routing state |
| 11 | CON-031 | TODO | Clean up endpoint index on disconnect | Remove all endpoints for connection |
| 12 | CON-032 | TODO | Log connection lifecycle events | Connect, HELLO, disconnect |
| 13 | CON-040 | TODO | Implement connection ID generation | Unique per connection |
| 14 | CON-041 | TODO | Store connection metadata | Transport type, connect time |
| 15 | CON-050 | TODO | Write integration tests for HELLO flow | SDK → Gateway registration |
| 16 | CON-051 | TODO | Write tests for connection cleanup | |
| 17 | CON-052 | TODO | Write tests for multiple connections from same service | Different instances |
## Connection Lifecycle
```
┌─────────────────┐
│ New Connection │ (Transport layer signals new connection)
└────────┬────────┘
┌─────────────────┐
│ Awaiting HELLO │ (Connection exists but not registered for routing)
└────────┬────────┘
│ HELLO frame received
┌─────────────────┐
│ Validate HELLO │ (Check ServiceName, Version, endpoints)
└────────┬────────┘
│ Valid
┌─────────────────┐
│ Create │
│ ConnectionState │ (InstanceDescriptor, endpoints, health = Unknown)
└────────┬────────┘
┌─────────────────┐
│ Register in │ (Add to IGlobalRoutingState, index endpoints)
│ RoutingState │
└────────┬────────┘
┌─────────────────┐
│ Registered │ (Connection can receive routed requests)
└────────┬────────┘
│ Disconnect or error
┌─────────────────┐
│ Cleanup State │ (Remove from routing state, clean endpoint index)
└─────────────────┘
```
## HELLO Processing
```csharp
internal sealed class ConnectionHandler : IConnectionHandler
{
public async Task HandleFrameAsync(string connectionId, Frame frame)
{
switch (frame.Type)
{
case FrameType.Hello:
await ProcessHelloAsync(connectionId, frame);
break;
case FrameType.Heartbeat:
await ProcessHeartbeatAsync(connectionId, frame);
break;
case FrameType.Response:
case FrameType.ResponseStreamData:
await ProcessResponseAsync(connectionId, frame);
break;
default:
_logger.LogWarning("Unknown frame type {Type} from {ConnectionId}",
frame.Type, connectionId);
break;
}
}
private async Task ProcessHelloAsync(string connectionId, Frame frame)
{
var payload = DeserializeHelloPayload(frame.Payload);
// Validate
if (string.IsNullOrEmpty(payload.Instance.ServiceName))
throw new InvalidHelloException("ServiceName required");
if (string.IsNullOrEmpty(payload.Instance.Version))
throw new InvalidHelloException("Version required");
// Build ConnectionState
var connection = new ConnectionState
{
ConnectionId = connectionId,
Instance = payload.Instance,
Status = InstanceHealthStatus.Unknown,
LastHeartbeatUtc = DateTime.UtcNow,
TransportType = _currentTransportType,
Endpoints = payload.Endpoints.ToDictionary(
e => (e.Method, e.Path),
e => e)
};
// Register
_routingState.AddConnection(connection);
_logger.LogInformation(
"Registered {ServiceName} v{Version} instance {InstanceId} from {Region}",
payload.Instance.ServiceName,
payload.Instance.Version,
payload.Instance.InstanceId,
payload.Instance.Region);
}
}
```
## TransportServerHost
```csharp
internal sealed class TransportServerHost : IHostedService
{
private readonly ITransportServer _server;
private readonly IConnectionHandler _handler;
public async Task StartAsync(CancellationToken cancellationToken)
{
_server.OnConnection += HandleNewConnection;
_server.OnFrame += HandleFrame;
_server.OnDisconnect += HandleDisconnect;
await _server.StartAsync(cancellationToken);
}
private void HandleNewConnection(string connectionId)
{
_logger.LogInformation("New connection: {ConnectionId}", connectionId);
}
private async Task HandleFrame(string connectionId, Frame frame)
{
await _handler.HandleFrameAsync(connectionId, frame);
}
private void HandleDisconnect(string connectionId)
{
_routingState.RemoveConnection(connectionId);
_logger.LogInformation("Connection closed: {ConnectionId}", connectionId);
}
}
```
## Multiple Instances
The gateway must handle multiple instances of the same service:
- Same ServiceName + Version from different InstanceIds
- Each instance has its own ConnectionState
- Routing algorithm selects among available instances
```
Service: billing v1.0.0
├── Instance: billing-01 (Region: eu1) → Connection abc123
├── Instance: billing-02 (Region: eu1) → Connection def456
└── Instance: billing-03 (Region: us1) → Connection ghi789
```
## Exit Criteria
Before marking this sprint DONE:
1. [ ] HELLO frames processed correctly
2. [ ] ConnectionState created and stored
3. [ ] Endpoint index updated for routing lookups
4. [ ] Connection cleanup removes all state
5. [ ] TransportServerHost starts/stops with application
6. [ ] Integration tests: SDK registers, Gateway routes, SDK handles request
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Initial health status is `Unknown` until first heartbeat
- Connection ID format: GUID for InMemory, transport-specific for real transports
- HELLO validation failure disconnects the client (logs error)
- Duplicate HELLO from same connection replaces existing state (re-registration)

View File

@@ -1,205 +0,0 @@
# Sprint 7000-0005-0001 · Protocol Features · Heartbeat & Health
## Topic & Scope
Implement heartbeat processing and health tracking. Microservices send HEARTBEAT frames periodically; the gateway updates health status and marks stale instances as unhealthy.
**Goal:** Gateway maintains accurate health status for all connected instances, enabling health-aware routing.
**Working directories:**
- `src/__Libraries/StellaOps.Microservice/` (heartbeat sending)
- `src/Gateway/StellaOps.Gateway.WebService/` (heartbeat processing)
- `src/__Libraries/StellaOps.Router.Common/` (if payload changes needed)
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0004_0003 (Gateway connection handling), SPRINT_7000_0003_0001 (SDK core)
- **Downstream:** SPRINT_7000_0005_0002 (routing algorithm uses health)
- **Parallel work:** None. Sequential after connection handling.
- **Cross-module impact:** SDK and Gateway both modified.
## Documentation Prerequisites
- `docs/router/specs.md` (section 8 - Control/health/ping requirements)
- `docs/router/06-Step.md` (heartbeat section)
- `docs/router/implplan.md` (phase 6 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Working Directory |
|---|---------|--------|-------------|-------------------|
| 1 | HB-001 | DONE | Implement HeartbeatPayload serialization | Common |
| 2 | HB-002 | DONE | Add InstanceHealthStatus to HeartbeatPayload | Common |
| 3 | HB-003 | DONE | Add optional metrics to HeartbeatPayload (inflight count, error rate) | Common |
| 4 | HB-010 | DONE | Implement heartbeat sending timer in SDK | Microservice |
| 5 | HB-011 | DONE | Report current health status in heartbeat | Microservice |
| 6 | HB-012 | DONE | Report optional metrics in heartbeat | Microservice |
| 7 | HB-013 | DONE | Make heartbeat interval configurable | Microservice |
| 8 | HB-020 | DONE | Implement HEARTBEAT frame processing in Gateway | Gateway |
| 9 | HB-021 | DONE | Update LastHeartbeatUtc on heartbeat | Gateway |
| 10 | HB-022 | DONE | Update InstanceHealthStatus from payload | Gateway |
| 11 | HB-023 | DONE | Update optional metrics from payload | Gateway |
| 12 | HB-030 | DONE | Create HealthMonitorService hosted service | Gateway |
| 13 | HB-031 | DONE | Implement stale heartbeat detection | Configurable threshold |
| 14 | HB-032 | DONE | Mark instances Unhealthy when heartbeat stale | Gateway |
| 15 | HB-033 | DONE | Implement Draining status support | For graceful shutdown |
| 16 | HB-040 | DONE | Create HealthOptions for thresholds | StaleThreshold, DegradedThreshold |
| 17 | HB-041 | DONE | Bind HealthOptions from configuration | Gateway |
| 18 | HB-050 | DONE | Implement ping latency measurement (request/response timing) | Gateway |
| 19 | HB-051 | DONE | Update AveragePingMs from timing | Exponential moving average |
| 20 | HB-060 | DONE | Write integration tests for heartbeat flow | |
| 21 | HB-061 | DONE | Write tests for health status transitions | |
| 22 | HB-062 | DONE | Write tests for stale detection | |
## HeartbeatPayload
```csharp
public sealed class HeartbeatPayload
{
public string InstanceId { get; init; } = string.Empty;
public InstanceHealthStatus Status { get; init; }
public int? InflightRequestCount { get; init; }
public double? ErrorRatePercent { get; init; }
public DateTimeOffset Timestamp { get; init; }
}
```
## Health Status Transitions
```
┌─────────┐
First │ Unknown │
Heartbeat └────┬────┘
│ Status from payload
┌─────────┐
◄────────────────│ Healthy │◄───────────────┐
│ Degraded └────┬────┘ Healthy │
│ in payload │ │
▼ │ Stale threshold │
┌──────────┐ │ exceeded │
│ Degraded │ ▼ │
└────┬─────┘ ┌───────────┐ │
│ │ Unhealthy │───────────────┘
│ Stale └───────────┘ Heartbeat
│ threshold received
┌───────────┐
│ Unhealthy │
└───────────┘
```
**Special case: Draining**
- Microservice explicitly sets status to `Draining`
- Router stops sending new requests but allows in-flight to complete
- Used for graceful shutdown
## HealthMonitorService
```csharp
internal sealed class HealthMonitorService : BackgroundService
{
private readonly IGlobalRoutingState _routingState;
private readonly IOptions<HealthOptions> _options;
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
var interval = TimeSpan.FromSeconds(5); // Check frequency
while (!stoppingToken.IsCancellationRequested)
{
CheckStaleConnections();
await Task.Delay(interval, stoppingToken);
}
}
private void CheckStaleConnections()
{
var threshold = _options.Value.StaleThreshold;
var now = DateTime.UtcNow;
foreach (var connection in _routingState.GetAllConnections())
{
var age = now - connection.LastHeartbeatUtc;
if (age > threshold && connection.Status != InstanceHealthStatus.Unhealthy)
{
_routingState.UpdateConnection(connection.ConnectionId,
c => c.Status = InstanceHealthStatus.Unhealthy);
_logger.LogWarning(
"Instance {InstanceId} marked Unhealthy: no heartbeat for {Age}",
connection.Instance.InstanceId, age);
}
}
}
}
```
## HealthOptions
```csharp
public sealed class HealthOptions
{
public TimeSpan StaleThreshold { get; set; } = TimeSpan.FromSeconds(30);
public TimeSpan DegradedThreshold { get; set; } = TimeSpan.FromSeconds(15);
public int PingHistorySize { get; set; } = 10; // For moving average
}
```
## Ping Latency Measurement
Measure round-trip time for REQUEST/RESPONSE:
1. Record timestamp when REQUEST frame sent
2. Record timestamp when RESPONSE frame received
3. Calculate RTT = response_time - request_time
4. Update exponential moving average: `avg = 0.8 * avg + 0.2 * rtt`
```csharp
internal sealed class PingTracker
{
private readonly ConcurrentDictionary<Guid, long> _pendingRequests = new();
private double _averagePingMs;
public void RecordRequestSent(Guid correlationId)
{
_pendingRequests[correlationId] = Stopwatch.GetTimestamp();
}
public void RecordResponseReceived(Guid correlationId)
{
if (_pendingRequests.TryRemove(correlationId, out var startTicks))
{
var elapsed = Stopwatch.GetElapsedTime(startTicks);
var rtt = elapsed.TotalMilliseconds;
_averagePingMs = 0.8 * _averagePingMs + 0.2 * rtt;
}
}
public double AveragePingMs => _averagePingMs;
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [x] SDK sends HEARTBEAT frames on timer
2. [x] Gateway processes HEARTBEAT and updates ConnectionState
3. [x] HealthMonitorService marks stale instances Unhealthy
4. [x] Draining status stops new requests
5. [x] Ping latency measured and stored
6. [x] Health thresholds configurable
7. [x] Integration tests pass
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2025-12-05 | Sprint completed. Implemented heartbeat sending in SDK, health monitoring in Gateway, ping latency tracking. 51 tests passing. | Claude |
## Decisions & Risks
- Heartbeat interval default: 10 seconds (configurable)
- Stale threshold default: 30 seconds (3 missed heartbeats)
- Ping measurement uses REQUEST/RESPONSE timing, not separate PING frames
- Health status changes are logged for observability

View File

@@ -1,217 +0,0 @@
# Sprint 7000-0005-0002 · Protocol Features · Full Routing Algorithm
## Topic & Scope
Implement the complete routing algorithm as specified: region preference, ping-based selection, heartbeat recency, and fallback logic.
**Goal:** Routes prefer closest healthy instances with lowest latency, falling back through region tiers when necessary.
**Working directory:** `src/Gateway/StellaOps.Gateway.WebService/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0005_0001 (heartbeat/health provides the metrics)
- **Downstream:** SPRINT_7000_0005_0003 (cancellation), SPRINT_7000_0006_* (real transports)
- **Parallel work:** None. Sequential.
- **Cross-module impact:** Gateway only.
## Documentation Prerequisites
- `docs/router/specs.md` (section 4 - Routing algorithm / instance selection)
- `docs/router/06-Step.md` (routing algorithm section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | RTG-001 | DONE | Implement full filter chain in DefaultRoutingPlugin | |
| 2 | RTG-002 | DONE | Filter by ServiceName (exact match) | Via AvailableConnections from context |
| 3 | RTG-003 | DONE | Filter by Version (strict semver equality) | FilterByVersion method |
| 4 | RTG-004 | DONE | Filter by Health (Healthy or Degraded only) | FilterByHealth method |
| 5 | RTG-010 | DONE | Implement region tier logic | SelectByRegionTier method |
| 6 | RTG-011 | DONE | Tier 0: Same region as gateway | GatewayNodeConfig.Region |
| 7 | RTG-012 | DONE | Tier 1: Configured neighbor regions | NeighborRegions |
| 8 | RTG-013 | DONE | Tier 2: All other regions | Fallback |
| 9 | RTG-020 | DONE | Implement instance scoring within tier | SelectFromTier method |
| 10 | RTG-021 | DONE | Primary sort: lower AveragePingMs | OrderBy AveragePingMs |
| 11 | RTG-022 | DONE | Secondary sort: more recent LastHeartbeatUtc | ThenByDescending LastHeartbeatUtc |
| 12 | RTG-023 | DONE | Tie-breaker: random or round-robin | Configurable via TieBreakerMode |
| 13 | RTG-030 | DONE | Implement fallback decision order | Tier 0 → 1 → 2 |
| 14 | RTG-031 | DONE | Fallback 1: Greater ping (latency) | Sorted ascending |
| 15 | RTG-032 | DONE | Fallback 2: Greater heartbeat age | Sorted descending |
| 16 | RTG-033 | DONE | Fallback 3: Less preferred region tier | Tier cascade |
| 17 | RTG-040 | DONE | Create RoutingOptions for algorithm tuning | TieBreakerMode, PingToleranceMs |
| 18 | RTG-041 | DONE | Add default version configuration | DefaultVersion property |
| 19 | RTG-042 | DONE | Add health status acceptance set | AllowDegradedInstances |
| 20 | RTG-050 | DONE | Write unit tests for each filter | 15+ tests |
| 21 | RTG-051 | DONE | Write unit tests for region tier logic | Neighbor region tests |
| 22 | RTG-052 | DONE | Write unit tests for scoring and tie-breaking | Ping/heartbeat/round-robin tests |
| 23 | RTG-053 | DONE | Write integration tests for routing decisions | 55 tests passing |
## Routing Algorithm
```
Input: (ServiceName, Version, Method, Path)
Output: ConnectionState or null
1. Get all connections from IGlobalRoutingState.GetConnectionsFor(...)
2. Filter by ServiceName
- connections.Where(c => c.Instance.ServiceName == serviceName)
3. Filter by Version (strict semver equality)
- connections.Where(c => c.Instance.Version == version)
- If version not specified, use DefaultVersion from config
4. Filter by Health
- connections.Where(c => c.Status in {Healthy, Degraded})
- Exclude Unknown, Draining, Unhealthy
5. Group by Region Tier
- Tier 0: c.Instance.Region == GatewayNodeConfig.Region
- Tier 1: c.Instance.Region in GatewayNodeConfig.NeighborRegions
- Tier 2: All others
6. For each tier (0, 1, 2), if any candidates exist:
a. Sort by AveragePingMs (ascending)
b. For ties, sort by LastHeartbeatUtc (descending = more recent first)
c. For remaining ties, apply tie-breaker (random or round-robin)
d. Return first candidate
7. If no candidates in any tier, return null (503)
```
## Implementation
```csharp
public class DefaultRoutingPlugin : IRoutingPlugin
{
public async Task<RoutingDecision?> ChooseInstanceAsync(
RoutingContext context, CancellationToken cancellationToken)
{
var endpoint = context.Endpoint;
var gatewayRegion = context.GatewayRegion;
// Get all matching connections
var connections = _routingState.GetConnectionsFor(
endpoint.ServiceName,
endpoint.Version,
endpoint.Method,
endpoint.Path);
// Filter by health
var healthy = connections
.Where(c => c.Status is InstanceHealthStatus.Healthy
or InstanceHealthStatus.Degraded)
.ToList();
if (healthy.Count == 0)
return null;
// Group by region tier
var tier0 = healthy.Where(c => c.Instance.Region == gatewayRegion).ToList();
var tier1 = healthy.Where(c =>
_options.NeighborRegions.Contains(c.Instance.Region)).ToList();
var tier2 = healthy.Except(tier0).Except(tier1).ToList();
// Select from best tier
var selected = SelectFromTier(tier0)
?? SelectFromTier(tier1)
?? SelectFromTier(tier2);
if (selected == null)
return null;
return new RoutingDecision
{
Endpoint = endpoint,
Connection = selected,
TransportType = selected.TransportType,
EffectiveTimeout = endpoint.DefaultTimeout
};
}
private ConnectionState? SelectFromTier(List<ConnectionState> tier)
{
if (tier.Count == 0)
return null;
// Sort by ping (asc), then heartbeat (desc)
var sorted = tier
.OrderBy(c => c.AveragePingMs)
.ThenByDescending(c => c.LastHeartbeatUtc)
.ToList();
// Tie-breaker for same ping and heartbeat
var best = sorted.First();
var tied = sorted.TakeWhile(c =>
Math.Abs(c.AveragePingMs - best.AveragePingMs) < 0.1
&& c.LastHeartbeatUtc == best.LastHeartbeatUtc).ToList();
if (tied.Count == 1)
return tied[0];
// Round-robin or random for ties
return _options.TieBreaker == TieBreakerMode.Random
? tied[Random.Shared.Next(tied.Count)]
: tied[_roundRobinCounter++ % tied.Count];
}
}
```
## RoutingOptions
```csharp
public sealed class RoutingOptions
{
public Dictionary<string, string> DefaultVersions { get; set; } = new();
public HashSet<InstanceHealthStatus> AcceptableStatuses { get; set; }
= new() { InstanceHealthStatus.Healthy, InstanceHealthStatus.Degraded };
public TieBreakerMode TieBreaker { get; set; } = TieBreakerMode.RoundRobin;
}
public enum TieBreakerMode
{
Random,
RoundRobin
}
```
## Spec Compliance Verification
From specs.md section 4:
> * Region:
> * Prefer instances whose `Region == GatewayNodeConfig.Region`.
> * If none, fall back to configured neighbor regions.
> * If none, fall back to all other regions.
> * Within a chosen region tier:
> * Prefer lower `AveragePingMs`.
> * If several are tied, prefer more recent `LastHeartbeatUtc`.
> * If still tied, use a balancing strategy (e.g. random or round-robin).
Implementation must match exactly.
## Exit Criteria
Before marking this sprint DONE:
1. [x] Full filter chain implemented (service, version, health)
2. [x] Region tier logic works (same region → neighbors → others)
3. [x] Scoring within tier (ping, heartbeat, tie-breaker)
4. [x] RoutingOptions configurable
5. [x] All unit tests pass
6. [x] Integration tests verify routing decisions
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2025-12-05 | Sprint completed. Full routing algorithm with region tiers, ping/heartbeat scoring, and tie-breaking. 55 tests passing. | Claude |
## Decisions & Risks
- Ping tolerance for "ties": 0.1ms difference considered equal
- Round-robin counter is per-endpoint to avoid hot instances
- DefaultVersion lookup is per-service from configuration
- Degraded instances are routed to (may want to prefer Healthy first)

View File

@@ -1,230 +0,0 @@
# Sprint 7000-0005-0003 · Protocol Features · Cancellation Semantics
## Topic & Scope
Implement cancellation semantics on both gateway and microservice sides. When HTTP clients disconnect, timeouts occur, or payload limits are breached, CANCEL frames are sent to stop in-flight work.
**Goal:** Clean cancellation propagation from HTTP client through gateway to microservice handlers.
**Working directories:**
- `src/Gateway/StellaOps.Gateway.WebService/` (send CANCEL)
- `src/__Libraries/StellaOps.Microservice/` (receive CANCEL, cancel handler)
- `src/__Libraries/StellaOps.Router.Common/` (CancelPayload)
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0005_0002 (routing algorithm complete)
- **Downstream:** SPRINT_7000_0005_0004 (streaming uses cancellation)
- **Parallel work:** None. Sequential.
- **Cross-module impact:** SDK and Gateway both modified.
## Documentation Prerequisites
- `docs/router/specs.md` (sections 7.6, 10 - Cancellation requirements)
- `docs/router/07-Step.md` (cancellation section)
- `docs/router/implplan.md` (phase 7 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Working Directory |
|---|---------|--------|-------------|-------------------|
| 1 | CAN-001 | DONE | Define CancelPayload with Reason code | Common |
| 2 | CAN-002 | DONE | Define cancel reason constants | ClientDisconnected, Timeout, PayloadLimitExceeded, Shutdown |
| 3 | CAN-010 | DONE | Implement CANCEL frame sending in gateway | Gateway |
| 4 | CAN-011 | DONE | Wire HttpContext.RequestAborted to CANCEL | Gateway |
| 5 | CAN-012 | DONE | Implement timeout-triggered CANCEL | Gateway |
| 6 | CAN-013 | DONE | Implement payload-limit-triggered CANCEL | Gateway |
| 7 | CAN-014 | DONE | Implement shutdown-triggered CANCEL for in-flight | Gateway |
| 8 | CAN-020 | DONE | Stop forwarding REQUEST_STREAM_DATA after CANCEL | Gateway |
| 9 | CAN-021 | DONE | Ignore late RESPONSE frames for cancelled requests | Gateway |
| 10 | CAN-022 | DONE | Log cancelled requests with reason | Gateway |
| 11 | CAN-030 | DONE | Implement inflight request tracking in SDK | Microservice |
| 12 | CAN-031 | DONE | Create ConcurrentDictionary<Guid, CancellationTokenSource> | Microservice |
| 13 | CAN-032 | DONE | Add handler task to tracking map | Microservice |
| 14 | CAN-033 | DONE | Implement CANCEL frame processing | Microservice |
| 15 | CAN-034 | DONE | Call cts.Cancel() on CANCEL frame | Microservice |
| 16 | CAN-035 | DONE | Remove from tracking when handler completes | Microservice |
| 17 | CAN-040 | DONE | Implement connection-close cancellation | Microservice |
| 18 | CAN-041 | DONE | Cancel all inflight on connection loss | Microservice |
| 19 | CAN-050 | DONE | Pass CancellationToken to handler interfaces | Microservice |
| 20 | CAN-051 | DONE | Document cancellation best practices for handlers | Docs |
| 21 | CAN-060 | DONE | Write integration tests: client disconnect → handler cancelled | |
| 22 | CAN-061 | DONE | Write integration tests: timeout → handler cancelled | |
| 23 | CAN-062 | DONE | Write tests: late response ignored | |
## CancelPayload
```csharp
public sealed class CancelPayload
{
public string Reason { get; init; } = string.Empty;
}
public static class CancelReasons
{
public const string ClientDisconnected = "ClientDisconnected";
public const string Timeout = "Timeout";
public const string PayloadLimitExceeded = "PayloadLimitExceeded";
public const string Shutdown = "Shutdown";
}
```
## Gateway-Side: Sending CANCEL
### On Client Disconnect
```csharp
// In TransportDispatchMiddleware
context.RequestAborted.Register(async () =>
{
await transport.SendCancelAsync(
connection,
correlationId,
CancelReasons.ClientDisconnected);
});
```
### On Timeout
```csharp
using var cts = CancellationTokenSource.CreateLinkedTokenSource(context.RequestAborted);
cts.CancelAfter(decision.EffectiveTimeout);
try
{
var response = await transport.SendRequestAsync(..., cts.Token);
}
catch (OperationCanceledException) when (cts.IsCancellationRequested)
{
if (!context.RequestAborted.IsCancellationRequested)
{
// Timeout, not client disconnect
await transport.SendCancelAsync(connection, correlationId, CancelReasons.Timeout);
context.Response.StatusCode = 504;
return;
}
}
```
### Late Response Handling
```csharp
private readonly ConcurrentDictionary<Guid, bool> _cancelledRequests = new();
public void MarkCancelled(Guid correlationId)
{
_cancelledRequests[correlationId] = true;
}
public bool IsCancelled(Guid correlationId)
{
return _cancelledRequests.ContainsKey(correlationId);
}
// When response arrives
if (IsCancelled(frame.CorrelationId))
{
_logger.LogDebug("Ignoring late response for cancelled {CorrelationId}", frame.CorrelationId);
return; // Discard
}
```
## Microservice-Side: Receiving CANCEL
### Inflight Tracking
```csharp
internal sealed class InflightRequestTracker
{
private readonly ConcurrentDictionary<Guid, InflightRequest> _inflight = new();
public CancellationToken Track(Guid correlationId, Task handlerTask)
{
var cts = new CancellationTokenSource();
_inflight[correlationId] = new InflightRequest(cts, handlerTask);
return cts.Token;
}
public void Cancel(Guid correlationId, string reason)
{
if (_inflight.TryGetValue(correlationId, out var request))
{
request.Cts.Cancel();
_logger.LogInformation("Cancelled {CorrelationId}: {Reason}", correlationId, reason);
}
}
public void Complete(Guid correlationId)
{
if (_inflight.TryRemove(correlationId, out var request))
{
request.Cts.Dispose();
}
}
public void CancelAll(string reason)
{
foreach (var kvp in _inflight)
{
kvp.Value.Cts.Cancel();
}
_inflight.Clear();
}
}
```
### Connection-Close Handling
```csharp
// When connection closes unexpectedly
_inflightTracker.CancelAll("ConnectionClosed");
```
## Handler Cancellation Guidelines
Handlers MUST:
1. Accept `CancellationToken` parameter
2. Pass token to all async I/O operations
3. Check `token.IsCancellationRequested` in loops
4. Stop work promptly when cancelled
```csharp
public class ProcessDataEndpoint : IStellaEndpoint<DataRequest, DataResponse>
{
public async Task<DataResponse> HandleAsync(DataRequest request, CancellationToken ct)
{
// Pass token to I/O
var data = await _database.QueryAsync(request.Id, ct);
// Check in loops
foreach (var item in data)
{
ct.ThrowIfCancellationRequested();
await ProcessItemAsync(item, ct);
}
return new DataResponse { ... };
}
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [x] CANCEL frames sent on client disconnect
2. [x] CANCEL frames sent on timeout
3. [x] SDK tracks inflight requests with CTS
4. [x] SDK cancels handlers on CANCEL frame
5. [x] Connection close cancels all inflight
6. [x] Late responses are ignored/logged
7. [x] Integration tests verify cancellation flow
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2025-12-05 | Sprint DONE - CancelReasons defined, InflightRequestTracker implemented, Gateway sends CANCEL on disconnect/timeout, SDK handles CANCEL frames, 67 tests pass | Claude |
## Decisions & Risks
- Cancellation is cooperative; handlers must honor the token
- CTS disposal happens on completion to avoid leaks
- Late response cleanup: entries expire after 60 seconds
- Shutdown CANCEL is best-effort (connections may close first)

View File

@@ -1,215 +0,0 @@
# Sprint 7000-0005-0004 · Protocol Features · Streaming Support
## Topic & Scope
Implement streaming request/response support. Large payloads stream through the gateway as `REQUEST_STREAM_DATA` and `RESPONSE_STREAM_DATA` frames rather than being fully buffered.
**Goal:** Enable large file uploads/downloads without memory exhaustion at gateway.
**Working directories:**
- `src/Gateway/StellaOps.Gateway.WebService/` (streaming dispatch)
- `src/__Libraries/StellaOps.Microservice/` (streaming handlers)
- `src/__Libraries/StellaOps.Router.Transport.InMemory/` (streaming frames)
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0005_0003 (cancellation - streaming needs cancel support)
- **Downstream:** SPRINT_7000_0005_0005 (payload limits)
- **Parallel work:** None. Sequential.
- **Cross-module impact:** SDK, Gateway, InMemory transport all modified.
## Documentation Prerequisites
- `docs/router/specs.md` (sections 5.4, 6.3, 7.5 - Streaming requirements)
- `docs/router/08-Step.md` (streaming section)
- `docs/router/implplan.md` (phase 8 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Working Directory |
|---|---------|--------|-------------|-------------------|
| 1 | STR-001 | DONE | Add SupportsStreaming flag to EndpointDescriptor | Common |
| 2 | STR-002 | DONE | Add streaming attribute support to [StellaEndpoint] | Common |
| 3 | STR-010 | DONE | Implement REQUEST_STREAM_DATA frame handling in transport | InMemory |
| 4 | STR-011 | DONE | Implement RESPONSE_STREAM_DATA frame handling in transport | InMemory |
| 5 | STR-012 | DONE | Implement end-of-stream signaling | InMemory |
| 6 | STR-020 | DONE | Implement streaming request dispatch in gateway | Gateway |
| 7 | STR-021 | DONE | Pipe HTTP body stream → REQUEST_STREAM_DATA frames | Gateway |
| 8 | STR-022 | DONE | Implement chunking for stream data | Configurable chunk size |
| 9 | STR-023 | DONE | Honor cancellation during streaming | Gateway |
| 10 | STR-030 | DONE | Implement streaming response handling in gateway | Gateway |
| 11 | STR-031 | DONE | Pipe RESPONSE_STREAM_DATA frames → HTTP response | Gateway |
| 12 | STR-032 | DONE | Set chunked transfer encoding | Gateway |
| 13 | STR-040 | DONE | Implement streaming body in RawRequestContext | Microservice |
| 14 | STR-041 | DONE | Expose Body as async-readable stream | Microservice |
| 15 | STR-042 | DONE | Implement backpressure (slow consumer) | Microservice |
| 16 | STR-050 | DONE | Implement streaming response writing | Microservice |
| 17 | STR-051 | DONE | Expose WriteBodyAsync for streaming output | Microservice |
| 18 | STR-052 | DONE | Chunk output into RESPONSE_STREAM_DATA frames | Microservice |
| 19 | STR-060 | DONE | Implement IRawStellaEndpoint streaming pattern | Microservice |
| 20 | STR-061 | DONE | Document streaming handler guidelines | Docs |
| 21 | STR-070 | DONE | Write integration tests for upload streaming | |
| 22 | STR-071 | DONE | Write integration tests for download streaming | |
| 23 | STR-072 | DONE | Write tests for cancellation during streaming | |
## Streaming Frame Protocol
### Request Streaming
```
Gateway → Microservice:
1. REQUEST frame (headers, method, path, CorrelationId)
2. REQUEST_STREAM_DATA frame (chunk 1)
3. REQUEST_STREAM_DATA frame (chunk 2)
...
N. REQUEST_STREAM_DATA frame (final chunk, EndOfStream=true)
```
### Response Streaming
```
Microservice → Gateway:
1. RESPONSE frame (status code, headers, CorrelationId)
2. RESPONSE_STREAM_DATA frame (chunk 1)
3. RESPONSE_STREAM_DATA frame (chunk 2)
...
N. RESPONSE_STREAM_DATA frame (final chunk, EndOfStream=true)
```
## StreamDataPayload
```csharp
public sealed class StreamDataPayload
{
public Guid CorrelationId { get; init; }
public byte[] Data { get; init; } = Array.Empty<byte>();
public bool EndOfStream { get; init; }
public int SequenceNumber { get; init; }
}
```
## Gateway Streaming Dispatch
```csharp
// In TransportDispatchMiddleware
if (endpoint.SupportsStreaming)
{
await DispatchStreamingAsync(context, transport, decision, cancellationToken);
}
else
{
await DispatchBufferedAsync(context, transport, decision, cancellationToken);
}
private async Task DispatchStreamingAsync(...)
{
// Send REQUEST header
var requestFrame = BuildRequestHeaderFrame(context);
await transport.SendFrameAsync(connection, requestFrame, ct);
// Stream body chunks
var buffer = new byte[_options.StreamChunkSize];
int bytesRead;
int sequence = 0;
while ((bytesRead = await context.Request.Body.ReadAsync(buffer, ct)) > 0)
{
var streamFrame = new Frame
{
Type = FrameType.RequestStreamData,
CorrelationId = requestFrame.CorrelationId,
Payload = SerializeStreamData(buffer[..bytesRead], sequence++, endOfStream: false)
};
await transport.SendFrameAsync(connection, streamFrame, ct);
}
// Send end-of-stream
var endFrame = new Frame
{
Type = FrameType.RequestStreamData,
CorrelationId = requestFrame.CorrelationId,
Payload = SerializeStreamData(Array.Empty<byte>(), sequence, endOfStream: true)
};
await transport.SendFrameAsync(connection, endFrame, ct);
// Receive response (streaming or buffered)
await ReceiveResponseAsync(context, transport, connection, requestFrame.CorrelationId, ct);
}
```
## Microservice Streaming Handler
```csharp
[StellaEndpoint("POST", "/files/upload", SupportsStreaming = true)]
public class FileUploadEndpoint : IRawStellaEndpoint
{
public async Task<RawResponse> HandleAsync(RawRequestContext context, CancellationToken ct)
{
// Body is a stream that reads from REQUEST_STREAM_DATA frames
var tempPath = Path.GetTempFileName();
await using var fileStream = File.Create(tempPath);
await context.Body.CopyToAsync(fileStream, ct);
return RawResponse.Ok($"Uploaded {fileStream.Length} bytes");
}
}
[StellaEndpoint("GET", "/files/{id}/download", SupportsStreaming = true)]
public class FileDownloadEndpoint : IRawStellaEndpoint
{
public async Task<RawResponse> HandleAsync(RawRequestContext context, CancellationToken ct)
{
var fileId = context.PathParameters["id"];
var filePath = _storage.GetPath(fileId);
// Return streaming response
return new RawResponse
{
StatusCode = 200,
Body = File.OpenRead(filePath), // Stream, not buffered
Headers = new HeaderCollection
{
["Content-Type"] = "application/octet-stream"
}
};
}
}
```
## StreamingOptions
```csharp
public sealed class StreamingOptions
{
public int ChunkSize { get; set; } = 64 * 1024; // 64KB default
public int MaxConcurrentStreams { get; set; } = 100;
public TimeSpan StreamIdleTimeout { get; set; } = TimeSpan.FromMinutes(5);
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [x] REQUEST_STREAM_DATA frames implemented in transport
2. [x] RESPONSE_STREAM_DATA frames implemented in transport
3. [x] Gateway streams request body to microservice
4. [x] Gateway streams response body to HTTP client
5. [x] SDK exposes streaming Body in RawRequestContext
6. [x] SDK can write streaming response
7. [x] Cancellation works during streaming
8. [x] Integration tests for upload and download streaming
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2025-12-05 | Sprint DONE - StreamDataPayload, StreamingOptions, StreamingRequestBodyStream, StreamingResponseBodyStream, DispatchStreamingAsync in gateway, 80 tests pass | Claude |
## Decisions & Risks
- Default chunk size: 64KB (tunable)
- End-of-stream is explicit frame, not connection close
- Backpressure via channel capacity (bounded channels)
- Idle timeout cancels stuck streams
- Typed handlers don't support streaming (use IRawStellaEndpoint)

View File

@@ -1,231 +0,0 @@
# Sprint 7000-0005-0005 · Protocol Features · Payload Limits
## Topic & Scope
Implement payload size limits to protect the gateway from memory exhaustion. Enforce limits per-request, per-connection, and aggregate across all connections.
**Goal:** Gateway rejects oversized payloads early and cancels streams that exceed limits mid-flight.
**Working directory:** `src/Gateway/StellaOps.Gateway.WebService/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0005_0004 (streaming - limits apply to streams)
- **Downstream:** SPRINT_7000_0006_* (real transports)
- **Parallel work:** None. Sequential.
- **Cross-module impact:** Gateway only.
## Documentation Prerequisites
- `docs/router/specs.md` (section 6.5 - Payload and memory protection)
- `docs/router/08-Step.md` (payload limits section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | LIM-001 | DONE | Implement PayloadLimitsMiddleware | Before dispatch |
| 2 | LIM-002 | DONE | Check Content-Length header against MaxRequestBytesPerCall | |
| 3 | LIM-003 | DONE | Return 413 for oversized Content-Length | Early rejection |
| 4 | LIM-010 | DONE | Implement per-request byte counter | ByteCountingStream |
| 5 | LIM-011 | DONE | Track bytes read during streaming | |
| 6 | LIM-012 | DONE | Abort when MaxRequestBytesPerCall exceeded mid-stream | |
| 7 | LIM-013 | DONE | Send CANCEL frame on limit breach | Via PayloadLimitExceededException |
| 8 | LIM-020 | DONE | Implement per-connection byte counter | PayloadTracker |
| 9 | LIM-021 | DONE | Track total inflight bytes per connection | |
| 10 | LIM-022 | DONE | Throttle/reject when MaxRequestBytesPerConnection exceeded | Returns 429 |
| 11 | LIM-030 | DONE | Implement aggregate byte counter | PayloadTracker |
| 12 | LIM-031 | DONE | Track total inflight bytes across all connections | |
| 13 | LIM-032 | DONE | Throttle/reject when MaxAggregateInflightBytes exceeded | |
| 14 | LIM-033 | DONE | Return 503 for aggregate limit | Service overloaded |
| 15 | LIM-040 | DONE | Implement ByteCountingStream wrapper | Counts bytes as they flow |
| 16 | LIM-041 | DONE | Wire counting stream into dispatch | Via middleware |
| 17 | LIM-050 | DONE | Create PayloadLimitOptions | PayloadLimits record |
| 18 | LIM-051 | DONE | Bind PayloadLimitOptions from configuration | IOptions<PayloadLimits> |
| 19 | LIM-060 | DONE | Log limit breaches with request details | Warning level |
| 20 | LIM-061 | DONE | Add metrics for payload tracking | Via IPayloadTracker.CurrentInflightBytes |
| 21 | LIM-070 | DONE | Write tests for early rejection (Content-Length) | ByteCountingStreamTests |
| 22 | LIM-071 | DONE | Write tests for mid-stream cancellation | |
| 23 | LIM-072 | DONE | Write tests for connection limit | PayloadTrackerTests |
| 24 | LIM-073 | DONE | Write tests for aggregate limit | PayloadTrackerTests |
## PayloadLimits
```csharp
public sealed class PayloadLimits
{
public long MaxRequestBytesPerCall { get; set; } = 10 * 1024 * 1024; // 10 MB
public long MaxRequestBytesPerConnection { get; set; } = 100 * 1024 * 1024; // 100 MB
public long MaxAggregateInflightBytes { get; set; } = 1024 * 1024 * 1024; // 1 GB
}
```
## PayloadLimitsMiddleware
```csharp
public class PayloadLimitsMiddleware
{
public async Task InvokeAsync(HttpContext context, IPayloadTracker tracker)
{
// Early rejection for known Content-Length
if (context.Request.ContentLength.HasValue)
{
if (context.Request.ContentLength > _limits.MaxRequestBytesPerCall)
{
_logger.LogWarning("Request rejected: Content-Length {Length} exceeds limit {Limit}",
context.Request.ContentLength, _limits.MaxRequestBytesPerCall);
context.Response.StatusCode = 413; // Payload Too Large
await context.Response.WriteAsJsonAsync(new
{
error = "Payload Too Large",
maxBytes = _limits.MaxRequestBytesPerCall
});
return;
}
}
// Check aggregate capacity
if (!tracker.TryReserve(context.Request.ContentLength ?? 0))
{
context.Response.StatusCode = 503; // Service Unavailable
await context.Response.WriteAsJsonAsync(new
{
error = "Service Overloaded",
message = "Too many concurrent requests"
});
return;
}
try
{
await _next(context);
}
finally
{
tracker.Release(/* bytes actually used */);
}
}
}
```
## IPayloadTracker
```csharp
public interface IPayloadTracker
{
bool TryReserve(long estimatedBytes);
void Release(long actualBytes);
long CurrentInflightBytes { get; }
bool IsOverloaded { get; }
}
internal sealed class PayloadTracker : IPayloadTracker
{
private long _totalInflightBytes;
private readonly ConcurrentDictionary<string, long> _perConnectionBytes = new();
public bool TryReserve(long estimatedBytes)
{
var newTotal = Interlocked.Add(ref _totalInflightBytes, estimatedBytes);
if (newTotal > _limits.MaxAggregateInflightBytes)
{
Interlocked.Add(ref _totalInflightBytes, -estimatedBytes);
return false;
}
return true;
}
public void Release(long actualBytes)
{
Interlocked.Add(ref _totalInflightBytes, -actualBytes);
}
}
```
## ByteCountingStream
```csharp
internal sealed class ByteCountingStream : Stream
{
private readonly Stream _inner;
private readonly long _limit;
private readonly Action _onLimitExceeded;
private long _bytesRead;
public override async ValueTask<int> ReadAsync(Memory<byte> buffer, CancellationToken ct)
{
var read = await _inner.ReadAsync(buffer, ct);
_bytesRead += read;
if (_bytesRead > _limit)
{
_onLimitExceeded();
throw new PayloadLimitExceededException(_bytesRead, _limit);
}
return read;
}
public long BytesRead => _bytesRead;
}
```
## Mid-Stream Limit Breach Flow
```
1. Streaming request begins
2. Gateway counts bytes as they flow through ByteCountingStream
3. When _bytesRead > MaxRequestBytesPerCall:
a. Stop reading from HTTP body
b. Send CANCEL frame with reason "PayloadLimitExceeded"
c. Return 413 to client
d. Log the incident with request details
```
## Configuration
```json
{
"PayloadLimits": {
"MaxRequestBytesPerCall": 10485760,
"MaxRequestBytesPerConnection": 104857600,
"MaxAggregateInflightBytes": 1073741824
}
}
```
## Error Responses
| Condition | HTTP Status | Error Message |
|-----------|-------------|---------------|
| Content-Length exceeds per-call limit | 413 | Payload Too Large |
| Streaming exceeds per-call limit | 413 | Payload Too Large |
| Per-connection limit exceeded | 429 | Too Many Requests |
| Aggregate limit exceeded | 503 | Service Overloaded |
## Exit Criteria
Before marking this sprint DONE:
1. [x] Early rejection for known oversized Content-Length
2. [x] Mid-stream cancellation when limit exceeded
3. [x] CANCEL frame sent on limit breach
4. [x] Per-connection tracking works
5. [x] Aggregate tracking works
6. [x] All limit scenarios tested
7. [x] Metrics/logging in place
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2025-12-05 | Sprint DONE - PayloadTracker, ByteCountingStream, PayloadLimitsMiddleware, PayloadLimitExceededException, 97 tests pass | Claude |
## Decisions & Risks
- Default limits are conservative; tune for your environment
- Per-connection limit applies to inflight bytes, not lifetime total
- Aggregate limit prevents memory exhaustion but may cause 503s under load
- ByteCountingStream adds minimal overhead
- Limit breach is logged at Warning level

View File

@@ -1,231 +0,0 @@
# Sprint 7000-0006-0001 · Real Transports · TCP Plugin
## Topic & Scope
Implement the TCP transport plugin. This is the primary production transport with length-prefixed framing for reliable frame delivery.
**Goal:** Replace InMemory transport with production-grade TCP transport.
**Working directory:** `src/__Libraries/StellaOps.Router.Transport.Tcp/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0005_0005 (all protocol features proven with InMemory)
- **Downstream:** SPRINT_7000_0006_0002 (TLS wraps TCP)
- **Parallel work:** None initially; UDP and RabbitMQ can start after TCP basics work
- **Cross-module impact:** None. New library only.
## Documentation Prerequisites
- `docs/router/specs.md` (section 5 - Transport plugin requirements)
- `docs/router/09-Step.md` (TCP transport section)
- `docs/router/implplan.md` (phase 9 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | TCP-001 | DONE | Create `StellaOps.Router.Transport.Tcp` classlib project | Add to solution |
| 2 | TCP-002 | DONE | Add project reference to Router.Common | |
| 3 | TCP-010 | DONE | Implement `TcpTransportServer` : `ITransportServer` | Gateway side |
| 4 | TCP-011 | DONE | Implement TCP listener with configurable bind address/port | |
| 5 | TCP-012 | DONE | Implement connection accept loop | One connection per microservice |
| 6 | TCP-013 | DONE | Implement connection ID generation | Based on endpoint |
| 7 | TCP-020 | DONE | Implement `TcpTransportClient` : `ITransportClient` | Microservice side |
| 8 | TCP-021 | DONE | Implement connection establishment | With retry |
| 9 | TCP-022 | DONE | Implement reconnection on failure | Exponential backoff |
| 10 | TCP-030 | DONE | Implement length-prefixed framing protocol | FrameProtocol class |
| 11 | TCP-031 | DONE | Frame format: [4-byte length][payload] | Big-endian length |
| 12 | TCP-032 | DONE | Implement frame reader (async, streaming) | |
| 13 | TCP-033 | DONE | Implement frame writer (async, thread-safe) | |
| 14 | TCP-040 | DONE | Implement frame multiplexing | PendingRequestTracker |
| 15 | TCP-041 | DONE | Route responses by CorrelationId | |
| 16 | TCP-042 | DONE | Handle out-of-order responses | |
| 17 | TCP-050 | DONE | Implement keep-alive/ping at TCP level | Via heartbeat frames |
| 18 | TCP-051 | DONE | Detect dead connections | On socket error |
| 19 | TCP-052 | DONE | Clean up on connection loss | OnDisconnected event |
| 20 | TCP-060 | DONE | Create TcpTransportOptions | BindAddress, Port, BufferSize |
| 21 | TCP-061 | DONE | Create DI registration `AddTcpTransport()` | ServiceCollectionExtensions |
| 22 | TCP-070 | DONE | Write integration tests with real sockets | 11 tests |
| 23 | TCP-071 | DONE | Write tests for reconnection | Via TcpTransportClient |
| 24 | TCP-072 | DONE | Write tests for multiplexing | PendingRequestTrackerTests |
| 25 | TCP-073 | DONE | Write load tests | Via PendingRequestTracker |
## Frame Format
```
┌─────────────────────────────────────────────────────────────┐
│ 4 bytes (big-endian) │ N bytes (payload) │
│ Payload Length │ [FrameType][CorrelationId][Data] │
└─────────────────────────────────────────────────────────────┘
```
### Payload Structure
```
Byte 0: FrameType (1 byte enum value)
Bytes 1-16: CorrelationId (16 bytes GUID)
Bytes 17+: Frame-specific data
```
## TcpTransportServer
```csharp
public sealed class TcpTransportServer : ITransportServer, IAsyncDisposable
{
private TcpListener? _listener;
private readonly ConcurrentDictionary<string, TcpConnection> _connections = new();
public async Task StartAsync(CancellationToken ct)
{
_listener = new TcpListener(_options.BindAddress, _options.Port);
_listener.Start();
_ = AcceptLoopAsync(ct);
}
private async Task AcceptLoopAsync(CancellationToken ct)
{
while (!ct.IsCancellationRequested)
{
var client = await _listener!.AcceptTcpClientAsync(ct);
var connectionId = GenerateConnectionId(client);
var connection = new TcpConnection(connectionId, client, this);
_connections[connectionId] = connection;
OnConnection?.Invoke(connectionId);
_ = connection.ReadLoopAsync(ct);
}
}
public async Task SendFrameAsync(string connectionId, Frame frame)
{
if (_connections.TryGetValue(connectionId, out var conn))
{
await conn.WriteFrameAsync(frame);
}
}
}
```
## TcpConnection (internal)
```csharp
internal sealed class TcpConnection : IAsyncDisposable
{
private readonly TcpClient _client;
private readonly NetworkStream _stream;
private readonly SemaphoreSlim _writeLock = new(1, 1);
public async Task ReadLoopAsync(CancellationToken ct)
{
var lengthBuffer = new byte[4];
while (!ct.IsCancellationRequested)
{
// Read length prefix
await ReadExactAsync(_stream, lengthBuffer, ct);
var length = BinaryPrimitives.ReadInt32BigEndian(lengthBuffer);
// Read payload
var payload = new byte[length];
await ReadExactAsync(_stream, payload, ct);
// Parse frame
var frame = ParseFrame(payload);
_server.OnFrame?.Invoke(_connectionId, frame);
}
}
public async Task WriteFrameAsync(Frame frame)
{
var payload = SerializeFrame(frame);
var lengthBytes = new byte[4];
BinaryPrimitives.WriteInt32BigEndian(lengthBytes, payload.Length);
await _writeLock.WaitAsync();
try
{
await _stream.WriteAsync(lengthBytes);
await _stream.WriteAsync(payload);
}
finally
{
_writeLock.Release();
}
}
}
```
## TcpTransportOptions
```csharp
public sealed class TcpTransportOptions
{
public IPAddress BindAddress { get; set; } = IPAddress.Any;
public int Port { get; set; } = 5100;
public int ReceiveBufferSize { get; set; } = 64 * 1024;
public int SendBufferSize { get; set; } = 64 * 1024;
public TimeSpan KeepAliveInterval { get; set; } = TimeSpan.FromSeconds(30);
public TimeSpan ConnectTimeout { get; set; } = TimeSpan.FromSeconds(10);
public int MaxReconnectAttempts { get; set; } = 10;
public TimeSpan MaxReconnectBackoff { get; set; } = TimeSpan.FromMinutes(1);
}
```
## Multiplexing
One TCP connection carries multiple concurrent requests:
- Each request has unique CorrelationId
- Responses can arrive in any order
- `ConcurrentDictionary<Guid, TaskCompletionSource<Frame>>` for pending requests
```csharp
internal sealed class PendingRequestTracker
{
private readonly ConcurrentDictionary<Guid, TaskCompletionSource<Frame>> _pending = new();
public Task<Frame> TrackRequest(Guid correlationId, CancellationToken ct)
{
var tcs = new TaskCompletionSource<Frame>(TaskCreationOptions.RunContinuationsAsynchronously);
ct.Register(() => tcs.TrySetCanceled());
_pending[correlationId] = tcs;
return tcs.Task;
}
public void CompleteRequest(Guid correlationId, Frame response)
{
if (_pending.TryRemove(correlationId, out var tcs))
{
tcs.TrySetResult(response);
}
}
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [x] TcpTransportServer accepts connections and reads frames
2. [x] TcpTransportClient connects and sends frames
3. [x] Length-prefixed framing works correctly
4. [x] Multiplexing routes responses to correct callers
5. [x] Reconnection with backoff works
6. [x] Keep-alive detects dead connections
7. [x] Integration tests pass
8. [x] Load tests demonstrate concurrent request handling
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2025-12-05 | Sprint DONE - TcpTransportServer, TcpTransportClient, TcpConnection, FrameProtocol, PendingRequestTracker, TcpTransportOptions, ServiceCollectionExtensions, 11 tests pass | Claude |
## Decisions & Risks
- Big-endian length prefix for network byte order
- Maximum frame size: 16 MB (configurable)
- One socket per microservice instance (not per request)
- Write lock prevents interleaved frames
- No compression at transport level (consider adding later)

View File

@@ -1,227 +0,0 @@
# Sprint 7000-0006-0002 · Real Transports · TLS/mTLS Plugin
## Topic & Scope
Implement the TLS transport plugin (Certificate transport). Wraps TCP with TLS encryption and supports optional mutual TLS (mTLS) for verifiable peer identity.
**Goal:** Secure transport with certificate-based authentication.
**Working directory:** `src/__Libraries/StellaOps.Router.Transport.Tls/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0006_0001 (TCP transport - this wraps it)
- **Downstream:** None. Parallel with UDP and RabbitMQ.
- **Parallel work:** Can run in parallel with UDP and RabbitMQ sprints.
- **Cross-module impact:** None. New library only.
## Documentation Prerequisites
- `docs/router/specs.md` (section 5 - Certificate transport requirements)
- `docs/router/09-Step.md` (TLS transport section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | TLS-001 | DONE | Create `StellaOps.Router.Transport.Tls` classlib project | Add to solution |
| 2 | TLS-002 | DONE | Add project reference to Router.Common and Transport.Tcp | Wraps TCP |
| 3 | TLS-010 | DONE | Implement `TlsTransportServer` : `ITransportServer` | Gateway side |
| 4 | TLS-011 | DONE | Wrap TcpListener with SslStream | |
| 5 | TLS-012 | DONE | Configure server certificate | |
| 6 | TLS-013 | DONE | Implement optional client certificate validation (mTLS) | |
| 7 | TLS-020 | DONE | Implement `TlsTransportClient` : `ITransportClient` | Microservice side |
| 8 | TLS-021 | DONE | Wrap TcpClient with SslStream | |
| 9 | TLS-022 | DONE | Implement server certificate validation | |
| 10 | TLS-023 | DONE | Implement client certificate presentation (mTLS) | |
| 11 | TLS-030 | DONE | Create TlsTransportOptions | Certificates, validation mode |
| 12 | TLS-031 | DONE | Support PEM file paths | |
| 13 | TLS-032 | DONE | Support PFX file paths with password | |
| 14 | TLS-033 | DONE | Support X509Certificate2 objects | For programmatic use |
| 15 | TLS-040 | DONE | Implement certificate chain validation | |
| 16 | TLS-041 | DONE | Implement certificate revocation checking (optional) | |
| 17 | TLS-042 | DONE | Implement hostname verification | |
| 18 | TLS-050 | DONE | Create DI registration `AddTlsTransport()` | |
| 19 | TLS-051 | DONE | Support certificate hot-reload | For rotation |
| 20 | TLS-060 | DONE | Write integration tests with self-signed certs | |
| 21 | TLS-061 | DONE | Write tests for mTLS | |
| 22 | TLS-062 | DONE | Write tests for cert validation failures | |
## TlsTransportOptions
```csharp
public sealed class TlsTransportOptions
{
// Server-side (Gateway)
public X509Certificate2? ServerCertificate { get; set; }
public string? ServerCertificatePath { get; set; } // PEM or PFX
public string? ServerCertificateKeyPath { get; set; } // PEM private key
public string? ServerCertificatePassword { get; set; } // For PFX
// Client-side (Microservice)
public X509Certificate2? ClientCertificate { get; set; }
public string? ClientCertificatePath { get; set; }
public string? ClientCertificateKeyPath { get; set; }
public string? ClientCertificatePassword { get; set; }
// Validation
public bool RequireClientCertificate { get; set; } = false; // mTLS
public bool AllowSelfSigned { get; set; } = false; // Dev only
public bool CheckCertificateRevocation { get; set; } = false;
public string? ExpectedServerHostname { get; set; } // For SNI
// Protocol
public SslProtocols EnabledProtocols { get; set; } = SslProtocols.Tls12 | SslProtocols.Tls13;
}
```
## Server Implementation
```csharp
public sealed class TlsTransportServer : ITransportServer
{
public async Task StartAsync(CancellationToken ct)
{
_listener = new TcpListener(_tcpOptions.BindAddress, _tcpOptions.Port);
_listener.Start();
_ = AcceptLoopAsync(ct);
}
private async Task AcceptLoopAsync(CancellationToken ct)
{
while (!ct.IsCancellationRequested)
{
var tcpClient = await _listener!.AcceptTcpClientAsync(ct);
var sslStream = new SslStream(
tcpClient.GetStream(),
leaveInnerStreamOpen: false,
userCertificateValidationCallback: ValidateClientCertificate);
try
{
await sslStream.AuthenticateAsServerAsync(new SslServerAuthenticationOptions
{
ServerCertificate = _options.ServerCertificate,
ClientCertificateRequired = _options.RequireClientCertificate,
EnabledSslProtocols = _options.EnabledProtocols,
CertificateRevocationCheckMode = _options.CheckCertificateRevocation
? X509RevocationMode.Online
: X509RevocationMode.NoCheck
}, ct);
// Connection authenticated, continue with frame reading
var connectionId = GenerateConnectionId(tcpClient, sslStream.RemoteCertificate);
var connection = new TlsConnection(connectionId, tcpClient, sslStream, this);
_connections[connectionId] = connection;
OnConnection?.Invoke(connectionId);
_ = connection.ReadLoopAsync(ct);
}
catch (AuthenticationException ex)
{
_logger.LogWarning(ex, "TLS handshake failed from {RemoteEndpoint}",
tcpClient.Client.RemoteEndPoint);
tcpClient.Dispose();
}
}
}
private bool ValidateClientCertificate(
object sender, X509Certificate? certificate,
X509Chain? chain, SslPolicyErrors errors)
{
if (!_options.RequireClientCertificate && certificate == null)
return true;
if (_options.AllowSelfSigned)
return true;
return errors == SslPolicyErrors.None;
}
}
```
## Client Implementation
```csharp
public sealed class TlsTransportClient : ITransportClient
{
public async Task ConnectAsync(CancellationToken ct)
{
var tcpClient = new TcpClient();
await tcpClient.ConnectAsync(_options.Host, _options.Port, ct);
var sslStream = new SslStream(
tcpClient.GetStream(),
leaveInnerStreamOpen: false,
userCertificateValidationCallback: ValidateServerCertificate);
await sslStream.AuthenticateAsClientAsync(new SslClientAuthenticationOptions
{
TargetHost = _options.ExpectedServerHostname ?? _options.Host,
ClientCertificates = _options.ClientCertificate != null
? new X509CertificateCollection { _options.ClientCertificate }
: null,
EnabledSslProtocols = _options.EnabledProtocols,
CertificateRevocationCheckMode = _options.CheckCertificateRevocation
? X509RevocationMode.Online
: X509RevocationMode.NoCheck
}, ct);
// Connected and authenticated
_stream = sslStream;
_tcpClient = tcpClient;
}
}
```
## mTLS Identity Extraction
With mTLS, the microservice identity can be verified from the client certificate:
```csharp
internal string ExtractIdentityFromCertificate(X509Certificate2 cert)
{
// Common patterns:
// 1. Common Name (CN)
var cn = cert.GetNameInfo(X509NameType.SimpleName, forIssuer: false);
// 2. Subject Alternative Name (SAN) - DNS or URI
var san = cert.Extensions["2.5.29.17"]; // SAN OID
// 3. Custom extension for service identity
// ...
return cn;
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [x] TlsTransportServer accepts TLS connections
2. [x] TlsTransportClient connects with TLS
3. [x] Server and client certificate configuration works
4. [x] mTLS (mutual TLS) works when enabled
5. [x] Certificate validation works (chain, revocation, hostname)
6. [x] AllowSelfSigned works for dev environments
7. [x] Certificate hot-reload works
8. [x] Integration tests pass
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2025-12-05 | Sprint DONE - TlsTransportServer, TlsTransportClient, TlsConnection, TlsTransportOptions, CertificateLoader, CertificateWatcher, ServiceCollectionExtensions, 12 tests pass | Claude |
## Decisions & Risks
- TLS 1.2 and 1.3 enabled by default (1.0/1.1 disabled)
- Certificate revocation checking is optional (can slow down)
- mTLS is optional (RequireClientCertificate = false by default)
- Identity extraction from cert is customizable
- Certificate hot-reload uses file system watcher

View File

@@ -1,221 +0,0 @@
# Sprint 7000-0006-0003 · Real Transports · UDP Plugin
## Topic & Scope
Implement the UDP transport plugin for small, bounded payloads. UDP provides low-latency communication for simple operations but cannot handle streaming or large payloads.
**Goal:** Fast transport for small, idempotent operations.
**Working directory:** `src/__Libraries/StellaOps.Router.Transport.Udp/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0006_0001 (TCP transport for reference patterns)
- **Downstream:** None.
- **Parallel work:** Can run in parallel with TLS and RabbitMQ sprints.
- **Cross-module impact:** None. New library only.
## Documentation Prerequisites
- `docs/router/specs.md` (section 5 - UDP transport requirements)
- `docs/router/09-Step.md` (UDP transport section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | UDP-001 | DONE | Create `StellaOps.Router.Transport.Udp` classlib project | Add to solution |
| 2 | UDP-002 | DONE | Add project reference to Router.Common | |
| 3 | UDP-010 | DONE | Implement `UdpTransportServer` : `ITransportServer` | Gateway side |
| 4 | UDP-011 | DONE | Implement UDP socket listener | |
| 5 | UDP-012 | DONE | Implement datagram receive loop | |
| 6 | UDP-013 | DONE | Route received datagrams by source address | |
| 7 | UDP-020 | DONE | Implement `UdpTransportClient` : `ITransportClient` | Microservice side |
| 8 | UDP-021 | DONE | Implement UDP socket for sending | |
| 9 | UDP-022 | DONE | Implement receive for responses | |
| 10 | UDP-030 | DONE | Enforce MaxRequestBytesPerCall limit | Single datagram |
| 11 | UDP-031 | DONE | Reject oversized payloads | |
| 12 | UDP-032 | DONE | Set maximum datagram size from config | |
| 13 | UDP-040 | DONE | Implement request/response correlation | Per-datagram matching |
| 14 | UDP-041 | DONE | Track pending requests with timeout | |
| 15 | UDP-042 | DONE | Handle out-of-order responses | |
| 16 | UDP-050 | DONE | Implement HELLO via UDP | |
| 17 | UDP-051 | DONE | Implement HEARTBEAT via UDP | |
| 18 | UDP-052 | DONE | Implement REQUEST/RESPONSE via UDP | No streaming |
| 19 | UDP-060 | DONE | Disable streaming for UDP transport | |
| 20 | UDP-061 | DONE | Reject endpoints with SupportsStreaming | |
| 21 | UDP-062 | DONE | Log streaming attempts as errors | |
| 22 | UDP-070 | DONE | Create UdpTransportOptions | BindAddress, Port, MaxDatagramSize |
| 23 | UDP-071 | DONE | Create DI registration `AddUdpTransport()` | |
| 24 | UDP-080 | DONE | Write integration tests | |
| 25 | UDP-081 | DONE | Write tests for size limit enforcement | |
## Constraints
From specs.md:
> UDP transport:
> * MUST be used only for small/bounded payloads (no unbounded streaming).
> * MUST respect configured `MaxRequestBytesPerCall`.
- **No streaming:** REQUEST_STREAM_DATA and RESPONSE_STREAM_DATA are not supported
- **Size limit:** Entire request must fit in one datagram
- **Best for:** Ping, health checks, small queries, commands
## Datagram Format
Single UDP datagram = single frame:
```
┌─────────────────────────────────────────────────────────────┐
│ FrameType (1 byte) │ CorrelationId (16 bytes) │ Data (N) │
└─────────────────────────────────────────────────────────────┘
```
Maximum datagram size: Typically 65,507 bytes (IPv4) but practical limit ~1400 for MTU safety.
## UdpTransportServer
```csharp
public sealed class UdpTransportServer : ITransportServer
{
private UdpClient? _listener;
private readonly ConcurrentDictionary<IPEndPoint, string> _endpointToConnectionId = new();
public async Task StartAsync(CancellationToken ct)
{
_listener = new UdpClient(_options.Port);
_ = ReceiveLoopAsync(ct);
}
private async Task ReceiveLoopAsync(CancellationToken ct)
{
while (!ct.IsCancellationRequested)
{
var result = await _listener!.ReceiveAsync(ct);
var remoteEndpoint = result.RemoteEndPoint;
var data = result.Buffer;
// Parse frame
var frame = ParseFrame(data);
// Get or create connection ID for this endpoint
var connectionId = _endpointToConnectionId.GetOrAdd(
remoteEndpoint,
ep => $"udp-{ep}");
// Handle HELLO specially to register connection
if (frame.Type == FrameType.Hello)
{
OnConnection?.Invoke(connectionId);
}
OnFrame?.Invoke(connectionId, frame);
}
}
public async Task SendFrameAsync(string connectionId, Frame frame)
{
var endpoint = ResolveEndpoint(connectionId);
var data = SerializeFrame(frame);
if (data.Length > _options.MaxDatagramSize)
throw new PayloadTooLargeException(data.Length, _options.MaxDatagramSize);
await _listener!.SendAsync(data, data.Length, endpoint);
}
}
```
## UdpTransportClient
```csharp
public sealed class UdpTransportClient : ITransportClient
{
private UdpClient? _client;
private readonly ConcurrentDictionary<Guid, TaskCompletionSource<Frame>> _pending = new();
public async Task ConnectAsync(string host, int port, CancellationToken ct)
{
_client = new UdpClient();
_client.Connect(host, port);
_ = ReceiveLoopAsync(ct);
}
public async Task<Frame> SendRequestAsync(
ConnectionState connection, Frame request,
TimeSpan timeout, CancellationToken ct)
{
var data = SerializeFrame(request);
if (data.Length > _options.MaxDatagramSize)
throw new PayloadTooLargeException(data.Length, _options.MaxDatagramSize);
var tcs = new TaskCompletionSource<Frame>();
using var cts = CancellationTokenSource.CreateLinkedTokenSource(ct);
cts.CancelAfter(timeout);
cts.Token.Register(() => tcs.TrySetCanceled());
_pending[request.CorrelationId] = tcs;
await _client!.SendAsync(data, data.Length);
return await tcs.Task;
}
// Streaming not supported
public Task SendStreamingAsync(...) => throw new NotSupportedException(
"UDP transport does not support streaming. Use TCP or TLS transport.");
}
```
## UdpTransportOptions
```csharp
public sealed class UdpTransportOptions
{
public IPAddress BindAddress { get; set; } = IPAddress.Any;
public int Port { get; set; } = 5101;
public int MaxDatagramSize { get; set; } = 8192; // Conservative default
public TimeSpan DefaultTimeout { get; set; } = TimeSpan.FromSeconds(5);
public bool AllowBroadcast { get; set; } = false;
}
```
## Use Cases
UDP is appropriate for:
- **Health checks:** Small, frequent, non-critical
- **Metrics collection:** Fire-and-forget updates
- **Cache invalidation:** Small notifications
- **DNS-like lookups:** Quick request/response
UDP is NOT appropriate for:
- **File uploads/downloads:** Requires streaming
- **Large requests/responses:** Exceeds datagram limit
- **Critical operations:** No delivery guarantee
- **Ordered sequences:** Out-of-order possible
## Exit Criteria
Before marking this sprint DONE:
1. [x] UdpTransportServer receives datagrams
2. [x] UdpTransportClient sends and receives
3. [x] Size limits enforced
4. [x] Streaming disabled/rejected
5. [x] Request/response correlation works
6. [x] Integration tests pass
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2025-12-05 | Sprint DONE - UdpTransportServer, UdpTransportClient, UdpFrameProtocol, UdpTransportOptions, PayloadTooLargeException, ServiceCollectionExtensions, 13 tests pass | Claude |
## Decisions & Risks
- Default max datagram: 8KB (well under MTU)
- No retry/reliability - UDP is fire-and-forget
- Connection is logical (based on source IP:port)
- Timeout is per-request, no keepalive needed
- CANCEL is sent but may not arrive (best effort)

View File

@@ -1,219 +0,0 @@
# Sprint 7000-0006-0004 · Real Transports · RabbitMQ Plugin
## Topic & Scope
Implement the RabbitMQ transport plugin. Uses message queue infrastructure for reliable asynchronous communication with built-in durability options.
**Goal:** Reliable transport using existing message queue infrastructure.
**Working directory:** `src/__Libraries/StellaOps.Router.Transport.RabbitMq/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0006_0001 (TCP transport for reference patterns)
- **Downstream:** None.
- **Parallel work:** Can run in parallel with TLS and UDP sprints.
- **Cross-module impact:** None. New library only.
## Documentation Prerequisites
- `docs/router/specs.md` (section 5 - RabbitMQ transport requirements)
- `docs/router/09-Step.md` (RabbitMQ transport section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | RMQ-001 | DONE | Create `StellaOps.Router.Transport.RabbitMq` classlib project | Add to solution |
| 2 | RMQ-002 | DONE | Add project reference to Router.Common | |
| 3 | RMQ-003 | BLOCKED | Add RabbitMQ.Client NuGet package | Needs package in local-nugets |
| 4 | RMQ-010 | DONE | Implement `RabbitMqTransportServer` : `ITransportServer` | Gateway side |
| 5 | RMQ-011 | DONE | Implement connection to RabbitMQ broker | |
| 6 | RMQ-012 | DONE | Create request queue per gateway node | |
| 7 | RMQ-013 | DONE | Create response exchange for routing | |
| 8 | RMQ-014 | DONE | Implement consumer for incoming frames | |
| 9 | RMQ-020 | DONE | Implement `RabbitMqTransportClient` : `ITransportClient` | Microservice side |
| 10 | RMQ-021 | DONE | Implement connection to RabbitMQ broker | |
| 11 | RMQ-022 | DONE | Create response queue per microservice instance | |
| 12 | RMQ-023 | DONE | Bind response queue to exchange | |
| 13 | RMQ-030 | DONE | Implement queue/exchange naming convention | |
| 14 | RMQ-031 | DONE | Format: `stella.router.{nodeId}.requests` | Gateway request queue |
| 15 | RMQ-032 | DONE | Format: `stella.router.responses` | Response exchange |
| 16 | RMQ-033 | DONE | Routing key: `{connectionId}` | For response routing |
| 17 | RMQ-040 | DONE | Use CorrelationId for request/response matching | BasicProperties |
| 18 | RMQ-041 | DONE | Set ReplyTo for response routing | |
| 19 | RMQ-042 | DONE | Implement pending request tracking | |
| 20 | RMQ-050 | DONE | Implement HELLO via RabbitMQ | |
| 21 | RMQ-051 | DONE | Implement HEARTBEAT via RabbitMQ | |
| 22 | RMQ-052 | DONE | Implement REQUEST/RESPONSE via RabbitMQ | |
| 23 | RMQ-053 | DONE | Implement CANCEL via RabbitMQ | |
| 24 | RMQ-060 | DONE | Implement streaming via RabbitMQ (optional) | Throws NotSupportedException |
| 25 | RMQ-061 | DONE | Consider at-most-once delivery semantics | Using autoAck=true |
| 26 | RMQ-070 | DONE | Create RabbitMqTransportOptions | Connection, queues, durability |
| 27 | RMQ-071 | DONE | Create DI registration `AddRabbitMqTransport()` | |
| 28 | RMQ-080 | BLOCKED | Write integration tests with local RabbitMQ | Needs package in local-nugets |
| 29 | RMQ-081 | BLOCKED | Write tests for connection recovery | Needs package in local-nugets | |
## Queue/Exchange Topology
```
┌─────────────────────────┐
Microservice ──────────►│ stella.router.requests │
(HELLO, HEARTBEAT, │ (Direct Exchange) │
RESPONSE) └───────────┬─────────────┘
│ routing_key = nodeId
┌─────────────────────────┐
│ stella.gw.{nodeId}.in │◄─── Gateway consumes
│ (Queue) │
└─────────────────────────┘
Gateway ───────────────►┌─────────────────────────┐
(REQUEST, CANCEL) │ stella.router.responses │
│ (Topic Exchange) │
└───────────┬─────────────┘
│ routing_key = instanceId
┌─────────────────────────┐
│ stella.svc.{instanceId} │◄─── Microservice consumes
│ (Queue) │
└─────────────────────────┘
```
## Message Properties
```csharp
var properties = channel.CreateBasicProperties();
properties.CorrelationId = correlationId.ToString();
properties.ReplyTo = replyQueueName;
properties.Type = frameType.ToString();
properties.Timestamp = new AmqpTimestamp(DateTimeOffset.UtcNow.ToUnixTimeSeconds());
properties.Expiration = timeout.TotalMilliseconds.ToString();
properties.DeliveryMode = 1; // Non-persistent (or 2 for persistent)
```
## RabbitMqTransportOptions
```csharp
public sealed class RabbitMqTransportOptions
{
// Connection
public string HostName { get; set; } = "localhost";
public int Port { get; set; } = 5672;
public string VirtualHost { get; set; } = "/";
public string UserName { get; set; } = "guest";
public string Password { get; set; } = "guest";
// TLS
public bool UseSsl { get; set; } = false;
public string? SslCertPath { get; set; }
// Queues
public bool DurableQueues { get; set; } = false; // For dev, true for prod
public bool AutoDeleteQueues { get; set; } = true; // Clean up on disconnect
public int PrefetchCount { get; set; } = 10; // Concurrent messages
// Naming
public string ExchangePrefix { get; set; } = "stella.router";
public string QueuePrefix { get; set; } = "stella";
}
```
## RabbitMqTransportServer
```csharp
public sealed class RabbitMqTransportServer : ITransportServer
{
private IConnection? _connection;
private IModel? _channel;
private readonly string _requestQueueName;
public async Task StartAsync(CancellationToken ct)
{
var factory = new ConnectionFactory
{
HostName = _options.HostName,
Port = _options.Port,
VirtualHost = _options.VirtualHost,
UserName = _options.UserName,
Password = _options.Password
};
_connection = factory.CreateConnection();
_channel = _connection.CreateModel();
// Declare exchanges
_channel.ExchangeDeclare(_options.RequestExchange, ExchangeType.Direct, durable: true);
_channel.ExchangeDeclare(_options.ResponseExchange, ExchangeType.Topic, durable: true);
// Declare and bind request queue
_requestQueueName = $"{_options.QueuePrefix}.gw.{_nodeId}.in";
_channel.QueueDeclare(_requestQueueName,
durable: _options.DurableQueues,
exclusive: false,
autoDelete: _options.AutoDeleteQueues);
_channel.QueueBind(_requestQueueName, _options.RequestExchange, routingKey: _nodeId);
// Start consuming
var consumer = new EventingBasicConsumer(_channel);
consumer.Received += OnMessageReceived;
_channel.BasicConsume(_requestQueueName, autoAck: true, consumer);
}
private void OnMessageReceived(object? sender, BasicDeliverEventArgs e)
{
var frame = ParseFrame(e.Body.ToArray(), e.BasicProperties);
var connectionId = ExtractConnectionId(e.BasicProperties);
if (frame.Type == FrameType.Hello)
{
OnConnection?.Invoke(connectionId);
}
OnFrame?.Invoke(connectionId, frame);
}
}
```
## At-Most-Once Semantics
From specs.md:
> * Guarantee at-most-once semantics where practical.
This means:
- Auto-ack messages (no redelivery on failure)
- Non-durable queues/messages by default
- Idempotent handlers are caller's responsibility
For at-least-once (if needed later):
- Manual ack after processing
- Durable queues and persistent messages
- Deduplication in handler
## Exit Criteria
Before marking this sprint DONE:
1. [ ] RabbitMqTransportServer connects and consumes
2. [ ] RabbitMqTransportClient publishes and consumes
3. [ ] Queue/exchange topology correct
4. [ ] CorrelationId matching works
5. [ ] HELLO/HEARTBEAT/REQUEST/RESPONSE flow works
6. [ ] Connection recovery works
7. [ ] Integration tests pass with local RabbitMQ
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2025-12-05 | Code DONE but BLOCKED - RabbitMQ.Client NuGet package not available in local-nugets. Code written: RabbitMqTransportServer, RabbitMqTransportClient, RabbitMqFrameProtocol, RabbitMqTransportOptions, ServiceCollectionExtensions | Claude |
## Decisions & Risks
- Auto-delete queues by default (clean up on disconnect)
- Non-persistent messages by default (speed over durability)
- Prefetch count limits concurrent processing
- Connection recovery uses RabbitMQ.Client built-in recovery
- Streaming is optional (throws NotSupportedException for simplicity)
- **BLOCKED:** RabbitMQ.Client 7.0.0 needs to be added to local-nugets folder for build to succeed

View File

@@ -1,220 +0,0 @@
# Sprint 7000-0007-0001 · Configuration · Router Config Library
## Topic & Scope
Implement the Router.Config library with YAML configuration support and hot-reload. Provides centralized configuration for services, endpoints, static instances, and payload limits.
**Goal:** Configuration-driven router behavior with runtime updates.
**Working directory:** `src/__Libraries/StellaOps.Router.Config/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0006_* (all transports - config applies to transport selection)
- **Downstream:** SPRINT_7000_0007_0002 (microservice YAML)
- **Parallel work:** None. Sequential.
- **Cross-module impact:** Gateway consumes this library.
## Documentation Prerequisites
- `docs/router/specs.md` (section 11 - Configuration and YAML requirements)
- `docs/router/10-Step.md` (configuration section)
- `docs/router/implplan.md` (phase 10 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | CFG-001 | DONE | Implement `RouterConfig` root object | |
| 2 | CFG-002 | DONE | Implement `ServiceConfig` for service definitions | |
| 3 | CFG-003 | DONE | Implement `EndpointConfig` for endpoint definitions | |
| 4 | CFG-004 | DONE | Implement `StaticInstanceConfig` for known instances | |
| 5 | CFG-010 | DONE | Implement YAML configuration binding | NetEscapades.Configuration.Yaml |
| 6 | CFG-011 | DONE | Implement JSON configuration binding | Microsoft.Extensions.Configuration.Json |
| 7 | CFG-012 | DONE | Implement environment variable overrides | |
| 8 | CFG-013 | DONE | Support configuration layering (base + overrides) | |
| 9 | CFG-020 | DONE | Implement hot-reload via IOptionsMonitor | Using FileSystemWatcher |
| 10 | CFG-021 | DONE | Implement file system watcher for YAML | With debounce |
| 11 | CFG-022 | DONE | Trigger routing state refresh on config change | ConfigurationChanged event |
| 12 | CFG-023 | DONE | Handle errors in reloaded config (keep previous) | |
| 13 | CFG-030 | DONE | Implement `IRouterConfigProvider` interface | |
| 14 | CFG-031 | DONE | Implement validation on load | Required fields, format |
| 15 | CFG-032 | DONE | Log configuration changes | |
| 16 | CFG-040 | DONE | Create DI registration `AddRouterConfig()` | |
| 17 | CFG-041 | DONE | Integrate with Gateway startup | Via ServiceCollectionExtensions |
| 18 | CFG-050 | DONE | Write sample router.yaml | etc/router.yaml.sample |
| 19 | CFG-051 | DONE | Write unit tests for binding | 15 tests passing |
| 20 | CFG-052 | DONE | Write tests for hot-reload | |
## RouterConfig Structure
```csharp
public sealed class RouterConfig
{
public IList<ServiceConfig> Services { get; init; } = new List<ServiceConfig>();
public IList<StaticInstanceConfig> StaticInstances { get; init; } = new List<StaticInstanceConfig>();
public PayloadLimits PayloadLimits { get; init; } = new();
public RoutingOptions Routing { get; init; } = new();
}
public sealed class ServiceConfig
{
public string Name { get; init; } = string.Empty;
public string DefaultVersion { get; init; } = "1.0.0";
public TransportType DefaultTransport { get; init; } = TransportType.Tcp;
public IList<EndpointConfig> Endpoints { get; init; } = new List<EndpointConfig>();
}
public sealed class EndpointConfig
{
public string Method { get; init; } = "GET";
public string Path { get; init; } = string.Empty;
public TimeSpan? DefaultTimeout { get; init; }
public IList<ClaimRequirementConfig> RequiringClaims { get; init; } = new List<ClaimRequirementConfig>();
public bool? SupportsStreaming { get; init; }
}
public sealed class StaticInstanceConfig
{
public string ServiceName { get; init; } = string.Empty;
public string Version { get; init; } = string.Empty;
public string Region { get; init; } = string.Empty;
public string Host { get; init; } = string.Empty;
public int Port { get; init; }
public TransportType Transport { get; init; }
}
```
## Sample router.yaml
```yaml
# Router configuration
payloadLimits:
maxRequestBytesPerCall: 10485760 # 10 MB
maxRequestBytesPerConnection: 104857600
maxAggregateInflightBytes: 1073741824
routing:
neighborRegions:
- eu2
- us1
tieBreaker: roundRobin
services:
- name: billing
defaultVersion: "1.0.0"
defaultTransport: tcp
endpoints:
- method: POST
path: /invoices
defaultTimeout: 30s
requiringClaims:
- type: role
value: billing-admin
- method: GET
path: /invoices/{id}
defaultTimeout: 5s
- name: inventory
defaultVersion: "2.1.0"
defaultTransport: tls
endpoints:
- method: GET
path: /items
supportsStreaming: true
# Optional: static instances (usually discovered via HELLO)
staticInstances:
- serviceName: billing
version: "1.0.0"
region: eu1
host: billing-eu1-01.internal
port: 5100
transport: tcp
```
## Hot-Reload Implementation
```csharp
public sealed class RouterConfigProvider : IRouterConfigProvider, IDisposable
{
private RouterConfig _current;
private readonly FileSystemWatcher? _watcher;
private readonly ILogger<RouterConfigProvider> _logger;
public RouterConfigProvider(IOptions<RouterConfigOptions> options, ILogger<RouterConfigProvider> logger)
{
_logger = logger;
_current = LoadConfig(options.Value.ConfigPath);
if (options.Value.EnableHotReload)
{
_watcher = new FileSystemWatcher(Path.GetDirectoryName(options.Value.ConfigPath)!)
{
Filter = Path.GetFileName(options.Value.ConfigPath),
NotifyFilter = NotifyFilters.LastWrite
};
_watcher.Changed += OnConfigFileChanged;
_watcher.EnableRaisingEvents = true;
}
}
private void OnConfigFileChanged(object sender, FileSystemEventArgs e)
{
try
{
var newConfig = LoadConfig(e.FullPath);
ValidateConfig(newConfig);
var previous = _current;
_current = newConfig;
_logger.LogInformation("Router configuration reloaded successfully");
ConfigurationChanged?.Invoke(this, new ConfigChangedEventArgs(previous, newConfig));
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to reload configuration, keeping previous");
}
}
public RouterConfig Current => _current;
public event EventHandler<ConfigChangedEventArgs>? ConfigurationChanged;
}
```
## Configuration Precedence
1. **Code defaults** (in Common library)
2. **YAML configuration** (router.yaml)
3. **JSON configuration** (appsettings.json)
4. **Environment variables** (STELLAOPS_ROUTER_*)
5. **Microservice HELLO** (dynamic registration)
6. **Authority overrides** (for RequiringClaims)
Later sources override earlier ones.
## Exit Criteria
Before marking this sprint DONE:
1. [x] RouterConfig binds from YAML correctly
2. [x] JSON and environment variables also work
3. [x] Hot-reload updates config without restart
4. [x] Validation rejects invalid config
5. [x] Sample router.yaml documents all options
6. [x] DI integration works with Gateway
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2025-12-05 | Sprint DONE - Implemented RouterConfig, ServiceConfig, EndpointConfig, StaticInstanceConfig, RoutingOptions, RouterConfigOptions, IRouterConfigProvider, RouterConfigProvider with hot-reload, ServiceCollectionExtensions. Created etc/router.yaml.sample. 15 tests passing. | Claude |
## Decisions & Risks
- YamlDotNet for YAML parsing (mature, well-supported)
- File watcher has debounce to avoid multiple reloads
- Invalid hot-reload keeps previous config (fail-safe)
- Static instances are optional (most discover via HELLO)

View File

@@ -1,213 +0,0 @@
# Sprint 7000-0007-0002 · Configuration · Microservice YAML Config
## Topic & Scope
Implement YAML configuration support for microservices. Allows endpoint-level overrides for timeouts, RequiringClaims, and streaming flags without code changes.
**Goal:** Microservices can customize endpoint behavior via YAML without rebuilding.
**Working directory:** `src/__Libraries/StellaOps.Microservice/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0007_0001 (Router.Config patterns)
- **Downstream:** SPRINT_7000_0008_0001 (Authority integration)
- **Parallel work:** None. Sequential.
- **Cross-module impact:** Microservice SDK only.
## Documentation Prerequisites
- `docs/router/specs.md` (sections 7.3, 11 - Microservice config requirements)
- `docs/router/10-Step.md` (microservice YAML section)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | MCFG-001 | DONE | Create `MicroserviceEndpointConfig` class | ClaimRequirementConfig |
| 2 | MCFG-002 | DONE | Create `MicroserviceYamlConfig` root object | EndpointOverrideConfig |
| 3 | MCFG-010 | DONE | Implement YAML loading from ConfigFilePath | MicroserviceYamlLoader |
| 4 | MCFG-011 | DONE | Implement endpoint matching by (Method, Path) | Case-insensitive matching |
| 5 | MCFG-012 | DONE | Implement override merge with code defaults | EndpointOverrideMerger |
| 6 | MCFG-020 | DONE | Override DefaultTimeout per endpoint | Supports "30s", "5m", "1h" formats |
| 7 | MCFG-021 | DONE | Override RequiringClaims per endpoint | Full replacement |
| 8 | MCFG-022 | DONE | Override SupportsStreaming per endpoint | |
| 9 | MCFG-030 | DONE | Implement precedence: code → YAML | Via EndpointOverrideMerger |
| 10 | MCFG-031 | DONE | Document that YAML cannot create endpoints (only modify) | In sample file |
| 11 | MCFG-032 | DONE | Warn on YAML entries that don't match code endpoints | WarnUnmatchedOverrides |
| 12 | MCFG-040 | DONE | Integrate with endpoint discovery | EndpointDiscoveryService |
| 13 | MCFG-041 | DONE | Apply overrides before HELLO construction | Via IEndpointDiscoveryService |
| 14 | MCFG-050 | DONE | Create sample microservice.yaml | etc/microservice.yaml.sample |
| 15 | MCFG-051 | DONE | Write unit tests for merge logic | EndpointOverrideMergerTests |
| 16 | MCFG-052 | DONE | Write tests for precedence | 85 tests pass |
## MicroserviceYamlConfig Structure
```csharp
public sealed class MicroserviceYamlConfig
{
public IList<EndpointOverrideConfig> Endpoints { get; init; } = new List<EndpointOverrideConfig>();
}
public sealed class EndpointOverrideConfig
{
public string Method { get; init; } = string.Empty;
public string Path { get; init; } = string.Empty;
public TimeSpan? DefaultTimeout { get; init; }
public bool? SupportsStreaming { get; init; }
public IList<ClaimRequirementConfig>? RequiringClaims { get; init; }
}
```
## Sample microservice.yaml
```yaml
# Microservice endpoint overrides
# Note: Only modifies endpoints declared in code; cannot create new endpoints
endpoints:
- method: POST
path: /invoices
defaultTimeout: 60s # Override code default of 30s
requiringClaims:
- type: role
value: invoice-creator
- type: department
value: finance
- method: GET
path: /invoices/{id}
defaultTimeout: 10s
- method: POST
path: /reports/generate
supportsStreaming: true # Enable streaming for large reports
defaultTimeout: 300s # 5 minutes for long-running reports
```
## Merge Logic
```csharp
internal sealed class EndpointOverrideMerger
{
public EndpointDescriptor Merge(
EndpointDescriptor codeDefault,
EndpointOverrideConfig? yamlOverride)
{
if (yamlOverride == null)
return codeDefault;
return codeDefault with
{
DefaultTimeout = yamlOverride.DefaultTimeout ?? codeDefault.DefaultTimeout,
SupportsStreaming = yamlOverride.SupportsStreaming ?? codeDefault.SupportsStreaming,
RequiringClaims = yamlOverride.RequiringClaims?.Select(c =>
new ClaimRequirement { Type = c.Type, Value = c.Value }).ToList()
?? codeDefault.RequiringClaims
};
}
}
```
## Precedence Rules
From specs.md section 7.3:
> Precedence rules MUST be clearly defined and honored:
> * Service identity & router pool: from `StellaMicroserviceOptions` (not YAML).
> * Endpoint set: from code (attributes/source gen); YAML MAY override properties but ideally not create endpoints not present in code.
> * `RequiringClaims` and timeouts: YAML overrides defaults from code, unless overridden by central Authority.
```
┌─────────────────┐
│ Code defaults │ [StellaEndpoint] attribute values
└────────┬────────┘
│ YAML overrides (if present)
┌─────────────────┐
│ YAML config │ Endpoint-specific overrides
└────────┬────────┘
│ Authority overrides (later sprint)
┌─────────────────┐
│ Effective │ Final values sent in HELLO
└─────────────────┘
```
## Integration with Discovery
```csharp
internal sealed class EndpointDiscoveryService
{
private readonly IMicroserviceYamlLoader _yamlLoader;
private readonly EndpointOverrideMerger _merger;
public IReadOnlyList<EndpointDescriptor> DiscoverEndpoints()
{
// 1. Discover from code
var codeEndpoints = DiscoverFromReflection();
// 2. Load YAML overrides
var yamlConfig = _yamlLoader.Load();
// 3. Merge
return codeEndpoints.Select(ep =>
{
var yamlOverride = yamlConfig?.Endpoints
.FirstOrDefault(y => y.Method == ep.Method && y.Path == ep.Path);
if (yamlOverride == null)
return ep;
return _merger.Merge(ep, yamlOverride);
}).ToList();
}
}
```
## Warning on Unmatched YAML
```csharp
private void WarnUnmatchedOverrides(
IEnumerable<EndpointDescriptor> codeEndpoints,
MicroserviceYamlConfig? yamlConfig)
{
if (yamlConfig == null) return;
var codeKeys = codeEndpoints.Select(e => (e.Method, e.Path)).ToHashSet();
foreach (var yamlEntry in yamlConfig.Endpoints)
{
if (!codeKeys.Contains((yamlEntry.Method, yamlEntry.Path)))
{
_logger.LogWarning(
"YAML override for {Method} {Path} does not match any code endpoint",
yamlEntry.Method, yamlEntry.Path);
}
}
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [x] YAML loading works from ConfigFilePath
2. [x] Merge applies YAML overrides to code defaults
3. [x] Precedence is code → YAML
4. [x] Unmatched YAML entries logged as warnings
5. [x] Sample microservice.yaml documented
6. [x] Unit tests for merge logic
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2025-12-05 | Sprint completed. 85 tests pass. | Claude |
## Decisions & Risks
- YAML cannot create endpoints (only modify) per spec
- Missing YAML file is not an error (optional config)
- Hot-reload of microservice YAML is not supported (restart required)
- RequiringClaims in YAML fully replaces code defaults (not merged)

View File

@@ -1,211 +0,0 @@
# Sprint 7000-0008-0001 · Integration · Authority Claims Override
## Topic & Scope
Implement Authority integration for RequiringClaims overrides. The central Authority service can push endpoint authorization requirements that override microservice defaults.
**Goal:** Centralized authorization policy that takes precedence over microservice-defined claims.
**Working directories:**
- `src/Gateway/StellaOps.Gateway.WebService/` (apply overrides)
- `src/Authority/` (if Authority changes needed)
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0007_0002 (microservice YAML - establishes precedence)
- **Downstream:** SPRINT_7000_0008_0002 (source generator)
- **Parallel work:** Can run in parallel with source generator sprint.
- **Cross-module impact:** May require Authority module changes.
## Documentation Prerequisites
- `docs/router/specs.md` (section 9 - Authorization / requiringClaims / Authority requirements)
- `docs/modules/authority/architecture.md` (Authority module design)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Working Directory |
|---|---------|--------|-------------|-------------------|
| 1 | AUTH-001 | DONE | Define `IAuthorityClaimsProvider` interface | Common/Gateway |
| 2 | AUTH-002 | DONE | Define `ClaimsOverride` model | Common |
| 3 | AUTH-010 | DONE | Implement Gateway startup claims fetch | Gateway |
| 4 | AUTH-011 | DONE | Request overrides from Authority on startup | |
| 5 | AUTH-012 | DONE | Wait for Authority before handling traffic (configurable) | |
| 6 | AUTH-020 | DONE | Implement runtime claims update | Gateway |
| 7 | AUTH-021 | DONE | Periodically refresh from Authority | |
| 8 | AUTH-022 | DONE | Or subscribe to Authority push notifications | |
| 9 | AUTH-030 | DONE | Merge Authority overrides with microservice defaults | Gateway |
| 10 | AUTH-031 | DONE | Authority takes precedence over YAML and code | |
| 11 | AUTH-032 | DONE | Store effective RequiringClaims per endpoint | |
| 12 | AUTH-040 | DONE | Implement AuthorizationMiddleware with claims enforcement | Gateway |
| 13 | AUTH-041 | DONE | Check user principal has all required claims | |
| 14 | AUTH-042 | DONE | Return 403 Forbidden on claim failure | |
| 15 | AUTH-050 | DONE | Create configuration for Authority connection | Gateway |
| 16 | AUTH-051 | DONE | Handle Authority unavailable (use cached/defaults) | |
| 17 | AUTH-060 | DONE | Write integration tests for claims enforcement | |
| 18 | AUTH-061 | DONE | Write tests for Authority override precedence | |
## IAuthorityClaimsProvider
```csharp
public interface IAuthorityClaimsProvider
{
Task<IReadOnlyDictionary<EndpointKey, IReadOnlyList<ClaimRequirement>>> GetOverridesAsync(
CancellationToken cancellationToken);
event EventHandler<ClaimsOverrideChangedEventArgs>? OverridesChanged;
}
public readonly record struct EndpointKey(string ServiceName, string Method, string Path);
public sealed class ClaimsOverrideChangedEventArgs : EventArgs
{
public IReadOnlyDictionary<EndpointKey, IReadOnlyList<ClaimRequirement>> Overrides { get; init; } = new Dictionary<EndpointKey, IReadOnlyList<ClaimRequirement>>();
}
```
## Final Precedence Chain
```
┌─────────────────────┐
│ Code defaults │ [StellaEndpoint] RequiringClaims
└──────────┬──────────┘
│ YAML overrides
┌─────────────────────┐
│ Microservice YAML │ Endpoint-specific claims
└──────────┬──────────┘
│ Authority overrides (highest priority)
┌─────────────────────┐
│ Authority Policy │ Central claims requirements
└──────────┬──────────┘
┌─────────────────────┐
│ Effective Claims │ What Gateway enforces
└─────────────────────┘
```
## AuthorizationMiddleware (Updated)
```csharp
public class AuthorizationMiddleware
{
public async Task InvokeAsync(HttpContext context, IEffectiveClaimsStore claimsStore)
{
var endpoint = (EndpointDescriptor)context.Items["ResolvedEndpoint"]!;
// Get effective claims (already merged with Authority)
var effectiveClaims = claimsStore.GetEffectiveClaims(
endpoint.ServiceName, endpoint.Method, endpoint.Path);
// Check each required claim
foreach (var required in effectiveClaims)
{
var userClaims = context.User.Claims;
bool hasClaim = required.Value == null
? userClaims.Any(c => c.Type == required.Type)
: userClaims.Any(c => c.Type == required.Type && c.Value == required.Value);
if (!hasClaim)
{
_logger.LogWarning(
"Authorization failed: user lacks claim {ClaimType}={ClaimValue}",
required.Type, required.Value ?? "(any)");
context.Response.StatusCode = 403;
await context.Response.WriteAsJsonAsync(new
{
error = "Forbidden",
requiredClaim = new { type = required.Type, value = required.Value }
});
return;
}
}
await _next(context);
}
}
```
## IEffectiveClaimsStore
```csharp
public interface IEffectiveClaimsStore
{
IReadOnlyList<ClaimRequirement> GetEffectiveClaims(
string serviceName, string method, string path);
void UpdateFromMicroservice(string serviceName, IReadOnlyList<EndpointDescriptor> endpoints);
void UpdateFromAuthority(IReadOnlyDictionary<EndpointKey, IReadOnlyList<ClaimRequirement>> overrides);
}
internal sealed class EffectiveClaimsStore : IEffectiveClaimsStore
{
private readonly ConcurrentDictionary<EndpointKey, IReadOnlyList<ClaimRequirement>> _microserviceClaims = new();
private readonly ConcurrentDictionary<EndpointKey, IReadOnlyList<ClaimRequirement>> _authorityClaims = new();
public IReadOnlyList<ClaimRequirement> GetEffectiveClaims(
string serviceName, string method, string path)
{
var key = new EndpointKey(serviceName, method, path);
// Authority takes precedence
if (_authorityClaims.TryGetValue(key, out var authorityClaims))
return authorityClaims;
// Fall back to microservice defaults
if (_microserviceClaims.TryGetValue(key, out var msClaims))
return msClaims;
return Array.Empty<ClaimRequirement>();
}
}
```
## Authority Connection Options
```csharp
public sealed class AuthorityConnectionOptions
{
public string AuthorityUrl { get; set; } = string.Empty;
public bool WaitForAuthorityOnStartup { get; set; } = true;
public TimeSpan StartupTimeout { get; set; } = TimeSpan.FromSeconds(30);
public TimeSpan RefreshInterval { get; set; } = TimeSpan.FromMinutes(5);
public bool UseAuthorityPushNotifications { get; set; } = false;
}
```
## Exit Criteria
Before marking this sprint DONE:
1. [x] IAuthorityClaimsProvider implemented
2. [x] Gateway fetches overrides on startup
3. [x] Authority overrides take precedence
4. [x] AuthorizationMiddleware enforces effective claims
5. [x] Graceful handling when Authority unavailable
6. [x] Integration tests verify claims enforcement
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2025-12-05 | Implemented IAuthorityClaimsProvider, IEffectiveClaimsStore, EffectiveClaimsStore | Claude |
| 2025-12-05 | Implemented HttpAuthorityClaimsProvider with HTTP client | Claude |
| 2025-12-05 | Implemented AuthorityClaimsRefreshService background service | Claude |
| 2025-12-05 | Implemented AuthorizationMiddleware with claims enforcement | Claude |
| 2025-12-05 | Created AuthorityConnectionOptions for configuration | Claude |
| 2025-12-05 | Added NoOpAuthorityClaimsProvider for disabled mode | Claude |
| 2025-12-05 | Created 19 tests for EffectiveClaimsStore and AuthorizationMiddleware | Claude |
| 2025-12-05 | All tests passing - sprint DONE | Claude |
## Decisions & Risks
- Authority overrides fully replace microservice claims (not merged)
- Startup can optionally wait for Authority (fail-safe mode proceeds without)
- Refresh interval is 5 minutes by default (tune for your environment)
- Authority push notifications optional (polling is default)
- This sprint assumes Authority module exists; coordinate with Authority team

View File

@@ -1,237 +0,0 @@
# Sprint 7000-0008-0002 · Integration · Endpoint Source Generator
## Topic & Scope
Implement a Roslyn source generator for compile-time endpoint discovery. Generates endpoint metadata at build time, eliminating runtime reflection overhead.
**Goal:** Faster startup and AOT compatibility via build-time endpoint discovery.
**Working directory:** `src/__Libraries/StellaOps.Microservice.SourceGen/`
## Dependencies & Concurrency
- **Upstream:** SPRINT_7000_0003_0001 (SDK core with reflection-based discovery)
- **Downstream:** None.
- **Parallel work:** Can run in parallel with Authority integration.
- **Cross-module impact:** Microservice SDK consumes generated code.
## Documentation Prerequisites
- `docs/router/specs.md` (section 7.2 - Endpoint definition & discovery)
- Roslyn Source Generator documentation
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | GEN-001 | DONE | Convert project to source generator | Microsoft.CodeAnalysis.CSharp |
| 2 | GEN-002 | DONE | Implement `[StellaEndpoint]` attribute detection | Syntax receiver |
| 3 | GEN-003 | DONE | Extract Method, Path, and other attribute properties | |
| 4 | GEN-010 | DONE | Detect handler interface implementation | IStellaEndpoint<T,R>, etc. |
| 5 | GEN-011 | DONE | Generate `EndpointDescriptor` instances | |
| 6 | GEN-012 | DONE | Generate `IGeneratedEndpointProvider` implementation | |
| 7 | GEN-020 | DONE | Generate registration code for DI | |
| 8 | GEN-021 | DONE | Generate handler factory methods | |
| 9 | GEN-030 | DONE | Implement incremental generation | For fast builds |
| 10 | GEN-031 | DONE | Cache compilation results | Via incremental pipeline |
| 11 | GEN-040 | DONE | Add analyzer for invalid [StellaEndpoint] usage | Diagnostics |
| 12 | GEN-041 | DONE | Error on missing handler interface | STELLA001 |
| 13 | GEN-042 | DONE | Warning on duplicate Method+Path | STELLA002 |
| 14 | GEN-050 | DONE | Hook into SDK to prefer generated over reflection | GeneratedEndpointDiscoveryProvider |
| 15 | GEN-051 | DONE | Fall back to reflection if generation not available | |
| 16 | GEN-060 | DONE | Write unit tests for generator | Existing tests pass |
| 17 | GEN-061 | DONE | Test generated code compiles and works | SDK build succeeds |
| 18 | GEN-062 | DONE | Test incremental generation | Incremental pipeline verified |
## Source Generator Output
Given this input:
```csharp
[StellaEndpoint("POST", "/invoices", DefaultTimeout = 30)]
public sealed class CreateInvoiceEndpoint : IStellaEndpoint<CreateInvoiceRequest, CreateInvoiceResponse>
{
public Task<CreateInvoiceResponse> HandleAsync(CreateInvoiceRequest request, CancellationToken ct) => ...;
}
```
The generator produces:
```csharp
// <auto-generated/>
namespace StellaOps.Microservice.Generated
{
[global::System.CodeDom.Compiler.GeneratedCode("StellaOps.Microservice.SourceGen", "1.0.0")]
internal static class StellaEndpoints
{
public static global::System.Collections.Generic.IReadOnlyList<global::StellaOps.Router.Common.EndpointDescriptor>
GetEndpoints()
{
return new global::StellaOps.Router.Common.EndpointDescriptor[]
{
new global::StellaOps.Router.Common.EndpointDescriptor
{
Method = "POST",
Path = "/invoices",
DefaultTimeout = global::System.TimeSpan.FromSeconds(30),
SupportsStreaming = false,
RequiringClaims = global::System.Array.Empty<global::StellaOps.Router.Common.ClaimRequirement>(),
HandlerType = typeof(global::MyApp.CreateInvoiceEndpoint)
},
// ... more endpoints
};
}
public static void RegisterHandlers(
global::Microsoft.Extensions.DependencyInjection.IServiceCollection services)
{
services.AddTransient<global::MyApp.CreateInvoiceEndpoint>();
// ... more handlers
}
}
}
```
## Generator Implementation
```csharp
[Generator]
public class StellaEndpointGenerator : IIncrementalGenerator
{
public void Initialize(IncrementalGeneratorInitializationContext context)
{
// Find all classes with [StellaEndpoint]
var endpointClasses = context.SyntaxProvider
.ForAttributeWithMetadataName(
"StellaOps.Microservice.StellaEndpointAttribute",
predicate: static (node, _) => node is ClassDeclarationSyntax,
transform: static (ctx, _) => GetEndpointInfo(ctx))
.Where(static info => info is not null);
// Combine and generate
context.RegisterSourceOutput(
endpointClasses.Collect(),
static (spc, endpoints) => GenerateEndpointsClass(spc, endpoints!));
}
private static EndpointInfo? GetEndpointInfo(GeneratorAttributeSyntaxContext context)
{
var classSymbol = (INamedTypeSymbol)context.TargetSymbol;
var attribute = context.Attributes[0];
// Extract attribute parameters
var method = attribute.ConstructorArguments[0].Value as string;
var path = attribute.ConstructorArguments[1].Value as string;
// Find timeout, streaming, etc. from named arguments
var timeout = attribute.NamedArguments
.FirstOrDefault(a => a.Key == "DefaultTimeout").Value.Value as int? ?? 30;
// Verify handler interface
var implementsHandler = classSymbol.AllInterfaces
.Any(i => i.Name.StartsWith("IStellaEndpoint"));
if (!implementsHandler)
{
// Report diagnostic
return null;
}
return new EndpointInfo(classSymbol, method!, path!, timeout);
}
}
```
## IGeneratedEndpointProvider
```csharp
public interface IGeneratedEndpointProvider
{
IReadOnlyList<EndpointDescriptor> GetEndpoints();
void RegisterHandlers(IServiceCollection services);
}
// Generated implementation
internal sealed class GeneratedEndpointProvider : IGeneratedEndpointProvider
{
public IReadOnlyList<EndpointDescriptor> GetEndpoints()
=> StellaEndpoints.GetEndpoints();
public void RegisterHandlers(IServiceCollection services)
=> StellaEndpoints.RegisterHandlers(services);
}
```
## SDK Integration
```csharp
internal sealed class EndpointDiscoveryService
{
public IReadOnlyList<EndpointDescriptor> DiscoverEndpoints()
{
// Prefer generated
var generated = TryGetGeneratedProvider();
if (generated != null)
{
_logger.LogDebug("Using source-generated endpoint discovery");
return generated.GetEndpoints();
}
// Fall back to reflection
_logger.LogDebug("Using reflection-based endpoint discovery");
return DiscoverFromReflection();
}
private IGeneratedEndpointProvider? TryGetGeneratedProvider()
{
// Look for generated type in entry assembly
var entryAssembly = Assembly.GetEntryAssembly();
var providerType = entryAssembly?.GetType(
"StellaOps.Microservice.Generated.GeneratedEndpointProvider");
if (providerType != null)
return (IGeneratedEndpointProvider)Activator.CreateInstance(providerType)!;
return null;
}
}
```
## Diagnostics
| ID | Severity | Message |
|----|----------|---------|
| STELLA001 | Error | Class with [StellaEndpoint] must implement IStellaEndpoint<> or IRawStellaEndpoint |
| STELLA002 | Warning | Duplicate endpoint: {Method} {Path} |
| STELLA003 | Warning | [StellaEndpoint] on abstract class is ignored |
| STELLA004 | Info | Generated {N} endpoint descriptors |
## Exit Criteria
Before marking this sprint DONE:
1. [x] Source generator detects [StellaEndpoint] classes
2. [x] Generates EndpointDescriptor array
3. [x] Generates DI registration
4. [x] Incremental generation for fast builds
5. [x] Analyzers report invalid usage
6. [x] SDK prefers generated over reflection
7. [x] All tests pass
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2025-12-05 | Converted project to Roslyn source generator (netstandard2.0) | Claude |
| 2025-12-05 | Implemented StellaEndpointGenerator with incremental pipeline | Claude |
| 2025-12-05 | Added diagnostic descriptors STELLA001-004 | Claude |
| 2025-12-05 | Added IGeneratedEndpointProvider interface | Claude |
| 2025-12-05 | Created GeneratedEndpointDiscoveryProvider (prefers generated) | Claude |
| 2025-12-05 | Updated SDK to use generated provider by default | Claude |
| 2025-12-05 | All 85 microservice tests pass - sprint DONE | Claude |
## Decisions & Risks
- Incremental generation is essential for large projects
- Generated code uses fully qualified names to avoid conflicts
- Fallback to reflection ensures compatibility with older projects
- AOT scenarios require source generation (no reflection)

View File

@@ -1,260 +0,0 @@
# Sprint 7000-0009-0001 · Examples · Reference Implementation
## Topic & Scope
Build a complete reference example demonstrating the router, gateway, and microservice SDK working together. Provides templates for common patterns and validates the entire system end-to-end.
**Goal:** Working example that developers can copy and adapt.
**Working directory:** `examples/router/`
## Dependencies & Concurrency
- **Upstream:** All feature sprints complete (7000-0001 through 7000-0008)
- **Downstream:** SPRINT_7000_0009_0002 (migration docs)
- **Parallel work:** Can run in parallel with migration docs.
- **Cross-module impact:** None. Examples only.
## Documentation Prerequisites
- `docs/router/specs.md` (complete specification)
- `docs/router/implplan.md` (phase 11 guidance)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | EX-001 | DONE | Create `examples/router/` directory structure | |
| 2 | EX-002 | DONE | Create example solution `Examples.Router.sln` | |
| 3 | EX-010 | DONE | Create `Examples.Gateway` project | Full gateway setup |
| 4 | EX-011 | DONE | Configure gateway with all middleware | |
| 5 | EX-012 | DONE | Create example router.yaml | |
| 6 | EX-013 | DONE | Configure TCP and TLS transports | Using InMemory for demo |
| 7 | EX-020 | DONE | Create `Examples.Billing.Microservice` project | |
| 8 | EX-021 | DONE | Implement simple GET/POST endpoints | CreateInvoice, GetInvoice |
| 9 | EX-022 | DONE | Implement streaming upload endpoint | UploadAttachmentEndpoint |
| 10 | EX-023 | DONE | Create example microservice.yaml | |
| 11 | EX-030 | DONE | Create `Examples.Inventory.Microservice` project | |
| 12 | EX-031 | DONE | Demonstrate multi-service routing | ListItems, GetItem |
| 13 | EX-040 | DONE | Create docker-compose.yaml | |
| 14 | EX-041 | DONE | Include RabbitMQ for transport option | |
| 15 | EX-042 | DONE | Include health monitoring | Gateway /health endpoint |
| 16 | EX-050 | DONE | Write README.md with run instructions | |
| 17 | EX-051 | DONE | Document adding new endpoints | In README |
| 18 | EX-052 | DONE | Document cancellation behavior | In README |
| 19 | EX-053 | DONE | Document payload limit testing | In README |
| 20 | EX-060 | DONE | Create integration test project | |
| 21 | EX-061 | DONE | Test full end-to-end flow | Tests compile |
## Directory Structure
```
examples/router/
├── Examples.Router.sln
├── docker-compose.yaml
├── README.md
├── src/
│ ├── Examples.Gateway/
│ │ ├── Program.cs
│ │ ├── appsettings.json
│ │ └── router.yaml
│ ├── Examples.Billing.Microservice/
│ │ ├── Program.cs
│ │ ├── appsettings.json
│ │ ├── microservice.yaml
│ │ └── Endpoints/
│ │ ├── CreateInvoiceEndpoint.cs
│ │ ├── GetInvoiceEndpoint.cs
│ │ └── UploadAttachmentEndpoint.cs
│ └── Examples.Inventory.Microservice/
│ ├── Program.cs
│ └── Endpoints/
│ ├── ListItemsEndpoint.cs
│ └── GetItemEndpoint.cs
└── tests/
└── Examples.Integration.Tests/
```
## Example Gateway Program.cs
```csharp
var builder = WebApplication.CreateBuilder(args);
// Router configuration
builder.Services.AddRouterConfig(options =>
{
options.ConfigPath = "router.yaml";
options.EnableHotReload = true;
});
// Gateway node configuration
builder.Services.Configure<GatewayNodeConfig>(
builder.Configuration.GetSection("GatewayNode"));
// Transports
builder.Services.AddTcpTransport(options =>
{
options.Port = 5100;
});
builder.Services.AddTlsTransport(options =>
{
options.Port = 5101;
options.ServerCertificatePath = "certs/gateway.pfx";
});
// Routing
builder.Services.AddSingleton<IGlobalRoutingState, InMemoryRoutingState>();
builder.Services.AddSingleton<IRoutingPlugin, DefaultRoutingPlugin>();
// Authority integration
builder.Services.AddAuthorityClaimsProvider(options =>
{
options.AuthorityUrl = builder.Configuration["Authority:Url"];
});
var app = builder.Build();
// Middleware pipeline
app.UseForwardedHeaders();
app.UseMiddleware<GlobalErrorHandlerMiddleware>();
app.UseMiddleware<RequestLoggingMiddleware>();
app.UseMiddleware<PayloadLimitsMiddleware>();
app.UseAuthentication();
app.UseMiddleware<EndpointResolutionMiddleware>();
app.UseMiddleware<AuthorizationMiddleware>();
app.UseMiddleware<RoutingDecisionMiddleware>();
app.UseMiddleware<TransportDispatchMiddleware>();
app.Run();
```
## Example Microservice Program.cs
```csharp
var builder = Host.CreateApplicationBuilder(args);
builder.Services.AddStellaMicroservice(options =>
{
options.ServiceName = "billing";
options.Version = "1.0.0";
options.Region = "eu1";
options.InstanceId = $"billing-{Environment.MachineName}";
options.ConfigFilePath = "microservice.yaml";
options.Routers = new[]
{
new RouterEndpointConfig
{
Host = "gateway.local",
Port = 5100,
TransportType = TransportType.Tcp
}
};
});
var host = builder.Build();
await host.RunAsync();
```
## Example Endpoints
### Typed Endpoint
```csharp
[StellaEndpoint("POST", "/invoices", DefaultTimeout = 30)]
public sealed class CreateInvoiceEndpoint : IStellaEndpoint<CreateInvoiceRequest, CreateInvoiceResponse>
{
private readonly IInvoiceService _service;
public CreateInvoiceEndpoint(IInvoiceService service) => _service = service;
public async Task<CreateInvoiceResponse> HandleAsync(
CreateInvoiceRequest request,
CancellationToken ct)
{
var invoice = await _service.CreateAsync(request, ct);
return new CreateInvoiceResponse { InvoiceId = invoice.Id };
}
}
```
### Streaming Endpoint
```csharp
[StellaEndpoint("POST", "/invoices/{id}/attachments", SupportsStreaming = true)]
public sealed class UploadAttachmentEndpoint : IRawStellaEndpoint
{
private readonly IStorageService _storage;
public async Task<RawResponse> HandleAsync(RawRequestContext context, CancellationToken ct)
{
var invoiceId = context.PathParameters["id"];
// Stream body directly to storage
var path = await _storage.StoreAsync(invoiceId, context.Body, ct);
return RawResponse.Ok(JsonSerializer.Serialize(new { path }));
}
}
```
## docker-compose.yaml
```yaml
version: '3.8'
services:
gateway:
build: ./src/Examples.Gateway
ports:
- "8080:8080" # HTTP ingress
- "5100:5100" # TCP transport
- "5101:5101" # TLS transport
environment:
- GatewayNode__Region=eu1
- GatewayNode__NodeId=gw-01
billing:
build: ./src/Examples.Billing.Microservice
environment:
- Stella__Routers__0__Host=gateway
- Stella__Routers__0__Port=5100
depends_on:
- gateway
inventory:
build: ./src/Examples.Inventory.Microservice
environment:
- Stella__Routers__0__Host=gateway
- Stella__Routers__0__Port=5100
depends_on:
- gateway
rabbitmq:
image: rabbitmq:3-management
ports:
- "5672:5672"
- "15672:15672"
```
## Exit Criteria
Before marking this sprint DONE:
1. [ ] All example projects build
2. [ ] docker-compose starts full environment
3. [ ] HTTP requests route through gateway to microservices
4. [ ] Streaming upload works
5. [ ] Multiple microservices register correctly
6. [ ] README documents all usage patterns
7. [ ] Integration tests pass
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- Examples are separate solution from main StellaOps
- Uses Docker for easy local dev
- Includes both TCP and TLS examples
- RabbitMQ included for transport option demo

View File

@@ -1,269 +0,0 @@
# Sprint 7000-0010-0001 · Migration · WebService to Microservice
## Topic & Scope
Define and document the migration path from existing `StellaOps.*.WebService` projects to the new microservice pattern with router. This is the final sprint that connects the router infrastructure to the rest of StellaOps.
**Goal:** Clear migration guide and tooling for converting WebServices to Microservices.
**Working directories:**
- `docs/router/` (migration documentation)
- Potentially existing WebService projects (for pilot migration)
## Dependencies & Concurrency
- **Upstream:** All router sprints complete (7000-0001 through 7000-0009)
- **Downstream:** None. Final sprint.
- **Parallel work:** None.
- **Cross-module impact:** YES - This sprint affects existing StellaOps modules.
## Documentation Prerequisites
- `docs/router/specs.md` (section 14 - Migration requirements)
- `docs/router/implplan.md` (phase 11-12 guidance)
- Existing WebService project structures
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Description | Notes |
|---|---------|--------|-------------|-------|
| 1 | MIG-001 | DONE | Inventory all existing WebService projects | 19 services documented in migration-guide.md |
| 2 | MIG-002 | DONE | Document HTTP routes per service | In migration-guide.md with examples |
| 3 | MIG-010 | DONE | Document Strategy A: In-place adaptation | migration-guide.md section |
| 4 | MIG-011 | DONE | Add SDK to existing WebService | Example code in migration-guide.md |
| 5 | MIG-012 | DONE | Wrap controllers in [StellaEndpoint] handlers | Code examples provided |
| 6 | MIG-013 | DONE | Register with router alongside HTTP | Documented in guide |
| 7 | MIG-014 | DONE | Gradual traffic shift from HTTP to router | Cutover section in guide |
| 8 | MIG-020 | DONE | Document Strategy B: Clean split | migration-guide.md section |
| 9 | MIG-021 | DONE | Extract domain logic to shared library | Step-by-step in guide |
| 10 | MIG-022 | DONE | Create new Microservice project | Template in examples/router |
| 11 | MIG-023 | DONE | Map routes to handlers | Controller-to-handler mapping section |
| 12 | MIG-024 | DONE | Phase out original WebService | Cleanup section in guide |
| 13 | MIG-030 | DONE | Document CancellationToken wiring | Comprehensive checklist in guide |
| 14 | MIG-031 | DONE | Identify async operations needing token | Checklist with examples |
| 15 | MIG-032 | DONE | Update DB calls, HTTP calls, etc. | Before/after examples |
| 16 | MIG-040 | DONE | Document streaming migration | IRawStellaEndpoint examples |
| 17 | MIG-041 | DONE | Convert file upload controllers | Before/after examples |
| 18 | MIG-042 | DONE | Convert file download controllers | Before/after examples |
| 19 | MIG-050 | DONE | Create migration checklist template | In migration-guide.md |
| 20 | MIG-051 | SKIP | Create automated route inventory tool | Optional - not needed |
| 21 | MIG-060 | SKIP | Pilot migration: choose one WebService | Deferred to team |
| 22 | MIG-061 | SKIP | Execute pilot migration | Deferred to team |
| 23 | MIG-062 | SKIP | Document lessons learned | Deferred to team |
| 24 | MIG-070 | DONE | Merge Router.sln into StellaOps.sln | All projects added |
| 25 | MIG-071 | DONE | Update CI/CD for router components | Added to build-test-deploy.yml |
## Migration Strategies
### Strategy A: In-Place Adaptation
Best for: Services that need to maintain HTTP compatibility during transition.
```
┌─────────────────────────────────────┐
│ StellaOps.Billing.WebService │
│ ┌─────────────────────────────┐ │
│ │ Existing HTTP Controllers │◄───┼──── HTTP clients (legacy)
│ └─────────────────────────────┘ │
│ ┌─────────────────────────────┐ │
│ │ [StellaEndpoint] Handlers │◄───┼──── Router (new)
│ └─────────────────────────────┘ │
│ ┌─────────────────────────────┐ │
│ │ Shared Domain Logic │ │
│ └─────────────────────────────┘ │
└─────────────────────────────────────┘
```
Steps:
1. Add `StellaOps.Microservice` package reference
2. Create handler classes for each route
3. Handlers call existing service layer
4. Register with router pool
5. Test via router
6. Shift traffic gradually
7. Remove HTTP controllers when ready
### Strategy B: Clean Split
Best for: Major refactoring or when HTTP compatibility not needed.
```
┌─────────────────────────────────────┐
│ StellaOps.Billing.Domain │ ◄── Shared library
│ (extracted business logic) │
└─────────────────────────────────────┘
▲ ▲
│ │
┌─────────┴───────┐ ┌───────┴─────────┐
│ (Legacy) │ │ (New) │
│ Billing.Web │ │ Billing.Micro │
│ Service │ │ service │
│ HTTP only │ │ Router only │
└─────────────────┘ └─────────────────┘
```
Steps:
1. Extract domain logic to `.Domain` library
2. Create new `.Microservice` project
3. Implement handlers using domain library
4. Deploy alongside WebService
5. Shift traffic to router
6. Deprecate WebService
## Controller to Handler Mapping
### Before (ASP.NET Controller)
```csharp
[ApiController]
[Route("api/invoices")]
public class InvoicesController : ControllerBase
{
private readonly IInvoiceService _service;
[HttpPost]
[Authorize(Roles = "billing-admin")]
public async Task<IActionResult> Create(
[FromBody] CreateInvoiceRequest request,
CancellationToken ct) // <-- Often missing!
{
var invoice = await _service.CreateAsync(request);
return Ok(new { invoice.Id });
}
}
```
### After (Microservice Handler)
```csharp
[StellaEndpoint("POST", "/api/invoices")]
public sealed class CreateInvoiceEndpoint : IStellaEndpoint<CreateInvoiceRequest, CreateInvoiceResponse>
{
private readonly IInvoiceService _service;
public CreateInvoiceEndpoint(IInvoiceService service) => _service = service;
public async Task<CreateInvoiceResponse> HandleAsync(
CreateInvoiceRequest request,
CancellationToken ct) // <-- Required, propagated
{
var invoice = await _service.CreateAsync(request, ct); // Pass token!
return new CreateInvoiceResponse { InvoiceId = invoice.Id };
}
}
```
## CancellationToken Checklist
For each migrated handler, verify:
- [ ] Handler accepts CancellationToken parameter
- [ ] Token passed to all database calls
- [ ] Token passed to all HTTP client calls
- [ ] Token passed to all file I/O operations
- [ ] Long-running loops check `ct.IsCancellationRequested`
- [ ] Token passed to Task.Delay, WaitAsync, etc.
## Streaming Migration
### File Upload (Before)
```csharp
[HttpPost("upload")]
public async Task<IActionResult> Upload(IFormFile file)
{
using var stream = file.OpenReadStream();
await _storage.SaveAsync(stream);
return Ok();
}
```
### File Upload (After)
```csharp
[StellaEndpoint("POST", "/upload", SupportsStreaming = true)]
public sealed class UploadEndpoint : IRawStellaEndpoint
{
public async Task<RawResponse> HandleAsync(RawRequestContext ctx, CancellationToken ct)
{
await _storage.SaveAsync(ctx.Body, ct); // Body is already a stream
return RawResponse.Ok();
}
}
```
## Migration Checklist Template
```markdown
# Migration Checklist: [ServiceName]
## Inventory
- [ ] List all HTTP routes (Method + Path)
- [ ] Identify streaming endpoints
- [ ] Identify authorization requirements
- [ ] Document external dependencies
## Preparation
- [ ] Add StellaOps.Microservice package
- [ ] Configure router connection
- [ ] Set up local gateway for testing
## Per-Route Migration
For each route:
- [ ] Create [StellaEndpoint] handler class
- [ ] Map request/response types
- [ ] Wire CancellationToken throughout
- [ ] Convert to IRawStellaEndpoint if streaming
- [ ] Write unit tests
- [ ] Write integration tests
## Cutover
- [ ] Deploy alongside existing WebService
- [ ] Verify via router routing
- [ ] Shift percentage of traffic
- [ ] Monitor for errors
- [ ] Full cutover
- [ ] Remove WebService HTTP listeners
## Cleanup
- [ ] Remove unused controller code
- [ ] Remove HTTP pipeline configuration
- [ ] Update documentation
```
## StellaOps Modules to Migrate
| Module | WebService | Priority | Complexity |
|--------|------------|----------|------------|
| Concelier | StellaOps.Concelier.WebService | High | Medium |
| Scanner | StellaOps.Scanner.WebService | High | High (streaming) |
| Authority | StellaOps.Authority.WebService | Medium | Low |
| Orchestrator | StellaOps.Orchestrator.WebService | Medium | Medium |
| Scheduler | StellaOps.Scheduler.WebService | Low | Low |
| Notify | StellaOps.Notify.WebService | Low | Low |
## Exit Criteria
Before marking this sprint DONE:
1. [x] Migration strategies documented (migration-guide.md)
2. [x] Controller-to-handler mapping guide complete (migration-guide.md)
3. [x] CancellationToken checklist complete (migration-guide.md)
4. [x] Streaming migration guide complete (migration-guide.md)
5. [x] Migration checklist template created (migration-guide.md)
6. [~] Pilot migration executed successfully (deferred to team for actual service migration)
7. [x] Router.sln merged into StellaOps.sln
8. [x] CI/CD updated (build-test-deploy.yml)
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2024-12-04 | Created comprehensive migration-guide.md with strategies, examples, and service inventory | Claude |
| 2024-12-04 | Added all Router projects to StellaOps.sln (Microservice SDK, Config, Transports) | Claude |
| 2024-12-04 | Updated build-test-deploy.yml with Router component build and test steps | Claude |
## Decisions & Risks
- Pilot migration should be a low-risk service first
- Strategy A preferred for gradual transition
- Strategy B preferred for greenfield-like rewrites
- CancellationToken wiring is the #1 source of migration bugs
- Streaming endpoints require IRawStellaEndpoint, not typed handlers
- Authorization migrates from [Authorize(Roles)] to RequiringClaims

View File

@@ -1,92 +0,0 @@
# Sprint 7000-0011-0001 - Router Testing Sprint
## Topic & Scope
Create comprehensive test coverage for StellaOps Router projects. **Critical gap**: `StellaOps.Router.Transport.RabbitMq` has **NO tests**.
**Goal:** ~192 tests covering all Router components with shared testing infrastructure.
**Working directory:** `src/__Libraries/__Tests/`
## Dependencies & Concurrency
- **Upstream:** All Router libraries at stable v1.0 state (sprints 7000-0001 through 7000-0010)
- **Downstream:** None. Testing sprint.
- **Parallel work:** TST-001 through TST-004 can run in parallel.
- **Cross-module impact:** None. Tests only.
## Documentation Prerequisites
- `docs/router/specs.md` (complete specification)
- `docs/router/implplan.md` (phase guidance)
- Existing test patterns in `src/__Libraries/__Tests/StellaOps.Router.Transport.Tcp.Tests/`
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
## Delivery Tracker
| # | Task ID | Status | Priority | Description | Notes |
|---|---------|--------|----------|-------------|-------|
| 1 | TST-001 | TODO | High | Create shared testing infrastructure (`StellaOps.Router.Testing`) | Enables all other tasks |
| 2 | TST-002 | TODO | Critical | Create RabbitMq transport test project skeleton | Critical gap |
| 3 | TST-003 | TODO | High | Implement Router.Common tests | FrameConverter, PathMatcher |
| 4 | TST-004 | TODO | High | Implement Router.Config tests | validation, hot-reload |
| 5 | TST-005 | TODO | Critical | Implement RabbitMq transport unit tests | ~35 tests |
| 6 | TST-006 | TODO | Medium | Expand Microservice SDK tests | EndpointRegistry, RequestDispatcher |
| 7 | TST-007 | TODO | Medium | Expand Transport.InMemory tests | Concurrency scenarios |
| 8 | TST-008 | TODO | Medium | Create integration test suite | End-to-end flows |
| 9 | TST-009 | TODO | Low | Expand TCP/TLS transport tests | Edge cases |
| 10 | TST-010 | TODO | Low | Create SourceGen integration tests | Optional |
## Current State
| Project | Test Location | Status |
|---------|--------------|--------|
| Router.Common | `tests/StellaOps.Router.Common.Tests` | Exists (skeletal) |
| Router.Config | `tests/StellaOps.Router.Config.Tests` | Exists (skeletal) |
| Router.Transport.InMemory | `tests/StellaOps.Router.Transport.InMemory.Tests` | Exists (skeletal) |
| Router.Transport.Tcp | `src/__Libraries/__Tests/` | Exists |
| Router.Transport.Tls | `src/__Libraries/__Tests/` | Exists |
| Router.Transport.Udp | `tests/StellaOps.Router.Transport.Udp.Tests` | Exists (skeletal) |
| **Router.Transport.RabbitMq** | **NONE** | **MISSING** |
| Microservice | `tests/StellaOps.Microservice.Tests` | Exists |
| Microservice.SourceGen | N/A | Source generator |
## Test Counts Summary
| Component | Unit | Integration | Total |
|-----------|------|-------------|-------|
| Router.Common | 35 | 0 | 35 |
| Router.Config | 25 | 3 | 28 |
| **Transport.RabbitMq** | **30** | **5** | **35** |
| Microservice SDK | 28 | 5 | 33 |
| Transport.InMemory | 23 | 5 | 28 |
| Integration Suite | 0 | 15 | 15 |
| TCP/TLS Expansion | 12 | 0 | 12 |
| SourceGen | 0 | 6 | 6 |
| **TOTAL** | **153** | **39** | **~192** |
## Exit Criteria
Before marking this sprint DONE:
1. [ ] All test projects compile
2. [ ] RabbitMq transport has comprehensive unit tests (critical gap closed)
3. [ ] Router.Common coverage > 90% for FrameConverter, PathMatcher
4. [ ] Router.Config coverage > 85% for RouterConfigProvider
5. [ ] All tests follow AAA pattern with comments
6. [ ] Integration tests demonstrate end-to-end flows
7. [ ] All tests added to CI/CD workflow
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| | | |
## Decisions & Risks
- All new test projects in `src/__Libraries/__Tests/` following existing pattern
- RabbitMQ unit tests use mocked interfaces (no real broker required)
- Integration tests may use Testcontainers for real broker testing
- xUnit v3 with FluentAssertions 6.12.0
- Test naming: `[Method]_[Scenario]_[Expected]`

View File

@@ -1,200 +0,0 @@
# Stella Ops Router - Sprint Index
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [../implplan/BLOCKED_DEPENDENCY_TREE.md](../implplan/BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
This document provides an overview of all sprints for implementing the StellaOps Router infrastructure. Sprints are organized for maximum agent independence while respecting dependencies.
## Key Documents
| Document | Purpose |
|----------|---------|
| [specs.md](./specs.md) | **Canonical specification** - READ FIRST |
| [implplan.md](./implplan.md) | High-level implementation plan |
| Step files (01-29) | Detailed task breakdowns per phase |
## Sprint Epochs
All router sprints use **Epoch 7000** to maintain isolation from existing StellaOps work.
| Batch | Focus Area | Sprints |
|-------|------------|---------|
| 0001 | Foundation | Skeleton, Common library |
| 0002 | InMemory Transport | Prove the design before real transports |
| 0003 | Microservice SDK | Core infrastructure, request handling |
| 0004 | Gateway | Core, middleware, connection handling |
| 0005 | Protocol Features | Heartbeat, routing, cancellation, streaming, limits |
| 0006 | Real Transports | TCP, TLS, UDP, RabbitMQ |
| 0007 | Configuration | Router config, microservice YAML |
| 0008 | Integration | Authority, source generator |
| 0009 | Examples | Reference implementation |
| 0010 | Migration | WebService → Microservice |
## Sprint Dependency Graph
```
┌─────────────────────────────────────┐
│ SPRINT_7000_0001_0001 │
│ Router Skeleton │
└───────────────┬─────────────────────┘
┌───────────────▼─────────────────────┐
│ SPRINT_7000_0001_0002 │
│ Common Library Models │
└───────────────┬─────────────────────┘
┌───────────────▼─────────────────────┐
│ SPRINT_7000_0002_0001 │
│ InMemory Transport │
└───────────────┬─────────────────────┘
┌──────────────────────────┼──────────────────────────┐
│ │ │
▼ │ ▼
┌─────────────────────┐ │ ┌─────────────────────┐
│ SPRINT_7000_0003_* │ │ │ SPRINT_7000_0004_* │
│ Microservice SDK │ │ │ Gateway │
│ (2 sprints) │◄────────────┼────────────►│ (3 sprints) │
└─────────┬───────────┘ │ └─────────┬───────────┘
│ │ │
└─────────────────────────┼───────────────────────┘
┌───────────────▼─────────────────────┐
│ SPRINT_7000_0005_0001-0005 │
│ Protocol Features (sequential) │
│ Heartbeat → Routing → Cancel │
│ → Streaming → Payload Limits │
└───────────────┬─────────────────────┘
┌──────────────────────────┼──────────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ TCP Transport │ │ UDP Transport │ │ RabbitMQ │
│ 7000_0006_0001 │ │ 7000_0006_0003 │ │ 7000_0006_0004 │
└────────┬────────┘ └─────────────────┘ └─────────────────┘
┌─────────────────┐
│ TLS Transport │
│ 7000_0006_0002 │
└────────┬────────┘
└──────────────────────────┬──────────────────────────┘
┌───────────────▼─────────────────────┐
│ SPRINT_7000_0007_0001-0002 │
│ Configuration (sequential) │
└───────────────┬─────────────────────┘
┌──────────────────────────┼──────────────────────────┐
│ │ │
▼ │ ▼
┌─────────────────────┐ │ ┌─────────────────────┐
│ Authority Integration│ │ │ Source Generator │
│ 7000_0008_0001 │◄────────────┼────────────►│ 7000_0008_0002 │
└─────────────────────┘ │ └─────────────────────┘
┌───────────────▼─────────────────────┐
│ SPRINT_7000_0009_0001 │
│ Reference Example │
└───────────────┬─────────────────────┘
┌───────────────▼─────────────────────┐
│ SPRINT_7000_0010_0001 │
│ Migration │
│ (Connects to rest of StellaOps) │
└─────────────────────────────────────┘
```
## Parallel Execution Opportunities
These sprints can run in parallel:
| Phase | Parallel Track A | Parallel Track B | Parallel Track C |
|-------|------------------|------------------|------------------|
| After InMemory | SDK Core (0003_0001) | Gateway Core (0004_0001) | - |
| After Protocol | TCP (0006_0001) | UDP (0006_0003) | RabbitMQ (0006_0004) |
| After TCP | TLS (0006_0002) | (continues above) | (continues above) |
| After Config | Authority (0008_0001) | Source Gen (0008_0002) | - |
## Sprint Status Overview
| Sprint | Name | Status | Working Directory |
|--------|------|--------|-------------------|
| 7000-0001-0001 | Router Skeleton | TODO | Multiple (see sprint) |
| 7000-0001-0002 | Common Library | TODO | `src/__Libraries/StellaOps.Router.Common/` |
| 7000-0002-0001 | InMemory Transport | TODO | `src/__Libraries/StellaOps.Router.Transport.InMemory/` |
| 7000-0003-0001 | SDK Core | TODO | `src/__Libraries/StellaOps.Microservice/` |
| 7000-0003-0002 | SDK Handlers | TODO | `src/__Libraries/StellaOps.Microservice/` |
| 7000-0004-0001 | Gateway Core | TODO | `src/Gateway/StellaOps.Gateway.WebService/` |
| 7000-0004-0002 | Gateway Middleware | TODO | `src/Gateway/StellaOps.Gateway.WebService/` |
| 7000-0004-0003 | Gateway Connections | TODO | `src/Gateway/StellaOps.Gateway.WebService/` |
| 7000-0005-0001 | Heartbeat & Health | TODO | SDK + Gateway |
| 7000-0005-0002 | Routing Algorithm | TODO | `src/Gateway/StellaOps.Gateway.WebService/` |
| 7000-0005-0003 | Cancellation | TODO | SDK + Gateway |
| 7000-0005-0004 | Streaming | TODO | SDK + Gateway + InMemory |
| 7000-0005-0005 | Payload Limits | TODO | `src/Gateway/StellaOps.Gateway.WebService/` |
| 7000-0006-0001 | TCP Transport | TODO | `src/__Libraries/StellaOps.Router.Transport.Tcp/` |
| 7000-0006-0002 | TLS Transport | TODO | `src/__Libraries/StellaOps.Router.Transport.Tls/` |
| 7000-0006-0003 | UDP Transport | TODO | `src/__Libraries/StellaOps.Router.Transport.Udp/` |
| 7000-0006-0004 | RabbitMQ Transport | TODO | `src/__Libraries/StellaOps.Router.Transport.RabbitMq/` |
| 7000-0007-0001 | Router Config | TODO | `src/__Libraries/StellaOps.Router.Config/` |
| 7000-0007-0002 | Microservice YAML | TODO | `src/__Libraries/StellaOps.Microservice/` |
| 7000-0008-0001 | Authority Integration | TODO | Gateway + Authority |
| 7000-0008-0002 | Source Generator | TODO | `src/__Libraries/StellaOps.Microservice.SourceGen/` |
| 7000-0009-0001 | Reference Example | TODO | `examples/router/` |
| 7000-0010-0001 | Migration | TODO | Multiple (final integration) |
## Critical Path
The minimum path to a working router:
1. **7000-0001-0001** → Skeleton
2. **7000-0001-0002** → Common models
3. **7000-0002-0001** → InMemory transport
4. **7000-0003-0001** → SDK core
5. **7000-0003-0002** → SDK handlers
6. **7000-0004-0001** → Gateway core
7. **7000-0004-0002** → Gateway middleware
8. **7000-0004-0003** → Gateway connections
After these 8 sprints, you have a working router with InMemory transport for testing.
## Isolation Strategy
The router is developed in isolation using:
1. **Separate solution file:** `StellaOps.Router.sln`
2. **Dedicated directories:** All router code in new directories
3. **No changes to existing modules:** Until migration sprint
4. **InMemory transport first:** No network dependencies during core development
This ensures:
- Router development doesn't impact existing StellaOps builds
- Agents can work independently on router without merge conflicts
- Full testing possible without real infrastructure
- Migration is a conscious, controlled step
## Agent Assignment Guidance
For maximum parallelization:
- **Foundation Agent:** Sprints 7000-0001-0001, 7000-0001-0002
- **SDK Agent:** Sprints 7000-0003-0001, 7000-0003-0002
- **Gateway Agent:** Sprints 7000-0004-0001, 7000-0004-0002, 7000-0004-0003
- **Transport Agent:** Sprints 7000-0002-0001, 7000-0006-*
- **Protocol Agent:** Sprints 7000-0005-*
- **Config Agent:** Sprints 7000-0007-*
- **Integration Agent:** Sprints 7000-0008-*, 7000-0010-0001
- **Documentation Agent:** Sprint 7000-0009-0001
## Invariants (Never Violate)
From `specs.md`, these are non-negotiable:
- **Method + Path** is the endpoint identity
- **Strict semver** for version matching
- **Region from GatewayNodeConfig.Region** (never from headers/host)
- **No HTTP transport** between gateway and microservices
- **RequiringClaims** (not AllowedRoles) for authorization
- **Opaque body handling** (router doesn't interpret payloads)
Any change to these invariants requires updating `specs.md` first.

View File

@@ -1,356 +0,0 @@
Start by treating `docs/router/specs.md` as law. Nothing gets coded that contradicts it. The first sprint or two should be about *wiring the skeleton* and proving the core flows with the simplest possible transport, then layering in the real transports and migration paths.
Id structure the work for your agents like this.
---
## 0. Read & freeze invariants
**All agents:**
* Read `docs/router/specs.md` end to end.
* Extract and pin the non-negotiables:
* Method + Path identity.
* Strict semver for versions.
* Region from `GatewayNodeConfig.Region` (no host/header magic).
* No HTTP transport for microservice communications.
* Single connection carrying HELLO + HEARTBEAT + REQUEST/RESPONSE + CANCEL.
* Router treats body as opaque bytes/streams.
* `RequiringClaims` replaces any form of `AllowedRoles`.
Agree that these are invariants; any future idea that violates them needs an explicit spec change first.
---
## 1. Lay down the solution skeleton
**“Skeleton” agent (or gateway core agent):**
Create the basic project structure, no logic yet:
* `src/__Libraries/StellaOps.Router.Common`
* `src/__Libraries/StellaOps.Router.Config`
* `src/__Libraries/StellaOps.Microservice`
* `src/StellaOps.Gateway.WebService`
* `docs/router/` already has `specs.md` (add placeholders for the other docs).
Goal: everything builds, but most classes are empty or stubs.
---
## 2. Implement the shared core model (Common)
**Common/core agent:**
Implement only the *data* and *interfaces*, no behavior:
* Enums:
* `TransportType`, `FrameType`, `InstanceHealthStatus`.
* Models:
* `ClaimRequirement`
* `EndpointDescriptor`
* `InstanceDescriptor`
* `ConnectionState`
* `RoutingContext`, `RoutingDecision`
* `PayloadLimits`
* Interfaces:
* `IGlobalRoutingState`
* `IRoutingPlugin`
* `ITransportServer`
* `ITransportClient`
* `Frame` struct/class:
* `FrameType`, `CorrelationId`, `Payload` (byte[]).
Leave implementations of `IGlobalRoutingState`, `IRoutingPlugin`, transports, etc., for later steps.
Deliverable: a stable set of contracts that gateway + microservice SDK depend on.
---
## 3. Build a fake “in-memory” transport plugin
**Transport agent:**
Before UDP/TCP/Rabbit, build an **in-process transport**:
* `InMemoryTransportServer` and `InMemoryTransportClient`.
* They share a concurrent dictionary keyed by `ConnectionId`.
* Frames are passed via channels/queues in memory.
Purpose:
* Let you prove HELLO/HEARTBEAT/REQUEST/RESPONSE/CANCEL semantics and routing logic *without* dealing with sockets and Rabbit yet.
* Let you unit and integration test the router and SDK quickly.
This plugin will never ship to production; its only for dev tests and CI.
---
## 4. Microservice SDK: minimal handshake & dispatch (with InMemory)
**Microservice agent:**
Initial focus: “connect and say HELLO, then handle a simple request.”
1. Implement `StellaMicroserviceOptions`.
2. Implement `AddStellaMicroservice(...)`:
* Bind options.
* Register endpoint handlers and SDK internal services.
3. Endpoint discovery:
* Implement runtime reflection for `[StellaEndpoint]` + handler types.
* Build in-memory `EndpointDescriptor` list (simple: no YAML yet).
4. Connection:
* Use `InMemoryTransportClient` to “connect” to a fake router.
* On connect, send a HELLO frame with:
* Identity.
* Endpoint list and metadata (`SupportsStreaming` false for now, simple `RequiringClaims` empty).
5. Request handling:
* Implement `IRawStellaEndpoint` and adapter to it.
* Implement `RawRequestContext` / `RawResponse`.
* Implement a dispatcher that:
* Receives `Request` frame.
* Builds `RawRequestContext`.
* Invokes the correct handler.
* Sends `Response` frame.
Do **not** handle streaming or cancellation yet; just basic request/response with small bodies.
---
## 5. Gateway: minimal routing using InMemory plugin
**Gateway agent:**
Goal: HTTP → in-memory transport → microservice → HTTP response.
1. Implement `GatewayNodeConfig` and bind it from config.
2. Implement `IGlobalRoutingState` as a simple in-memory implementation that:
* Holds `ConnectionState` objects.
* Builds a map `(Method, Path)` → endpoint + connections.
3. Implement a minimal `IRoutingPlugin` that:
* For now, just picks *any* connection that has the endpoint (no region/ping logic yet).
4. Implement minimal HTTP pipeline:
* `EndpointResolutionMiddleware`:
* `(Method, Path)``EndpointDescriptor` from `IGlobalRoutingState`.
* Naive authorization middleware stub (only checks “needs authenticated user”; ignore real requiringClaims for now).
* `RoutingDecisionMiddleware`:
* Ask `IRoutingPlugin` for a `RoutingDecision`.
* `TransportDispatchMiddleware`:
* Build a `Request` frame.
* Use `InMemoryTransportClient` to send and await `Response`.
* Map response to HTTP.
5. Implement HELLO handler on gateway side:
* When InMemory “connection” from microservice appears and sends HELLO:
* Construct `ConnectionState`.
* Update `IGlobalRoutingState` with endpoint → connection mapping.
Once this works, you have end-to-end:
* Example microservice.
* Example gateway.
* In-memory transport.
* A couple of test endpoints returning simple JSON.
---
## 6. Add heartbeat, health, and basic routing rules
**Common/core + gateway agent:**
Now enforce liveness and basic routing:
1. Heartbeat:
* Microservice SDK sends HEARTBEAT frames on a timer.
* Gateway updates `LastHeartbeatUtc` and `Status`.
2. Health:
* Add background job in gateway that:
* Marks instances Unhealthy if heartbeat stale.
3. Routing:
* Enhance `IRoutingPlugin` to:
* Filter out Unhealthy instances.
* Prefer gateway region (using `GatewayNodeConfig.Region`).
* Use simple `AveragePingMs` stub from request/response timings.
Still using InMemory transport; just building the selection logic.
---
## 7. Add cancellation semantics (with InMemory)
**Microservice + gateway agents:**
Wire up cancellation logic before touching real transports:
1. Common:
* Extend `FrameType` with `Cancel`.
2. Gateway:
* In `TransportDispatchMiddleware`:
* Tie `HttpContext.RequestAborted` to a `SendCancelAsync` call.
* On timeout, send CANCEL.
* Ignore late `Response`/stream data for canceled correlation IDs.
3. Microservice:
* Maintain `_inflight` map of correlation → `CancellationTokenSource`.
* When `Cancel` frame arrives, call `cts.Cancel()`.
* Ensure handlers receive and honor `CancellationToken`.
Prove via tests: if client disconnects, handler stops quickly.
---
## 8. Add streaming & payload limits (still InMemory)
**Gateway + microservice agents:**
1. Streaming:
* Extend InMemory transport to support `RequestStreamData` / `ResponseStreamData` frames.
* On the gateway:
* For `SupportsStreaming` endpoints, pipe HTTP body stream → frame stream.
* For response, pipe frames → HTTP response stream.
* On microservice:
* Expose `RawRequestContext.Body` as a stream reading frames as they arrive.
* Allow `RawResponse.WriteBodyAsync` to stream out.
2. Payload limits:
* Implement `PayloadLimits` enforcement at gateway:
* Early reject large `Content-Length`.
* Track counters in streaming; trigger cancellation when exceeding thresholds.
Demonstrate with a fake “upload” endpoint that uses `IRawStellaEndpoint` and streaming.
---
## 9. Implement real transport plugins one by one
**Transport agent:**
Now replace InMemory with real transports:
Order:
1. **TCP plugin** (easiest baseline):
* Length-prefixed frame protocol.
* Connection per microservice instance (or multi-instance if needed later).
* Implement HELLO/HEARTBEAT/REQUEST/RESPONSE/STREAM/CANCEL as per frame model.
2. **Certificate (TLS) plugin**:
* Wrap TCP plugin with TLS.
* Add configuration for server & client certs.
3. **UDP plugin**:
* Single datagram = single frame; no streaming.
* Enforce `MaxRequestBytesPerCall`.
* Use for small, idempotent operations.
4. **RabbitMQ plugin**:
* Add exchanges/queues for HELLO/HEARTBEAT and REQUEST/RESPONSE.
* Use `CorrelationId` properties for matching.
* Guarantee at-most-once semantics where practical.
While each plugin is built, keep the core router and microservice SDK relying only on `ITransportClient`/`ITransportServer` abstractions.
---
## 10. Add Router.Config + Microservice YAML integration
**Config agent:**
1. Implement `__Libraries/StellaOps.Router.Config`:
* YAML → `RouterConfig` binding.
* Services, endpoints, static instances, payload limits.
* Hot-reload via `IOptionsMonitor` / file watcher.
2. Implement microservice YAML:
* Endpoint-level overrides only (timeouts, requiringClaims, SupportsStreaming).
* Merge logic: code defaults → YAML override.
3. Integrate:
* Gateway uses RouterConfig for:
* Defaults when no microservice registered yet.
* Payload limits.
* Microservice uses YAML to refine endpoint metadata before sending HELLO.
---
## 11. Build a reference example + migration skeleton
**DX / migration agent:**
1. Build a `StellaOps.Billing.Microservice` example:
* A couple of simple endpoints (GET/POST).
* One streaming upload endpoint.
* YAML for requiringClaims and timeouts.
2. Build a `StellaOps.Gateway.WebService` example config around it.
3. Document the full path:
* How to run both locally.
* How to add a new endpoint.
* How cancellation behaves (killing the client, watching logs).
* How payload limits work (try to upload too-large file).
4. Outline migration steps from an imaginary `StellaOps.Billing.WebService` using the patterns in `Migration of Webservices to Microservices.md`.
---
## 12. Process guidance for your agents
* **Do not jump to UDP/TCP immediately.**
Prove the protocol (HELLO/HEARTBEAT/REQUEST/RESPONSE/STREAM/CANCEL), routing, and limits on the InMemory plugin first.
* **Guard the invariants.**
If someone proposes “just call HTTP between services” or “lets derive region from host,” theyre violating spec and must update `docs/router/specs.md` before coding.
* **Keep Common stable.**
Changes to `StellaOps.Router.Common` must be rare and reviewed; everything else depends on it.
* **Document as you go.**
Every time a behavior settles (e.g. status mapping, frame layout), update the docs under `docs/router/` so new agents always have a single source of truth.
If you want, next step I can convert this into a task board (epic → stories) per repo folder, so you can assign specific chunks to named agents.

View File

@@ -1,454 +0,0 @@
# StellaOps Router Migration Guide
This guide describes how to migrate existing `StellaOps.*.WebService` projects to the new microservice pattern with the StellaOps Router.
## Overview
The router provides a transport-agnostic communication layer between services, replacing direct HTTP calls with efficient binary protocols (TCP, TLS, UDP, RabbitMQ). Benefits include:
- **Performance**: Binary framing vs HTTP overhead
- **Streaming**: First-class support for large payloads
- **Cancellation**: Propagated across service boundaries
- **Claims**: Authority-integrated authorization
- **Health**: Automatic heartbeat and failover
## Prerequisites
Before migrating, ensure:
1. Router infrastructure is deployed (Gateway, transports)
2. Authority is configured with endpoint claims
3. Local development environment has router.yaml configured
## Migration Strategies
### Strategy A: In-Place Adaptation
Best for services that need to maintain HTTP compatibility during transition.
```
┌─────────────────────────────────────┐
│ StellaOps.*.WebService │
│ ┌─────────────────────────────────┐│
│ │ Existing HTTP Controllers ││◄── HTTP clients (legacy)
│ └─────────────────────────────────┘│
│ ┌─────────────────────────────────┐│
│ │ [StellaEndpoint] Handlers ││◄── Router (new)
│ └─────────────────────────────────┘│
│ ┌─────────────────────────────────┐│
│ │ Shared Domain Logic ││
│ └─────────────────────────────────┘│
└─────────────────────────────────────┘
```
**Steps:**
1. Add `StellaOps.Microservice` package reference
2. Create handler classes for each HTTP route
3. Handlers call existing service layer
4. Register with router alongside HTTP
5. Test via router
6. Shift traffic gradually
7. Remove HTTP controllers when ready
**Pros:**
- Gradual migration
- No downtime
- Can roll back easily
**Cons:**
- Dual maintenance during transition
- May delay cleanup
### Strategy B: Clean Split
Best for major refactoring or when HTTP compatibility is not needed.
```
┌─────────────────────────────────────┐
│ StellaOps.*.Domain │ ◄── Shared library
│ (extracted business logic) │
└─────────────────────────────────────┘
▲ ▲
│ │
┌─────────┴───────┐ ┌───────┴─────────┐
│ (Legacy) │ │ (New) │
│ *.WebService │ │ *.Microservice │
│ HTTP only │ │ Router only │
└─────────────────┘ └─────────────────┘
```
**Steps:**
1. Extract domain logic to `.Domain` library
2. Create new `.Microservice` project
3. Implement handlers using domain library
4. Deploy alongside WebService
5. Shift traffic to router
6. Deprecate WebService
**Pros:**
- Clean architecture
- No legacy code in new project
- Clear separation of concerns
**Cons:**
- More upfront work
- Requires domain extraction
## Controller to Handler Mapping
### Before (ASP.NET Controller)
```csharp
[ApiController]
[Route("api/invoices")]
public class InvoicesController : ControllerBase
{
private readonly IInvoiceService _service;
[HttpPost]
[Authorize(Roles = "billing-admin")]
public async Task<IActionResult> Create(
[FromBody] CreateInvoiceRequest request,
CancellationToken ct)
{
var invoice = await _service.CreateAsync(request);
return Ok(new { invoice.Id });
}
[HttpGet("{id}")]
public async Task<IActionResult> Get(string id)
{
var invoice = await _service.GetAsync(id);
if (invoice == null) return NotFound();
return Ok(invoice);
}
}
```
### After (Microservice Handler)
```csharp
// Handler for POST /api/invoices
[StellaEndpoint("POST", "/api/invoices", RequiredClaims = ["invoices:write"])]
public sealed class CreateInvoiceEndpoint : IStellaEndpoint<CreateInvoiceRequest, CreateInvoiceResponse>
{
private readonly IInvoiceService _service;
public CreateInvoiceEndpoint(IInvoiceService service) => _service = service;
public async Task<CreateInvoiceResponse> HandleAsync(
CreateInvoiceRequest request,
CancellationToken ct)
{
var invoice = await _service.CreateAsync(request, ct);
return new CreateInvoiceResponse { InvoiceId = invoice.Id };
}
}
// Handler for GET /api/invoices/{id}
[StellaEndpoint("GET", "/api/invoices/{id}", RequiredClaims = ["invoices:read"])]
public sealed class GetInvoiceEndpoint : IStellaEndpoint<GetInvoiceRequest, GetInvoiceResponse>
{
private readonly IInvoiceService _service;
public GetInvoiceEndpoint(IInvoiceService service) => _service = service;
public async Task<GetInvoiceResponse> HandleAsync(
GetInvoiceRequest request,
CancellationToken ct)
{
var invoice = await _service.GetAsync(request.Id, ct);
return new GetInvoiceResponse
{
InvoiceId = invoice?.Id,
Found = invoice != null
};
}
}
```
## CancellationToken Wiring
**This is the #1 source of migration bugs.** Every async operation must receive and respect the cancellation token.
### Checklist
For each migrated handler, verify:
- [ ] Handler accepts CancellationToken parameter (automatic with IStellaEndpoint)
- [ ] Token passed to all database calls
- [ ] Token passed to all HTTP client calls
- [ ] Token passed to all file I/O operations
- [ ] Long-running loops check `ct.IsCancellationRequested`
- [ ] Token passed to `Task.Delay`, `WaitAsync`, etc.
### Example: Before (missing tokens)
```csharp
public async Task<Invoice> CreateAsync(CreateInvoiceRequest request)
{
var invoice = new Invoice(request);
await _db.Invoices.AddAsync(invoice); // Missing token!
await _db.SaveChangesAsync(); // Missing token!
await _notifier.SendAsync(invoice); // Missing token!
return invoice;
}
```
### Example: After (proper wiring)
```csharp
public async Task<Invoice> CreateAsync(CreateInvoiceRequest request, CancellationToken ct)
{
ct.ThrowIfCancellationRequested();
var invoice = new Invoice(request);
await _db.Invoices.AddAsync(invoice, ct);
await _db.SaveChangesAsync(ct);
await _notifier.SendAsync(invoice, ct);
return invoice;
}
```
## Streaming Migration
### File Upload: Before
```csharp
[HttpPost("upload")]
public async Task<IActionResult> Upload(IFormFile file)
{
using var stream = file.OpenReadStream();
await _storage.SaveAsync(stream);
return Ok();
}
```
### File Upload: After
```csharp
[StellaEndpoint("POST", "/upload", SupportsStreaming = true)]
public sealed class UploadEndpoint : IRawStellaEndpoint
{
private readonly IStorageService _storage;
public UploadEndpoint(IStorageService storage) => _storage = storage;
public async Task<RawResponse> HandleAsync(RawRequestContext ctx, CancellationToken ct)
{
// ctx.Body is already a stream - no buffering needed
var path = await _storage.SaveAsync(ctx.Body, ct);
return RawResponse.Ok($"{{\"path\":\"{path}\"}}");
}
}
```
### File Download: Before
```csharp
[HttpGet("download/{id}")]
public async Task<IActionResult> Download(string id)
{
var stream = await _storage.GetAsync(id);
return File(stream, "application/octet-stream");
}
```
### File Download: After
```csharp
[StellaEndpoint("GET", "/download/{id}", SupportsStreaming = true)]
public sealed class DownloadEndpoint : IRawStellaEndpoint
{
private readonly IStorageService _storage;
public DownloadEndpoint(IStorageService storage) => _storage = storage;
public async Task<RawResponse> HandleAsync(RawRequestContext ctx, CancellationToken ct)
{
var id = ctx.PathParameters["id"];
var stream = await _storage.GetAsync(id, ct);
return RawResponse.Stream(stream, "application/octet-stream");
}
}
```
## Authorization Migration
### Before: [Authorize] Attribute
```csharp
[Authorize(Roles = "admin,billing-manager")]
public async Task<IActionResult> Delete(string id) { ... }
```
### After: RequiredClaims
```csharp
[StellaEndpoint("DELETE", "/invoices/{id}", RequiredClaims = ["invoices:delete"])]
public sealed class DeleteInvoiceEndpoint : IStellaEndpoint<...> { ... }
```
Claims are configured in Authority and enforced by the Gateway's AuthorizationMiddleware.
## Migration Checklist Template
Use this checklist for each service migration:
```markdown
# Migration Checklist: [ServiceName]
## Inventory
- [ ] List all HTTP routes (Method + Path)
- [ ] Identify streaming endpoints
- [ ] Identify authorization requirements
- [ ] Document external dependencies
## Preparation
- [ ] Add StellaOps.Microservice package
- [ ] Add StellaOps.Router.Transport.* package(s)
- [ ] Configure router connection in Program.cs
- [ ] Set up local gateway for testing
## Per-Route Migration
For each route:
- [ ] Create [StellaEndpoint] handler class
- [ ] Define request/response record types
- [ ] Map path parameters
- [ ] Wire CancellationToken throughout
- [ ] Convert to IRawStellaEndpoint if streaming
- [ ] Add RequiredClaims
- [ ] Write unit tests
- [ ] Write integration tests
## Cutover
- [ ] Deploy alongside existing WebService
- [ ] Verify via router routing
- [ ] Shift percentage of traffic
- [ ] Monitor for errors
- [ ] Full cutover
- [ ] Remove WebService HTTP listeners
## Cleanup
- [ ] Remove unused controller code
- [ ] Remove HTTP pipeline configuration
- [ ] Update OpenAPI documentation
- [ ] Update client SDKs
```
## Service Inventory
| Module | WebService Project | Priority | Complexity | Notes |
|--------|-------------------|----------|------------|-------|
| Gateway | StellaOps.Gateway.WebService | N/A | N/A | IS the router |
| Concelier | StellaOps.Concelier.WebService | High | Medium | Advisory ingestion |
| Scanner | StellaOps.Scanner.WebService | High | High | Streaming scans |
| Attestor | StellaOps.Attestor.WebService | Medium | Medium | Attestation gen |
| Excititor | StellaOps.Excititor.WebService | Medium | Low | VEX processing |
| Orchestrator | StellaOps.Orchestrator.WebService | Medium | Medium | Job coordination |
| Scheduler | StellaOps.Scheduler.WebService | Low | Low | Job scheduling |
| Notify | StellaOps.Notify.WebService | Low | Low | Notifications |
| Notifier | StellaOps.Notifier.WebService | Low | Low | Alert dispatch |
| Signer | StellaOps.Signer.WebService | Medium | Low | Crypto signing |
| Findings | StellaOps.Findings.Ledger.WebService | Medium | Medium | Results storage |
| EvidenceLocker | StellaOps.EvidenceLocker.WebService | Low | Medium | Blob storage |
| ExportCenter | StellaOps.ExportCenter.WebService | Low | Medium | Report generation |
| IssuerDirectory | StellaOps.IssuerDirectory.WebService | Low | Low | Issuer lookup |
| PacksRegistry | StellaOps.PacksRegistry.WebService | Low | Low | Pack management |
| RiskEngine | StellaOps.RiskEngine.WebService | Medium | Medium | Risk calculation |
| TaskRunner | StellaOps.TaskRunner.WebService | Low | Medium | Task execution |
| TimelineIndexer | StellaOps.TimelineIndexer.WebService | Low | Low | Event indexing |
| AdvisoryAI | StellaOps.AdvisoryAI.WebService | Low | Medium | AI assistance |
## Testing During Migration
### Unit Tests
Test handlers in isolation using mocked dependencies:
```csharp
[Fact]
public async Task CreateInvoice_ValidRequest_ReturnsInvoiceId()
{
// Arrange
var mockService = new Mock<IInvoiceService>();
mockService.Setup(s => s.CreateAsync(It.IsAny<CreateInvoiceRequest>(), It.IsAny<CancellationToken>()))
.ReturnsAsync(new Invoice { Id = "INV-123" });
var endpoint = new CreateInvoiceEndpoint(mockService.Object);
// Act
var response = await endpoint.HandleAsync(
new CreateInvoiceRequest { Amount = 100 },
CancellationToken.None);
// Assert
response.InvoiceId.Should().Be("INV-123");
}
```
### Integration Tests
Use WebApplicationFactory for the Gateway and actual microservice instances:
```csharp
public sealed class InvoiceTests : IClassFixture<GatewayFixture>
{
private readonly GatewayFixture _fixture;
[Fact]
public async Task CreateAndGetInvoice_WorksEndToEnd()
{
var createResponse = await _fixture.Client.PostAsJsonAsync("/api/invoices",
new { Amount = 100 });
createResponse.StatusCode.Should().Be(HttpStatusCode.OK);
var created = await createResponse.Content.ReadFromJsonAsync<CreateInvoiceResponse>();
var getResponse = await _fixture.Client.GetAsync($"/api/invoices/{created.InvoiceId}");
getResponse.StatusCode.Should().Be(HttpStatusCode.OK);
}
}
```
## Common Migration Issues
### 1. Missing CancellationToken Propagation
**Symptom:** Requests continue processing after client disconnects.
**Fix:** Pass `CancellationToken` to all async operations.
### 2. IFormFile Not Available
**Symptom:** Compilation error on `IFormFile` parameter.
**Fix:** Convert to `IRawStellaEndpoint` for streaming.
### 3. HttpContext Not Available
**Symptom:** Code references `HttpContext` for headers, claims.
**Fix:** Use `RawRequestContext` for raw endpoints, or inject claims via Authority.
### 4. Return Type Mismatch
**Symptom:** Handler returns `IActionResult`.
**Fix:** Define proper response record type, return that instead.
### 5. Route Parameter Not Extracted
**Symptom:** Path parameters like `{id}` not populated.
**Fix:** For `IStellaEndpoint`, add property to request type. For `IRawStellaEndpoint`, use `ctx.PathParameters["id"]`.
## Next Steps
1. Choose a low-risk service for pilot migration (Scheduler recommended)
2. Follow the Migration Checklist
3. Document lessons learned
4. Proceed with higher-priority services
5. Eventually merge all to use router exclusively

View File

@@ -1,494 +0,0 @@
Ill group everything into requirement buckets, but keep it all as requirements statements (no rationale). This is the union of what you asked for or confirmed across the whole thread.
---
## 1. Architectural / scope requirements
* There SHALL be a single HTTP ingress service named `StellaOps.Gateway.WebService`.
* Microservices SHALL NOT expose HTTP to the router; all microservice-to-router traffic (control + data) MUST use in-house transports (UDP, TCP, certificate/TLS, RabbitMQ).
* There SHALL NOT be a separate control-plane service or protocol; each transport connection between a microservice and the router MUST carry:
* Initial registration (HELLO) and endpoint configuration.
* Ongoing heartbeats.
* Endpoint updates (if any).
* Request/response and streaming data.
* The router SHALL maintain per-connection endpoint mappings and derive its global routing state from the union of all live connections.
* The router SHALL treat request and response bodies as opaque (raw bytes / streams); all deserialization and schema handling SHALL be the microservices responsibility.
* The system SHALL support both buffered and streaming request/response flows end-to-end.
* The design MUST reuse only the generic parts of `__SerdicaTemplate` (dynamic endpoint metadata, attribute-based endpoint discovery, request routing patterns, correlation, connection management) and MUST drop Serdica-specific stack (Oracle schema, domain logic, etc.).
* The solution MUST be a simpler, generic replacement for the existing Serdica HTTP→RabbitMQ→microservice design.
---
## 2. Service identity, region, versioning
* Each microservice instance SHALL be identified by `(ServiceName, Version, Region, InstanceId)`.
* `Version` MUST follow strict semantic versioning (`major.minor.patch`).
* Routing MUST be strict on version:
* The router MUST only route a request to instances whose `Version` equals the selected version.
* When a version is not explicitly specified by the client, a default version MUST be used (from config or metadata).
* Each gateway node SHALL have a static configuration object `GatewayNodeConfig` containing at least:
* `Region` (e.g. `"eu1"`).
* `NodeId` (e.g. `"gw-eu1-01"`).
* `Environment` (e.g. `"prod"`).
* Routing decisions MUST use `GatewayNodeConfig.Region` as the nodes region; the router MUST NOT derive region from HTTP headers or URL host names.
* DNS/host naming conventions SHOULD express region in the domain (e.g. `eu1.global.stella-ops.org`, `mainoffice.contoso.stella-ops.org`), but routing logic MUST be driven by `GatewayNodeConfig.Region` rather than by host parsing.
---
## 3. Endpoint identity and metadata
* Endpoint identity in the router and microservices MUST be `HTTP Method + Path`, for example:
* `Method`: one of `GET`, `POST`, `PUT`, `PATCH`, `DELETE`.
* `Path`: e.g. `/section/get/{id}`.
* The router and microservices MUST use the same path template syntax and matching rules (e.g. ASP.NET-style route templates), including decisions on:
* Case sensitivity.
* Trailing slash handling.
* Parameter segments (e.g. `{id}`).
* The router MUST resolve an incoming HTTP `(Method, Path)` to a logical endpoint descriptor that includes:
* ServiceName.
* Version.
* Method.
* Path.
* DefaultTimeout.
* `RequiringClaims`: a list of claim requirements.
* A flag indicating whether the endpoint supports streaming.
* Every place that previously spoke about `AllowedRoles` MUST be replaced with `RequiringClaims`:
* Each requirement MUST at minimum contain a `Type` and MAY contain a `Value`.
* Endpoints MUST support being configured with default `RequiringClaims` in microservices, with the possibility of external override (see Authority section).
---
## 4. Routing algorithm / instance selection
* Given a resolved endpoint `(ServiceName, Version, Method, Path)`, the router MUST:
* Filter candidate instances by:
* Matching `ServiceName`.
* Matching `Version` (strict semver equality).
* Health in an acceptable set (e.g. `Healthy` or `Degraded`).
* Instances MUST have health metadata:
* `Status` ∈ {`Unknown`, `Healthy`, `Degraded`, `Draining`, `Unhealthy`}.
* `LastHeartbeatUtc`.
* `AveragePingMs`.
* The routers instance selection MUST obey these rules:
* Region:
* Prefer instances whose `Region == GatewayNodeConfig.Region`.
* If none, fall back to configured neighbor regions.
* If none, fall back to all other regions.
* Within a chosen region tier:
* Prefer lower `AveragePingMs`.
* If several are tied, prefer more recent `LastHeartbeatUtc`.
* If still tied, use a balancing strategy (e.g. random or round-robin).
* The router MUST support a strict fallback order as requested:
* Prefer “closest by region and heartbeat and ping.”
* If having to choose between worse candidates, fall back in order of:
* Greater ping (latency).
* Greater heartbeat age.
* Less preferred region tier.
---
## 5. Transport plugin requirements
* There MUST be a transport plugin abstraction representing how the router and microservices communicate.
* The default transport type MUST be UDP.
* Additional supported transport types MUST include:
* TCP.
* Certificate-based TCP (TLS / mTLS).
* RabbitMQ.
* There MUST NOT be an HTTP transport plugin; HTTP MUST NOT be used for microservice-to-router communications (control or data).
* Each transport plugin MUST support:
* Establishing logical connections between microservices and the router.
* Sending/receiving HELLO (registration), HEARTBEAT, optional ENDPOINTS_UPDATE.
* Sending/receiving REQUEST/RESPONSE frames.
* Supporting streaming via REQUEST_STREAM_DATA / RESPONSE_STREAM_DATA frames where the transport allows it.
* Sending/receiving CANCEL frames to abort specific in-flight requests.
* UDP transport:
* MUST be used only for small/bounded payloads (no unbounded streaming).
* MUST respect configured `MaxRequestBytesPerCall`.
* TCP and Certificate transports:
* MUST implement a length-prefixed framing protocol capable of multiplexing frames for multiple correlation IDs.
* Certificate transport MUST enforce TLS and support optional mutual TLS (verifiable peer identity).
* RabbitMQ:
* MUST implement queue/exchange naming and routing keys sufficient to represent logical connections and correlation IDs.
* MUST use message properties (e.g. `CorrelationId`) for request/response matching.
---
## 6. Gateway (`StellaOps.Gateway.WebService`) requirements
### 6.1 HTTP ingress pipeline
* The gateway MUST host an ASP.NET Core HTTP server.
* The HTTP middleware pipeline MUST include at least:
* Forwarded headers handling (when behind reverse proxy).
* Request logging (e.g. via Serilog) including correlation ID, service, endpoint, region, instance.
* Global error-handling middleware.
* Authentication middleware.
* `EndpointResolutionMiddleware` to resolve `(Method, Path)` → endpoint.
* Authorization middleware that enforces `RequiringClaims`.
* `RoutingDecisionMiddleware` to choose connection/instance/transport.
* `TransportDispatchMiddleware` to carry out buffered or streaming dispatch.
* The gateway MUST read `Method` and `Path` from the HTTP request and use them to resolve endpoints.
### 6.2 Per-connection state and routing view
* The gateway MUST maintain a `ConnectionState` per logical connection that includes:
* ConnectionId.
* `InstanceDescriptor` (`InstanceId`, `ServiceName`, `Version`, `Region`).
* `Status`, `LastHeartbeatUtc`, `AveragePingMs`.
* The set of endpoints that this connection serves (`(Method, Path)``EndpointDescriptor`).
* The transport type for that connection.
* The gateway MUST maintain a global routing state (`IGlobalRoutingState`) that:
* Resolves `(Method, Path)` to an `EndpointDescriptor` (service, version, metadata).
* Provides the set of `ConnectionState` objects that can handle a given `(ServiceName, Version, Method, Path)`.
### 6.3 Buffered vs streaming dispatch
* The gateway MUST support:
* **Buffered mode** for small to medium payloads:
* Read the entire HTTP body into memory (or temp file when above a threshold).
* Send as a single REQUEST payload.
* **Streaming mode** for large or unknown content:
* Streaming from HTTP body to microservice via a sequence of REQUEST_STREAM_DATA frames.
* Streaming from microservice back to HTTP via RESPONSE_STREAM_DATA frames.
* For each endpoint, the gateway MUST know whether it can use streaming or must use buffered mode (`SupportsStreaming` flag).
### 6.4 Opaque body handling
* The gateway MUST treat request and response bodies as opaque byte sequences and MUST NOT attempt to deserialize or interpret payload contents.
* The gateway MUST forward headers and body bytes as given and leave any schema, JSON, or other decoding to the microservice.
### 6.5 Payload and memory protection
* The gateway MUST enforce configured payload limits:
* `MaxRequestBytesPerCall`.
* `MaxRequestBytesPerConnection`.
* `MaxAggregateInflightBytes`.
* If `Content-Length` is known and exceeds `MaxRequestBytesPerCall`, the gateway MUST reject the request early (e.g. HTTP 413 Payload Too Large).
* During streaming, the gateway MUST maintain counters of:
* Bytes read for this request.
* Bytes for this connection.
* Total in-flight bytes across all requests.
* If any limit is exceeded mid-stream, the gateway MUST:
* Stop reading the HTTP body.
* Send a CANCEL frame for that correlation ID.
* Abort the stream to the microservice.
* Return an appropriate error to the client (e.g. 413 or 503) and log the incident.
---
## 7. Microservice SDK (`__Libraries/StellaOps.Microservice`) requirements
### 7.1 Identity & router connections
* `StellaMicroserviceOptions` MUST let microservices configure:
* `ServiceName`.
* `Version`.
* `Region`.
* `InstanceId`.
* A list of router endpoints (`Routers` / router pool) including host, port, and transport type for each.
* Optional path to a YAML config file for endpoint-level overrides.
* Providing the router pool (`Routers` / HTTP servers pool) MUST be mandatory; a microservice cannot start without at least one configured router endpoint.
* The router pool SHOULD be configurable via code and MAY optionally be configured via YAML with hot-reload (causing reconnections if changed).
### 7.2 Endpoint definition & discovery
* Microservice endpoints MUST be declared using attributes that specify `(Method, Path)`:
```csharp
[StellaEndpoint("POST", "/billing/invoices")]
public sealed class CreateInvoiceEndpoint : ...
```
* The SDK MUST support two handler shapes:
* Raw handler:
* `IRawStellaEndpoint` taking a `RawRequestContext` and returning a `RawResponse`, where:
* `RawRequestContext.Body` is a stream (may be buffered or streaming).
* Body contents are raw bytes.
* Typed handlers:
* `IStellaEndpoint<TRequest, TResponse>` which takes a typed request and returns a typed response.
* `IStellaEndpoint<TResponse>` which has no request payload and returns a typed response.
* The SDK MUST adapt typed endpoints to the raw model internally (microservice-side only), leaving the router unaware of types.
* Endpoint discovery MUST work by:
* Runtime reflection: scanning assemblies for `[StellaEndpoint]` and handler interfaces.
* Build-time reflection via source generation:
* A Roslyn source generator MUST generate a descriptor list at build time.
* At runtime, the SDK MUST prefer source-generated metadata and only fall back to reflection if generation is not available.
### 7.3 Endpoint metadata defaults & overrides
* Microservices MUST be able to provide default endpoint metadata:
* `SupportsStreaming` flag.
* Default timeout.
* Default `RequiringClaims`.
* Microservice-local YAML MUST be allowed to override or refine these defaults per endpoint, keyed by `(Method, Path)`.
* Precedence rules MUST be clearly defined and honored:
* Service identity & router pool: from `StellaMicroserviceOptions` (not YAML).
* Endpoint set: from code (attributes/source gen); YAML MAY override properties but ideally not create endpoints not present in code (policy decision to be documented).
* `RequiringClaims` and timeouts: YAML overrides defaults from code, unless overridden by central Authority.
### 7.4 Connection behavior
* On establishing a connection to a router endpoint, the SDK MUST:
* Immediately send a HELLO frame containing:
* `ServiceName`, `Version`, `Region`, `InstanceId`.
* The list of endpoints (Method, Path) with their metadata (SupportsStreaming, default timeouts, default RequiringClaims).
* At regular intervals, the SDK MUST send HEARTBEAT frames on each connection indicating:
* Instance health status.
* Optional metrics (e.g. in-flight request count, error rate).
* The SDK SHOULD support optional ENDPOINTS_UPDATE (or a re-HELLO) to update endpoint metadata at runtime if needed.
### 7.5 Request handling & streaming
* For each incoming REQUEST frame:
* The SDK MUST create a `RawRequestContext` with:
* Method.
* Path.
* Headers.
* A `Body` stream that either:
* Wraps a buffered byte array.
* Or exposes streaming reads from subsequent REQUEST_STREAM_DATA frames.
* A `CancellationToken` that will be cancelled when the router sends a CANCEL frame or the connection fails.
* The SDK MUST resolve the correct endpoint handler by `(Method, Path)` using the same path template rules as the router.
* For streaming endpoints, handlers MUST be able to read from `RawRequestContext.Body` incrementally and obey the `CancellationToken`.
### 7.6 Cancellation handling (microservice side)
* The SDK MUST maintain a map of in-flight requests by correlation ID, each containing:
* A `CancellationTokenSource`.
* The task executing the handler.
* Upon receiving a CANCEL frame for a given correlation ID, the SDK MUST:
* Look up the corresponding entry and call `CancellationTokenSource.Cancel()`.
* Handlers (both raw and typed) MUST receive a `CancellationToken`:
* They MUST observe the token and be coded to cancel promptly where needed.
* They MUST pass the token to downstream I/O operations (DB calls, file I/O, network).
* If the transport connection is closed, the SDK MUST treat it as a cancellation trigger for all outstanding requests on that connection and cancel their tokens.
---
## 8. Control / health / ping requirements
* Heartbeats MUST be sent over the same connection as requests (no separate control channel).
* The router MUST:
* Track `LastHeartbeatUtc` for each connection.
* Derive `InstanceHealthStatus` based on heartbeat recency and optionally metrics.
* Drop or mark as Unhealthy any instances whose heartbeats are stale past configured thresholds.
* The router SHOULD measure network latency (ping) by:
* Timing request-response round trips, or
* Using explicit ping frames, and updating `AveragePingMs` for each connection.
* The router MUST use heartbeat and ping metrics in its routing decision as described above.
---
## 9. Authorization / requiringClaims / Authority requirements
* `RequiringClaims` MUST be the only authorization metadata field; `AllowedRoles` MUST NOT be used.
* Every endpoint MUST be able to specify:
* An empty `RequiringClaims` list (no additional claims required beyond authenticated).
* Or one or more `ClaimRequirement` objects (Type + optional Value).
* The gateway MUST enforce `RequiringClaims` per request:
* Authorization MUST check that the requests user principal has all required claims for the endpoint.
* Microservices MUST provide default `RequiringClaims` as part of their HELLO metadata.
* There MUST be a mechanism for an external Authority service to override `RequiringClaims` centrally:
* Defaults MUST come from microservices.
* Authority MUST be able to push or supply overrides that the gateway applies at startup and/or at runtime.
* The gateway MUST proactively request such overrides on startup (e.g. via a special message or mechanism) before handling traffic, or as early as practical.
* Final, effective `RequiringClaims` enforced at the gateway MUST be derived from microservice defaults plus Authority overrides, with Authority taking precedence where applicable.
---
## 10. Cancellation requirements (router side)
* The protocol MUST define a `FrameType.Cancel` with:
* A `CorrelationId` indicating which request to cancel.
* An optional payload containing a reason code (e.g. `"ClientDisconnected"`, `"Timeout"`, `"PayloadLimitExceeded"`).
* The router MUST send CANCEL frames when:
* The HTTP client disconnects (ASP.NET `HttpContext.RequestAborted` fires) while the request is in progress.
* The routers effective timeout for the request elapses, and no response has been received.
* The router detects payload/memory limit breaches and has to abort the request.
* The router is shutting down and explicitly aborts in-flight requests (if implemented).
* The router MUST:
* Stop forwarding any additional REQUEST_STREAM_DATA to the microservice once a CANCEL is sent.
* Stop reading any remaining response frames for that correlation and either:
* Discard them.
* Or treat them as late, log them, and ignore them.
* For streaming responses, if the HTTP client disconnects or router cancels:
* The router MUST stop writing to the HTTP response and treat any subsequent frames as ignored.
---
## 11. Configuration and YAML requirements
* `__Libraries/StellaOps.Router.Config` MUST handle:
* Binding router config from JSON/appsettings + YAML + environment variables.
* Static service definitions:
* ServiceName.
* DefaultVersion.
* DefaultTransport.
* Endpoint list (Method, Path) with default timeouts, requiringClaims, streaming flags.
* Static instance definitions (optional):
* ServiceName, Version, Region, supported transports, plugin-specific settings.
* Global payload limits (`PayloadLimits`).
* Router YAML config MUST support hot-reload:
* Changes SHOULD be picked up at runtime without restarting the gateway.
* Hot-reload MUST cause in-memory routing state to be updated, including:
* New or removed services/endpoints.
* New or removed instances (static).
* Updated payload limits.
* Microservice YAML config MUST be optional and used for endpoint-level overrides only, not for identity or router pool configuration.
* The router pool for microservices MUST be configured via code and MAY be backed by YAML (with hot-plug / reconnection behavior) if desired.
---
## 12. Library naming / repo structure requirements
* The router configuration library MUST be named `__Libraries/StellaOps.Router.Config`.
* The microservice SDK library MUST be named `__Libraries/StellaOps.Microservice`.
* The gateway webservice MUST be named `StellaOps.Gateway.WebService`.
* There MUST be a “common” library for shared types and abstractions (e.g. `__Libraries/StellaOps.Router.Common`).
* Documentation files MUST include at least:
* `Stella Ops Router.md` (what it is, why, high-level architecture).
* `Stella Ops Router - Webserver.md` (how the webservice works).
* `Stella Ops Router - Microservice.md` (how the microservice SDK works and is implemented).
* `Stella Ops Router - Common.md` (common components and how they are implemented).
* `Migration of Webservices to Microservices.md`.
* `Stella Ops Router Documentation.md` (doc structure & guidance).
---
## 13. Documentation & developer-experience requirements
* The docs MUST be detailed; “do not spare details” implies:
* High-fidelity, concrete examples and not hand-wavy descriptions.
* For average C# developers, documentation MUST cover:
* Exact .NET / ASP.NET Core target version and runtime baseline.
* Required NuGet packages (logging, serialization, YAML parsing, RabbitMQ client, etc.).
* Exact serialization formats for frames and payloads (JSON vs MessagePack vs others).
* Exact framing rules for each transport (length-prefix for TCP/TLS, datagrams for UDP, exchanges/queues for Rabbit).
* Concrete sample `Program.cs` for:
* A gateway node.
* A microservice.
* Example endpoint implementations:
* Typed (with and without request).
* Raw streaming endpoints for large payloads.
* Example router YAML and microservice YAML with realistic values.
* Error and HTTP status mapping policy:
* E.g. “version not found → 404 or 400; no instance available → 503; timeout → 504; payload too large → 413.”
* Guidelines on:
* When to use UDP vs TCP vs RabbitMQ.
* How to configure and validate certificates for the certificate transport.
* How to write cancellation-friendly handlers (proper use of `CancellationToken`).
* Testing strategies: local dev setups, integration test harnesses, how to run router + microservice together for tests.
* Clear explanation of config precedence:
* Code options vs YAML vs microservice defaults vs Authority for claims.
* Documentation MUST answer for each major concept:
* What it is.
* Why it exists.
* How it works.
* How to use it (with examples).
* What happens when it is misused and how to debug issues.
---
## 14. Migration requirements
* There MUST be a defined migration path from `StellaOps.*.WebServices` to `StellaOps.*.Microservices`.
* Migration documentation MUST cover:
* Inventorying existing HTTP routes (Method + Path).
* Strategy A (in-place adaptation):
* Adding microservice SDK into WebService.
* Declaring endpoints with `[StellaEndpoint]`.
* Wrapping existing controller logic in handlers.
* Connecting to the router and validating registration.
* Gradually shifting traffic from direct WebService HTTP ingress to gateway routing.
* Strategy B (split):
* Extracting domain logic into shared libraries.
* Creating a dedicated microservice project using the SDK.
* Mapping routes and handlers.
* Phasing out or repurposing the original WebService.
* Ensuring cancellation tokens are wired throughout migrated code.
* Handling streaming endpoints (large uploads/downloads) via `IRawStellaEndpoint` and streaming support instead of naive buffered HTTP controllers.
---
If you want, I can next turn this requirement set into a machine-readable checklist (e.g. JSON or YAML) or derive a first-pass implementation roadmap directly from these requirements.