Start by treating `docs/router/specs.md` as law. Nothing gets coded that contradicts it. The first sprint or two should be about *wiring the skeleton* and proving the core flows with the simplest possible transport, then layering in the real transports and migration paths. I’d structure the work for your agents like this. --- ## 0. Read & freeze invariants **All agents:** * Read `docs/router/specs.md` end to end. * Extract and pin the non-negotiables: * Method + Path identity. * Strict semver for versions. * Region from `GatewayNodeConfig.Region` (no host/header magic). * No HTTP transport for microservice communications. * Single connection carrying HELLO + HEARTBEAT + REQUEST/RESPONSE + CANCEL. * Router treats body as opaque bytes/streams. * `RequiringClaims` replaces any form of `AllowedRoles`. Agree that these are invariants; any future idea that violates them needs an explicit spec change first. --- ## 1. Lay down the solution skeleton **“Skeleton” agent (or gateway core agent):** Create the basic project structure, no logic yet: * `src/__Libraries/StellaOps.Router.Common` * `src/__Libraries/StellaOps.Router.Config` * `src/__Libraries/StellaOps.Microservice` * `src/StellaOps.Gateway.WebService` * `docs/router/` already has `specs.md` (add placeholders for the other docs). Goal: everything builds, but most classes are empty or stubs. --- ## 2. Implement the shared core model (Common) **Common/core agent:** Implement only the *data* and *interfaces*, no behavior: * Enums: * `TransportType`, `FrameType`, `InstanceHealthStatus`. * Models: * `ClaimRequirement` * `EndpointDescriptor` * `InstanceDescriptor` * `ConnectionState` * `RoutingContext`, `RoutingDecision` * `PayloadLimits` * Interfaces: * `IGlobalRoutingState` * `IRoutingPlugin` * `ITransportServer` * `ITransportClient` * `Frame` struct/class: * `FrameType`, `CorrelationId`, `Payload` (byte[]). Leave implementations of `IGlobalRoutingState`, `IRoutingPlugin`, transports, etc., for later steps. Deliverable: a stable set of contracts that gateway + microservice SDK depend on. --- ## 3. Build a fake “in-memory” transport plugin **Transport agent:** Before UDP/TCP/Rabbit, build an **in-process transport**: * `InMemoryTransportServer` and `InMemoryTransportClient`. * They share a concurrent dictionary keyed by `ConnectionId`. * Frames are passed via channels/queues in memory. Purpose: * Let you prove HELLO/HEARTBEAT/REQUEST/RESPONSE/CANCEL semantics and routing logic *without* dealing with sockets and Rabbit yet. * Let you unit and integration test the router and SDK quickly. This plugin will never ship to production; it’s only for dev tests and CI. --- ## 4. Microservice SDK: minimal handshake & dispatch (with InMemory) **Microservice agent:** Initial focus: “connect and say HELLO, then handle a simple request.” 1. Implement `StellaMicroserviceOptions`. 2. Implement `AddStellaMicroservice(...)`: * Bind options. * Register endpoint handlers and SDK internal services. 3. Endpoint discovery: * Implement runtime reflection for `[StellaEndpoint]` + handler types. * Build in-memory `EndpointDescriptor` list (simple: no YAML yet). 4. Connection: * Use `InMemoryTransportClient` to “connect” to a fake router. * On connect, send a HELLO frame with: * Identity. * Endpoint list and metadata (`SupportsStreaming` false for now, simple `RequiringClaims` empty). 5. Request handling: * Implement `IRawStellaEndpoint` and adapter to it. * Implement `RawRequestContext` / `RawResponse`. * Implement a dispatcher that: * Receives `Request` frame. * Builds `RawRequestContext`. * Invokes the correct handler. * Sends `Response` frame. Do **not** handle streaming or cancellation yet; just basic request/response with small bodies. --- ## 5. Gateway: minimal routing using InMemory plugin **Gateway agent:** Goal: HTTP → in-memory transport → microservice → HTTP response. 1. Implement `GatewayNodeConfig` and bind it from config. 2. Implement `IGlobalRoutingState` as a simple in-memory implementation that: * Holds `ConnectionState` objects. * Builds a map `(Method, Path)` → endpoint + connections. 3. Implement a minimal `IRoutingPlugin` that: * For now, just picks *any* connection that has the endpoint (no region/ping logic yet). 4. Implement minimal HTTP pipeline: * `EndpointResolutionMiddleware`: * `(Method, Path)` → `EndpointDescriptor` from `IGlobalRoutingState`. * Naive authorization middleware stub (only checks “needs authenticated user”; ignore real requiringClaims for now). * `RoutingDecisionMiddleware`: * Ask `IRoutingPlugin` for a `RoutingDecision`. * `TransportDispatchMiddleware`: * Build a `Request` frame. * Use `InMemoryTransportClient` to send and await `Response`. * Map response to HTTP. 5. Implement HELLO handler on gateway side: * When InMemory “connection” from microservice appears and sends HELLO: * Construct `ConnectionState`. * Update `IGlobalRoutingState` with endpoint → connection mapping. Once this works, you have end-to-end: * Example microservice. * Example gateway. * In-memory transport. * A couple of test endpoints returning simple JSON. --- ## 6. Add heartbeat, health, and basic routing rules **Common/core + gateway agent:** Now enforce liveness and basic routing: 1. Heartbeat: * Microservice SDK sends HEARTBEAT frames on a timer. * Gateway updates `LastHeartbeatUtc` and `Status`. 2. Health: * Add background job in gateway that: * Marks instances Unhealthy if heartbeat stale. 3. Routing: * Enhance `IRoutingPlugin` to: * Filter out Unhealthy instances. * Prefer gateway region (using `GatewayNodeConfig.Region`). * Use simple `AveragePingMs` stub from request/response timings. Still using InMemory transport; just building the selection logic. --- ## 7. Add cancellation semantics (with InMemory) **Microservice + gateway agents:** Wire up cancellation logic before touching real transports: 1. Common: * Extend `FrameType` with `Cancel`. 2. Gateway: * In `TransportDispatchMiddleware`: * Tie `HttpContext.RequestAborted` to a `SendCancelAsync` call. * On timeout, send CANCEL. * Ignore late `Response`/stream data for canceled correlation IDs. 3. Microservice: * Maintain `_inflight` map of correlation → `CancellationTokenSource`. * When `Cancel` frame arrives, call `cts.Cancel()`. * Ensure handlers receive and honor `CancellationToken`. Prove via tests: if client disconnects, handler stops quickly. --- ## 8. Add streaming & payload limits (still InMemory) **Gateway + microservice agents:** 1. Streaming: * Extend InMemory transport to support `RequestStreamData` / `ResponseStreamData` frames. * On the gateway: * For `SupportsStreaming` endpoints, pipe HTTP body stream → frame stream. * For response, pipe frames → HTTP response stream. * On microservice: * Expose `RawRequestContext.Body` as a stream reading frames as they arrive. * Allow `RawResponse.WriteBodyAsync` to stream out. 2. Payload limits: * Implement `PayloadLimits` enforcement at gateway: * Early reject large `Content-Length`. * Track counters in streaming; trigger cancellation when exceeding thresholds. Demonstrate with a fake “upload” endpoint that uses `IRawStellaEndpoint` and streaming. --- ## 9. Implement real transport plugins one by one **Transport agent:** Now replace InMemory with real transports: Order: 1. **TCP plugin** (easiest baseline): * Length-prefixed frame protocol. * Connection per microservice instance (or multi-instance if needed later). * Implement HELLO/HEARTBEAT/REQUEST/RESPONSE/STREAM/CANCEL as per frame model. 2. **Certificate (TLS) plugin**: * Wrap TCP plugin with TLS. * Add configuration for server & client certs. 3. **UDP plugin**: * Single datagram = single frame; no streaming. * Enforce `MaxRequestBytesPerCall`. * Use for small, idempotent operations. 4. **RabbitMQ plugin**: * Add exchanges/queues for HELLO/HEARTBEAT and REQUEST/RESPONSE. * Use `CorrelationId` properties for matching. * Guarantee at-most-once semantics where practical. While each plugin is built, keep the core router and microservice SDK relying only on `ITransportClient`/`ITransportServer` abstractions. --- ## 10. Add Router.Config + Microservice YAML integration **Config agent:** 1. Implement `__Libraries/StellaOps.Router.Config`: * YAML → `RouterConfig` binding. * Services, endpoints, static instances, payload limits. * Hot-reload via `IOptionsMonitor` / file watcher. 2. Implement microservice YAML: * Endpoint-level overrides only (timeouts, requiringClaims, SupportsStreaming). * Merge logic: code defaults → YAML override. 3. Integrate: * Gateway uses RouterConfig for: * Defaults when no microservice registered yet. * Payload limits. * Microservice uses YAML to refine endpoint metadata before sending HELLO. --- ## 11. Build a reference example + migration skeleton **DX / migration agent:** 1. Build a `StellaOps.Billing.Microservice` example: * A couple of simple endpoints (GET/POST). * One streaming upload endpoint. * YAML for requiringClaims and timeouts. 2. Build a `StellaOps.Gateway.WebService` example config around it. 3. Document the full path: * How to run both locally. * How to add a new endpoint. * How cancellation behaves (killing the client, watching logs). * How payload limits work (try to upload too-large file). 4. Outline migration steps from an imaginary `StellaOps.Billing.WebService` using the patterns in `Migration of Webservices to Microservices.md`. --- ## 12. Process guidance for your agents * **Do not jump to UDP/TCP immediately.** Prove the protocol (HELLO/HEARTBEAT/REQUEST/RESPONSE/STREAM/CANCEL), routing, and limits on the InMemory plugin first. * **Guard the invariants.** If someone proposes “just call HTTP between services” or “let’s derive region from host,” they’re violating spec and must update `docs/router/specs.md` before coding. * **Keep Common stable.** Changes to `StellaOps.Router.Common` must be rare and reviewed; everything else depends on it. * **Document as you go.** Every time a behavior settles (e.g. status mapping, frame layout), update the docs under `docs/router/` so new agents always have a single source of truth. If you want, next step I can convert this into a task board (epic → stories) per repo folder, so you can assign specific chunks to named agents.