11 KiB
Below is how I’d tell your dev agents to operate on this codebase so it doesn’t turn into chaos over time.
Think of this as the “rules of engagement” for Stella Ops Router.
1. Non‑negotiable operating principles
All agents follow these rules:
-
Specs are law
-
docs/router/specs.mdis the primary source of truth. -
If code and spec differ:
- Fix the spec first (in a PR), then adjust the code.
-
No “quick fixes” that contradict the spec.
-
-
Common & protocol are sacred
-
StellaOps.Router.Commonand the wire protocol (Frame/FrameType/serialization) are stable layers. -
Any change to:
Frame,FrameTypeEndpointDescriptor,ConnectionStateITransportClient/ITransportServer
-
…requires:
- Explicit spec update.
- Compatibility consideration.
- Code review by someone thinking about all transports and both sides (gateway + microservice).
-
-
InMemory first, then real transports
-
New protocol semantics (e.g., new frame type, new behavior, new timeout rules) MUST:
- Be implemented and proven with InMemory.
- Have tests passing with InMemory.
- Only then be rolled into TCP/TLS/UDP/RabbitMQ.
-
-
No backdoor HTTP between microservices and router
- Microservices must never talk HTTP to the router for control plane or data.
- All microservice–router traffic goes through the registered transports (UDP/TCP/TLS/RabbitMQ) using
Frame.
-
Method + Path = contract
- Endpoint identity is always:
HTTP Method + Path, nothing else. - No “dynamic” routing hacks that bypass the
(Method, Path)resolution.
- Endpoint identity is always:
2. How agents should structure work (vertical slices, not scattered edits)
Whenever you assign work, agents should:
-
Work in vertical slices
-
Example slice: “Cancellation with InMemory”, “Streaming + payload limits with TCP”, “RabbitMQ buffered requests”.
-
Each slice includes:
- Spec amendments (if needed).
- Common contracts (if needed).
- Implementation (gateway + microservice + transport).
- Tests.
-
-
Avoid cross‑cutting, half‑finished changes
-
Do not:
- Change Common, start on TCP, then get bored and leave InMemory broken.
-
Do:
- Finish one vertical slice end‑to‑end, then move on.
-
-
Keep changes small and reviewable
-
Prefer:
- One PR for “add YAML overrides merging”.
- Another PR for “add router YAML hot‑reload details”.
-
Avoid huge omnibus PRs that change protocol, transports, router, and microservice in one go.
-
3. Change categories & review rules
Agents should classify their work by category and obey the review level.
-
Category A – Protocol / Common changes
-
Affects:
Frame,FrameType, payload DTOs.EndpointDescriptor,ConnectionState,RoutingDecision.ITransportClient,ITransportServer.
-
Requirements:
- Spec change with rationale.
- Cross‑side impact analysis: gateway + microservice + all transports.
- Tests updated for InMemory and at least one real transport.
-
Review: 2+ reviewers, one acting as “protocol owner”.
-
-
Category B – Router logic / routing plugin
-
Affects:
IGlobalRoutingStateimplementation.IRoutingPluginlogic (region, ping, heartbeat).
-
Requirements:
- Unit tests for routing plugin (selection rules).
- At least one integration test through gateway + InMemory.
-
Review: at least one reviewer who understands region/version semantics.
-
-
Category C – Transport implementation
-
Affects:
- TCP/TLS/UDP/RabbitMQ clients & servers.
-
Requirements:
- Transport‑specific tests (connection, basic request/response, timeout).
- No protocol changes.
-
Review: 1–2 reviewers, including one who owns that transport.
-
-
Category D – SDK / Microservice developer experience
-
Affects:
StellaOps.Microservicepublic surface, endpoint discovery, YAML merging.
-
Requirements:
- API review for public surface.
- Docs update (
Microservice.md) if behavior changes.
-
Review: 1–2 reviewers.
-
-
Category E – Docs only
-
Affects:
docs/router/*, no code.
-
Requirements:
- Ensure docs match current behavior; if not, spawn follow‑up issues.
-
4. Workflow per change (what each agent does)
For any non‑trivial change:
-
Check the spec
-
Confirm that:
- The desired behavior is already described, or
- You will extend the spec first.
-
-
Update / extend spec if needed
-
Edit
docs/router/specs.mdor appropriate doc. -
Document:
- What’s changing.
- Why we need it.
- Which components are affected.
-
-
Adjust Common / contracts if needed
- Only after spec is updated.
- Keep changes minimal and backwards compatible where possible.
-
Implement in InMemory path
-
Update:
- InMemory
ITransportClient/hub. - Microservice and gateway logic that rely on it.
- InMemory
-
Add tests to prove behavior.
-
-
Port to real transports
-
Implement the same behavior in:
- TCP (baseline).
- TLS (wrapping TCP).
- Others when needed.
-
Reuse the same InMemory tests pattern for transport tests.
-
-
Add / update tests
- Unit tests for logic.
- Integration tests for gateway + microservice via at least one real transport.
-
Update documentation
-
Update relevant docs:
Stella Ops Router - Webserver.mdStella Ops Router - Microservice.mdCommon.md, if common contracts changed.
-
Highlight any new configuration knobs or invariants.
-
5. Testing expectations for all agents
Agents should treat tests as part of the change, not an afterthought.
-
Unit tests
-
For:
- Routing plugin decisions.
- YAML merge behavior.
- Payload budget logic.
-
Goal:
- All tricky branches are covered.
-
-
Integration tests
-
For gateway + microservice using:
- InMemory.
- At least one real transport (TCP in dev).
-
Scenarios to maintain:
- Simple request/response.
- Streaming upload.
- Cancellation on client abort.
- Timeout leading to CANCEL.
- Payload limit exceeded.
-
-
Smoke tests for examples
-
Ensure
StellaOps.Billing.Microserviceexample always passes a small test:/billing/healthworks./billing/invoices/uploadstreaming behaves.
-
-
CI gating
-
No PR merges unless:
dotnet buildfor solution succeeds.- All tests pass.
-
If agents add new projects/tests, CI must be updated in the same PR.
-
6. How agents should use configuration & YAML
-
Router side
-
Always read payload limits, node region, transports from
RouterConfig(bound from YAML + env). -
Do not hardcode:
- Limits.
- Regions.
- Ports.
-
If behavior depends on config, fetch from
IOptionsMonitor<RouterConfig>at runtime, not from cached fields unless you explicitly freeze.
-
-
Microservice side
-
Identity & router pool:
- From
StellaMicroserviceOptions(code/env).
- From
-
Endpoint metadata overrides:
- From YAML (
ConfigFilePath) merged into reflection result.
- From YAML (
-
Agents must not let YAML create endpoints that don’t exist in code; overrides only.
-
-
No hidden defaults
- If a default is important (e.g.
HeartbeatInterval), document it and centralize it. - Don’t sprinkle magic numbers across code.
- If a default is important (e.g.
7. Adding new capabilities: pattern all agents follow
When someone wants a new capability (e.g. “retry on transient transport failures”):
-
Open a design issue / doc snippet
-
Describe:
- Problem.
- Proposed design.
- Where it sits in architecture (router, microservice, transport, config).
-
-
Update spec
-
Write the behavior in the appropriate doc section.
-
Include:
- API shape (if public).
- Transport impacts.
- Failure modes.
-
-
Follow the vertical slice path
- Implement in Common (if needed).
- Implement InMemory.
- Implement in primary transport (TCP).
- Add tests.
- Update docs.
Agents should not just spike code into TCP implementation without spec or tests.
8. Logging, tracing, and debugging expectations
Agents should instrument consistently; this matters for operations and for debugging during development.
-
Use structured logging
-
At minimum, include:
ServiceNameInstanceIdCorrelationIdMethodPathConnectionId
-
Never log full payload bodies by default for privacy and performance; log sizes and key metadata instead.
-
-
Trace correlation
-
Ensure correlation IDs:
- Propagate from HTTP (gateway) into
Frame.CorrelationId. - Are used in logs on both sides (gateway + microservice).
- Propagate from HTTP (gateway) into
-
-
Agent debugging guidance
-
When debugging a routing or transport problem:
-
Turn on debug logging for gateway + microservice for that service.
-
Use the correlation ID to follow the request end‑to‑end.
-
Verify:
- HELLO registration.
- HEARTBEAT events.
- REQUEST leaving gateway.
- RESPONSE arriving.
-
-
9. Daily agent workflow (practical directions)
For each day / task, an agent should:
-
Start from an issue or spec line item
- Never “just code something” without an issue/state in the backlog.
-
Locate the relevant doc
- Spec section.
- Example docs (e.g. Billing sample).
- Migration doc if working on conversion.
-
Work in a feature branch
- Branch name reflects scope:
feature/streaming-tcp,fix/router-cancellation, etc.
- Branch name reflects scope:
-
Keep notes
- If an assumption is made (e.g. “we currently don’t support streaming over RabbitMQ”), note it in the issue.
- If they discover inconsistency in docs, open a doc‑fix issue.
-
Finish the full slice
- Code + tests + docs.
- Keep partial implementations behind feature flags (if needed) and clearly marked.
-
Open PR with clear description
- What changed.
- Which spec section it implements or modifies.
- Any risks or roll‑back notes.
10. Guardrails against drift
Finally, a few things agents must actively avoid:
-
No silent protocol changes
-
Don’t change
FrameTypesemantics, payload formats, or header layout without:- Spec update.
- Full impact review.
-
-
No specless behavior
- If something matters at runtime (timeouts, retries, routing rules), it has to be in the docs, not just in someone’s head.
-
No bypassing of router
- Do not introduce “temporary” direct calls from clients to microservices. All client HTTP should go via gateway.
-
No direct dependencies on specific transports in domain code
- Domain and microservice endpoint logic must not know if the transport is TCP, TLS, UDP, or RabbitMQ. They only see
RawRequestContext,RawResponse, and cancellation tokens.
- Domain and microservice endpoint logic must not know if the transport is TCP, TLS, UDP, or RabbitMQ. They only see
If you want, I can turn this into a one‑page “Agent Handbook” markdown file you can drop into docs/router/AGENTS_PROCESS.md and link from specs.md so every AI or human dev working on this stack has the same ground rules.