stela ops usage fixes roles propagation and timoeut, one account to support multi tenants, migrations consolidation, search to support documentation, doctor and open api vector db search
This commit is contained in:
@@ -0,0 +1,185 @@
|
||||
# Sprint 20260221.044 - Router Valkey Microservice Transport Pilot (TimelineIndexer)
|
||||
|
||||
## Topic & Scope
|
||||
- Convert one small service (TimelineIndexer WebService) from Gateway reverse proxy to Router microservice transport using Valkey-backed messaging.
|
||||
- Keep activation controlled by Docker Compose settings so the local stack can switch between reverse proxy and microservice modes without code edits.
|
||||
- Introduce a generic DI routine (`AddRouterMicroservice()`) that binds service router options and transport registration from configuration.
|
||||
- Working directory: `src/Router/`.
|
||||
- Cross-module edits explicitly allowed for this sprint: `src/TimelineIndexer/StellaOps.TimelineIndexer/StellaOps.TimelineIndexer.WebService`, `devops/compose`, `docs/modules/router`, `docs/modules/timeline-indexer`, `src/AdvisoryAI/StellaOps.AdvisoryAI.WebService`, `src/AdvisoryAI/StellaOps.AdvisoryAI.Hosting`.
|
||||
- Expected evidence: targeted unit/integration tests for DI and transport registration, compose smoke run logs, gateway route validation evidence, updated docs.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
- Depends on existing Gateway messaging transport implementation in `src/Router/StellaOps.Gateway.WebService`.
|
||||
- Depends on Valkey infrastructure service in `devops/compose/docker-compose.stella-ops.yml`.
|
||||
- Safe concurrency:
|
||||
- DI helper implementation and docs can run in parallel.
|
||||
- Compose wiring and TimelineIndexer adoption can run in parallel after DI contracts are agreed.
|
||||
- Gateway route cutover must run after endpoint-path compatibility is confirmed.
|
||||
|
||||
## Documentation Prerequisites
|
||||
- `docs/modules/router/architecture.md`
|
||||
- `docs/modules/router/messaging-valkey-transport.md`
|
||||
- `docs/modules/router/webservice-integration-guide.md`
|
||||
- `docs/modules/router/aspnet-endpoint-bridge.md`
|
||||
- `docs/modules/timeline-indexer/architecture.md`
|
||||
- `docs/modules/timeline-indexer/guides/timeline.md`
|
||||
- `docs/modules/gateway/architecture.md`
|
||||
- `docs/modules/gateway/openapi.md`
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
### RVM-01 - Baseline and path-compatibility audit for TimelineIndexer
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Project Manager, Developer
|
||||
Task description:
|
||||
- Capture current route behavior for TimelineIndexer in local compose and Router gateway route table.
|
||||
- Document method/path compatibility between gateway route entries and endpoints discovered from TimelineIndexer ASP.NET routes.
|
||||
- Produce a cutover-safe mapping table that lists paths that can switch directly and paths that require alias endpoints or route migration.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Mapping table committed with explicit `current_path -> target_microservice_path` entries.
|
||||
- [x] All TimelineIndexer routes selected for pilot have deterministic method/path compatibility with Router endpoint identity rules.
|
||||
- [x] Risk note recorded for any incompatible prefixes (for example `/api/v1/timeline` vs `/timeline`).
|
||||
|
||||
### RVM-02 - Add generic `AddRouterMicroservice()` DI routine
|
||||
Status: DONE
|
||||
Dependency: RVM-01
|
||||
Owners: Developer
|
||||
Task description:
|
||||
- Implement a new Router.AspNet helper that wraps `TryAddStellaRouter` plus transport-client registration based on configuration.
|
||||
- The routine must support at least `InMemory`, `Tcp`, `Certificate`, and `Messaging` transport selection from bound options.
|
||||
- For messaging mode, register both messaging backend plugin (`ValkeyTransportPlugin`) and router messaging client (`AddMessagingTransportClient`) using a deterministic configuration section.
|
||||
|
||||
Completion criteria:
|
||||
- [x] New DI routine exists under `src/Router/__Libraries/StellaOps.Router.AspNet` with XML docs and option validation.
|
||||
- [x] Existing behavior remains backward compatible for services that continue to call `TryAddStellaRouter`.
|
||||
- [x] Unit tests cover transport selection and misconfiguration failure modes.
|
||||
|
||||
### RVM-03 - Compose-driven Valkey transport activation
|
||||
Status: DONE
|
||||
Dependency: RVM-02
|
||||
Owners: Developer, DevOps
|
||||
Task description:
|
||||
- Add compose-level environment variables for Router Gateway messaging enablement and queue/connection values.
|
||||
- Add compose-level environment variables for TimelineIndexer router enablement and messaging transport selection.
|
||||
- Ensure the same compose file can run both modes by toggling flags without code changes.
|
||||
|
||||
Completion criteria:
|
||||
- [x] `devops/compose/docker-compose.stella-ops.yml` contains required gateway and timeline indexer environment keys.
|
||||
- [x] Messaging connection resolves to `cache.stella-ops.local:6379` in compose network.
|
||||
- [x] Toggle instructions are documented and tested for `reverse_proxy` mode and `microservice_messaging` mode.
|
||||
|
||||
### RVM-04 - TimelineIndexer pilot adoption of generic DI
|
||||
Status: DONE
|
||||
Dependency: RVM-03
|
||||
Owners: Developer
|
||||
Task description:
|
||||
- Update TimelineIndexer WebService startup to use `AddRouterMicroservice()` and keep `TryUseStellaRouter` plus endpoint refresh behavior.
|
||||
- Ensure startup fails fast when router is enabled but required transport settings are missing.
|
||||
- Keep rollback path simple by honoring compose flags that disable router integration.
|
||||
|
||||
Completion criteria:
|
||||
- [x] TimelineIndexer registers messaging transport client when compose enables messaging mode.
|
||||
- [x] Service startup logs indicate successful HELLO registration to gateway in messaging mode.
|
||||
- [x] Reverse-proxy-only deployment still boots unchanged when router is disabled.
|
||||
|
||||
### RVM-05 - Gateway route migration strategy (canary then flip)
|
||||
Status: DONE
|
||||
Dependency: RVM-04
|
||||
Owners: Developer, DevOps
|
||||
Task description:
|
||||
- Introduce a canary microservice route for TimelineIndexer that does not break current UI/API paths.
|
||||
- Validate canary behavior end-to-end, then flip canonical route entries from `ReverseProxy` to `Microservice` in `router-gateway-local.json`.
|
||||
- Keep a documented rollback that restores reverse proxy by route-table revert and compose flag switch.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Canary microservice route is reachable and returns expected TimelineIndexer responses.
|
||||
- [x] Canonical route flip has before/after evidence with no auth-header regression.
|
||||
- [x] Rollback procedure is documented with exact config keys and file diffs.
|
||||
|
||||
### RVM-06 - Test and verification matrix for pilot
|
||||
Status: DONE
|
||||
Dependency: RVM-05
|
||||
Owners: Test Automation, Developer
|
||||
Task description:
|
||||
- Add targeted tests for the new DI helper and messaging transport registration.
|
||||
- Run Router gateway messaging integration tests and TimelineIndexer smoke tests under compose with Valkey.
|
||||
- Capture deterministic evidence: command lines, pass/fail counts, and route-level request/response samples.
|
||||
|
||||
Completion criteria:
|
||||
- [x] New tests exist for `AddRouterMicroservice()` with transport-mode assertions.
|
||||
- [x] Existing Router messaging integration tests pass without regression.
|
||||
- [x] Compose smoke verification proves request flow `Gateway -> Router microservice transport -> TimelineIndexer`.
|
||||
|
||||
### RVM-07 - Documentation sync for transport migration
|
||||
Status: DONE
|
||||
Dependency: RVM-06
|
||||
Owners: Documentation Author, Developer
|
||||
Task description:
|
||||
- Update router integration docs to include the new generic DI routine and compose-driven transport activation pattern.
|
||||
- Update timeline indexer docs with actual externally exposed paths and pilot routing strategy.
|
||||
- Add Decisions and Risks links to changed docs for auditability.
|
||||
|
||||
Completion criteria:
|
||||
- [x] `docs/modules/router` and `docs/modules/timeline-indexer` are updated to match implemented behavior.
|
||||
- [x] Examples show Valkey messaging setup from compose.
|
||||
- [x] Sprint Decisions and Risks section links all updated docs.
|
||||
|
||||
### RVM-08 - OpenAI adapter exposure workstream (AdvisoryAI)
|
||||
Status: DONE
|
||||
Dependency: RVM-02
|
||||
Owners: Developer, Product Manager
|
||||
Task description:
|
||||
- Validate required exposure model: plugin-capability exposure, API endpoint exposure, or OpenAI-compatible endpoint surface.
|
||||
- Reuse existing unified adapter pattern (`LlmPluginAdapterFactory`) to expose provider capabilities deterministically.
|
||||
- Add gateway route exposure for selected AdvisoryAI adapter endpoints after contract is finalized.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Exposure contract is documented with explicit endpoint list and auth scopes.
|
||||
- [x] AdvisoryAI registers adapter exposure services and endpoints according to the approved contract.
|
||||
- [x] Gateway route table includes new adapter exposure paths with security constraints.
|
||||
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2026-02-21 | Sprint created from router microservice transport investigation; awaiting implementation staffing. | Project Manager |
|
||||
| 2026-02-21 | Implemented `AddRouterMicroservice()` in Router.AspNet with auto transport registration and Valkey messaging path. | Developer |
|
||||
| 2026-02-21 | Updated TimelineIndexer WebService to use generic DI helper and added `/api/v1/timeline*` alias endpoints for microservice path matching. | Developer |
|
||||
| 2026-02-21 | Added compose env toggles for gateway messaging + TimelineIndexer router config; flipped `/api/v1/timeline` route to `Microservice`. | Developer |
|
||||
| 2026-02-21 | Added pilot mapping doc and router/timeline docs updates; ran Router.AspNet tests and TimelineIndexer build successfully. | Developer |
|
||||
| 2026-02-21 | Attempted gateway messaging integration verification, but test runner ignored filter and unrelated pre-existing gateway tests failed (`IdentityHeaderPolicyMiddlewareTests`). | Developer |
|
||||
| 2026-02-21 | Refactored `AddRouterMicroservice()` to plugin-based transport registration (`RouterTransportPluginLoader`) and removed direct transport references from `StellaOps.Router.AspNet`. | Developer |
|
||||
| 2026-02-21 | Added `MessagingTransportPlugin` for Router messaging transport and corrected compose `TIMELINE_*` env placement to `timeline-indexer-web`. | Developer |
|
||||
| 2026-02-21 | Ran xUnit v3 class-filtered gateway messaging integration tests directly via test executable (`MessagingTransportIntegrationTests`): 6/6 passed; full suite still contains 9 unrelated failures (identity-header policy + websocket redirect tests). | Developer |
|
||||
| 2026-02-21 | Verified compose messaging flow with Valkey (`POST /api/v1/timeline/events -> 202`) and gateway logs (`Dispatching ... via Messaging`, `TargetService=timelineindexer`). | Developer |
|
||||
| 2026-02-21 | Verified router-disabled boot path by compose toggle (`TIMELINE_ROUTER_ENABLED=false`): TimelineIndexer starts without router registration logs. | Developer |
|
||||
| 2026-02-21 | Fixed plugin packaging gap for TimelineIndexer publish by copying plugin transitive dependencies (`StellaOps.Messaging`, Valkey transport dependencies) and validated startup in messaging mode from publish output. | Developer |
|
||||
| 2026-02-21 | Fixed microservice HELLO schema propagation for messaging transport, added schema-aware transport tests, rebuilt `timeline-indexer-web`, and verified default compose OpenAPI now shows TimelineIndexer endpoints with summaries/descriptions and 4 exported JSON schemas under `components.schemas`. | Developer |
|
||||
| 2026-02-22 | Closed `RVM-05`: validated canary path (`GET /timelineindexer/api/v1/timeline -> 200`), canonical path (`GET /api/v1/timeline -> 200`, `POST /api/v1/timeline/events -> 202`), and gateway OpenAPI availability (`GET /openapi.json -> 200`). | Developer |
|
||||
|
||||
## Archive Note
|
||||
- Archive readiness confirmed on 2026-02-22: all tasks are `DONE`, no `TODO/DOING/BLOCKED` items remain.
|
||||
|
||||
## Decisions & Risks
|
||||
- Decision needed: canonical pilot path strategy. Recommended: canary route first, then canonical flip after validation to avoid breaking `/api/v1/timeline` consumers.
|
||||
- Risk: Router microservice dispatch does not strip prefixes like reverse proxy. Mitigation: complete RVM-01 mapping and add compatible aliases before canonical route flip.
|
||||
- Risk: Compose currently has no service-level router env blocks for most services. Mitigation: keep changes scoped to TimelineIndexer and Gateway for pilot; do not mass-convert.
|
||||
- Risk: "OpenAI adapter exposure" can mean multiple surfaces (provider plugin metadata vs OpenAI-compatible inbound API). Mitigation: lock contract in RVM-08 before endpoint implementation.
|
||||
- Risk: Existing local working tree contains unrelated edits. Mitigation: this sprint will touch only scoped files and will not revert unrelated changes.
|
||||
- Risk: Router.AspNet direct transport references increase service coupling and build surface. Mitigation: use plugin discovery from configuration (`TransportPlugins:*`, `Messaging:Transport`) and keep transport assemblies optional at app/service packaging level.
|
||||
- Risk: `dotnet test --filter` is ignored for xUnit v3 MTP execution in this repo. Mitigation: run filtered gateway messaging tests via the xUnit v3 test executable (`-class ...MessagingTransportIntegrationTests`) for deterministic scope evidence.
|
||||
- Risk: Existing timeline-indexer container image may miss plugin transitive assemblies and restart when router messaging mode is enabled. Mitigation: use updated publish packaging and rebuild image before enabling `TIMELINE_ROUTER_ENABLED=true` in compose.
|
||||
- Docs links:
|
||||
- `docs/modules/router/timelineindexer-microservice-pilot.md`
|
||||
- `docs/modules/router/webservice-integration-guide.md`
|
||||
- `docs/modules/router/messaging-valkey-transport.md`
|
||||
- `docs/modules/timeline-indexer/guides/timeline.md`
|
||||
- `docs/modules/timeline-indexer/architecture.md`
|
||||
|
||||
## Next Checkpoints
|
||||
- 2026-02-22: Complete RVM-01 path audit and confirm canary route.
|
||||
- 2026-02-23: Land `AddRouterMicroservice()` with unit tests (RVM-02).
|
||||
- 2026-02-24: Compose activation and TimelineIndexer pilot wiring in dev stack (RVM-03, RVM-04).
|
||||
- 2026-02-25: Canary validation and route flip decision (RVM-05).
|
||||
- 2026-02-26: Docs and OpenAI adapter exposure contract checkpoint (RVM-07, RVM-08).
|
||||
@@ -0,0 +1,226 @@
|
||||
# Sprint 20260221.045 - Router Valkey Microservice Transport Rollout (All WebServices)
|
||||
|
||||
## Topic & Scope
|
||||
- Migrate StellaOps webservices exposed through Gateway API routes from direct reverse proxy routing to Router microservice transport over Valkey messaging.
|
||||
- Standardize service startup integration on `AddRouterMicroservice()` with transport activation fully controlled by environment variables and Docker Compose settings.
|
||||
- Enforce plugin-only transport loading for both router transport and messaging backend; no hard transport coupling in webservice DI routines.
|
||||
- Ensure Gateway OpenAPI preview (`/openapi.json`) includes connected microservice endpoints with operation summary/description and JSON Schema components.
|
||||
- Working directory: `src/Router/`.
|
||||
- Cross-module edits explicitly allowed for this sprint: `src/**/StellaOps.*.WebService`, `src/**/StellaOps.*.Worker`, `devops/compose`, `docs/modules/router`, `docs/modules/gateway`, module dossiers under `docs/modules/**`, and service-level `TASKS.md` files where touched.
|
||||
- Expected evidence: targeted tests per migration wave, compose run logs, route-table diff evidence, OpenAPI path/schema verification reports, rollback playbook validation.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
- Depends on pilot groundwork from `docs-archived/implplan/SPRINT_20260221_044_Router_valkey_microservice_transport_timelineindexer_pilot.md`.
|
||||
- Depends on stable Valkey service and Gateway messaging transport runtime in default compose.
|
||||
- Safe concurrency is by independent migration waves grouped by module domain, with a strict freeze on canonical route flips until each wave passes OpenAPI and smoke verification.
|
||||
- Shared contracts (`AddRouterMicroservice`, plugin directory conventions, messaging option keys) must remain stable during wave execution to avoid cross-wave churn.
|
||||
|
||||
## Documentation Prerequisites
|
||||
- `docs/modules/router/architecture.md`
|
||||
- `docs/modules/router/messaging-valkey-transport.md`
|
||||
- `docs/modules/router/webservice-integration-guide.md`
|
||||
- `docs/modules/router/aspnet-endpoint-bridge.md`
|
||||
- `docs/modules/gateway/architecture.md`
|
||||
- `docs/modules/gateway/openapi.md`
|
||||
- `docs/modules/platform/architecture-overview.md`
|
||||
- Module dossier for each service before its task moves to `DOING`.
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
### RMW-01 - Global webservice inventory and migration matrix
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Project Manager, Developer
|
||||
Task description:
|
||||
- Build the authoritative migration matrix from current Gateway routes and compose services.
|
||||
- Enumerate each service host, current route prefixes, auth requirements, rollout wave, and rollback switch.
|
||||
- Record whether each service already has `Router` configuration, `AddRouterMicroservice()` adoption, and plugin publish packaging.
|
||||
- Publish the matrix at `docs/modules/router/webservices-valkey-rollout-matrix.md` and link it from router module documentation.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Migration matrix lists every current reverse-proxy API surface and target microservice route owner.
|
||||
- [x] Each service has an assigned rollout wave and explicit acceptance owner.
|
||||
- [x] Matrix is linked in this sprint and in router module docs.
|
||||
|
||||
### RMW-02 - Guardrail contract for plugin-only transport activation
|
||||
Status: DONE
|
||||
Dependency: RMW-01
|
||||
Owners: Developer
|
||||
Task description:
|
||||
- Codify required contract for all services: transport registration via `RouterTransportPluginLoader`, messaging backend via plugin, and compose/env-driven selection.
|
||||
- Ensure no service DI path introduces direct references that hardwire `Messaging/Tcp/Udp/Tls` registrations.
|
||||
- Add or extend tests to fail when required plugin assemblies or resolved transport sections are missing.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Contract doc states required configuration keys and forbidden hard-coupling patterns.
|
||||
- [x] Router-level tests cover missing-plugin and missing-section failures.
|
||||
- [x] Migration wave PR checklist includes this guardrail.
|
||||
|
||||
### RMW-03 - Compose defaults and env key standardization for all services
|
||||
Status: DONE
|
||||
Dependency: RMW-02
|
||||
Owners: Developer, DevOps
|
||||
Task description:
|
||||
- Standardize environment key patterns across services for router enablement, gateway target, transport plugin directories, messaging transport selection, and Valkey connection.
|
||||
- Keep reverse-proxy fallback toggles available per service for rollback.
|
||||
- Validate that compose defaults start with Valkey messaging enabled at gateway and service-level router enablement controlled explicitly.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Compose files include normalized router/messaging key sets per service.
|
||||
- [x] Each migrated service has a documented disable toggle for rollback.
|
||||
- [x] Compose lint/start validation passes for the edited stack.
|
||||
|
||||
### RMW-04 - Migration Wave A (low-coupling API services)
|
||||
Status: DONE
|
||||
Dependency: RMW-03
|
||||
Owners: Developer, Test Automation
|
||||
Task description:
|
||||
- Migrate the first low-coupling services to microservice transport to de-risk bulk rollout.
|
||||
- Candidate scope: `advisoryai`, `binaryindex`, `integrations`, `opsmemory`, `replay`, `unknowns`, `symbols`, `packsregistry`, `registry-token`, `smremote`, `airgap-controller`, `airgap-time`.
|
||||
- For each service in wave: adopt `AddRouterMicroservice()`, validate plugin packaging, switch route entry to `Microservice` after canary validation.
|
||||
|
||||
Completion criteria:
|
||||
- [x] All Wave A services dispatch through messaging with no reverse-proxy dependency for their API routes.
|
||||
- [x] Gateway logs show `via Messaging` dispatch for each Wave A service.
|
||||
- [x] Wave A endpoints appear in gateway OpenAPI with operation metadata and schemas.
|
||||
|
||||
### RMW-05 - Migration Wave B (evidence and trust plane services)
|
||||
Status: DONE
|
||||
Dependency: RMW-04
|
||||
Owners: Developer, Test Automation
|
||||
Task description:
|
||||
- Migrate higher-sensitivity services where evidence integrity and signing workflows are involved.
|
||||
- Candidate scope: `attestor`, `evidencelocker`, `signer`, `authority`, `exportcenter`, `issuerdirectory`.
|
||||
- Require stricter verification around auth propagation, identity headers, and route-level policy behavior.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Wave B services pass route-level auth and policy checks in microservice mode.
|
||||
- [x] Evidence/trust endpoints remain behavior-compatible against baseline requests.
|
||||
- [x] OpenAPI output includes Wave B schemas and descriptions without regressions.
|
||||
|
||||
### RMW-06 - Migration Wave C (orchestration and policy control plane)
|
||||
Status: DONE
|
||||
Dependency: RMW-05
|
||||
Owners: Developer, Test Automation
|
||||
Task description:
|
||||
- Migrate orchestration/control services that have fan-out dependencies and high request volume.
|
||||
- Candidate scope: `orchestrator`, `scheduler`, `taskrunner`, `policy-engine`, `policy-gateway`, `riskengine`, `platform`.
|
||||
- Validate request timeout, cancellation, and heartbeat behavior under expected load patterns.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Wave C services respond through messaging transport with stable p95 latency targets defined in evidence.
|
||||
- [x] Cancellation and timeout semantics are verified for at least one endpoint per service.
|
||||
- [x] No required canonical API route for Wave C remains reverse-proxy only.
|
||||
|
||||
### RMW-07 - Migration Wave D (scanner/graph/feed and operational services)
|
||||
Status: DONE
|
||||
Dependency: RMW-06
|
||||
Owners: Developer, Test Automation
|
||||
Task description:
|
||||
- Migrate remaining service surfaces with graph/feed/scanning and operational dashboards.
|
||||
- Candidate scope: `scanner`, `concelier`, `excititor`, `vexhub`, `vexlens`, `reachgraph`, `cartographer`, `findings`, `sbomservice`, `vulnexplorer`, `doctor`, `doctor-scheduler`, `notify`, `notifier`, `gateway`.
|
||||
- Confirm mixed protocol and payload-heavy endpoints remain compatible after route conversion.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Wave D services have successful microservice dispatch and health heartbeat registration.
|
||||
- [x] High-volume endpoints complete smoke scenarios without transport errors.
|
||||
- [x] Route table contains explicit rollback markers removed only after acceptance.
|
||||
|
||||
### RMW-08 - Gateway route conversion completion and rollback automation
|
||||
Status: DONE
|
||||
Dependency: RMW-07
|
||||
Owners: Developer, DevOps
|
||||
Task description:
|
||||
- Convert canonical route entries from `ReverseProxy` to `Microservice` per accepted wave, preserving static/file and external authority routes where required.
|
||||
- Add deterministic rollback script or documented command sequence to restore previous route modes by wave.
|
||||
- Ensure route ordering and prefix specificity remain deterministic after conversions.
|
||||
|
||||
Completion criteria:
|
||||
- [x] All internal API routes eligible for migration are `Microservice` routes.
|
||||
- [x] Rollback procedure is tested for at least one service per wave.
|
||||
- [x] Route-table diff evidence is attached in sprint execution log.
|
||||
|
||||
### RMW-09 - Gateway OpenAPI completeness and schema quality gate
|
||||
Status: DONE
|
||||
Dependency: RMW-08
|
||||
Owners: Developer, Documentation Author
|
||||
Task description:
|
||||
- Validate gateway OpenAPI output for all migrated services.
|
||||
- Enforce per-endpoint checks: route presence, operation summary, description, response schema refs, and schema objects in `components.schemas`.
|
||||
- Include AdvisoryAI OpenAI adapter exposure endpoints in this gate and verify contract visibility in OpenAPI output.
|
||||
|
||||
Completion criteria:
|
||||
- [x] OpenAPI verification report covers every migrated service prefix.
|
||||
- [x] Missing summary/description/schema defects are fixed or explicitly tracked as BLOCKED.
|
||||
- [x] AdvisoryAI adapter exposure endpoints are present and documented with schemas.
|
||||
|
||||
### RMW-10 - Deterministic QA, resilience, and rollout decision gate
|
||||
Status: DONE
|
||||
Dependency: RMW-09
|
||||
Owners: QA, Test Automation, Project Manager
|
||||
Task description:
|
||||
- Execute tiered validation per module surface: targeted tests, compose smoke requests, and failure-path checks (timeouts, cancellation, service restart heartbeat recovery).
|
||||
- Capture deterministic evidence with exact commands and outputs for each wave.
|
||||
- Hold default-flip decision until all unresolved migration blockers are closed or formally accepted.
|
||||
|
||||
Completion criteria:
|
||||
- [x] QA evidence exists for every wave and includes behavioral checks, not only build/test totals.
|
||||
- [x] No open `BLOCKED` item remains for migrated canonical routes at gate sign-off.
|
||||
- [x] Default mode decision and rollback guardrails are recorded with approvers.
|
||||
|
||||
### RMW-11 - Documentation and runbook synchronization
|
||||
Status: DONE
|
||||
Dependency: RMW-10
|
||||
Owners: Documentation Author, Developer
|
||||
Task description:
|
||||
- Update router/gateway/service docs to reflect final microservice routing model, compose toggles, and operation/rollback runbooks.
|
||||
- Add migration cookbook for onboarding new services to Valkey messaging microservice mode.
|
||||
- Sync Decisions & Risks links and archive obsolete reverse-proxy-first guidance.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Router and gateway docs match implemented default behavior.
|
||||
- [x] Service docs include router activation and plugin packaging requirements.
|
||||
- [x] Runbook steps for incident rollback are validated and linked.
|
||||
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2026-02-21 | Sprint created for all-webservices migration from reverse proxy to Router microservice transport over Valkey, based on TimelineIndexer pilot outcomes. | Project Manager |
|
||||
| 2026-02-21 | Completed `RMW-01`: published full host/path migration matrix (`webservices-valkey-rollout-matrix.md`) from gateway route inventory, assigned waves/owners/rollback switches, and linked matrix in router docs. | Developer |
|
||||
| 2026-02-21 | Completed `RMW-02`: added transport guardrail contract + PR checklist (`microservice-transport-guardrails.md`) and extended Router.AspNet tests for missing-section and missing-plugin failures (30/30 passing). | Developer |
|
||||
| 2026-02-22 | Completed `RMW-03` through `RMW-08`: standardized compose router/messaging defaults for all webservices, migrated service startup to `AddRouterMicroservice()`, and converted route table to `Microservice` mode for internal APIs (110 microservice routes, 7 expected reverse-proxy routes retained for authority/static flows). | Developer |
|
||||
| 2026-02-22 | Completed `RMW-09`: validated gateway OpenAPI aggregation after full migration (`/openapi.json` => 1861 paths, 901 schemas), including Timeline endpoints and AdvisoryAI OpenAI adapter schemas. | Developer |
|
||||
| 2026-02-22 | Completed `RMW-10`: from-scratch compose bootstrap (`down -v` then `up -d`) plus route smoke (`110` microservice routes checked, `0` transport failures in 5xx range), timeline ingress/query verification (`GET /api/v1/timeline` => `200 []`). | Developer |
|
||||
| 2026-02-22 | Completed `RMW-11`: synchronized router migration docs, valkey transport guide, integration guide, and module dossiers; validated rollback toggles via per-service `*_ROUTER_ENABLED` compose env controls. | Developer |
|
||||
| 2026-02-22 | Post-gate correction: fixed `gateway` service generic env loading so compose-provided `Router__*` and `ASPNETCORE_URLS` are honored; container health recovered and `router:requests:gateway` queue registration confirmed. | Developer |
|
||||
| 2026-02-22 | Final hard-reference cleanup: removed direct transport registrations from non-example runtime startup (`Gateway`, `Orchestrator`) to enforce plugin-only transport activation via configuration/env; rebuilt affected images and re-ran clean-stack validation. | Developer |
|
||||
| 2026-02-22 | Final acceptance rerun: executed second clean bootstrap (`docker compose down -v --remove-orphans` + `up -d`), revalidated router OpenAPI discovery (`/.well-known/openapi`), OpenAPI aggregate (`/openapi.json` => `1861` paths / `901` schemas), and microservice route smoke (`110` routes, `0` 5xx). | Developer |
|
||||
| 2026-02-22 | Added `rekor` as explicit `ReverseProxy` route (`/rekor -> http://rekor.stella-ops.local:3322`) and investigated remaining non-microservice path anomalies: authority/platform base-prefix probes return upstream `404` by design for undefined root endpoints; `/envsettings.json` upstream returns `500` due Platform DB error (`relation \"platform.environment_settings\" does not exist`). | Developer |
|
||||
| 2026-02-22 | Authority stabilization follow-up: added Authority schema bootstrap SQL to compose Postgres init (`devops/compose/postgres-init/04-authority-schema.sql`) and adjusted gateway authority edge routing to keep Authority on microservice transport while adding protocol-specific reverse-proxy fallbacks for OpenIddict paths not currently exposed by endpoint discovery (`/.well-known`, `/connect/token`, `/connect/introspect`, `/connect/revoke`). | Developer |
|
||||
| 2026-02-22 | Authority OIDC microservice cutover: added in-service OpenIddict bridge endpoints (`/connect/authorize`, `/connect/token`, `/connect/introspect`, `/connect/revoke`, `/well-known/openid-configuration`), switched gateway Authority protocol routes back to `Microservice`, and removed temporary reverse-proxy protocol fallbacks. | Developer |
|
||||
| 2026-02-22 | Rekor runtime investigation and compose hardening: confirmed `rekor-tiles` fails without signer + Tessera backend flags; configured compose Rekor profile with explicit signer mount/startup flags, corrected internal HTTP port to `3322`, and split profile usage so `sigstore` is CLI-only while self-hosted Rekor uses `sigstore-local`. | Developer |
|
||||
| 2026-02-22 | Archive readiness verification completed: all delivery tasks and completion criteria are `DONE`; no remaining `TODO/DOING/BLOCKED` items for this sprint. | Project Manager |
|
||||
| 2026-02-22 | Archive metadata hygiene: normalized cross-sprint dependency link to archived pilot sprint path (`docs-archived/implplan/...044...`) so archived references are self-contained. | Project Manager |
|
||||
|
||||
## Decisions & Risks
|
||||
- Decision resolved: wave ordering executed guardrail-first and completed through Waves A-D before final route/default gates.
|
||||
- Risk: attempting one-shot migration across all services may destabilize local compose. Mitigation: strict wave-based rollout with explicit rollback checkpoints.
|
||||
- Risk: accidental hard transport references in service DI. Mitigation: enforce plugin-only registration and tests from `RMW-02`.
|
||||
- Risk: OpenAPI visibility gaps can hide incomplete metadata after migration. Mitigation: dedicated OpenAPI quality gate in `RMW-09`.
|
||||
- Risk: gateway route conversion can break path precedence. Mitigation: route diff review and deterministic ordering checks in `RMW-08`.
|
||||
- Risk: scope spans many modules and can drift from owning directory rules. Mitigation: this sprint explicitly authorizes listed cross-module edit zones and requires per-wave task scoping.
|
||||
- Risk: `rekor-tiles` requires a Tessera GCP backend (`REKOR_GCP_BUCKET`, `REKOR_GCP_SPANNER`) plus ADC credentials to become healthy; without those, `sigstore-local` remains intentionally non-default for local stacks. This is non-blocking for Router migration acceptance.
|
||||
- Risk: Authority OIDC bridge endpoints proxy to loopback Authority endpoints to preserve OpenIddict protocol behavior under microservice dispatch. Mitigation: bridge routes are explicit and limited to OIDC protocol paths, with no transport hard references introduced.
|
||||
- Docs links:
|
||||
- `docs/modules/router/webservices-valkey-rollout-matrix.md`
|
||||
- `docs/modules/router/microservice-transport-guardrails.md`
|
||||
- `docs/modules/router/README.md`
|
||||
- `docs/modules/router/migration-guide.md`
|
||||
|
||||
## Next Checkpoints
|
||||
- 2026-02-22: Sprint execution completed through `RMW-11` with clean compose bootstrap and OpenAPI validation.
|
||||
- 2026-02-23: Optional hardening checkpoint: review non-blocking `rekor` profile restarts in local default stack.
|
||||
- 2026-02-24: Optional follow-up checkpoint: curate additional per-operation smoke suite for authenticated business endpoints.
|
||||
|
||||
## Archive Note
|
||||
- Archive readiness confirmed on 2026-02-22: all tasks are `DONE`, with non-blocking Rekor local backend prerequisites documented in `Decisions & Risks`.
|
||||
@@ -0,0 +1,177 @@
|
||||
# Sprint 20260222.047 - Router Product Contract and Semantics Hardening
|
||||
|
||||
## Topic & Scope
|
||||
- Establish Stella Router as a standalone product surface with explicit, versioned contracts for endpoint registration, authorization metadata, timeout semantics, and OpenAPI projection.
|
||||
- Close current semantic misses where endpoint authorization and timeout intent are not represented or enforced consistently end-to-end.
|
||||
- Deliver compatibility-safe behavior changes so existing Stella Ops services can adopt improvements without disruptive rewrites.
|
||||
- Working directory: `src/Router/`.
|
||||
- Cross-module edits explicitly allowed for this sprint: `docs/modules/router`, `docs/modules/gateway`, `src/Gateway/StellaOps.Gateway.WebService`, `src/**/__Tests`.
|
||||
- Expected evidence: contract docs, targeted unit/integration tests, OpenAPI fixture diffs, compatibility matrix.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
- Depends on archived router migration outcomes:
|
||||
- `docs-archived/implplan/SPRINT_20260221_044_Router_valkey_microservice_transport_timelineindexer_pilot.md`
|
||||
- `docs-archived/implplan/SPRINT_20260221_045_Router_valkey_microservice_transport_all_webservices_rollout.md`
|
||||
- Safe concurrency:
|
||||
- Contract documentation and test-fixture drafting can run in parallel.
|
||||
- Endpoint descriptor and timeout pipeline changes must be sequenced before OpenAPI generator changes.
|
||||
- Compatibility harness updates must run after all semantic code changes land.
|
||||
|
||||
## Documentation Prerequisites
|
||||
- `docs/modules/router/architecture.md`
|
||||
- `docs/modules/router/messaging-valkey-transport.md`
|
||||
- `docs/modules/router/webservice-integration-guide.md`
|
||||
- `docs/modules/gateway/openapi.md`
|
||||
- `docs/modules/gateway/architecture.md`
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
### RPC-01 - Router contract inventory and semantic gap matrix
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Project Manager, Developer
|
||||
Task description:
|
||||
- Produce a contract inventory for Router product surfaces:
|
||||
- Endpoint discovery and HELLO payload contracts.
|
||||
- Gateway routing and authorization contracts.
|
||||
- OpenAPI aggregation contracts.
|
||||
- Transport timeout/cancellation contracts.
|
||||
- Document precise mismatch points between intended semantics and current runtime behavior, including authorization metadata loss and timeout precedence ambiguity.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Contract inventory doc committed under `docs/modules/router`.
|
||||
- [x] Gap matrix maps each mismatch to owning component and test target.
|
||||
- [x] Each gap entry includes impact level and backward-compatibility risk.
|
||||
|
||||
### RPC-02 - Preserve full endpoint auth metadata through Router model boundaries
|
||||
Status: DONE
|
||||
Dependency: RPC-01
|
||||
Owners: Developer
|
||||
Task description:
|
||||
- Extend Router endpoint metadata contracts so authorization semantics are preserved across discovery, HELLO transport, routing state, and OpenAPI generation.
|
||||
- Ensure no required metadata is dropped during projection from ASP.NET-discovered descriptors to shared descriptors.
|
||||
- Add compatibility-safe schema versioning for HELLO payload changes.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Endpoint metadata model includes required auth semantics for gateway policy and docs.
|
||||
- [x] HELLO payload serialization remains deterministic and version-compatible.
|
||||
- [x] Contract tests prove metadata survives discovery -> HELLO -> routing state.
|
||||
|
||||
### RPC-03 - Policy-aware ASP.NET authorization mapping
|
||||
Status: DONE
|
||||
Dependency: RPC-02
|
||||
Owners: Developer
|
||||
Task description:
|
||||
- Replace synchronous-only policy extraction path with policy-aware mapping that resolves policy claims deterministically.
|
||||
- Keep fallback behavior explicit when policy resolution fails.
|
||||
- Enforce configurable missing-authorization behavior without silent privilege broadening.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Discovery path uses policy-aware claim mapping for ASP.NET endpoints.
|
||||
- [x] Tests cover `RequireExplicit`, `WarnAndAllow`, and `AllowAuthenticated` behaviors.
|
||||
- [x] Failure diagnostics identify unresolved policies and impacted endpoints.
|
||||
|
||||
### RPC-04 - OpenAPI security semantics correction
|
||||
Status: DONE
|
||||
Dependency: RPC-03
|
||||
Owners: Developer, Documentation Author
|
||||
Task description:
|
||||
- Correct security mapping semantics so scopes and claim requirements are represented accurately.
|
||||
- Ensure allow-anonymous endpoints and authenticated-without-scope endpoints are distinguishable in OpenAPI output.
|
||||
- Align generated security schemes with Authority token semantics and gateway enforcement behavior.
|
||||
|
||||
Completion criteria:
|
||||
- [x] OpenAPI security requirements are generated from effective claim semantics.
|
||||
- [x] Scope value mapping is correct for OAuth2 requirements.
|
||||
- [x] OpenAPI tests cover anonymous, auth-only, and scoped endpoints.
|
||||
|
||||
### RPC-05 - Timeout precedence and routing effective-timeout fix
|
||||
Status: DONE
|
||||
Dependency: RPC-02
|
||||
Owners: Developer
|
||||
Task description:
|
||||
- Implement explicit timeout precedence:
|
||||
- Endpoint override timeout.
|
||||
- Service default timeout.
|
||||
- Gateway route default timeout.
|
||||
- Global gateway cap.
|
||||
- Update routing decision generation and dispatch to use resolved endpoint-aware timeout deterministically.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Effective timeout resolution is centralized and unit-tested.
|
||||
- [x] Dispatch timeout behavior follows precedence rules across transports.
|
||||
- [x] Regression tests verify timeout, cancel, and 504 semantics.
|
||||
|
||||
### RPC-06 - OpenAPI timeout and response metadata publication
|
||||
Status: DONE
|
||||
Dependency: RPC-05
|
||||
Owners: Developer, Documentation Author
|
||||
Task description:
|
||||
- Add Router-specific OpenAPI extension fields for timeout publication and document their meaning.
|
||||
- Improve response modeling so generated responses reflect endpoint metadata where available, instead of static generic defaults only.
|
||||
- Keep backward compatibility for consumers expecting current baseline fields.
|
||||
|
||||
Completion criteria:
|
||||
- [x] `openapi.json` includes timeout metadata extension per operation.
|
||||
- [x] Response metadata generation prefers endpoint-defined contracts.
|
||||
- [x] Docs describe extension semantics and compatibility expectations.
|
||||
|
||||
### RPC-07 - Router product compatibility and conformance suite
|
||||
Status: DONE
|
||||
Dependency: RPC-04
|
||||
Owners: Test Automation, Developer
|
||||
Task description:
|
||||
- Introduce router-product conformance tests validating:
|
||||
- Metadata propagation.
|
||||
- Security semantics.
|
||||
- Timeout precedence.
|
||||
- Transport parity (in-memory, messaging/Valkey).
|
||||
- Add fixture-based approval tests to prevent semantic regression.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Conformance suite exists and runs in CI for Router libraries.
|
||||
- [x] Failure output identifies contract area and owning component.
|
||||
- [x] Baseline fixtures are deterministic and checked into repo.
|
||||
|
||||
### RPC-08 - Product docs and migration guidance sync
|
||||
Status: DONE
|
||||
Dependency: RPC-06
|
||||
Owners: Documentation Author
|
||||
Task description:
|
||||
- Publish Router product contracts and migration guidance for service teams.
|
||||
- Add explicit “old vs new semantics” sections with upgrade steps and fallback strategy.
|
||||
- Link all changed docs into sprint Decisions & Risks for traceability.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Router docs include versioned semantics sections.
|
||||
- [x] Migration guide includes compatibility toggles and rollout sequence.
|
||||
- [x] All changed docs linked in sprint Decisions & Risks.
|
||||
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2026-02-22 | Sprint created to harden Router product contracts and close auth/timeout/OpenAPI semantic gaps. | Project Manager |
|
||||
| 2026-02-22 | RPC-02/RPC-03: endpoint auth metadata propagation finalized (`AllowAnonymous`, `RequiresAuthentication`, policies/roles/source) with ASP.NET discovery tests for `RequireExplicit`, `WarnAndAllow`, and `AllowAuthenticated`. | Developer |
|
||||
| 2026-02-22 | RPC-04: OpenAPI security mapping corrected for anonymous, auth-only, and scoped endpoints; security requirement tests added. | Developer |
|
||||
| 2026-02-22 | RPC-05/RPC-06: timeout precedence implemented (endpoint -> route default -> gateway default -> global cap) and published via `x-stellaops-timeout` + backward-compatible `x-stellaops-timeout-seconds`; routing/OpenAPI tests added. | Developer |
|
||||
| 2026-02-22 | Docs sync started for Router integration + Gateway OpenAPI timeout/auth extensions. | Documentation Author |
|
||||
| 2026-02-22 | RPC-01/RPC-07/RPC-08 closed with Router conformance suite passing and product contract docs synchronized for auth + timeout semantics. | Developer |
|
||||
|
||||
## Decisions & Risks
|
||||
- Decision resolved: HELLO payload metadata expansion is shipped with backward-compatible descriptor fields and deterministic serialization.
|
||||
- Risk: semantic fixes may alter generated OpenAPI for existing consumers. Mitigation: versioned docs and compatibility tests.
|
||||
- Risk: policy resolution may fail for custom authorization handlers. Mitigation: explicit fallback behavior and diagnostics.
|
||||
- Risk: timeout precedence changes may surface hidden latency problems. Mitigation: staged rollout with cap and metric comparison.
|
||||
- Dependency license gate: no new dependency is allowed without BUSL-1.1 compatibility review and legal docs updates.
|
||||
- Docs updated in this sprint slice:
|
||||
- `docs/modules/router/webservice-integration-guide.md`
|
||||
- `docs/modules/gateway/openapi.md`
|
||||
- `docs/modules/router/microservice-transport-guardrails.md`
|
||||
- `docs/modules/router/migration-guide.md`
|
||||
|
||||
## Next Checkpoints
|
||||
- 2026-02-23: Contract inventory and gap matrix complete (`RPC-01`).
|
||||
- 2026-02-24: Metadata and policy mapping changes merged (`RPC-02`, `RPC-03`).
|
||||
- 2026-02-25: OpenAPI and timeout semantic fixes validated (`RPC-04`, `RPC-05`, `RPC-06`).
|
||||
- 2026-02-26: Conformance suite and docs synchronization complete (`RPC-07`, `RPC-08`).
|
||||
|
||||
@@ -0,0 +1,174 @@
|
||||
# Sprint 20260222.048 - Router Authority Permission Checks and Identity Impersonation
|
||||
|
||||
## Topic & Scope
|
||||
- Implement centralized authorization in Stella Router Gateway using Authority as policy source, so downstream webservices do not duplicate authorization checks for router-dispatched endpoints.
|
||||
- Define and ship trusted user-impersonation semantics where gateway-enforced identity context is propagated to microservices in a tamper-resistant form.
|
||||
- Publish runtime and OpenAPI semantics describing gateway-enforced authorization and identity propagation behavior.
|
||||
- Working directory: `src/Router/`.
|
||||
- Cross-module edits explicitly allowed for this sprint: `src/Authority`, `src/**/StellaOps.*.WebService`, `docs/modules/router`, `docs/modules/gateway`, `docs/modules/authority`, `devops/compose`.
|
||||
- Expected evidence: authority integration tests, impersonation security tests, docs and runbook updates, compose validation.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
- Depends on `docs/implplan/SPRINT_20260222_047_Router_product_contract_and_semantics_hardening.md`.
|
||||
- Depends on existing Authority claims override path and gateway claims refresh service.
|
||||
- Safe concurrency:
|
||||
- Authority contract docs and gateway enforcement implementation can run in parallel after API contract freeze.
|
||||
- Identity envelope transport and service consumption changes can run in parallel per service wave.
|
||||
- Security hardening tests must run after implementation tasks are complete.
|
||||
|
||||
## Documentation Prerequisites
|
||||
- `docs/modules/authority/architecture.md`
|
||||
- `docs/modules/router/architecture.md`
|
||||
- `docs/modules/gateway/architecture.md`
|
||||
- `docs/modules/router/webservice-integration-guide.md`
|
||||
- `docs/modules/gateway/openapi.md`
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
### RAI-01 - Authority-to-Router authorization contract formalization
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Product Manager, Developer
|
||||
Task description:
|
||||
- Define the authoritative contract for endpoint permissions delivered from Authority to Gateway.
|
||||
- Include endpoint key identity rules, claim semantics, cache TTL, refresh model, conflict precedence, and failure behavior.
|
||||
- Specify behavior for missing Authority data and service startup conditions.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Authority permission contract doc is published with request/response schema.
|
||||
- [x] Precedence rules (Authority vs service metadata) are explicit and testable.
|
||||
- [x] Failure modes and fallback policy are documented.
|
||||
|
||||
### RAI-02 - Gateway policy decision point (PDP) enforcement hardening
|
||||
Status: DONE
|
||||
Dependency: RAI-01
|
||||
Owners: Developer
|
||||
Task description:
|
||||
- Upgrade gateway authorization path so effective claims from Authority-backed store are the primary enforcement source for router-dispatched endpoints.
|
||||
- Ensure endpoint authorization decisions are deterministic under refresh races and transient Authority outages.
|
||||
- Add metrics and structured denial reasons for operator debugging.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Gateway authorization middleware enforces effective claims for all router-dispatched endpoints.
|
||||
- [x] Race-safe behavior is verified under Authority refresh churn.
|
||||
- [x] Denial logs/metrics include endpoint key and missing requirement details.
|
||||
|
||||
### RAI-03 - User identity impersonation envelope design and signing
|
||||
Status: DONE
|
||||
Dependency: RAI-01
|
||||
Owners: Developer, Security
|
||||
Task description:
|
||||
- Define and implement a gateway-issued identity envelope containing authenticated user context:
|
||||
- Subject.
|
||||
- Tenant/project.
|
||||
- Effective scopes/roles.
|
||||
- Sender-constraint references (DPoP/MTLS confirmation).
|
||||
- Correlation and timestamp claims.
|
||||
- Sign the envelope with gateway-controlled key material so microservices can trust origin and integrity.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Identity envelope schema is published and versioned.
|
||||
- [x] Envelope signature generation is implemented and deterministic.
|
||||
- [x] Gateway strips spoofable client identity headers before issuing trusted envelope.
|
||||
|
||||
### RAI-04 - Microservice trust mode for gateway-enforced authorization
|
||||
Status: DONE
|
||||
Dependency: RAI-03
|
||||
Owners: Developer
|
||||
Task description:
|
||||
- Add service-side router trust mode allowing services to rely on gateway-enforced authorization and signed identity envelope.
|
||||
- Preserve optional hybrid mode for gradual rollout where services can keep local checks.
|
||||
- Fail closed when trust mode is enabled but envelope verification fails.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Service trust modes are configurable (`GatewayEnforced`, `Hybrid`, `ServiceEnforced`).
|
||||
- [x] Envelope verification path is implemented for router-dispatched requests.
|
||||
- [x] Fail-closed behavior is tested for missing/invalid envelope in gateway-enforced mode.
|
||||
|
||||
### RAI-05 - Authority refresh reliability and cache consistency
|
||||
Status: DONE
|
||||
Dependency: RAI-02
|
||||
Owners: Developer, Test Automation
|
||||
Task description:
|
||||
- Harden periodic/push-based Authority claims refresh path for consistency and observability.
|
||||
- Add version or ETag-style change tracking to avoid stale claims ambiguity.
|
||||
- Validate startup, reconnect, and degraded-network behaviors.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Claims refresh behavior is deterministic across startup and reconnect scenarios.
|
||||
- [x] Cache versioning/change tracking is visible in logs/metrics.
|
||||
- [x] Tests cover stale cache, empty override sets, and refresh failure fallback.
|
||||
|
||||
### RAI-06 - OpenAPI publication of gateway-enforced auth semantics
|
||||
Status: DONE
|
||||
Dependency: RAI-02
|
||||
Owners: Developer, Documentation Author
|
||||
Task description:
|
||||
- Align OpenAPI security generation with effective claims used by gateway enforcement.
|
||||
- Publish operation-level indicators for gateway-enforced authorization mode and trusted identity propagation semantics.
|
||||
- Ensure generated docs clearly signal where services rely on gateway checks.
|
||||
|
||||
Completion criteria:
|
||||
- [x] OpenAPI security requirements reflect Authority-effective claims.
|
||||
- [x] Operations include documented gateway-enforcement semantics.
|
||||
- [x] Docs explain consumer expectations and service trust boundaries.
|
||||
|
||||
### RAI-07 - Security hardening and abuse-case coverage
|
||||
Status: DONE
|
||||
Dependency: RAI-03
|
||||
Owners: Security, Test Automation
|
||||
Task description:
|
||||
- Add targeted tests for spoofing and privilege escalation attempts:
|
||||
- Injected identity headers from client.
|
||||
- Forged envelope signatures.
|
||||
- Replay of stale envelope payloads.
|
||||
- Missing sender-constraint data.
|
||||
- Validate denial behavior and telemetry for each abuse case.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Abuse-case tests exist and pass in CI.
|
||||
- [x] Spoofing attempts are rejected before dispatch.
|
||||
- [x] Security runbook includes diagnostics for envelope verification failures.
|
||||
|
||||
### RAI-08 - Authority and Router operational runbooks
|
||||
Status: DONE
|
||||
Dependency: RAI-05
|
||||
Owners: Documentation Author, DevOps
|
||||
Task description:
|
||||
- Publish operator runbooks for:
|
||||
- Key rotation for envelope signing.
|
||||
- Authority outage behavior.
|
||||
- Emergency fallback to hybrid/service-enforced modes.
|
||||
- Incident response for authorization drift.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Runbooks cover normal operations and incident scenarios.
|
||||
- [x] Compose/env toggles for fallback modes are documented with exact keys.
|
||||
- [x] Decisions and Risks links point to final runbook docs.
|
||||
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2026-02-22 | Sprint created for Authority-backed gateway authorization and trusted identity impersonation semantics. | Project Manager |
|
||||
| 2026-02-22 | RAI-03 implemented: signed gateway identity envelope schema + codec (HS256), gateway emission, reserved-header stripping, and transport headers (`X-StellaOps-Identity-Envelope*`). | Developer |
|
||||
| 2026-02-22 | RAI-04 implemented: service trust modes (`ServiceEnforced`, `Hybrid`, `GatewayEnforced`) with fail-closed verification path in ASP.NET dispatcher. | Developer |
|
||||
| 2026-02-22 | Added trust-mode regression tests for missing envelope rejection and valid envelope dispatch identity propagation. | Test Automation |
|
||||
| 2026-02-22 | RAI-01/RAI-02/RAI-05/RAI-06/RAI-07/RAI-08 closed with authority-backed effective claim enforcement, abuse-case coverage, and operator runbook publication. | Developer |
|
||||
|
||||
## Decisions & Risks
|
||||
- Decision resolved: gateway identity envelope remains HMAC-SHA256 signed with deterministic claim canonicalization and env-driven key rotation controls.
|
||||
- Risk: disabling service-local authorization in gateway-enforced mode increases blast radius if gateway policy fails. Mitigation: fail-closed verification and hybrid fallback mode.
|
||||
- Risk: Authority availability can delay policy convergence. Mitigation: versioned cache and explicit stale-mode behavior.
|
||||
- Risk: semantics drift between runtime enforcement and OpenAPI publication. Mitigation: shared source of effective claims for both paths.
|
||||
- Dependency license gate: any cryptography/signing dependency addition must pass BUSL-1.1 compatibility review.
|
||||
- Current implementation docs:
|
||||
- `docs/modules/router/webservice-integration-guide.md`
|
||||
- `docs/modules/gateway/openapi.md`
|
||||
- `docs/modules/router/authority-gateway-enforcement-runbook.md`
|
||||
|
||||
## Next Checkpoints
|
||||
- 2026-02-23: Authority-Router contract freeze (`RAI-01`).
|
||||
- 2026-02-24: Gateway PDP enforcement and identity envelope implementation (`RAI-02`, `RAI-03`).
|
||||
- 2026-02-25: Service trust mode and refresh hardening (`RAI-04`, `RAI-05`).
|
||||
- 2026-02-26: OpenAPI sync and security hardening evidence (`RAI-06`, `RAI-07`, `RAI-08`).
|
||||
|
||||
@@ -0,0 +1,203 @@
|
||||
# Sprint 20260222.049 - Router Optional Transport All-Webservices Migration
|
||||
|
||||
## Topic & Scope
|
||||
- Migrate all eligible Stella Ops webservices to Router microservice transport as an optional runtime mode controlled by environment variables and compose settings.
|
||||
- Ensure each migrated service supports dual-mode operation:
|
||||
- Reverse-proxy fallback mode.
|
||||
- Router microservice transport mode (Valkey messaging by default).
|
||||
- Apply Authority-backed gateway authorization and trusted identity propagation modes service-by-service without breaking existing deployments.
|
||||
- Working directory: `src/Router/`.
|
||||
- Cross-module edits explicitly allowed for this sprint: `src/**/StellaOps.*.WebService`, `src/**/StellaOps.*.Worker`, `devops/compose`, `docs/modules/**`, `src/**/TASKS.md`.
|
||||
- Expected evidence: migration matrix, per-wave smoke reports, route-table diffs, OpenAPI completeness report, rollback scripts.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
- Depends on:
|
||||
- `docs/implplan/SPRINT_20260222_047_Router_product_contract_and_semantics_hardening.md`
|
||||
- `docs/implplan/SPRINT_20260222_048_Router_authority_permission_checks_and_identity_impersonation.md`
|
||||
- Safe concurrency:
|
||||
- Service startup rewiring can proceed by domain waves.
|
||||
- Compose/env standardization can run in parallel with service code rewiring.
|
||||
- Canonical route flips must wait for per-wave verification completion.
|
||||
|
||||
## Documentation Prerequisites
|
||||
- `docs/modules/router/webservice-integration-guide.md`
|
||||
- `docs/modules/router/migration-guide.md`
|
||||
- `docs/modules/router/messaging-valkey-transport.md`
|
||||
- `docs/modules/gateway/architecture.md`
|
||||
- Module dossiers for each service wave before task moves to `DOING`.
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
### RMW-01 - Authoritative service and route migration matrix refresh
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Project Manager, Developer
|
||||
Task description:
|
||||
- Rebuild the matrix of all gateway-exposed service routes.
|
||||
- Classify each route as:
|
||||
- Eligible for Router microservice transport.
|
||||
- Reverse-proxy-only exception.
|
||||
- Static/WebSocket/external edge exception.
|
||||
- Assign wave ownership, acceptance owner, and rollback switch per service.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Matrix covers every gateway route and service host.
|
||||
- [x] Each service has explicit optional-transport toggle keys.
|
||||
- [x] Exceptions are justified and documented.
|
||||
|
||||
### RMW-02 - Env key normalization and compose profile hardening
|
||||
Status: DONE
|
||||
Dependency: RMW-01
|
||||
Owners: Developer, DevOps
|
||||
Task description:
|
||||
- Standardize router-related env keys across all services and compose stacks.
|
||||
- Ensure default compose startup remains deterministic and supports explicit mode selection.
|
||||
- Validate that plugin directories and transport settings are always environment-driven.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Compose files include normalized router env keys for all target services.
|
||||
- [x] Service-level fallback toggle exists for each migrated service.
|
||||
- [x] Compose validation proves both modes boot without code edits.
|
||||
|
||||
### RMW-03 - Plugin-only transport registration compliance for all services
|
||||
Status: DONE
|
||||
Dependency: RMW-02
|
||||
Owners: Developer
|
||||
Task description:
|
||||
- Eliminate remaining direct hard transport references in service startup paths.
|
||||
- Enforce transport plugin loading and configuration-driven registration only.
|
||||
- Add static checks or tests to prevent reintroduction of hard transport coupling.
|
||||
|
||||
Completion criteria:
|
||||
- [x] No runtime service startup path directly wires concrete transport types.
|
||||
- [x] Plugin-loading contract is validated by tests/guardrails.
|
||||
- [x] Violations fail CI.
|
||||
|
||||
### RMW-04 - Migration Wave A (low-coupling services) optional transport rollout
|
||||
Status: DONE
|
||||
Dependency: RMW-03
|
||||
Owners: Developer, Test Automation
|
||||
Task description:
|
||||
- Migrate Wave A service set to optional transport with canary and canonical routes.
|
||||
- Validate gateway-enforced authorization mode compatibility for each service.
|
||||
- Keep per-service rollback path available until wave sign-off.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Wave A services pass in reverse-proxy and microservice modes.
|
||||
- [x] Gateway dispatch and authorization behavior is verified per service.
|
||||
- [x] OpenAPI includes Wave A endpoints with schema and security metadata.
|
||||
|
||||
### RMW-05 - Migration Wave B (evidence/trust services) optional transport rollout
|
||||
Status: DONE
|
||||
Dependency: RMW-04
|
||||
Owners: Developer, Test Automation
|
||||
Task description:
|
||||
- Migrate trust-sensitive services and validate signature/evidence flows.
|
||||
- Verify trusted identity propagation semantics for sensitive endpoints.
|
||||
- Validate no privilege broadening under gateway-enforced mode.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Wave B services pass dual-mode and auth propagation checks.
|
||||
- [x] Sensitive endpoint behavior is baseline-compatible.
|
||||
- [x] Security regression checks pass for Wave B.
|
||||
|
||||
### RMW-06 - Migration Wave C (orchestration/policy control plane) optional transport rollout
|
||||
Status: DONE
|
||||
Dependency: RMW-05
|
||||
Owners: Developer, Test Automation
|
||||
Task description:
|
||||
- Migrate orchestration and policy control services to optional transport.
|
||||
- Validate cancellation and timeout semantics after endpoint-aware timeout rollout.
|
||||
- Validate policy-sensitive endpoints under Authority-backed gateway enforcement.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Wave C services pass dual-mode behavior checks.
|
||||
- [x] Timeout/cancellation behavior matches documented semantics.
|
||||
- [x] No canonical control-plane route remains unclassified.
|
||||
|
||||
### RMW-07 - Migration Wave D (graph/feed/scanner/ops services) optional transport rollout
|
||||
Status: DONE
|
||||
Dependency: RMW-06
|
||||
Owners: Developer, Test Automation
|
||||
Task description:
|
||||
- Complete migration for remaining operational and data-plane services.
|
||||
- Validate high-volume and payload-heavy flows under messaging transport.
|
||||
- Keep route-level rollback markers until acceptance sign-off.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Wave D services pass dual-mode dispatch and health checks.
|
||||
- [x] High-volume smoke scenarios pass without transport errors.
|
||||
- [x] Rollback markers remain until explicit acceptance.
|
||||
|
||||
### RMW-08 - Reverse-proxy-only exception lock-in and policy publication
|
||||
Status: DONE
|
||||
Dependency: RMW-07
|
||||
Owners: Developer, Project Manager
|
||||
Task description:
|
||||
- Finalize non-microservice exception list (for example Rekor reverse-proxy-only).
|
||||
- Publish explicit policy for why each exception remains reverse-proxy.
|
||||
- Add detection checks that prevent accidental conversion of exception routes.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Exception list is explicit and documented with reasons.
|
||||
- [x] Route table and docs reflect exception policy consistently.
|
||||
- [x] Guardrail tests detect accidental exception drift.
|
||||
|
||||
### RMW-09 - OpenAPI coverage gate for all migrated service prefixes
|
||||
Status: DONE
|
||||
Dependency: RMW-07
|
||||
Owners: Developer, Documentation Author
|
||||
Task description:
|
||||
- Run full OpenAPI coverage validation for every migrated prefix.
|
||||
- Verify summary, description, schema refs, security requirements, and timeout extension presence.
|
||||
- Ensure authority/gateway-enforced semantics are visible in docs for impacted endpoints.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Coverage report includes every migrated service prefix.
|
||||
- [x] Missing metadata defects are fixed or tracked as `BLOCKED` with owner.
|
||||
- [x] Gateway-enforced authorization semantics are published in OpenAPI docs.
|
||||
|
||||
### RMW-10 - Rollback automation and migration acceptance package
|
||||
Status: DONE
|
||||
Dependency: RMW-09
|
||||
Owners: DevOps, Project Manager
|
||||
Task description:
|
||||
- Build deterministic rollback scripts by wave and by service.
|
||||
- Capture acceptance package per wave with commands, outputs, and pass/fail matrix.
|
||||
- Prepare release handoff inputs for QA gate sprint.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Rollback scripts are tested and documented.
|
||||
- [x] Acceptance package exists per wave and includes dual-mode evidence.
|
||||
- [x] QA gate sprint has complete handoff artifacts.
|
||||
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2026-02-22 | Sprint created for all-webservices optional Router transport migration with Authority-enforced authorization compatibility. | Project Manager |
|
||||
| 2026-02-22 | RMW-01 delivered: full route/service rollout inventory published in `docs/modules/router/webservices-valkey-rollout-matrix.md` (116 reverse-proxy routes, 42 service hosts, wave assignment + rollback keys). | Project Manager |
|
||||
| 2026-02-22 | RMW-02/RMW-03 in progress: compose/env normalization and plugin-only transport activation hardening across service `Program.cs` integration paths. | Developer |
|
||||
| 2026-02-22 | RMW-04 started with pilot-proven timeline path and wave-A service toggles; validation continues per-service in dual-mode matrix. | Developer |
|
||||
| 2026-02-22 | RMW-02/RMW-08 completed: compose defaults hardened for microservice mode and reverse-proxy-only exceptions locked (`/rekor`, platform/static edge routes). | DevOps |
|
||||
| 2026-02-22 | RMW-03/RMW-04/RMW-05/RMW-06/RMW-07 completed: plugin-only transport registration verified and all webservice waves validated in dual-mode compose smoke. | Developer |
|
||||
| 2026-02-22 | RMW-09/RMW-10 completed: OpenAPI coverage and rollout acceptance package published with deterministic mode-redeploy helper. | Project Manager |
|
||||
|
||||
## Decisions & Risks
|
||||
- Decision resolved: wave rollout matrix is fixed and published in `docs/modules/router/webservices-valkey-rollout-matrix.md`.
|
||||
- Risk: large-scope migration can hide service-specific regressions. Mitigation: strict wave gating with per-service evidence.
|
||||
- Risk: gateway-enforced authorization mode may conflict with legacy service-local assumptions. Mitigation: dual-mode rollout and trust-mode toggles.
|
||||
- Risk: route conversion order can introduce prefix collisions. Mitigation: deterministic route diff checks and canary-first policy.
|
||||
- Dependency license gate: no additional dependencies/images without BUSL-1.1 compatibility validation.
|
||||
- Acceptance artifacts:
|
||||
- `docs/modules/router/rollout-acceptance-20260222.md`
|
||||
- `devops/compose/openapi_routeprefix_smoke_microservice.csv`
|
||||
- `devops/compose/openapi_routeprefix_smoke_reverseproxy.csv`
|
||||
- `devops/compose/openapi_quality_report_microservice.json`
|
||||
- `devops/compose/openapi_quality_report_reverseproxy.json`
|
||||
|
||||
## Next Checkpoints
|
||||
- 2026-02-23: Matrix and env standardization complete (`RMW-01`, `RMW-02`).
|
||||
- 2026-02-24: Plugin compliance and Wave A complete (`RMW-03`, `RMW-04`).
|
||||
- 2026-02-25: Waves B and C complete (`RMW-05`, `RMW-06`).
|
||||
- 2026-02-26: Wave D, OpenAPI gate, and rollback package complete (`RMW-07` to `RMW-10`).
|
||||
|
||||
@@ -0,0 +1,168 @@
|
||||
# Sprint 20260222.050 - Router Conformance QA and Rollout Gate
|
||||
|
||||
## Topic & Scope
|
||||
- Execute final deterministic QA and release gate for Router semantic fixes, Authority-backed authorization, trusted identity impersonation, and all-webservices optional transport migration.
|
||||
- Validate from-scratch Stella Ops stack setup and route behavior in both routing modes.
|
||||
- Produce archive-ready evidence package for preceding Router implementation sprints.
|
||||
- Working directory: `src/Router/`.
|
||||
- Cross-module edits explicitly allowed for this sprint: `devops/compose`, `docs/modules/router`, `docs/modules/gateway`, `docs/modules/authority`, `src/**/__Tests`, `docs/qa/feature-checks`.
|
||||
- Expected evidence: tiered QA logs, endpoint smoke matrices, security abuse-case results, OpenAPI quality reports, archive checklist.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
- Depends on:
|
||||
- `docs/implplan/SPRINT_20260222_047_Router_product_contract_and_semantics_hardening.md`
|
||||
- `docs/implplan/SPRINT_20260222_048_Router_authority_permission_checks_and_identity_impersonation.md`
|
||||
- `docs/implplan/SPRINT_20260222_049_Router_optional_transport_all_webservices_migration.md`
|
||||
- Safe concurrency:
|
||||
- QA harness preparation can run in parallel with compose profile verification.
|
||||
- Security abuse-case runs and OpenAPI quality gate can run in parallel after environment bootstrap.
|
||||
- Final release/archive decision is sequential and depends on all prior checks.
|
||||
|
||||
## Documentation Prerequisites
|
||||
- `docs/qa/feature-checks/FLOW.md`
|
||||
- `docs/code-of-conduct/TESTING_PRACTICES.md`
|
||||
- `docs/modules/router/migration-guide.md`
|
||||
- `docs/modules/gateway/openapi.md`
|
||||
- `docs/modules/authority/architecture.md`
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
### RQG-01 - Router conformance harness activation
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Test Automation
|
||||
Task description:
|
||||
- Activate and run Router conformance suite from Sprint 047 against current branch state.
|
||||
- Record pass/fail by contract area:
|
||||
- Metadata propagation.
|
||||
- Authorization semantics.
|
||||
- Timeout semantics.
|
||||
- OpenAPI semantics.
|
||||
- Transport parity.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Conformance suite run output is captured and linked.
|
||||
- [x] Any failing contract area is mapped to blocking task owner.
|
||||
- [x] No silent contract regression remains untracked.
|
||||
|
||||
### RQG-02 - From-scratch compose bootstrap in dual-mode matrix
|
||||
Status: DONE
|
||||
Dependency: RQG-01
|
||||
Owners: DevOps, QA
|
||||
Task description:
|
||||
- Perform clean bootstrap runs (`down -v`, `up -d`) for both:
|
||||
- Default reverse-proxy-centric mode.
|
||||
- Router microservice messaging mode.
|
||||
- Validate service readiness, route dispatch, and health checks in both modes.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Both mode bootstraps complete from scratch without manual intervention.
|
||||
- [x] Health and readiness evidence is captured for all critical services.
|
||||
- [x] Route smoke suite passes in both modes.
|
||||
|
||||
### RQG-03 - Authority-backed authorization and impersonation abuse testing
|
||||
Status: DONE
|
||||
Dependency: RQG-02
|
||||
Owners: Security, QA
|
||||
Task description:
|
||||
- Execute abuse-case scenarios focused on central gateway authorization and identity envelope trust:
|
||||
- Header spoofing attempts.
|
||||
- Invalid signature injection.
|
||||
- Replay and stale envelope use.
|
||||
- Authority refresh lag edge cases.
|
||||
- Validate fail-closed behavior and denial telemetry.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Abuse-case suite results are captured with expected denial outcomes.
|
||||
- [x] No bypass path exists for gateway-enforced mode.
|
||||
- [x] Incident diagnostics are documented for each denial class.
|
||||
|
||||
### RQG-04 - Timeout, cancellation, and resilience behavioral verification
|
||||
Status: DONE
|
||||
Dependency: RQG-02
|
||||
Owners: Test Automation
|
||||
Task description:
|
||||
- Validate endpoint-aware timeout precedence behavior and cancellation propagation.
|
||||
- Test service restart and heartbeat recovery scenarios under messaging transport.
|
||||
- Verify deterministic behavior for timeout-related 504 and cancel semantics.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Endpoint timeout precedence is validated with targeted scenarios.
|
||||
- [x] Cancellation propagation behavior is proven end-to-end.
|
||||
- [x] Recovery tests after service restart pass without stale routing state.
|
||||
|
||||
### RQG-05 - Global OpenAPI quality and semantics gate
|
||||
Status: DONE
|
||||
Dependency: RQG-02
|
||||
Owners: Developer, Documentation Author, QA
|
||||
Task description:
|
||||
- Validate `openapi.json` for all migrated service prefixes and security semantics:
|
||||
- operation presence.
|
||||
- summary/description completeness.
|
||||
- schema references and components integrity.
|
||||
- security requirement correctness.
|
||||
- timeout extension presence.
|
||||
- Track and resolve documentation mismatches before rollout approval.
|
||||
|
||||
Completion criteria:
|
||||
- [x] OpenAPI quality report covers all service prefixes.
|
||||
- [x] Critical metadata/security/schema defects are resolved.
|
||||
- [x] Published docs align with generated OpenAPI semantics.
|
||||
|
||||
### RQG-06 - Performance and latency regression gate
|
||||
Status: DONE
|
||||
Dependency: RQG-04
|
||||
Owners: Test Automation, Developer
|
||||
Task description:
|
||||
- Run comparative baseline checks for reverse-proxy vs microservice modes on representative endpoints.
|
||||
- Capture p50/p95 latency and error-rate deltas with acceptance thresholds.
|
||||
- Identify transport/config bottlenecks requiring post-release hardening tasks.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Latency and error-rate comparison report is published.
|
||||
- [x] Any threshold breach is blocked or accepted with explicit risk sign-off.
|
||||
- [x] Follow-up hardening tasks are created for non-blocking performance gaps.
|
||||
|
||||
### RQG-07 - Rollout decision package and archive readiness
|
||||
Status: DONE
|
||||
Dependency: RQG-03
|
||||
Owners: Project Manager, QA
|
||||
Task description:
|
||||
- Assemble final rollout decision package referencing all evidence and risk decisions.
|
||||
- Confirm no `TODO/DOING/BLOCKED` remains in dependent Router implementation sprints before archival transitions.
|
||||
- Document post-release monitoring checkpoints and rollback triggers.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Decision package includes all required evidence links and sign-offs.
|
||||
- [x] Dependent sprints meet archive eligibility rules.
|
||||
- [x] Next-step monitoring and rollback checkpoints are published.
|
||||
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2026-02-22 | Sprint created as final QA/conformance gate for Router semantic and migration program closeout. | Project Manager |
|
||||
| 2026-02-22 | Conformance pre-gate started: targeted Router suites green (`StellaOps.Router.AspNet.Tests`, `StellaOps.Router.Gateway.Tests`, `StellaOps.Gateway.WebService.Tests`) after auth/timeout/envelope updates. | Test Automation |
|
||||
| 2026-02-22 | RQG-01 completed: Router conformance suites passed (`41 + 34 + 230`). | Test Automation |
|
||||
| 2026-02-22 | RQG-02 completed: clean `down -v`/`up --wait` executed in both reverseproxy and microservice modes with healthy final state. | DevOps |
|
||||
| 2026-02-22 | RQG-03/RQG-04 completed: abuse-case and timeout/cancellation semantics verified via gateway/router test coverage and dual-mode route smoke. | Security |
|
||||
| 2026-02-22 | RQG-05/RQG-06 completed: OpenAPI quality + perf comparison reports generated for both routing modes. | QA |
|
||||
| 2026-02-22 | RQG-07 completed: rollout decision package published and dependent router sprints prepared for archive. | Project Manager |
|
||||
|
||||
## Decisions & Risks
|
||||
- Decision resolved: reverseproxy remains latency baseline; microservice mode accepted for release with no error-rate regression and documented non-blocking latency drift.
|
||||
- Risk: from-scratch compose reproducibility may vary by local environment. Mitigation: fixed command matrix and deterministic fixtures.
|
||||
- Risk: OpenAPI/security semantics may diverge during late fixes. Mitigation: mandatory quality gate after final build.
|
||||
- Risk: unresolved non-critical defects may delay archive process. Mitigation: explicit blocker policy and owner assignment.
|
||||
- Gate artifacts:
|
||||
- `devops/compose/openapi_current.json`
|
||||
- `devops/compose/openapi_reverse.json`
|
||||
- `devops/compose/perf_microservice.json`
|
||||
- `devops/compose/perf_reverseproxy.json`
|
||||
- `devops/compose/perf_mode_comparison.json`
|
||||
|
||||
## Next Checkpoints
|
||||
- 2026-02-24: Conformance and dual-mode bootstrap complete (`RQG-01`, `RQG-02`).
|
||||
- 2026-02-25: Security and timeout/resilience gates complete (`RQG-03`, `RQG-04`).
|
||||
- 2026-02-26: OpenAPI and performance gates complete (`RQG-05`, `RQG-06`).
|
||||
- 2026-02-27: Rollout decision package and archive readiness review (`RQG-07`).
|
||||
|
||||
@@ -0,0 +1,69 @@
|
||||
# Sprint 20260222.052 - Gateway Auth Semantics Legacy Compatibility
|
||||
|
||||
## Topic & Scope
|
||||
- Fix contradictory gateway auth metadata in aggregated OpenAPI where `allowAnonymous=false` could still show `requiresAuthentication=false`.
|
||||
- Harden gateway runtime authorization behavior for legacy HELLO payloads that omit `RequiresAuthentication`.
|
||||
- Working directory: `src/Router/`.
|
||||
- Cross-module edits explicitly allowed for this sprint: `docs/modules/router`, `docs/implplan`.
|
||||
- Expected evidence: gateway/auth unit tests, openapi generator unit tests, live `openapi.json` validation for notifier incidents ack path.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
- Depends on current Router.Common endpoint descriptor contract and gateway middleware/openapi aggregation behavior.
|
||||
- No blocking upstream sprint dependency.
|
||||
- Safe concurrency: middleware and OpenAPI updates can be implemented in parallel with test additions.
|
||||
|
||||
## Documentation Prerequisites
|
||||
- `docs/modules/router/architecture.md`
|
||||
- `docs/modules/router/openapi-aggregation.md`
|
||||
- `docs/modules/platform/architecture-overview.md`
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
### RAG-01 - Normalize effective gateway auth semantics for legacy endpoint descriptors
|
||||
Status: DONE
|
||||
Dependency: none
|
||||
Owners: Developer
|
||||
Task description:
|
||||
- Introduce shared authorization semantics resolution in Router.Common and apply it in gateway authorization middleware and OpenAPI aggregation metadata emission.
|
||||
- Ensure routes that are not explicitly anonymous fail closed when legacy payloads omit `RequiresAuthentication`.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Gateway authorization middleware enforces authenticated principal for legacy descriptors (`allowAnonymous=false`, missing auth flag semantics).
|
||||
- [x] OpenAPI `x-stellaops-gateway-auth.requiresAuthentication` reflects effective semantics instead of raw legacy value.
|
||||
- [x] Existing explicit anonymous endpoints remain anonymous.
|
||||
|
||||
### RAG-02 - Validate and document behavior end-to-end
|
||||
Status: DONE
|
||||
Dependency: RAG-01
|
||||
Owners: Developer, Test Automation, Documentation Author
|
||||
Task description:
|
||||
- Add targeted unit tests for middleware and OpenAPI generation covering legacy descriptor compatibility.
|
||||
- Run targeted test projects and validate live gateway OpenAPI output for notifier incident ack endpoint.
|
||||
- Update router OpenAPI aggregation docs with legacy compatibility semantics.
|
||||
|
||||
Completion criteria:
|
||||
- [x] Targeted gateway test projects pass with new compatibility assertions.
|
||||
- [x] Live `https://stella-ops.local/openapi.json` shows `requiresAuthentication=true` for `/notifier/api/v2/incidents/{deliveryId}/ack`.
|
||||
- [x] Router OpenAPI documentation includes the compatibility rule and explicit anonymous guidance.
|
||||
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2026-02-22 | Sprint created to close auth metadata/runtime mismatch from mixed router-common versions across services. | Project Manager |
|
||||
| 2026-02-22 | Started implementation of shared auth semantics resolver and applied gateway middleware/OpenAPI wiring changes. | Developer |
|
||||
| 2026-02-22 | Added middleware/OpenAPI compatibility tests; `StellaOps.Gateway.WebService.Tests` and `StellaOps.Router.Gateway.Tests` passed after changes. | Test Automation |
|
||||
| 2026-02-22 | Built `stellaops/router-gateway:dev`, recreated `router-gateway`, and validated live OpenAPI auth metadata for `/notifier/api/v2/incidents/{deliveryId}/ack` now emits `requiresAuthentication=true`. | Developer |
|
||||
| 2026-02-22 | Verified unauthenticated POST to `/notifier/api/v2/incidents/test/ack` now returns `401` from gateway authorization middleware. | QA |
|
||||
| 2026-02-22 | Ran compose `router-mode-redeploy.ps1 -Mode microservice`; all services healthy and OpenAPI path inventory restored (`paths=1899`) with notifier ack route still `requiresAuthentication=true`. | QA |
|
||||
|
||||
## Decisions & Risks
|
||||
- Decision: enforce fail-closed semantics for non-anonymous legacy descriptors to prevent accidental unauthenticated access when HELLO payloads omit `RequiresAuthentication`.
|
||||
- Decision: root cause is mixed image/library versions (`router-gateway` newer than several service images), so compatibility is enforced at gateway runtime and OpenAPI generation.
|
||||
- Risk: endpoints that intended to be public but did not explicitly mark `AllowAnonymous` will now require authentication.
|
||||
Mitigation: publish explicit requirement in docs and enforce `AllowAnonymous` for public routes.
|
||||
- Docs links:
|
||||
- `docs/modules/router/openapi-aggregation.md`
|
||||
|
||||
## Next Checkpoints
|
||||
- Run a parity rebuild wave for remaining webservice images so all HELLO payloads carry explicit `RequiresAuthentication`.
|
||||
- Add CI guardrails to detect mixed Router.Common contract versions between gateway and connected microservices.
|
||||
Reference in New Issue
Block a user