Add unit tests and logging infrastructure for InMemory and RabbitMQ transports
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled

- Implemented RecordingLogger and RecordingLoggerFactory for capturing log entries in tests.
- Added unit tests for InMemoryChannel, covering constructor behavior, property assignments, channel communication, and disposal.
- Created InMemoryTransportOptionsTests to validate default values and customizable options for InMemory transport.
- Developed RabbitMqFrameProtocolTests to ensure correct parsing and property creation for RabbitMQ frames.
- Added RabbitMqTransportOptionsTests to verify default settings and customization options for RabbitMQ transport.
- Updated project files for testing libraries and dependencies.
This commit is contained in:
StellaOps Bot
2025-12-05 09:38:45 +02:00
parent 6a299d231f
commit 53508ceccb
98 changed files with 10868 additions and 663 deletions

View File

@@ -1,146 +1,8 @@
# Sprint 170 - Notifications & Telemetry
# Sprint 170 - Notifications & Telemetry (legacy stub)
> **BLOCKED Tasks:** Before working on BLOCKED tasks, review [BLOCKED_DEPENDENCY_TREE.md](./BLOCKED_DEPENDENCY_TREE.md) for root blockers and dependencies.
This sprint was normalized and renamed to `SPRINT_0170_0001_0001_notifications_telemetry.md` on 2025-11-19 and fully merged on 2025-12-05. Use the canonical file for status, risks, and logs.
Active items only. Completed/historic work now resides in docs/implplan/archived/tasks.md (updated 2025-11-08).
- For BLOCKED task handling, see `BLOCKED_DEPENDENCY_TREE.md`.
- Active backlog and evidence live in the canonical sprint file and the downstream Sprint 0171/0174 trackers.
This file now only tracks the notifications & telemetry status snapshot. Active backlog lives in Sprint 171+ files.
# Wave coordination
| Wave | Guild owners | Shared prerequisites | Status | Notes |
| --- | --- | --- | --- | --- |
| 170.A Notifier | Notifications Service Guild · Attestor Service Guild · Observability Guild | Sprint 150.A Orchestrator | **DONE (2025-12-04)** | All 14 tasks DONE (NOTIFY-GAPS-171-014 signed with dev key `notify-dev-hmac-001`; production HSM re-signing deferred). Tracked in `SPRINT_0171_0001_0001_notifier_i.md`. |
| 170.B Telemetry | Telemetry Core Guild · Observability Guild · Security Guild | Sprint 150.A Orchestrator | **DONE (2025-11-27)** | All 6 tasks complete (TELEMETRY-OBS-50-001 through 56-001). Tracked in `SPRINT_0174_0001_0001_telemetry.md`. |
# Sprint 170 - Notifications & Telemetry
## Wave 170.A Notifier readiness
### Scope & goals
- Deliver attestation/key-rotation alert templates plus routing so Attestor/Signer incidents surface immediately (NOTIFY-ATTEST-74-001/002).
- Refresh Notifier OpenAPI/SDK surface (`NOTIFY-OAS-61-001``NOTIFY-OAS-63-001`) so Console/CLI teams can self-serve the new endpoints.
- Wire SLO/incident inputs into rules (NOTIFY-OBS-51-001/55-001) and extend risk-profile routing (NOTIFY-RISK-66-001 → NOTIFY-RISK-68-001) without regressing quiet-hours/dedup.
- Preserve Offline Kit and documentation parity (NOTIFY-DOC-70-001 — done, NOTIFY-AIRGAP-56-002 — done) while adding the new rule surfaces.
### Entry criteria
- Orchestrator job attest events flowing to Notify bus (Sprint 150.A dependency) with test fixtures approved by Attestor Guild.
- Quiet-hours/digest backlog reconciled (no pending blockers in `docs/notifications/*.md`).
- Observability Guild sign-off on telemetry fields reused by Notifier SLO webhooks.
### Exit criteria
- All NOTIFY-ATTEST/OAS/OBS/RISK tasks in `SPRINT_171_notifier_i.md` moved to DONE with accompanying doc updates.
- Templates promoted to Offline Kit manifests and sample payloads stored under `docs/notifications/templates.md`.
- Incident mode notifications exercised in staging with audit logs + DSSE evidence attached.
### Task clusters & owners
| Cluster | Linked tasks | Owners | Status snapshot | Notes |
| --- | --- | --- | --- | --- |
| Attestation / key lifecycle alerts | NOTIFY-ATTEST-74-001/74-002 | Notifications Service Guild · Attestor Service Guild | TODO → DOING (prep) | Template scaffolding drafted; awaiting Rekor witness payload contract freeze. |
| API/OAS refresh & SDK parity | NOTIFY-OAS-61-001 → NOTIFY-OAS-63-001 | Notifications Service Guild · API Contracts Guild · SDK Generator Guild | TODO | Contract doc outline in review; SDK generator blocked on `/notifications/rules` schema finalize date (target 2025-11-15). |
| Observability-driven triggers | NOTIFY-OBS-51-001/55-001 | Notifications Service Guild · Observability Guild | TODO | Depends on Telemetry team exposing SLO webhook payload shape (see TELEMETRY-OBS-51-001). |
| Risk profile routing | NOTIFY-RISK-66-001 → NOTIFY-RISK-68-001 | Notifications Service Guild · Risk Engine Guild · Policy Guild | TODO | Requires Policys risk profile metadata (POLICY-RISK-40-002) export; follow up in Sprint 175. |
| Docs & offline parity | NOTIFY-DOC-70-001, NOTIFY-AIRGAP-56-002 | Notifications Service Guild · DevOps Guild | DONE | Remains reference for GA checklists; keep untouched unless new surfaces appear. |
### Observability checkpoints
- Align metric names/labels with `docs/notifications/architecture.md#12-observability-prometheus--otel` before promoting new dashboards.
- Ensure Notifier spans/logs include tenant, ruleId, actionId, and `attestation_event_id` for attestation-triggered templates.
- Capture incident notification smoke tests via `ops/devops/telemetry/tenant_isolation_smoke.py` once Telemetry wave lands.
## Wave 170.B Telemetry bootstrap
### Scope & goals
- Ship `StellaOps.Telemetry.Core` bootstrap + propagation helpers (TELEMETRY-OBS-50-001/50-002).
- Provide golden-signal helpers + scrubbing/PII safety nets (TELEMETRY-OBS-51-001/51-002) so service teams can onboard without bespoke plumbing.
- Implement incident + sealed-mode toggles (TELEMETRY-OBS-55-001/56-001) and document the integration contract for Orchestrator, Policy, Task Runner, Gateway (`WEB-OBS-50-001`).
### Entry criteria
- Orchestrator + Policy hosts expose extension points for telemetry bootstrap (tracked via Sprint 150.A and IDs ORCH-OBS-50-001 / POLICY-OBS-50-001).
- Observability Guild reviewed storage footprint impacts for Prometheus/Tempo/Loki per module (docs/modules/telemetry/architecture.md §2).
- Security Guild signs off on redaction defaults + tenant override audit logging.
### Exit criteria
- Core library published to `/local-nugets` and referenced by at least Orchestrator & Policy in integration branches.
- Context propagation middleware validated through HTTP/gRPC/job smoke tests with deterministic trace IDs.
- Incident/sealed-mode toggles wired into CLI + Notify hooks (NOTIFY-OBS-55-001) with runbooks updated under `docs/notifications/architecture.md`.
### Task clusters & owners
| Cluster | Linked tasks | Owners | Status snapshot | Notes |
| --- | --- | --- | --- | --- |
| Bootstrap & propagation | TELEMETRY-OBS-50-001/50-002 | Telemetry Core Guild | TODO → DOING (scaffolding) | Collector profile templates staged; need service metadata detector + sample host integration PRs. |
| Metrics helpers + scrubbing | TELEMETRY-OBS-51-001/51-002 | Telemetry Core Guild · Observability Guild · Security Guild | TODO | Roslyn analyzer spec drafted; waiting on scrub policy from Security (POLICY-SEC-42-003). |
| Incident & sealed-mode controls | TELEMETRY-OBS-55-001/56-001 | Telemetry Core Guild · Observability Guild | TODO | Requires CLI toggle contract (CLI-OBS-12-001) and Notify incident payload spec (NOTIFY-OBS-55-001). |
### Tooling & validation
- Smoke: `ops/devops/telemetry/smoke_otel_collector.py` + `tenant_isolation_smoke.py` to run for each profile (default/forensic/airgap).
- Offline bundle packaging: `ops/devops/telemetry/package_offline_bundle.py` to include updated collectors, dashboards, manifest digests.
- Incident simulation: reuse `ops/devops/telemetry/generate_dev_tls.sh` for local collector certs during sealed-mode testing.
## Shared milestones & dependencies
| Target date | Milestone | Owners | Dependency notes |
| --- | --- | --- | --- |
| 2025-11-13 | Finalize attestation payload schema + template variables | Notifications Service Guild · Attestor Service Guild | Unblocks NOTIFY-ATTEST-74-001/002 + Telemetry incident span labels. |
| 2025-11-15 | Publish draft Notifier OAS + SDK snippets | Notifications Service Guild · API Contracts Guild | Required for CLI/UI adoption; prereq for NOTIFY-OAS-61/62 series. |
| 2025-11-18 | Land Telemetry.Core bootstrap sample in Orchestrator | Telemetry Core Guild · Orchestrator Guild | Demonstrates TELEMETRY-OBS-50-001 viability; prerequisite for Policy adoption + Notify SLO hooks. |
| 2025-11-20 | Incident/quiet-hour end-to-end rehearsal | Notifications Service Guild · Telemetry Core Guild · Observability Guild | Validates TELEMETRY-OBS-55-001 + NOTIFY-OBS-55-001 + CLI toggle contract. |
| 2025-11-22 | Offline kit bundle refresh (notifications + telemetry assets) | DevOps Guild · Notifications Service Guild · Telemetry Core Guild | Ensure docs/ops/offline-kit manifests reference new templates/configs. |
## Risks & mitigations
- **Telemetry data drift in sealed mode.** Mitigate by enforcing `IEgressPolicy` checks (TELEMETRY-OBS-56-001) and documenting fallback exporters; schedule smoke runs after each config change.
- **Template/API divergence.** Maintain single source of truth in `SPRINT_171_notifier_i.md` tasks; require API Contracts review before merging SDK updates to avoid drift with UI consumers.
- **Observability storage overhead.** Coordinate with Ops Guild to project Prometheus/Tempo growth when SLO webhooks + incident toggles increase cardinality; adjust retention per docs/modules/telemetry/architecture.md §2.
- **Cross-sprint dependency churn.** Track ORCH-OBS-50-001, POLICY-OBS-50-001, WEB-OBS-50-001 weekly; if they slip, re-baseline Telemetry wave deliverables or gate Notifier observability triggers accordingly.
## Task mirror snapshot (reference: Sprint 171 & 174 trackers)
### Wave 170.A Notifier (Sprint 171 mirror)
- **Open tasks:** 0.
- **Done tasks:** 14 (all NOTIFY-ATTEST, NOTIFY-OAS, NOTIFY-OBS, NOTIFY-RISK, NOTIFY-DOC, NOTIFY-AIRGAP, NOTIFY-GAPS series complete).
| Category | Task IDs | Current state | Notes |
| --- | --- | --- | --- |
| Attestation + key lifecycle | NOTIFY-ATTEST-74-001/002 | **DONE** | Templates and wiring complete (2025-11-16/27). |
| API/OAS + SDK refresh | NOTIFY-OAS-61-001 → 63-001 | **DONE** | All OAS/SDK tasks complete (2025-11-17). |
| Observability-driven triggers | NOTIFY-OBS-51-001/55-001 | **DONE** | SLO webhook + incident mode templates shipped (2025-11-22). |
| Risk routing | NOTIFY-RISK-66-001 → 68-001 | **DONE** | Risk-events endpoint + routing seeds shipped (2025-11-24); POLICY-RISK-40-002 metadata export now available. |
| Gap remediation | NOTIFY-GAPS-171-014 | **DONE** | NR1-NR10 artifacts complete; DSSE signed with dev key `notify-dev-hmac-001` (2025-12-04). |
| Completed prerequisites | NOTIFY-DOC-70-001, NOTIFY-AIRGAP-56-002 | **DONE** | Documentation and offline-kit parity complete. |
### Wave 170.B Telemetry (Sprint 174 mirror)
- **Open tasks:** 0.
- **Done tasks:** 6 (TELEMETRY-OBS-50/51/55/56 series all complete as of 2025-11-27).
| Category | Task IDs | Current state | Notes |
| --- | --- | --- | --- |
| Bootstrap & propagation | TELEMETRY-OBS-50-001/002 | **DONE** | Core bootstrap (50-001) and propagation middleware (50-002) complete (2025-11-19/27). |
| Metrics helpers & scrubbing | TELEMETRY-OBS-51-001/002 | **DONE** | Golden signal metrics with cardinality guards + scrubbing filters complete (2025-11-27). |
| Incident & sealed-mode controls | TELEMETRY-OBS-55-001/56-001 | **DONE** | Incident mode toggle and sealed-mode helpers complete (2025-11-27). |
## External dependency tracker
| Dependency | Source sprint / doc | Current state (as of 2025-11-12) | Impact on waves |
| --- | --- | --- | --- |
| Sprint 150.A Orchestrator (wave table) | `SPRINT_150_scheduling_automation.md` | TODO | Blocks Notifier template wiring + Telemetry consumption of job events until orchestration telemetry lands. |
| ORCH-OBS-50-001 `orchestrator instrumentation` | `docs/implplan/archived/tasks.md` excerpt / Sprint 150 backlog | TODO | Needed for Telemetry.Core sample + Notify SLO hooks; monitor for slip. |
| POLICY-OBS-50-001 `policy instrumentation` | Sprint 150 backlog | TODO | Required before Telemetry helpers can be adopted by Policy + risk routing. |
| WEB-OBS-50-001 `gateway telemetry core adoption` | Sprint 214/215 backlogs | TODO | Ensures web/gateway emits trace IDs that Notify incident payload references. |
| POLICY-RISK-40-002 `risk profile metadata export` | Sprint 215+ (Policy) | DONE (2025-12-04) | Implemented `GET /api/risk/profiles/{id}/metadata` endpoint for notification enrichment. |
## Coordination log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2025-12-04 | Sprint 170 FULLY COMPLETE: Created dev signing key (`etc/secrets/dsse-dev.signing.json`) and signing utility (`scripts/notifications/sign-dsse.py`); signed DSSE files with `notify-dev-hmac-001`. NOTIFY-GAPS-171-014 now DONE. All 14 Notifier + 6 Telemetry tasks complete. | Implementer |
| 2025-12-04 | Sprint 170 complete: Wave 170.A marked DONE (12/13 tasks); Wave 170.B already DONE (6/6 tasks). Only NOTIFY-GAPS-171-014 remains BLOCKED on security infra (signing keys). | Implementer |
| 2025-12-04 | Implemented POLICY-RISK-40-002: Added `GET /api/risk/profiles/{id}/metadata` endpoint for notification enrichment. NOTIFY-RISK tasks unblocked. Only NOTIFY-GAPS-171-014 remains BLOCKED (signing keys). | Implementer |
| 2025-12-04 | Status refresh: Wave 170.B (Telemetry) marked DONE (all 6 tasks complete); Wave 170.A (Notifier) updated to show 9/13 done with 4 BLOCKED on external dependencies (POLICY-RISK-40-002, signing keys). Updated task mirror snapshots. | Project Mgmt |
| 2025-11-12 10:15 | Wave rows flipped to DOING; baseline scope/entry/exit criteria recorded for both waves. | Observability Guild · Notifications Service Guild |
| 2025-11-12 14:40 | Added task mirror + dependency tracker + milestone table to keep Sprint170 snapshot aligned with Sprint171/174 execution plans. | Observability Guild |
| 2025-11-12 18:05 | Marked NOTIFY-ATTEST-74-001, NOTIFY-OAS-61-001, and TELEMETRY-OBS-50-001 as DOING in their sprint trackers; added status notes reflecting in-flight work vs. gated follow-ups. | Notifications Service Guild · Telemetry Core Guild |
| 2025-11-12 19:20 | Documented attestation template suite (Section7 in `docs/notifications/templates.md`) to unblock NOTIFY-ATTEST-74-001 deliverables and updated sprint mirrors accordingly. | Notifications Service Guild |
| 2025-11-12 19:32 | Synced notifications architecture doc to reference the new attestation template suite so downstream teams see the dependency in one place. | Notifications Service Guild |
| 2025-11-12 19:45 | Updated notifications overview + rules docs with `tmpl-attest-*` requirements so rule authors/operators share the same contract. | Notifications Service Guild |
| 2025-11-12 20:05 | Published baseline Offline Kit templates under `offline/notifier/templates/attestation/` for Slack/Email/Webhook so NOTIFY-ATTEST-74-002 wiring has ready-made artefacts. | Notifications Service Guild |
→ Open `SPRINT_0170_0001_0001_notifications_telemetry.md` for the current snapshot.