Files
git.stella-ops.org/docs-archived/implplan/SPRINT_0174_0001_0001_telemetry.md
2026-01-05 16:02:11 +02:00

10 KiB

Sprint 0174 - Telemetry (Notifications & Telemetry 170.B)

Topic & Scope

  • Deliver StellaOps.Telemetry.Core bootstrap, propagation middleware, metrics helpers, scrubbing, incident/sealed-mode toggles.
  • Provide sample host integrations while keeping deterministic, offline-friendly telemetry with redaction and tenant awareness.
  • Working directory: src/Telemetry/StellaOps.Telemetry.Core.

Dependencies & Concurrency

  • Upstream: Sprint 0150 (Orchestrator) host integration, CLI incident toggle contract (CLI-OBS-12-001), Notify incident payload spec (NOTIFY-OBS-55-001), Security scrub policy (POLICY-SEC-42-003) - all landed and referenced in prep docs; telemetry tests rerun after Moq restore on 2025-12-05.
  • Concurrency: executed sequential chain 50-001 -> 50-002 -> 51-001/51-002 -> 55-001 -> 56-001; no remaining interlocks.

Documentation Prerequisites

  • docs/README.md
  • docs/07_HIGH_LEVEL_ARCHITECTURE.md
  • docs/modules/platform/architecture-overview.md
  • docs/modules/telemetry/architecture.md
  • src/Telemetry/StellaOps.Telemetry.Core/AGENTS.md

Delivery Tracker

# Task ID Status Key dependency / next step Owners Task Definition
P1 PREP-TELEMETRY-OBS-50-002-AWAIT-PUBLISHED-50 DONE (2025-11-19) Bootstrap doc docs/observability/telemetry-bootstrap.md published; package available for downstream hosts. Telemetry Core Guild Bootstrap package published; reference doc docs/observability/telemetry-bootstrap.md provides wiring + config.
P2 PREP-TELEMETRY-OBS-51-001-TELEMETRY-PROPAGATI DONE (2025-11-20) Doc published at docs/observability/telemetry-propagation-51-001.md; downstream unblocked. Telemetry Core Guild + Observability Guild Telemetry propagation guidance documented for TELEMETRY-OBS-51-001.
P3 PREP-TELEMETRY-OBS-51-002-DEPENDS-ON-51-001 DONE (2025-11-20) Doc published at docs/observability/telemetry-scrub-51-002.md; downstream unblocked. Telemetry Core Guild + Security Guild Scrub policy and wiring documented for TELEMETRY-OBS-51-002.
P4 PREP-TELEMETRY-OBS-56-001-DEPENDS-ON-55-001 DONE (2025-11-20) Doc published at docs/observability/telemetry-sealed-56-001.md; downstream unblocked. Telemetry Core Guild Sealed-mode helper guidance documented for TELEMETRY-OBS-56-001.
P5 PREP-CLI-OBS-12-001-INCIDENT-TOGGLE-CONTRACT DONE (2025-11-20) Doc published at docs/observability/cli-incident-toggle-12-001.md; downstream unblocked. CLI Guild + Notifications Service Guild + Telemetry Core Guild CLI incident toggle contract (CLI-OBS-12-001) published; required for TELEMETRY-OBS-55-001/56-001.
1 TELEMETRY-OBS-50-001 DONE (2025-11-19) Finalize bootstrap + sample host integration. Telemetry Core Guild (src/Telemetry/StellaOps.Telemetry.Core) Telemetry Core helper in place; sample host wiring + config published in docs/observability/telemetry-bootstrap.md.
2 TELEMETRY-OBS-50-002 DONE (2025-11-27) Implementation complete; tests restored 2025-12-05. Telemetry Core Guild Context propagation middleware/adapters for HTTP, gRPC, background jobs, CLI; carry trace_id, tenant_id, actor, imposed-rule metadata; async resume harness. Prep artefact: docs/modules/telemetry/prep/2025-11-20-obs-50-002-prep.md.
3 TELEMETRY-OBS-51-001 DONE (2025-11-27) Implementation complete; tests restored 2025-12-05. Telemetry Core Guild + Observability Guild Metrics helpers for golden signals with exemplar support and cardinality guards; Roslyn analyzer preventing unsanitised labels. Prep artefact: docs/modules/telemetry/prep/2025-11-20-obs-51-001-prep.md.
4 TELEMETRY-OBS-51-002 DONE (2025-11-27) Implemented scrubbing with LogRedactor, per-tenant config, audit overrides, determinism tests. Telemetry Core Guild + Security Guild Redaction/scrubbing filters for secrets/PII at logger sink; per-tenant config with TTL; audit overrides; determinism tests.
5 TELEMETRY-OBS-55-001 DONE (2025-11-27) Implementation complete with unit tests. Telemetry Core Guild Incident mode toggle API adjusting sampling, retention tags; activation trail; honored by hosting templates + feature flags.
6 TELEMETRY-OBS-56-001 DONE (2025-11-27) Implementation complete with unit tests. Telemetry Core Guild Sealed-mode telemetry helpers (drift metrics, seal/unseal spans, offline exporters); disable external exporters when sealed.

Execution Log

Date (UTC) Update Owner
2025-11-27 Implemented TELEMETRY-OBS-56-001: Added ISealedModeTelemetryService with drift metrics, seal/unseal activity spans, external export blocking. Telemetry Core Guild
2025-11-27 Implemented TELEMETRY-OBS-55-001: Added IIncidentModeService with activation/deactivation/TTL extension methods. Telemetry Core Guild
2025-11-27 Implemented TELEMETRY-OBS-50-002: Added TelemetryContext, TelemetryContextAccessor, propagation middleware. Telemetry Core Guild
2025-11-27 Implemented TELEMETRY-OBS-51-001: Added GoldenSignalMetrics with cardinality guards and exemplar support. Telemetry Core Guild
2025-11-27 Added unit tests for context propagation and golden signal metrics. Build/test blocked by NuGet restore; implementation validated by code review. Telemetry Core Guild
2025-11-20 Published telemetry prep docs (context propagation + metrics helpers); set TELEMETRY-OBS-50-002/51-001 to DOING. Project Mgmt
2025-11-20 Added sealed-mode helper prep doc (telemetry-sealed-56-001.md); marked PREP-TELEMETRY-OBS-56-001 DONE. Implementer
2025-11-20 Published propagation and scrubbing prep docs (telemetry-propagation-51-001.md, telemetry-scrub-51-002.md) and CLI incident toggle contract; marked corresponding PREP tasks DONE and moved TELEMETRY-OBS-51-001 to TODO. Implementer
2025-11-20 Added PREP-CLI-OBS-12-001-INCIDENT-TOGGLE-CONTRACT and cleaned PREP-TELEMETRY-OBS-50-002 Task ID; updated TELEMETRY-OBS-55-001 dependency accordingly. Project Mgmt
2025-11-19 Assigned PREP owners/dates; see Delivery Tracker. Planning
2025-11-12 Marked TELEMETRY-OBS-50-001 as DOING; branch feature/telemetry-core-bootstrap with resource detector/profile manifest in review; host sample slated 2025-11-18. Telemetry Core Guild
2025-11-19 Normalized sprint to standard template and renamed from SPRINT_174_telemetry.md to SPRINT_0174_0001_0001_telemetry.md; content preserved. Implementer
2025-11-19 Added legacy-file redirect stub to avoid divergent updates. Implementer
2025-11-20 Marked tasks 50-002..56-001 BLOCKED: waiting on 50-001 package publication, Security scrub policy, and CLI incident-toggle contract; no executable work until upstream artefacts land. Implementer
2025-11-19 PREP-TELEMETRY-OBS-50-002-AWAIT-PUBLISHED-50 completed; bootstrap doc published. Downstream tasks remain blocked on propagation/scrub/toggle contracts. DONE (2025-11-22)
2025-11-19 TELEMETRY-OBS-50-001 set to DONE; TELEMETRY-OBS-50-002 moved to TODO now that bootstrap package is documented. Implementer
2025-11-19 Completed TELEMETRY-OBS-50-001: published bootstrap sample at docs/observability/telemetry-bootstrap.md; library already present. Implementer
2025-11-22 Marked all PREP tasks to DONE per directive; evidence to be verified. Project Mgmt
2025-12-05 Attempted dotnet test src/Telemetry/StellaOps.Telemetry.Core/StellaOps.Telemetry.Core.Tests/StellaOps.Telemetry.Core.Tests.csproj -c Deterministic --logger "trx;LogFileName=TestResults/telemetry-tests.trx"; compilation failed: Moq references missing (packages not restored), so tests did not execute. Requires restoring Moq from curated feed or vendor mirror and re-running. Implementer
2025-12-05 Re-ran telemetry tests after adding Moq + fixes (TestResults/telemetry-tests.trx); 1 test still failing: TelemetryPropagationMiddlewareTests.Middleware_Populates_Accessor_And_Activity_Tags (accessor.Current null inside middleware). Other suites now pass. Implementer
2025-12-05 Telemetry suite GREEN: dotnet test src/Telemetry/StellaOps.Telemetry.Core/StellaOps.Telemetry.Core.Tests/StellaOps.Telemetry.Core.Tests.csproj -c Deterministic --logger "trx;LogFileName=TestResults/telemetry-tests.trx" completed with only warnings (NU1510/NU1900/CS0618/CS8633/xUnit1030). TRX evidence stored at src/Telemetry/StellaOps.Telemetry.Core/StellaOps.Telemetry.Core.Tests/TestResults/TestResults/telemetry-tests.trx. Implementer
2025-12-06 Cleared Moq restore risk; telemetry tests validated with curated feed. Updated Decisions & Risks and closed checkpoints. Telemetry Core Guild
2025-12-10 Hardened propagation: HTTP handler now falls back to current Activity trace when no context is set, with regression test added (TelemetryPropagationHandlerTests.Handler_Propagates_Trace_When_Context_Missing). Implementer
2025-12-10 Propagation middleware now keeps Activity.Current visible to callers; sealed-mode file exporter tests adjusted to dispose before reads. Full telemetry suite rerun (dotnet test ...StellaOps.Telemetry.Core.Tests.csproj -c Deterministic, TRX at src/Telemetry/StellaOps.Telemetry.Core/StellaOps.Telemetry.Core.Tests/TestResults/TestResults/telemetry-full.trx). Implementer
2025-12-10 Sprint archived; all TELEMETRY-OBS-50/51/55/56 tasks and prep tracks DONE with tests restored (2025-12-05 evidence). Project Mgmt

Decisions & Risks

  • All upstream contracts (bootstrap, propagation, scrub, CLI toggle, Notify payload) delivered; telemetry helpers shipped with tests.
  • Determinism/offline posture enforced: sealed mode disables external exporters; propagation carries trace_id, tenant_id, actor, imposed_rule, correlation_id; golden signals guard label cardinality.
  • Telemetry test suite validated on 2025-12-05 using curated Moq package; rerun CI lane if package cache changes or new adapters are added. Full suite revalidated 2025-12-10 after propagation and sealed-mode exporter fixes.
  • Sprint archived 2025-12-10; no open risks.

Next Checkpoints

Date (UTC) Milestone Owner(s)
None Sprint archived 2025-12-10; rerun telemetry test lane if scrub policy or CLI toggle contract changes. Telemetry Core Guild