Files
git.stella-ops.org/docs/implplan/SPRINT_174_telemetry.md
master d1cbb905f8
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
up
2025-11-28 18:21:46 +02:00

46 lines
7.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Sprint 174 - Notifications & Telemetry · 170.B) Telemetry
Active items only. Completed/historic work now resides in docs/implplan/archived/tasks.md (updated 2025-11-08).
[Notifications & Telemetry] 170.B) Telemetry
Depends on: Sprint 150.A - Orchestrator
Summary: Notifications & Telemetry focus on Telemetry).
Task ID | State | Task description | Owners (Source)
--- | --- | --- | ---
TELEMETRY-OBS-50-001 | DONE (2025-11-19) | `StellaOps.Telemetry.Core` bootstrap library shipped with structured logging facade, OTEL configuration helpers, deterministic bootstrap (service name/version detection, resource attributes), and sample usage for web/worker hosts. Evidence: `docs/observability/telemetry-bootstrap.md`. | Telemetry Core Guild (src/Telemetry/StellaOps.Telemetry.Core)
TELEMETRY-OBS-50-002 | DONE (2025-11-27) | Implement context propagation middleware/adapters for HTTP, gRPC, background jobs, and CLI invocations, carrying `trace_id`, `tenant_id`, `actor`, and imposed-rule metadata. Provide test harness covering async resume scenarios. Dependencies: TELEMETRY-OBS-50-001. | Telemetry Core Guild (src/Telemetry/StellaOps.Telemetry.Core)
TELEMETRY-OBS-51-001 | DONE (2025-11-27) | Ship metrics helpers for golden signals (histograms, counters, gauges) with exemplar support and cardinality guards. Provide Roslyn analyzer preventing unsanitised labels. Dependencies: TELEMETRY-OBS-50-002. Evidence: `GoldenSignalMetrics.cs` + `StellaOps.Telemetry.Analyzers` project with `MetricLabelAnalyzer` (TELEM001/002/003 diagnostics). | Telemetry Core Guild, Observability Guild (src/Telemetry/StellaOps.Telemetry.Core)
TELEMETRY-OBS-51-002 | DONE (2025-11-27) | Implement redaction/scrubbing filters for secrets/PII enforced at logger sink, configurable per-tenant with TTL, including audit of overrides. Add determinism tests verifying stable field order and timestamp normalization. Dependencies: TELEMETRY-OBS-51-001. Evidence: `LogRedactor`, `LogRedactionOptions`, `RedactingLogProcessor`, `DeterministicLogFormatter` + test suites. | Telemetry Core Guild, Security Guild (src/Telemetry/StellaOps.Telemetry.Core)
TELEMETRY-OBS-55-001 | DONE (2025-11-28) | Provide incident mode toggle API that adjusts sampling, enables extended retention tags, and records activation trail for services. Ensure toggle honored by all hosting templates and integrates with Config/FeatureFlag providers. Dependencies: TELEMETRY-OBS-51-002. Evidence: `IIncidentModeService`/`IncidentModeService` with full state management, TTL handling, events, persistence; `IncidentModeOptions` for configuration; `AddIncidentMode()` DI extension; comprehensive test suite in `IncidentModeServiceTests`. | Telemetry Core Guild (src/Telemetry/StellaOps.Telemetry.Core)
TELEMETRY-OBS-56-001 | DONE (2025-11-28) | Add sealed-mode telemetry helpers (drift metrics, seal/unseal spans, offline exporters) and ensure hosts can disable external exporters when sealed. Dependencies: TELEMETRY-OBS-55-001. Evidence: `ISealedModeTelemetryService`/`SealedModeTelemetryService` with metrics counters (`sealEventsCounter`, `unsealEventsCounter`, `driftEventsCounter`, `blockedExportsCounter`), `SealedModeFileExporter` for offline export, `TelemetryExporterGuard` for blocking external exporters; `AddSealedModeTelemetry()` DI extension; test suite in `SealedModeTelemetryServiceTests`. | Telemetry Core Guild (src/Telemetry/StellaOps.Telemetry.Core)
## Status notes (2025-11-28 UTC)
- **TELEMETRY-OBS-50-001** DONE. Library merged with deterministic bootstrap helpers; sample host + test harness published in `docs/observability/telemetry-bootstrap.md`.
- **TELEMETRY-OBS-50-002** DONE. Context propagation middleware for HTTP, gRPC, CLI, and background jobs; includes async resume test harness.
- **TELEMETRY-OBS-51-001** DONE. Golden signal metrics (`GoldenSignalMetrics.cs`) with exemplar support and cardinality guards. Roslyn analyzer project (`StellaOps.Telemetry.Analyzers`) with `MetricLabelAnalyzer` enforcing TELEM001/002/003 diagnostics.
- **TELEMETRY-OBS-51-002** DONE. `ILogRedactor`/`LogRedactor` with pattern-based and field-name redaction. Per-tenant overrides with TTL and audit logging. `DeterministicLogFormatter` ensures stable field ordering and UTC timestamp normalization.
- **TELEMETRY-OBS-55-001** DONE. Incident mode toggle API implemented with `IIncidentModeService`/`IncidentModeService` providing: sampling adjustment, extended retention tags, activation trail recording, state persistence, events, TTL management with extension support, CLI/API/config activation sources. DI registration via `AddIncidentMode()`. Full test suite.
- **TELEMETRY-OBS-56-001** DONE. Sealed-mode telemetry helpers implemented with `ISealedModeTelemetryService`/`SealedModeTelemetryService` providing: drift metrics counters, seal/unseal spans, offline file exporter (`SealedModeFileExporter`), external exporter blocking via `TelemetryExporterGuard`. DI registration via `AddSealedModeTelemetry()`. Full test suite.
## Milestones & dependencies
| Target date | Milestone | Owner(s) | Notes / dependencies |
| --- | --- | --- | --- |
| 2025-11-18 | Land Telemetry.Core bootstrap sample in Orchestrator | Telemetry Core Guild · Orchestrator Guild | Demonstrates TELEMETRY-OBS-50-001 deliverable; prerequisite for propagation middleware adoption. |
| 2025-11-19 | Publish propagation adapter API draft | Telemetry Core Guild | Needed for TELEMETRY-OBS-50-002 and downstream service adoption. |
| 2025-11-21 | Security sign-off on scrub policy (POLICY-SEC-42-003) | Telemetry Core Guild · Security Guild | Unlocks TELEMETRY-OBS-51-001/51-002 implementation. |
| 2025-11-22 | Incident/CLI toggle contract agreed (CLI-OBS-12-001 + NOTIFY-OBS-55-001) | Telemetry Core Guild · Notifications Service Guild · CLI Guild | Required before TELEMETRY-OBS-55-001/56-001 can advance. |
## Coordination log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2025-11-12 18:05 | Marked TELEMETRY-OBS-50-001 as DOING and captured branch/progress details in status notes. | Telemetry Core Guild |
| 2025-11-19 | Marked TELEMETRY-OBS-50-001 DONE; evidence: library merged + `docs/observability/telemetry-bootstrap.md` with sample host integration. | Implementer |
| 2025-11-27 | Marked TELEMETRY-OBS-50-002 DONE; added gRPC interceptors, CLI context, and async resume test harness. | Implementer |
| 2025-11-27 | Marked TELEMETRY-OBS-51-001 DONE; created `StellaOps.Telemetry.Analyzers` project with `MetricLabelAnalyzer` (TELEM001/002/003) and test suite. | Implementer |
| 2025-11-27 | Marked TELEMETRY-OBS-51-002 DONE; implemented `LogRedactor`, `LogRedactionOptions`, `RedactingLogProcessor`, `DeterministicLogFormatter` with comprehensive test suites. | Implementer |
| 2025-11-28 | Marked TELEMETRY-OBS-55-001 DONE; verified existing implementation of `IIncidentModeService`/`IncidentModeService` with state management, TTL handling, events, persistence, and comprehensive test suite. | Implementer |
| 2025-11-28 | Marked TELEMETRY-OBS-56-001 DONE; verified existing implementation of `ISealedModeTelemetryService`/`SealedModeTelemetryService` with metrics, spans, offline exporter, and exporter guard. Sprint 174 Telemetry complete. | Implementer |