Files
git.stella-ops.org/docs/modules/telemetry/contracts/telemetry-gaps-remediation.md
2026-01-08 09:06:03 +02:00

30 lines
2.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Telemetry Gap Remediation (TO1TO10) — v1 · 2025-12-01
Source: `docs/product/advisories/31-Nov-2025 FINDINGS.md` (Telemetry gaps TO1TO10).
Scope: telemetry core (collectors/SDK defaults/bundles) across services; applicable to default/forensic/airgap profiles.
## Decisions (mapped to gaps)
- **TO1 Canonical schemas & hashing**: Published versioned schemas
- `telemetry-config.schema.json` for collector/SDK profile configs (signed, canonical JSON, stable ordering)
- `telemetry-bundle.schema.json` for offline bundle manifests
- Hash recipe: SHA-256 over normalized (UTF-8, LF, sorted keys) JSON; test vectors to follow.
- **TO2 Provenance & DSSE**: Bundles and profile activations must include DSSE envelope (`*.dsse.json`) with predicate fields: profileHash, collectorVersion, exporters, redactionPolicyUri, cryptoProfile.
- **TO3 Determinism & sampling stability**: Sampling policies must declare deterministic seed, ordered rules, and backpressure policy. Logs/traces ordered by (timestamp, traceId). Multi-run hash check recommended in CI.
- **TO4 Sealed mode / egress guards**: Sealed mode blocks all non-loopback exporters unless explicitly allowlisted; DNS pinning required; failure is fail-closed. Seal status recorded as DSSE event.
- **TO5 Redaction policy & PII tests**: Redaction catalog/allowlist required; bundle must include `redaction-manifest.json` listing rules applied and violations=0. CI must run PII/secret test suite before export.
- **TO6 Tenant isolation & quotas**: OTLP signals include `tenant.id` and `project.id`; collector routes by tenant pipeline; per-tenant quotas/limits enforced with counters and alerts.
- **TO7 Forensic triggers governance**: Forensic mode requires dual approval, DSSE activation record, expiry timestamp, and auto-rollback; alert if forensic mode active > configured window.
- **TO8 Offline bundle schema & verify**: Bundles must follow `telemetry-bundle.schema.json`, created with deterministic tar flags, include hash manifest + DSSE + RFC3161 time-anchor; verifier script provided (`ops/devops/telemetry/verify-telemetry-bundle.sh`).
- **TO9 Observability of observability**: Add SLOs + alerts for collector/exporter health, queue backpressure, bundle success rate; scheduled self-test emits DSSE result.
- **TO10 CLI/pack contracts**: CLI/pack contract tracked in `cli-spec-v1.yaml`; telemetry exports must respect exit codes and checksum policy (reuse 21/22 for checksum missing/mismatch).
## Artifacts
- Schemas: `docs/modules/telemetry/schemas/telemetry-config.schema.json`, `telemetry-bundle.schema.json`.
- Hash recipe: in-line within schemas (canonical JSON, SHA-256).
- Verify script: `ops/devops/telemetry/verify-telemetry-bundle.sh`.
## Adoption notes
- Profile and bundle producers must validate against schemas and sign DSSE envelopes before distribution.
- Air-gap/forensic profiles MUST set sealed mode and include redaction manifest.
- CI should add a multi-run hash test for telemetry exporter output and fail on drift.