Files
git.stella-ops.org/docs/modules/telemetry/contracts/telemetry-gaps-remediation.md
2026-01-08 09:06:03 +02:00

2.9 KiB
Raw Blame History

Telemetry Gap Remediation (TO1TO10) — v1 · 2025-12-01

Source: docs/product/advisories/31-Nov-2025 FINDINGS.md (Telemetry gaps TO1TO10). Scope: telemetry core (collectors/SDK defaults/bundles) across services; applicable to default/forensic/airgap profiles.

Decisions (mapped to gaps)

  • TO1 Canonical schemas & hashing: Published versioned schemas
    • telemetry-config.schema.json for collector/SDK profile configs (signed, canonical JSON, stable ordering)
    • telemetry-bundle.schema.json for offline bundle manifests
    • Hash recipe: SHA-256 over normalized (UTF-8, LF, sorted keys) JSON; test vectors to follow.
  • TO2 Provenance & DSSE: Bundles and profile activations must include DSSE envelope (*.dsse.json) with predicate fields: profileHash, collectorVersion, exporters, redactionPolicyUri, cryptoProfile.
  • TO3 Determinism & sampling stability: Sampling policies must declare deterministic seed, ordered rules, and backpressure policy. Logs/traces ordered by (timestamp, traceId). Multi-run hash check recommended in CI.
  • TO4 Sealed mode / egress guards: Sealed mode blocks all non-loopback exporters unless explicitly allowlisted; DNS pinning required; failure is fail-closed. Seal status recorded as DSSE event.
  • TO5 Redaction policy & PII tests: Redaction catalog/allowlist required; bundle must include redaction-manifest.json listing rules applied and violations=0. CI must run PII/secret test suite before export.
  • TO6 Tenant isolation & quotas: OTLP signals include tenant.id and project.id; collector routes by tenant pipeline; per-tenant quotas/limits enforced with counters and alerts.
  • TO7 Forensic triggers governance: Forensic mode requires dual approval, DSSE activation record, expiry timestamp, and auto-rollback; alert if forensic mode active > configured window.
  • TO8 Offline bundle schema & verify: Bundles must follow telemetry-bundle.schema.json, created with deterministic tar flags, include hash manifest + DSSE + RFC3161 time-anchor; verifier script provided (ops/devops/telemetry/verify-telemetry-bundle.sh).
  • TO9 Observability of observability: Add SLOs + alerts for collector/exporter health, queue backpressure, bundle success rate; scheduled self-test emits DSSE result.
  • TO10 CLI/pack contracts: CLI/pack contract tracked in cli-spec-v1.yaml; telemetry exports must respect exit codes and checksum policy (reuse 21/22 for checksum missing/mismatch).

Artifacts

  • Schemas: docs/modules/telemetry/schemas/telemetry-config.schema.json, telemetry-bundle.schema.json.
  • Hash recipe: in-line within schemas (canonical JSON, SHA-256).
  • Verify script: ops/devops/telemetry/verify-telemetry-bundle.sh.

Adoption notes

  • Profile and bundle producers must validate against schemas and sign DSSE envelopes before distribution.
  • Air-gap/forensic profiles MUST set sealed mode and include redaction manifest.
  • CI should add a multi-run hash test for telemetry exporter output and fail on drift.