# Telemetry Standards (DOCS-OBS-50-002) Last updated: 2025-11-25 (Docs Tasks Md.VI) ## Common envelope - **Trace context**: `trace_id`, `span_id`, `trace_flags`; propagate W3C `traceparent` and `baggage` end to end. - **Tenant & workload**: `tenant`, `workload` (service name), `region`, `env` (dev/stage/prod), `version` (git sha or semver). - **Subject**: `component` (module), `operation` (verb/name), `resource` (purl/uri/subject id when safe). - **Timing**: UTC ISO-8601 `timestamp`; durations in milliseconds with integers. - **Outcome**: `status` (`ok|error|fault|throttle`), `error.code` (machine), `error.message` (human, redacted), `retryable` (bool). ## Scrubbing policy - Denylist PII/secrets before emit: emails, tokens, Authorization headers, bearer fragments, private keys, passwords, session IDs. - Redact fields to `"[redacted]"` and add `redaction.reason` (`secret|pii|tenant_policy`). - Hash low-cardinality identifiers when needed (`sha256` lowercase hex) and mark `hashed=true`. - Logs must not contain full request/response bodies; store hashes plus lengths. For NDJSON exports, allow hashes + selected headers only. ## Sampling defaults - **Traces**: 10% head sampling non-prod; 100% for `status=error|fault` and for spans tagged `audit=true`. Prod default 5% with the same error/audit boost. - **Logs**: info logs rate-limited per component (default 100/s); warn/error never sampled. Structured JSON only. - **Metrics**: never sampled; counters/gauges/histograms use deterministic bucket boundaries documented in component specs. ## Redaction override procedure - Overrides are rare and must be auditable. - To allow a field temporarily, set `telemetry.redaction.overrides=` in service config with change-ticket id; emit `redaction.override=true` tag on affected spans/logs. - Overrides expire automatically after `telemetry.redaction.override_ttl` (default 24h); services refuse to start with expired overrides. - All overrides are logged to `telemetry.redaction.audit` channel with actor, ticket, fields, TTL. ## Determinism & offline posture - No external enrichers; all enrichment data must be preloaded bundles (e.g., service map, tenant metadata). - Sorting for exports: by `timestamp`, then `workload`, then `operation`. - Time always UTC; avoid locale-specific formats. ## Validation checklist - [ ] `traceparent` propagated and present on inbound/outbound. - [ ] Required fields present (`tenant`, `workload`, `operation`, `status`). - [ ] Scrubbing tests cover auth headers and bodies. - [ ] Sampling knobs configurable via env vars with documented defaults.