Files

master 152c1b1357 doctor: complete runtime check documentation sprint

Signed-off-by: master <>

2026-03-31 23:26:24 +03:00

1.6 KiB

Raw Blame History

checkId, plugin, severity, tags

checkId

plugin

severity

Distributed Tracing

What It Checks

Validates trace enablement, propagator, sampling ratio, exporter type, and whether HTTP and database instrumentation are turned on.

The check reports info when tracing is explicitly disabled and warns when sampling is invalid, too low, or when important instrumentation is turned off.

Why It Matters

Tracing is the fastest way to understand cross-service latency and identify the exact hop that is failing. Disabling instrumentation removes that evidence.

Common Causes

Sampling ratio set to 0 during load testing and never restored
Only outbound HTTP traces are enabled while database spans remain off
Propagator or exporter defaults differ between services

How to Fix

Docker Compose

services:
  doctor-web:
    environment:
      Tracing__Enabled: "true"
      Tracing__SamplingRatio: "1.0"
      Tracing__Instrumentation__Http: "true"
      Tracing__Instrumentation__Database: "true"

Bare Metal / systemd

Keep Tracing:SamplingRatio between 0.01 and 1.0 unless you are deliberately suppressing traces for a benchmark.

Kubernetes / Helm

Propagate the same trace configuration across all services in the release path so correlation IDs remain intact.

Verification

stella doctor --check check.observability.tracing

check.observability.otel - exporter connectivity must work before traces leave the process
check.servicegraph.timeouts - tracing is most useful when diagnosing timeout-related issues

1.6 KiB Raw Blame History