Files
git.stella-ops.org/docs/doctor/articles/servicegraph/servicegraph-timeouts.md
2026-03-31 23:26:24 +03:00

1.7 KiB

checkId, plugin, severity, tags
checkId plugin severity tags
check.servicegraph.timeouts stellaops.doctor.servicegraph warn
servicegraph
timeouts
configuration

Service Timeouts

What It Checks

Validates HttpClient:Timeout, Database:CommandTimeout, Cache:OperationTimeout, and HealthChecks:Timeout.

The check warns when HTTP timeout is below 5s or above 300s, database timeout is below 5s or above 120s, cache timeout exceeds 30s, or health-check timeout exceeds the HTTP timeout.

Why It Matters

Timeouts define how quickly failures surface and how long stuck work ties up resources. Poor values cause either premature failures or prolonged resource exhaustion.

Common Causes

  • Defaults from one environment were copied into another with very different latency
  • Health-check timeout was set higher than the main request timeout
  • Cache or database timeouts were raised to hide underlying performance problems

How to Fix

Docker Compose

services:
  doctor-web:
    environment:
      HttpClient__Timeout: "100"
      Database__CommandTimeout: "30"
      Cache__OperationTimeout: "5"
      HealthChecks__Timeout: "10"

Bare Metal / systemd

Tune timeouts from measured service latencies, not from guesswork. Raise values only after understanding the slower dependency.

Kubernetes / Helm

Keep application timeouts lower than ingress, service-mesh, and job-level deadlines so failures happen in the component that owns the retry policy.

Verification

stella doctor --check check.servicegraph.timeouts
  • check.servicegraph.backend - timeout misconfiguration often shows up as backend failures first
  • check.db.latency - high database latency can force operators to revisit timeout values