Files

master 152c1b1357 doctor: complete runtime check documentation sprint

Signed-off-by: master <>

2026-03-31 23:26:24 +03:00

2.1 KiB

Raw Blame History

checkId, plugin, severity, tags

checkId

plugin

severity

Query Latency

What It Checks

Runs two warmup queries and then measures five SELECT 1 probes plus five temporary-table INSERT probes against PostgreSQL.

The check warns when the p95 latency exceeds 50ms and fails when the p95 latency exceeds 200ms.

Why It Matters

Healthy connectivity is not enough if the database path is slow. Elevated query latency turns into slow UI pages, delayed releases, and queue backlogs across the platform.

Common Causes

CPU, memory, or I/O pressure on the PostgreSQL host
Cross-host or cross-region latency between Doctor and PostgreSQL
Lock contention or long-running transactions
Shared infrastructure saturation in the default compose stack

How to Fix

Docker Compose

docker compose -f devops/compose/docker-compose.stella-ops.yml exec postgres psql -U stellaops -d stellaops -c "SELECT * FROM pg_stat_activity WHERE state = 'active';"
docker compose -f devops/compose/docker-compose.stella-ops.yml exec postgres psql -U stellaops -d stellaops -c "SELECT * FROM pg_locks WHERE NOT granted;"
docker compose -f devops/compose/docker-compose.stella-ops.yml stats postgres

Tune connection placement and storage before raising thresholds. If the database is remote, keep doctor-web and PostgreSQL on the same low-latency network segment.

Bare Metal / systemd

psql -h <db-host> -U <db-user> -d <db-name> -c "SELECT * FROM pg_stat_activity WHERE state = 'active';"
psql -h <db-host> -U <db-user> -d <db-name> -c "SELECT * FROM pg_locks WHERE NOT granted;"

Kubernetes / Helm

kubectl top pod -n <namespace> <postgres-pod>
kubectl exec -n <namespace> <postgres-pod> -- psql -U <db-user> -d <db-name> -c "SELECT now();"

Verification

stella doctor --check check.db.latency

check.db.connection - basic reachability must pass before latency numbers are meaningful
check.db.pool.health - pool saturation often shows up as latency first

2.1 KiB Raw Blame History