doctor: complete runtime check documentation sprint

Signed-off-by: master <>
This commit is contained in:
master
2026-03-31 23:26:24 +03:00
parent 404d50bcb7
commit 152c1b1357
54 changed files with 2210 additions and 258 deletions

View File

@@ -0,0 +1,53 @@
---
checkId: check.db.latency
plugin: stellaops.doctor.database
severity: fail
tags: [database, postgres, latency, performance]
---
# Query Latency
## What It Checks
Runs two warmup queries and then measures five `SELECT 1` probes plus five temporary-table `INSERT` probes against PostgreSQL.
The check warns when the p95 latency exceeds `50ms` and fails when the p95 latency exceeds `200ms`.
## Why It Matters
Healthy connectivity is not enough if the database path is slow. Elevated query latency turns into slow UI pages, delayed releases, and queue backlogs across the platform.
## Common Causes
- CPU, memory, or I/O pressure on the PostgreSQL host
- Cross-host or cross-region latency between Doctor and PostgreSQL
- Lock contention or long-running transactions
- Shared infrastructure saturation in the default compose stack
## How to Fix
### Docker Compose
```bash
docker compose -f devops/compose/docker-compose.stella-ops.yml exec postgres psql -U stellaops -d stellaops -c "SELECT * FROM pg_stat_activity WHERE state = 'active';"
docker compose -f devops/compose/docker-compose.stella-ops.yml exec postgres psql -U stellaops -d stellaops -c "SELECT * FROM pg_locks WHERE NOT granted;"
docker compose -f devops/compose/docker-compose.stella-ops.yml stats postgres
```
Tune connection placement and storage before raising thresholds. If the database is remote, keep `doctor-web` and PostgreSQL on the same low-latency network segment.
### Bare Metal / systemd
```bash
psql -h <db-host> -U <db-user> -d <db-name> -c "SELECT * FROM pg_stat_activity WHERE state = 'active';"
psql -h <db-host> -U <db-user> -d <db-name> -c "SELECT * FROM pg_locks WHERE NOT granted;"
```
### Kubernetes / Helm
```bash
kubectl top pod -n <namespace> <postgres-pod>
kubectl exec -n <namespace> <postgres-pod> -- psql -U <db-user> -d <db-name> -c "SELECT now();"
```
## Verification
```bash
stella doctor --check check.db.latency
```
## Related Checks
- `check.db.connection` - basic reachability must pass before latency numbers are meaningful
- `check.db.pool.health` - pool saturation often shows up as latency first