--- checkId: check.postgres.pool plugin: stellaops.doctor.postgres severity: warn tags: [database, postgres, pool, connections] --- # PostgreSQL Connection Pool ## What It Checks Connects to PostgreSQL and queries `pg_stat_activity` and `pg_settings` to evaluate connection pool health: - **Pool usage ratio**: warn above 70%, fail above 90% (active connections / max connections). - **Waiting connections**: warn if any connections are waiting for a pool slot. Evidence collected: `ActiveConnections`, `IdleConnections`, `MaxConnections`, `UsageRatio`, `ConfiguredMaxPoolSize`, `ConfiguredMinPoolSize`, `WaitingConnections`. The check requires `ConnectionStrings:StellaOps` or `Database:ConnectionString` to be configured. The SQL query executed: ```sql SELECT (SELECT count(*) FROM pg_stat_activity WHERE state = 'active') as active, (SELECT count(*) FROM pg_stat_activity WHERE state = 'idle') as idle, (SELECT setting::int FROM pg_settings WHERE name = 'max_connections') as max_conn, (SELECT count(*) FROM pg_stat_activity WHERE wait_event_type = 'Client') as waiting ``` ## Why It Matters PostgreSQL connection exhaustion is one of the most common causes of service outages in Stella Ops. When the connection pool is exhausted, all services that need the database start timing out, causing cascading failures across the platform. Waiting connections indicate that requests are already queuing for database access, which translates directly to increased latency for end users. Connection leaks, if not caught early, will eventually exhaust the pool completely. ## Common Causes - Connection leak in application code (connections opened but not returned to pool) - Long-running queries holding connections open - Pool size too small for the workload (too many services sharing a single pool) - Sudden spike in database requests (bulk scan, CI surge) - All pool connections in use during peak load - Connection timeout configured too long, allowing stale connections to occupy slots - Requests arriving faster than connections are released ## How to Fix ### Docker Compose ```bash # Check active database connections docker compose -f docker-compose.stella-ops.yml exec postgres \ psql -U stellaops -d stellaops_platform -c \ "SELECT state, count(*) FROM pg_stat_activity GROUP BY state;" # Terminate idle connections docker compose -f docker-compose.stella-ops.yml exec postgres \ psql -U stellaops -d stellaops_platform -c \ "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle' AND query_start < now() - interval '10 minutes';" # Increase max connections in PostgreSQL # In docker-compose.stella-ops.yml: ``` ```yaml services: postgres: command: > postgres -c max_connections=200 -c shared_buffers=256MB ``` Increase Npgsql pool size via connection string: ```yaml services: platform: environment: ConnectionStrings__StellaOps: "Host=postgres;Database=stellaops_platform;Username=stellaops;Password=stellaops;Maximum Pool Size=50;Minimum Pool Size=5" ``` ### Bare Metal / systemd ```bash # Check connection statistics psql -U stellaops -d stellaops_platform -c \ "SELECT state, count(*) FROM pg_stat_activity GROUP BY state;" # Check for long-running queries psql -U stellaops -d stellaops_platform -c \ "SELECT pid, now() - query_start AS duration, query FROM pg_stat_activity WHERE state = 'active' ORDER BY duration DESC LIMIT 10;" # Increase max connections sudo -u postgres psql -c "ALTER SYSTEM SET max_connections = 200;" sudo systemctl restart postgresql ``` ### Kubernetes / Helm ```bash # Check connection pool from inside a pod kubectl exec -it -- psql -U stellaops -d stellaops_platform -c \ "SELECT state, count(*) FROM pg_stat_activity GROUP BY state;" # Terminate idle connections kubectl exec -it -- psql -U stellaops -d stellaops_platform -c \ "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle' AND query_start < now() - interval '10 minutes';" ``` Set in Helm `values.yaml`: ```yaml postgresql: maxConnections: 200 sharedBuffers: 256MB platform: database: connectionString: "Host=postgres;Database=stellaops_platform;Username=stellaops;Password=stellaops;Maximum Pool Size=50;Minimum Pool Size=5" ``` ## Verification ``` stella doctor run --check check.postgres.pool ``` ## Related Checks - `check.postgres.connectivity` -- connectivity issues compound pool problems - `check.postgres.migrations` -- schema issues can cause queries to hang, consuming connections - `check.operations.job-queue` -- database bottleneck slows job processing