Implement remediation-aware health checks across all Doctor plugin modules (Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment, EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release, Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation, Authority, Core, Cryptography, Database, Docker, Integration, Notify, Observability, Security, ServiceGraph, Sources, Verification). Each check now emits structured remediation metadata (severity, category, runbook links, and fix suggestions) consumed by the Doctor dashboard remediation panel. Also adds: - docs/doctor/articles/ knowledge base for check explanations - Advisory AI search seed and allowlist updates for doctor content - Sprint plan for doctor checks documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4.0 KiB
checkId, plugin, severity, tags
| checkId | plugin | severity | tags | ||||
|---|---|---|---|---|---|---|---|
| check.postgres.connectivity | stellaops.doctor.postgres | fail |
|
PostgreSQL Connectivity
What It Checks
Opens a connection to PostgreSQL and executes SELECT version(), current_timestamp to verify the database is accessible and responsive. Measures round-trip latency:
- Critical latency: fail if response time exceeds 500ms.
- Warning latency: warn if response time exceeds 100ms.
- Connection timeout: fail if the connection attempt exceeds 10 seconds.
- Connection failure: fail on authentication errors, DNS failures, or network issues.
The connection string password is masked in all evidence output.
Evidence collected: ConnectionString (masked), LatencyMs, Version, ServerTime, Status, Threshold, ErrorCode, ErrorMessage, TimeoutSeconds.
The check requires ConnectionStrings:StellaOps or Database:ConnectionString to be configured.
Why It Matters
PostgreSQL is the primary data store for the entire Stella Ops platform. Every service depends on it for configuration, state, and transactional data. If the database is unreachable, the platform is effectively down. High latency propagates through every database operation, degrading the performance of all services, API endpoints, and background jobs simultaneously. This is the most fundamental infrastructure health check.
Common Causes
- Database server not running or crashed
- Network connectivity issues between the application and database
- Firewall blocking the database port (5432)
- DNS resolution failure for the database hostname
- Invalid connection string (wrong host, port, or database name)
- Authentication failure (wrong username or password)
- Database does not exist
- Database server overloaded (high CPU, memory pressure, I/O saturation)
- Network latency between application and database hosts
- Slow queries blocking connections
- SSL/TLS certificate issues
How to Fix
Docker Compose
# Check postgres container status
docker compose -f docker-compose.stella-ops.yml ps postgres
# Test direct connection
docker compose -f docker-compose.stella-ops.yml exec postgres \
pg_isready -U stellaops -d stellaops_platform
# View postgres logs
docker compose -f docker-compose.stella-ops.yml logs --tail 100 postgres
# Restart postgres if needed
docker compose -f docker-compose.stella-ops.yml restart postgres
Verify connection string in environment:
services:
platform:
environment:
ConnectionStrings__StellaOps: "Host=postgres;Port=5432;Database=stellaops_platform;Username=stellaops;Password=stellaops"
Bare Metal / systemd
# Check PostgreSQL service status
sudo systemctl status postgresql
# Test connectivity
pg_isready -h localhost -p 5432 -U stellaops -d stellaops_platform
# Check PostgreSQL logs
sudo tail -100 /var/log/postgresql/postgresql-*.log
# Verify connection string
stella config get ConnectionStrings:StellaOps
# Test connection manually
psql -h localhost -p 5432 -U stellaops -d stellaops_platform -c "SELECT 1;"
Kubernetes / Helm
# Check PostgreSQL pod status
kubectl get pods -l app=postgresql
# Test connectivity from an application pod
kubectl exec -it <platform-pod> -- pg_isready -h postgres -p 5432
# View PostgreSQL pod logs
kubectl logs -l app=postgresql --tail=100
# Check service DNS resolution
kubectl exec -it <platform-pod> -- nslookup postgres
Verify connection string in secret:
kubectl get secret stellaops-db-credentials -o jsonpath='{.data.connection-string}' | base64 -d
Set in Helm values.yaml:
postgresql:
host: postgres
port: 5432
database: stellaops_platform
auth:
existingSecret: stellaops-db-credentials
Verification
stella doctor run --check check.postgres.connectivity
Related Checks
check.postgres.pool-- pool exhaustion can masquerade as connectivity issuescheck.postgres.migrations-- migration checks depend on connectivitycheck.operations.job-queue-- database issues cause job queue failures