Files
StellaOps Bot 37cba83708
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Export Center CI / export-ci (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
up
2025-12-03 00:10:19 +02:00

67 lines
3.6 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# PostgreSQL 16 Cluster (staging / production)
This directory provisions StellaOps PostgreSQL clusters with **CloudNativePG (CNPG)**. It is pinned to Postgres 16.x, includes connection pooling (PgBouncer), Prometheus scraping, and S3-compatible backups. Everything is air-gap friendly: fetch the operator and images once, then render/apply manifests offline.
## Targets
- **Staging:** `stellaops-pg-stg` (2 instances, 200Gi data, WAL 64Gi, PgBouncer x2)
- **Production:** `stellaops-pg-prod` (3 instances, 500Gi data, WAL 128Gi, PgBouncer x3)
- **Namespace:** `platform-postgres`
## Prerequisites
- Kubernetes ≥ 1.27 with CSI storage classes `fast-ssd` (data) and `fast-wal` (WAL) available.
- CloudNativePG operator 1.23.x mirrored or downloaded to `artifacts/cloudnative-pg-1.23.0.yaml`.
- Images mirrored to your registry (example tags):
- `ghcr.io/cloudnative-pg/postgresql:16.4`
- `ghcr.io/cloudnative-pg/postgresql-operator:1.23.0`
- `ghcr.io/cloudnative-pg/pgbouncer:1.23.0`
- Secrets created from the templates under `ops/devops/postgres/secrets/` (superuser, app user, backup credentials).
## Render & Apply (deterministic)
```bash
# 1) Create namespace
kubectl apply -f ops/devops/postgres/namespace.yaml
# 2) Install operator (offline-friendly: use the pinned manifest you mirrored)
kubectl apply -f artifacts/cloudnative-pg-1.23.0.yaml
# 3) Create secrets (replace passwords/keys first)
kubectl apply -f ops/devops/postgres/secrets/example-superuser.yaml
kubectl apply -f ops/devops/postgres/secrets/example-app.yaml
kubectl apply -f ops/devops/postgres/secrets/example-backup-credentials.yaml
# 4) Apply the cluster and pooler for the target environment
kubectl apply -f ops/devops/postgres/cluster-staging.yaml
kubectl apply -f ops/devops/postgres/pooler-staging.yaml
# or
kubectl apply -f ops/devops/postgres/cluster-production.yaml
kubectl apply -f ops/devops/postgres/pooler-production.yaml
```
## Connection Endpoints
- RW service: `<cluster>-rw` (e.g., `stellaops-pg-stg-rw:5432`)
- RO service: `<cluster>-ro`
- PgBouncer pooler: `<pooler-name>` (e.g., `stellaops-pg-stg-pooler:6432`)
**Application connection string (matches library defaults):**
`Host=stellaops-pg-stg-pooler;Port=6432;Username=stellaops_app;Password=<app-password>;Database=stellaops;Pooling=true;Timeout=15;CommandTimeout=30;Ssl Mode=Require;`
## Monitoring & Backups
- `monitoring.enablePodMonitor: true` exposes PodMonitor for Prometheus Operator.
- Barman/S3 backups are enabled by default; set `backup.barmanObjectStore.destinationPath` per env and populate `stellaops-pg-backup` credentials.
- WAL compression is `gzip`; retention is operator-managed (configure via Barman bucket policies).
## Alignment with code defaults
- Session settings: UTC timezone, 30s `statement_timeout`, tenant context via `set_config('app.current_tenant', ...)`.
- Connection pooler uses **transaction** mode with a `server_reset_query` that clears session state, keeping RepositoryBase deterministic.
## Verification checklist
- `kubectl get cluster -n platform-postgres` shows `Ready` replicas matching `instances`.
- `kubectl logs deploy/cnpg-controller-manager -n cnpg-system` has no failing webhooks.
- `kubectl get podmonitor -n platform-postgres` returns entries for the cluster and pooler.
- `psql "<rw-connection-string>" -c 'select 1'` works from CI runner subnet.
- `cnpg` `barman-cloud-backup-list` shows successful full + WAL backups.
## Offline notes
- Mirror the operator manifest and container images to the approved registry first; no live downloads occur at runtime.
- If Prometheus is not present, leave PodMonitor applied; it is inert without the CRD.