up
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Export Center CI / export-ci (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled

This commit is contained in:
StellaOps Bot
2025-12-03 00:10:19 +02:00
parent ea1d58a89b
commit 37cba83708
158 changed files with 147438 additions and 867 deletions

View File

@@ -0,0 +1,66 @@
# PostgreSQL 16 Cluster (staging / production)
This directory provisions StellaOps PostgreSQL clusters with **CloudNativePG (CNPG)**. It is pinned to Postgres 16.x, includes connection pooling (PgBouncer), Prometheus scraping, and S3-compatible backups. Everything is air-gap friendly: fetch the operator and images once, then render/apply manifests offline.
## Targets
- **Staging:** `stellaops-pg-stg` (2 instances, 200Gi data, WAL 64Gi, PgBouncer x2)
- **Production:** `stellaops-pg-prod` (3 instances, 500Gi data, WAL 128Gi, PgBouncer x3)
- **Namespace:** `platform-postgres`
## Prerequisites
- Kubernetes ≥ 1.27 with CSI storage classes `fast-ssd` (data) and `fast-wal` (WAL) available.
- CloudNativePG operator 1.23.x mirrored or downloaded to `artifacts/cloudnative-pg-1.23.0.yaml`.
- Images mirrored to your registry (example tags):
- `ghcr.io/cloudnative-pg/postgresql:16.4`
- `ghcr.io/cloudnative-pg/postgresql-operator:1.23.0`
- `ghcr.io/cloudnative-pg/pgbouncer:1.23.0`
- Secrets created from the templates under `ops/devops/postgres/secrets/` (superuser, app user, backup credentials).
## Render & Apply (deterministic)
```bash
# 1) Create namespace
kubectl apply -f ops/devops/postgres/namespace.yaml
# 2) Install operator (offline-friendly: use the pinned manifest you mirrored)
kubectl apply -f artifacts/cloudnative-pg-1.23.0.yaml
# 3) Create secrets (replace passwords/keys first)
kubectl apply -f ops/devops/postgres/secrets/example-superuser.yaml
kubectl apply -f ops/devops/postgres/secrets/example-app.yaml
kubectl apply -f ops/devops/postgres/secrets/example-backup-credentials.yaml
# 4) Apply the cluster and pooler for the target environment
kubectl apply -f ops/devops/postgres/cluster-staging.yaml
kubectl apply -f ops/devops/postgres/pooler-staging.yaml
# or
kubectl apply -f ops/devops/postgres/cluster-production.yaml
kubectl apply -f ops/devops/postgres/pooler-production.yaml
```
## Connection Endpoints
- RW service: `<cluster>-rw` (e.g., `stellaops-pg-stg-rw:5432`)
- RO service: `<cluster>-ro`
- PgBouncer pooler: `<pooler-name>` (e.g., `stellaops-pg-stg-pooler:6432`)
**Application connection string (matches library defaults):**
`Host=stellaops-pg-stg-pooler;Port=6432;Username=stellaops_app;Password=<app-password>;Database=stellaops;Pooling=true;Timeout=15;CommandTimeout=30;Ssl Mode=Require;`
## Monitoring & Backups
- `monitoring.enablePodMonitor: true` exposes PodMonitor for Prometheus Operator.
- Barman/S3 backups are enabled by default; set `backup.barmanObjectStore.destinationPath` per env and populate `stellaops-pg-backup` credentials.
- WAL compression is `gzip`; retention is operator-managed (configure via Barman bucket policies).
## Alignment with code defaults
- Session settings: UTC timezone, 30s `statement_timeout`, tenant context via `set_config('app.current_tenant', ...)`.
- Connection pooler uses **transaction** mode with a `server_reset_query` that clears session state, keeping RepositoryBase deterministic.
## Verification checklist
- `kubectl get cluster -n platform-postgres` shows `Ready` replicas matching `instances`.
- `kubectl logs deploy/cnpg-controller-manager -n cnpg-system` has no failing webhooks.
- `kubectl get podmonitor -n platform-postgres` returns entries for the cluster and pooler.
- `psql "<rw-connection-string>" -c 'select 1'` works from CI runner subnet.
- `cnpg` `barman-cloud-backup-list` shows successful full + WAL backups.
## Offline notes
- Mirror the operator manifest and container images to the approved registry first; no live downloads occur at runtime.
- If Prometheus is not present, leave PodMonitor applied; it is inert without the CRD.