- Deleted obsolete test files for SchedulerAuditService and SchedulerMongoSessionFactory. - Removed unused TestDataFactory class. - Updated project files for Mongo.Tests to remove references to deleted files. - Upgraded BouncyCastle.Cryptography package to version 2.6.2 across multiple projects. - Replaced Microsoft.Extensions.Http.Polly with Microsoft.Extensions.Http.Resilience in Zastava.Webhook project. - Updated NetEscapades.Configuration.Yaml package to version 3.1.0 in Configuration library. - Upgraded Pkcs11Interop package to version 5.1.2 in Cryptography libraries. - Refactored Argon2idPasswordHasher to use BouncyCastle for hashing instead of Konscious. - Updated JsonSchema.Net package to version 7.3.2 in Microservice project. - Updated global.json to use .NET SDK version 10.0.101.
746 lines
20 KiB
Markdown
746 lines
20 KiB
Markdown
# PostgreSQL Operations Guide
|
|
|
|
**Version:** 1.0.0
|
|
**Last Updated:** 2025-12-10
|
|
**Status:** Active
|
|
|
|
This guide covers PostgreSQL operations for StellaOps, including setup, performance tuning, monitoring, backup/restore, and scaling recommendations.
|
|
|
|
---
|
|
|
|
## 1. Overview
|
|
|
|
StellaOps uses PostgreSQL (≥16) as the primary control-plane database with per-module schema isolation. MongoDB is retained only for legacy modules not yet converted.
|
|
|
|
### 1.1 Schema Topology
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ PostgreSQL Cluster │
|
|
│ ┌─────────────────────────────────────────────────────────────┐│
|
|
│ │ stellaops (database) ││
|
|
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ││
|
|
│ │ │authority│ │ vuln │ │ vex │ │scheduler│ ││
|
|
│ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ ││
|
|
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ││
|
|
│ │ │ notify │ │ policy │ │ packs │ │ issuer │ ││
|
|
│ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ ││
|
|
│ │ ┌─────────┐ ││
|
|
│ │ │ audit │ (cross-cutting audit schema) ││
|
|
│ │ └─────────┘ ││
|
|
│ └─────────────────────────────────────────────────────────────┘│
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### 1.2 Module Schema Ownership
|
|
|
|
| Schema | Owner Module | Primary Tables |
|
|
|--------|--------------|----------------|
|
|
| `authority` | Authority | tenants, users, roles, tokens, licenses |
|
|
| `vuln` | Concelier | sources, advisories, advisory_affected, kev_flags |
|
|
| `vex` | Excititor | projects, graph_revisions, statements, observations |
|
|
| `scheduler` | Scheduler | schedules, runs, graph_jobs, workers, locks |
|
|
| `notify` | Notify | channels, templates, rules, deliveries |
|
|
| `policy` | Policy | packs, rules, evaluations, exceptions |
|
|
| `concelier` | Concelier | documents, dtos, states, exports |
|
|
| `audit` | Shared | audit_log (cross-cutting) |
|
|
|
|
---
|
|
|
|
## 2. Performance Configuration
|
|
|
|
### 2.1 Enable pg_stat_statements
|
|
|
|
The `pg_stat_statements` extension is essential for query performance analysis. Enable it in your PostgreSQL configuration:
|
|
|
|
**postgresql.conf:**
|
|
```ini
|
|
# Load the extension at startup
|
|
shared_preload_libraries = 'pg_stat_statements'
|
|
|
|
# Configuration
|
|
pg_stat_statements.max = 10000
|
|
pg_stat_statements.track = all
|
|
pg_stat_statements.track_utility = on
|
|
pg_stat_statements.track_planning = on
|
|
```
|
|
|
|
**Enable in database:**
|
|
```sql
|
|
-- Create the extension (requires superuser)
|
|
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
|
|
|
|
-- Verify installation
|
|
SELECT * FROM pg_stat_statements LIMIT 1;
|
|
|
|
-- Reset statistics (useful after configuration changes)
|
|
SELECT pg_stat_statements_reset();
|
|
```
|
|
|
|
### 2.2 Recommended PostgreSQL Settings
|
|
|
|
**Memory Configuration (adjust based on available RAM):**
|
|
```ini
|
|
# For a server with 16GB RAM dedicated to PostgreSQL:
|
|
shared_buffers = 4GB # 25% of RAM
|
|
effective_cache_size = 12GB # 75% of RAM
|
|
maintenance_work_mem = 1GB # For VACUUM, CREATE INDEX
|
|
work_mem = 64MB # Per-operation sort memory
|
|
|
|
# Connection management
|
|
max_connections = 200 # Adjust based on pooling
|
|
```
|
|
|
|
**Write-Ahead Log (WAL):**
|
|
```ini
|
|
wal_buffers = 64MB
|
|
checkpoint_completion_target = 0.9
|
|
max_wal_size = 4GB
|
|
min_wal_size = 1GB
|
|
```
|
|
|
|
**Query Planner:**
|
|
```ini
|
|
random_page_cost = 1.1 # For SSDs (default 4.0 is for HDDs)
|
|
effective_io_concurrency = 200 # For SSDs
|
|
default_statistics_target = 100 # Increase for complex queries
|
|
```
|
|
|
|
**Parallel Query:**
|
|
```ini
|
|
max_parallel_workers_per_gather = 4
|
|
max_parallel_workers = 8
|
|
max_parallel_maintenance_workers = 4
|
|
```
|
|
|
|
### 2.3 Connection Pooling (PgBouncer)
|
|
|
|
**Recommended PgBouncer configuration:**
|
|
```ini
|
|
[pgbouncer]
|
|
pool_mode = transaction
|
|
max_client_conn = 1000
|
|
default_pool_size = 20
|
|
reserve_pool_size = 5
|
|
reserve_pool_timeout = 3
|
|
server_idle_timeout = 60
|
|
query_timeout = 30
|
|
```
|
|
|
|
**Session configuration (set on connection open):**
|
|
```sql
|
|
SET app.tenant_id = '<tenant-uuid>';
|
|
SET timezone = 'UTC';
|
|
SET statement_timeout = '30s';
|
|
```
|
|
|
|
---
|
|
|
|
## 3. Query Performance Analysis
|
|
|
|
### 3.1 Identifying Slow Queries
|
|
|
|
**Top queries by total time:**
|
|
```sql
|
|
SELECT
|
|
substring(query, 1, 100) as query_preview,
|
|
calls,
|
|
round(total_exec_time::numeric, 2) as total_ms,
|
|
round(mean_exec_time::numeric, 2) as mean_ms,
|
|
round((100 * total_exec_time / sum(total_exec_time) over())::numeric, 2) as percent_total
|
|
FROM pg_stat_statements
|
|
ORDER BY total_exec_time DESC
|
|
LIMIT 20;
|
|
```
|
|
|
|
**Queries with high mean execution time:**
|
|
```sql
|
|
SELECT
|
|
substring(query, 1, 100) as query_preview,
|
|
calls,
|
|
round(mean_exec_time::numeric, 2) as mean_ms,
|
|
round(stddev_exec_time::numeric, 2) as stddev_ms,
|
|
rows
|
|
FROM pg_stat_statements
|
|
WHERE calls > 10
|
|
ORDER BY mean_exec_time DESC
|
|
LIMIT 20;
|
|
```
|
|
|
|
**Queries with high buffer usage (I/O intensive):**
|
|
```sql
|
|
SELECT
|
|
substring(query, 1, 100) as query_preview,
|
|
calls,
|
|
shared_blks_hit + shared_blks_read as total_blks,
|
|
round(100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0), 2) as hit_ratio
|
|
FROM pg_stat_statements
|
|
WHERE shared_blks_hit + shared_blks_read > 1000
|
|
ORDER BY shared_blks_read DESC
|
|
LIMIT 20;
|
|
```
|
|
|
|
### 3.2 Using EXPLAIN ANALYZE
|
|
|
|
**Basic usage:**
|
|
```sql
|
|
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
|
|
SELECT * FROM vuln.advisories
|
|
WHERE state = 'active' AND severity = 'critical'
|
|
ORDER BY modified_at DESC
|
|
LIMIT 100;
|
|
```
|
|
|
|
**Understanding output - key indicators:**
|
|
- **Seq Scan** on large tables = missing index
|
|
- **Hash Join** vs **Nested Loop** - consider data sizes
|
|
- **Rows** estimate vs actual - statistics accuracy
|
|
- **Buffers: shared hit/read** - cache effectiveness
|
|
|
|
**Example analysis:**
|
|
```sql
|
|
-- Bad: Sequential scan on large table
|
|
Seq Scan on advisories (cost=0.00..50000.00 rows=1000 width=100)
|
|
Filter: ((state = 'active') AND (severity = 'critical'))
|
|
Rows Removed by Filter: 99000
|
|
|
|
-- Good: Index scan
|
|
Index Scan using idx_advisories_state_severity on advisories
|
|
Index Cond: ((state = 'active') AND (severity = 'critical'))
|
|
```
|
|
|
|
### 3.3 Index Analysis
|
|
|
|
**Find unused indexes:**
|
|
```sql
|
|
SELECT
|
|
schemaname || '.' || relname as table,
|
|
indexrelname as index,
|
|
pg_size_pretty(pg_relation_size(indexrelid)) as size,
|
|
idx_scan as scans
|
|
FROM pg_stat_user_indexes
|
|
WHERE idx_scan = 0
|
|
AND schemaname NOT IN ('pg_catalog', 'pg_toast')
|
|
ORDER BY pg_relation_size(indexrelid) DESC;
|
|
```
|
|
|
|
**Find missing indexes (tables with high sequential scans):**
|
|
```sql
|
|
SELECT
|
|
schemaname || '.' || relname as table,
|
|
seq_scan,
|
|
seq_tup_read,
|
|
idx_scan,
|
|
round(100.0 * idx_scan / nullif(seq_scan + idx_scan, 0), 2) as idx_usage_pct
|
|
FROM pg_stat_user_tables
|
|
WHERE seq_scan > 100
|
|
ORDER BY seq_tup_read DESC
|
|
LIMIT 20;
|
|
```
|
|
|
|
**Duplicate indexes:**
|
|
```sql
|
|
SELECT
|
|
pg_size_pretty(sum(pg_relation_size(idx))::bigint) as size,
|
|
array_agg(idx) as indexes,
|
|
indrelid::regclass as table,
|
|
indkey as columns
|
|
FROM (
|
|
SELECT indexrelid::regclass as idx, indrelid, indkey
|
|
FROM pg_index
|
|
) sub
|
|
GROUP BY indrelid, indkey
|
|
HAVING count(*) > 1;
|
|
```
|
|
|
|
---
|
|
|
|
## 4. Index Guidelines for StellaOps
|
|
|
|
### 4.1 Standard Index Patterns
|
|
|
|
All tenant-scoped tables should have composite indexes starting with `tenant_id`:
|
|
|
|
```sql
|
|
-- Standard tenant + primary lookup pattern
|
|
CREATE INDEX idx_<table>_tenant_<field> ON <schema>.<table>(tenant_id, <field>);
|
|
|
|
-- Time-based queries
|
|
CREATE INDEX idx_<table>_tenant_time ON <schema>.<table>(tenant_id, created_at DESC);
|
|
|
|
-- State/status filtering
|
|
CREATE INDEX idx_<table>_tenant_state ON <schema>.<table>(tenant_id, state)
|
|
WHERE state IN ('active', 'pending');
|
|
```
|
|
|
|
### 4.2 Module-Specific Indexes
|
|
|
|
**Authority schema:**
|
|
```sql
|
|
CREATE INDEX idx_users_tenant ON authority.users(tenant_id);
|
|
CREATE INDEX idx_users_email ON authority.users(email) WHERE email IS NOT NULL;
|
|
CREATE INDEX idx_tokens_expires ON authority.tokens(expires_at) WHERE revoked_at IS NULL;
|
|
```
|
|
|
|
**Vuln schema:**
|
|
```sql
|
|
CREATE INDEX idx_advisories_primary_vuln ON vuln.advisories(primary_vuln_id);
|
|
CREATE INDEX idx_advisories_modified ON vuln.advisories(modified_at DESC);
|
|
CREATE INDEX idx_advisory_aliases_value ON vuln.advisory_aliases(alias_value);
|
|
CREATE INDEX idx_advisory_affected_purl ON vuln.advisory_affected(package_purl)
|
|
WHERE package_purl IS NOT NULL;
|
|
```
|
|
|
|
**Scheduler schema:**
|
|
```sql
|
|
CREATE INDEX idx_runs_tenant_state ON scheduler.runs(tenant_id, state);
|
|
CREATE INDEX idx_runs_state_created ON scheduler.runs(state, created_at)
|
|
WHERE state IN ('pending', 'queued', 'running');
|
|
CREATE INDEX idx_graph_jobs_tenant_status ON scheduler.graph_jobs(tenant_id, status);
|
|
```
|
|
|
|
### 4.3 JSONB Indexes
|
|
|
|
```sql
|
|
-- GIN index for containment queries (@>, ?, ?&, ?|)
|
|
CREATE INDEX idx_<table>_<column>_gin ON <schema>.<table> USING GIN (<column>);
|
|
|
|
-- Expression index for specific JSON paths
|
|
CREATE INDEX idx_<table>_<column>_path ON <schema>.<table> ((<column>->>'specific_key'));
|
|
```
|
|
|
|
---
|
|
|
|
## 5. Monitoring Setup
|
|
|
|
### 5.1 Key Metrics to Monitor
|
|
|
|
**Connection metrics:**
|
|
```sql
|
|
-- Current connections by state
|
|
SELECT state, count(*)
|
|
FROM pg_stat_activity
|
|
GROUP BY state;
|
|
|
|
-- Connections by database/user
|
|
SELECT datname, usename, count(*)
|
|
FROM pg_stat_activity
|
|
GROUP BY datname, usename;
|
|
```
|
|
|
|
**Cache effectiveness:**
|
|
```sql
|
|
-- Database-level cache hit ratio (should be >99%)
|
|
SELECT
|
|
datname,
|
|
round(100.0 * blks_hit / nullif(blks_hit + blks_read, 0), 2) as cache_hit_ratio
|
|
FROM pg_stat_database
|
|
WHERE datname = 'stellaops';
|
|
```
|
|
|
|
**Table bloat and maintenance:**
|
|
```sql
|
|
-- Tables needing VACUUM
|
|
SELECT
|
|
schemaname || '.' || relname as table,
|
|
n_dead_tup,
|
|
n_live_tup,
|
|
round(100.0 * n_dead_tup / nullif(n_live_tup + n_dead_tup, 0), 2) as dead_pct,
|
|
last_vacuum,
|
|
last_autovacuum
|
|
FROM pg_stat_user_tables
|
|
WHERE n_dead_tup > 10000
|
|
ORDER BY n_dead_tup DESC;
|
|
```
|
|
|
|
### 5.2 Prometheus Metrics
|
|
|
|
Use `postgres_exporter` for Prometheus integration. Key metrics:
|
|
|
|
```yaml
|
|
# Alert rules for PostgreSQL
|
|
groups:
|
|
- name: postgresql
|
|
rules:
|
|
- alert: PostgreSQLHighConnections
|
|
expr: pg_stat_activity_count > (pg_settings_max_connections * 0.8)
|
|
for: 5m
|
|
labels:
|
|
severity: warning
|
|
annotations:
|
|
summary: "PostgreSQL connections at {{ $value | humanizePercentage }} of max"
|
|
|
|
- alert: PostgreSQLLowCacheHitRatio
|
|
expr: pg_stat_database_blks_hit / (pg_stat_database_blks_hit + pg_stat_database_blks_read) < 0.95
|
|
for: 15m
|
|
labels:
|
|
severity: warning
|
|
annotations:
|
|
summary: "PostgreSQL cache hit ratio below 95%"
|
|
|
|
- alert: PostgreSQLDeadlocks
|
|
expr: rate(pg_stat_database_deadlocks[5m]) > 0
|
|
for: 5m
|
|
labels:
|
|
severity: warning
|
|
annotations:
|
|
summary: "PostgreSQL deadlocks detected"
|
|
|
|
- alert: PostgreSQLSlowQueries
|
|
expr: pg_stat_activity_max_tx_duration > 300
|
|
for: 5m
|
|
labels:
|
|
severity: warning
|
|
annotations:
|
|
summary: "Long-running transaction detected (>5min)"
|
|
```
|
|
|
|
### 5.3 Grafana Dashboard
|
|
|
|
Import the PostgreSQL dashboard (ID: 9628) or create custom panels for:
|
|
|
|
1. **Connection Pool** - Active/idle/waiting connections
|
|
2. **Query Performance** - QPS, latency percentiles
|
|
3. **Cache Hit Ratio** - Database and table level
|
|
4. **Disk I/O** - Read/write IOPS and throughput
|
|
5. **Replication Lag** - For HA setups
|
|
6. **Lock Waits** - Blocked queries count
|
|
|
|
---
|
|
|
|
## 6. Performance Baselines
|
|
|
|
### 6.1 Expected Performance Targets
|
|
|
|
| Operation | Target P95 | Notes |
|
|
|-----------|------------|-------|
|
|
| Simple key lookup | < 5ms | Single row by UUID |
|
|
| Tenant-filtered list | < 50ms | 100 rows with pagination |
|
|
| Advisory search | < 100ms | With FTS and filters |
|
|
| VEX statement insert | < 20ms | Single statement |
|
|
| Scheduler job enqueue | < 10ms | With lock acquisition |
|
|
| Report generation | < 500ms | Full SBOM evaluation |
|
|
|
|
### 6.2 Baseline Queries
|
|
|
|
Run these periodically to establish baselines:
|
|
|
|
```sql
|
|
-- Authority: User lookup
|
|
EXPLAIN (ANALYZE, BUFFERS)
|
|
SELECT * FROM authority.users
|
|
WHERE tenant_id = '<uuid>' AND normalized_username = 'testuser';
|
|
|
|
-- Vuln: Advisory search
|
|
EXPLAIN (ANALYZE, BUFFERS)
|
|
SELECT * FROM vuln.advisories
|
|
WHERE state = 'active'
|
|
AND to_tsvector('english', title || ' ' || coalesce(summary, '')) @@ plainto_tsquery('critical vulnerability')
|
|
ORDER BY modified_at DESC
|
|
LIMIT 50;
|
|
|
|
-- Scheduler: Pending jobs
|
|
EXPLAIN (ANALYZE, BUFFERS)
|
|
SELECT * FROM scheduler.runs
|
|
WHERE tenant_id = '<uuid>' AND state = 'pending'
|
|
ORDER BY created_at
|
|
LIMIT 100;
|
|
```
|
|
|
|
### 6.3 Load Testing
|
|
|
|
Use `pgbench` for baseline load testing:
|
|
|
|
```bash
|
|
# Initialize test data
|
|
pgbench -i -s 50 stellaops
|
|
|
|
# Run benchmark (60 seconds, 10 clients)
|
|
pgbench -c 10 -j 4 -T 60 stellaops
|
|
|
|
# Custom script benchmark
|
|
pgbench -c 10 -j 4 -T 60 -f custom_workload.sql stellaops
|
|
```
|
|
|
|
---
|
|
|
|
## 7. Backup and Restore
|
|
|
|
### 7.1 Backup Strategy
|
|
|
|
**Daily full backup with pg_dump:**
|
|
```bash
|
|
#!/bin/bash
|
|
DATE=$(date +%Y%m%d_%H%M%S)
|
|
BACKUP_DIR=/var/backups/postgresql
|
|
|
|
pg_dump -Fc -Z 9 \
|
|
--host="${PGHOST}" \
|
|
--port="${PGPORT}" \
|
|
--username="${PGUSER}" \
|
|
--dbname=stellaops \
|
|
--file="${BACKUP_DIR}/stellaops_${DATE}.dump"
|
|
|
|
# Retain last 7 days
|
|
find ${BACKUP_DIR} -name "*.dump" -mtime +7 -delete
|
|
```
|
|
|
|
**Continuous WAL archiving:**
|
|
```ini
|
|
# postgresql.conf
|
|
archive_mode = on
|
|
archive_command = 'cp %p /var/lib/postgresql/wal_archive/%f'
|
|
```
|
|
|
|
### 7.2 Point-in-Time Recovery
|
|
|
|
```bash
|
|
# Stop PostgreSQL
|
|
systemctl stop postgresql
|
|
|
|
# Restore base backup
|
|
pg_restore -C -d postgres /var/backups/postgresql/stellaops_backup.dump
|
|
|
|
# Create recovery.conf (PostgreSQL 12+: recovery.signal + postgresql.conf)
|
|
cat > ${PGDATA}/postgresql.auto.conf << EOF
|
|
restore_command = 'cp /var/lib/postgresql/wal_archive/%f %p'
|
|
recovery_target_time = '2025-12-10 14:30:00 UTC'
|
|
EOF
|
|
|
|
touch ${PGDATA}/recovery.signal
|
|
|
|
# Start PostgreSQL
|
|
systemctl start postgresql
|
|
```
|
|
|
|
### 7.3 Backup Verification
|
|
|
|
```bash
|
|
# Test restore to a different database
|
|
pg_restore -C -d postgres --dbname=stellaops_test /var/backups/postgresql/stellaops_backup.dump
|
|
|
|
# Verify data integrity
|
|
psql -d stellaops_test -c "SELECT count(*) FROM authority.users;"
|
|
psql -d stellaops_test -c "SELECT count(*) FROM vuln.advisories;"
|
|
|
|
# Cleanup
|
|
dropdb stellaops_test
|
|
```
|
|
|
|
---
|
|
|
|
## 8. Scaling Recommendations
|
|
|
|
### 8.1 Vertical Scaling
|
|
|
|
| Load Level | vCPUs | RAM | Storage | Connections |
|
|
|------------|-------|-----|---------|-------------|
|
|
| Development | 2 | 4GB | 50GB SSD | 50 |
|
|
| Small (<1k images) | 4 | 16GB | 200GB SSD | 100 |
|
|
| Medium (1k-10k images) | 8 | 32GB | 500GB SSD | 200 |
|
|
| Large (10k+ images) | 16 | 64GB | 1TB+ NVMe | 500 |
|
|
|
|
### 8.2 Horizontal Scaling
|
|
|
|
**Read replicas for reporting:**
|
|
```yaml
|
|
# Primary for writes
|
|
primary:
|
|
host: postgres-primary.internal
|
|
port: 5432
|
|
|
|
# Replicas for reads (round-robin)
|
|
replicas:
|
|
- host: postgres-replica-1.internal
|
|
port: 5432
|
|
- host: postgres-replica-2.internal
|
|
port: 5432
|
|
```
|
|
|
|
**Connection routing in application:**
|
|
- Writes → Primary
|
|
- Heavy reads (reports, dashboards) → Replicas
|
|
- Scheduler impact queries → Replicas with acceptable lag
|
|
|
|
### 8.3 Table Partitioning
|
|
|
|
For high-volume tables (>100M rows), consider partitioning:
|
|
|
|
```sql
|
|
-- Partition scheduler.runs by created_at
|
|
CREATE TABLE scheduler.runs_partitioned (
|
|
LIKE scheduler.runs INCLUDING ALL
|
|
) PARTITION BY RANGE (created_at);
|
|
|
|
-- Monthly partitions
|
|
CREATE TABLE scheduler.runs_y2025m12
|
|
PARTITION OF scheduler.runs_partitioned
|
|
FOR VALUES FROM ('2025-12-01') TO ('2026-01-01');
|
|
|
|
-- Automate partition creation
|
|
-- See: pg_partman extension
|
|
```
|
|
|
|
### 8.4 Connection Pooling at Scale
|
|
|
|
For >1000 concurrent connections, deploy PgBouncer as a sidecar or dedicated service:
|
|
|
|
```yaml
|
|
# Kubernetes deployment with PgBouncer sidecar
|
|
containers:
|
|
- name: app
|
|
env:
|
|
- name: DATABASE_URL
|
|
value: "postgresql://localhost:6432/stellaops"
|
|
- name: pgbouncer
|
|
image: pgbouncer/pgbouncer:1.21.0
|
|
ports:
|
|
- containerPort: 6432
|
|
```
|
|
|
|
---
|
|
|
|
## 9. Troubleshooting
|
|
|
|
### 9.1 Common Issues
|
|
|
|
**High connection count:**
|
|
```sql
|
|
-- Identify connection sources
|
|
SELECT client_addr, usename, state, count(*)
|
|
FROM pg_stat_activity
|
|
GROUP BY 1, 2, 3
|
|
ORDER BY 4 DESC;
|
|
|
|
-- Terminate idle connections
|
|
SELECT pg_terminate_backend(pid)
|
|
FROM pg_stat_activity
|
|
WHERE state = 'idle'
|
|
AND state_change < now() - interval '30 minutes';
|
|
```
|
|
|
|
**Lock contention:**
|
|
```sql
|
|
-- Find blocking queries
|
|
SELECT
|
|
blocked.pid as blocked_pid,
|
|
blocked.query as blocked_query,
|
|
blocking.pid as blocking_pid,
|
|
blocking.query as blocking_query
|
|
FROM pg_stat_activity blocked
|
|
JOIN pg_stat_activity blocking ON blocking.pid = ANY(pg_blocking_pids(blocked.pid))
|
|
WHERE blocked.wait_event_type = 'Lock';
|
|
```
|
|
|
|
**Table bloat:**
|
|
```sql
|
|
-- Check table and index sizes
|
|
SELECT
|
|
schemaname || '.' || relname as table,
|
|
pg_size_pretty(pg_total_relation_size(relid)) as total_size,
|
|
pg_size_pretty(pg_table_size(relid)) as table_size,
|
|
pg_size_pretty(pg_indexes_size(relid)) as index_size
|
|
FROM pg_stat_user_tables
|
|
ORDER BY pg_total_relation_size(relid) DESC
|
|
LIMIT 20;
|
|
|
|
-- Manual VACUUM FULL for severe bloat (blocks writes!)
|
|
VACUUM (FULL, ANALYZE) scheduler.runs;
|
|
```
|
|
|
|
### 9.2 Emergency Procedures
|
|
|
|
**Kill long-running queries:**
|
|
```sql
|
|
SELECT pg_terminate_backend(pid)
|
|
FROM pg_stat_activity
|
|
WHERE state = 'active'
|
|
AND query_start < now() - interval '10 minutes'
|
|
AND query NOT LIKE '%pg_stat%';
|
|
```
|
|
|
|
**Force checkpoint (before maintenance):**
|
|
```sql
|
|
CHECKPOINT;
|
|
```
|
|
|
|
**Emergency read-only mode:**
|
|
```sql
|
|
ALTER DATABASE stellaops SET default_transaction_read_only = on;
|
|
```
|
|
|
|
---
|
|
|
|
## 10. Air-Gap Considerations
|
|
|
|
### 10.1 Offline Setup
|
|
|
|
PostgreSQL 16+ is bundled in the air-gap kit. See `docs/24_OFFLINE_KIT.md` for import instructions.
|
|
|
|
**Docker image digest (pinned):**
|
|
```yaml
|
|
postgres:
|
|
image: docker.io/library/postgres:16@sha256:<pinned-digest>
|
|
```
|
|
|
|
### 10.2 Migrations in Air-Gap
|
|
|
|
All migrations are embedded in application assemblies. No network access required:
|
|
|
|
```bash
|
|
# Run migrations manually
|
|
dotnet run --project src/Tools/MigrationRunner -- \
|
|
--connection "Host=postgres;Database=stellaops;..." \
|
|
--schema all
|
|
```
|
|
|
|
### 10.3 Backup in Air-Gap
|
|
|
|
```bash
|
|
# Local backup with encryption
|
|
pg_dump -Fc stellaops | gpg --encrypt -r backup@stellaops.local > backup.dump.gpg
|
|
|
|
# Restore
|
|
gpg --decrypt backup.dump.gpg | pg_restore -d stellaops
|
|
```
|
|
|
|
---
|
|
|
|
## Appendix A: Quick Reference
|
|
|
|
### Connection String Template
|
|
```
|
|
Host=<host>;Port=5432;Database=stellaops;Username=<user>;Password=<pass>;
|
|
Pooling=true;MinPoolSize=5;MaxPoolSize=20;ConnectionIdleLifetime=300;
|
|
CommandTimeout=30;Timeout=15;
|
|
```
|
|
|
|
### Essential Commands
|
|
```bash
|
|
# Connect to database
|
|
psql -h localhost -U stellaops -d stellaops
|
|
|
|
# Check version
|
|
psql -c "SELECT version();"
|
|
|
|
# List schemas
|
|
psql -c "\dn"
|
|
|
|
# List tables in schema
|
|
psql -c "\dt vuln.*"
|
|
|
|
# Table structure
|
|
psql -c "\d vuln.advisories"
|
|
|
|
# Current activity
|
|
psql -c "SELECT * FROM pg_stat_activity;"
|
|
```
|
|
|
|
### Useful Extensions
|
|
```sql
|
|
CREATE EXTENSION IF NOT EXISTS pg_stat_statements; -- Query statistics
|
|
CREATE EXTENSION IF NOT EXISTS pg_trgm; -- Fuzzy text search
|
|
CREATE EXTENSION IF NOT EXISTS btree_gin; -- GIN for scalars
|
|
CREATE EXTENSION IF NOT EXISTS pgcrypto; -- Cryptographic functions
|
|
```
|