426 lines
10 KiB
Markdown
426 lines
10 KiB
Markdown
# PostgreSQL Patterns Operational Runbook
|
|
|
|
**Version:** 1.0
|
|
**Last Updated:** 2025-12-14
|
|
**Related Sprints:** SPRINT_3420, SPRINT_3421, SPRINT_3422, SPRINT_3423
|
|
|
|
---
|
|
|
|
## Table of Contents
|
|
|
|
1. [Row-Level Security (RLS)](#1-row-level-security-rls)
|
|
2. [Bitemporal Unknowns](#2-bitemporal-unknowns)
|
|
3. [Time-Based Partitioning](#3-time-based-partitioning)
|
|
4. [Generated Columns](#4-generated-columns)
|
|
5. [Troubleshooting](#5-troubleshooting)
|
|
|
|
---
|
|
|
|
## 1. Row-Level Security (RLS)
|
|
|
|
### 1.1 Overview
|
|
|
|
RLS provides database-level tenant isolation. All tenant-scoped tables enforce that queries can only access rows matching the current tenant context.
|
|
|
|
### 1.2 Setting Tenant Context
|
|
|
|
Before executing queries, the application must set the tenant context:
|
|
|
|
```sql
|
|
-- Set tenant context for the session
|
|
SELECT set_config('app.tenant_id', 'my-tenant-id', false);
|
|
|
|
-- Verify tenant context
|
|
SELECT current_setting('app.tenant_id', true);
|
|
```
|
|
|
|
### 1.3 Validating RLS Configuration
|
|
|
|
Run the validation script to check RLS status:
|
|
|
|
```bash
|
|
psql -f deploy/postgres-validation/001_validate_rls.sql
|
|
```
|
|
|
|
Expected output shows:
|
|
- All tenant-scoped tables with `rls_enabled = true`
|
|
- All tenant-scoped tables with `rls_forced = true`
|
|
- One policy per table named `*_tenant_isolation`
|
|
|
|
### 1.4 Bypassing RLS (Admin Operations)
|
|
|
|
For migrations or cross-tenant operations, use the admin role:
|
|
|
|
```sql
|
|
-- Grant admin role to current user (one-time setup)
|
|
GRANT scheduler_admin TO your_migration_user;
|
|
|
|
-- Execute as admin (bypasses RLS)
|
|
SET ROLE scheduler_admin;
|
|
SELECT * FROM scheduler.runs; -- Returns all tenants
|
|
|
|
-- Return to normal role
|
|
RESET ROLE;
|
|
```
|
|
|
|
### 1.5 Common Issues
|
|
|
|
**Issue:** Query returns no rows unexpectedly
|
|
```sql
|
|
-- Check if tenant context is set
|
|
SELECT current_setting('app.tenant_id', true);
|
|
-- If NULL or empty, set it before querying
|
|
```
|
|
|
|
**Issue:** "app.tenant_id session variable not set" error
|
|
```sql
|
|
-- Set the tenant context
|
|
SELECT set_config('app.tenant_id', 'tenant-123', false);
|
|
```
|
|
|
|
**Issue:** Need to query across tenants
|
|
```sql
|
|
-- Use admin bypass role
|
|
SET ROLE scheduler_admin;
|
|
-- Run your cross-tenant query
|
|
RESET ROLE;
|
|
```
|
|
|
|
---
|
|
|
|
## 2. Bitemporal Unknowns
|
|
|
|
### 2.1 Overview
|
|
|
|
The `unknowns` schema tracks ambiguities discovered during vulnerability scanning using bitemporal semantics:
|
|
- **Valid time** (`valid_from`, `valid_to`): When the unknown was relevant in the real world
|
|
- **System time** (`sys_from`, `sys_to`): When the system knew about the unknown
|
|
|
|
### 2.2 Querying Current Unknowns
|
|
|
|
```sql
|
|
-- Get all current open unknowns for a tenant
|
|
SELECT * FROM unknowns.current
|
|
WHERE tenant_id = 'my-tenant';
|
|
|
|
-- Or use the view with tenant context set
|
|
SELECT set_config('app.tenant_id', 'my-tenant', false);
|
|
SELECT * FROM unknowns.current;
|
|
```
|
|
|
|
### 2.3 Point-in-Time Queries
|
|
|
|
Query the state of unknowns at any point in history:
|
|
|
|
```sql
|
|
-- What unknowns existed on January 1, 2025?
|
|
SELECT * FROM unknowns.as_of(
|
|
'my-tenant',
|
|
'2025-01-01 00:00:00+00'::TIMESTAMPTZ
|
|
);
|
|
|
|
-- What did we know about unknowns on Jan 1, as of Jan 15?
|
|
SELECT * FROM unknowns.as_of(
|
|
'my-tenant',
|
|
'2025-01-01 00:00:00+00'::TIMESTAMPTZ, -- valid_at
|
|
'2025-01-15 00:00:00+00'::TIMESTAMPTZ -- sys_at
|
|
);
|
|
```
|
|
|
|
### 2.4 Resolving Unknowns
|
|
|
|
```sql
|
|
-- Resolve an unknown (sets valid_to and resolution fields)
|
|
UPDATE unknowns.unknown
|
|
SET resolved_at = NOW(),
|
|
resolution_type = 'feed_updated',
|
|
resolution_ref = 'nvd-feed-2025-01',
|
|
resolution_notes = 'NVD now covers this package',
|
|
valid_to = NOW()
|
|
WHERE id = 'unknown-uuid'
|
|
AND tenant_id = 'my-tenant';
|
|
```
|
|
|
|
### 2.5 Metrics Queries
|
|
|
|
```sql
|
|
-- Count unknowns by kind
|
|
SELECT * FROM unknowns.count_by_kind('my-tenant');
|
|
|
|
-- Count unknowns by severity
|
|
SELECT * FROM unknowns.count_by_severity('my-tenant');
|
|
```
|
|
|
|
---
|
|
|
|
## 3. Time-Based Partitioning
|
|
|
|
### 3.1 Overview
|
|
|
|
High-volume tables are partitioned by month for:
|
|
- Efficient time-range queries (partition pruning)
|
|
- O(1) data retention (drop partition vs DELETE)
|
|
- Improved vacuum performance
|
|
|
|
### 3.2 Checking Partition Status
|
|
|
|
Run the validation script:
|
|
|
|
```bash
|
|
psql -f deploy/postgres-validation/002_validate_partitions.sql
|
|
```
|
|
|
|
Or query directly:
|
|
|
|
```sql
|
|
-- View all partitions with sizes
|
|
SELECT * FROM partition_mgmt.partition_stats
|
|
WHERE schema_name = 'scheduler'
|
|
ORDER BY table_name, partition_name;
|
|
```
|
|
|
|
### 3.3 Creating Future Partitions
|
|
|
|
Partitions should be created ahead of time. Run monthly:
|
|
|
|
```sql
|
|
-- Create partitions for next 3 months
|
|
SELECT partition_mgmt.create_monthly_partitions(
|
|
'scheduler', -- schema
|
|
'audit', -- table
|
|
'created_at', -- partition column
|
|
CURRENT_DATE, -- start from now
|
|
3 -- months ahead
|
|
);
|
|
```
|
|
|
|
### 3.4 Archiving Old Partitions
|
|
|
|
Preview what would be archived:
|
|
|
|
```sql
|
|
-- Dry run: see what would be archived (12 month retention)
|
|
SELECT * FROM partition_mgmt.cleanup_old_partitions(
|
|
'scheduler', 'audit', 12, 'archive', TRUE
|
|
);
|
|
```
|
|
|
|
Execute archiving:
|
|
|
|
```sql
|
|
-- Actually archive partitions older than 12 months
|
|
SELECT * FROM partition_mgmt.cleanup_old_partitions(
|
|
'scheduler', 'audit', 12, 'archive', FALSE
|
|
);
|
|
```
|
|
|
|
### 3.5 Dropping Archived Partitions
|
|
|
|
After verifying archived data is backed up:
|
|
|
|
```sql
|
|
-- Drop archived partition
|
|
DROP TABLE archive.audit_2024_01;
|
|
```
|
|
|
|
### 3.6 Maintenance Checklist
|
|
|
|
**Weekly:**
|
|
- [ ] Check `partition_mgmt.partition_stats` for unexpected growth
|
|
- [ ] Verify default partitions are empty
|
|
|
|
**Monthly:**
|
|
- [ ] Create partitions for 3 months ahead
|
|
- [ ] Review partitions approaching retention cutoff
|
|
|
|
**Quarterly:**
|
|
- [ ] Archive partitions beyond retention period
|
|
- [ ] Verify archived data is backed up
|
|
- [ ] Drop archived partitions if space needed
|
|
|
|
---
|
|
|
|
## 4. Generated Columns
|
|
|
|
### 4.1 Overview
|
|
|
|
Generated columns extract frequently-queried JSONB fields into indexed columns for 10-50x query speedup.
|
|
|
|
### 4.2 Affected Tables
|
|
|
|
| Table | Generated Columns | Source |
|
|
|-------|-------------------|--------|
|
|
| `scheduler.runs` | `finding_count`, `critical_count`, `high_count`, `medium_count`, `low_count` | `stats` JSONB |
|
|
| `vuln.advisories` | `provenance_source_key`, `provenance_feed_id`, `provenance_ingested_at` | `provenance` JSONB |
|
|
| `vuln.advisory_affected` | `purl_type`, `purl_name` | `purl` TEXT |
|
|
|
|
### 4.3 Query Optimization
|
|
|
|
**Before (slow):**
|
|
```sql
|
|
SELECT * FROM scheduler.runs
|
|
WHERE (stats->>'findingCount')::int > 100;
|
|
```
|
|
|
|
**After (fast):**
|
|
```sql
|
|
SELECT * FROM scheduler.runs
|
|
WHERE finding_count > 100;
|
|
```
|
|
|
|
### 4.4 Adding New Generated Columns
|
|
|
|
```sql
|
|
-- Add a new generated column
|
|
ALTER TABLE your_table
|
|
ADD COLUMN new_field TEXT
|
|
GENERATED ALWAYS AS (jsonb_column->>'field_name') STORED;
|
|
|
|
-- Create index
|
|
CREATE INDEX ix_your_table_new_field
|
|
ON your_table (new_field)
|
|
WHERE new_field IS NOT NULL;
|
|
```
|
|
|
|
### 4.5 Monitoring
|
|
|
|
Check column usage in queries:
|
|
|
|
```sql
|
|
-- See which indexes are being used
|
|
SELECT indexrelname, idx_scan, idx_tup_read
|
|
FROM pg_stat_user_indexes
|
|
WHERE schemaname = 'scheduler'
|
|
ORDER BY idx_scan DESC;
|
|
```
|
|
|
|
---
|
|
|
|
## 5. Troubleshooting
|
|
|
|
### 5.1 RLS Issues
|
|
|
|
**Symptom:** Application gets "permission denied" or empty results
|
|
|
|
**Diagnosis:**
|
|
```sql
|
|
-- Check if RLS is enabled
|
|
SELECT tablename, rowsecurity, forcerowsecurity
|
|
FROM pg_tables WHERE schemaname = 'scheduler';
|
|
|
|
-- Check policies
|
|
SELECT * FROM pg_policies WHERE schemaname = 'scheduler';
|
|
|
|
-- Check tenant context
|
|
SELECT current_setting('app.tenant_id', true);
|
|
```
|
|
|
|
**Resolution:**
|
|
1. Ensure `set_config('app.tenant_id', ...)` is called before queries
|
|
2. Verify the tenant ID matches data in the table
|
|
3. For admin operations, use bypass role
|
|
|
|
### 5.2 Partition Issues
|
|
|
|
**Symptom:** Inserts fail with "no partition of relation"
|
|
|
|
**Diagnosis:**
|
|
```sql
|
|
-- Check existing partitions
|
|
SELECT * FROM partition_mgmt.partition_stats
|
|
WHERE table_name = 'audit';
|
|
|
|
-- Check default partition for orphaned data
|
|
SELECT COUNT(*) FROM scheduler.audit_default;
|
|
```
|
|
|
|
**Resolution:**
|
|
```sql
|
|
-- Create missing partition
|
|
SELECT partition_mgmt.create_partition(
|
|
'scheduler', 'audit', 'created_at',
|
|
'2025-12-01'::DATE, '2026-01-01'::DATE
|
|
);
|
|
```
|
|
|
|
**Symptom:** Queries scanning all partitions (slow)
|
|
|
|
**Diagnosis:**
|
|
```sql
|
|
-- Check query plan
|
|
EXPLAIN (ANALYZE, BUFFERS)
|
|
SELECT * FROM scheduler.audit
|
|
WHERE created_at >= '2025-01-01' AND created_at < '2025-02-01';
|
|
```
|
|
|
|
**Resolution:**
|
|
- Ensure WHERE clause includes partition key
|
|
- Use `>=` and `<` instead of BETWEEN for clearer pruning
|
|
|
|
### 5.3 Bitemporal Query Issues
|
|
|
|
**Symptom:** `as_of()` returns unexpected results
|
|
|
|
**Diagnosis:**
|
|
```sql
|
|
-- Check raw temporal data
|
|
SELECT id, valid_from, valid_to, sys_from, sys_to
|
|
FROM unknowns.unknown
|
|
WHERE tenant_id = 'my-tenant'
|
|
ORDER BY valid_from;
|
|
```
|
|
|
|
**Resolution:**
|
|
- Verify timestamps are in UTC
|
|
- Check that `valid_from <= query_time < valid_to`
|
|
- Check that `sys_from <= query_time < sys_to`
|
|
|
|
### 5.4 Generated Column Issues
|
|
|
|
**Symptom:** Generated column has NULL when source field exists
|
|
|
|
**Diagnosis:**
|
|
```sql
|
|
-- Check source field format
|
|
SELECT provenance->>'source_key', provenance_source_key
|
|
FROM vuln.advisories
|
|
WHERE id = 'advisory-uuid';
|
|
```
|
|
|
|
**Resolution:**
|
|
- Verify JSONB field name matches exactly (case-sensitive)
|
|
- Check for type casting issues in generation expression
|
|
|
|
---
|
|
|
|
## Appendix A: Quick Reference
|
|
|
|
### Schema Admin Roles
|
|
|
|
| Schema | Admin Role | Bypass RLS |
|
|
|--------|------------|------------|
|
|
| scheduler | `scheduler_admin` | Yes |
|
|
| notify | `notify_admin` | Yes |
|
|
| authority | `authority_admin` | Yes |
|
|
| vex | `vex_admin` | Yes |
|
|
| policy | `policy_admin` | Yes |
|
|
| unknowns | `unknowns_admin` | Yes |
|
|
|
|
### Helper Functions
|
|
|
|
| Function | Purpose |
|
|
|----------|---------|
|
|
| `*_app.require_current_tenant()` | Returns tenant from session, raises if not set |
|
|
| `unknowns.as_of(tenant, valid_at, sys_at)` | Point-in-time temporal query |
|
|
| `unknowns.count_by_kind(tenant)` | Aggregate unknowns by kind |
|
|
| `unknowns.count_by_severity(tenant)` | Aggregate unknowns by severity |
|
|
| `partition_mgmt.create_monthly_partitions(...)` | Create partitions ahead |
|
|
| `partition_mgmt.cleanup_old_partitions(...)` | Archive old partitions |
|
|
|
|
### Validation Scripts
|
|
|
|
| Script | Purpose |
|
|
|--------|---------|
|
|
| `deploy/postgres-validation/001_validate_rls.sql` | Verify RLS configuration |
|
|
| `deploy/postgres-validation/002_validate_partitions.sql` | Verify partition health |
|