# PostgreSQL Patterns Operational Runbook **Version:** 1.0 **Last Updated:** 2025-12-14 **Related Sprints:** SPRINT_3420, SPRINT_3421, SPRINT_3422, SPRINT_3423 --- ## Table of Contents 1. [Row-Level Security (RLS)](#1-row-level-security-rls) 2. [Bitemporal Unknowns](#2-bitemporal-unknowns) 3. [Time-Based Partitioning](#3-time-based-partitioning) 4. [Generated Columns](#4-generated-columns) 5. [Troubleshooting](#5-troubleshooting) --- ## 1. Row-Level Security (RLS) ### 1.1 Overview RLS provides database-level tenant isolation. All tenant-scoped tables enforce that queries can only access rows matching the current tenant context. ### 1.2 Setting Tenant Context Before executing queries, the application must set the tenant context: ```sql -- Set tenant context for the session SELECT set_config('app.tenant_id', 'my-tenant-id', false); -- Verify tenant context SELECT current_setting('app.tenant_id', true); ``` ### 1.3 Validating RLS Configuration Run the validation script to check RLS status: ```bash psql -f deploy/postgres-validation/001_validate_rls.sql ``` Expected output shows: - All tenant-scoped tables with `rls_enabled = true` - All tenant-scoped tables with `rls_forced = true` - One policy per table named `*_tenant_isolation` ### 1.4 Bypassing RLS (Admin Operations) For migrations or cross-tenant operations, use the admin role: ```sql -- Grant admin role to current user (one-time setup) GRANT scheduler_admin TO your_migration_user; -- Execute as admin (bypasses RLS) SET ROLE scheduler_admin; SELECT * FROM scheduler.runs; -- Returns all tenants -- Return to normal role RESET ROLE; ``` ### 1.5 Common Issues **Issue:** Query returns no rows unexpectedly ```sql -- Check if tenant context is set SELECT current_setting('app.tenant_id', true); -- If NULL or empty, set it before querying ``` **Issue:** "app.tenant_id session variable not set" error ```sql -- Set the tenant context SELECT set_config('app.tenant_id', 'tenant-123', false); ``` **Issue:** Need to query across tenants ```sql -- Use admin bypass role SET ROLE scheduler_admin; -- Run your cross-tenant query RESET ROLE; ``` --- ## 2. Bitemporal Unknowns ### 2.1 Overview The `unknowns` schema tracks ambiguities discovered during vulnerability scanning using bitemporal semantics: - **Valid time** (`valid_from`, `valid_to`): When the unknown was relevant in the real world - **System time** (`sys_from`, `sys_to`): When the system knew about the unknown ### 2.2 Querying Current Unknowns ```sql -- Get all current open unknowns for a tenant SELECT * FROM unknowns.current WHERE tenant_id = 'my-tenant'; -- Or use the view with tenant context set SELECT set_config('app.tenant_id', 'my-tenant', false); SELECT * FROM unknowns.current; ``` ### 2.3 Point-in-Time Queries Query the state of unknowns at any point in history: ```sql -- What unknowns existed on January 1, 2025? SELECT * FROM unknowns.as_of( 'my-tenant', '2025-01-01 00:00:00+00'::TIMESTAMPTZ ); -- What did we know about unknowns on Jan 1, as of Jan 15? SELECT * FROM unknowns.as_of( 'my-tenant', '2025-01-01 00:00:00+00'::TIMESTAMPTZ, -- valid_at '2025-01-15 00:00:00+00'::TIMESTAMPTZ -- sys_at ); ``` ### 2.4 Resolving Unknowns ```sql -- Resolve an unknown (sets valid_to and resolution fields) UPDATE unknowns.unknown SET resolved_at = NOW(), resolution_type = 'feed_updated', resolution_ref = 'nvd-feed-2025-01', resolution_notes = 'NVD now covers this package', valid_to = NOW() WHERE id = 'unknown-uuid' AND tenant_id = 'my-tenant'; ``` ### 2.5 Metrics Queries ```sql -- Count unknowns by kind SELECT * FROM unknowns.count_by_kind('my-tenant'); -- Count unknowns by severity SELECT * FROM unknowns.count_by_severity('my-tenant'); ``` --- ## 3. Time-Based Partitioning ### 3.1 Overview High-volume tables are partitioned by month for: - Efficient time-range queries (partition pruning) - O(1) data retention (drop partition vs DELETE) - Improved vacuum performance ### 3.2 Checking Partition Status Run the validation script: ```bash psql -f deploy/postgres-validation/002_validate_partitions.sql ``` Or query directly: ```sql -- View all partitions with sizes SELECT * FROM partition_mgmt.partition_stats WHERE schema_name = 'scheduler' ORDER BY table_name, partition_name; ``` ### 3.3 Creating Future Partitions Partitions should be created ahead of time. Run monthly: ```sql -- Create partitions for next 3 months SELECT partition_mgmt.create_monthly_partitions( 'scheduler', -- schema 'audit', -- table 'created_at', -- partition column CURRENT_DATE, -- start from now 3 -- months ahead ); ``` ### 3.4 Archiving Old Partitions Preview what would be archived: ```sql -- Dry run: see what would be archived (12 month retention) SELECT * FROM partition_mgmt.cleanup_old_partitions( 'scheduler', 'audit', 12, 'archive', TRUE ); ``` Execute archiving: ```sql -- Actually archive partitions older than 12 months SELECT * FROM partition_mgmt.cleanup_old_partitions( 'scheduler', 'audit', 12, 'archive', FALSE ); ``` ### 3.5 Dropping Archived Partitions After verifying archived data is backed up: ```sql -- Drop archived partition DROP TABLE archive.audit_2024_01; ``` ### 3.6 Maintenance Checklist **Weekly:** - [ ] Check `partition_mgmt.partition_stats` for unexpected growth - [ ] Verify default partitions are empty **Monthly:** - [ ] Create partitions for 3 months ahead - [ ] Review partitions approaching retention cutoff **Quarterly:** - [ ] Archive partitions beyond retention period - [ ] Verify archived data is backed up - [ ] Drop archived partitions if space needed --- ## 4. Generated Columns ### 4.1 Overview Generated columns extract frequently-queried JSONB fields into indexed columns for 10-50x query speedup. ### 4.2 Affected Tables | Table | Generated Columns | Source | |-------|-------------------|--------| | `scheduler.runs` | `finding_count`, `critical_count`, `high_count`, `medium_count`, `low_count` | `stats` JSONB | | `vuln.advisories` | `provenance_source_key`, `provenance_feed_id`, `provenance_ingested_at` | `provenance` JSONB | | `vuln.advisory_affected` | `purl_type`, `purl_name` | `purl` TEXT | ### 4.3 Query Optimization **Before (slow):** ```sql SELECT * FROM scheduler.runs WHERE (stats->>'findingCount')::int > 100; ``` **After (fast):** ```sql SELECT * FROM scheduler.runs WHERE finding_count > 100; ``` ### 4.4 Adding New Generated Columns ```sql -- Add a new generated column ALTER TABLE your_table ADD COLUMN new_field TEXT GENERATED ALWAYS AS (jsonb_column->>'field_name') STORED; -- Create index CREATE INDEX ix_your_table_new_field ON your_table (new_field) WHERE new_field IS NOT NULL; ``` ### 4.5 Monitoring Check column usage in queries: ```sql -- See which indexes are being used SELECT indexrelname, idx_scan, idx_tup_read FROM pg_stat_user_indexes WHERE schemaname = 'scheduler' ORDER BY idx_scan DESC; ``` --- ## 5. Troubleshooting ### 5.1 RLS Issues **Symptom:** Application gets "permission denied" or empty results **Diagnosis:** ```sql -- Check if RLS is enabled SELECT tablename, rowsecurity, forcerowsecurity FROM pg_tables WHERE schemaname = 'scheduler'; -- Check policies SELECT * FROM pg_policies WHERE schemaname = 'scheduler'; -- Check tenant context SELECT current_setting('app.tenant_id', true); ``` **Resolution:** 1. Ensure `set_config('app.tenant_id', ...)` is called before queries 2. Verify the tenant ID matches data in the table 3. For admin operations, use bypass role ### 5.2 Partition Issues **Symptom:** Inserts fail with "no partition of relation" **Diagnosis:** ```sql -- Check existing partitions SELECT * FROM partition_mgmt.partition_stats WHERE table_name = 'audit'; -- Check default partition for orphaned data SELECT COUNT(*) FROM scheduler.audit_default; ``` **Resolution:** ```sql -- Create missing partition SELECT partition_mgmt.create_partition( 'scheduler', 'audit', 'created_at', '2025-12-01'::DATE, '2026-01-01'::DATE ); ``` **Symptom:** Queries scanning all partitions (slow) **Diagnosis:** ```sql -- Check query plan EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM scheduler.audit WHERE created_at >= '2025-01-01' AND created_at < '2025-02-01'; ``` **Resolution:** - Ensure WHERE clause includes partition key - Use `>=` and `<` instead of BETWEEN for clearer pruning ### 5.3 Bitemporal Query Issues **Symptom:** `as_of()` returns unexpected results **Diagnosis:** ```sql -- Check raw temporal data SELECT id, valid_from, valid_to, sys_from, sys_to FROM unknowns.unknown WHERE tenant_id = 'my-tenant' ORDER BY valid_from; ``` **Resolution:** - Verify timestamps are in UTC - Check that `valid_from <= query_time < valid_to` - Check that `sys_from <= query_time < sys_to` ### 5.4 Generated Column Issues **Symptom:** Generated column has NULL when source field exists **Diagnosis:** ```sql -- Check source field format SELECT provenance->>'source_key', provenance_source_key FROM vuln.advisories WHERE id = 'advisory-uuid'; ``` **Resolution:** - Verify JSONB field name matches exactly (case-sensitive) - Check for type casting issues in generation expression --- ## Appendix A: Quick Reference ### Schema Admin Roles | Schema | Admin Role | Bypass RLS | |--------|------------|------------| | scheduler | `scheduler_admin` | Yes | | notify | `notify_admin` | Yes | | authority | `authority_admin` | Yes | | vex | `vex_admin` | Yes | | policy | `policy_admin` | Yes | | unknowns | `unknowns_admin` | Yes | ### Helper Functions | Function | Purpose | |----------|---------| | `*_app.require_current_tenant()` | Returns tenant from session, raises if not set | | `unknowns.as_of(tenant, valid_at, sys_at)` | Point-in-time temporal query | | `unknowns.count_by_kind(tenant)` | Aggregate unknowns by kind | | `unknowns.count_by_severity(tenant)` | Aggregate unknowns by severity | | `partition_mgmt.create_monthly_partitions(...)` | Create partitions ahead | | `partition_mgmt.cleanup_old_partitions(...)` | Archive old partitions | ### Validation Scripts | Script | Purpose | |--------|---------| | `deploy/postgres-validation/001_validate_rls.sql` | Verify RLS configuration | | `deploy/postgres-validation/002_validate_partitions.sql` | Verify partition health |