Files
git.stella-ops.org/docs/product-advisories/14-Dec-2025 - PostgreSQL Patterns Technical Reference.md
2025-12-14 21:29:44 +02:00

16 KiB
Raw Blame History

PostgreSQL Patterns Technical Reference

Source Advisories:

  • 01-Dec-2025 - PostgreSQL Patterns for Each StellaOps Module
  • 14-Dec-2025 - Evaluate PostgreSQL vs MongoDB for StellaOps

Last Updated: 2025-12-14


0. POSTGRESQL VS MONGODB (DECISION RULE)

Default posture:

  • System of record: PostgreSQL (JSONB-first, per-module schema isolation).
  • Queues & coordination: PostgreSQL (SKIP LOCKED, advisory locks when needed).
  • Cache/acceleration only: Valkey/Redis (ephemeral).
  • MongoDB: only when you have a clear need for very large, read-optimized snapshot workloads (e.g., extremely large historical graphs), and you can regenerate those snapshots deterministically from the Postgres source-of-truth.

When MongoDB is justified:

  • Interactive exploration over hundreds of millions of nodes/edges where denormalized reads beat relational joins.
  • Snapshot cadence is batchy (hourly/daily) and you can re-emit snapshots deterministically.
  • You need to isolate read spikes from transactional control-plane writes.

1. MODULE-SCHEMA MAPPING

Module Schema Primary Tables
Authority authority user, role, grant, oauth_client, oauth_token, audit_log
Routing routing feature_flag, instance, rate_limit_config
VEX vex vuln_fact, package, vex_decision, mv_triage_queue
Unknowns unknowns unknown (bitemporal)
Artifact artifact artifact, signature, tag
Core core outbox

2. CORE POSTGRESQL CONVENTIONS

2.1 Required Columns (All Tables)

id uuid primary key default gen_random_uuid()
tenant_id uuid not null
created_at timestamptz not null default now()
updated_at timestamptz not null default now()

2.2 Multi-Tenancy RLS Pattern

alter table <table> enable row level security;

create policy p_<table>_tenant on <table>
  for all using (tenant_id = current_setting('app.tenant_id')::uuid);

2.3 Session Configuration (Set Per Request)

select set_config('app.user_id',   '<uuid>', false);
select set_config('app.tenant_id', '<uuid>', false);
select set_config('app.roles',     'role1,role2', false);

3. TABLE TAXONOMY AND PATTERNS

3.1 Source-of-Truth (SOR) Tables

create table <module>.<entity> (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  external_id uuid,
  content_hash bytea not null,
  doc jsonb not null,
  schema_version int not null,
  created_at timestamptz not null default now(),
  supersedes_id bigint null
);

create unique index on <entity>(tenant_id, content_hash);

3.2 JSONB Facts + Relational Decisions

Facts (Immutable):

create table vex.vuln_fact (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  source text not null,
  external_id text,
  payload jsonb not null,
  schema_version int not null,
  received_at timestamptz not null default now()
);

Decisions (Relational):

create table vex.vex_decision (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  package_id uuid not null,
  vuln_id text not null,
  status text check (status in ('not_affected','affected','fixed','under_investigation')),
  rationale text,
  proof_ref text,
  decided_by uuid,
  decided_at timestamptz not null default now(),
  unique (tenant_id, package_id, vuln_id)
);

3.3 Queue Pattern (SKIP LOCKED)

create table job_queue (
  id bigserial primary key,
  tenant_id uuid,
  kind text not null,
  payload jsonb not null,
  run_after timestamptz default now(),
  attempts int default 0,
  locked_at timestamptz,
  locked_by text
);

create index ix_job_ready
  on job_queue(kind, run_after, id)
  where locked_at is null;

-- Claim job
with cte as (
  select id from job_queue
  where kind = $1
    and run_after <= now()
    and locked_at is null
  order by id
  for update skip locked
  limit 1
)
update job_queue j
set locked_at = now(), locked_by = $2
from cte
where j.id = cte.id
returning j.*;

3.3.1 Advisory Locks (coordination / idempotency guards)

Use advisory locks for per-tenant singleton work or "at-most-once" critical sections (do not hold them while doing long-running work):

-- Acquire (per tenant, per artifact) for the duration of the transaction
select pg_try_advisory_xact_lock(hashtextextended('recalc:' || $1 || ':' || $2, 0));

3.3.2 LISTEN/NOTIFY (nudge, not a durable queue)

Use LISTEN/NOTIFY to wake workers quickly after inserting work into a durable table:

notify stella_scan, json_build_object('purl', $1, 'priority', 5)::text;

3.4 Temporal Pattern (Unknowns)

create table unknowns.unknown (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  subject_hash text not null,
  kind text not null,
  context jsonb not null,
  valid_from timestamptz not null default now(),
  valid_to timestamptz,
  sys_from timestamptz not null default now(),
  sys_to timestamptz,
  created_at timestamptz not null default now()
);

create unique index unknown_one_open_per_subject
  on unknowns.unknown (tenant_id, subject_hash, kind)
  where valid_to is null;

create view unknowns.current as
  select * from unknowns.unknown
  where valid_to is null;

3.5 Audit Log Pattern

create table authority.audit_log (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  actor_id uuid,
  action text not null,
  entity_type text not null,
  entity_id uuid,
  at timestamptz not null default now(),
  diff jsonb not null
);

3.6 Outbox Pattern (Exactly-Once Side Effects)

create table core.outbox (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid,
  aggregate_type text not null,
  aggregate_id uuid,
  topic text not null,
  payload jsonb not null,
  created_at timestamptz not null default now(),
  dispatched_at timestamptz,
  dispatch_attempts int not null default 0,
  error text
);

4. JSONB WITH GENERATED COLUMNS

create table sbom_document (
  id bigserial primary key,
  tenant_id uuid not null,
  artifact_purl text not null,
  content_hash bytea not null,
  doc jsonb not null,
  created_at timestamptz not null default now(),

  -- hot keys as generated columns
  bom_format text generated always as ((doc->>'bomFormat')) stored,
  spec_version text generated always as ((doc->>'specVersion')) stored
);

create unique index ux_sbom_doc_hash on sbom_document(tenant_id, content_hash);
create index ix_sbom_doc_tenant_artifact on sbom_document(tenant_id, artifact_purl, created_at desc);
create index ix_sbom_doc_json_gin on sbom_document using gin (doc jsonb_path_ops);
create index ix_sbom_doc_bomformat on sbom_document(tenant_id, bom_format);

5. MATERIALIZED VIEWS FOR HOT READS

create materialized view mv_artifact_risk as
select tenant_id, artifact_purl, max(score) as risk_score
from open_findings
group by tenant_id, artifact_purl;

create unique index ux_mv_artifact_risk
  on mv_artifact_risk(tenant_id, artifact_purl);

-- Refresh
refresh materialized view concurrently mv_artifact_risk;

6. PARTITIONING (TIME-BASED EVENTS)

create table scan_run_event (
  tenant_id uuid not null,
  scan_run_id bigint not null,
  occurred_at timestamptz not null,
  event_type text not null,
  payload jsonb not null
) partition by range (occurred_at);

create index brin_scan_events_time
  on scan_run_event using brin (occurred_at);

7. INDEX PATTERNS

Use Case Index Pattern
Tenant-scoped queries INDEX(tenant_id, ...)
Latest version lookup INDEX(tenant_id, artifact_purl, created_at DESC)
Queue readiness INDEX(kind, run_after, id) WHERE locked_at IS NULL
JSONB containment INDEX USING GIN (doc jsonb_path_ops)
JSONB key lookup INDEX((doc->>'key'))
Time-series scan INDEX USING BRIN (occurred_at)

8. PERFORMANCE REQUIREMENTS

8.1 Query Performance Standards

Required per PR:

  • Provide SQL query + intended parameters
  • Provide EXPLAIN (ANALYZE, BUFFERS) from staging-sized dataset
  • Identify serving index(es)
  • Confirm row estimates not wildly wrong
  • Confirm tenant-scoped and uses tenant-leading index

8.2 Index Performance Standards

Pattern Requirement
Tenant queries INDEX(tenant_id, ...) leading column
Sort ordering Index must end with ORDER BY column + direction
Queue claims Partial index WHERE locked_at IS NULL
Time-series BRIN index on timestamp columns for partitioned tables
JSONB containment GIN jsonb_path_ops for @> queries

8.3 General Performance Rules

  • Every hot query must have an index story
  • Write path stays simple: prefer append-only versioning
  • Multi-tenant explicit: all core tables include tenant_id
  • Derived data modeled as projection tables or materialized views
  • Idempotency enforced in DB: unique keys for imports/jobs/results

8.4 Materialized Views vs Projection Tables

Materialized views are acceptable when:

  • You can refresh them deterministically at a defined cadence (owned by a specific worker/job).
  • You can afford full refresh cost, or the dataset is bounded.
  • You provide a unique index to enable REFRESH MATERIALIZED VIEW CONCURRENTLY.

Prefer projection tables when:

  • You need incremental updates (on import/scan completion).
  • You need deterministic point-in-time snapshots per scan manifest (replay/audit).
  • Refresh cost would scale with the entire dataset on every change.

Checklist:

  • Every derived read model declares: owner, refresh cadence/trigger, retention, and idempotency key.
  • No UI/API endpoint depends on a heavy non-materialized view for hot paths.

8.5 Queue + Outbox Rules (avoid deadlocks)

Queue claim rules:

  • Claim in a short transaction (commit immediately after lock acquisition).
  • Do work outside the transaction.
  • On failure: increment attempts, compute backoff into run_after, and release locks.
  • Define a DLQ condition (attempts > N) that is queryable and observable.

Outbox dispatch rules:

  • Dispatch is idempotent (consumer must tolerate duplicates).
  • The dispatcher writes a stable delivery attempt record (dispatched_at, dispatch_attempts, error).

8.6 Migration Safety Rules

  • Create/drop indexes concurrently on large tables (CREATE INDEX CONCURRENTLY, DROP INDEX CONCURRENTLY).
  • Add NOT NULL in stages: add nullable column → backfill in batches → enforce constraint → then add default (if needed).
  • Avoid long-running ALTER TABLE on high-volume tables without a lock plan.

8.7 Definition of Done (new table/view)

A PR adding a table/view is incomplete unless it includes:

  • Table classification (SoR / projection / queue / event).
  • Primary key + idempotency unique key.
  • Tenant scoping strategy (and RLS policy when applicable).
  • Index plan mapped to the top 13 query patterns (include EXPLAIN (ANALYZE, BUFFERS) output).
  • Retention plan (partitioning and drop policy for high-volume tables).
  • Refresh/update plan for derived models (owner + cadence).

9. FEATURE FLAG SCHEMA

create table routing.feature_flag (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  key text not null,
  rules jsonb not null,
  version int not null default 1,
  is_enabled boolean not null default true,
  created_at timestamptz not null default now(),
  updated_at timestamptz not null default now(),
  unique (tenant_id, key)
);

create table routing.feature_flag_history (
  id uuid primary key default gen_random_uuid(),
  feature_flag_id uuid not null,
  tenant_id uuid not null,
  key text not null,
  rules jsonb not null,
  version int not null,
  changed_at timestamptz not null default now(),
  changed_by uuid
);

Redis Cache Pattern:

SETEX flag:{key}:{version} <ttl> <json>

10. RATE LIMIT CONFIGURATION

create table routing.rate_limit_config (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  key text not null,
  limit_per_interval int not null,
  interval_seconds int not null,
  created_at timestamptz not null default now(),
  updated_at timestamptz not null default now(),
  unique (tenant_id, key)
);

Redis Counter Pattern:

INCR rl:{bucket}:{window}

11. INSTANCE REGISTRY

create table routing.instance (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  instance_key text not null,
  domain text not null,
  last_heartbeat timestamptz not null default now(),
  status text not null check (status in ('active','draining','offline')),
  created_at timestamptz not null default now(),
  updated_at timestamptz not null default now(),
  unique (tenant_id, instance_key),
  unique (tenant_id, domain)
);

12. MIGRATION PATTERNS

12.1 Schema Versioning

create table core.schema_version (
  module text primary key,
  version int not null,
  applied_at timestamptz not null default now(),
  migration_hash text not null
);

12.2 Migration Script Template

-- Migration: <module>_v<version>_<description>
-- Dependencies: <module>_v<previous_version>

begin;

-- Schema changes
create table if not exists <module>.<table> (...);

-- Data migrations (if needed)

-- Update version
insert into core.schema_version (module, version, migration_hash)
values ('<module>', <version>, '<hash>')
on conflict (module) do update
set version = excluded.version,
    applied_at = now(),
    migration_hash = excluded.migration_hash;

commit;

13. CONNECTION POOLING

database:
  host: postgres.local
  port: 5432
  database: stellaops
  username: stellaops_app
  password: <from secrets>
  pool:
    min_size: 5
    max_size: 20
    connection_timeout: 5000  # ms
    idle_timeout: 600000      # ms (10 min)
    max_lifetime: 1800000     # ms (30 min)

13.2 .NET Configuration

services.AddNpgsqlDataSource(connectionString, builder =>
{
    builder.MaxConnections = 20;
    builder.MinConnections = 5;
    builder.ConnectionIdleLifetime = TimeSpan.FromMinutes(10);
    builder.ConnectionLifetime = TimeSpan.FromMinutes(30);
});

14. MONITORING & OBSERVABILITY

14.1 Essential Metrics

postgres_connections_active
postgres_connections_idle
postgres_transaction_duration_seconds
postgres_query_duration_seconds
postgres_cache_hit_ratio
postgres_table_size_bytes
postgres_index_size_bytes
postgres_slow_queries_total

14.2 Query Performance Monitoring

-- Enable pg_stat_statements
create extension if not exists pg_stat_statements;

-- Top 10 slowest queries
select
  substring(query, 1, 100) as query_snippet,
  calls,
  total_exec_time / 1000 as total_time_sec,
  mean_exec_time as mean_time_ms,
  max_exec_time as max_time_ms
from pg_stat_statements
order by total_exec_time desc
limit 10;

15. BACKUP & RECOVERY

15.1 Backup Strategy

  • Point-in-time recovery (PITR): Enabled via WAL archiving
  • Daily full backups: Automated via pg_basebackup
  • Retention: 30 days for compliance
  • Testing: Monthly restore drills

15.2 Backup Commands

# Full backup
pg_basebackup -h postgres.local -D /backup/$(date +%Y%m%d) -Ft -z -P

# WAL archiving (postgresql.conf)
# archive_mode = on
# archive_command = 'cp %p /archive/%f'

16. SECURITY BEST PRACTICES

16.1 Access Control

  • Use RLS for multi-tenancy isolation
  • Grant minimal privileges per role
  • Separate read-only and read-write users
  • Use connection pooler with separate credentials

16.2 Encryption

  • TLS for connections: sslmode=require
  • Transparent data encryption (TDE) for data at rest
  • Encrypted backups

16.3 Audit Logging

-- Enable audit logging
create extension if not exists pgaudit;

-- Configure audit (postgresql.conf)
-- pgaudit.log = 'write, ddl'
-- pgaudit.log_catalog = off

Document Version: 1.0 Target Platform: .NET 10, PostgreSQL ≥16, Angular v17