stella-ops.org/git.stella-ops.org

Fork 0

Files

StellaOps Bot 3411e825cd themesd advisories enhanced

2025-12-14 21:29:44 +02:00

16 KiB

Raw Blame History

PostgreSQL Patterns Technical Reference

Source Advisories:

01-Dec-2025 - PostgreSQL Patterns for Each StellaOps Module
14-Dec-2025 - Evaluate PostgreSQL vs MongoDB for StellaOps

Last Updated: 2025-12-14

0. POSTGRESQL VS MONGODB (DECISION RULE)

Default posture:

System of record: PostgreSQL (JSONB-first, per-module schema isolation).
Queues & coordination: PostgreSQL (SKIP LOCKED, advisory locks when needed).
Cache/acceleration only: Valkey/Redis (ephemeral).
MongoDB: only when you have a clear need for very large, read-optimized snapshot workloads (e.g., extremely large historical graphs), and you can regenerate those snapshots deterministically from the Postgres source-of-truth.

When MongoDB is justified:

Interactive exploration over hundreds of millions of nodes/edges where denormalized reads beat relational joins.
Snapshot cadence is batchy (hourly/daily) and you can re-emit snapshots deterministically.
You need to isolate read spikes from transactional control-plane writes.

1. MODULE-SCHEMA MAPPING

Module	Schema	Primary Tables
Authority	`authority`	`user`, `role`, `grant`, `oauth_client`, `oauth_token`, `audit_log`
Routing	`routing`	`feature_flag`, `instance`, `rate_limit_config`
VEX	`vex`	`vuln_fact`, `package`, `vex_decision`, `mv_triage_queue`
Unknowns	`unknowns`	`unknown` (bitemporal)
Artifact	`artifact`	`artifact`, `signature`, `tag`
Core	`core`	`outbox`

2. CORE POSTGRESQL CONVENTIONS

2.1 Required Columns (All Tables)

id uuid primary key default gen_random_uuid()
tenant_id uuid not null
created_at timestamptz not null default now()
updated_at timestamptz not null default now()

2.2 Multi-Tenancy RLS Pattern

alter table <table> enable row level security;

create policy p_<table>_tenant on <table>
  for all using (tenant_id = current_setting('app.tenant_id')::uuid);

2.3 Session Configuration (Set Per Request)

select set_config('app.user_id',   '<uuid>', false);
select set_config('app.tenant_id', '<uuid>', false);
select set_config('app.roles',     'role1,role2', false);

3. TABLE TAXONOMY AND PATTERNS

3.1 Source-of-Truth (SOR) Tables

create table <module>.<entity> (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  external_id uuid,
  content_hash bytea not null,
  doc jsonb not null,
  schema_version int not null,
  created_at timestamptz not null default now(),
  supersedes_id bigint null
);

create unique index on <entity>(tenant_id, content_hash);

3.2 JSONB Facts + Relational Decisions

Facts (Immutable):

create table vex.vuln_fact (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  source text not null,
  external_id text,
  payload jsonb not null,
  schema_version int not null,
  received_at timestamptz not null default now()
);

Decisions (Relational):

create table vex.vex_decision (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  package_id uuid not null,
  vuln_id text not null,
  status text check (status in ('not_affected','affected','fixed','under_investigation')),
  rationale text,
  proof_ref text,
  decided_by uuid,
  decided_at timestamptz not null default now(),
  unique (tenant_id, package_id, vuln_id)
);

3.3 Queue Pattern (SKIP LOCKED)

create table job_queue (
  id bigserial primary key,
  tenant_id uuid,
  kind text not null,
  payload jsonb not null,
  run_after timestamptz default now(),
  attempts int default 0,
  locked_at timestamptz,
  locked_by text
);

create index ix_job_ready
  on job_queue(kind, run_after, id)
  where locked_at is null;

-- Claim job
with cte as (
  select id from job_queue
  where kind = $1
    and run_after <= now()
    and locked_at is null
  order by id
  for update skip locked
  limit 1
)
update job_queue j
set locked_at = now(), locked_by = $2
from cte
where j.id = cte.id
returning j.*;

3.3.1 Advisory Locks (coordination / idempotency guards)

Use advisory locks for per-tenant singleton work or "at-most-once" critical sections (do not hold them while doing long-running work):

-- Acquire (per tenant, per artifact) for the duration of the transaction
select pg_try_advisory_xact_lock(hashtextextended('recalc:' || $1 || ':' || $2, 0));

3.3.2 LISTEN/NOTIFY (nudge, not a durable queue)

Use LISTEN/NOTIFY to wake workers quickly after inserting work into a durable table:

notify stella_scan, json_build_object('purl', $1, 'priority', 5)::text;

3.4 Temporal Pattern (Unknowns)

create table unknowns.unknown (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  subject_hash text not null,
  kind text not null,
  context jsonb not null,
  valid_from timestamptz not null default now(),
  valid_to timestamptz,
  sys_from timestamptz not null default now(),
  sys_to timestamptz,
  created_at timestamptz not null default now()
);

create unique index unknown_one_open_per_subject
  on unknowns.unknown (tenant_id, subject_hash, kind)
  where valid_to is null;

create view unknowns.current as
  select * from unknowns.unknown
  where valid_to is null;

3.5 Audit Log Pattern

create table authority.audit_log (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  actor_id uuid,
  action text not null,
  entity_type text not null,
  entity_id uuid,
  at timestamptz not null default now(),
  diff jsonb not null
);

3.6 Outbox Pattern (Exactly-Once Side Effects)

create table core.outbox (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid,
  aggregate_type text not null,
  aggregate_id uuid,
  topic text not null,
  payload jsonb not null,
  created_at timestamptz not null default now(),
  dispatched_at timestamptz,
  dispatch_attempts int not null default 0,
  error text
);

4. JSONB WITH GENERATED COLUMNS

create table sbom_document (
  id bigserial primary key,
  tenant_id uuid not null,
  artifact_purl text not null,
  content_hash bytea not null,
  doc jsonb not null,
  created_at timestamptz not null default now(),

  -- hot keys as generated columns
  bom_format text generated always as ((doc->>'bomFormat')) stored,
  spec_version text generated always as ((doc->>'specVersion')) stored
);

create unique index ux_sbom_doc_hash on sbom_document(tenant_id, content_hash);
create index ix_sbom_doc_tenant_artifact on sbom_document(tenant_id, artifact_purl, created_at desc);
create index ix_sbom_doc_json_gin on sbom_document using gin (doc jsonb_path_ops);
create index ix_sbom_doc_bomformat on sbom_document(tenant_id, bom_format);

5. MATERIALIZED VIEWS FOR HOT READS

create materialized view mv_artifact_risk as
select tenant_id, artifact_purl, max(score) as risk_score
from open_findings
group by tenant_id, artifact_purl;

create unique index ux_mv_artifact_risk
  on mv_artifact_risk(tenant_id, artifact_purl);

-- Refresh
refresh materialized view concurrently mv_artifact_risk;

6. PARTITIONING (TIME-BASED EVENTS)

create table scan_run_event (
  tenant_id uuid not null,
  scan_run_id bigint not null,
  occurred_at timestamptz not null,
  event_type text not null,
  payload jsonb not null
) partition by range (occurred_at);

create index brin_scan_events_time
  on scan_run_event using brin (occurred_at);

7. INDEX PATTERNS

Use Case	Index Pattern
Tenant-scoped queries	`INDEX(tenant_id, ...)`
Latest version lookup	`INDEX(tenant_id, artifact_purl, created_at DESC)`
Queue readiness	`INDEX(kind, run_after, id) WHERE locked_at IS NULL`
JSONB containment	`INDEX USING GIN (doc jsonb_path_ops)`
JSONB key lookup	`INDEX((doc->>'key'))`
Time-series scan	`INDEX USING BRIN (occurred_at)`

8. PERFORMANCE REQUIREMENTS

8.1 Query Performance Standards

Required per PR:

Provide SQL query + intended parameters
Provide EXPLAIN (ANALYZE, BUFFERS) from staging-sized dataset
Identify serving index(es)
Confirm row estimates not wildly wrong
Confirm tenant-scoped and uses tenant-leading index

8.2 Index Performance Standards

Pattern	Requirement
Tenant queries	`INDEX(tenant_id, ...)` leading column
Sort ordering	Index must end with `ORDER BY` column + direction
Queue claims	Partial index `WHERE locked_at IS NULL`
Time-series	BRIN index on timestamp columns for partitioned tables
JSONB containment	GIN `jsonb_path_ops` for `@>` queries

8.3 General Performance Rules

Every hot query must have an index story
Write path stays simple: prefer append-only versioning
Multi-tenant explicit: all core tables include tenant_id
Derived data modeled as projection tables or materialized views
Idempotency enforced in DB: unique keys for imports/jobs/results

8.4 Materialized Views vs Projection Tables

Materialized views are acceptable when:

You can refresh them deterministically at a defined cadence (owned by a specific worker/job).
You can afford full refresh cost, or the dataset is bounded.
You provide a unique index to enable REFRESH MATERIALIZED VIEW CONCURRENTLY.

Prefer projection tables when:

You need incremental updates (on import/scan completion).
You need deterministic point-in-time snapshots per scan manifest (replay/audit).
Refresh cost would scale with the entire dataset on every change.

Checklist:

Every derived read model declares: owner, refresh cadence/trigger, retention, and idempotency key.
No UI/API endpoint depends on a heavy non-materialized view for hot paths.

8.5 Queue + Outbox Rules (avoid deadlocks)

Queue claim rules:

Claim in a short transaction (commit immediately after lock acquisition).
Do work outside the transaction.
On failure: increment attempts, compute backoff into run_after, and release locks.
Define a DLQ condition (attempts > N) that is queryable and observable.

Outbox dispatch rules:

Dispatch is idempotent (consumer must tolerate duplicates).
The dispatcher writes a stable delivery attempt record (dispatched_at, dispatch_attempts, error).

8.6 Migration Safety Rules

Create/drop indexes concurrently on large tables (CREATE INDEX CONCURRENTLY, DROP INDEX CONCURRENTLY).
Add NOT NULL in stages: add nullable column → backfill in batches → enforce constraint → then add default (if needed).
Avoid long-running ALTER TABLE on high-volume tables without a lock plan.

8.7 Definition of Done (new table/view)

A PR adding a table/view is incomplete unless it includes:

Table classification (SoR / projection / queue / event).
Primary key + idempotency unique key.
Tenant scoping strategy (and RLS policy when applicable).
Index plan mapped to the top 1–3 query patterns (include EXPLAIN (ANALYZE, BUFFERS) output).
Retention plan (partitioning and drop policy for high-volume tables).
Refresh/update plan for derived models (owner + cadence).

9. FEATURE FLAG SCHEMA

create table routing.feature_flag (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  key text not null,
  rules jsonb not null,
  version int not null default 1,
  is_enabled boolean not null default true,
  created_at timestamptz not null default now(),
  updated_at timestamptz not null default now(),
  unique (tenant_id, key)
);

create table routing.feature_flag_history (
  id uuid primary key default gen_random_uuid(),
  feature_flag_id uuid not null,
  tenant_id uuid not null,
  key text not null,
  rules jsonb not null,
  version int not null,
  changed_at timestamptz not null default now(),
  changed_by uuid
);

Redis Cache Pattern:

SETEX flag:{key}:{version} <ttl> <json>

10. RATE LIMIT CONFIGURATION

create table routing.rate_limit_config (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  key text not null,
  limit_per_interval int not null,
  interval_seconds int not null,
  created_at timestamptz not null default now(),
  updated_at timestamptz not null default now(),
  unique (tenant_id, key)
);

Redis Counter Pattern:

INCR rl:{bucket}:{window}

11. INSTANCE REGISTRY

create table routing.instance (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  instance_key text not null,
  domain text not null,
  last_heartbeat timestamptz not null default now(),
  status text not null check (status in ('active','draining','offline')),
  created_at timestamptz not null default now(),
  updated_at timestamptz not null default now(),
  unique (tenant_id, instance_key),
  unique (tenant_id, domain)
);

12. MIGRATION PATTERNS

12.1 Schema Versioning

create table core.schema_version (
  module text primary key,
  version int not null,
  applied_at timestamptz not null default now(),
  migration_hash text not null
);

12.2 Migration Script Template

-- Migration: <module>_v<version>_<description>
-- Dependencies: <module>_v<previous_version>

begin;

-- Schema changes
create table if not exists <module>.<table> (...);

-- Data migrations (if needed)

-- Update version
insert into core.schema_version (module, version, migration_hash)
values ('<module>', <version>, '<hash>')
on conflict (module) do update
set version = excluded.version,
    applied_at = now(),
    migration_hash = excluded.migration_hash;

commit;

13. CONNECTION POOLING

13.1 Recommended Settings

database:
  host: postgres.local
  port: 5432
  database: stellaops
  username: stellaops_app
  password: <from secrets>
  pool:
    min_size: 5
    max_size: 20
    connection_timeout: 5000  # ms
    idle_timeout: 600000      # ms (10 min)
    max_lifetime: 1800000     # ms (30 min)

13.2 .NET Configuration

services.AddNpgsqlDataSource(connectionString, builder =>
{
    builder.MaxConnections = 20;
    builder.MinConnections = 5;
    builder.ConnectionIdleLifetime = TimeSpan.FromMinutes(10);
    builder.ConnectionLifetime = TimeSpan.FromMinutes(30);
});

14. MONITORING & OBSERVABILITY

14.1 Essential Metrics

postgres_connections_active
postgres_connections_idle
postgres_transaction_duration_seconds
postgres_query_duration_seconds
postgres_cache_hit_ratio
postgres_table_size_bytes
postgres_index_size_bytes
postgres_slow_queries_total

14.2 Query Performance Monitoring

-- Enable pg_stat_statements
create extension if not exists pg_stat_statements;

-- Top 10 slowest queries
select
  substring(query, 1, 100) as query_snippet,
  calls,
  total_exec_time / 1000 as total_time_sec,
  mean_exec_time as mean_time_ms,
  max_exec_time as max_time_ms
from pg_stat_statements
order by total_exec_time desc
limit 10;

15. BACKUP & RECOVERY

15.1 Backup Strategy

Point-in-time recovery (PITR): Enabled via WAL archiving
Daily full backups: Automated via pg_basebackup
Retention: 30 days for compliance
Testing: Monthly restore drills

15.2 Backup Commands

# Full backup
pg_basebackup -h postgres.local -D /backup/$(date +%Y%m%d) -Ft -z -P

# WAL archiving (postgresql.conf)
# archive_mode = on
# archive_command = 'cp %p /archive/%f'

16. SECURITY BEST PRACTICES

16.1 Access Control

Use RLS for multi-tenancy isolation
Grant minimal privileges per role
Separate read-only and read-write users
Use connection pooler with separate credentials

16.2 Encryption

TLS for connections: sslmode=require
Transparent data encryption (TDE) for data at rest
Encrypted backups

16.3 Audit Logging

-- Enable audit logging
create extension if not exists pgaudit;

-- Configure audit (postgresql.conf)
-- pgaudit.log = 'write, ddl'
-- pgaudit.log_catalog = off

Document Version: 1.0 Target Platform: .NET 10, PostgreSQL ≥16, Angular v17

16 KiB Raw Blame History Unescape Escape