Files

master 01f4943ab9 up

2025-12-14 16:23:44 +02:00

17 KiB

Raw Blame History

Here’s a quick, practical cheat‑sheet on choosing PostgreSQL vs MongoDB for security/DevOps apps—plus how I’d model SBOM/VEX and queues in Stella Ops without adding moving parts.

PostgreSQL you can lean on (why it often wins for ops apps)

JSONB that flies: Store documents yet query like SQL. Add GIN indexes on JSONB fields for fast lookups (jsonb_ops general; jsonb_path_ops great for @> containment).
Queue pattern built‑in: SELECT … FOR UPDATE SKIP LOCKED lets multiple workers pop jobs from the same table safely—no head‑of‑line blocking, no extra broker.
Cooperative locks: Advisory locks (session/transaction) for “at‑most‑once” sections or leader election.
Lightweight pub/sub: LISTEN/NOTIFY for async nudges between services (poke a worker to re‑scan, refresh cache, etc.).
Search included: Full‑text search (tsvector/tsquery) is native—no separate search service for moderate needs.
Serious backups: PITR with WAL archiving / pg_basebackup for deterministic rollbacks and offline bundles.

MongoDB facts to factor in

Flexible ingest: Schemaless docs make it easy to absorb varied telemetry and vendor feeds.
Horizontal scale: Sharding is mature for huge, read‑heavy datasets.
Consistency is a choice: Design embedding vs refs and when to use multi‑document transactions.

A simple rule of thumb (Stella Ops‑style)

System of record: PostgreSQL (JSONB first).
Hot paths: Materialized views + JSONB GIN indexes.
Queues & coordination: PostgreSQL (skip‑locked + advisory locks).
Cache/accel only: Valkey (ephemeral).
MongoDB: Optional for very large, read‑optimized graph snapshots (e.g., periodically baked reachability graphs) if Postgres starts to strain.

Concrete patterns you can drop in today

1) SBOM/VEX storage (Postgres JSONB)

-- Documents
CREATE TABLE sbom (
  id BIGSERIAL PRIMARY KEY,
  artifact_purl TEXT NOT NULL,
  doc JSONB NOT NULL,
  created_at TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX sbom_purl_idx ON sbom(artifact_purl);
CREATE INDEX sbom_doc_gin ON sbom USING GIN (doc jsonb_path_ops);

-- Common queries
-- find components by name/version:
-- SELECT * FROM sbom WHERE doc @> '{"components":[{"name":"openssl","version":"3.0.14"}]}';

-- VEX
CREATE TABLE vex (
  id BIGSERIAL PRIMARY KEY,
  subject_purl TEXT NOT NULL,
  vex_doc JSONB NOT NULL,
  created_at TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX vex_subject_idx ON vex(subject_purl);
CREATE INDEX vex_doc_gin ON vex USING GIN (vex_doc jsonb_path_ops);

2) Hot reads via materialized views

CREATE MATERIALIZED VIEW mv_open_findings AS
SELECT
  s.artifact_purl,
  c->>'name' AS comp,
  c->>'version' AS ver,
  v.vex_doc
FROM sbom s
CROSS JOIN LATERAL jsonb_array_elements(s.doc->'components') c
LEFT JOIN vex v ON v.subject_purl = s.artifact_purl
-- add WHERE clauses to pre‑filter only actionable rows
;
CREATE INDEX mv_open_findings_idx ON mv_open_findings(artifact_purl, comp);

Refresh cadence: on feed import or via a scheduler; REFRESH MATERIALIZED VIEW CONCURRENTLY mv_open_findings;

3) Queue without a broker

CREATE TABLE job_queue(
  id BIGSERIAL PRIMARY KEY,
  kind TEXT NOT NULL,          -- e.g., 'scan', 'sbom-diff'
  payload JSONB NOT NULL,
  run_after TIMESTAMPTZ DEFAULT now(),
  attempts INT DEFAULT 0,
  locked_at TIMESTAMPTZ,
  locked_by TEXT
);
CREATE INDEX job_queue_ready_idx ON job_queue(kind, run_after);

-- Worker loop
WITH cte AS (
  SELECT id FROM job_queue
  WHERE kind = $1 AND run_after <= now() AND locked_at IS NULL
  ORDER BY id
  FOR UPDATE SKIP LOCKED
  LIMIT 1
)
UPDATE job_queue j
SET locked_at = now(), locked_by = $2
FROM cte
WHERE j.id = cte.id
RETURNING j.*;

Release/fail with: set locked_at=NULL, locked_by=NULL, attempts=attempts+1 or delete on success.

4) Advisory lock for singletons

-- Acquire (per tenant, per artifact)
SELECT pg_try_advisory_xact_lock(hashtextextended('recalc:'||tenant||':'||artifact, 0));

5) Nudge workers without a bus

NOTIFY stella_scan, json_build_object('purl', $1, 'priority', 5)::TEXT;
-- workers LISTEN stella_scan and enqueue quickly

When to add MongoDB

You need interactive exploration over hundreds of millions of nodes/edges (e.g., historical “proof‑of‑integrity” graphs) where document fan‑out and denormalized reads beat relational joins.
Snapshot cadence is batchy (hourly/daily), and you can re‑emit snapshots deterministically from Postgres (single source of truth).
You want to isolate read spikes from the transactional core.

Snapshot pipe: Postgres → (ETL) → MongoDB collection {graph_id, node, edges[], attrs} with compound shard keys tuned to your UI traversal.

Why this fits Stella Ops

Fewer moving parts on‑prem/air‑gapped.
Deterministic replays (PITR + immutable imports).
Clear performance levers (GIN indexes, MVs, skip‑locked queues).
MongoDB stays optional, purpose‑built for giant read graphs—not a default dependency.

If you want, I can turn the above into ready‑to‑run .sql migrations and a small .NET 10 worker (Dapper/EF Core) that implements the queue loop + advisory locks + LISTEN/NOTIFY hooks. Below is a handoff-ready set of PostgreSQL tables/views engineering guidelines intended for developer review. It is written as a gap-finding checklist with concrete DDL patterns and performance red flags (Postgres as system of record, JSONB where useful, derived projections where needed).

PostgreSQL Tables & Views Engineering Guide

0) Non-negotiable principles

Every hot query must have an index story. If you cannot name the index that serves it, you have a performance gap.
Write path stays simple. Prefer append-only versioning to large updates (especially for JSONB).
Multi-tenant must be explicit. Every core table includes tenant_id and indexes are tenant-prefixed.
Derived data is a product. If the UI needs it fast, model it as a projection table or materialized view, not as an ad-hoc mega-join.
Idempotency is enforced in the DB. Unique keys for imports/jobs/results; no “best effort” dedupe in application only.

1) Table taxonomy and what to look for

Use this to classify every table; each class has different indexing/retention/locking rules.

A. Source-of-truth (SOR) tables

Examples: sbom_document, vex_document, feed_import, scan_manifest, attestation.

Expect: immutable rows, versioning via new row inserts.
Gaps: frequent updates to large JSONB; missing content_hash; no unique idempotency key.

B. Projection tables (query-optimized)

Examples: open_findings, artifact_risk_summary, component_index.

Expect: denormalized, indexed for UI/API; refresh/update strategy defined.
Gaps: projections rebuilt from scratch too often; missing incremental update plan; no retention plan.

C. Queue/outbox tables

Examples: job_queue, outbox_events.

Expect: SKIP LOCKED claim pattern; retry + DLQ; minimal lock duration.
Gaps: holding row locks while doing work; missing partial index for “ready” jobs.

D. Audit/event tables

Examples: scan_run_event, decision_event, access_audit.

Expect: append-only; partitioned by time; BRIN on timestamps.
Gaps: single huge table without partitioning; slow deletes instead of partition drops.

2) Naming, keys, and required columns

Required columns per class

SOR documents (SBOM/VEX/Attestations)

tenant_id uuid
id bigserial (internal PK)
external_id uuid (optional API-facing id)
content_hash bytea (sha256) NOT NULL
doc jsonb NOT NULL
created_at timestamptz NOT NULL default now()
supersedes_id bigint NULL (version chain) OR version int

Checklist

Unique constraint exists: (tenant_id, content_hash)
Version strategy exists (supersedes/version) and is queryable
“Latest” access is index-backed (see §4)

Queue

tenant_id uuid (if multi-tenant)
id bigserial
kind text
payload jsonb
run_after timestamptz
attempts int
locked_at timestamptz NULL
locked_by text NULL
status smallint (optional; e.g., ready/running/done/dead)

Checklist

“Ready to claim” has a partial index (see §4)
Claim transaction is short (claim+commit; work outside lock)

3) JSONB rules that prevent “looks fine → melts in prod”

When JSONB is appropriate

Storing signed envelopes (DSSE), SBOM/VEX raw docs, vendor payloads.
Ingest-first scenarios where schema evolves.

When JSONB is a performance hazard

You frequently query deep keys/arrays (components, vulnerabilities, call paths).
You need sorting/aggregations on doc fields.

Mandatory pattern for hot JSON fields

Keep the raw JSONB for fidelity.
Extract hot keys into stored generated columns (or real columns), index those.
Extract hot arrays into child tables (components, vulnerabilities).

Example:

CREATE TABLE sbom_document (
  id           bigserial PRIMARY KEY,
  tenant_id    uuid NOT NULL,
  artifact_purl text NOT NULL,
  content_hash bytea NOT NULL,
  doc          jsonb NOT NULL,
  created_at   timestamptz NOT NULL DEFAULT now(),

  -- hot keys as generated columns
  bom_format   text GENERATED ALWAYS AS ((doc->>'bomFormat')) STORED,
  spec_version text GENERATED ALWAYS AS ((doc->>'specVersion')) STORED
);

CREATE UNIQUE INDEX ux_sbom_doc_hash ON sbom_document(tenant_id, content_hash);
CREATE INDEX ix_sbom_doc_tenant_artifact ON sbom_document(tenant_id, artifact_purl, created_at DESC);
CREATE INDEX ix_sbom_doc_json_gin ON sbom_document USING GIN (doc jsonb_path_ops);
CREATE INDEX ix_sbom_doc_bomformat ON sbom_document(tenant_id, bom_format);

Checklist

Any query using doc->> in WHERE has either an expression index or a generated column index
Any query using jsonb_array_elements(...) in hot path has been replaced by a normalized child table or a projection table

4) Indexing standards (what devs must justify)

Core rules

Tenant-first: INDEX(tenant_id, …) for anything read per tenant.
Sort support: if query uses ORDER BY created_at DESC, index must end with created_at DESC.
Partial indexes for sparse predicates (status/locked flags).
BRIN for massive append-only time series.
GIN jsonb_path_ops for containment (@>) on JSONB; avoid GIN for everything.

Required index patterns by use case

“Latest version per artifact”

If you store versions as rows:

-- supports: WHERE tenant_id=? AND artifact_purl=? ORDER BY created_at DESC LIMIT 1
CREATE INDEX ix_sbom_latest ON sbom_document(tenant_id, artifact_purl, created_at DESC);

Ready queue claims

CREATE INDEX ix_job_ready
ON job_queue(kind, run_after, id)
WHERE locked_at IS NULL;

-- Optional: tenant scoped
CREATE INDEX ix_job_ready_tenant
ON job_queue(tenant_id, kind, run_after, id)
WHERE locked_at IS NULL;

JSON key lookup (expression index)

-- supports: WHERE (doc->>'subject') = ?
CREATE INDEX ix_vex_subject_expr
ON vex_document(tenant_id, (doc->>'subject'));

Massive event table time filtering

CREATE INDEX brin_scan_events_time
ON scan_run_event USING BRIN (occurred_at);

Red flags

GIN index on a JSONB column + frequent updates = bloat and write amplification.
No partial index for queue readiness → sequential scans under load.
Composite indexes with wrong leading column order (e.g., created_at, tenant_id) → not used.

5) Partitioning and retention (avoid “infinite tables”)

Use partitioning for:

audit/events
scan run logs
large finding histories
anything > tens of millions rows with time-based access

Standard approach

Partition by occurred_at (monthly) for event/audit tables.
Retention by dropping partitions (fast and vacuum-free).

Example:

CREATE TABLE scan_run_event (
  tenant_id uuid NOT NULL,
  scan_run_id bigint NOT NULL,
  occurred_at timestamptz NOT NULL,
  event_type text NOT NULL,
  payload jsonb NOT NULL
) PARTITION BY RANGE (occurred_at);

Checklist

Partition creation/rollover process exists (migration or scheduler)
Retention is “DROP PARTITION”, not “DELETE WHERE occurred_at < …”
Each partition has needed local indexes (BRIN/time + tenant filters)

6) Views vs Materialized Views vs Projection Tables

Use a normal VIEW when

It’s thin (renaming columns, simple joins) and not used in hot paths.

Use a MATERIALIZED VIEW when

It accelerates complex joins/aggregations and can be refreshed on a schedule.
You can tolerate refresh lag.

Materialized view requirements

Must have a unique index to use REFRESH … CONCURRENTLY.
Refresh must be outside an explicit transaction block.

Example:

CREATE MATERIALIZED VIEW mv_artifact_risk AS
SELECT tenant_id, artifact_purl, max(score) AS risk_score
FROM open_findings
GROUP BY tenant_id, artifact_purl;

CREATE UNIQUE INDEX ux_mv_artifact_risk
ON mv_artifact_risk(tenant_id, artifact_purl);

Prefer projection tables over MV when

You need incremental updates (on import/scan completion).
You need deterministic “point-in-time” snapshots per manifest.

Checklist

Every MV has refresh cadence + owner (which worker/job triggers it)
UI/API queries do not depend on a heavy non-materialized view
If “refresh cost” scales with whole dataset, projection table exists instead

7) Queue and outbox patterns that do not deadlock

Claim pattern (short transaction)

WITH cte AS (
  SELECT id
  FROM job_queue
  WHERE kind = $1
    AND run_after <= now()
    AND locked_at IS NULL
  ORDER BY id
  FOR UPDATE SKIP LOCKED
  LIMIT 1
)
UPDATE job_queue j
SET locked_at = now(),
    locked_by = $2
FROM cte
WHERE j.id = cte.id
RETURNING j.*;

Rules

Claim + commit quickly.
Do work outside the lock.
On completion: update row to done (or delete if you want compactness).
On failure: increment attempts, set run_after = now() + backoff, release lock.

Checklist

Worker does not keep transaction open while scanning/importing
Backoff policy is encoded (in DB columns) and observable
DLQ condition exists (attempts > N) and is queryable

8) Query performance review checklist (what to require in PRs)

For each new endpoint/query:

Provide the query (SQL) and the intended parameters.
Provide EXPLAIN (ANALYZE, BUFFERS) from a dataset size that resembles staging.
Identify the serving index(es).
Confirm row estimates are not wildly wrong (if they are: stats or predicate mismatch).
Confirm it is tenant-scoped and uses the tenant-leading index.

Common fixes

Replace IN (SELECT …) with EXISTS for correlated checks.
Replace ORDER BY … LIMIT without index with an index that matches ordering.
Avoid exploding joins with JSON arrays; pre-extract.

9) Vacuum, bloat, and “why is disk growing”

Design to avoid bloat

Append-only for large docs and events.
If frequent updates are needed, isolate hot-updated columns into a smaller table.

Example split:

job_queue_payload (stable)
job_queue_state (locked/status/attempts updated frequently)

Checklist

Large frequently-updated JSONB tables have been questioned
Updates do not rewrite big TOAST values repeatedly
Retention is partition-drop where possible

10) Migration safety rules (prevent production locks)

Index creation: CREATE INDEX CONCURRENTLY.
Dropping indexes: DROP INDEX CONCURRENTLY.
New column with default on large table:
1. ADD COLUMN nullable
2. backfill in batches
3. ALTER COLUMN SET NOT NULL
4. add default if needed

Checklist

No long-running ALTER TABLE on huge tables without plan
Any new NOT NULL constraint is staged safely

11) Stella Ops-specific schema guidance (SBOM/VEX/Finding)

Minimum recommended normalized tables

Even if you keep raw SBOM/VEX JSON:

sbom_document (raw, immutable)
sbom_component (extracted components)
vex_document (raw, immutable)
vex_statement (extracted statements per CVE/component)
finding (facts: CVE ↔ component ↔ artifact ↔ scan_run)
scan_manifest (determinism: feed versions/hashes, policy hash)
scan_run (links results to manifest)

Key gap detectors

If “find all artifacts affected by CVE X” is slow → missing finding indexing.
If “component search” is slow → missing sbom_component and its indexes.
If “replay this scan” is not exact → missing scan_manifest + feed import hashes.

12) Minimal “definition of done” for a new table/view

A PR adding a table/view is incomplete unless it includes:

Table classification (SOR / projection / queue / event)
Primary key and idempotency unique key
Tenant scoping strategy
Index plan mapped to known queries
Retention plan (especially for event/projection tables)
Refresh/update plan if derived
Example query + EXPLAIN for the top 1–3 access patterns

If you want this as a single drop-in repo document, tell me the target path (e.g., /docs/platform/postgres-table-view-guidelines.md) and I will format it exactly as a team-facing guideline, including a one-page “Architecture/Performance Gaps” review form that engineers can paste into PR descriptions.

17 KiB Raw Blame History Unescape Escape

PostgreSQL you can lean on (why it often wins for ops apps)

MongoDB facts to factor in

A simple rule of thumb (Stella Ops‑style)

Concrete patterns you can drop in today

When to add MongoDB

Why this fits Stella Ops

PostgreSQL Tables & Views Engineering Guide

0) Non-negotiable principles

1) Table taxonomy and what to look for

A. Source-of-truth (SOR) tables

B. Projection tables (query-optimized)

C. Queue/outbox tables

D. Audit/event tables

2) Naming, keys, and required columns

Required columns per class

SOR documents (SBOM/VEX/Attestations)

Queue

3) JSONB rules that prevent “looks fine → melts in prod”

When JSONB is appropriate

When JSONB is a performance hazard

4) Indexing standards (what devs must justify)

Core rules

Required index patterns by use case

“Latest version per artifact”

Ready queue claims

JSON key lookup (expression index)

Massive event table time filtering

5) Partitioning and retention (avoid “infinite tables”)

Standard approach

6) Views vs Materialized Views vs Projection Tables

Use a normal VIEW when

Use a MATERIALIZED VIEW when

Prefer projection tables over MV when

7) Queue and outbox patterns that do not deadlock

Claim pattern (short transaction)

8) Query performance review checklist (what to require in PRs)

9) Vacuum, bloat, and “why is disk growing”

Design to avoid bloat

10) Migration safety rules (prevent production locks)

11) Stella Ops-specific schema guidance (SBOM/VEX/Finding)

Minimum recommended normalized tables

12) Minimal “definition of done” for a new table/view

17 KiB

Raw Blame History

A simple rule of thumb (Stella Ops‑style)

Why this fits Stella Ops