17 KiB
Here’s a quick, practical cheat‑sheet on choosing PostgreSQL vs MongoDB for security/DevOps apps—plus how I’d model SBOM/VEX and queues in Stella Ops without adding moving parts.
PostgreSQL you can lean on (why it often wins for ops apps)
- JSONB that flies: Store documents yet query like SQL. Add GIN indexes on JSONB fields for fast lookups (
jsonb_opsgeneral;jsonb_path_opsgreat for@>containment). - Queue pattern built‑in:
SELECT … FOR UPDATE SKIP LOCKEDlets multiple workers pop jobs from the same table safely—no head‑of‑line blocking, no extra broker. - Cooperative locks: Advisory locks (session/transaction) for “at‑most‑once” sections or leader election.
- Lightweight pub/sub: LISTEN/NOTIFY for async nudges between services (poke a worker to re‑scan, refresh cache, etc.).
- Search included: Full‑text search (tsvector/tsquery) is native—no separate search service for moderate needs.
- Serious backups: PITR with WAL archiving /
pg_basebackupfor deterministic rollbacks and offline bundles.
MongoDB facts to factor in
- Flexible ingest: Schemaless docs make it easy to absorb varied telemetry and vendor feeds.
- Horizontal scale: Sharding is mature for huge, read‑heavy datasets.
- Consistency is a choice: Design embedding vs refs and when to use multi‑document transactions.
A simple rule of thumb (Stella Ops‑style)
- System of record: PostgreSQL (JSONB first).
- Hot paths: Materialized views + JSONB GIN indexes.
- Queues & coordination: PostgreSQL (skip‑locked + advisory locks).
- Cache/accel only: Valkey (ephemeral).
- MongoDB: Optional for very large, read‑optimized graph snapshots (e.g., periodically baked reachability graphs) if Postgres starts to strain.
Concrete patterns you can drop in today
1) SBOM/VEX storage (Postgres JSONB)
-- Documents
CREATE TABLE sbom (
id BIGSERIAL PRIMARY KEY,
artifact_purl TEXT NOT NULL,
doc JSONB NOT NULL,
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX sbom_purl_idx ON sbom(artifact_purl);
CREATE INDEX sbom_doc_gin ON sbom USING GIN (doc jsonb_path_ops);
-- Common queries
-- find components by name/version:
-- SELECT * FROM sbom WHERE doc @> '{"components":[{"name":"openssl","version":"3.0.14"}]}';
-- VEX
CREATE TABLE vex (
id BIGSERIAL PRIMARY KEY,
subject_purl TEXT NOT NULL,
vex_doc JSONB NOT NULL,
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX vex_subject_idx ON vex(subject_purl);
CREATE INDEX vex_doc_gin ON vex USING GIN (vex_doc jsonb_path_ops);
2) Hot reads via materialized views
CREATE MATERIALIZED VIEW mv_open_findings AS
SELECT
s.artifact_purl,
c->>'name' AS comp,
c->>'version' AS ver,
v.vex_doc
FROM sbom s
CROSS JOIN LATERAL jsonb_array_elements(s.doc->'components') c
LEFT JOIN vex v ON v.subject_purl = s.artifact_purl
-- add WHERE clauses to pre‑filter only actionable rows
;
CREATE INDEX mv_open_findings_idx ON mv_open_findings(artifact_purl, comp);
Refresh cadence: on feed import or via a scheduler; REFRESH MATERIALIZED VIEW CONCURRENTLY mv_open_findings;
3) Queue without a broker
CREATE TABLE job_queue(
id BIGSERIAL PRIMARY KEY,
kind TEXT NOT NULL, -- e.g., 'scan', 'sbom-diff'
payload JSONB NOT NULL,
run_after TIMESTAMPTZ DEFAULT now(),
attempts INT DEFAULT 0,
locked_at TIMESTAMPTZ,
locked_by TEXT
);
CREATE INDEX job_queue_ready_idx ON job_queue(kind, run_after);
-- Worker loop
WITH cte AS (
SELECT id FROM job_queue
WHERE kind = $1 AND run_after <= now() AND locked_at IS NULL
ORDER BY id
FOR UPDATE SKIP LOCKED
LIMIT 1
)
UPDATE job_queue j
SET locked_at = now(), locked_by = $2
FROM cte
WHERE j.id = cte.id
RETURNING j.*;
Release/fail with: set locked_at=NULL, locked_by=NULL, attempts=attempts+1 or delete on success.
4) Advisory lock for singletons
-- Acquire (per tenant, per artifact)
SELECT pg_try_advisory_xact_lock(hashtextextended('recalc:'||tenant||':'||artifact, 0));
5) Nudge workers without a bus
NOTIFY stella_scan, json_build_object('purl', $1, 'priority', 5)::TEXT;
-- workers LISTEN stella_scan and enqueue quickly
When to add MongoDB
- You need interactive exploration over hundreds of millions of nodes/edges (e.g., historical “proof‑of‑integrity” graphs) where document fan‑out and denormalized reads beat relational joins.
- Snapshot cadence is batchy (hourly/daily), and you can re‑emit snapshots deterministically from Postgres (single source of truth).
- You want to isolate read spikes from the transactional core.
Snapshot pipe: Postgres → (ETL) → MongoDB collection {graph_id, node, edges[], attrs} with compound shard keys tuned to your UI traversal.
Why this fits Stella Ops
- Fewer moving parts on‑prem/air‑gapped.
- Deterministic replays (PITR + immutable imports).
- Clear performance levers (GIN indexes, MVs, skip‑locked queues).
- MongoDB stays optional, purpose‑built for giant read graphs—not a default dependency.
If you want, I can turn the above into ready‑to‑run .sql migrations and a small .NET 10 worker (Dapper/EF Core) that implements the queue loop + advisory locks + LISTEN/NOTIFY hooks.
Below is a handoff-ready set of PostgreSQL tables/views engineering guidelines intended for developer review. It is written as a gap-finding checklist with concrete DDL patterns and performance red flags (Postgres as system of record, JSONB where useful, derived projections where needed).
PostgreSQL Tables & Views Engineering Guide
0) Non-negotiable principles
- Every hot query must have an index story. If you cannot name the index that serves it, you have a performance gap.
- Write path stays simple. Prefer append-only versioning to large updates (especially for JSONB).
- Multi-tenant must be explicit. Every core table includes
tenant_idand indexes are tenant-prefixed. - Derived data is a product. If the UI needs it fast, model it as a projection table or materialized view, not as an ad-hoc mega-join.
- Idempotency is enforced in the DB. Unique keys for imports/jobs/results; no “best effort” dedupe in application only.
1) Table taxonomy and what to look for
Use this to classify every table; each class has different indexing/retention/locking rules.
A. Source-of-truth (SOR) tables
Examples: sbom_document, vex_document, feed_import, scan_manifest, attestation.
- Expect: immutable rows, versioning via new row inserts.
- Gaps: frequent updates to large JSONB; missing
content_hash; no unique idempotency key.
B. Projection tables (query-optimized)
Examples: open_findings, artifact_risk_summary, component_index.
- Expect: denormalized, indexed for UI/API; refresh/update strategy defined.
- Gaps: projections rebuilt from scratch too often; missing incremental update plan; no retention plan.
C. Queue/outbox tables
Examples: job_queue, outbox_events.
- Expect:
SKIP LOCKEDclaim pattern; retry + DLQ; minimal lock duration. - Gaps: holding row locks while doing work; missing partial index for “ready” jobs.
D. Audit/event tables
Examples: scan_run_event, decision_event, access_audit.
- Expect: append-only; partitioned by time; BRIN on timestamps.
- Gaps: single huge table without partitioning; slow deletes instead of partition drops.
2) Naming, keys, and required columns
Required columns per class
SOR documents (SBOM/VEX/Attestations)
tenant_id uuidid bigserial(internal PK)external_id uuid(optional API-facing id)content_hash bytea(sha256) NOT NULLdoc jsonbNOT NULLcreated_at timestamptzNOT NULL default now()supersedes_id bigint NULL(version chain) ORversion int
Checklist
- Unique constraint exists:
(tenant_id, content_hash) - Version strategy exists (supersedes/version) and is queryable
- “Latest” access is index-backed (see §4)
Queue
tenant_id uuid(if multi-tenant)id bigserialkind textpayload jsonbrun_after timestamptzattempts intlocked_at timestamptz NULLlocked_by text NULLstatus smallint(optional; e.g., ready/running/done/dead)
Checklist
- “Ready to claim” has a partial index (see §4)
- Claim transaction is short (claim+commit; work outside lock)
3) JSONB rules that prevent “looks fine → melts in prod”
When JSONB is appropriate
- Storing signed envelopes (DSSE), SBOM/VEX raw docs, vendor payloads.
- Ingest-first scenarios where schema evolves.
When JSONB is a performance hazard
- You frequently query deep keys/arrays (components, vulnerabilities, call paths).
- You need sorting/aggregations on doc fields.
Mandatory pattern for hot JSON fields
- Keep the raw JSONB for fidelity.
- Extract hot keys into stored generated columns (or real columns), index those.
- Extract hot arrays into child tables (components, vulnerabilities).
Example:
CREATE TABLE sbom_document (
id bigserial PRIMARY KEY,
tenant_id uuid NOT NULL,
artifact_purl text NOT NULL,
content_hash bytea NOT NULL,
doc jsonb NOT NULL,
created_at timestamptz NOT NULL DEFAULT now(),
-- hot keys as generated columns
bom_format text GENERATED ALWAYS AS ((doc->>'bomFormat')) STORED,
spec_version text GENERATED ALWAYS AS ((doc->>'specVersion')) STORED
);
CREATE UNIQUE INDEX ux_sbom_doc_hash ON sbom_document(tenant_id, content_hash);
CREATE INDEX ix_sbom_doc_tenant_artifact ON sbom_document(tenant_id, artifact_purl, created_at DESC);
CREATE INDEX ix_sbom_doc_json_gin ON sbom_document USING GIN (doc jsonb_path_ops);
CREATE INDEX ix_sbom_doc_bomformat ON sbom_document(tenant_id, bom_format);
Checklist
- Any query using
doc->>in WHERE has either an expression index or a generated column index - Any query using
jsonb_array_elements(...)in hot path has been replaced by a normalized child table or a projection table
4) Indexing standards (what devs must justify)
Core rules
- Tenant-first:
INDEX(tenant_id, …)for anything read per tenant. - Sort support: if query uses
ORDER BY created_at DESC, index must end withcreated_at DESC. - Partial indexes for sparse predicates (status/locked flags).
- BRIN for massive append-only time series.
- GIN jsonb_path_ops for containment (
@>) on JSONB; avoid GIN for everything.
Required index patterns by use case
“Latest version per artifact”
If you store versions as rows:
-- supports: WHERE tenant_id=? AND artifact_purl=? ORDER BY created_at DESC LIMIT 1
CREATE INDEX ix_sbom_latest ON sbom_document(tenant_id, artifact_purl, created_at DESC);
Ready queue claims
CREATE INDEX ix_job_ready
ON job_queue(kind, run_after, id)
WHERE locked_at IS NULL;
-- Optional: tenant scoped
CREATE INDEX ix_job_ready_tenant
ON job_queue(tenant_id, kind, run_after, id)
WHERE locked_at IS NULL;
JSON key lookup (expression index)
-- supports: WHERE (doc->>'subject') = ?
CREATE INDEX ix_vex_subject_expr
ON vex_document(tenant_id, (doc->>'subject'));
Massive event table time filtering
CREATE INDEX brin_scan_events_time
ON scan_run_event USING BRIN (occurred_at);
Red flags
- GIN index on a JSONB column + frequent updates = bloat and write amplification.
- No partial index for queue readiness → sequential scans under load.
- Composite indexes with wrong leading column order (e.g.,
created_at, tenant_id) → not used.
5) Partitioning and retention (avoid “infinite tables”)
Use partitioning for:
- audit/events
- scan run logs
- large finding histories
- anything > tens of millions rows with time-based access
Standard approach
- Partition by
occurred_at(monthly) for event/audit tables. - Retention by dropping partitions (fast and vacuum-free).
Example:
CREATE TABLE scan_run_event (
tenant_id uuid NOT NULL,
scan_run_id bigint NOT NULL,
occurred_at timestamptz NOT NULL,
event_type text NOT NULL,
payload jsonb NOT NULL
) PARTITION BY RANGE (occurred_at);
Checklist
- Partition creation/rollover process exists (migration or scheduler)
- Retention is “DROP PARTITION”, not “DELETE WHERE occurred_at < …”
- Each partition has needed local indexes (BRIN/time + tenant filters)
6) Views vs Materialized Views vs Projection Tables
Use a normal VIEW when
- It’s thin (renaming columns, simple joins) and not used in hot paths.
Use a MATERIALIZED VIEW when
- It accelerates complex joins/aggregations and can be refreshed on a schedule.
- You can tolerate refresh lag.
Materialized view requirements
- Must have a unique index to use
REFRESH … CONCURRENTLY. - Refresh must be outside an explicit transaction block.
Example:
CREATE MATERIALIZED VIEW mv_artifact_risk AS
SELECT tenant_id, artifact_purl, max(score) AS risk_score
FROM open_findings
GROUP BY tenant_id, artifact_purl;
CREATE UNIQUE INDEX ux_mv_artifact_risk
ON mv_artifact_risk(tenant_id, artifact_purl);
Prefer projection tables over MV when
- You need incremental updates (on import/scan completion).
- You need deterministic “point-in-time” snapshots per manifest.
Checklist
- Every MV has refresh cadence + owner (which worker/job triggers it)
- UI/API queries do not depend on a heavy non-materialized view
- If “refresh cost” scales with whole dataset, projection table exists instead
7) Queue and outbox patterns that do not deadlock
Claim pattern (short transaction)
WITH cte AS (
SELECT id
FROM job_queue
WHERE kind = $1
AND run_after <= now()
AND locked_at IS NULL
ORDER BY id
FOR UPDATE SKIP LOCKED
LIMIT 1
)
UPDATE job_queue j
SET locked_at = now(),
locked_by = $2
FROM cte
WHERE j.id = cte.id
RETURNING j.*;
Rules
- Claim + commit quickly.
- Do work outside the lock.
- On completion: update row to done (or delete if you want compactness).
- On failure: increment attempts, set
run_after = now() + backoff, release lock.
Checklist
- Worker does not keep transaction open while scanning/importing
- Backoff policy is encoded (in DB columns) and observable
- DLQ condition exists (attempts > N) and is queryable
8) Query performance review checklist (what to require in PRs)
For each new endpoint/query:
- Provide the query (SQL) and the intended parameters.
- Provide
EXPLAIN (ANALYZE, BUFFERS)from a dataset size that resembles staging. - Identify the serving index(es).
- Confirm row estimates are not wildly wrong (if they are: stats or predicate mismatch).
- Confirm it is tenant-scoped and uses the tenant-leading index.
Common fixes
- Replace
IN (SELECT …)withEXISTSfor correlated checks. - Replace
ORDER BY … LIMITwithout index with an index that matches ordering. - Avoid exploding joins with JSON arrays; pre-extract.
9) Vacuum, bloat, and “why is disk growing”
Design to avoid bloat
- Append-only for large docs and events.
- If frequent updates are needed, isolate hot-updated columns into a smaller table.
Example split:
job_queue_payload(stable)job_queue_state(locked/status/attempts updated frequently)
Checklist
- Large frequently-updated JSONB tables have been questioned
- Updates do not rewrite big TOAST values repeatedly
- Retention is partition-drop where possible
10) Migration safety rules (prevent production locks)
-
Index creation:
CREATE INDEX CONCURRENTLY. -
Dropping indexes:
DROP INDEX CONCURRENTLY. -
New column with default on large table:
ADD COLUMNnullable- backfill in batches
ALTER COLUMN SET NOT NULL- add default if needed
Checklist
- No long-running
ALTER TABLEon huge tables without plan - Any new NOT NULL constraint is staged safely
11) Stella Ops-specific schema guidance (SBOM/VEX/Finding)
Minimum recommended normalized tables
Even if you keep raw SBOM/VEX JSON:
sbom_document(raw, immutable)sbom_component(extracted components)vex_document(raw, immutable)vex_statement(extracted statements per CVE/component)finding(facts: CVE ↔ component ↔ artifact ↔ scan_run)scan_manifest(determinism: feed versions/hashes, policy hash)scan_run(links results to manifest)
Key gap detectors
- If “find all artifacts affected by CVE X” is slow → missing
findingindexing. - If “component search” is slow → missing
sbom_componentand its indexes. - If “replay this scan” is not exact → missing
scan_manifest+ feed import hashes.
12) Minimal “definition of done” for a new table/view
A PR adding a table/view is incomplete unless it includes:
- Table classification (SOR / projection / queue / event)
- Primary key and idempotency unique key
- Tenant scoping strategy
- Index plan mapped to known queries
- Retention plan (especially for event/projection tables)
- Refresh/update plan if derived
- Example query +
EXPLAINfor the top 1–3 access patterns
If you want this as a single drop-in repo document, tell me the target path (e.g., /docs/platform/postgres-table-view-guidelines.md) and I will format it exactly as a team-facing guideline, including a one-page “Architecture/Performance Gaps” review form that engineers can paste into PR descriptions.