Here’s a quick, practical cheat‑sheet on choosing **PostgreSQL vs MongoDB** for security/DevOps apps—plus how I’d model SBOM/VEX and queues in Stella Ops without adding moving parts. --- # PostgreSQL you can lean on (why it often wins for ops apps) * **JSONB that flies:** Store documents yet query like SQL. Add **GIN indexes** on JSONB fields for fast lookups (`jsonb_ops` general; `jsonb_path_ops` great for `@>` containment). * **Queue pattern built‑in:** `SELECT … FOR UPDATE SKIP LOCKED` lets multiple workers pop jobs from the same table safely—no head‑of‑line blocking, no extra broker. * **Cooperative locks:** **Advisory locks** (session/transaction) for “at‑most‑once” sections or leader election. * **Lightweight pub/sub:** **LISTEN/NOTIFY** for async nudges between services (poke a worker to re‑scan, refresh cache, etc.). * **Search included:** **Full‑text search** (tsvector/tsquery) is native—no separate search service for moderate needs. * **Serious backups:** **PITR** with WAL archiving / `pg_basebackup` for deterministic rollbacks and offline bundles. # MongoDB facts to factor in * **Flexible ingest:** Schemaless docs make it easy to absorb varied telemetry and vendor feeds. * **Horizontal scale:** Sharding is mature for huge, read‑heavy datasets. * **Consistency is a choice:** Design embedding vs refs and when to use multi‑document transactions. --- # A simple rule of thumb (Stella Ops‑style) * **System of record:** PostgreSQL (JSONB first). * **Hot paths:** Materialized views + JSONB GIN indexes. * **Queues & coordination:** PostgreSQL (skip‑locked + advisory locks). * **Cache/accel only:** Valkey (ephemeral). * **MongoDB:** Optional for **very large, read‑optimized graph snapshots** (e.g., periodically baked reachability graphs) if Postgres starts to strain. --- # Concrete patterns you can drop in today **1) SBOM/VEX storage (Postgres JSONB)** ```sql -- Documents CREATE TABLE sbom ( id BIGSERIAL PRIMARY KEY, artifact_purl TEXT NOT NULL, doc JSONB NOT NULL, created_at TIMESTAMPTZ DEFAULT now() ); CREATE INDEX sbom_purl_idx ON sbom(artifact_purl); CREATE INDEX sbom_doc_gin ON sbom USING GIN (doc jsonb_path_ops); -- Common queries -- find components by name/version: -- SELECT * FROM sbom WHERE doc @> '{"components":[{"name":"openssl","version":"3.0.14"}]}'; -- VEX CREATE TABLE vex ( id BIGSERIAL PRIMARY KEY, subject_purl TEXT NOT NULL, vex_doc JSONB NOT NULL, created_at TIMESTAMPTZ DEFAULT now() ); CREATE INDEX vex_subject_idx ON vex(subject_purl); CREATE INDEX vex_doc_gin ON vex USING GIN (vex_doc jsonb_path_ops); ``` **2) Hot reads via materialized views** ```sql CREATE MATERIALIZED VIEW mv_open_findings AS SELECT s.artifact_purl, c->>'name' AS comp, c->>'version' AS ver, v.vex_doc FROM sbom s CROSS JOIN LATERAL jsonb_array_elements(s.doc->'components') c LEFT JOIN vex v ON v.subject_purl = s.artifact_purl -- add WHERE clauses to pre‑filter only actionable rows ; CREATE INDEX mv_open_findings_idx ON mv_open_findings(artifact_purl, comp); ``` Refresh cadence: on feed import or via a scheduler; `REFRESH MATERIALIZED VIEW CONCURRENTLY mv_open_findings;` **3) Queue without a broker** ```sql CREATE TABLE job_queue( id BIGSERIAL PRIMARY KEY, kind TEXT NOT NULL, -- e.g., 'scan', 'sbom-diff' payload JSONB NOT NULL, run_after TIMESTAMPTZ DEFAULT now(), attempts INT DEFAULT 0, locked_at TIMESTAMPTZ, locked_by TEXT ); CREATE INDEX job_queue_ready_idx ON job_queue(kind, run_after); -- Worker loop WITH cte AS ( SELECT id FROM job_queue WHERE kind = $1 AND run_after <= now() AND locked_at IS NULL ORDER BY id FOR UPDATE SKIP LOCKED LIMIT 1 ) UPDATE job_queue j SET locked_at = now(), locked_by = $2 FROM cte WHERE j.id = cte.id RETURNING j.*; ``` Release/fail with: set `locked_at=NULL, locked_by=NULL, attempts=attempts+1` or delete on success. **4) Advisory lock for singletons** ```sql -- Acquire (per tenant, per artifact) SELECT pg_try_advisory_xact_lock(hashtextextended('recalc:'||tenant||':'||artifact, 0)); ``` **5) Nudge workers without a bus** ```sql NOTIFY stella_scan, json_build_object('purl', $1, 'priority', 5)::TEXT; -- workers LISTEN stella_scan and enqueue quickly ``` --- # When to add MongoDB * You need **interactive exploration** over **hundreds of millions of nodes/edges** (e.g., historical “proof‑of‑integrity” graphs) where document fan‑out and denormalized reads beat relational joins. * Snapshot cadence is **batchy** (hourly/daily), and you can **re‑emit** snapshots deterministically from Postgres (single source of truth). * You want to isolate read spikes from the transactional core. **Snapshot pipe:** Postgres → (ETL) → MongoDB collection `{graph_id, node, edges[], attrs}` with **compound shard keys** tuned to your UI traversal. --- # Why this fits Stella Ops * Fewer moving parts on‑prem/air‑gapped. * Deterministic replays (PITR + immutable imports). * Clear performance levers (GIN indexes, MVs, skip‑locked queues). * MongoDB stays optional, purpose‑built for giant read graphs—not a default dependency. If you want, I can turn the above into ready‑to‑run `.sql` migrations and a small **.NET 10** worker (Dapper/EF Core) that implements the queue loop + advisory locks + LISTEN/NOTIFY hooks. Below is a handoff-ready set of **PostgreSQL tables/views engineering guidelines** intended for developer review. It is written as a **gap-finding checklist** with **concrete DDL patterns** and **performance red flags** (Postgres as system of record, JSONB where useful, derived projections where needed). --- # PostgreSQL Tables & Views Engineering Guide ## 0) Non-negotiable principles 1. **Every hot query must have an index story.** If you cannot name the index that serves it, you have a performance gap. 2. **Write path stays simple.** Prefer **append-only** versioning to large updates (especially for JSONB). 3. **Multi-tenant must be explicit.** Every core table includes `tenant_id` and indexes are tenant-prefixed. 4. **Derived data is a product.** If the UI needs it fast, model it as a **projection table or materialized view**, not as an ad-hoc mega-join. 5. **Idempotency is enforced in the DB.** Unique keys for imports/jobs/results; no “best effort” dedupe in application only. --- # 1) Table taxonomy and what to look for Use this to classify every table; each class has different indexing/retention/locking rules. ### A. Source-of-truth (SOR) tables Examples: `sbom_document`, `vex_document`, `feed_import`, `scan_manifest`, `attestation`. * **Expect:** immutable rows, versioning via new row inserts. * **Gaps:** frequent updates to large JSONB; missing `content_hash`; no unique idempotency key. ### B. Projection tables (query-optimized) Examples: `open_findings`, `artifact_risk_summary`, `component_index`. * **Expect:** denormalized, indexed for UI/API; refresh/update strategy defined. * **Gaps:** projections rebuilt from scratch too often; missing incremental update plan; no retention plan. ### C. Queue/outbox tables Examples: `job_queue`, `outbox_events`. * **Expect:** `SKIP LOCKED` claim pattern; retry + DLQ; minimal lock duration. * **Gaps:** holding row locks while doing work; missing partial index for “ready” jobs. ### D. Audit/event tables Examples: `scan_run_event`, `decision_event`, `access_audit`. * **Expect:** append-only; partitioned by time; BRIN on timestamps. * **Gaps:** single huge table without partitioning; slow deletes instead of partition drops. --- # 2) Naming, keys, and required columns ## Required columns per class ### SOR documents (SBOM/VEX/Attestations) * `tenant_id uuid` * `id bigserial` (internal PK) * `external_id uuid` (optional API-facing id) * `content_hash bytea` (sha256) **NOT NULL** * `doc jsonb` **NOT NULL** * `created_at timestamptz` **NOT NULL default now()** * `supersedes_id bigint NULL` (version chain) OR `version int` **Checklist** * [ ] Unique constraint exists: `(tenant_id, content_hash)` * [ ] Version strategy exists (supersedes/version) and is queryable * [ ] “Latest” access is index-backed (see §4) ### Queue * `tenant_id uuid` (if multi-tenant) * `id bigserial` * `kind text` * `payload jsonb` * `run_after timestamptz` * `attempts int` * `locked_at timestamptz NULL` * `locked_by text NULL` * `status smallint` (optional; e.g., ready/running/done/dead) **Checklist** * [ ] “Ready to claim” has a partial index (see §4) * [ ] Claim transaction is short (claim+commit; work outside lock) --- # 3) JSONB rules that prevent “looks fine → melts in prod” ## When JSONB is appropriate * Storing signed envelopes (DSSE), SBOM/VEX raw docs, vendor payloads. * Ingest-first scenarios where schema evolves. ## When JSONB is a performance hazard * You frequently query deep keys/arrays (components, vulnerabilities, call paths). * You need sorting/aggregations on doc fields. **Mandatory pattern for hot JSON fields** 1. Keep the raw JSONB for fidelity. 2. Extract **hot keys** into **stored generated columns** (or real columns), index those. 3. Extract **hot arrays** into child tables (components, vulnerabilities). Example: ```sql CREATE TABLE sbom_document ( id bigserial PRIMARY KEY, tenant_id uuid NOT NULL, artifact_purl text NOT NULL, content_hash bytea NOT NULL, doc jsonb NOT NULL, created_at timestamptz NOT NULL DEFAULT now(), -- hot keys as generated columns bom_format text GENERATED ALWAYS AS ((doc->>'bomFormat')) STORED, spec_version text GENERATED ALWAYS AS ((doc->>'specVersion')) STORED ); CREATE UNIQUE INDEX ux_sbom_doc_hash ON sbom_document(tenant_id, content_hash); CREATE INDEX ix_sbom_doc_tenant_artifact ON sbom_document(tenant_id, artifact_purl, created_at DESC); CREATE INDEX ix_sbom_doc_json_gin ON sbom_document USING GIN (doc jsonb_path_ops); CREATE INDEX ix_sbom_doc_bomformat ON sbom_document(tenant_id, bom_format); ``` **Checklist** * [ ] Any query using `doc->>` in WHERE has either an expression index or a generated column index * [ ] Any query using `jsonb_array_elements(...)` in hot path has been replaced by a normalized child table or a projection table --- # 4) Indexing standards (what devs must justify) ## Core rules 1. **Tenant-first**: `INDEX(tenant_id, …)` for anything read per tenant. 2. **Sort support**: if query uses `ORDER BY created_at DESC`, index must end with `created_at DESC`. 3. **Partial indexes** for sparse predicates (status/locked flags). 4. **BRIN** for massive append-only time series. 5. **GIN jsonb_path_ops** for containment (`@>`) on JSONB; avoid GIN for everything. ## Required index patterns by use case ### “Latest version per artifact” If you store versions as rows: ```sql -- supports: WHERE tenant_id=? AND artifact_purl=? ORDER BY created_at DESC LIMIT 1 CREATE INDEX ix_sbom_latest ON sbom_document(tenant_id, artifact_purl, created_at DESC); ``` ### Ready queue claims ```sql CREATE INDEX ix_job_ready ON job_queue(kind, run_after, id) WHERE locked_at IS NULL; -- Optional: tenant scoped CREATE INDEX ix_job_ready_tenant ON job_queue(tenant_id, kind, run_after, id) WHERE locked_at IS NULL; ``` ### JSON key lookup (expression index) ```sql -- supports: WHERE (doc->>'subject') = ? CREATE INDEX ix_vex_subject_expr ON vex_document(tenant_id, (doc->>'subject')); ``` ### Massive event table time filtering ```sql CREATE INDEX brin_scan_events_time ON scan_run_event USING BRIN (occurred_at); ``` **Red flags** * GIN index on a JSONB column + frequent updates = bloat and write amplification. * No partial index for queue readiness → sequential scans under load. * Composite indexes with wrong leading column order (e.g., `created_at, tenant_id`) → not used. --- # 5) Partitioning and retention (avoid “infinite tables”) Use partitioning for: * audit/events * scan run logs * large finding histories * anything > tens of millions rows with time-based access ## Standard approach * Partition by `occurred_at` (monthly) for event/audit tables. * Retention by dropping partitions (fast and vacuum-free). Example: ```sql CREATE TABLE scan_run_event ( tenant_id uuid NOT NULL, scan_run_id bigint NOT NULL, occurred_at timestamptz NOT NULL, event_type text NOT NULL, payload jsonb NOT NULL ) PARTITION BY RANGE (occurred_at); ``` **Checklist** * [ ] Partition creation/rollover process exists (migration or scheduler) * [ ] Retention is “DROP PARTITION”, not “DELETE WHERE occurred_at < …” * [ ] Each partition has needed local indexes (BRIN/time + tenant filters) --- # 6) Views vs Materialized Views vs Projection Tables ## Use a normal VIEW when * It’s thin (renaming columns, simple joins) and not used in hot paths. ## Use a MATERIALIZED VIEW when * It accelerates complex joins/aggregations and can be refreshed on a schedule. * You can tolerate refresh lag. **Materialized view requirements** * Must have a **unique index** to use `REFRESH … CONCURRENTLY`. * Refresh must be **outside** an explicit transaction block. Example: ```sql CREATE MATERIALIZED VIEW mv_artifact_risk AS SELECT tenant_id, artifact_purl, max(score) AS risk_score FROM open_findings GROUP BY tenant_id, artifact_purl; CREATE UNIQUE INDEX ux_mv_artifact_risk ON mv_artifact_risk(tenant_id, artifact_purl); ``` ## Prefer projection tables over MV when * You need **incremental updates** (on import/scan completion). * You need deterministic “point-in-time” snapshots per manifest. **Checklist** * [ ] Every MV has refresh cadence + owner (which worker/job triggers it) * [ ] UI/API queries do not depend on a heavy non-materialized view * [ ] If “refresh cost” scales with whole dataset, projection table exists instead --- # 7) Queue and outbox patterns that do not deadlock ## Claim pattern (short transaction) ```sql WITH cte AS ( SELECT id FROM job_queue WHERE kind = $1 AND run_after <= now() AND locked_at IS NULL ORDER BY id FOR UPDATE SKIP LOCKED LIMIT 1 ) UPDATE job_queue j SET locked_at = now(), locked_by = $2 FROM cte WHERE j.id = cte.id RETURNING j.*; ``` **Rules** * Claim + commit quickly. * Do work outside the lock. * On completion: update row to done (or delete if you want compactness). * On failure: increment attempts, set `run_after = now() + backoff`, release lock. **Checklist** * [ ] Worker does not keep transaction open while scanning/importing * [ ] Backoff policy is encoded (in DB columns) and observable * [ ] DLQ condition exists (attempts > N) and is queryable --- # 8) Query performance review checklist (what to require in PRs) For each new endpoint/query: * [ ] Provide the query (SQL) and the intended parameters. * [ ] Provide `EXPLAIN (ANALYZE, BUFFERS)` from a dataset size that resembles staging. * [ ] Identify the serving index(es). * [ ] Confirm row estimates are not wildly wrong (if they are: stats or predicate mismatch). * [ ] Confirm it is tenant-scoped and uses the tenant-leading index. **Common fixes** * Replace `IN (SELECT …)` with `EXISTS` for correlated checks. * Replace `ORDER BY … LIMIT` without index with an index that matches ordering. * Avoid exploding joins with JSON arrays; pre-extract. --- # 9) Vacuum, bloat, and “why is disk growing” ## Design to avoid bloat * Append-only for large docs and events. * If frequent updates are needed, isolate hot-updated columns into a smaller table. Example split: * `job_queue_payload` (stable) * `job_queue_state` (locked/status/attempts updated frequently) **Checklist** * [ ] Large frequently-updated JSONB tables have been questioned * [ ] Updates do not rewrite big TOAST values repeatedly * [ ] Retention is partition-drop where possible --- # 10) Migration safety rules (prevent production locks) * Index creation: `CREATE INDEX CONCURRENTLY`. * Dropping indexes: `DROP INDEX CONCURRENTLY`. * New column with default on large table: 1. `ADD COLUMN` nullable 2. backfill in batches 3. `ALTER COLUMN SET NOT NULL` 4. add default if needed **Checklist** * [ ] No long-running `ALTER TABLE` on huge tables without plan * [ ] Any new NOT NULL constraint is staged safely --- # 11) Stella Ops-specific schema guidance (SBOM/VEX/Finding) ## Minimum recommended normalized tables Even if you keep raw SBOM/VEX JSON: * `sbom_document` (raw, immutable) * `sbom_component` (extracted components) * `vex_document` (raw, immutable) * `vex_statement` (extracted statements per CVE/component) * `finding` (facts: CVE ↔ component ↔ artifact ↔ scan_run) * `scan_manifest` (determinism: feed versions/hashes, policy hash) * `scan_run` (links results to manifest) **Key gap detectors** * If “find all artifacts affected by CVE X” is slow → missing `finding` indexing. * If “component search” is slow → missing `sbom_component` and its indexes. * If “replay this scan” is not exact → missing `scan_manifest` + feed import hashes. --- # 12) Minimal “definition of done” for a new table/view A PR adding a table/view is incomplete unless it includes: * [ ] Table classification (SOR / projection / queue / event) * [ ] Primary key and idempotency unique key * [ ] Tenant scoping strategy * [ ] Index plan mapped to known queries * [ ] Retention plan (especially for event/projection tables) * [ ] Refresh/update plan if derived * [ ] Example query + `EXPLAIN` for the top 1–3 access patterns --- If you want this as a single drop-in repo document, tell me the target path (e.g., `/docs/platform/postgres-table-view-guidelines.md`) and I will format it exactly as a team-facing guideline, including a one-page “Architecture/Performance Gaps” review form that engineers can paste into PR descriptions.