This commit is contained in:
2025-12-14 16:23:44 +02:00
parent 233873f620
commit 01f4943ab9
8 changed files with 6193 additions and 12 deletions

View File

@@ -0,0 +1,544 @@
Heres a quick, practical cheatsheet on choosing **PostgreSQL vs MongoDB** for security/DevOps apps—plus how Id model SBOM/VEX and queues in StellaOps without adding moving parts.
---
# PostgreSQL you can lean on (why it often wins for ops apps)
* **JSONB that flies:** Store documents yet query like SQL. Add **GIN indexes** on JSONB fields for fast lookups (`jsonb_ops` general; `jsonb_path_ops` great for `@>` containment).
* **Queue pattern builtin:** `SELECT … FOR UPDATE SKIP LOCKED` lets multiple workers pop jobs from the same table safely—no headofline blocking, no extra broker.
* **Cooperative locks:** **Advisory locks** (session/transaction) for “atmostonce” sections or leader election.
* **Lightweight pub/sub:** **LISTEN/NOTIFY** for async nudges between services (poke a worker to rescan, refresh cache, etc.).
* **Search included:** **Fulltext search** (tsvector/tsquery) is native—no separate search service for moderate needs.
* **Serious backups:** **PITR** with WAL archiving / `pg_basebackup` for deterministic rollbacks and offline bundles.
# MongoDB facts to factor in
* **Flexible ingest:** Schemaless docs make it easy to absorb varied telemetry and vendor feeds.
* **Horizontal scale:** Sharding is mature for huge, readheavy datasets.
* **Consistency is a choice:** Design embedding vs refs and when to use multidocument transactions.
---
# A simple rule of thumb (StellaOpsstyle)
* **System of record:** PostgreSQL (JSONB first).
* **Hot paths:** Materialized views + JSONB GIN indexes.
* **Queues & coordination:** PostgreSQL (skiplocked + advisory locks).
* **Cache/accel only:** Valkey (ephemeral).
* **MongoDB:** Optional for **very large, readoptimized graph snapshots** (e.g., periodically baked reachability graphs) if Postgres starts to strain.
---
# Concrete patterns you can drop in today
**1) SBOM/VEX storage (Postgres JSONB)**
```sql
-- Documents
CREATE TABLE sbom (
id BIGSERIAL PRIMARY KEY,
artifact_purl TEXT NOT NULL,
doc JSONB NOT NULL,
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX sbom_purl_idx ON sbom(artifact_purl);
CREATE INDEX sbom_doc_gin ON sbom USING GIN (doc jsonb_path_ops);
-- Common queries
-- find components by name/version:
-- SELECT * FROM sbom WHERE doc @> '{"components":[{"name":"openssl","version":"3.0.14"}]}';
-- VEX
CREATE TABLE vex (
id BIGSERIAL PRIMARY KEY,
subject_purl TEXT NOT NULL,
vex_doc JSONB NOT NULL,
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX vex_subject_idx ON vex(subject_purl);
CREATE INDEX vex_doc_gin ON vex USING GIN (vex_doc jsonb_path_ops);
```
**2) Hot reads via materialized views**
```sql
CREATE MATERIALIZED VIEW mv_open_findings AS
SELECT
s.artifact_purl,
c->>'name' AS comp,
c->>'version' AS ver,
v.vex_doc
FROM sbom s
CROSS JOIN LATERAL jsonb_array_elements(s.doc->'components') c
LEFT JOIN vex v ON v.subject_purl = s.artifact_purl
-- add WHERE clauses to prefilter only actionable rows
;
CREATE INDEX mv_open_findings_idx ON mv_open_findings(artifact_purl, comp);
```
Refresh cadence: on feed import or via a scheduler; `REFRESH MATERIALIZED VIEW CONCURRENTLY mv_open_findings;`
**3) Queue without a broker**
```sql
CREATE TABLE job_queue(
id BIGSERIAL PRIMARY KEY,
kind TEXT NOT NULL, -- e.g., 'scan', 'sbom-diff'
payload JSONB NOT NULL,
run_after TIMESTAMPTZ DEFAULT now(),
attempts INT DEFAULT 0,
locked_at TIMESTAMPTZ,
locked_by TEXT
);
CREATE INDEX job_queue_ready_idx ON job_queue(kind, run_after);
-- Worker loop
WITH cte AS (
SELECT id FROM job_queue
WHERE kind = $1 AND run_after <= now() AND locked_at IS NULL
ORDER BY id
FOR UPDATE SKIP LOCKED
LIMIT 1
)
UPDATE job_queue j
SET locked_at = now(), locked_by = $2
FROM cte
WHERE j.id = cte.id
RETURNING j.*;
```
Release/fail with: set `locked_at=NULL, locked_by=NULL, attempts=attempts+1` or delete on success.
**4) Advisory lock for singletons**
```sql
-- Acquire (per tenant, per artifact)
SELECT pg_try_advisory_xact_lock(hashtextextended('recalc:'||tenant||':'||artifact, 0));
```
**5) Nudge workers without a bus**
```sql
NOTIFY stella_scan, json_build_object('purl', $1, 'priority', 5)::TEXT;
-- workers LISTEN stella_scan and enqueue quickly
```
---
# When to add MongoDB
* You need **interactive exploration** over **hundreds of millions of nodes/edges** (e.g., historical “proofofintegrity” graphs) where document fanout and denormalized reads beat relational joins.
* Snapshot cadence is **batchy** (hourly/daily), and you can **reemit** snapshots deterministically from Postgres (single source of truth).
* You want to isolate read spikes from the transactional core.
**Snapshot pipe:** Postgres → (ETL) → MongoDB collection `{graph_id, node, edges[], attrs}` with **compound shard keys** tuned to your UI traversal.
---
# Why this fits StellaOps
* Fewer moving parts onprem/airgapped.
* Deterministic replays (PITR + immutable imports).
* Clear performance levers (GIN indexes, MVs, skiplocked queues).
* MongoDB stays optional, purposebuilt for giant read graphs—not a default dependency.
If you want, I can turn the above into readytorun `.sql` migrations and a small **.NET 10** worker (Dapper/EF Core) that implements the queue loop + advisory locks + LISTEN/NOTIFY hooks.
Below is a handoff-ready set of **PostgreSQL tables/views engineering guidelines** intended for developer review. It is written as a **gap-finding checklist** with **concrete DDL patterns** and **performance red flags** (Postgres as system of record, JSONB where useful, derived projections where needed).
---
# PostgreSQL Tables & Views Engineering Guide
## 0) Non-negotiable principles
1. **Every hot query must have an index story.** If you cannot name the index that serves it, you have a performance gap.
2. **Write path stays simple.** Prefer **append-only** versioning to large updates (especially for JSONB).
3. **Multi-tenant must be explicit.** Every core table includes `tenant_id` and indexes are tenant-prefixed.
4. **Derived data is a product.** If the UI needs it fast, model it as a **projection table or materialized view**, not as an ad-hoc mega-join.
5. **Idempotency is enforced in the DB.** Unique keys for imports/jobs/results; no “best effort” dedupe in application only.
---
# 1) Table taxonomy and what to look for
Use this to classify every table; each class has different indexing/retention/locking rules.
### A. Source-of-truth (SOR) tables
Examples: `sbom_document`, `vex_document`, `feed_import`, `scan_manifest`, `attestation`.
* **Expect:** immutable rows, versioning via new row inserts.
* **Gaps:** frequent updates to large JSONB; missing `content_hash`; no unique idempotency key.
### B. Projection tables (query-optimized)
Examples: `open_findings`, `artifact_risk_summary`, `component_index`.
* **Expect:** denormalized, indexed for UI/API; refresh/update strategy defined.
* **Gaps:** projections rebuilt from scratch too often; missing incremental update plan; no retention plan.
### C. Queue/outbox tables
Examples: `job_queue`, `outbox_events`.
* **Expect:** `SKIP LOCKED` claim pattern; retry + DLQ; minimal lock duration.
* **Gaps:** holding row locks while doing work; missing partial index for “ready” jobs.
### D. Audit/event tables
Examples: `scan_run_event`, `decision_event`, `access_audit`.
* **Expect:** append-only; partitioned by time; BRIN on timestamps.
* **Gaps:** single huge table without partitioning; slow deletes instead of partition drops.
---
# 2) Naming, keys, and required columns
## Required columns per class
### SOR documents (SBOM/VEX/Attestations)
* `tenant_id uuid`
* `id bigserial` (internal PK)
* `external_id uuid` (optional API-facing id)
* `content_hash bytea` (sha256) **NOT NULL**
* `doc jsonb` **NOT NULL**
* `created_at timestamptz` **NOT NULL default now()**
* `supersedes_id bigint NULL` (version chain) OR `version int`
**Checklist**
* [ ] Unique constraint exists: `(tenant_id, content_hash)`
* [ ] Version strategy exists (supersedes/version) and is queryable
* [ ] “Latest” access is index-backed (see §4)
### Queue
* `tenant_id uuid` (if multi-tenant)
* `id bigserial`
* `kind text`
* `payload jsonb`
* `run_after timestamptz`
* `attempts int`
* `locked_at timestamptz NULL`
* `locked_by text NULL`
* `status smallint` (optional; e.g., ready/running/done/dead)
**Checklist**
* [ ] “Ready to claim” has a partial index (see §4)
* [ ] Claim transaction is short (claim+commit; work outside lock)
---
# 3) JSONB rules that prevent “looks fine → melts in prod”
## When JSONB is appropriate
* Storing signed envelopes (DSSE), SBOM/VEX raw docs, vendor payloads.
* Ingest-first scenarios where schema evolves.
## When JSONB is a performance hazard
* You frequently query deep keys/arrays (components, vulnerabilities, call paths).
* You need sorting/aggregations on doc fields.
**Mandatory pattern for hot JSON fields**
1. Keep the raw JSONB for fidelity.
2. Extract **hot keys** into **stored generated columns** (or real columns), index those.
3. Extract **hot arrays** into child tables (components, vulnerabilities).
Example:
```sql
CREATE TABLE sbom_document (
id bigserial PRIMARY KEY,
tenant_id uuid NOT NULL,
artifact_purl text NOT NULL,
content_hash bytea NOT NULL,
doc jsonb NOT NULL,
created_at timestamptz NOT NULL DEFAULT now(),
-- hot keys as generated columns
bom_format text GENERATED ALWAYS AS ((doc->>'bomFormat')) STORED,
spec_version text GENERATED ALWAYS AS ((doc->>'specVersion')) STORED
);
CREATE UNIQUE INDEX ux_sbom_doc_hash ON sbom_document(tenant_id, content_hash);
CREATE INDEX ix_sbom_doc_tenant_artifact ON sbom_document(tenant_id, artifact_purl, created_at DESC);
CREATE INDEX ix_sbom_doc_json_gin ON sbom_document USING GIN (doc jsonb_path_ops);
CREATE INDEX ix_sbom_doc_bomformat ON sbom_document(tenant_id, bom_format);
```
**Checklist**
* [ ] Any query using `doc->>` in WHERE has either an expression index or a generated column index
* [ ] Any query using `jsonb_array_elements(...)` in hot path has been replaced by a normalized child table or a projection table
---
# 4) Indexing standards (what devs must justify)
## Core rules
1. **Tenant-first**: `INDEX(tenant_id, …)` for anything read per tenant.
2. **Sort support**: if query uses `ORDER BY created_at DESC`, index must end with `created_at DESC`.
3. **Partial indexes** for sparse predicates (status/locked flags).
4. **BRIN** for massive append-only time series.
5. **GIN jsonb_path_ops** for containment (`@>`) on JSONB; avoid GIN for everything.
## Required index patterns by use case
### “Latest version per artifact”
If you store versions as rows:
```sql
-- supports: WHERE tenant_id=? AND artifact_purl=? ORDER BY created_at DESC LIMIT 1
CREATE INDEX ix_sbom_latest ON sbom_document(tenant_id, artifact_purl, created_at DESC);
```
### Ready queue claims
```sql
CREATE INDEX ix_job_ready
ON job_queue(kind, run_after, id)
WHERE locked_at IS NULL;
-- Optional: tenant scoped
CREATE INDEX ix_job_ready_tenant
ON job_queue(tenant_id, kind, run_after, id)
WHERE locked_at IS NULL;
```
### JSON key lookup (expression index)
```sql
-- supports: WHERE (doc->>'subject') = ?
CREATE INDEX ix_vex_subject_expr
ON vex_document(tenant_id, (doc->>'subject'));
```
### Massive event table time filtering
```sql
CREATE INDEX brin_scan_events_time
ON scan_run_event USING BRIN (occurred_at);
```
**Red flags**
* GIN index on a JSONB column + frequent updates = bloat and write amplification.
* No partial index for queue readiness → sequential scans under load.
* Composite indexes with wrong leading column order (e.g., `created_at, tenant_id`) → not used.
---
# 5) Partitioning and retention (avoid “infinite tables”)
Use partitioning for:
* audit/events
* scan run logs
* large finding histories
* anything > tens of millions rows with time-based access
## Standard approach
* Partition by `occurred_at` (monthly) for event/audit tables.
* Retention by dropping partitions (fast and vacuum-free).
Example:
```sql
CREATE TABLE scan_run_event (
tenant_id uuid NOT NULL,
scan_run_id bigint NOT NULL,
occurred_at timestamptz NOT NULL,
event_type text NOT NULL,
payload jsonb NOT NULL
) PARTITION BY RANGE (occurred_at);
```
**Checklist**
* [ ] Partition creation/rollover process exists (migration or scheduler)
* [ ] Retention is “DROP PARTITION”, not “DELETE WHERE occurred_at < …”
* [ ] Each partition has needed local indexes (BRIN/time + tenant filters)
---
# 6) Views vs Materialized Views vs Projection Tables
## Use a normal VIEW when
* Its thin (renaming columns, simple joins) and not used in hot paths.
## Use a MATERIALIZED VIEW when
* It accelerates complex joins/aggregations and can be refreshed on a schedule.
* You can tolerate refresh lag.
**Materialized view requirements**
* Must have a **unique index** to use `REFRESH … CONCURRENTLY`.
* Refresh must be **outside** an explicit transaction block.
Example:
```sql
CREATE MATERIALIZED VIEW mv_artifact_risk AS
SELECT tenant_id, artifact_purl, max(score) AS risk_score
FROM open_findings
GROUP BY tenant_id, artifact_purl;
CREATE UNIQUE INDEX ux_mv_artifact_risk
ON mv_artifact_risk(tenant_id, artifact_purl);
```
## Prefer projection tables over MV when
* You need **incremental updates** (on import/scan completion).
* You need deterministic “point-in-time” snapshots per manifest.
**Checklist**
* [ ] Every MV has refresh cadence + owner (which worker/job triggers it)
* [ ] UI/API queries do not depend on a heavy non-materialized view
* [ ] If “refresh cost” scales with whole dataset, projection table exists instead
---
# 7) Queue and outbox patterns that do not deadlock
## Claim pattern (short transaction)
```sql
WITH cte AS (
SELECT id
FROM job_queue
WHERE kind = $1
AND run_after <= now()
AND locked_at IS NULL
ORDER BY id
FOR UPDATE SKIP LOCKED
LIMIT 1
)
UPDATE job_queue j
SET locked_at = now(),
locked_by = $2
FROM cte
WHERE j.id = cte.id
RETURNING j.*;
```
**Rules**
* Claim + commit quickly.
* Do work outside the lock.
* On completion: update row to done (or delete if you want compactness).
* On failure: increment attempts, set `run_after = now() + backoff`, release lock.
**Checklist**
* [ ] Worker does not keep transaction open while scanning/importing
* [ ] Backoff policy is encoded (in DB columns) and observable
* [ ] DLQ condition exists (attempts > N) and is queryable
---
# 8) Query performance review checklist (what to require in PRs)
For each new endpoint/query:
* [ ] Provide the query (SQL) and the intended parameters.
* [ ] Provide `EXPLAIN (ANALYZE, BUFFERS)` from a dataset size that resembles staging.
* [ ] Identify the serving index(es).
* [ ] Confirm row estimates are not wildly wrong (if they are: stats or predicate mismatch).
* [ ] Confirm it is tenant-scoped and uses the tenant-leading index.
**Common fixes**
* Replace `IN (SELECT …)` with `EXISTS` for correlated checks.
* Replace `ORDER BY … LIMIT` without index with an index that matches ordering.
* Avoid exploding joins with JSON arrays; pre-extract.
---
# 9) Vacuum, bloat, and “why is disk growing”
## Design to avoid bloat
* Append-only for large docs and events.
* If frequent updates are needed, isolate hot-updated columns into a smaller table.
Example split:
* `job_queue_payload` (stable)
* `job_queue_state` (locked/status/attempts updated frequently)
**Checklist**
* [ ] Large frequently-updated JSONB tables have been questioned
* [ ] Updates do not rewrite big TOAST values repeatedly
* [ ] Retention is partition-drop where possible
---
# 10) Migration safety rules (prevent production locks)
* Index creation: `CREATE INDEX CONCURRENTLY`.
* Dropping indexes: `DROP INDEX CONCURRENTLY`.
* New column with default on large table:
1. `ADD COLUMN` nullable
2. backfill in batches
3. `ALTER COLUMN SET NOT NULL`
4. add default if needed
**Checklist**
* [ ] No long-running `ALTER TABLE` on huge tables without plan
* [ ] Any new NOT NULL constraint is staged safely
---
# 11) Stella Ops-specific schema guidance (SBOM/VEX/Finding)
## Minimum recommended normalized tables
Even if you keep raw SBOM/VEX JSON:
* `sbom_document` (raw, immutable)
* `sbom_component` (extracted components)
* `vex_document` (raw, immutable)
* `vex_statement` (extracted statements per CVE/component)
* `finding` (facts: CVE ↔ component ↔ artifact ↔ scan_run)
* `scan_manifest` (determinism: feed versions/hashes, policy hash)
* `scan_run` (links results to manifest)
**Key gap detectors**
* If “find all artifacts affected by CVE X” is slow → missing `finding` indexing.
* If “component search” is slow → missing `sbom_component` and its indexes.
* If “replay this scan” is not exact → missing `scan_manifest` + feed import hashes.
---
# 12) Minimal “definition of done” for a new table/view
A PR adding a table/view is incomplete unless it includes:
* [ ] Table classification (SOR / projection / queue / event)
* [ ] Primary key and idempotency unique key
* [ ] Tenant scoping strategy
* [ ] Index plan mapped to known queries
* [ ] Retention plan (especially for event/projection tables)
* [ ] Refresh/update plan if derived
* [ ] Example query + `EXPLAIN` for the top 13 access patterns
---
If you want this as a single drop-in repo document, tell me the target path (e.g., `/docs/platform/postgres-table-view-guidelines.md`) and I will format it exactly as a team-facing guideline, including a one-page “Architecture/Performance Gaps” review form that engineers can paste into PR descriptions.