23 KiB
Here’s a crisp, opinionated storage blueprint you can hand to your Stella Ops devs right now, plus zero‑downtime conversion tactics so you can keep prototyping fast without painting yourself into a corner.
Module → store map (deterministic by default)
-
Authority / OAuth / Accounts & Audit
- PostgreSQL as the primary source of truth.
- Tables:
users,clients,oauth_tokens,roles,grants,audit_log. - Row‑Level Security (RLS) on
users,grants,audit_log; STRICT FK + CHECK constraints; immutable UUID PKs. - Audit:
audit_log(actor_id, action, entity, entity_id, at timestamptz default now(), diff jsonb). - Why: ACID + RLS keeps authz decisions and audit trails deterministic and reviewable.
-
VEX & Vulnerability Writes
- PostgreSQL with JSONB facts + relational decisions.
- Tables:
vuln_fact(jsonb),vex_decision(package_id, vuln_id, status, rationale, proof_ref, updated_at). - Materialized views for triage queues, e.g.
mv_triage_hotset(refresh on commit or scheduled). - Why: JSONB lets you ingest vendor‑shaped docs; decisions stay relational for joins, integrity, and explainability.
-
Routing / Feature Flags / Rate‑limits
- PostgreSQL (truth) + Redis (cache).
- Tables:
feature_flag(key, rules jsonb, version),route(domain, service, instance_id, last_heartbeat),rate_limiter(bucket, quota, interval). - Redis keys:
flag:{key}:{version},route:{domain},rl:{bucket}with short TTLs. - Why: one canonical RDBMS for consistency; Redis for hot path latency.
-
Unknowns Registry (ambiguity tracker)
- PostgreSQL with temporal tables (bitemporal pattern via
valid_from/valid_to,sys_from/sys_to). - Table:
unknowns(subject_hash, kind, context jsonb, valid_from, valid_to, sys_from default now(), sys_to). - Views:
unknowns_currentwherevalid_to is null. - Why: preserves how/when uncertainty changed (critical for proofs and audits).
- PostgreSQL with temporal tables (bitemporal pattern via
-
Artifacts / SBOM / VEX files
- OCI‑compatible CAS (e.g., self‑hosted registry or MinIO bucket as content‑addressable store).
- Keys by digest (
sha256:...), metadata in Postgresartifact(index)withdigest,media_type,size,signatures. - Why: blobs don’t belong in your RDBMS; use CAS for scale + cryptographic addressing.
PostgreSQL implementation essentials (copy/paste starters)
-
RLS scaffold (Authority):
alter table audit_log enable row level security; create policy p_audit_read_self on audit_log for select using (actor_id = current_setting('app.user_id')::uuid or exists (select 1 from grants g where g.user_id = current_setting('app.user_id')::uuid and g.role = 'auditor')); -
JSONB facts + relational decisions:
create table vuln_fact ( id uuid primary key default gen_random_uuid(), source text not null, payload jsonb not null, received_at timestamptz default now() ); create table vex_decision ( package_id uuid not null, vuln_id text not null, status text check (status in ('not_affected','affected','fixed','under_investigation')), rationale text, proof_ref text, decided_at timestamptz default now(), primary key (package_id, vuln_id) ); -
Materialized view for triage:
create materialized view mv_triage_hotset as select v.id as fact_id, v.payload->>'vuln' as vuln, v.received_at from vuln_fact v where (now() - v.received_at) < interval '7 days'; -- refresh concurrently via job -
Temporal pattern (Unknowns):
create table unknowns ( id uuid primary key default gen_random_uuid(), subject_hash text not null, kind text not null, context jsonb not null, valid_from timestamptz not null default now(), valid_to timestamptz, sys_from timestamptz not null default now(), sys_to timestamptz ); create view unknowns_current as select * from unknowns where valid_to is null;
Conversion (not migration): zero‑downtime, prototype‑friendly
Even if you’re “not migrating anything yet,” set these rails now so cutting over later is painless.
- Encode Mongo‑shaped docs into JSONB with versioned schemas
- Ingest pipeline writes to
*_fact(payload jsonb, schema_version int). - Add a
validate(schema_version, payload)step in your service layer (JSON Schema or SQL checks). - Keep a forward‑compatible view that projects stable columns from JSONB (e.g.,
payload->>'id' as vendor_id) so downstream code doesn’t break when payload evolves.
- Outbox pattern for exactly‑once side‑effects
- Add
outbox(id, topic, key, payload jsonb, created_at, dispatched bool default false). - On the same transaction as your write, insert the outbox row.
- A background dispatcher reads
dispatched=false, publishes to MQ/Webhook, then marksdispatched=true. - Guarantees: no lost events, no duplicates to external systems.
- Parallel read adapters behind feature flags
- Keep old readers (e.g., Mongo driver) and new Postgres readers in the same service.
- Gate by
feature_flag('pg_reads')per tenant or env; flip gradually. - Add a read‑diff monitor that compares results and logs mismatches to
audit_log(diff).
- CDC for analytics without coupling
- Enable logical replication (pgoutput) on your key tables.
- Stream changes into analyzers (reachability, heuristics) without hitting primaries.
- This lets you keep OLTP clean and still power dashboards/tests.
- Materialized views & job cadence
- Refresh
mv_*on a fixed cadence (e.g., every 2–5 minutes) or post‑commit for hot paths. - Keep “cold path” analytics in separate schemas (
analytics.*) sourced from CDC.
- Cutover playbook (phased)
- Phase A (Dark Read): write Postgres, still serve from Mongo; compare results silently.
- Phase B (Shadow Serve): 5–10% traffic from Postgres via flag; auto‑rollback switch.
- Phase C (Authoritative): Postgres becomes source; Mongo path left for emergency read‑only.
- Phase D (Retire): freeze Mongo, back up, remove writes, delete code paths after 2 stable sprints.
Rate‑limits & flags: single truth, fast edges
-
Truth in Postgres with versioned flag docs:
create table feature_flag ( key text primary key, rules jsonb not null, version int not null default 1, updated_at timestamptz default now() ); -
Edge cache in Redis:
SETEX flag:{key}:{version} <ttl> <json>- On update, bump
version; readers compose cache key with version (cache‑busting without deletes).
-
Rate limiting: Persist quotas in Postgres; counters in Redis (
INCR rl:{bucket}:{window}), with periodic reconciliation jobs writing summaries back to Postgres for audits.
CAS for SBOM/VEX/attestations
-
Push blobs to OCI/MinIO by digest; store only pointers in Postgres:
create table artifact_index ( digest text primary key, media_type text not null, size bigint not null, created_at timestamptz default now(), signature_refs jsonb ); -
Benefits: immutable, deduped, easy to mirror into offline kits.
Guardrails your team should follow
- Always wrap multi‑table writes (facts + outbox + decisions) in a single transaction.
- Prefer
jsonb_path_queryfor targeted reads; avoid scanning entire payloads. - Enforce RLS + least‑privilege roles; application sets
app.user_idat session start. - Version everything: schemas, flags, materialized views; never “change in place” without bumping version.
- Observability: expose
pg_stat_statements, refresh latency formv_*, outbox lag, Redis hit ratio, and RLS policy hits.
If you want, I can turn this into:
- ready‑to‑run EF Core 10 migrations,
- a /docs/architecture/store-map.md for your repo,
- and a tiny dev seed (Docker + sample data) so the team can poke it immediately. Here’s a focused “PostgreSQL patterns per module” doc you can hand straight to your StellaOps devs.
StellaOps – PostgreSQL Patterns per Module
Scope: How each StellaOps module should use PostgreSQL: schema patterns, constraints, RLS, indexing, and transaction rules.
0. Cross‑cutting PostgreSQL Rules
These apply everywhere unless explicitly overridden.
0.1 Core conventions
-
Schemas
- Use one logical schema per module:
authority,routing,vex,unknowns,artifact. - Shared utilities (e.g.,
outbox) live in acoreschema.
- Use one logical schema per module:
-
Naming
-
Tables:
snake_case, singular:user,feature_flag,vuln_fact. -
PK:
id uuid primary key. -
FKs:
<referenced_table>_id(e.g.,user_id,tenant_id). -
Timestamps:
created_at timestamptz not null default now()updated_at timestamptz not null default now()
-
-
Multi‑tenancy
- All tenant‑scoped tables must have
tenant_id uuid not null. - Enforce tenant isolation with RLS on
tenant_id.
- All tenant‑scoped tables must have
-
Time & timezones
- Always
timestamptz, always store UTC, let the DB defaultnow().
- Always
0.2 RLS & security
-
RLS must be enabled on any table reachable from a user‑initiated path.
-
Every session must set:
select set_config('app.user_id', '<uuid>', false); select set_config('app.tenant_id', '<uuid>', false); select set_config('app.roles', 'role1,role2', false); -
RLS policies:
- Base policy:
tenant_id = current_setting('app.tenant_id')::uuid. - Extra predicates for per‑user privacy (e.g., only see own tokens, only own API clients).
- Base policy:
-
DB users:
- Each module’s service has its own role with access only to its schema +
core.outbox.
- Each module’s service has its own role with access only to its schema +
0.3 JSONB & versioning
-
Any JSONB column must have:
payload jsonb not null,schema_version int not null.
-
Always index:
- by source (
source/origin), - by a small set of projected fields used in WHERE clauses.
- by source (
0.4 Migrations
-
All schema changes via migrations, forward‑only.
-
Backwards‑compat pattern:
- Add new columns / tables.
- Backfill.
- Flip code to use new structure (behind a feature flag).
- After stability, remove old columns/paths.
1. Authority Module (auth, accounts, audit)
Schema: authority.*
Mission: identity, OAuth, roles, grants, audit.
1.1 Core tables & patterns
-
authority.usercreate table authority.user ( id uuid primary key default gen_random_uuid(), tenant_id uuid not null, email text not null, display_name text not null, is_disabled boolean not null default false, created_at timestamptz not null default now(), updated_at timestamptz not null default now(), unique (tenant_id, email) );- Never hard‑delete users: use
is_disabled(and optionallydisabled_at).
- Never hard‑delete users: use
-
authority.rolecreate table authority.role ( id uuid primary key default gen_random_uuid(), tenant_id uuid not null, name text not null, description text, created_at timestamptz not null default now(), updated_at timestamptz not null default now(), unique (tenant_id, name) ); -
authority.grantcreate table authority.grant ( id uuid primary key default gen_random_uuid(), tenant_id uuid not null, user_id uuid not null references authority.user(id), role_id uuid not null references authority.role(id), created_at timestamptz not null default now(), unique (tenant_id, user_id, role_id) ); -
authority.oauth_client,authority.oauth_token-
Enforce token uniqueness:
create table authority.oauth_token ( id uuid primary key default gen_random_uuid(), tenant_id uuid not null, user_id uuid not null references authority.user(id), client_id uuid not null references authority.oauth_client(id), token_hash text not null, -- hash, never raw expires_at timestamptz not null, created_at timestamptz not null default now(), revoked_at timestamptz, unique (token_hash) );
-
1.2 Audit log pattern
-
authority.audit_logcreate table authority.audit_log ( id uuid primary key default gen_random_uuid(), tenant_id uuid not null, actor_id uuid, -- null for system action text not null, entity_type text not null, entity_id uuid, at timestamptz not null default now(), diff jsonb not null ); -
Insert audit rows in the same transaction as the change.
1.3 RLS patterns
-
Base RLS:
alter table authority.user enable row level security; create policy p_user_tenant on authority.user for all using (tenant_id = current_setting('app.tenant_id')::uuid); -
Extra policies:
-
Audit log is visible only to:
- actor themself, or
- users with an
auditororadminrole.
-
2. Routing & Feature Flags Module
Schema: routing.*
Mission: where instances live, what features are on, rate‑limit configuration.
2.1 Feature flags
-
routing.feature_flagcreate table routing.feature_flag ( id uuid primary key default gen_random_uuid(), tenant_id uuid not null, key text not null, rules jsonb not null, version int not null default 1, is_enabled boolean not null default true, created_at timestamptz not null default now(), updated_at timestamptz not null default now(), unique (tenant_id, key) ); -
Immutability by version:
-
On update, increment
version, don’t overwrite historical data. -
Mirror changes into a history table via trigger:
create table routing.feature_flag_history ( id uuid primary key default gen_random_uuid(), feature_flag_id uuid not null references routing.feature_flag(id), tenant_id uuid not null, key text not null, rules jsonb not null, version int not null, changed_at timestamptz not null default now(), changed_by uuid );
-
2.2 Instance registry
-
routing.instancecreate table routing.instance ( id uuid primary key default gen_random_uuid(), tenant_id uuid not null, instance_key text not null, domain text not null, last_heartbeat timestamptz not null default now(), status text not null check (status in ('active','draining','offline')), created_at timestamptz not null default now(), updated_at timestamptz not null default now(), unique (tenant_id, instance_key), unique (tenant_id, domain) ); -
Pattern:
- Heartbeats use
update ... set last_heartbeat = now()without touching other fields. - Routing logic filters by
status='active'and heartbeat recency.
- Heartbeats use
2.3 Rate‑limit configuration
-
Config in Postgres, counters in Redis:
create table routing.rate_limit_config ( id uuid primary key default gen_random_uuid(), tenant_id uuid not null, key text not null, limit_per_interval int not null, interval_seconds int not null, created_at timestamptz not null default now(), updated_at timestamptz not null default now(), unique (tenant_id, key) );
3. VEX & Vulnerability Module
Schema: vex.*
Mission: ingest vulnerability facts, keep decisions & triage state.
3.1 Facts as JSONB
-
vex.vuln_factcreate table vex.vuln_fact ( id uuid primary key default gen_random_uuid(), tenant_id uuid not null, source text not null, -- e.g. "nvd", "vendor_x_vex" external_id text, -- e.g. CVE, advisory id payload jsonb not null, schema_version int not null, received_at timestamptz not null default now() ); -
Index patterns:
create index on vex.vuln_fact (tenant_id, source); create index on vex.vuln_fact (tenant_id, external_id); create index vuln_fact_payload_gin on vex.vuln_fact using gin (payload);
3.2 Decisions as relational data
-
vex.packagecreate table vex.package ( id uuid primary key default gen_random_uuid(), tenant_id uuid not null, name text not null, version text not null, ecosystem text not null, -- e.g. "pypi", "npm" created_at timestamptz not null default now(), unique (tenant_id, name, version, ecosystem) ); -
vex.vex_decisioncreate table vex.vex_decision ( id uuid primary key default gen_random_uuid(), tenant_id uuid not null, package_id uuid not null references vex.package(id), vuln_id text not null, status text not null check (status in ( 'not_affected', 'affected', 'fixed', 'under_investigation' )), rationale text, proof_ref text, -- CAS digest or URL decided_by uuid, decided_at timestamptz not null default now(), created_at timestamptz not null default now(), updated_at timestamptz not null default now(), unique (tenant_id, package_id, vuln_id) ); -
For history:
- Keep current state in
vex_decision. - Mirror previous versions into
vex_decision_historytable (similar to feature flags).
- Keep current state in
3.3 Triage queues with materialized views
-
Example triage view:
create materialized view vex.mv_triage_queue as select d.tenant_id, p.name, p.version, d.vuln_id, d.status, d.decided_at from vex.vex_decision d join vex.package p on p.id = d.package_id where d.status = 'under_investigation'; -
Refresh options:
- Scheduled refresh (cron/worker).
- Or incremental via triggers (more complex; use only when needed).
3.4 RLS for VEX
-
All tables scoped by
tenant_id. -
Typical policy:
alter table vex.vex_decision enable row level security; create policy p_vex_tenant on vex.vex_decision for all using (tenant_id = current_setting('app.tenant_id')::uuid);
4. Unknowns Module
Schema: unknowns.*
Mission: represent uncertainty and how it changes over time.
4.1 Bitemporal unknowns table
-
unknowns.unknowncreate table unknowns.unknown ( id uuid primary key default gen_random_uuid(), tenant_id uuid not null, subject_hash text not null, -- stable identifier for "thing" being reasoned about kind text not null, -- e.g. "reachability", "version_inferred" context jsonb not null, -- extra info: call graph node, evidence, etc. valid_from timestamptz not null default now(), valid_to timestamptz, sys_from timestamptz not null default now(), sys_to timestamptz, created_at timestamptz not null default now() ); -
“Exactly one open unknown per subject/kind” pattern:
create unique index unknown_one_open_per_subject on unknowns.unknown (tenant_id, subject_hash, kind) where valid_to is null;
4.2 Closing an unknown
-
Close by setting
valid_toandsys_to:update unknowns.unknown set valid_to = now(), sys_to = now() where id = :id and valid_to is null; -
Never hard-delete; keep all rows for audit/explanation.
4.3 Convenience views
-
Current unknowns:
create view unknowns.current as select * from unknowns.unknown where valid_to is null;
4.4 RLS
- Same tenant policy as other modules; unknowns are tenant‑scoped.
5. Artifact Index / CAS Module
Schema: artifact.*
Mission: index of immutable blobs stored in OCI / S3 / MinIO etc.
5.1 Artifact index
-
artifact.artifactcreate table artifact.artifact ( digest text primary key, -- e.g. "sha256:..." tenant_id uuid not null, media_type text not null, size_bytes bigint not null, created_at timestamptz not null default now(), created_by uuid ); -
Validate digest shape with a CHECK:
alter table artifact.artifact add constraint chk_digest_format check (digest ~ '^sha[0-9]+:[0-9a-fA-F]{32,}$');
5.2 Signatures and tags
-
artifact.signaturecreate table artifact.signature ( id uuid primary key default gen_random_uuid(), tenant_id uuid not null, artifact_digest text not null references artifact.artifact(digest), signer text not null, signature_payload jsonb not null, created_at timestamptz not null default now() ); -
artifact.tagcreate table artifact.tag ( id uuid primary key default gen_random_uuid(), tenant_id uuid not null, name text not null, artifact_digest text not null references artifact.artifact(digest), created_at timestamptz not null default now(), unique (tenant_id, name) );
5.3 RLS
-
Ensure that tenants cannot see each other’s digests, even if the CAS backing store is shared:
alter table artifact.artifact enable row level security; create policy p_artifact_tenant on artifact.artifact for all using (tenant_id = current_setting('app.tenant_id')::uuid);
6. Shared Outbox / Event Pattern
Schema: core.*
Mission: reliable events for external side‑effects.
6.1 Outbox table
-
core.outboxcreate table core.outbox ( id uuid primary key default gen_random_uuid(), tenant_id uuid, aggregate_type text not null, -- e.g. "vex_decision", "feature_flag" aggregate_id uuid, topic text not null, payload jsonb not null, created_at timestamptz not null default now(), dispatched_at timestamptz, dispatch_attempts int not null default 0, error text );
6.2 Usage rule
-
For anything that must emit an event (webhook, Kafka, notifications):
-
In the same transaction as the change:
- write primary data (e.g.
vex.vex_decision), - insert an
outboxrow.
- write primary data (e.g.
-
A background worker:
- pulls undelivered rows,
- sends to external system,
- updates
dispatched_at/dispatch_attempts/error.
-
7. Indexing & Query Patterns per Module
7.1 Authority
-
Index:
user(tenant_id, email)grant(tenant_id, user_id)oauth_token(token_hash)
-
Typical query patterns:
- Look up user by
tenant_id + email. - All roles/grants for a user; design composite indexes accordingly.
- Look up user by
7.2 Routing & Flags
-
Index:
-
feature_flag(tenant_id, key) -
partial index on enabled flags:
create index on routing.feature_flag (tenant_id, key) where is_enabled; -
instance(tenant_id, status),instance(tenant_id, domain).
-
7.3 VEX
-
Index:
package(tenant_id, name, version, ecosystem)vex_decision(tenant_id, package_id, vuln_id)- GIN on
vuln_fact.payloadfor flexible querying.
7.4 Unknowns
-
Index:
- unique open unknown per subject/kind (shown above).
unknown(tenant_id, kind)for filtering by kind.
7.5 Artifact
-
Index:
- PK on
digest. signature(tenant_id, artifact_digest).tag(tenant_id, name).
- PK on
8. Transaction & Isolation Guidelines
-
Default isolation: READ COMMITTED.
-
For critical sequences (e.g., provisioning a tenant, bulk role updates):
- consider REPEATABLE READ or SERIALIZABLE and keep transactions short.
-
Pattern:
- One transaction per logical user action (e.g., “set flag”, “record decision”).
- Never do long‑running external calls inside a database transaction.
If you’d like, next step I can turn this into:
- concrete
CREATE SCHEMA+CREATE TABLEmigration files, and - a short “How to write queries in each module” cheat‑sheet for devs (with example SELECT/INSERT/UPDATE patterns).