Files

master 790801f329 add advisories

2025-12-01 17:50:11 +02:00

23 KiB

Raw Blame History

Here’s a crisp, opinionated storage blueprint you can hand to your Stella Ops devs right now, plus zero‑downtime conversion tactics so you can keep prototyping fast without painting yourself into a corner.

Module → store map (deterministic by default)

Authority / OAuth / Accounts & Audit
- PostgreSQL as the primary source of truth.
- Tables: users, clients, oauth_tokens, roles, grants, audit_log.
- Row‑Level Security (RLS) on users, grants, audit_log; STRICT FK + CHECK constraints; immutable UUID PKs.
- Audit: audit_log(actor_id, action, entity, entity_id, at timestamptz default now(), diff jsonb).
- Why: ACID + RLS keeps authz decisions and audit trails deterministic and reviewable.
VEX & Vulnerability Writes
- PostgreSQL with JSONB facts + relational decisions.
- Tables: vuln_fact(jsonb), vex_decision(package_id, vuln_id, status, rationale, proof_ref, updated_at).
- Materialized views for triage queues, e.g. mv_triage_hotset (refresh on commit or scheduled).
- Why: JSONB lets you ingest vendor‑shaped docs; decisions stay relational for joins, integrity, and explainability.
Routing / Feature Flags / Rate‑limits
- PostgreSQL (truth) + Redis (cache).
- Tables: feature_flag(key, rules jsonb, version), route(domain, service, instance_id, last_heartbeat), rate_limiter(bucket, quota, interval).
- Redis keys: flag:{key}:{version}, route:{domain}, rl:{bucket} with short TTLs.
- Why: one canonical RDBMS for consistency; Redis for hot path latency.
Unknowns Registry (ambiguity tracker)
- PostgreSQL with temporal tables (bitemporal pattern via valid_from/valid_to, sys_from/sys_to).
- Table: unknowns(subject_hash, kind, context jsonb, valid_from, valid_to, sys_from default now(), sys_to).
- Views: unknowns_current where valid_to is null.
- Why: preserves how/when uncertainty changed (critical for proofs and audits).
Artifacts / SBOM / VEX files
- OCI‑compatible CAS (e.g., self‑hosted registry or MinIO bucket as content‑addressable store).
- Keys by digest (sha256:...), metadata in Postgres artifact(index) with digest, media_type, size, signatures.
- Why: blobs don’t belong in your RDBMS; use CAS for scale + cryptographic addressing.

PostgreSQL implementation essentials (copy/paste starters)

RLS scaffold (Authority):

alter table audit_log enable row level security;
create policy p_audit_read_self
  on audit_log for select
  using (actor_id = current_setting('app.user_id')::uuid or
         exists (select 1 from grants g where g.user_id = current_setting('app.user_id')::uuid and g.role = 'auditor'));

JSONB facts + relational decisions:

create table vuln_fact (
  id uuid primary key default gen_random_uuid(),
  source text not null,
  payload jsonb not null,
  received_at timestamptz default now()
);

create table vex_decision (
  package_id uuid not null,
  vuln_id text not null,
  status text check (status in ('not_affected','affected','fixed','under_investigation')),
  rationale text,
  proof_ref text,
  decided_at timestamptz default now(),
  primary key (package_id, vuln_id)
);

Materialized view for triage:

create materialized view mv_triage_hotset as
select v.id as fact_id, v.payload->>'vuln' as vuln, v.received_at
from vuln_fact v
where (now() - v.received_at) < interval '7 days';
-- refresh concurrently via job

Temporal pattern (Unknowns):

create table unknowns (
  id uuid primary key default gen_random_uuid(),
  subject_hash text not null,
  kind text not null,
  context jsonb not null,
  valid_from timestamptz not null default now(),
  valid_to   timestamptz,
  sys_from   timestamptz not null default now(),
  sys_to     timestamptz
);

create view unknowns_current as
select * from unknowns where valid_to is null;

Conversion (not migration): zero‑downtime, prototype‑friendly

Even if you’re “not migrating anything yet,” set these rails now so cutting over later is painless.

Encode Mongo‑shaped docs into JSONB with versioned schemas

Ingest pipeline writes to *_fact(payload jsonb, schema_version int).
Add a validate(schema_version, payload) step in your service layer (JSON Schema or SQL checks).
Keep a forward‑compatible view that projects stable columns from JSONB (e.g., payload->>'id' as vendor_id) so downstream code doesn’t break when payload evolves.

Outbox pattern for exactly‑once side‑effects

Add outbox(id, topic, key, payload jsonb, created_at, dispatched bool default false).
On the same transaction as your write, insert the outbox row.
A background dispatcher reads dispatched=false, publishes to MQ/Webhook, then marks dispatched=true.
Guarantees: no lost events, no duplicates to external systems.

Parallel read adapters behind feature flags

Keep old readers (e.g., Mongo driver) and new Postgres readers in the same service.
Gate by feature_flag('pg_reads') per tenant or env; flip gradually.
Add a read‑diff monitor that compares results and logs mismatches to audit_log(diff).

CDC for analytics without coupling

Enable logical replication (pgoutput) on your key tables.
Stream changes into analyzers (reachability, heuristics) without hitting primaries.
This lets you keep OLTP clean and still power dashboards/tests.

Materialized views & job cadence

Refresh mv_* on a fixed cadence (e.g., every 2–5 minutes) or post‑commit for hot paths.
Keep “cold path” analytics in separate schemas (analytics.*) sourced from CDC.

Cutover playbook (phased)

Phase A (Dark Read): write Postgres, still serve from Mongo; compare results silently.
Phase B (Shadow Serve): 5–10% traffic from Postgres via flag; auto‑rollback switch.
Phase C (Authoritative): Postgres becomes source; Mongo path left for emergency read‑only.
Phase D (Retire): freeze Mongo, back up, remove writes, delete code paths after 2 stable sprints.

Rate‑limits & flags: single truth, fast edges

Truth in Postgres with versioned flag docs:

create table feature_flag (
  key text primary key,
  rules jsonb not null,
  version int not null default 1,
  updated_at timestamptz default now()
);

Edge cache in Redis:
- SETEX flag:{key}:{version} <ttl> <json>
- On update, bump version; readers compose cache key with version (cache‑busting without deletes).
Rate limiting: Persist quotas in Postgres; counters in Redis (INCR rl:{bucket}:{window}), with periodic reconciliation jobs writing summaries back to Postgres for audits.

CAS for SBOM/VEX/attestations

Push blobs to OCI/MinIO by digest; store only pointers in Postgres:

create table artifact_index (
  digest text primary key,
  media_type text not null,
  size bigint not null,
  created_at timestamptz default now(),
  signature_refs jsonb
);

Benefits: immutable, deduped, easy to mirror into offline kits.

Guardrails your team should follow

Always wrap multi‑table writes (facts + outbox + decisions) in a single transaction.
Prefer jsonb_path_query for targeted reads; avoid scanning entire payloads.
Enforce RLS + least‑privilege roles; application sets app.user_id at session start.
Version everything: schemas, flags, materialized views; never “change in place” without bumping version.
Observability: expose pg_stat_statements, refresh latency for mv_*, outbox lag, Redis hit ratio, and RLS policy hits.

If you want, I can turn this into:

ready‑to‑run EF Core 10 migrations,
a /docs/architecture/store-map.md for your repo,
and a tiny dev seed (Docker + sample data) so the team can poke it immediately. Here’s a focused “PostgreSQL patterns per module” doc you can hand straight to your StellaOps devs.

StellaOps – PostgreSQL Patterns per Module

Scope: How each StellaOps module should use PostgreSQL: schema patterns, constraints, RLS, indexing, and transaction rules.

0. Cross‑cutting PostgreSQL Rules

These apply everywhere unless explicitly overridden.

0.1 Core conventions

Schemas
- Use one logical schema per module: authority, routing, vex, unknowns, artifact.
- Shared utilities (e.g., outbox) live in a core schema.
Naming
- Tables: snake_case, singular: user, feature_flag, vuln_fact.
- PK: id uuid primary key.
- FKs: <referenced_table>_id (e.g., user_id, tenant_id).
- Timestamps:
  - created_at timestamptz not null default now()
  - updated_at timestamptz not null default now()
Multi‑tenancy
- All tenant‑scoped tables must have tenant_id uuid not null.
- Enforce tenant isolation with RLS on tenant_id.
Time & timezones
- Always timestamptz, always store UTC, let the DB default now().

0.2 RLS & security

RLS must be enabled on any table reachable from a user‑initiated path.

Every session must set:

select set_config('app.user_id',   '<uuid>', false);
select set_config('app.tenant_id', '<uuid>', false);
select set_config('app.roles',     'role1,role2', false);

RLS policies:
- Base policy: tenant_id = current_setting('app.tenant_id')::uuid.
- Extra predicates for per‑user privacy (e.g., only see own tokens, only own API clients).
DB users:
- Each module’s service has its own role with access only to its schema + core.outbox.

0.3 JSONB & versioning

Any JSONB column must have:
- payload jsonb not null,
- schema_version int not null.
Always index:
- by source (source / origin),
- by a small set of projected fields used in WHERE clauses.

0.4 Migrations

All schema changes via migrations, forward‑only.
Backwards‑compat pattern:
1. Add new columns / tables.
2. Backfill.
3. Flip code to use new structure (behind a feature flag).
4. After stability, remove old columns/paths.

1. Authority Module (auth, accounts, audit)

Schema: authority.* Mission: identity, OAuth, roles, grants, audit.

1.1 Core tables & patterns

authority.user

create table authority.user (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  email text not null,
  display_name text not null,
  is_disabled boolean not null default false,
  created_at timestamptz not null default now(),
  updated_at timestamptz not null default now(),
  unique (tenant_id, email)
);

Never hard‑delete users: use is_disabled (and optionally disabled_at).

authority.role

create table authority.role (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  name text not null,
  description text,
  created_at timestamptz not null default now(),
  updated_at timestamptz not null default now(),
  unique (tenant_id, name)
);

authority.grant

create table authority.grant (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  user_id uuid not null references authority.user(id),
  role_id uuid not null references authority.role(id),
  created_at timestamptz not null default now(),
  unique (tenant_id, user_id, role_id)
);

authority.oauth_client, authority.oauth_token

Enforce token uniqueness:

create table authority.oauth_token (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  user_id uuid not null references authority.user(id),
  client_id uuid not null references authority.oauth_client(id),
  token_hash text not null, -- hash, never raw
  expires_at timestamptz not null,
  created_at timestamptz not null default now(),
  revoked_at timestamptz,
  unique (token_hash)
);

1.2 Audit log pattern

authority.audit_log

create table authority.audit_log (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  actor_id uuid, -- null for system
  action text not null,
  entity_type text not null,
  entity_id uuid,
  at timestamptz not null default now(),
  diff jsonb not null
);

Insert audit rows in the same transaction as the change.

1.3 RLS patterns

Base RLS:

alter table authority.user enable row level security;

create policy p_user_tenant on authority.user
  for all using (tenant_id = current_setting('app.tenant_id')::uuid);

Extra policies:
- Audit log is visible only to:
  - actor themself, or
  - users with an auditor or admin role.

2. Routing & Feature Flags Module

Schema: routing.* Mission: where instances live, what features are on, rate‑limit configuration.

2.1 Feature flags

routing.feature_flag

create table routing.feature_flag (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  key text not null,
  rules jsonb not null,
  version int not null default 1,
  is_enabled boolean not null default true,
  created_at timestamptz not null default now(),
  updated_at timestamptz not null default now(),
  unique (tenant_id, key)
);

Immutability by version:

On update, increment version, don’t overwrite historical data.

Mirror changes into a history table via trigger:

create table routing.feature_flag_history (
  id uuid primary key default gen_random_uuid(),
  feature_flag_id uuid not null references routing.feature_flag(id),
  tenant_id uuid not null,
  key text not null,
  rules jsonb not null,
  version int not null,
  changed_at timestamptz not null default now(),
  changed_by uuid
);

2.2 Instance registry

routing.instance

create table routing.instance (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  instance_key text not null,
  domain text not null,
  last_heartbeat timestamptz not null default now(),
  status text not null check (status in ('active','draining','offline')),
  created_at timestamptz not null default now(),
  updated_at timestamptz not null default now(),
  unique (tenant_id, instance_key),
  unique (tenant_id, domain)
);

Pattern:
- Heartbeats use update ... set last_heartbeat = now() without touching other fields.
- Routing logic filters by status='active' and heartbeat recency.

2.3 Rate‑limit configuration

Config in Postgres, counters in Redis:

create table routing.rate_limit_config (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  key text not null,
  limit_per_interval int not null,
  interval_seconds int not null,
  created_at timestamptz not null default now(),
  updated_at timestamptz not null default now(),
  unique (tenant_id, key)
);

3. VEX & Vulnerability Module

Schema: vex.* Mission: ingest vulnerability facts, keep decisions & triage state.

3.1 Facts as JSONB

vex.vuln_fact

create table vex.vuln_fact (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  source text not null,               -- e.g. "nvd", "vendor_x_vex"
  external_id text,                   -- e.g. CVE, advisory id
  payload jsonb not null,
  schema_version int not null,
  received_at timestamptz not null default now()
);

Index patterns:

create index on vex.vuln_fact (tenant_id, source);
create index on vex.vuln_fact (tenant_id, external_id);
create index vuln_fact_payload_gin on vex.vuln_fact using gin (payload);

3.2 Decisions as relational data

vex.package

create table vex.package (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  name text not null,
  version text not null,
  ecosystem text not null, -- e.g. "pypi", "npm"
  created_at timestamptz not null default now(),
  unique (tenant_id, name, version, ecosystem)
);

vex.vex_decision

create table vex.vex_decision (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  package_id uuid not null references vex.package(id),
  vuln_id text not null,
  status text not null check (status in (
    'not_affected', 'affected', 'fixed', 'under_investigation'
  )),
  rationale text,
  proof_ref text, -- CAS digest or URL
  decided_by uuid,
  decided_at timestamptz not null default now(),
  created_at timestamptz not null default now(),
  updated_at timestamptz not null default now(),
  unique (tenant_id, package_id, vuln_id)
);

For history:
- Keep current state in vex_decision.
- Mirror previous versions into vex_decision_history table (similar to feature flags).

3.3 Triage queues with materialized views

Example triage view:

create materialized view vex.mv_triage_queue as
select
  d.tenant_id,
  p.name,
  p.version,
  d.vuln_id,
  d.status,
  d.decided_at
from vex.vex_decision d
join vex.package p on p.id = d.package_id
where d.status = 'under_investigation';

Refresh options:
- Scheduled refresh (cron/worker).
- Or incremental via triggers (more complex; use only when needed).

3.4 RLS for VEX

All tables scoped by tenant_id.

Typical policy:

alter table vex.vex_decision enable row level security;

create policy p_vex_tenant on vex.vex_decision
  for all using (tenant_id = current_setting('app.tenant_id')::uuid);

4. Unknowns Module

Schema: unknowns.* Mission: represent uncertainty and how it changes over time.

4.1 Bitemporal unknowns table

unknowns.unknown

create table unknowns.unknown (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  subject_hash text not null, -- stable identifier for "thing" being reasoned about
  kind text not null,         -- e.g. "reachability", "version_inferred"
  context jsonb not null,     -- extra info: call graph node, evidence, etc.
  valid_from timestamptz not null default now(),
  valid_to timestamptz,
  sys_from timestamptz not null default now(),
  sys_to timestamptz,
  created_at timestamptz not null default now()
);

“Exactly one open unknown per subject/kind” pattern:

create unique index unknown_one_open_per_subject
  on unknowns.unknown (tenant_id, subject_hash, kind)
  where valid_to is null;

4.2 Closing an unknown

Close by setting valid_to and sys_to:

update unknowns.unknown
set valid_to = now(), sys_to = now()
where id = :id and valid_to is null;

Never hard-delete; keep all rows for audit/explanation.

4.3 Convenience views

Current unknowns:

create view unknowns.current as
select *
from unknowns.unknown
where valid_to is null;

4.4 RLS

Same tenant policy as other modules; unknowns are tenant‑scoped.

5. Artifact Index / CAS Module

Schema: artifact.* Mission: index of immutable blobs stored in OCI / S3 / MinIO etc.

5.1 Artifact index

artifact.artifact

create table artifact.artifact (
  digest text primary key, -- e.g. "sha256:..."
  tenant_id uuid not null,
  media_type text not null,
  size_bytes bigint not null,
  created_at timestamptz not null default now(),
  created_by uuid
);

Validate digest shape with a CHECK:

alter table artifact.artifact
  add constraint chk_digest_format
  check (digest ~ '^sha[0-9]+:[0-9a-fA-F]{32,}$');

5.2 Signatures and tags

artifact.signature

create table artifact.signature (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  artifact_digest text not null references artifact.artifact(digest),
  signer text not null,
  signature_payload jsonb not null,
  created_at timestamptz not null default now()
);

artifact.tag

create table artifact.tag (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid not null,
  name text not null,
  artifact_digest text not null references artifact.artifact(digest),
  created_at timestamptz not null default now(),
  unique (tenant_id, name)
);

5.3 RLS

Ensure that tenants cannot see each other’s digests, even if the CAS backing store is shared:

alter table artifact.artifact enable row level security;

create policy p_artifact_tenant on artifact.artifact
  for all using (tenant_id = current_setting('app.tenant_id')::uuid);

6. Shared Outbox / Event Pattern

Schema: core.* Mission: reliable events for external side‑effects.

6.1 Outbox table

core.outbox

create table core.outbox (
  id uuid primary key default gen_random_uuid(),
  tenant_id uuid,
  aggregate_type text not null, -- e.g. "vex_decision", "feature_flag"
  aggregate_id uuid,
  topic text not null,
  payload jsonb not null,
  created_at timestamptz not null default now(),
  dispatched_at timestamptz,
  dispatch_attempts int not null default 0,
  error text
);

6.2 Usage rule

For anything that must emit an event (webhook, Kafka, notifications):
- In the same transaction as the change:
  - write primary data (e.g. vex.vex_decision),
  - insert an outbox row.
- A background worker:
  - pulls undelivered rows,
  - sends to external system,
  - updates dispatched_at/dispatch_attempts/error.

7. Indexing & Query Patterns per Module

7.1 Authority

Index:
- user(tenant_id, email)
- grant(tenant_id, user_id)
- oauth_token(token_hash)
Typical query patterns:
- Look up user by tenant_id + email.
- All roles/grants for a user; design composite indexes accordingly.

7.2 Routing & Flags

Index:
- feature_flag(tenant_id, key)
- partial index on enabled flags:
```
create index on routing.feature_flag (tenant_id, key)
  where is_enabled;
```
- instance(tenant_id, status), instance(tenant_id, domain).

7.3 VEX

Index:
- package(tenant_id, name, version, ecosystem)
- vex_decision(tenant_id, package_id, vuln_id)
- GIN on vuln_fact.payload for flexible querying.

7.4 Unknowns

Index:
- unique open unknown per subject/kind (shown above).
- unknown(tenant_id, kind) for filtering by kind.

7.5 Artifact

Index:
- PK on digest.
- signature(tenant_id, artifact_digest).
- tag(tenant_id, name).

8. Transaction & Isolation Guidelines

Default isolation: READ COMMITTED.
For critical sequences (e.g., provisioning a tenant, bulk role updates):
- consider REPEATABLE READ or SERIALIZABLE and keep transactions short.
Pattern:
- One transaction per logical user action (e.g., “set flag”, “record decision”).
- Never do long‑running external calls inside a database transaction.

If you’d like, next step I can turn this into:

concrete CREATE SCHEMA + CREATE TABLE migration files, and
a short “How to write queries in each module” cheat‑sheet for devs (with example SELECT/INSERT/UPDATE patterns).

23 KiB Raw Blame History Unescape Escape