# Database Specification **Version:** 1.0.0 **Status:** DRAFT **Last Updated:** 2025-11-28 --- ## 1. Overview This document specifies the PostgreSQL database design for StellaOps control-plane domains. It defines schemas, naming conventions, data types, indexing strategies, and design patterns that all database work must follow. ## 2. Database Architecture ### 2.1 Database Topology ``` ┌─────────────────────────────────────────────────────────────────┐ │ PostgreSQL Cluster │ │ ┌─────────────────────────────────────────────────────────────┐│ │ │ stellaops (database) ││ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ││ │ │ │authority│ │ vuln │ │ vex │ │scheduler│ ││ │ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ ││ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ││ │ │ │ notify │ │ policy │ │ packs │ │ issuer │ ││ │ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ ││ │ │ ┌─────────┐ ││ │ │ │ audit │ (cross-cutting audit schema) ││ │ │ └─────────┘ ││ │ └─────────────────────────────────────────────────────────────┘│ └─────────────────────────────────────────────────────────────────┘ ``` ### 2.2 Schema Ownership | Schema | Owner Module | Purpose | |--------|--------------|---------| | `authority` | Authority | Identity, authentication, authorization, licensing | | `vuln` | Concelier | Vulnerability advisories, CVSS, affected packages | | `vex` | Excititor | VEX statements, graphs, observations, evidence | | `scheduler` | Scheduler | Job definitions, triggers, execution history | | `notify` | Notify | Channels, rules, deliveries, escalations | | `policy` | Policy | Policy packs, rules, risk profiles, evaluations | | `packs` | PacksRegistry | Package attestations, mirrors, lifecycle | | `issuer` | IssuerDirectory | Trust anchors, issuer keys, certificates | | `unknowns` | Unknowns | Bitemporal ambiguity tracking for scan gaps | | `audit` | Shared | Cross-cutting audit log (optional) | ### 2.3 Multi-Tenancy Model **Strategy:** Single database, single schema set, `tenant_id` column on all tenant-scoped tables with **mandatory Row-Level Security (RLS)**. ```sql -- Every tenant-scoped table includes: tenant_id UUID NOT NULL, -- Session-level tenant context (MUST be set on connection open): SET app.tenant_id = ''; -- Row-level security policy (MANDATORY for all tenant-scoped tables): ALTER TABLE . ENABLE ROW LEVEL SECURITY; ALTER TABLE .
FORCE ROW LEVEL SECURITY; CREATE POLICY
_tenant_isolation ON .
FOR ALL USING (tenant_id = _app.require_current_tenant()) WITH CHECK (tenant_id = _app.require_current_tenant()); ``` **RLS Helper Function Pattern:** Each schema with tenant-scoped tables has a companion `_app` schema containing a `require_current_tenant()` function that validates `app.tenant_id` is set. ```sql CREATE SCHEMA IF NOT EXISTS _app; CREATE OR REPLACE FUNCTION _app.require_current_tenant() RETURNS TEXT LANGUAGE plpgsql STABLE SECURITY DEFINER AS $$ DECLARE v_tenant TEXT; BEGIN v_tenant := current_setting('app.tenant_id', true); IF v_tenant IS NULL OR v_tenant = '' THEN RAISE EXCEPTION 'app.tenant_id session variable not set'; END IF; RETURN v_tenant; END; $$; ``` **Rationale:** - Defense-in-depth tenant isolation at the database level - Prevents data leakage even if application bugs bypass tenant checks - Shared connection pooling compatible - Admin bypass via `BYPASSRLS` roles for cross-tenant operations - Composite indexes on `(tenant_id, ...)` for query performance --- ## 3. Naming Conventions ### 3.1 Schema Names - Lowercase, singular noun - Match module name where applicable - Examples: `authority`, `vuln`, `vex`, `scheduler`, `notify`, `policy` ### 3.2 Table Names | Convention | Example | |------------|---------| | Lowercase with underscores | `advisory_aliases` | | Plural nouns for collections | `users`, `advisories`, `runs` | | Singular for junction tables | `user_role`, `role_permission` | | Prefix with schema context if ambiguous | `vex_statements` (not just `statements`) | ### 3.3 Column Names | Convention | Example | |------------|---------| | Lowercase with underscores | `created_at`, `tenant_id` | | Primary keys | `id` (UUID) | | Foreign keys | `
_id` (e.g., `user_id`, `advisory_id`) | | Timestamps | `*_at` suffix (e.g., `created_at`, `updated_at`, `deleted_at`) | | Booleans | `is_*` or `has_*` prefix, or adjective (e.g., `enabled`, `is_primary`) | | Counts | `*_count` suffix | | JSONB columns | Descriptive noun (e.g., `attributes`, `metadata`, `config`) | ### 3.4 Index Names ``` idx_
_ idx_
_ ``` Examples: - `idx_users_tenant` - Index on tenant_id - `idx_users_email` - Index on email - `idx_advisories_fts` - Full-text search index - `idx_runs_tenant_state` - Composite index ### 3.5 Constraint Names ```
__
__ ``` | Type | Suffix | Example | |------|--------|---------| | Primary key | `_pkey` | `users_pkey` | | Foreign key | `_fkey` | `users_tenant_id_fkey` | | Unique | `_key` | `users_email_key` | | Check | `_check` | `users_status_check` | | Exclusion | `_excl` | `schedules_time_excl` | --- ## 4. Data Types ### 4.1 Standard Type Mappings | Domain Concept | PostgreSQL Type | Notes | |----------------|-----------------|-------| | Identifiers | `UUID` | Use `gen_random_uuid()` for generation | | Timestamps | `TIMESTAMPTZ` | Always UTC, never `TIMESTAMP` | | Short strings | `TEXT` | No `VARCHAR(n)` unless hard limit required | | Enumerations | `TEXT` with `CHECK` | Not `ENUM` type (easier migrations) | | Booleans | `BOOLEAN` | Never `INTEGER` or `TEXT` | | Counts/quantities | `INTEGER` or `BIGINT` | Use `BIGINT` for counters that may exceed 2B | | Scores/decimals | `NUMERIC(p,s)` | Explicit precision for CVSS, percentages | | Arrays | `TEXT[]`, `UUID[]` | PostgreSQL native arrays | | Semi-structured | `JSONB` | Never `JSON` (always use binary) | | IP addresses | `INET` | For IP storage | | Large text | `TEXT` | No `CLOB` equivalent needed | ### 4.2 Identifier Strategy **Primary Keys:** ```sql id UUID PRIMARY KEY DEFAULT gen_random_uuid() ``` **Alternative: ULID for time-ordered IDs:** ```sql -- If time-ordering in ID is needed (e.g., for pagination) id TEXT PRIMARY KEY DEFAULT generate_ulid() ``` **Surrogate vs Natural Keys:** - Use UUID surrogate keys for all tables - Natural keys (e.g., `advisory_key`, `username`) as unique constraints, not primary keys - Exception: Junction tables use composite primary keys ### 4.3 Timestamp Conventions ```sql -- Standard audit columns (on every mutable table): created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), -- Optional soft-delete: deleted_at TIMESTAMPTZ, deleted_by TEXT, -- Trigger for updated_at (optional, can be application-managed): CREATE OR REPLACE FUNCTION update_updated_at() RETURNS TRIGGER AS $$ BEGIN NEW.updated_at = NOW(); RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER trg_
_updated_at BEFORE UPDATE ON
FOR EACH ROW EXECUTE FUNCTION update_updated_at(); ``` ### 4.4 JSONB Usage Guidelines **When to use JSONB:** - Semi-structured data with variable schema - Audit/provenance trails - External system payloads that must be preserved exactly - Configuration objects with optional fields - Nested arrays of complex objects **When NOT to use JSONB:** - Data that will be frequently queried/filtered - Data that requires referential integrity - Simple key-value pairs (use separate columns) - Data that will be aggregated **JSONB Indexing:** ```sql -- GIN index for containment queries (@>, ?, ?&, ?|) CREATE INDEX idx_
__gin ON
USING GIN (); -- Expression index for specific JSON path CREATE INDEX idx_
__ ON
((->>'path')); ``` ### 4.5 Generated Columns for JSONB Hot Fields When JSONB fields are frequently queried with equality or range filters, use **generated columns** to extract them as first-class columns. This enables: - B-tree indexes with accurate statistics - Index-only scans via covering indexes - Proper cardinality estimates for query planning **Pattern:** ```sql -- Extract hot field as generated column ALTER TABLE .
ADD COLUMN IF NOT EXISTS GENERATED ALWAYS AS ((->>'')::) STORED; -- Create B-tree index on generated column CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_
_ ON .
() WHERE IS NOT NULL; -- Covering index for dashboard queries CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_
_listing ON .
(tenant_id, created_at DESC) INCLUDE (, , ); -- Update statistics ANALYZE .
; ``` **Example (scheduler.runs stats extraction):** ```sql ALTER TABLE scheduler.runs ADD COLUMN IF NOT EXISTS finding_count INT GENERATED ALWAYS AS (NULLIF((stats->>'findingCount'), '')::int) STORED; CREATE INDEX ix_runs_with_findings ON scheduler.runs (tenant_id, created_at DESC) WHERE finding_count > 0; ``` **Guidelines:** - Use `NULLIF(, '')` before casting to handle empty strings - Add `WHERE IS NOT NULL` to partial indexes for sparse data - Use `INCLUDE` clause for covering indexes that return multiple generated columns - Run `ANALYZE` after adding generated columns to populate statistics --- ## 5. Schema Definitions ### 5.1 Authority Schema ```sql CREATE SCHEMA IF NOT EXISTS authority; -- Core identity tables CREATE TABLE authority.tenants ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), code TEXT NOT NULL UNIQUE, display_name TEXT NOT NULL, status TEXT NOT NULL DEFAULT 'active' CHECK (status IN ('active', 'suspended', 'trial', 'terminated')), settings JSONB DEFAULT '{}', created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); CREATE TABLE authority.users ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL REFERENCES authority.tenants(id), subject_id UUID NOT NULL UNIQUE, username TEXT NOT NULL, normalized_username TEXT NOT NULL, display_name TEXT, email TEXT, email_verified BOOLEAN NOT NULL DEFAULT FALSE, disabled BOOLEAN NOT NULL DEFAULT FALSE, plugin TEXT, attributes JSONB DEFAULT '{}', created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), UNIQUE (tenant_id, normalized_username) ); CREATE TABLE authority.roles ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID REFERENCES authority.tenants(id), name TEXT NOT NULL, description TEXT, is_system BOOLEAN NOT NULL DEFAULT FALSE, permissions TEXT[] DEFAULT '{}', created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), UNIQUE (tenant_id, name) ); CREATE TABLE authority.user_roles ( user_id UUID NOT NULL REFERENCES authority.users(id) ON DELETE CASCADE, role_id UUID NOT NULL REFERENCES authority.roles(id) ON DELETE CASCADE, granted_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), granted_by TEXT, PRIMARY KEY (user_id, role_id) ); CREATE TABLE authority.service_accounts ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL REFERENCES authority.tenants(id), account_id TEXT NOT NULL, display_name TEXT NOT NULL, description TEXT, enabled BOOLEAN NOT NULL DEFAULT TRUE, allowed_scopes TEXT[] DEFAULT '{}', authorized_clients TEXT[] DEFAULT '{}', attributes JSONB DEFAULT '{}', created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), UNIQUE (tenant_id, account_id) ); CREATE TABLE authority.clients ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), client_id TEXT NOT NULL UNIQUE, client_secret_hash TEXT, display_name TEXT, type TEXT NOT NULL DEFAULT 'confidential' CHECK (type IN ('public', 'confidential')), redirect_uris TEXT[] DEFAULT '{}', post_logout_redirect_uris TEXT[] DEFAULT '{}', permissions TEXT[] DEFAULT '{}', requirements TEXT[] DEFAULT '{}', settings JSONB DEFAULT '{}', created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); CREATE TABLE authority.scopes ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), name TEXT NOT NULL UNIQUE, display_name TEXT, description TEXT, resources TEXT[] DEFAULT '{}' ); CREATE TABLE authority.tokens ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), subject_id UUID NOT NULL, client_id TEXT, token_type TEXT NOT NULL CHECK (token_type IN ('access', 'refresh', 'authorization_code')), token_hash TEXT NOT NULL UNIQUE, scopes TEXT[] DEFAULT '{}', issued_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), expires_at TIMESTAMPTZ NOT NULL, revoked_at TIMESTAMPTZ, revocation_reason TEXT, metadata JSONB DEFAULT '{}' ); CREATE TABLE authority.revocations ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), token_id UUID REFERENCES authority.tokens(id), jti TEXT, revoked_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), reason TEXT, revoked_by TEXT ); CREATE TABLE authority.login_attempts ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID REFERENCES authority.tenants(id), username TEXT NOT NULL, ip_address INET, user_agent TEXT, success BOOLEAN NOT NULL, failure_reason TEXT, attempted_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); CREATE TABLE authority.licenses ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL REFERENCES authority.tenants(id), license_key TEXT NOT NULL UNIQUE, edition TEXT NOT NULL CHECK (edition IN ('community', 'standard', 'enterprise', 'sovereign')), max_nodes INT, max_projects INT, features JSONB DEFAULT '{}', start_date DATE NOT NULL, end_date DATE, issued_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), issued_by TEXT, revoked_at TIMESTAMPTZ, revocation_reason TEXT ); CREATE TABLE authority.license_usage ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), license_id UUID NOT NULL REFERENCES authority.licenses(id), scanner_node_id TEXT NOT NULL, project_id TEXT, scanner_version TEXT, first_seen_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), last_seen_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), UNIQUE (license_id, scanner_node_id) ); -- Indexes CREATE INDEX idx_users_tenant ON authority.users(tenant_id); CREATE INDEX idx_users_email ON authority.users(email) WHERE email IS NOT NULL; CREATE INDEX idx_users_subject ON authority.users(subject_id); CREATE INDEX idx_service_accounts_tenant ON authority.service_accounts(tenant_id); CREATE INDEX idx_tokens_subject ON authority.tokens(subject_id); CREATE INDEX idx_tokens_expires ON authority.tokens(expires_at) WHERE revoked_at IS NULL; CREATE INDEX idx_tokens_hash ON authority.tokens(token_hash); CREATE INDEX idx_login_attempts_tenant_time ON authority.login_attempts(tenant_id, attempted_at DESC); CREATE INDEX idx_licenses_tenant ON authority.licenses(tenant_id); ``` ### 5.2 Vulnerability Schema (vuln) ```sql CREATE SCHEMA IF NOT EXISTS vuln; CREATE TABLE vuln.sources ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), key TEXT NOT NULL UNIQUE, display_name TEXT NOT NULL, url TEXT, source_type TEXT NOT NULL CHECK (source_type IN ('nvd', 'osv', 'ghsa', 'vendor', 'oval', 'custom')), enabled BOOLEAN NOT NULL DEFAULT TRUE, priority INT NOT NULL DEFAULT 100, config JSONB DEFAULT '{}', created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); CREATE TABLE vuln.feed_snapshots ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), source_id UUID NOT NULL REFERENCES vuln.sources(id), snapshot_id TEXT NOT NULL, taken_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), completed_at TIMESTAMPTZ, status TEXT NOT NULL DEFAULT 'pending' CHECK (status IN ('pending', 'processing', 'completed', 'failed')), stats JSONB DEFAULT '{}', checksum TEXT, error TEXT, UNIQUE (source_id, snapshot_id) ); CREATE TABLE vuln.advisory_snapshots ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), source_id UUID NOT NULL REFERENCES vuln.sources(id), source_advisory_id TEXT NOT NULL, feed_snapshot_id UUID REFERENCES vuln.feed_snapshots(id), imported_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), raw_payload JSONB NOT NULL, payload_hash TEXT NOT NULL, is_latest BOOLEAN NOT NULL DEFAULT TRUE, UNIQUE (source_id, source_advisory_id, payload_hash) ); CREATE TABLE vuln.advisories ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), advisory_key TEXT NOT NULL UNIQUE, primary_vuln_id TEXT NOT NULL, source_id UUID REFERENCES vuln.sources(id), title TEXT, summary TEXT, description TEXT, language TEXT DEFAULT 'en', severity TEXT CHECK (severity IN ('critical', 'high', 'medium', 'low', 'none', 'unknown')), exploit_known BOOLEAN NOT NULL DEFAULT FALSE, state TEXT NOT NULL DEFAULT 'active' CHECK (state IN ('active', 'rejected', 'withdrawn', 'disputed')), published_at TIMESTAMPTZ, modified_at TIMESTAMPTZ, withdrawn_at TIMESTAMPTZ, current_snapshot_id UUID REFERENCES vuln.advisory_snapshots(id), canonical_metric_id UUID, provenance JSONB DEFAULT '[]', created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); CREATE TABLE vuln.advisory_aliases ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), advisory_id UUID NOT NULL REFERENCES vuln.advisories(id) ON DELETE CASCADE, alias_type TEXT NOT NULL CHECK (alias_type IN ('cve', 'ghsa', 'osv', 'vendor', 'internal', 'other')), alias_value TEXT NOT NULL, provenance JSONB DEFAULT '{}', UNIQUE (alias_type, alias_value) ); CREATE TABLE vuln.advisory_cvss ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), advisory_id UUID NOT NULL REFERENCES vuln.advisories(id) ON DELETE CASCADE, version TEXT NOT NULL CHECK (version IN ('2.0', '3.0', '3.1', '4.0')), vector TEXT NOT NULL, base_score NUMERIC(3,1) NOT NULL CHECK (base_score >= 0 AND base_score <= 10), base_severity TEXT, temporal_score NUMERIC(3,1) CHECK (temporal_score >= 0 AND temporal_score <= 10), environmental_score NUMERIC(3,1) CHECK (environmental_score >= 0 AND environmental_score <= 10), source TEXT, is_primary BOOLEAN NOT NULL DEFAULT FALSE, provenance JSONB DEFAULT '{}' ); CREATE TABLE vuln.advisory_affected ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), advisory_id UUID NOT NULL REFERENCES vuln.advisories(id) ON DELETE CASCADE, package_type TEXT NOT NULL CHECK (package_type IN ('rpm', 'deb', 'cpe', 'semver', 'vendor', 'ics-vendor', 'generic')), ecosystem TEXT, package_name TEXT NOT NULL, package_purl TEXT, platform TEXT, version_ranges JSONB NOT NULL DEFAULT '[]', statuses JSONB DEFAULT '[]', normalized_versions JSONB DEFAULT '[]', provenance JSONB DEFAULT '[]' ); CREATE TABLE vuln.advisory_references ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), advisory_id UUID NOT NULL REFERENCES vuln.advisories(id) ON DELETE CASCADE, url TEXT NOT NULL, title TEXT, ref_type TEXT, provenance JSONB DEFAULT '{}' ); CREATE TABLE vuln.advisory_credits ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), advisory_id UUID NOT NULL REFERENCES vuln.advisories(id) ON DELETE CASCADE, name TEXT NOT NULL, contact TEXT, credit_type TEXT, provenance JSONB DEFAULT '{}' ); CREATE TABLE vuln.advisory_weaknesses ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), advisory_id UUID NOT NULL REFERENCES vuln.advisories(id) ON DELETE CASCADE, cwe_id TEXT NOT NULL, description TEXT, provenance JSONB DEFAULT '{}' ); CREATE TABLE vuln.kev_flags ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), cve_id TEXT NOT NULL UNIQUE, advisory_id UUID REFERENCES vuln.advisories(id), added_date DATE NOT NULL, due_date DATE, vendor_project TEXT, product TEXT, vulnerability_name TEXT, short_description TEXT, required_action TEXT, notes TEXT, known_ransomware_campaign BOOLEAN DEFAULT FALSE, updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); CREATE TABLE vuln.source_states ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), source_id UUID NOT NULL REFERENCES vuln.sources(id) UNIQUE, cursor TEXT, last_fetch_at TIMESTAMPTZ, last_success_at TIMESTAMPTZ, consecutive_failures INT DEFAULT 0, last_error TEXT, last_error_at TIMESTAMPTZ, metadata JSONB DEFAULT '{}' ); CREATE TABLE vuln.merge_events ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), advisory_id UUID NOT NULL REFERENCES vuln.advisories(id), event_type TEXT NOT NULL CHECK (event_type IN ('created', 'updated', 'merged', 'superseded', 'withdrawn')), source_id UUID REFERENCES vuln.sources(id), changes JSONB DEFAULT '{}', occurred_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); -- Indexes CREATE INDEX idx_advisories_primary_vuln ON vuln.advisories(primary_vuln_id); CREATE INDEX idx_advisories_modified ON vuln.advisories(modified_at DESC); CREATE INDEX idx_advisories_published ON vuln.advisories(published_at DESC); CREATE INDEX idx_advisories_severity ON vuln.advisories(severity) WHERE state = 'active'; CREATE INDEX idx_advisories_state ON vuln.advisories(state); CREATE INDEX idx_advisory_aliases_value ON vuln.advisory_aliases(alias_value); CREATE INDEX idx_advisory_aliases_advisory ON vuln.advisory_aliases(advisory_id); CREATE INDEX idx_advisory_affected_purl ON vuln.advisory_affected(package_purl) WHERE package_purl IS NOT NULL; CREATE INDEX idx_advisory_affected_name ON vuln.advisory_affected(ecosystem, package_name); CREATE INDEX idx_advisory_affected_advisory ON vuln.advisory_affected(advisory_id); CREATE INDEX idx_advisory_snapshots_latest ON vuln.advisory_snapshots(source_id, source_advisory_id) WHERE is_latest = TRUE; CREATE INDEX idx_kev_flags_cve ON vuln.kev_flags(cve_id); CREATE INDEX idx_merge_events_advisory ON vuln.merge_events(advisory_id, occurred_at DESC); -- Full-text search CREATE INDEX idx_advisories_fts ON vuln.advisories USING GIN ( to_tsvector('english', COALESCE(title, '') || ' ' || COALESCE(summary, '') || ' ' || COALESCE(description, '')) ); ``` ### 5.3 VEX & Graph Schema (vex) ```sql CREATE SCHEMA IF NOT EXISTS vex; CREATE TABLE vex.projects ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL, key TEXT NOT NULL, display_name TEXT NOT NULL, description TEXT, settings JSONB DEFAULT '{}', created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), UNIQUE (tenant_id, key) ); CREATE TABLE vex.graph_revisions ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), project_id UUID NOT NULL REFERENCES vex.projects(id), revision_id TEXT NOT NULL UNIQUE, parent_revision_id TEXT, sbom_hash TEXT NOT NULL, sbom_format TEXT NOT NULL CHECK (sbom_format IN ('cyclonedx', 'spdx', 'syft', 'other')), sbom_location TEXT, feed_snapshot_id UUID, lattice_policy_version TEXT, unknowns_snapshot_id UUID, node_count INT NOT NULL DEFAULT 0, edge_count INT NOT NULL DEFAULT 0, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), created_by TEXT, notes TEXT ); CREATE TABLE vex.graph_nodes ( id BIGSERIAL PRIMARY KEY, graph_revision_id UUID NOT NULL REFERENCES vex.graph_revisions(id) ON DELETE CASCADE, node_key TEXT NOT NULL, node_type TEXT NOT NULL CHECK (node_type IN ('component', 'vulnerability', 'runtime_entity', 'file', 'package', 'service')), purl TEXT, name TEXT, version TEXT, attributes JSONB DEFAULT '{}', UNIQUE (graph_revision_id, node_key) ); CREATE TABLE vex.graph_edges ( id BIGSERIAL PRIMARY KEY, graph_revision_id UUID NOT NULL REFERENCES vex.graph_revisions(id) ON DELETE CASCADE, from_node_id BIGINT NOT NULL REFERENCES vex.graph_nodes(id) ON DELETE CASCADE, to_node_id BIGINT NOT NULL REFERENCES vex.graph_nodes(id) ON DELETE CASCADE, edge_type TEXT NOT NULL CHECK (edge_type IN ( 'depends_on', 'dev_depends_on', 'optional_depends_on', 'contains', 'introduces', 'mitigates', 'affects', 'build_tool', 'test_dependency' )), attributes JSONB DEFAULT '{}' ); CREATE TABLE vex.statements ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL, project_id UUID REFERENCES vex.projects(id), graph_revision_id UUID REFERENCES vex.graph_revisions(id), advisory_id UUID, vulnerability_id TEXT NOT NULL, subject_node_id BIGINT REFERENCES vex.graph_nodes(id), product_key TEXT, status TEXT NOT NULL CHECK (status IN ('affected', 'not_affected', 'under_investigation', 'fixed')), status_justification TEXT CHECK (status_justification IN ( 'component_not_present', 'vulnerable_code_not_present', 'vulnerable_code_not_in_execute_path', 'vulnerable_code_cannot_be_controlled_by_adversary', 'inline_mitigations_already_exist', NULL )), impact_statement TEXT, action_statement TEXT, action_statement_timestamp TIMESTAMPTZ, evidence JSONB DEFAULT '{}', provenance JSONB DEFAULT '{}', evaluated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), evaluated_by TEXT, superseded_by UUID REFERENCES vex.statements(id), created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); CREATE TABLE vex.observations ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL, provider_id TEXT NOT NULL, vulnerability_id TEXT NOT NULL, product_key TEXT NOT NULL, status TEXT NOT NULL CHECK (status IN ('affected', 'not_affected', 'under_investigation', 'fixed')), status_justification TEXT, content_hash TEXT NOT NULL, linkset_id UUID, dsse_envelope_hash TEXT, provenance JSONB DEFAULT '{}', observed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), expires_at TIMESTAMPTZ, UNIQUE (tenant_id, provider_id, vulnerability_id, product_key, content_hash) ); CREATE TABLE vex.linksets ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL, linkset_id TEXT NOT NULL, provider_id TEXT NOT NULL, sbom_digest TEXT, vex_digest TEXT, sbom_location TEXT, vex_location TEXT, status TEXT NOT NULL DEFAULT 'active' CHECK (status IN ('active', 'superseded', 'revoked')), created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), metadata JSONB DEFAULT '{}', UNIQUE (tenant_id, linkset_id) ); CREATE TABLE vex.linkset_events ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), linkset_id UUID NOT NULL REFERENCES vex.linksets(id), event_type TEXT NOT NULL CHECK (event_type IN ('created', 'updated', 'superseded', 'revoked')), details JSONB DEFAULT '{}', occurred_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); CREATE TABLE vex.consensus ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL, vulnerability_id TEXT NOT NULL, product_key TEXT NOT NULL, computed_status TEXT NOT NULL CHECK (computed_status IN ('affected', 'not_affected', 'under_investigation', 'fixed', 'conflict')), confidence_score NUMERIC(3,2) CHECK (confidence_score >= 0 AND confidence_score <= 1), contributing_observations UUID[] DEFAULT '{}', conflict_details JSONB, computed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), UNIQUE (tenant_id, vulnerability_id, product_key) ); CREATE TABLE vex.consensus_holds ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), consensus_id UUID NOT NULL REFERENCES vex.consensus(id), hold_type TEXT NOT NULL CHECK (hold_type IN ('manual_review', 'conflict_resolution', 'policy_override')), reason TEXT NOT NULL, placed_by TEXT NOT NULL, placed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), released_at TIMESTAMPTZ, released_by TEXT ); CREATE TABLE vex.unknowns_snapshots ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), project_id UUID NOT NULL REFERENCES vex.projects(id), graph_revision_id UUID REFERENCES vex.graph_revisions(id), created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), created_by TEXT, rationale TEXT, item_count INT NOT NULL DEFAULT 0 ); CREATE TABLE vex.unknown_items ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), snapshot_id UUID NOT NULL REFERENCES vex.unknowns_snapshots(id) ON DELETE CASCADE, item_key TEXT NOT NULL, item_type TEXT NOT NULL CHECK (item_type IN ( 'missing_sbom', 'ambiguous_package', 'missing_feed', 'unresolved_edge', 'no_version_info', 'unknown_ecosystem' )), severity TEXT CHECK (severity IN ('critical', 'high', 'medium', 'low', 'info')), details JSONB DEFAULT '{}', resolved_at TIMESTAMPTZ, resolution TEXT ); CREATE TABLE vex.evidence_manifests ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL, manifest_id TEXT NOT NULL UNIQUE, merkle_root TEXT NOT NULL, signature TEXT, signer_id TEXT, sealed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), item_count INT NOT NULL DEFAULT 0, items JSONB NOT NULL DEFAULT '[]', metadata JSONB DEFAULT '{}' ); CREATE TABLE vex.cvss_receipts ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), statement_id UUID NOT NULL REFERENCES vex.statements(id), cvss_metric_id UUID, cvss_version TEXT NOT NULL, vector TEXT NOT NULL, score_used NUMERIC(3,1) NOT NULL, context JSONB DEFAULT '{}', scored_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); CREATE TABLE vex.attestations ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL, statement_id UUID REFERENCES vex.statements(id), graph_revision_id UUID REFERENCES vex.graph_revisions(id), attestation_type TEXT NOT NULL CHECK (attestation_type IN ('in-toto', 'dsse', 'sigstore')), envelope_hash TEXT NOT NULL, rekor_log_id TEXT, rekor_log_index BIGINT, signer_id TEXT, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), metadata JSONB DEFAULT '{}' ); CREATE TABLE vex.timeline_events ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL, project_id UUID REFERENCES vex.projects(id), event_type TEXT NOT NULL, entity_type TEXT NOT NULL, entity_id UUID NOT NULL, actor TEXT, details JSONB DEFAULT '{}', occurred_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); -- Indexes CREATE INDEX idx_projects_tenant ON vex.projects(tenant_id); CREATE INDEX idx_graph_revisions_project ON vex.graph_revisions(project_id); CREATE INDEX idx_graph_revisions_sbom ON vex.graph_revisions(sbom_hash); CREATE INDEX idx_graph_nodes_revision ON vex.graph_nodes(graph_revision_id); CREATE INDEX idx_graph_nodes_purl ON vex.graph_nodes(purl) WHERE purl IS NOT NULL; CREATE INDEX idx_graph_edges_revision ON vex.graph_edges(graph_revision_id); CREATE INDEX idx_graph_edges_from ON vex.graph_edges(from_node_id); CREATE INDEX idx_graph_edges_to ON vex.graph_edges(to_node_id); CREATE INDEX idx_statements_tenant_vuln ON vex.statements(tenant_id, vulnerability_id); CREATE INDEX idx_statements_project ON vex.statements(project_id); CREATE INDEX idx_statements_graph ON vex.statements(graph_revision_id); CREATE INDEX idx_observations_tenant_vuln ON vex.observations(tenant_id, vulnerability_id); CREATE INDEX idx_observations_provider ON vex.observations(provider_id); CREATE INDEX idx_linksets_tenant ON vex.linksets(tenant_id); CREATE INDEX idx_consensus_tenant_vuln ON vex.consensus(tenant_id, vulnerability_id); CREATE INDEX idx_unknowns_project ON vex.unknowns_snapshots(project_id); CREATE INDEX idx_attestations_tenant ON vex.attestations(tenant_id); CREATE INDEX idx_attestations_rekor ON vex.attestations(rekor_log_id) WHERE rekor_log_id IS NOT NULL; CREATE INDEX idx_timeline_tenant_time ON vex.timeline_events(tenant_id, occurred_at DESC); CREATE INDEX idx_timeline_entity ON vex.timeline_events(entity_type, entity_id); ``` ### 5.4 Scheduler Schema ```sql CREATE SCHEMA IF NOT EXISTS scheduler; CREATE TABLE scheduler.schedules ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL, name TEXT NOT NULL, description TEXT, enabled BOOLEAN NOT NULL DEFAULT TRUE, cron_expression TEXT, timezone TEXT NOT NULL DEFAULT 'UTC', mode TEXT NOT NULL CHECK (mode IN ('scheduled', 'manual', 'on_event', 'continuous')), selection JSONB NOT NULL DEFAULT '{}', only_if JSONB DEFAULT '{}', notify JSONB DEFAULT '{}', limits JSONB DEFAULT '{}', subscribers TEXT[] DEFAULT '{}', created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), created_by TEXT, updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), updated_by TEXT, deleted_at TIMESTAMPTZ, deleted_by TEXT, UNIQUE (tenant_id, name) WHERE deleted_at IS NULL ); CREATE TABLE scheduler.triggers ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), schedule_id UUID NOT NULL REFERENCES scheduler.schedules(id) ON DELETE CASCADE, trigger_type TEXT NOT NULL CHECK (trigger_type IN ('cron', 'fixed_delay', 'manual', 'on_event', 'webhook')), cron_expression TEXT, fixed_delay_seconds INT, event_filter JSONB, timezone TEXT DEFAULT 'UTC', next_fire_time TIMESTAMPTZ, last_fire_time TIMESTAMPTZ, misfire_policy TEXT DEFAULT 'skip' CHECK (misfire_policy IN ('skip', 'fire_now', 'queue')), enabled BOOLEAN NOT NULL DEFAULT TRUE ); CREATE TABLE scheduler.runs ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL, schedule_id UUID REFERENCES scheduler.schedules(id), trigger_id UUID REFERENCES scheduler.triggers(id), state TEXT NOT NULL CHECK (state IN ('pending', 'queued', 'running', 'completed', 'failed', 'cancelled', 'stale', 'timeout')), reason JSONB DEFAULT '{}', stats JSONB DEFAULT '{}', deltas JSONB DEFAULT '[]', worker_id UUID, retry_of UUID REFERENCES scheduler.runs(id), retry_count INT NOT NULL DEFAULT 0, error TEXT, error_details JSONB, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), started_at TIMESTAMPTZ, finished_at TIMESTAMPTZ, timeout_at TIMESTAMPTZ ); CREATE TABLE scheduler.graph_jobs ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL, sbom_id TEXT NOT NULL, sbom_version_id TEXT, sbom_digest TEXT NOT NULL, graph_snapshot_id TEXT, status TEXT NOT NULL CHECK (status IN ('pending', 'running', 'completed', 'failed', 'cancelled')), trigger TEXT NOT NULL CHECK (trigger IN ('manual', 'scheduled', 'on_sbom_change', 'on_feed_update')), priority INT NOT NULL DEFAULT 100, attempts INT NOT NULL DEFAULT 0, max_attempts INT NOT NULL DEFAULT 3, cartographer_job_id TEXT, correlation_id TEXT, metadata JSONB DEFAULT '{}', created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), started_at TIMESTAMPTZ, completed_at TIMESTAMPTZ, error TEXT, error_details JSONB ); CREATE TABLE scheduler.policy_jobs ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL, policy_pack_id TEXT NOT NULL, policy_version INT, target_type TEXT NOT NULL CHECK (target_type IN ('image', 'sbom', 'project', 'artifact')), target_id TEXT NOT NULL, status TEXT NOT NULL CHECK (status IN ('pending', 'running', 'completed', 'failed')), priority INT NOT NULL DEFAULT 100, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), started_at TIMESTAMPTZ, completed_at TIMESTAMPTZ, result JSONB DEFAULT '{}', error TEXT ); CREATE TABLE scheduler.impact_snapshots ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL, run_id UUID NOT NULL REFERENCES scheduler.runs(id), image_digest TEXT NOT NULL, image_reference TEXT, new_findings INT NOT NULL DEFAULT 0, new_criticals INT NOT NULL DEFAULT 0, new_high INT NOT NULL DEFAULT 0, new_medium INT NOT NULL DEFAULT 0, new_low INT NOT NULL DEFAULT 0, total_findings INT NOT NULL DEFAULT 0, kev_hits TEXT[] DEFAULT '{}', top_findings JSONB DEFAULT '[]', report_url TEXT, attestation JSONB DEFAULT '{}', detected_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); CREATE TABLE scheduler.workers ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), node_id TEXT NOT NULL UNIQUE, hostname TEXT, capabilities TEXT[] DEFAULT '{}', max_concurrent_jobs INT NOT NULL DEFAULT 1, current_jobs INT NOT NULL DEFAULT 0, version TEXT, last_heartbeat_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), registered_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), status TEXT NOT NULL DEFAULT 'active' CHECK (status IN ('active', 'draining', 'paused', 'dead')) ); CREATE TABLE scheduler.execution_logs ( id BIGSERIAL PRIMARY KEY, run_id UUID NOT NULL REFERENCES scheduler.runs(id) ON DELETE CASCADE, logged_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), level TEXT NOT NULL CHECK (level IN ('trace', 'debug', 'info', 'warn', 'error', 'fatal')), message TEXT NOT NULL, logger TEXT, data JSONB DEFAULT '{}' ); CREATE TABLE scheduler.locks ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), lock_key TEXT NOT NULL UNIQUE, lock_type TEXT NOT NULL DEFAULT 'exclusive' CHECK (lock_type IN ('exclusive', 'shared')), holder_id TEXT NOT NULL, holder_info JSONB DEFAULT '{}', acquired_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), expires_at TIMESTAMPTZ NOT NULL, renewed_at TIMESTAMPTZ ); CREATE TABLE scheduler.run_summaries ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL, schedule_id UUID REFERENCES scheduler.schedules(id), period_start TIMESTAMPTZ NOT NULL, period_end TIMESTAMPTZ NOT NULL, total_runs INT NOT NULL DEFAULT 0, successful_runs INT NOT NULL DEFAULT 0, failed_runs INT NOT NULL DEFAULT 0, cancelled_runs INT NOT NULL DEFAULT 0, avg_duration_seconds NUMERIC(10,2), max_duration_seconds INT, min_duration_seconds INT, total_findings_detected INT NOT NULL DEFAULT 0, new_criticals INT NOT NULL DEFAULT 0, computed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), UNIQUE (tenant_id, schedule_id, period_start) ); CREATE TABLE scheduler.audit ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL, action TEXT NOT NULL, entity_type TEXT NOT NULL, entity_id UUID NOT NULL, actor TEXT, actor_type TEXT CHECK (actor_type IN ('user', 'service', 'system')), old_value JSONB, new_value JSONB, details JSONB DEFAULT '{}', ip_address INET, occurred_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); -- Indexes CREATE INDEX idx_schedules_tenant ON scheduler.schedules(tenant_id) WHERE deleted_at IS NULL; CREATE INDEX idx_schedules_enabled ON scheduler.schedules(tenant_id, enabled) WHERE deleted_at IS NULL; CREATE INDEX idx_triggers_schedule ON scheduler.triggers(schedule_id); CREATE INDEX idx_triggers_next_fire ON scheduler.triggers(next_fire_time) WHERE enabled = TRUE; CREATE INDEX idx_runs_tenant_state ON scheduler.runs(tenant_id, state); CREATE INDEX idx_runs_schedule ON scheduler.runs(schedule_id); CREATE INDEX idx_runs_created ON scheduler.runs(created_at DESC); CREATE INDEX idx_runs_state_created ON scheduler.runs(state, created_at) WHERE state IN ('pending', 'queued', 'running'); CREATE INDEX idx_graph_jobs_tenant_status ON scheduler.graph_jobs(tenant_id, status); CREATE INDEX idx_graph_jobs_sbom ON scheduler.graph_jobs(sbom_digest); CREATE INDEX idx_policy_jobs_tenant_status ON scheduler.policy_jobs(tenant_id, status); CREATE INDEX idx_impact_snapshots_run ON scheduler.impact_snapshots(run_id); CREATE INDEX idx_impact_snapshots_tenant ON scheduler.impact_snapshots(tenant_id, detected_at DESC); CREATE INDEX idx_workers_status ON scheduler.workers(status); CREATE INDEX idx_workers_heartbeat ON scheduler.workers(last_heartbeat_at); CREATE INDEX idx_execution_logs_run ON scheduler.execution_logs(run_id); CREATE INDEX idx_locks_expires ON scheduler.locks(expires_at); CREATE INDEX idx_run_summaries_tenant ON scheduler.run_summaries(tenant_id, period_start DESC); CREATE INDEX idx_audit_tenant_time ON scheduler.audit(tenant_id, occurred_at DESC); CREATE INDEX idx_audit_entity ON scheduler.audit(entity_type, entity_id); -- Partitioning for high-volume tables (optional) -- CREATE TABLE scheduler.runs_partitioned (...) PARTITION BY RANGE (created_at); -- CREATE TABLE scheduler.execution_logs_partitioned (...) PARTITION BY RANGE (logged_at); ``` ### 5.5 Notify Schema See [schemas/notify.sql](./schemas/notify.sql) for the complete schema definition. ### 5.6 Policy Schema See [schemas/policy.sql](./schemas/policy.sql) for the complete schema definition. --- ## 6. Indexing Strategy ### 6.1 Index Types | Index Type | Use Case | Example | |------------|----------|---------| | B-tree (default) | Equality, range, sorting | `CREATE INDEX idx_x ON t(col)` | | GIN | JSONB containment, arrays, full-text | `CREATE INDEX idx_x ON t USING GIN (col)` | | GiST | Geometric, range types | `CREATE INDEX idx_x ON t USING GiST (col)` | | Hash | Equality only (rare) | `CREATE INDEX idx_x ON t USING HASH (col)` | | BRIN | Large tables with natural ordering | `CREATE INDEX idx_x ON t USING BRIN (col)` | ### 6.2 Composite Index Guidelines ```sql -- Order columns by: -- 1. Equality conditions first -- 2. Range conditions second -- 3. Most selective columns first within each group -- Good: tenant_id always equality, created_at often range CREATE INDEX idx_runs_tenant_created ON scheduler.runs(tenant_id, created_at DESC); -- Good: Partial index for active records only CREATE INDEX idx_schedules_active ON scheduler.schedules(tenant_id, name) WHERE deleted_at IS NULL AND enabled = TRUE; ``` ### 6.3 JSONB Indexing ```sql -- GIN index for general JSONB queries CREATE INDEX idx_advisories_provenance_gin ON vuln.advisories USING GIN (provenance); -- Expression index for specific paths CREATE INDEX idx_affected_ecosystem ON vuln.advisory_affected ((attributes->>'ecosystem')); -- Partial GIN for specific conditions CREATE INDEX idx_metadata_active ON scheduler.runs USING GIN (stats) WHERE state = 'completed'; ``` --- ## 7. Partitioning Strategy ### 7.1 When to Partition - Tables exceeding 100M rows - Time-series data with clear retention windows - Append-heavy tables with date-based queries ### 7.2 Partition Schemes **Time-based (RANGE):** ```sql CREATE TABLE scheduler.runs ( -- columns ) PARTITION BY RANGE (created_at); CREATE TABLE scheduler.runs_y2024m01 PARTITION OF scheduler.runs FOR VALUES FROM ('2024-01-01') TO ('2024-02-01'); ``` **Tenant-based (LIST):** ```sql CREATE TABLE vex.statements ( -- columns ) PARTITION BY LIST (tenant_id); -- Only for very large tenants CREATE TABLE vex.statements_tenant_abc PARTITION OF vex.statements FOR VALUES IN ('abc-uuid'); ``` ### 7.3 Retention via Partition Drops ```sql -- Monthly cleanup job DROP TABLE scheduler.runs_y2023m01; DROP TABLE scheduler.execution_logs_y2023m01; ``` --- ## 8. Connection Management ### 8.1 Connection Pooling **Recommended: PgBouncer in transaction mode** ```ini [pgbouncer] pool_mode = transaction max_client_conn = 1000 default_pool_size = 20 reserve_pool_size = 5 ``` ### 8.2 Session Configuration Every connection must configure: ```sql -- Set on connection open (via DataSource) SET app.tenant_id = ''; SET timezone = 'UTC'; SET statement_timeout = '30s'; -- Adjust per use case ``` ### 8.3 Connection String Template ``` Host=;Port=5432;Database=stellaops;Username=;Password=; Pooling=true;MinPoolSize=5;MaxPoolSize=20;ConnectionIdleLifetime=300; CommandTimeout=30;Timeout=15; ``` --- ## 9. Migration Strategy ### 9.1 Migration Naming ``` V__.sql Examples: V001__create_authority_schema.sql V002__create_vuln_schema.sql V003__add_kev_flags_index.sql ``` ### 9.2 Migration Rules 1. **Idempotent**: Use `IF NOT EXISTS`, `IF EXISTS` 2. **Backward compatible**: Add columns as nullable first 3. **No data loss**: Never drop columns without migration path 4. **Testable**: Each migration runs in CI against test database 5. **Reversible**: Include down migration where possible ### 9.3 Migration Template ```sql -- V001__create_authority_schema.sql -- Description: Create initial authority schema -- Author: -- Date: 2025-XX-XX BEGIN; CREATE SCHEMA IF NOT EXISTS authority; CREATE TABLE IF NOT EXISTS authority.tenants ( -- ... ); -- Add indexes CREATE INDEX IF NOT EXISTS idx_tenants_code ON authority.tenants(code); COMMIT; ``` --- ## 10. Performance Guidelines ### 10.1 Query Patterns | Pattern | Guidance | |---------|----------| | Pagination | Use keyset pagination (`WHERE id > :last_id`), not `OFFSET` | | Bulk inserts | Use `COPY` or multi-value `INSERT` | | Existence checks | Use `EXISTS`, not `COUNT(*)` | | Aggregations | Pre-aggregate in summary tables where possible | ### 10.2 EXPLAIN ANALYZE All queries in production code should have `EXPLAIN ANALYZE` output documented for: - Expected row counts - Index usage - Scan types ### 10.3 Monitoring Queries ```sql -- Slow queries SELECT * FROM pg_stat_statements ORDER BY total_time DESC LIMIT 20; -- Index usage SELECT * FROM pg_stat_user_indexes WHERE idx_scan = 0; -- Table bloat SELECT * FROM pgstattuple('schema.table'); ``` --- ## Appendix A: Type Reference ### A.1 Custom Types (if needed) ```sql -- Advisory status type (use CHECK constraint instead for flexibility) -- CREATE TYPE advisory_status AS ENUM ('active', 'rejected', 'withdrawn'); -- Prefer CHECK constraints: status TEXT NOT NULL CHECK (status IN ('active', 'rejected', 'withdrawn')) ``` ### A.2 Extension Dependencies ```sql -- Required extensions CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; -- UUID generation (optional, gen_random_uuid() is built-in) CREATE EXTENSION IF NOT EXISTS "pg_trgm"; -- Trigram similarity for fuzzy search CREATE EXTENSION IF NOT EXISTS "btree_gin"; -- GIN indexes for scalar types ``` --- ## Appendix B: Schema Diagram ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ AUTHORITY SCHEMA │ │ ┌─────────┐ ┌───────┐ ┌──────────────────┐ ┌─────────────────┐ │ │ │ tenants │───<│ users │───<│ user_roles │>───│ roles │ │ │ └─────────┘ └───────┘ └──────────────────┘ └─────────────────┘ │ │ │ │ │ │ │ ┌────┴─────┐ │ │ └────────<│ service_ │ ┌─────────┐ ┌────────┐ │ │ │ accounts │ │ clients │ │ scopes │ │ │ └──────────┘ └─────────┘ └────────┘ │ │ │ │ ┌─────────┐ ┌─────────────┐ ┌──────────┐ ┌────────────────┐ │ │ │ tokens │ │ revocations │ │ licenses │ │ license_usage │ │ │ └─────────┘ └─────────────┘ └──────────┘ └────────────────┘ │ └─────────────────────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────────────────────┐ │ VULN SCHEMA │ │ ┌─────────┐ ┌────────────────┐ ┌───────────────────┐ │ │ │ sources │───<│ feed_snapshots │───<│ advisory_snapshots│ │ │ └─────────┘ └────────────────┘ └───────────────────┘ │ │ │ │ │ │ └──────────────────┬─────────────────────┘ │ │ ▼ │ │ ┌────────────┐ │ │ │ advisories │ │ │ └────────────┘ │ │ │ │ │ ┌─────────────────────┼─────────────────────┬──────────────────┐ │ │ ▼ ▼ ▼ ▼ │ │ ┌─────────────┐ ┌──────────────┐ ┌────────────────┐ ┌───────────┐ │ │ │ aliases │ │advisory_cvss │ │advisory_affected│ │ kev_flags │ │ │ └─────────────┘ └──────────────┘ └────────────────┘ └───────────┘ │ └─────────────────────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────────────────────┐ │ VEX SCHEMA │ │ ┌──────────┐ ┌─────────────────┐ ┌─────────────┐ ┌───────────┐ │ │ │ projects │───<│ graph_revisions │───<│ graph_nodes │───<│graph_edges│ │ │ └──────────┘ └─────────────────┘ └─────────────┘ └───────────┘ │ │ │ │ │ │ │ │ │ │ │ │ ▼ ▼ ▼ │ │ ┌────────────┐ ┌────────────────┐ ┌──────────────┐ │ │ │ statements │───<│ cvss_receipts │ │ observations │ │ │ └────────────┘ └────────────────┘ └──────────────┘ │ │ │ │ │ ┌──────────┐ ┌───────────┐ ┌───────────────┴─────┐ │ │ │ linksets │───<│ linkset_ │ │ consensus │ holds │ │ │ └──────────┘ │ events │ └─────────────────────┘ │ │ └───────────┘ │ └─────────────────────────────────────────────────────────────────────────────┘ ``` --- *Document Version: 1.0.0* *Last Updated: 2025-11-28*