# Schema Mapping Reference This document provides a comprehensive mapping of all data storage schemas across PostgreSQL, Valkey, and RustFS (S3), organized by module ownership. ## Table of Contents - [1. Storage Overview](#1-storage-overview) - [2. PostgreSQL Schema Ownership](#2-postgresql-schema-ownership) - [3. Valkey Keyspace Reference](#3-valkey-keyspace-reference) - [4. RustFS (S3) Path Conventions](#4-rustfs-s3-path-conventions) - [5. Module-to-Storage Mapping](#5-module-to-storage-mapping) - [6. Cross-References](#6-cross-references) --- ## 1. Storage Overview StellaOps uses three primary storage systems: | Storage | Purpose | Technology | Notes | |---------|---------|------------|-------| | **PostgreSQL** | Canonical persistent store | PostgreSQL v16+ | Per-module schema isolation | | **Valkey** | Cache, queues, events | Valkey v8.0 (Redis-compatible) | DPoP nonces, task streams | | **RustFS** | Object storage | S3-compatible API | Content-addressed blobs | ### Storage Principles 1. **Schema Isolation**: Each module owns its PostgreSQL schema 2. **Append-Only for Evidence**: Advisory, VEX, and SBOM raw data is immutable (AOC) 3. **Content-Addressable**: Blob storage uses SHA256 digest prefixes 4. **Deterministic Keys**: Valkey keys follow predictable patterns 5. **TTL Management**: Cache entries have explicit TTLs --- ## 2. PostgreSQL Schema Ownership ### Schema-to-Module Mapping | Schema | Owner Module | Purpose | Key Tables | |--------|--------------|---------|------------| | `authority` | Authority | Identity, clients, keys, auth audit | `clients`, `keys`, `tokens`, `audit_trail` | | `scanner` | Scanner | Scan manifests, triage, metadata | `scans`, `artifacts`, `images`, `layers` | | `vuln` | Concelier | Advisory raw documents, linksets | `advisory_raw`, `linksets`, `observations` | | `vex` | Excititor | VEX raw statements, consensus | `vex_raw`, `consensus`, `issuer_trust` | | `policy` | Policy | Policies, exceptions, findings | `policies`, `exceptions`, `effective_finding_*` | | `scheduler` | Scheduler | Jobs, runs, schedules | `schedules`, `runs`, `impact_snapshots` | | `notify` | Notifier | Rules, channels, delivery history | `rules`, `channels`, `templates`, `delivery_log` | | `orchestrator` | Orchestrator | Workflows, tasks | `workflows`, `tasks`, `task_runs` | | `registry` | Registry | Token service, image metadata | `tokens`, `repositories` | | `symbols` | Symbols | Symbol resolution | `symbols`, `debug_info` | | `unknowns` | Unknowns | Unknown components | `unknown_components`, `tracking` | ### Detailed Schema Definitions #### Schema: `authority` ```sql -- Core identity tables clients ( client_id UUID PRIMARY KEY, tenant_id UUID NOT NULL, name VARCHAR(255), client_type VARCHAR(50), -- confidential, public, service redirect_uris TEXT[], scopes TEXT[], created_at TIMESTAMPTZ, updated_at TIMESTAMPTZ ); keys ( key_id UUID PRIMARY KEY, client_id UUID REFERENCES clients, algorithm VARCHAR(50), -- RS256, ES256, EdDSA public_key TEXT, created_at TIMESTAMPTZ, expires_at TIMESTAMPTZ ); tokens ( token_id UUID PRIMARY KEY, client_id UUID REFERENCES clients, subject VARCHAR(255), scopes TEXT[], issued_at TIMESTAMPTZ, expires_at TIMESTAMPTZ, revoked_at TIMESTAMPTZ ); audit_trail ( audit_id UUID PRIMARY KEY, action VARCHAR(100), actor_id VARCHAR(255), resource_type VARCHAR(100), resource_id VARCHAR(255), occurred_at TIMESTAMPTZ, metadata JSONB ); ``` #### Schema: `scanner` ```sql -- Scan lifecycle tables scans ( scan_id UUID PRIMARY KEY, tenant_id UUID NOT NULL, image_digest VARCHAR(100), image_reference TEXT, state VARCHAR(50), -- pending, acquired, running, completed, failed created_at TIMESTAMPTZ, started_at TIMESTAMPTZ, completed_at TIMESTAMPTZ, metadata JSONB ); artifacts ( artifact_id VARCHAR(100) PRIMARY KEY, -- sha256:... scan_id UUID REFERENCES scans, format VARCHAR(50), -- cdx-json, spdx-json, cdx-pb created_at TIMESTAMPTZ, rekor_proof JSONB ); images ( image_digest VARCHAR(100) PRIMARY KEY, repository TEXT, tag TEXT, architecture VARCHAR(50), os VARCHAR(50), created_at TIMESTAMPTZ ); layers ( layer_digest VARCHAR(100) PRIMARY KEY, media_type VARCHAR(255), size BIGINT, created_at TIMESTAMPTZ ); scan_artifacts ( scan_id UUID REFERENCES scans, artifact_id VARCHAR(100) REFERENCES artifacts, PRIMARY KEY (scan_id, artifact_id) ); ``` #### Schema: `vuln` ```sql -- Advisory storage (append-only, AOC enforced) advisory_raw ( raw_id UUID PRIMARY KEY, advisory_id VARCHAR(100), -- CVE-2024-xxxx source VARCHAR(50), -- NVD, RED_HAT, OSV, GHSA raw_document JSONB NOT NULL, -- Original JSON as-received published_at TIMESTAMPTZ, revision INTEGER, created_at TIMESTAMPTZ, UNIQUE (advisory_id, source, revision) ); linksets ( linkset_id UUID PRIMARY KEY, advisory_id VARCHAR(100), purl TEXT, version_range JSONB, created_at TIMESTAMPTZ ); observations ( observation_id UUID PRIMARY KEY, advisory_id VARCHAR(100), severity_cvss3 JSONB, severity_cvss4 JSONB, description TEXT, references JSONB, created_at TIMESTAMPTZ ); ``` #### Schema: `vex` ```sql -- VEX storage (append-only, AOC enforced) vex_raw ( raw_id UUID PRIMARY KEY, issuer_id VARCHAR(255), component_purl TEXT, vulnerability_id VARCHAR(100), status VARCHAR(50), -- not_affected, affected, under_investigation justification VARCHAR(100), raw_statement JSONB NOT NULL, published_at TIMESTAMPTZ, signature JSONB, -- DSSE envelope if signed created_at TIMESTAMPTZ ); consensus ( consensus_id UUID PRIMARY KEY, component_purl TEXT, vulnerability_id VARCHAR(100), resolved_status VARCHAR(50), conflict_detected BOOLEAN, contributing_vex_ids UUID[], computed_at TIMESTAMPTZ ); issuer_trust ( issuer_id VARCHAR(255) PRIMARY KEY, trust_score DECIMAL(3,2), -- 0.00 to 1.00 priority INTEGER, -- Lower = higher priority issuer_type VARCHAR(50), -- vendor, distro, researcher, community updated_at TIMESTAMPTZ ); ``` #### Schema: `policy` ```sql -- Policy definitions and lifecycle policies ( policy_id UUID PRIMARY KEY, tenant_id UUID NOT NULL, name VARCHAR(255), version INTEGER, state VARCHAR(50), -- DRAFT, SHADOW, ACTIVE, ENFORCING yaml_content TEXT, compiled_ir JSONB, ir_hash VARCHAR(100), created_at TIMESTAMPTZ, updated_at TIMESTAMPTZ ); policy_runs ( run_id UUID PRIMARY KEY, policy_id UUID REFERENCES policies, tenant_id UUID NOT NULL, cursor JSONB, stats JSONB, determinism_hash VARCHAR(100), started_at TIMESTAMPTZ, completed_at TIMESTAMPTZ ); exceptions ( exception_id UUID PRIMARY KEY, tenant_id UUID NOT NULL, policy_id UUID, component_purl TEXT, vulnerability_id VARCHAR(100), effect VARCHAR(50), -- suppress, defer, downgrade, require_control approval_level VARCHAR(10), -- G0, G1, G2, G3, G4 expires_at TIMESTAMPTZ, created_at TIMESTAMPTZ, created_by VARCHAR(255) ); exception_approval_audit ( audit_id UUID PRIMARY KEY, exception_id UUID REFERENCES exceptions, approval_level VARCHAR(10), approver_id VARCHAR(255), approved_at TIMESTAMPTZ, comment TEXT ); -- Dynamic per-policy finding tables (created dynamically) -- effective_finding_{policyId} - Current snapshot -- effective_finding_{policyId}_history - Audit trail ``` #### Schema: `scheduler` ```sql -- Job scheduling schedules ( schedule_id VARCHAR(100) PRIMARY KEY, tenant_id UUID NOT NULL, name VARCHAR(255), enabled BOOLEAN, cron_expression VARCHAR(100), timezone VARCHAR(50), mode VARCHAR(50), selection JSONB, notify JSONB, limits JSONB, created_at TIMESTAMPTZ, updated_at TIMESTAMPTZ ); runs ( run_id VARCHAR(100) PRIMARY KEY, tenant_id UUID NOT NULL, schedule_id VARCHAR(100) REFERENCES schedules, trigger VARCHAR(50), state VARCHAR(50), stats JSONB, deltas JSONB, created_at TIMESTAMPTZ, started_at TIMESTAMPTZ, finished_at TIMESTAMPTZ, error TEXT ); run_summaries ( summary_id VARCHAR(200) PRIMARY KEY, -- tenant:schedule tenant_id UUID NOT NULL, schedule_id VARCHAR(100), last_run JSONB, recent JSONB[], counters JSONB, updated_at TIMESTAMPTZ ); ``` #### Schema: `notify` ```sql -- Notification routing rules ( rule_id UUID PRIMARY KEY, tenant_id UUID NOT NULL, name VARCHAR(255), match JSONB, actions JSONB, enabled BOOLEAN, created_at TIMESTAMPTZ, updated_at TIMESTAMPTZ, deleted_at TIMESTAMPTZ ); channels ( channel_id UUID PRIMARY KEY, tenant_id UUID NOT NULL, name VARCHAR(255), type VARCHAR(50), -- slack, teams, email, webhook config JSONB, created_at TIMESTAMPTZ, updated_at TIMESTAMPTZ, deleted_at TIMESTAMPTZ ); templates ( template_id UUID PRIMARY KEY, channel_type VARCHAR(50), key VARCHAR(100), locale VARCHAR(10), render_mode VARCHAR(50), body TEXT, created_at TIMESTAMPTZ, updated_at TIMESTAMPTZ ); delivery_log ( delivery_id UUID PRIMARY KEY, rule_id UUID REFERENCES rules, channel_id UUID REFERENCES channels, event_id UUID, status VARCHAR(50), delivered_at TIMESTAMPTZ, error TEXT ); ``` --- ## 3. Valkey Keyspace Reference ### Key Patterns by Module | Module | Key Pattern | Type | TTL | Purpose | |--------|-------------|------|-----|---------| | **Authority** | `dpop:{jti}` | string | 5m | DPoP nonce cache (RFC 9449) | | **Scanner** | `scan:{digest}` | string | Infinite | Last scan JSON result | | **Scanner** | `layers:{digest}` | set | 90d | Layers with SBOMs (delta cache) | | **Scanner** | `locator:{imageDigest}` | string | 30d | Image digest to sbomBlobId mapping | | **Policy** | `policy:active` | string | Infinite | Active policy YAML/Rego | | **Policy** | `policy:history` | list | Infinite | Change audit IDs | | **Concelier** | `feed:nvd:json` | string | 24h | Normalized feed snapshot | | **General** | `quota:{token}` | string | Until UTC midnight | Per-token scan counter | | **Scheduler** | `scheduler:jobs` | stream | - | Job queue | | **Notifier** | `notify:delivery` | stream | 7d | Delivery events | | **All** | `events:*` | stream | 7d | Event streams | | **All** | `queue:*` | stream | - | Task queues | | **Telemetry** | `metrics:*` | various | - | Runtime metrics | ### Stream Definitions | Stream | Producers | Consumers | Events | |--------|-----------|-----------|--------| | `scanner:events` | Scanner.Worker | Policy, Notify, TimelineIndexer, ExportCenter | scan.submitted, scan.completed, scan.failed | | `concelier:drift` | Concelier.Worker | Scheduler, Policy, Notify | advisory.new, advisory.updated | | `policy:evaluated` | Policy.Worker | Notify, Findings, ExportCenter | evaluation.completed, verdict.changed | | `scheduler:jobs` | Scheduler | Scanner, Policy | run.started, rescan.triggered | | `notify:delivery` | Notifier | Audit, TimelineIndexer | notification.sent, notification.failed | ### Valkey Configuration ```yaml # Recommended Valkey configuration for StellaOps maxmemory: 2gb maxmemory-policy: volatile-lru stream-node-max-bytes: 4096 stream-node-max-entries: 100 # Consumer groups for job processing scanner:jobs: consumer_group: scanner-workers idle_timeout: 60s notify:delivery: consumer_group: notify-workers idle_timeout: 30s ``` --- ## 4. RustFS (S3) Path Conventions ### Blob Storage Layout ``` blobs/ +-- {sha256_prefix}/ # First 4 chars of digest +-- {full_digest}/ +-- sbom.json # SBOM payload (any format) +-- sbom.meta.json # Wrapper envelope +-- sbom.cdx.pb # CycloneDX Protobuf (compact) +-- attestation.dsse # DSSE envelope +-- evidence.bundle # Evidence package images/ +-- {imageDigest}/ +-- inventory.cdx.json # Inventory SBOM +-- inventory.cdx.pb # Inventory (Protobuf) +-- usage.cdx.json # Usage SBOM (entrypoint closure) +-- usage.cdx.pb # Usage (Protobuf) +-- call-graph.json # Call graph data +-- reachability.json # Reachability analysis evidence/ +-- {bundleId}/ +-- manifest.json # Bundle manifest +-- manifest.dsse # Signed manifest +-- sboms/ # SBOM files +-- attestations/ # Attestation files +-- proofs/ # Verification proofs offline-kits/ +-- {kitId}/ +-- feeds/ # Advisory snapshots +-- images/ # Container images +-- signatures/ # DSSE signatures +-- trust-roots/ # CA certificates +-- manifest.json # Kit manifest ``` ### SBOM Wrapper Envelope ```json { "id": "sha256:417f...", "imageDigest": "sha256:e2b9...", "created": "2025-01-02T15:30:00Z", "format": "cdx-json", "layers": [ "sha256:d38b...", "sha256:af45..." ], "partial": false, "provenanceId": "prov_0291" } ``` ### Content Types | Extension | Content-Type | Description | |-----------|--------------|-------------| | `.json` | `application/json` | JSON documents | | `.cdx.json` | `application/vnd.cyclonedx+json` | CycloneDX JSON | | `.cdx.pb` | `application/vnd.cyclonedx+protobuf` | CycloneDX Protobuf | | `.spdx.json` | `application/spdx+json` | SPDX JSON | | `.dsse` | `application/vnd.dsse+json` | DSSE envelope | | `.bundle` | `application/zip` | Evidence bundle | --- ## 5. Module-to-Storage Mapping ### Complete Reference Table | Module | PostgreSQL Schema | Valkey Keys | RustFS Paths | |--------|------------------|-------------|--------------| | **Authority** | `authority` | `dpop:{jti}` | - | | **Gateway** | - (stateless) | - | - | | **Router** | - | connection state | - | | **Scanner** | `scanner` | `scan:{digest}`, `layers:{digest}`, `locator:{imageDigest}` | `blobs/`, `images/` | | **Concelier** | `vuln` | `feed:*`, `concelier:drift` stream | - | | **Excititor** | `vex` | - | - | | **VexLens** | - (reads `vex`) | - | - | | **VexHub** | `vex` (extension) | - | - | | **IssuerDirectory** | - (reads `vex.issuer_trust`) | - | - | | **Policy** | `policy` | `policy:active`, `policy:history` | - | | **RiskEngine** | - (reads `policy`) | - | - | | **Scheduler** | `scheduler` | `scheduler:jobs` stream | - | | **Notifier** | `notify` | `notify:delivery` stream | - | | **Orchestrator** | `orchestrator` | `orchestrator:*` streams | - | | **Attestor** | - (uses `scanner`) | - | `blobs/*/attestation.dsse` | | **Signer** | - (uses `authority`) | - | - | | **SbomService** | - (reads `scanner`) | - | `blobs/`, `images/` | | **EvidenceLocker** | - | - | `evidence/` | | **ExportCenter** | - | - | `evidence/`, `offline-kits/` | | **AirGap** | - | - | `offline-kits/` | | **Registry** | `registry` | - | - | | **Symbols** | `symbols` | - | - | | **Unknowns** | `unknowns` | - | - | | **TimelineIndexer** | - (writes to `scanner`, etc.) | - | - | --- ## 6. Cross-References ### Data Flow Dependencies ``` +---------+ +-----------+ +--------+ +--------+ | Scanner |---->| SbomService|---->| Policy |---->| Notify | +---------+ +-----------+ +--------+ +--------+ | | | | v v v v +----------+ +-----------+ +-----------+ +---------+ | scanner | | blobs/ | | policy | | notify | | (PG) | | (RustFS) | | (PG) | | (PG) | +----------+ +-----------+ +-----------+ +---------+ +-----------+ +----------+ | Concelier |---->| Policy | +-----------+ +----------+ | ^ v | +----------+ +-----------+ | vuln | | Excititor | | (PG) | +-----------+ +----------+ | v +----------+ | vex | | (PG) | +----------+ ``` ### Schema Version Tracking All schemas support versioning: ```sql -- Every schema has a version tracking table schema_migrations ( version INTEGER PRIMARY KEY, description TEXT, applied_at TIMESTAMPTZ, checksum VARCHAR(64) ); ``` ### Backup Considerations | Storage | Backup Strategy | Retention | |---------|-----------------|-----------| | PostgreSQL | pg_dump + WAL archiving | 30 days | | Valkey | RDB snapshots + AOF | 7 days | | RustFS | Object versioning | 90 days | --- ## Related Documentation - [Data Schemas](../../11_DATA_SCHEMAS.md) - Detailed schema definitions - [Data Flows](data-flows.md) - How data moves through the system - [Data Isolation](data-isolation.md) - Per-tenant isolation model - [Module Matrix](module-matrix.md) - Complete module inventory - [Request Flows](request-flows.md) - HTTP/Binary protocol flows