Files
StellaOps Bot ca578801fd save progress
2026-01-03 00:49:19 +02:00

16 KiB

Schema Mapping Reference

This document provides a comprehensive mapping of all data storage schemas across PostgreSQL, Valkey, and RustFS (S3), organized by module ownership.

Table of Contents


1. Storage Overview

StellaOps uses three primary storage systems:

Storage Purpose Technology Notes
PostgreSQL Canonical persistent store PostgreSQL v16+ Per-module schema isolation
Valkey Cache, queues, events Valkey v8.0 (Redis-compatible) DPoP nonces, task streams
RustFS Object storage S3-compatible API Content-addressed blobs

Storage Principles

  1. Schema Isolation: Each module owns its PostgreSQL schema
  2. Append-Only for Evidence: Advisory, VEX, and SBOM raw data is immutable (AOC)
  3. Content-Addressable: Blob storage uses SHA256 digest prefixes
  4. Deterministic Keys: Valkey keys follow predictable patterns
  5. TTL Management: Cache entries have explicit TTLs

2. PostgreSQL Schema Ownership

Schema-to-Module Mapping

Schema Owner Module Purpose Key Tables
authority Authority Identity, clients, keys, auth audit clients, keys, tokens, audit_trail
scanner Scanner Scan manifests, triage, metadata scans, artifacts, images, layers
vuln Concelier Advisory raw documents, linksets advisory_raw, linksets, observations
vex Excititor VEX raw statements, consensus vex_raw, consensus, issuer_trust
policy Policy Policies, exceptions, findings policies, exceptions, effective_finding_*
scheduler Scheduler Jobs, runs, schedules schedules, runs, impact_snapshots
notify Notifier Rules, channels, delivery history rules, channels, templates, delivery_log
orchestrator Orchestrator Workflows, tasks workflows, tasks, task_runs
registry Registry Token service, image metadata tokens, repositories
symbols Symbols Symbol resolution symbols, debug_info
unknowns Unknowns Unknown components unknown_components, tracking

Detailed Schema Definitions

Schema: authority

-- Core identity tables
clients (
  client_id UUID PRIMARY KEY,
  tenant_id UUID NOT NULL,
  name VARCHAR(255),
  client_type VARCHAR(50),  -- confidential, public, service
  redirect_uris TEXT[],
  scopes TEXT[],
  created_at TIMESTAMPTZ,
  updated_at TIMESTAMPTZ
);

keys (
  key_id UUID PRIMARY KEY,
  client_id UUID REFERENCES clients,
  algorithm VARCHAR(50),  -- RS256, ES256, EdDSA
  public_key TEXT,
  created_at TIMESTAMPTZ,
  expires_at TIMESTAMPTZ
);

tokens (
  token_id UUID PRIMARY KEY,
  client_id UUID REFERENCES clients,
  subject VARCHAR(255),
  scopes TEXT[],
  issued_at TIMESTAMPTZ,
  expires_at TIMESTAMPTZ,
  revoked_at TIMESTAMPTZ
);

audit_trail (
  audit_id UUID PRIMARY KEY,
  action VARCHAR(100),
  actor_id VARCHAR(255),
  resource_type VARCHAR(100),
  resource_id VARCHAR(255),
  occurred_at TIMESTAMPTZ,
  metadata JSONB
);

Schema: scanner

-- Scan lifecycle tables
scans (
  scan_id UUID PRIMARY KEY,
  tenant_id UUID NOT NULL,
  image_digest VARCHAR(100),
  image_reference TEXT,
  state VARCHAR(50),  -- pending, acquired, running, completed, failed
  created_at TIMESTAMPTZ,
  started_at TIMESTAMPTZ,
  completed_at TIMESTAMPTZ,
  metadata JSONB
);

artifacts (
  artifact_id VARCHAR(100) PRIMARY KEY,  -- sha256:...
  scan_id UUID REFERENCES scans,
  format VARCHAR(50),  -- cdx-json, spdx-json, cdx-pb
  created_at TIMESTAMPTZ,
  rekor_proof JSONB
);

images (
  image_digest VARCHAR(100) PRIMARY KEY,
  repository TEXT,
  tag TEXT,
  architecture VARCHAR(50),
  os VARCHAR(50),
  created_at TIMESTAMPTZ
);

layers (
  layer_digest VARCHAR(100) PRIMARY KEY,
  media_type VARCHAR(255),
  size BIGINT,
  created_at TIMESTAMPTZ
);

scan_artifacts (
  scan_id UUID REFERENCES scans,
  artifact_id VARCHAR(100) REFERENCES artifacts,
  PRIMARY KEY (scan_id, artifact_id)
);

Schema: vuln

-- Advisory storage (append-only, AOC enforced)
advisory_raw (
  raw_id UUID PRIMARY KEY,
  advisory_id VARCHAR(100),  -- CVE-2024-xxxx
  source VARCHAR(50),  -- NVD, RED_HAT, OSV, GHSA
  raw_document JSONB NOT NULL,  -- Original JSON as-received
  published_at TIMESTAMPTZ,
  revision INTEGER,
  created_at TIMESTAMPTZ,
  UNIQUE (advisory_id, source, revision)
);

linksets (
  linkset_id UUID PRIMARY KEY,
  advisory_id VARCHAR(100),
  purl TEXT,
  version_range JSONB,
  created_at TIMESTAMPTZ
);

observations (
  observation_id UUID PRIMARY KEY,
  advisory_id VARCHAR(100),
  severity_cvss3 JSONB,
  severity_cvss4 JSONB,
  description TEXT,
  references JSONB,
  created_at TIMESTAMPTZ
);

Schema: vex

-- VEX storage (append-only, AOC enforced)
vex_raw (
  raw_id UUID PRIMARY KEY,
  issuer_id VARCHAR(255),
  component_purl TEXT,
  vulnerability_id VARCHAR(100),
  status VARCHAR(50),  -- not_affected, affected, under_investigation
  justification VARCHAR(100),
  raw_statement JSONB NOT NULL,
  published_at TIMESTAMPTZ,
  signature JSONB,  -- DSSE envelope if signed
  created_at TIMESTAMPTZ
);

consensus (
  consensus_id UUID PRIMARY KEY,
  component_purl TEXT,
  vulnerability_id VARCHAR(100),
  resolved_status VARCHAR(50),
  conflict_detected BOOLEAN,
  contributing_vex_ids UUID[],
  computed_at TIMESTAMPTZ
);

issuer_trust (
  issuer_id VARCHAR(255) PRIMARY KEY,
  trust_score DECIMAL(3,2),  -- 0.00 to 1.00
  priority INTEGER,  -- Lower = higher priority
  issuer_type VARCHAR(50),  -- vendor, distro, researcher, community
  updated_at TIMESTAMPTZ
);

Schema: policy

-- Policy definitions and lifecycle
policies (
  policy_id UUID PRIMARY KEY,
  tenant_id UUID NOT NULL,
  name VARCHAR(255),
  version INTEGER,
  state VARCHAR(50),  -- DRAFT, SHADOW, ACTIVE, ENFORCING
  yaml_content TEXT,
  compiled_ir JSONB,
  ir_hash VARCHAR(100),
  created_at TIMESTAMPTZ,
  updated_at TIMESTAMPTZ
);

policy_runs (
  run_id UUID PRIMARY KEY,
  policy_id UUID REFERENCES policies,
  tenant_id UUID NOT NULL,
  cursor JSONB,
  stats JSONB,
  determinism_hash VARCHAR(100),
  started_at TIMESTAMPTZ,
  completed_at TIMESTAMPTZ
);

exceptions (
  exception_id UUID PRIMARY KEY,
  tenant_id UUID NOT NULL,
  policy_id UUID,
  component_purl TEXT,
  vulnerability_id VARCHAR(100),
  effect VARCHAR(50),  -- suppress, defer, downgrade, require_control
  approval_level VARCHAR(10),  -- G0, G1, G2, G3, G4
  expires_at TIMESTAMPTZ,
  created_at TIMESTAMPTZ,
  created_by VARCHAR(255)
);

exception_approval_audit (
  audit_id UUID PRIMARY KEY,
  exception_id UUID REFERENCES exceptions,
  approval_level VARCHAR(10),
  approver_id VARCHAR(255),
  approved_at TIMESTAMPTZ,
  comment TEXT
);

-- Dynamic per-policy finding tables (created dynamically)
-- effective_finding_{policyId} - Current snapshot
-- effective_finding_{policyId}_history - Audit trail

Schema: scheduler

-- Job scheduling
schedules (
  schedule_id VARCHAR(100) PRIMARY KEY,
  tenant_id UUID NOT NULL,
  name VARCHAR(255),
  enabled BOOLEAN,
  cron_expression VARCHAR(100),
  timezone VARCHAR(50),
  mode VARCHAR(50),
  selection JSONB,
  notify JSONB,
  limits JSONB,
  created_at TIMESTAMPTZ,
  updated_at TIMESTAMPTZ
);

runs (
  run_id VARCHAR(100) PRIMARY KEY,
  tenant_id UUID NOT NULL,
  schedule_id VARCHAR(100) REFERENCES schedules,
  trigger VARCHAR(50),
  state VARCHAR(50),
  stats JSONB,
  deltas JSONB,
  created_at TIMESTAMPTZ,
  started_at TIMESTAMPTZ,
  finished_at TIMESTAMPTZ,
  error TEXT
);

run_summaries (
  summary_id VARCHAR(200) PRIMARY KEY,  -- tenant:schedule
  tenant_id UUID NOT NULL,
  schedule_id VARCHAR(100),
  last_run JSONB,
  recent JSONB[],
  counters JSONB,
  updated_at TIMESTAMPTZ
);

Schema: notify

-- Notification routing
rules (
  rule_id UUID PRIMARY KEY,
  tenant_id UUID NOT NULL,
  name VARCHAR(255),
  match JSONB,
  actions JSONB,
  enabled BOOLEAN,
  created_at TIMESTAMPTZ,
  updated_at TIMESTAMPTZ,
  deleted_at TIMESTAMPTZ
);

channels (
  channel_id UUID PRIMARY KEY,
  tenant_id UUID NOT NULL,
  name VARCHAR(255),
  type VARCHAR(50),  -- slack, teams, email, webhook
  config JSONB,
  created_at TIMESTAMPTZ,
  updated_at TIMESTAMPTZ,
  deleted_at TIMESTAMPTZ
);

templates (
  template_id UUID PRIMARY KEY,
  channel_type VARCHAR(50),
  key VARCHAR(100),
  locale VARCHAR(10),
  render_mode VARCHAR(50),
  body TEXT,
  created_at TIMESTAMPTZ,
  updated_at TIMESTAMPTZ
);

delivery_log (
  delivery_id UUID PRIMARY KEY,
  rule_id UUID REFERENCES rules,
  channel_id UUID REFERENCES channels,
  event_id UUID,
  status VARCHAR(50),
  delivered_at TIMESTAMPTZ,
  error TEXT
);

3. Valkey Keyspace Reference

Key Patterns by Module

Module Key Pattern Type TTL Purpose
Authority dpop:{jti} string 5m DPoP nonce cache (RFC 9449)
Scanner scan:{digest} string Infinite Last scan JSON result
Scanner layers:{digest} set 90d Layers with SBOMs (delta cache)
Scanner locator:{imageDigest} string 30d Image digest to sbomBlobId mapping
Policy policy:active string Infinite Active policy YAML/Rego
Policy policy:history list Infinite Change audit IDs
Concelier feed:nvd:json string 24h Normalized feed snapshot
General quota:{token} string Until UTC midnight Per-token scan counter
Scheduler scheduler:jobs stream - Job queue
Notifier notify:delivery stream 7d Delivery events
All events:* stream 7d Event streams
All queue:* stream - Task queues
Telemetry metrics:* various - Runtime metrics

Stream Definitions

Stream Producers Consumers Events
scanner:events Scanner.Worker Policy, Notify, TimelineIndexer, ExportCenter scan.submitted, scan.completed, scan.failed
concelier:drift Concelier.Worker Scheduler, Policy, Notify advisory.new, advisory.updated
policy:evaluated Policy.Worker Notify, Findings, ExportCenter evaluation.completed, verdict.changed
scheduler:jobs Scheduler Scanner, Policy run.started, rescan.triggered
notify:delivery Notifier Audit, TimelineIndexer notification.sent, notification.failed

Valkey Configuration

# Recommended Valkey configuration for StellaOps
maxmemory: 2gb
maxmemory-policy: volatile-lru
stream-node-max-bytes: 4096
stream-node-max-entries: 100

# Consumer groups for job processing
scanner:jobs:
  consumer_group: scanner-workers
  idle_timeout: 60s

notify:delivery:
  consumer_group: notify-workers
  idle_timeout: 30s

4. RustFS (S3) Path Conventions

Blob Storage Layout

blobs/
+-- {sha256_prefix}/               # First 4 chars of digest
    +-- {full_digest}/
        +-- sbom.json              # SBOM payload (any format)
        +-- sbom.meta.json         # Wrapper envelope
        +-- sbom.cdx.pb            # CycloneDX Protobuf (compact)
        +-- attestation.dsse       # DSSE envelope
        +-- evidence.bundle        # Evidence package

images/
+-- {imageDigest}/
    +-- inventory.cdx.json         # Inventory SBOM
    +-- inventory.cdx.pb           # Inventory (Protobuf)
    +-- usage.cdx.json             # Usage SBOM (entrypoint closure)
    +-- usage.cdx.pb               # Usage (Protobuf)
    +-- call-graph.json            # Call graph data
    +-- reachability.json          # Reachability analysis

evidence/
+-- {bundleId}/
    +-- manifest.json              # Bundle manifest
    +-- manifest.dsse              # Signed manifest
    +-- sboms/                     # SBOM files
    +-- attestations/              # Attestation files
    +-- proofs/                    # Verification proofs

offline-kits/
+-- {kitId}/
    +-- feeds/                     # Advisory snapshots
    +-- images/                    # Container images
    +-- signatures/                # DSSE signatures
    +-- trust-roots/               # CA certificates
    +-- manifest.json              # Kit manifest

SBOM Wrapper Envelope

{
  "id": "sha256:417f...",
  "imageDigest": "sha256:e2b9...",
  "created": "2025-01-02T15:30:00Z",
  "format": "cdx-json",
  "layers": [
    "sha256:d38b...",
    "sha256:af45..."
  ],
  "partial": false,
  "provenanceId": "prov_0291"
}

Content Types

Extension Content-Type Description
.json application/json JSON documents
.cdx.json application/vnd.cyclonedx+json CycloneDX JSON
.cdx.pb application/vnd.cyclonedx+protobuf CycloneDX Protobuf
.spdx.json application/spdx+json SPDX JSON
.dsse application/vnd.dsse+json DSSE envelope
.bundle application/zip Evidence bundle

5. Module-to-Storage Mapping

Complete Reference Table

Module PostgreSQL Schema Valkey Keys RustFS Paths
Authority authority dpop:{jti} -
Gateway - (stateless) - -
Router - connection state -
Scanner scanner scan:{digest}, layers:{digest}, locator:{imageDigest} blobs/, images/
Concelier vuln feed:*, concelier:drift stream -
Excititor vex - -
VexLens - (reads vex) - -
VexHub vex (extension) - -
IssuerDirectory - (reads vex.issuer_trust) - -
Policy policy policy:active, policy:history -
RiskEngine - (reads policy) - -
Scheduler scheduler scheduler:jobs stream -
Notifier notify notify:delivery stream -
Orchestrator orchestrator orchestrator:* streams -
Attestor - (uses scanner) - blobs/*/attestation.dsse
Signer - (uses authority) - -
SbomService - (reads scanner) - blobs/, images/
EvidenceLocker - - evidence/
ExportCenter - - evidence/, offline-kits/
AirGap - - offline-kits/
Registry registry - -
Symbols symbols - -
Unknowns unknowns - -
TimelineIndexer - (writes to scanner, etc.) - -

6. Cross-References

Data Flow Dependencies

+---------+     +-----------+     +--------+     +--------+
| Scanner |---->| SbomService|---->| Policy |---->| Notify |
+---------+     +-----------+     +--------+     +--------+
     |                |                |              |
     v                v                v              v
+----------+    +-----------+    +-----------+   +---------+
| scanner  |    |  blobs/   |    |  policy   |   | notify  |
| (PG)     |    |  (RustFS) |    |  (PG)     |   | (PG)    |
+----------+    +-----------+    +-----------+   +---------+

+-----------+     +----------+
| Concelier |---->|  Policy  |
+-----------+     +----------+
     |                 ^
     v                 |
+----------+     +-----------+
|   vuln   |     | Excititor |
|  (PG)    |     +-----------+
+----------+           |
                       v
                 +----------+
                 |   vex    |
                 |  (PG)    |
                 +----------+

Schema Version Tracking

All schemas support versioning:

-- Every schema has a version tracking table
schema_migrations (
  version INTEGER PRIMARY KEY,
  description TEXT,
  applied_at TIMESTAMPTZ,
  checksum VARCHAR(64)
);

Backup Considerations

Storage Backup Strategy Retention
PostgreSQL pg_dump + WAL archiving 30 days
Valkey RDB snapshots + AOF 7 days
RustFS Object versioning 90 days