# StellaOps Platform - Detailed Architecture **Last Updated:** 2025-12-23 **Purpose:** Comprehensive component architecture with communication patterns and data flows ## Table of Contents 1. [Component Topology](#component-topology) 2. [Infrastructure Layer](#infrastructure-layer) 3. [Service Catalog](#service-catalog) 4. [Communication Patterns](#communication-patterns) 5. [Data Flow Diagrams](#data-flow-diagrams) 6. [Database Schema Isolation](#database-schema-isolation) 7. [Security Boundaries](#security-boundaries) --- ## Component Topology ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ CLIENT LAYER │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ stella │ │ Web UI │ │ CI/CD │ │ Zastava │ │ │ │ CLI │ │ Angular │ │ Pipeline │ │ Observer │ │ │ └─────┬────┘ └─────┬────┘ └─────┬────┘ └─────┬────┘ │ │ │ │ │ │ │ └────────┼─────────────┼─────────────┼─────────────┼──────────────────────────┘ │ │ │ │ └─────────────┴─────────────┴─────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────────────────┐ │ GATEWAY LAYER │ │ ┌───────────────────────────────────────────────────────────────┐ │ │ │ Gateway.WebService │ │ │ │ • JWT validation • Rate limiting │ │ │ │ • DPoP verification • Request routing │ │ │ │ • Tenant resolution • Correlation tracking │ │ │ └───┬────────────────────────────────────────────────┬───────────┘ │ │ │ │ │ └──────┼────────────────────────────────────────────────┼─────────────────────┘ │ │ ▼ ▼ ┌─────────────────┐ ┌─────────────────┐ │ AUTHORITY │◄───────────────────────────│ ALL SERVICES │ │ │ OpTok validation │ (Resource │ │ • OAuth2/OIDC │ DPoP nonce verification │ servers) │ │ • DPoP binding │ │ │ │ • OpTok issue │ └─────────────────┘ │ • mTLS verify │ └────────┬────────┘ │ stores tokens, │ audit trails ▼ ┌─────────────────────────────────────────────────────────────────────────────┐ │ CORE SERVICES LAYER │ │ │ │ ┌─────────────────────────────────────────────────────────────────┐ │ │ │ SCANNING ENGINE │ │ │ │ │ │ │ │ ┌────────────────────┐ ┌────────────────────┐ │ │ │ │ │ Scanner.WebService │────────▶│ Scanner.Worker │ │ │ │ │ │ │ Valkey │ │ │ │ │ │ │ • Scan orchestrate │ queue │ • Layer analysis │ │ │ │ │ │ • Report catalog │ │ • SBOM generation │ │ │ │ │ │ • Policy eval │ │ • Reachability │ │ │ │ │ └─────┬──────────────┘ └────────┬───────────┘ │ │ │ │ │ │ │ │ │ │ │ linkset │ artifact │ │ │ │ │ query │ upload │ │ │ │ ▼ ▼ │ │ │ │ ┌──────────────┐ ┌──────────────┐ │ │ │ │ │ Concelier │ │ RustFS │ │ │ │ │ │ WebService │ │ (S3 API) │ │ │ │ │ └──────────────┘ └──────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────────────────┐ │ │ │ ADVISORY INGESTION ENGINE │ │ │ │ │ │ │ │ ┌────────────────────┐ ┌────────────────────┐ │ │ │ │ │ Concelier.WebService│──────▶│ Concelier.Worker │ │ │ │ │ │ │ Jobs │ │ │ │ │ │ │ • Ingest advisories│ │ • Connector fetch │ │ │ │ │ │ • Compute linksets │ │ • Normalize data │ │ │ │ │ │ • AOC enforcement │ │ • Delta detection │ │ │ │ │ └─────┬──────────────┘ └────────────────────┘ │ │ │ │ │ │ │ │ │ │ webhook: advisory delta events │ │ │ │ ▼ │ │ │ │ ┌──────────────┐ │ │ │ │ │ Scheduler │ │ │ │ │ │ WebService │ │ │ │ │ └──────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────────────────┐ │ │ │ VEX INGESTION ENGINE │ │ │ │ │ │ │ │ ┌────────────────────┐ ┌────────────────────┐ │ │ │ │ │ Excititor.WebService│──────▶│ Excititor.Worker │ │ │ │ │ │ │ Jobs │ │ │ │ │ │ │ • Ingest VEX │ │ • Fetch VEX feeds │ │ │ │ │ │ • DSSE verify │ │ • Trust verify │ │ │ │ │ │ • Consensus calc │ │ • Signature check │ │ │ │ │ └─────┬──────────────┘ └──────┬─────────────┘ │ │ │ │ │ │ │ │ │ │ │ webhook: VEX delta │ trust lookup │ │ │ │ ▼ ▼ │ │ │ │ ┌──────────────┐ ┌──────────────┐ │ │ │ │ │ Scheduler │ │ Issuer │ │ │ │ │ │ WebService │ │ Directory │ │ │ │ │ └──────────────┘ └──────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────────────────┐ │ │ │ ORCHESTRATION & SCHEDULING │ │ │ │ │ │ │ │ ┌────────────────────┐ ┌────────────────────┐ │ │ │ │ │ Scheduler.WebService│──────▶│ Scheduler.Worker │ │ │ │ │ │ │ Jobs │ │ │ │ │ │ │ • Impact select │ │ • Re-scan trigger │ │ │ │ │ │ • Rate limit │ │ • Batch enforce │ │ │ │ │ │ • Maintenance win │ │ • Progress track │ │ │ │ │ └─────┬──────────────┘ └──────┬─────────────┘ │ │ │ │ │ │ │ │ │ │ │ │ HTTP: enqueue scan │ │ │ │ │ ▼ │ │ │ │ │ ┌──────────────┐ │ │ │ │ │ │ Scanner.Web │ │ │ │ │ │ └──────────────┘ │ │ │ │ │ │ │ │ │ ┌─────▼──────────────┐ │ │ │ │ │ Orchestrator.Web │ │ │ │ │ │ │ │ │ │ │ │ • DAG workflows │ │ │ │ │ │ • Pack runs │ │ │ │ │ │ • Job streaming │ │ │ │ │ └────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────────────────┐ │ │ │ NOTIFICATION ENGINE │ │ │ │ │ │ │ │ ┌────────────────────┐ ┌────────────────────┐ │ │ │ │ │ Notify.WebService │────────▶│ Notify.Worker │ │ │ │ │ │ │ Valkey │ │ │ │ │ │ │ • Channel mgmt │ Streams │ • Slack delivery │ │ │ │ │ │ • Template engine │ XADD/ │ • Teams delivery │ │ │ │ │ │ • Throttle/digest │ XREAD │ • Email delivery │ │ │ │ │ └─────▲──────────────┘ └──────┬─────────────┘ │ │ │ │ │ │ │ │ │ │ │ report.ready events │ External HTTP/SMTP │ │ │ │ │ ▼ │ │ │ │ ┌─────┴──────────────┐ ┌──────────────┐ │ │ │ │ │ Scanner.Web │ │ Slack API │ │ │ │ │ │ (events) │ │ Teams API │ │ │ │ │ └────────────────────┘ │ SMTP │ │ │ │ │ └──────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────────────────┐ │ │ │ CRYPTOGRAPHIC SERVICES │ │ │ │ │ │ │ │ ┌────────────────────┐ ┌────────────────────┐ │ │ │ │ │ Signer.WebService │────────▶│ Attestor.WebService│ │ │ │ │ │ │ mTLS │ │ │ │ │ │ │ • DSSE signing │ OpTok │ • Rekor v2 submit │ │ │ │ │ │ • PoE validation │ │ • Receipt verify │ │ │ │ │ │ • Multi-profile │ │ • Offline bundles │ │ │ │ │ │ FIPS/GOST/SM │ │ │ │ │ │ │ └─────┬──────────────┘ └──────┬─────────────┘ │ │ │ │ │ │ │ │ │ │ │ KMS/PKCS11 │ External │ │ │ │ ▼ ▼ │ │ │ │ ┌──────────────┐ ┌──────────────┐ │ │ │ │ │ External KMS │ │ Rekor v2 │ │ │ │ │ │ (AWS/GCP) │ │ (Sigstore) │ │ │ │ │ └──────────────┘ └──────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────────────────┐ │ │ │ POLICY ENGINE │ │ │ │ │ │ │ │ ┌────────────────────┐ ┌────────────────────┐ │ │ │ │ │ Policy.Gateway │────────▶│ Policy Engine │ │ │ │ │ │ │ HTTP │ (OPA/Rego) │ │ │ │ │ │ • Exception mgmt │ │ │ │ │ │ │ │ • Approval flow │ │ • Rule eval │ │ │ │ │ │ • Delta compute │ │ • Verdict compute │ │ │ │ │ └─────▲──────────────┘ └──────▲─────────────┘ │ │ │ │ │ │ │ │ │ │ │ policy eval request │ PostgreSQL │ │ │ │ │ │ logical replication │ │ │ │ ┌─────┴──────────────┐ │ │ │ │ │ │ Scanner.Web │ ┌─────┴──────────┐ │ │ │ │ │ (verdict request) │ │ advisory_raw │ │ │ │ │ └────────────────────┘ │ vex_raw │ │ │ │ │ │ (streams) │ │ │ │ │ └────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────────────────────┐ │ INFRASTRUCTURE LAYER │ │ │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │ │ │ PostgreSQL │ │ Valkey │ │ RustFS │ │ │ │ v16+ │ │ v8.0 │ │ (S3-compatible) │ │ │ │ │ │ │ │ │ │ │ │ • Per-service │ │ • DPoP nonces │ │ • SBOM artifacts │ │ │ │ schemas │ │ • Event streams │ │ • Proof bundles │ │ │ │ • Logical │ │ • Job queues │ │ • CAS storage │ │ │ │ replication │ │ • Cache │ │ │ │ │ │ • REQUIRED │ │ • REQUIRED │ │ • REQUIRED │ │ │ └──────────────────┘ └──────────────────┘ └──────────────────┘ │ │ │ │ ┌──────────────────┐ │ │ │ NATS │ │ │ │ JetStream │ │ │ │ │ │ │ │ • Message queue │ │ │ │ • Optional │ │ │ │ (Valkey is │ │ │ │ default) │ │ │ └──────────────────┘ │ └─────────────────────────────────────────────────────────────────────────────┘ ``` --- ## Infrastructure Layer ### PostgreSQL v16+ (REQUIRED) **Purpose:** Primary database for ALL persistent data **Schema Isolation:** | Schema | Owner Service | Purpose | |--------|---------------|---------| | `authority` | Authority | Users, clients, tenants, keys, audit trails | | `scanner` | Scanner | Scan manifests, triage, EPSS, reachability graphs | | `vuln` | Concelier | Advisory raw documents, linksets, observations | | `vex` | Excititor | VEX raw documents, consensus, provider state | | `scheduler` | Scheduler | Graph jobs, runs, schedules, impact snapshots | | `notify` | Notify | Channels, templates, delivery history, digests | | `policy` | Policy.Gateway | Exception objects, snapshots, unknowns | | `orchestrator` | Orchestrator | Sources, runs, jobs, DAGs, pack runs | **Special Features:** - **Logical Replication:** `advisory_raw_stream`, `vex_raw_stream` → Policy Engine - **Per-tenant isolation:** Tenant ID in all tables for row-level security - **Append-only patterns:** AOC (Aggregation-Only Contract) for advisory/VEX immutability ### Valkey v8.0 (REQUIRED) **Purpose:** Cache, DPoP security, event streams, job queues **Use Cases:** | Pattern | Services | Purpose | |---------|----------|---------| | DPoP nonces | Authority | RFC 9449 nonce storage (30s TTL) | | Event streams | Scanner, Notify, Scheduler | XADD for `report.ready`, drift events | | Job queues | Scanner, Notify | XREADGROUP for worker coordination | | Cache | All services | Distributed caching with tenant prefixes | | Rate limiting | Gateway, Authority | Token bucket counters | **Default Transport:** Valkey Streams preferred over NATS for queuing ### RustFS (REQUIRED) **Purpose:** S3-compatible object storage for artifacts **Buckets:** | Bucket | Services | Content | |--------|----------|---------| | `scanner-artifacts` | Scanner | Layer SBOMs, composed SBOMs, proof bundles | | `surface-cache` | Scanner.Worker | Extracted filesystem surfaces | | `evidence-locker` | Evidence Locker | Immutable audit evidence | | `cas-replay` | Replay Engine | Content-addressed snapshots | **API:** HTTP/S3 with optional API key authentication ### NATS JetStream (OPTIONAL) **Purpose:** Alternative messaging transport (not default) **When to Use:** - High-throughput environments requiring persistent streams - Multi-datacenter replication scenarios - When Valkey Streams insufficient for scale **Default:** Valkey is preferred; NATS opt-in via configuration --- ## Service Catalog ### Gateway Layer #### Gateway.WebService **Port:** 8080 (HTTP), 8443 (HTTPS) **Dependencies:** Authority (JWT validation), Backend services (routing) **Responsibilities:** - **Authentication:** JWT + DPoP verification on all requests - **Authorization:** Scope-based access control (RBAC claims) - **Tenant Resolution:** Multi-tenant routing via `X-Tenant-Id` header or JWT - **Rate Limiting:** Per-client token bucket (Valkey-backed) - **Request Routing:** Routes to Scanner, Concelier, Policy, Scheduler, Notify - **Correlation Tracking:** Injects `X-Correlation-Id` for distributed tracing **Security Boundaries:** - TLS termination (mutual TLS optional) - DPoP sender constraint validation - OpTok refresh on expiry --- ### Authentication & Security #### Authority **Port:** 8440 (HTTPS) **Database:** `authority` schema **Dependencies:** Valkey (DPoP nonces), External LDAP/OIDC (plugins) **Responsibilities:** - **OAuth 2.1 Server:** Issues OpToks (operational tokens) with DPoP binding - **Client Credentials Flow:** Machine-to-machine authentication - **Resource Owner Password Flow:** User authentication with LDAP/OIDC - **DPoP (RFC 9449):** Sender-constrained tokens with nonce validation - **mTLS:** Certificate-based client authentication - **Audit Trails:** All authentication events logged to PostgreSQL - **Multi-Tenancy:** Tenant-scoped token issuance **Token Types:** - **OpTok:** Short-lived (15 min), DPoP-bound, scoped access token - **Refresh Token:** Rotation-protected, 7-day expiry - **ID Token:** OIDC identity claims **Security:** - DPoP nonces stored in Valkey with 30s TTL - OpTok signatures verified by all resource servers - Rate limiting on failed login attempts #### Signer.WebService **Port:** 8441 (HTTPS with mTLS) **Dependencies:** Authority (PoE validation), External KMS (optional), OCI Registry (scanner digest verification) **Responsibilities:** - **DSSE Signing:** Signs in-toto envelopes for SBOMs, VEX, attestations - **PoE Validation:** Validates Proof-of-Entitlement (license check) - **Multi-Profile Keys:** FIPS, GOST (CryptoPro), SM (Chinese national crypto) - **Scanner Authenticity:** Verifies scanner image digest is Stella Ops-signed - **Key Management:** HSM/KMS integration (AWS, GCP, PKCS11) **Hard Gates (Reject on Failure):** 1. OpTok validation (DPoP + mTLS) 2. PoE license check 3. Scanner image digest verification (cosign signature) **Key Profiles:** | Profile | Algorithm | Use Case | |---------|-----------|----------| | `default` | ECDSA P-256 | Standard signing | | `fips` | ECDSA P-384 | FIPS 140-2 compliance | | `gost` | GOST R 34.10-2012 | Russian regulations | | `sm` | SM2 | Chinese regulations | #### Attestor.WebService **Port:** 8442 (HTTPS) **Dependencies:** Signer (DSSE signing), Rekor v2 (transparency log) **Responsibilities:** - **Rekor Submission:** Posts DSSE bundles to Sigstore Rekor v2 - **Receipt Retrieval:** Fetches inclusion proofs from Rekor - **Offline Bundles:** Generates offline verification bundles - **Verification:** Validates Rekor receipts for CLI/CI **Workflow:** 1. Receive DSSE envelope from Scanner/Excititor 2. Call Signer for signature (mTLS) 3. Submit signed DSSE to Rekor v2 4. Retrieve inclusion proof 5. Return receipt to caller **Offline Mode:** - Optional: Can operate without Rekor if `OFFLINEKIT_ENABLED=true` - Uses local timestamp service for non-repudiation --- ### Scanning Engine #### Scanner.WebService **Port:** 8444 (HTTP) **Database:** `scanner` schema **Object Storage:** RustFS `scanner-artifacts` bucket **Dependencies:** Authority (auth), Concelier (linkset queries), Policy (evaluation), Signer (DSSE), Attestor (Rekor) **Responsibilities:** - **Scan Orchestration:** Enqueues scan jobs to Scanner.Worker via Valkey - **Report Catalog:** Maintains scan history, triage data, policy verdicts - **Linkset Enrichment:** Queries Concelier for advisory linksets by PURL/CPE - **Policy Evaluation:** Calls Policy.Gateway for verdict computation - **SBOM Export:** Generates SPDX 3.0.1 and CycloneDX 1.6 SBOMs - **VEX Export:** Calls Excititor for VEX statement generation - **Proof Bundles:** Assembles DSSE envelopes with signatures + Rekor receipts - **Event Publishing:** Emits `report.ready` events to Notify via Valkey Streams **API Endpoints:** | Endpoint | Method | Purpose | |----------|--------|---------| | `/v1/scans` | POST | Enqueue scan job | | `/v1/scans/{id}` | GET | Retrieve scan report | | `/v1/scans/{id}/sbom` | GET | Download SBOM (SPDX/CycloneDX) | | `/v1/scans/{id}/vex` | GET | Download VEX document | | `/v1/scans/{id}/proof` | GET | Download proof bundle (DSSE + Rekor receipt) | | `/v1/triage` | POST | Mark finding as false positive | **Queue Pattern:** - Publishes to Valkey Stream: `scanner:jobs` - Scanner.Worker consumes via `XREADGROUP` #### Scanner.Worker **Database:** `scanner` schema (read EPSS, write inventory) **Object Storage:** RustFS `scanner-artifacts`, `surface-cache` **Dependencies:** Scanner.WebService (internal API), RustFS (upload) **Responsibilities:** - **Image Pull:** OCI image download and layer extraction - **Layer Analysis:** Runs OS/language/native analyzers per layer - **SBOM Generation:** Per-layer SBOMs in SPDX 3.0.1 format - **Composition:** Merges layer SBOMs into final composed SBOM - **Reachability Analysis:** Call-graph extraction for Java/Node/Go/Python - **Artifact Upload:** Uploads SBOMs to RustFS - **Progress Reporting:** Heartbeat to Scanner.WebService every 10s **Analyzers:** | Analyzer | Ecosystem | Method | |----------|-----------|--------| | `distro-debian` | Debian/Ubuntu | `dpkg-query`, `apt-cache` | | `distro-rpm` | RHEL/Fedora/CentOS | RPM database | | `distro-alpine` | Alpine Linux | APK database | | `lang-java` | Java/Maven/Gradle | JAR manifests, `pom.xml`, `build.gradle` | | `lang-node` | Node.js/npm | `package.json`, `package-lock.json` | | `lang-python` | Python/pip | `requirements.txt`, `Pipfile`, wheel metadata | | `lang-go` | Golang | `go.mod`, binary parsing | | `native` | C/C++ | ELF symbol tables, version symbols | **Reachability:** - Builds call graph (CG) with nodes/edges in PostgreSQL `cg_node`, `cg_edge` - Determines if vulnerable functions are callable from entrypoints - Flags findings as `REACHABLE`/`UNREACHABLE`/`UNKNOWN` --- ### Advisory Ingestion #### Concelier.WebService **Port:** 8445 (HTTP) **Database:** `vuln` schema (`advisory_raw`, `linksets`) **Dependencies:** Scheduler (webhook for delta events), Upstream sources (connectors) **Responsibilities:** - **Advisory Ingestion:** Fetches vulnerabilities from NVD, Red Hat, Debian, Ubuntu, GitHub, etc. - **Normalization:** Converts vendor formats to canonical Concelier advisory JSON - **Linkset Computation:** Maps CVE IDs to PURLs/CPEs with version ranges - **AOC Enforcement:** Append-only writes to `advisory_raw` (immutable after insert) - **Delta Detection:** Detects new advisories and emits webhook to Scheduler - **Merge Engine:** Deduplicates advisories across sources with priority rules **Connectors:** | Connector | Source | Update Frequency | |-----------|--------|------------------| | `nvd` | NVD CVE JSON | Hourly | | `redhat` | Red Hat OVAL | Every 6 hours | | `debian` | Debian Security Tracker | Every 6 hours | | `ubuntu` | Ubuntu CVE Tracker | Every 6 hours | | `github` | GitHub Advisory Database | Hourly | | `alpine` | Alpine SecDB | Every 6 hours | | `osv` | OSV.dev | Hourly | **Linkset API:** - `/v1/lnm/linksets/{advisoryId}` - Returns PURL/CPE mappings for a CVE - Consumed by Scanner for enrichment **PostgreSQL Logical Replication:** - `advisory_raw_stream` → Policy Engine (tenant-scoped replication) #### Concelier.Worker **Dependencies:** Concelier.WebService (internal API), Upstream advisory sources **Responsibilities:** - **Scheduled Fetching:** Polls connectors on cron schedules - **Delta Computation:** Compares fetched data with last snapshot - **Advisory Normalization:** Parses OVAL, JSON, XML into canonical format - **Database Insert:** Writes to `advisory_raw` via Concelier.WebService API --- ### VEX Ingestion #### Excititor.WebService **Port:** 8446 (HTTP) **Database:** `vex` schema (`vex_raw`, `consensus`) **Dependencies:** IssuerDirectory (trust verification), Scheduler (webhook for delta events) **Responsibilities:** - **VEX Ingestion:** Fetches OpenVEX and CSAF VEX documents from vendors - **DSSE Verification:** Validates in-toto signatures on VEX statements - **Trust Scoring:** Applies trust weights to issuers from IssuerDirectory - **Consensus Computation:** Resolves conflicts when multiple VEX statements conflict - **AOC Enforcement:** Append-only writes to `vex_raw` (immutable after insert) - **Delta Detection:** Detects new VEX statements and emits webhook to Scheduler **VEX Sources:** | Source | Format | Signature | |--------|--------|-----------| | Red Hat VEX | CSAF VEX | PGP-signed | | CISA VEX | OpenVEX | DSSE in-toto | | Vendor VEX | OpenVEX | DSSE in-toto | **Consensus Algorithm:** - Weighted voting based on issuer trust scores - Tie-breaking: Most conservative status wins (e.g., `affected` > `not_affected`) - Result stored in `consensus` table with provenance **PostgreSQL Logical Replication:** - `vex_raw_stream` → Policy Engine (tenant-scoped replication) #### Excititor.Worker **Dependencies:** Excititor.WebService (internal API), IssuerDirectory (trust lookup) **Responsibilities:** - **Scheduled Fetching:** Polls VEX sources on cron schedules - **Signature Verification:** Validates DSSE envelopes via IssuerDirectory - **Trust Verification:** Checks issuer is in trusted list - **Database Insert:** Writes to `vex_raw` via Excititor.WebService API --- ### Policy Engine #### Policy.Gateway **Port:** 8447 (HTTP) **Database:** `policy` schema (`exception_objects`, `snapshots`, `unknowns`) **Dependencies:** Policy Engine (OPA/Rego), Authority (auth) **Responsibilities:** - **Policy Evaluation Gateway:** Proxies requests to OPA/Rego engine - **Exception Management:** Stores approved false positives, waivers - **Approval Workflows:** Multi-stage approval for policy exceptions - **Delta Computation:** Compares baseline vs. current scan for policy drift - **Unknowns Tracking:** Records unresolved CVEs (no fix available) **API Endpoints:** | Endpoint | Method | Purpose | |----------|--------|---------| | `/v1/policy/evaluate` | POST | Evaluate policy against scan results | | `/v1/policy/exceptions` | POST | Create exception request | | `/v1/policy/exceptions/{id}/approve` | POST | Approve exception | | `/v1/policy/unknowns` | GET | List unresolved findings | **Policy Data Sources:** - PostgreSQL logical replication from `advisory_raw_stream`, `vex_raw_stream` - Real-time advisory and VEX data for policy eval #### Policy Engine (OPA/Rego) **Container:** Separate OPA container, called via HTTP by Policy.Gateway **Language:** Rego policies **Data Sources:** PostgreSQL logical replication streams **Policies:** - `unknowns-budget.rego` - Limits unresolved CVEs (no fix available) - `severity-gates.rego` - Blocks based on CVSS severity - `reachability-gates.rego` - Allows unreachable findings - `vex-override.rego` - Applies VEX `not_affected` status --- ### Orchestration & Scheduling #### Scheduler.WebService **Port:** 8448 (HTTP) **Database:** `scheduler` schema (`graph_jobs`, `runs`, `schedules`, `impact_snapshots`) **Dependencies:** Scanner (re-scan requests), Cartographer (export notifications, optional) **Responsibilities:** - **Impact Selection:** When advisories/VEX change, identifies affected images via BOM-Index - **Re-scan Orchestration:** Enqueues re-scan jobs to Scanner.WebService - **Rate Limiting:** Enforces max concurrent scans, maintenance windows - **Schedule Management:** Manages periodic scan schedules (cron) - **Webhook Ingestion:** Receives delta events from Concelier, Excititor **Webhook Endpoints:** | Endpoint | Source | Payload | |----------|--------|---------| | `/webhooks/concelier` | Concelier | Advisory delta event | | `/webhooks/excititor` | Excititor | VEX delta event | **Impact Selection Algorithm:** 1. Receive advisory delta (CVE IDs added) 2. Query BOM-Index for images containing affected PURLs 3. Batch impacted images (max 100 per run) 4. Enforce rate limits and maintenance windows 5. Enqueue re-scans to Scanner.WebService #### Scheduler.Worker **Dependencies:** Scheduler.WebService (internal API), Scanner.WebService (HTTP) **Responsibilities:** - **Job Execution:** Claims jobs from Scheduler.WebService - **Batch Processing:** Processes impacted image batches - **Re-scan Trigger:** HTTP POST to Scanner `/v1/scans` with `rescan=true` - **Progress Reporting:** Heartbeat to Scheduler every 10s #### Orchestrator.WebService **Port:** 8449 (HTTP) **Database:** `orchestrator` schema (`sources`, `runs`, `jobs`, `dags`, `pack_runs`) **Responsibilities:** - **DAG Workflows:** Manages directed acyclic graph job dependencies - **Pack Runs:** Bundles multiple jobs into atomic runs - **Job Streaming:** WebSocket endpoints for real-time job status - **Worker Coordination:** Job claim, heartbeat, completion tracking **Use Cases:** - Complex multi-step workflows (e.g., scan → policy → VEX → attest) - Batch operations (e.g., scan all images in namespace) --- ### Notification Engine #### Notify.WebService **Port:** 8450 (HTTP) **Database:** `notify` schema (`channels`, `templates`, `delivery_history`, `digest_state`) **Dependencies:** Valkey (delivery queue), Scanner (event subscription) **Responsibilities:** - **Channel Management:** Configures Slack, Teams, Email, Webhook channels - **Template Engine:** Renders notification templates with Liquid syntax - **Throttling:** Rate limits notifications (max N per hour per channel) - **Digest Mode:** Batches notifications into hourly/daily digests - **Event Subscription:** Subscribes to `report.ready` events from Scanner **Channel Types:** | Channel | Protocol | Configuration | |---------|----------|---------------| | Slack | HTTP (Slack API) | Bot token, channel ID | | Teams | HTTP (webhook) | Webhook URL | | Email | SMTP | SMTP server, credentials | | Webhook | HTTP | URL, auth headers | **Delivery Queue:** - Publishes to Valkey Stream: `notify:delivery` - Notify.Worker consumes via `XREADGROUP` #### Notify.Worker **Dependencies:** Notify.WebService (internal API), External services (Slack/Teams/SMTP) **Responsibilities:** - **Job Claim:** Claims delivery jobs from Valkey queue - **Template Rendering:** Renders Liquid templates with event data - **Delivery Execution:** HTTP/SMTP delivery with retries (exponential backoff) - **Idempotency:** Tracks delivery IDs to prevent duplicates - **SLO Tracking:** Records delivery latency for P95 monitoring **Retry Policy:** - Max 3 retries - Backoff: 1s, 5s, 15s - Dead-letter queue after exhaustion --- ### Cryptographic Services #### Signer.WebService **Port:** 8441 (HTTPS with mTLS) **Dependencies:** Authority (PoE validation), External KMS (AWS/GCP/PKCS11), OCI Registry (digest verification) **Responsibilities:** - **DSSE Signing:** Signs in-toto envelopes (SBOMs, VEX, attestations) - **PoE Validation:** License check via Authority introspection - **Scanner Authenticity:** Verifies scanner image digest is Stella Ops-signed - **Multi-Profile Keys:** FIPS, GOST, SM for regulatory compliance - **Key Rotation:** Automated key rotation with overlap period **Hard Gates:** 1. OpTok validation (DPoP + mTLS) 2. PoE license check (fails if expired) 3. Scanner image digest verification (must be cosign-signed by Stella Ops) **Key Storage:** | Storage | Use Case | |---------|----------| | In-memory | Development | | PKCS11 HSM | On-prem production | | AWS KMS | AWS cloud deployments | | GCP KMS | GCP cloud deployments | #### Attestor.WebService **Port:** 8442 (HTTPS) **Dependencies:** Signer (DSSE signing), Rekor v2 (transparency log) **Responsibilities:** - **Rekor Submission:** Posts DSSE bundles to Sigstore Rekor v2 - **Receipt Retrieval:** Fetches inclusion proofs (Merkle tree path) - **Offline Bundles:** Packages DSSE + Rekor receipt for airgap verification - **Verification API:** Validates Rekor receipts for CLI/CI **Workflow:** 1. Receive DSSE envelope from Scanner/Excititor 2. Call Signer for signature (mTLS with OpTok) 3. Submit signed DSSE to Rekor v2 (`/api/v2/entries`) 4. Retrieve inclusion proof from Rekor 5. Return proof bundle to caller **Offline Mode:** - When `OFFLINEKIT_ENABLED=true`: - Uses local timestamp service (no Rekor) - Bundles DSSE + TSA timestamp + trust anchors - Suitable for airgap deployments --- ### Supporting Services #### IssuerDirectory.WebService **Port:** 8451 (HTTP) **Database:** None (read-only configuration) **Responsibilities:** - **Trusted Issuer Registry:** Maintains list of authorized VEX/SBOM signers - **Trust Weights:** Assigns numerical trust scores (0.0 - 1.0) to issuers - **Seed Data:** CSAF trusted providers from official lists **Issuer Manifest:** ```json { "issuers": [ { "id": "redhat", "name": "Red Hat Product Security", "publicKey": "-----BEGIN PUBLIC KEY-----...", "trustWeight": 0.95 }, { "id": "cisa", "name": "CISA Cybersecurity", "publicKey": "-----BEGIN PUBLIC KEY-----...", "trustWeight": 1.0 } ] } ``` **API:** - `/v1/issuers` - List all trusted issuers - `/v1/issuers/{id}` - Get issuer details --- ## Communication Patterns ### 1. Scan Request Flow ``` CLI/UI │ │ POST /v1/scans │ { "imageRef": "alpine:latest" } ▼ Gateway.WebService │ │ 1. Validate JWT + DPoP │ 2. Check rate limits │ 3. Route to Scanner ▼ Scanner.WebService │ │ 1. Create scan record in PostgreSQL │ 2. XADD to Valkey: scanner:jobs │ │ ◄──────────────────────────┐ │ │ ▼ │ Valkey Stream │ scanner:jobs │ │ │ │ XREADGROUP (consumer group) │ ▼ │ Scanner.Worker │ │ │ │ 1. Pull OCI image │ │ 2. Extract layers │ │ 3. Run analyzers │ │ 4. Generate SBOMs │ │ 5. Upload to RustFS │ │ 6. Query Concelier for linksets │ └─► HTTP GET /v1/lnm/linksets/{cveId} │ │ │ 7. Heartbeat ──────────────┘ │ POST /internal/jobs/{id}/heartbeat │ │ 8. Complete │ POST /internal/jobs/{id}/complete ▼ Scanner.WebService │ │ 1. Update scan record (status=completed) │ 2. Call Policy.Gateway for verdict │ POST /v1/policy/evaluate │ └─► Policy.Gateway │ └─► Policy Engine (OPA) │ └─► PostgreSQL (advisory_raw_stream) │ │ 3. Call Signer for DSSE signature │ POST /v1/sign (mTLS + OpTok) │ └─► Signer.WebService │ ├─► Validate PoE (license) │ ├─► Verify scanner digest (cosign) │ └─► Sign DSSE envelope │ │ 4. Call Attestor for Rekor submission │ POST /v1/attest │ └─► Attestor.WebService │ ├─► Submit to Rekor v2 │ └─► Retrieve inclusion proof │ │ 5. Store proof bundle in RustFS │ 6. XADD to Valkey: events:report.ready │ ▼ Valkey Stream events:report.ready │ │ XREADGROUP ▼ Notify.WebService │ │ 1. Render template │ 2. XADD to Valkey: notify:delivery │ ▼ Valkey Stream notify:delivery │ │ XREADGROUP ▼ Notify.Worker │ │ 1. Claim delivery job │ 2. HTTP POST to Slack API │ 3. Mark complete ▼ Slack Channel ``` **Communication Summary:** 1. **Gateway → Scanner:** HTTP POST (JWT + DPoP auth) 2. **Scanner → Valkey:** XADD (queue job) 3. **Worker → Valkey:** XREADGROUP (consume job) 4. **Worker → Scanner:** HTTP POST (heartbeat, completion) 5. **Worker → Concelier:** HTTP GET (linkset query) 6. **Scanner → Policy:** HTTP POST (policy eval) 7. **Scanner → Signer:** HTTP POST mTLS (DSSE signing) 8. **Scanner → Attestor:** HTTP POST (Rekor submission) 9. **Scanner → Valkey:** XADD (event publish) 10. **Notify → Valkey:** XREADGROUP (event consume) 11. **Notify Worker → Slack:** HTTP POST (delivery) --- ### 2. Advisory Update Flow ``` Concelier.Worker (cron: every hour) │ │ 1. Fetch NVD CVE JSON feed │ HTTPS GET https://services.nvd.nist.gov/rest/json/cves/2.0 │ │ 2. Parse and normalize │ 3. POST to Concelier.WebService │ POST /internal/ingest ▼ Concelier.WebService │ │ 1. Validate advisory format │ 2. Compute linksets (CVE → PURL/CPE) │ 3. INSERT INTO vuln.advisory_raw (AOC: append-only) │ 4. Detect delta (new CVEs) │ 5. Webhook POST to Scheduler │ POST /webhooks/concelier │ { │ "cveIds": ["CVE-2024-1234"], │ "timestamp": "2025-12-23T12:00:00Z" │ } │ ▼ Scheduler.WebService │ │ 1. Query BOM-Index for impacted images │ SELECT DISTINCT image_ref │ FROM scanner.inventory │ WHERE purl IN ( │ SELECT purl FROM vuln.linksets │ WHERE cve_id IN ('CVE-2024-1234') │ ) │ │ 2. Batch results (max 100 images/run) │ 3. Enforce rate limits (max 10 scans/min) │ 4. Enqueue to Scheduler.Worker │ ▼ Scheduler.Worker │ │ 1. Claim job from Scheduler │ 2. For each image: │ POST /v1/scans │ { │ "imageRef": "alpine:latest", │ "rescan": true, │ "reason": "advisory-delta" │ } │ └─► Scanner.WebService │ └─► [Standard scan flow] │ │ 3. Heartbeat to Scheduler │ 4. Complete job ▼ Scanner.WebService (Re-scan executes, new report generated) ``` **Communication Summary:** 1. **Concelier.Worker → NVD:** HTTPS GET (fetch advisories) 2. **Concelier.Worker → Concelier.Web:** HTTP POST (ingest) 3. **Concelier.Web → PostgreSQL:** INSERT (advisory storage) 4. **Concelier.Web → Scheduler:** HTTP POST webhook (delta event) 5. **Scheduler → PostgreSQL:** SELECT (BOM-Index query for impacted images) 6. **Scheduler.Worker → Scanner:** HTTP POST (re-scan requests) --- ### 3. VEX Update Flow ``` Excititor.Worker (cron: every 6 hours) │ │ 1. Fetch Red Hat CSAF VEX feed │ HTTPS GET https://www.redhat.com/security/data/csaf/ │ │ 2. Parse CSAF JSON │ 3. Verify PGP signature │ 4. POST to Excititor.WebService │ POST /internal/ingest ▼ Excititor.WebService │ │ 1. Verify DSSE signature │ └─► IssuerDirectory.WebService │ GET /v1/issuers/{issuerId} │ (Retrieve public key + trust weight) │ │ 2. Validate signature with issuer public key │ 3. INSERT INTO vex.vex_raw (AOC: append-only) │ 4. Compute consensus (if multiple VEX for same CVE) │ UPDATE vex.consensus │ 5. Detect delta (new VEX statements) │ 6. Webhook POST to Scheduler │ POST /webhooks/excititor │ { │ "cveIds": ["CVE-2024-5678"], │ "status": "not_affected", │ "timestamp": "2025-12-23T18:00:00Z" │ } │ ▼ Scheduler.WebService │ │ 1. Query BOM-Index for impacted images │ (Same as advisory flow, but for VEX changes) │ │ 2. Enqueue analysis-only jobs │ (No full re-scan, just re-evaluate policy with new VEX) │ ▼ Scheduler.Worker │ │ For each image: │ POST /v1/scans/{scanId}/reanalyze │ └─► Scanner.WebService │ └─► Policy.Gateway (re-evaluate with new VEX) ▼ Scanner.WebService (Policy re-evaluation, verdict updated) ``` **Communication Summary:** 1. **Excititor.Worker → VEX source:** HTTPS GET (fetch VEX) 2. **Excititor.Worker → Excititor.Web:** HTTP POST (ingest) 3. **Excititor.Web → IssuerDirectory:** HTTP GET (trust verification) 4. **Excititor.Web → PostgreSQL:** INSERT (VEX storage) 5. **Excititor.Web → Scheduler:** HTTP POST webhook (delta event) 6. **Scheduler.Worker → Scanner:** HTTP POST (re-analyze request) --- ### 4. Notification Delivery Flow ``` Scanner.WebService (Scan completed, verdict computed) │ │ XADD to Valkey Stream │ events:report.ready │ { │ "scanId": "scan-123", │ "imageRef": "alpine:latest", │ "verdict": "FAIL", │ "criticalCount": 3 │ } ▼ Valkey Stream: events:report.ready │ │ XREADGROUP (consumer group: notify-delivery) ▼ Notify.WebService │ │ 1. SELECT channel config from PostgreSQL │ (Slack, Teams, Email channels) │ │ 2. SELECT template from PostgreSQL │ (Liquid template: "New vulnerabilities found...") │ │ 3. Check throttle limits │ (Max 10 notifications/hour per channel) │ │ 4. Render template with event data │ │ 5. XADD to Valkey Stream │ notify:delivery │ { │ "channelId": "slack-security", │ "renderedMessage": "🚨 Critical: 3 vulns in alpine:latest", │ "deliveryId": "delivery-456" │ } ▼ Valkey Stream: notify:delivery │ │ XREADGROUP (consumer group: notify-workers) ▼ Notify.Worker │ │ 1. Claim delivery job │ 2. Check idempotency (deliveryId seen before?) │ 3. HTTP POST to Slack API │ POST https://slack.com/api/chat.postMessage │ { │ "channel": "#security", │ "text": "🚨 Critical: 3 vulns in alpine:latest", │ "attachments": [...] │ } │ │ 4. Record delivery in PostgreSQL │ INSERT INTO notify.delivery_history │ 5. XACK to Valkey (mark complete) ▼ Slack Channel #security ``` **Communication Summary:** 1. **Scanner → Valkey:** XADD (publish event) 2. **Notify.Web → Valkey:** XREADGROUP (consume event) 3. **Notify.Web → PostgreSQL:** SELECT (channel config, template) 4. **Notify.Web → Valkey:** XADD (queue delivery job) 5. **Notify.Worker → Valkey:** XREADGROUP (consume delivery job) 6. **Notify.Worker → Slack API:** HTTP POST (deliver notification) 7. **Notify.Worker → PostgreSQL:** INSERT (delivery history) 8. **Notify.Worker → Valkey:** XACK (acknowledge completion) --- ### 5. Policy Evaluation Flow ``` Scanner.WebService (Scan completed, SBOM generated) │ │ POST /v1/policy/evaluate │ { │ "scanId": "scan-123", │ "findings": [ │ { │ "cveId": "CVE-2024-1234", │ "purl": "pkg:alpine/openssl@3.0.1", │ "severity": "CRITICAL", │ "reachability": "REACHABLE" │ } │ ] │ } ▼ Policy.Gateway │ │ 1. SELECT exceptions from PostgreSQL │ (Check for approved false positives) │ │ 2. POST /v1/policy/eval to Policy Engine │ └─► Policy Engine (OPA/Rego) │ │ │ │ Data sources: │ ├─► PostgreSQL logical replication │ │ • advisory_raw_stream (advisory data) │ │ • vex_raw_stream (VEX data) │ │ │ │ Policy rules: │ ├─► unknowns-budget.rego │ │ (Limit unresolved CVEs to max 10) │ ├─► severity-gates.rego │ │ (Block CRITICAL, allow HIGH with approval) │ ├─► reachability-gates.rego │ │ (Allow UNREACHABLE findings) │ └─► vex-override.rego │ (If VEX status = not_affected, allow) │ │ 3. Return verdict │ { │ "verdict": "FAIL", │ "blockedFindings": [ │ { │ "cveId": "CVE-2024-1234", │ "reason": "CRITICAL severity + REACHABLE" │ } │ ], │ "allowedFindings": [ │ { │ "cveId": "CVE-2024-5678", │ "reason": "VEX not_affected" │ } │ ] │ } ▼ Scanner.WebService │ │ 1. Store verdict in PostgreSQL │ UPDATE scanner.scan_manifests │ SET verdict = 'FAIL' │ 2. Return to caller ▼ CLI/UI ``` **Communication Summary:** 1. **Scanner → Policy.Gateway:** HTTP POST (policy eval request) 2. **Policy.Gateway → PostgreSQL:** SELECT (exceptions) 3. **Policy.Gateway → Policy Engine:** HTTP POST (OPA eval) 4. **Policy Engine → PostgreSQL:** Logical replication read (advisory/VEX data) 5. **Policy.Gateway → Scanner:** HTTP 200 (verdict response) 6. **Scanner → PostgreSQL:** UPDATE (store verdict) --- ## Database Schema Isolation Each service has a dedicated PostgreSQL schema for strict isolation: ### authority **Owner:** Authority.WebService **Tables:** | Table | Purpose | |-------|---------| | `users` | User accounts (LDAP-synced or local) | | `clients` | OAuth2 clients (service accounts) | | `tenants` | Multi-tenant organization data | | `keys` | Signing keys (JWK format) | | `tokens` | OpTok refresh tokens (rotation-protected) | | `audit_log` | Authentication/authorization events | | `dpop_nonces` | (Migrated to Valkey for performance) | **Indexes:** - `users.email` (unique) - `clients.client_id` (unique) - `tenants.slug` (unique) - `audit_log.timestamp, tenant_id` (composite) ### scanner **Owner:** Scanner.WebService **Tables:** | Table | Purpose | |-------|---------| | `scan_manifests` | Scan metadata, status, verdicts | | `proof_bundles` | DSSE envelopes + Rekor receipts | | `triage` | False positives, waiver approvals | | `epss` | EPSS scores (daily refresh from FIRST.org) | | `cg_node` | Call graph nodes (functions) | | `cg_edge` | Call graph edges (function calls) | | `inventory` | Package inventory (PURL → scan mapping) | **Indexes:** - `scan_manifests.image_ref, created_at` (composite) - `inventory.purl` (GIN index for LIKE queries) - `cg_node.function_signature` (unique) - `cg_edge.source_id, target_id` (composite) ### vuln **Owner:** Concelier.WebService **Tables:** | Table | Purpose | |-------|---------| | `advisory_raw` | Immutable advisory documents (AOC) | | `linksets` | CVE → PURL/CPE mappings with version ranges | | `observations` | Merge conflicts, priority overrides | **Logical Replication:** - `advisory_raw_stream` → Policy Engine (tenant-scoped) **Indexes:** - `advisory_raw.cve_id` (GIN array index) - `linksets.cve_id, purl` (composite) ### vex **Owner:** Excititor.WebService **Tables:** | Table | Purpose | |-------|---------| | `vex_raw` | Immutable VEX statements (AOC) | | `consensus` | Resolved VEX status (weighted voting) | | `provider_state` | Last-fetch timestamps per VEX source | **Logical Replication:** - `vex_raw_stream` → Policy Engine (tenant-scoped) **Indexes:** - `vex_raw.cve_id, issuer_id` (composite) - `consensus.cve_id` (unique) ### scheduler **Owner:** Scheduler.WebService **Tables:** | Table | Purpose | |-------|---------| | `graph_jobs` | Re-scan job definitions (advisory/VEX delta) | | `runs` | Job run instances (status, progress) | | `schedules` | Cron schedules for periodic scans | | `impact_snapshots` | BOM-Index query results (cached) | **Indexes:** - `runs.job_id, created_at` (composite) - `impact_snapshots.cve_id` (GIN array index) ### notify **Owner:** Notify.WebService **Tables:** | Table | Purpose | |-------|---------| | `channels` | Slack, Teams, Email, Webhook configs | | `templates` | Liquid templates for notifications | | `delivery_history` | Sent notifications (idempotency, SLO tracking) | | `digest_state` | Digest accumulation (hourly/daily batches) | **Indexes:** - `delivery_history.delivery_id` (unique) - `delivery_history.channel_id, created_at` (composite) ### policy **Owner:** Policy.Gateway **Tables:** | Table | Purpose | |-------|---------| | `exception_objects` | Approved false positives, waivers | | `snapshots` | Policy baseline snapshots for delta | | `unknowns` | Unresolved CVEs (no fix available) | **Indexes:** - `exception_objects.cve_id, image_ref` (composite) - `unknowns.cve_id` (unique) ### orchestrator **Owner:** Orchestrator.WebService **Tables:** | Table | Purpose | |-------|---------| | `sources` | Job sources (Git repos, webhooks) | | `runs` | Orchestrated run instances | | `jobs` | Individual jobs within runs | | `dags` | Job dependency graphs | | `pack_runs` | Atomic multi-job bundles | **Indexes:** - `jobs.run_id, status` (composite) - `dags.parent_job_id, child_job_id` (composite) --- ## Security Boundaries ### Authentication & Authorization **All services** enforce: 1. **JWT Validation:** OpTok signature verification (RS256/ES256) 2. **DPoP Verification:** Sender constraint validation (RFC 9449) 3. **Scope-Based Access:** RBAC claims in OpTok (`scan:read`, `policy:write`, etc.) 4. **Tenant Isolation:** All queries filtered by `tenant_id` from OpTok **Authority Hard Gates:** - DPoP nonce must be unused (30s TTL in Valkey) - OpTok expiry < 15 minutes from issue - mTLS certificate must match client_id **Signer Hard Gates:** - PoE (Proof of Entitlement) must be valid license - Scanner image digest must be cosign-signed by Stella Ops - OpTok must have `sign:dsse` scope ### Network Segmentation **Production Deployment:** ``` ┌─────────────────────────────────────────────────────────────┐ │ PUBLIC INTERNET │ └──────────────────────┬──────────────────────────────────────┘ │ │ HTTPS (TLS 1.3) ▼ ┌─────────────────────────────────────────────────────────────┐ │ LOAD BALANCER / WAF │ │ • Rate limiting (IP-based) │ │ • DDoS protection │ │ • TLS termination │ └──────────────────────┬──────────────────────────────────────┘ │ │ Internal HTTP ▼ ┌─────────────────────────────────────────────────────────────┐ │ DMZ - Gateway Layer │ │ ┌────────────────────────────────────────┐ │ │ │ Gateway.WebService │ │ │ │ • JWT + DPoP validation │ │ │ │ • Tenant resolution │ │ │ └────────────────────────────────────────┘ │ └──────────────────────┬──────────────────────────────────────┘ │ │ Internal mTLS (optional) ▼ ┌─────────────────────────────────────────────────────────────┐ │ APPLICATION LAYER (Internal) │ │ • Scanner.WebService │ │ • Concelier.WebService │ │ • Policy.Gateway │ │ • Scheduler.WebService │ │ • Notify.WebService │ │ • Orchestrator.WebService │ │ │ │ Network Policy: Only Gateway can initiate connections │ └──────────────────────┬──────────────────────────────────────┘ │ │ PostgreSQL protocol (TLS) │ Valkey protocol (TLS optional) │ S3 API (HTTPS) ▼ ┌─────────────────────────────────────────────────────────────┐ │ DATA LAYER (Isolated Subnet) │ │ • PostgreSQL (private IP only) │ │ • Valkey (private IP only) │ │ • RustFS (private IP only) │ │ │ │ Network Policy: No outbound internet, inbound from app │ └─────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────┐ │ PRIVILEGED SERVICES (Separate Subnet) │ │ • Authority (TLS 8440) │ │ • Signer (mTLS 8441) │ │ • Attestor (HTTPS 8442) │ │ │ │ Network Policy: mTLS required, audit all access │ └─────────────────────────────────────────────────────────────┘ ``` ### Data Encryption **At Rest:** - PostgreSQL: Transparent Data Encryption (TDE) or LUKS full-disk - Valkey: No encryption (ephemeral data only, 30s max TTL for DPoP nonces) - RustFS: Server-side encryption (SSE-S3 or AES-256) **In Transit:** - External: TLS 1.3 (Gateway → clients) - Internal: Optional mTLS (Gateway → services) - PostgreSQL: TLS (required in production) - Valkey: TLS optional (recommend enabled) - RustFS: HTTPS (required) ### Audit Logging **All services log to PostgreSQL:** | Event | Service | Table | |-------|---------|-------| | Authentication | Authority | `authority.audit_log` | | Authorization denials | Gateway | `authority.audit_log` | | DSSE signing | Signer | `authority.audit_log` (via OpTok validation) | | Policy exceptions | Policy.Gateway | `policy.exception_objects` (approval trail) | | Scan triggers | Scanner | `scanner.scan_manifests` (audit columns) | **Audit Trail Requirements (SOC 2):** - Who (user/client ID) - What (action performed) - When (ISO 8601 timestamp) - Where (tenant ID, IP address) - Result (success/failure, reason) **Retention:** - Audit logs: 90 days minimum (configurable per tenant) - Compliance mode: 7 years retention for regulated industries --- ## Summary **Key Architectural Principles:** 1. **Schema Isolation:** Each service owns its PostgreSQL schema, no cross-schema foreign keys 2. **Event-Driven:** Valkey Streams for async communication (scan jobs, notifications) 3. **Webhook Integration:** Concelier/Excititor → Scheduler for delta events 4. **Append-Only Data:** AOC for advisories and VEX (immutable, audit-friendly) 5. **Strong Authentication:** JWT + DPoP for all API calls, OpTok for service-to-service 6. **Hard Gates:** Signer enforces licensing and scanner authenticity 7. **Multi-Tenancy:** Tenant ID in all data, tenant-scoped logical replication 8. **Transparency:** Rekor v2 for public auditability, offline bundles for airgap **Communication Patterns:** | Pattern | Technology | Use Case | |---------|------------|----------| | Synchronous HTTP | REST APIs | Scanner → Concelier linkset queries | | Asynchronous Queue | Valkey Streams | Scanner jobs, Notify delivery | | Event Publishing | Valkey Streams | `report.ready`, `drift.detected` | | Webhooks | HTTP POST | Concelier/Excititor → Scheduler | | Database Replication | PostgreSQL Logical Replication | Policy Engine advisory/VEX data | | Object Storage | S3 API (RustFS) | SBOM artifacts, proof bundles | **Security Model:** - **Gateway:** Enforces authentication, authorization, rate limiting - **Authority:** Issues OpToks with DPoP binding (sender constraint) - **Signer:** Hard gates on PoE and scanner authenticity - **Tenant Isolation:** All queries filtered by `tenant_id` - **Audit Trails:** All privileged actions logged to PostgreSQL This architecture provides **deterministic, reproducible vulnerability scanning** with **strong cryptographic provenance** (DSSE + Rekor), **multi-tenant isolation**, and **VEX-first decisioning** for exploitability analysis. --- **For More Information:** - [Developer Onboarding](./DEVELOPER_ONBOARDING.md) - Quick start guide - [High-Level Architecture](./07_HIGH_LEVEL_ARCHITECTURE.md) - Business-level overview - [API/CLI Reference](./09_API_CLI_REFERENCE.md) - Endpoint documentation - [Offline Kit](./24_OFFLINE_KIT.md) - Airgap deployment guide