# VexHub Architecture > **Scope.** Architecture and operational contract for the VexHub aggregation service that normalizes, validates, and distributes VEX statements with deterministic, offline-friendly outputs. ## 1) Purpose VexHub collects VEX statements from multiple upstream sources, validates and normalizes them, detects conflicts, and exposes a distribution API for internal services and external tools (Trivy/Grype). It is the canonical aggregation layer that feeds VexLens trust scoring and Policy Engine decisioning. ## 2) Responsibilities - Scheduled ingestion of upstream VEX sources (connectors + mirrored feeds). - Canonical normalization to OpenVEX-compatible structures. - Validation pipeline (schema + signature/provenance checks). - Conflict detection and provenance capture. - Distribution API for CVE/PURL/source queries and bulk exports. Non-goals: policy decisioning (Policy Engine), consensus computation (VexLens), raw ingestion guardrails (Excititor AOC). ## 3) Component Model - **VexHub.WebService**: Minimal API host for distribution endpoints and admin controls. - **VexHub.Worker**: Background workers for ingestion schedules and validation pipelines. - **Normalization Pipeline**: Canonicalizes statements, deduplicates, and links provenance. - **Validation Pipeline**: Schema validation (OpenVEX/CycloneDX/CSAF) and signature checks. - **Storage**: PostgreSQL schema `vexhub` for normalized statements, provenance, conflicts, and export cursors. ## 4) Data Model (Draft) - `vexhub.statement` - `id`, `source_id`, `vuln_id`, `product_key`, `status`, `justification`, `timestamp`, `statement_hash` - `vexhub.provenance` - `statement_id`, `issuer`, `signature_valid`, `signature_ref`, `source_uri`, `ingested_at` - `vexhub.conflict` - `vuln_id`, `product_key`, `statement_ids[]`, `detected_at`, `reason` - `vexhub.export_cursor` - `source_id`, `last_exported_at`, `snapshot_hash` All tables must include `tenant_id`, UTC timestamps, and deterministic ordering keys. ## 5) API Surface (Draft) - `GET /api/v1/vex/cve/{cve-id}` - `GET /api/v1/vex/package/{purl}` - `GET /api/v1/vex/source/{source-id}` - `GET /api/v1/vex/export` (bulk OpenVEX feed) - `GET /api/v1/vex/index` (vex-index.json) Responses are deterministic: stable ordering by `timestamp DESC`, then `source_id ASC`, then `statement_hash ASC`. ## 6) Determinism & Offline Posture - Ingestion runs against frozen snapshots where possible; all outputs include `snapshot_hash`. - Canonical JSON serialization with stable key ordering. - No network egress outside configured connectors (sealed mode supported). - Bulk exports are immutable and content-addressed. ## 7) Security & Auth - API access requires Authority scopes (`vexhub.read`, `vexhub.admin`). - Signature verification follows issuer registry rules; failures are surfaced as metadata, not silent drops. - Rate limiting enforced at API gateway and per-client tokens. ## 8) Observability - Metrics: `vexhub_ingest_total`, `vexhub_validation_failures_total`, `vexhub_conflicts_total`, `vexhub_export_duration_seconds`. - Logs: include `tenant_id`, `source_id`, `statement_hash`, and `trace_id`. - Traces: spans for ingestion, normalization, validation, export. ## 9) Integration Points - **Excititor**: upstream connectors provide source payloads and trust hints. - **VexLens**: consumes normalized statements and provenance for trust scoring and consensus. - **Policy Engine**: reads VexLens consensus results; VexHub provides external distribution. - **UI**: VEX conflict studio consumes conflict API once available. ## 10) Testing Strategy - Unit tests for normalization and validation pipelines. - Integration tests with Postgres for ingestion and API outputs. - Determinism tests comparing repeated exports with identical inputs. *Last updated: 2025-12-22.*