3.7 KiB
VexHub Architecture
Scope. Architecture and operational contract for the VexHub aggregation service that normalizes, validates, and distributes VEX statements with deterministic, offline-friendly outputs.
1) Purpose
VexHub collects VEX statements from multiple upstream sources, validates and normalizes them, detects conflicts, and exposes a distribution API for internal services and external tools (Trivy/Grype). It is the canonical aggregation layer that feeds VexLens trust scoring and Policy Engine decisioning.
2) Responsibilities
- Scheduled ingestion of upstream VEX sources (connectors + mirrored feeds).
- Canonical normalization to OpenVEX-compatible structures.
- Validation pipeline (schema + signature/provenance checks).
- Conflict detection and provenance capture.
- Distribution API for CVE/PURL/source queries and bulk exports.
Non-goals: policy decisioning (Policy Engine), consensus computation (VexLens), raw ingestion guardrails (Excititor AOC).
3) Component Model
- VexHub.WebService: Minimal API host for distribution endpoints and admin controls.
- VexHub.Worker: Background workers for ingestion schedules and validation pipelines.
- Normalization Pipeline: Canonicalizes statements, deduplicates, and links provenance.
- Validation Pipeline: Schema validation (OpenVEX/CycloneDX/CSAF) and signature checks.
- Storage: PostgreSQL schema
vexhubfor normalized statements, provenance, conflicts, and export cursors.
4) Data Model (Draft)
vexhub.statementid,source_id,vuln_id,product_key,status,justification,timestamp,statement_hash
vexhub.provenancestatement_id,issuer,signature_valid,signature_ref,source_uri,ingested_at
vexhub.conflictvuln_id,product_key,statement_ids[],detected_at,reason
vexhub.export_cursorsource_id,last_exported_at,snapshot_hash
All tables must include tenant_id, UTC timestamps, and deterministic ordering keys.
5) API Surface (Draft)
GET /api/v1/vex/cve/{cve-id}GET /api/v1/vex/package/{purl}GET /api/v1/vex/source/{source-id}GET /api/v1/vex/export(bulk OpenVEX feed)GET /api/v1/vex/index(vex-index.json)
Responses are deterministic: stable ordering by timestamp DESC, then source_id ASC, then statement_hash ASC.
6) Determinism & Offline Posture
- Ingestion runs against frozen snapshots where possible; all outputs include
snapshot_hash. - Canonical JSON serialization with stable key ordering.
- No network egress outside configured connectors (sealed mode supported).
- Bulk exports are immutable and content-addressed.
7) Security & Auth
- API access requires Authority scopes (
vexhub.read,vexhub.admin). - Signature verification follows issuer registry rules; failures are surfaced as metadata, not silent drops.
- Rate limiting enforced at API gateway and per-client tokens.
8) Observability
- Metrics:
vexhub_ingest_total,vexhub_validation_failures_total,vexhub_conflicts_total,vexhub_export_duration_seconds. - Logs: include
tenant_id,source_id,statement_hash, andtrace_id. - Traces: spans for ingestion, normalization, validation, export.
9) Integration Points
- Excititor: upstream connectors provide source payloads and trust hints.
- VexLens: consumes normalized statements and provenance for trust scoring and consensus.
- Policy Engine: reads VexLens consensus results; VexHub provides external distribution.
- UI: VEX conflict studio consumes conflict API once available.
10) Testing Strategy
- Unit tests for normalization and validation pipelines.
- Integration tests with Postgres for ingestion and API outputs.
- Determinism tests comparing repeated exports with identical inputs.
Last updated: 2025-12-22.