Files
git.stella-ops.org/docs/modules/vex-hub/architecture.md
master 291c3d3ad4 feat(vex): Postgres persistence for Excititor + VexLens + VexHub hardening
Excititor: new migration 003_vex_claim_store.sql and PostgresVexClaimStore
replace the in-memory claim tracking. ExcititorPersistenceExtensions wires
the store; ExcititorMigrationTests updated. Archives S001 demo seed.

VexLens: new migration 002_noise_gating_state.sql with
PostgresGatingStatisticsStore, PostgresSnapshotStore, and
PostgresNoiseGatingJson bring noise-gating state onto disk. New
VexLensRuntimeDatabaseOptions + AuthorityIssuerDirectoryAdapter +
VexHubStatementProvider provide the runtime wiring. WebService tests cover
the persistence, the issuer-directory adapter, and the statement provider.

VexHub: WebService Program, endpoints, middleware, models, and policies
tightened; VexExportCompatibilityTests exercise the Concelier↔VexHub export
contract.

Docs: excititor, vex-hub (architecture + integration guide), and vex-lens
architecture pages updated to match the new persistence and verification
paths.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 11:15:31 +03:00

5.6 KiB

VexHub Architecture

Scope. Architecture and operational contract for the VexHub aggregation service that normalizes, validates, and distributes VEX statements with deterministic, offline-friendly outputs.

1) Purpose

VexHub collects VEX statements from multiple upstream sources, validates and normalizes them, detects conflicts, and exposes a distribution API for internal services and external tools (Trivy/Grype). It is the canonical aggregation layer that feeds VexLens trust scoring and Policy Engine decisioning.

2) Responsibilities

  • Scheduled ingestion of upstream VEX sources (connectors + mirrored feeds).
  • Canonical normalization to OpenVEX-compatible structures.
  • Validation pipeline (schema + signature/provenance checks).
  • Conflict detection and provenance capture.
  • Distribution API for CVE/PURL/source queries and bulk exports.

Non-goals: policy decisioning (Policy Engine), consensus computation (VexLens), raw ingestion guardrails (Excititor AOC).

3) Component Model

  • VexHub.WebService: Minimal API host for distribution endpoints and admin controls.
  • VexHub.Worker: Background workers for ingestion schedules and validation pipelines.
  • Normalization Pipeline: Canonicalizes statements, deduplicates, and links provenance.
  • Validation Pipeline: Schema validation (OpenVEX/CycloneDX/CSAF) and signature checks.
  • Storage: PostgreSQL schema vexhub for sources, normalized statements, provenance, conflicts, ingestion jobs, and export cursors.

4) Data Model (Draft)

  • vexhub.statement
    • id, source_id, vuln_id, product_key, status, justification, timestamp, statement_hash
  • vexhub.provenance
    • statement_id, issuer, signature_valid, signature_ref, source_uri, ingested_at
  • vexhub.conflict
    • vuln_id, product_key, statement_ids[], detected_at, reason
  • vexhub.export_cursor
    • source_id, last_exported_at, snapshot_hash

All tables must include tenant_id, UTC timestamps, and deterministic ordering keys.

5) API Surface

  • GET /api/v1/vex/cve/{cve-id}
  • GET /api/v1/vex/package/{purl}
  • GET /api/v1/vex/source/{source-id}
  • GET /api/v1/vex/search
  • GET /api/v1/vex/statement/{id}
  • POST /api/v1/vex/conflicts/resolve
  • GET /api/v1/vex/stats
  • GET /api/v1/vex/export (bulk OpenVEX feed)
  • GET /api/v1/vex/index (vex-index.json)

Responses are deterministic: stable ordering by timestamp DESC, then source_id ASC, then statement_hash ASC.

GET /api/v1/vex/stats returns the dashboard contract consumed by the console VEX surfaces:

  • totalStatements
  • verifiedStatements
  • flaggedStatements
  • byStatus
  • bySource
  • recentActivity
  • trends
  • generatedAt

The stats endpoint must keep working on fresh installs even when a committed EF compiled-model stub is empty; runtime model fallback is required until a real optimized model is generated. The service must also auto-apply embedded SQL migrations for schema vexhub on startup so wiped volumes converge without manual SQL bootstrap.

Console VEX Runtime Contract

  • The browser search and statement-detail surfaces read from GET /api/v1/vex/search and GET /api/v1/vex/statement/{id}.
  • Consensus and conflict analysis are computed through POST /api/v1/vexlens/consensus.
  • Conflict resolution is a real backend mutation through POST /api/v1/vex/conflicts/resolve.

6) Determinism & Offline Posture

  • Ingestion runs against frozen snapshots where possible; all outputs include snapshot_hash.
  • Canonical JSON serialization with stable key ordering.
  • No network egress outside configured connectors (sealed mode supported).
  • Bulk exports are immutable and content-addressed.

7) Security & Auth

  • First-party StellaOps callers authenticate with Authority bearer tokens and canonical scopes vexhub:read and vexhub:admin.
  • External tooling may still authenticate with explicit API keys, but VexHub normalizes any legacy API-key scope values onto the same canonical Authority scopes before authorization.
  • Signature verification follows issuer registry rules; failures are surfaced as metadata, not silent drops.
  • Rate limiting enforced at API gateway and per-client tokens.

7.1) Export Contract

  • GET /api/v1/vex/export returns deterministic OpenVEX JSON with application/vnd.openvex+json when the backend export succeeds.
  • Export generation failures must surface as truthful problem+json 500 responses; the service must not fabricate empty OpenVEX success documents to mask backend state or persistence failures.

8) Observability

  • Metrics: vexhub_ingest_total, vexhub_validation_failures_total, vexhub_conflicts_total, vexhub_export_duration_seconds.
  • Logs: include tenant_id, source_id, statement_hash, and trace_id.
  • Traces: spans for ingestion, normalization, validation, export.

9) Integration Points

  • Excititor: upstream connectors provide source payloads and trust hints.
  • VexLens: consumes normalized statements and provenance for trust scoring and consensus.
  • Policy Engine: reads VexLens consensus results; VexHub provides external distribution.
  • UI: VEX conflict studio consumes conflict API once available.
  • Console UI: combines VexHub statement/search data with VexLens consensus results over the live HTTP contract.

10) Testing Strategy

  • Unit tests for normalization and validation pipelines.
  • Integration tests with Postgres for ingestion and API outputs.
  • Persistence registration and runtime-model tests that prove source/conflict/ingestion-job repositories and startup migrations are wired on the service path.
  • Determinism tests comparing repeated exports with identical inputs.

Last updated: 2026-04-14.