Files
git.stella-ops.org/docs/ARCHITECTURE_VEXER.md
master 791e12baab
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Add tests and implement StubBearer authentication for Signer endpoints
- Created SignerEndpointsTests to validate the SignDsse and VerifyReferrers endpoints.
- Implemented StubBearerAuthenticationDefaults and StubBearerAuthenticationHandler for token-based authentication.
- Developed ConcelierExporterClient for managing Trivy DB settings and export operations.
- Added TrivyDbSettingsPageComponent for UI interactions with Trivy DB settings, including form handling and export triggering.
- Implemented styles and HTML structure for Trivy DB settings page.
- Created NotifySmokeCheck tool for validating Redis event streams and Notify deliveries.
2025-10-21 09:37:07 +03:00

18 KiB
Raw Blame History

component_architecture_vexer.md — StellaOps Vexer (2025Q4)

Scope. This document specifies the Vexer service: its purpose, trust model, data structures, APIs, plugin contracts, storage schema, normalization/consensus algorithms, performance budgets, testing matrix, and how it integrates with Scanner, Policy, Feedser, and the attestation chain. It is implementationready.


0) Mission & role in the platform

Mission. Convert heterogeneous VEX statements (OpenVEX, CSAF VEX, CycloneDX VEX; vendor/distro/platform sources) into canonical, queryable claims; compute deterministic consensus per (vuln, product); preserve conflicts with provenance; publish stable, attestable exports that the backend uses to suppress nonexploitable findings, prioritize remaining risk, and explain decisions.

Boundaries.

  • Vexer does not decide PASS/FAIL. It supplies evidence (statuses + justifications + provenance weights).
  • Vexer preserves conflicting claims unchanged; consensus encodes how we would pick, but the raw set is always exportable.
  • VEX consumption is backendonly: Scanner never applies VEX. The backends Policy Engine asks Vexer for status evidence and then decides what to show.

1) Inputs, outputs & canonical domain

1.1 Accepted input formats (ingest)

  • OpenVEX JSON documents (attested or raw).
  • CSAF VEX 2.x (vendor PSIRTs and distros commonly publish CSAF).
  • CycloneDX VEX 1.4+ (standalone VEX or embedded VEX blocks).
  • OCIattached attestations (VEX statements shipped as OCI referrers) — optional connectors.

All connectors register source metadata: provider identity, trust tier, signature expectations (PGP/cosign/PKI), fetch windows, rate limits, and time anchors.

1.2 Canonical model (normalized)

Every incoming statement becomes a set of VexClaim records:

VexClaim
- providerId           // 'redhat', 'suse', 'ubuntu', 'github', 'vendorX'
- vulnId               // 'CVE-2025-12345', 'GHSA-xxxx', canonicalized
- productKey           // canonical product identity (see §2.2)
- status               // affected | not_affected | fixed | under_investigation
- justification?       // for 'not_affected'/'affected' where provided
- introducedVersion?   // semantics per provider (range or exact)
- fixedVersion?        // where provided (range or exact)
- lastObserved         // timestamp from source or fetch time
- provenance           // doc digest, signature status, fetch URI, line/offset anchors
- evidence[]           // raw source snippets for explainability
- supersedes?          // optional cross-doc chain (docDigest → docDigest)

1.3 Exports (consumption)

  • VexConsensus per (vulnId, productKey) with:

    • rollupStatus (after policy weights/justification gates),
    • sources[] (winning + losing claims with weights & reasons),
    • policyRevisionId (identifier of the Vexer policy used),
    • consensusDigest (stable SHA256 over canonical JSON).
  • Raw claims export for auditing (unchanged, with provenance).

  • Provider snapshots (per source, last N days) for operator debugging.

  • Index optimized for backend joins: (productKey, vulnId) → (status, confidence, sourceSet).

All exports are deterministic, and (optionally) attested via DSSE and logged to Rekor v2.


2) Identity model — products & joins

2.1 Vuln identity

  • Accepts CVE, GHSA, vendor IDs (MSRC, RHSA…), distro IDs (DSA/USN/RHSA…) — normalized to vulnId with alias sets.
  • Alias graph maintained (from Feedser) to map vendor/distro IDs → CVE (primary) and to GHSA where applicable.

2.2 Product identity (productKey)

  • Primary: purl (Package URL).
  • Secondary links: cpe, OS package NVRA/EVR, NuGet/Maven/Golang identity, and OS package name when purl unavailable.
  • Fallback: oci:<registry>/<repo>@<digest> for imagelevel VEX.
  • Special cases: kernel modules, firmware, platforms → providerspecific mapping helpers (connector captures providers product taxonomy → canonical productKey).

Vexer does not invent identities. If a provider cannot be mapped to purl/CPE/NVRA deterministically, we keep the native product string and mark the claim as nonjoinable; the backend will ignore it unless a policy explicitly whitelists that provider mapping.


3) Storage schema (MongoDB)

Database: vexer

3.1 Collections

vex.providers

_id: providerId
name, homepage, contact
trustTier: enum {vendor, distro, platform, hub, attestation}
signaturePolicy: { type: pgp|cosign|x509|none, keys[], certs[], cosignKeylessRoots[] }
fetch: { baseUrl, kind: http|oci|file, rateLimit, etagSupport, windowDays }
enabled: bool
createdAt, modifiedAt

vex.raw (immutable raw documents)

_id: sha256(doc bytes)
providerId
uri
ingestedAt
contentType
sig: { verified: bool, method: pgp|cosign|x509|none, keyId|certSubject, bundle? }
payload: GridFS pointer (if large)
disposition: kept|replaced|superseded
correlation: { replaces?: sha256, replacedBy?: sha256 }

vex.claims (normalized rows; dedupe on providerId+vulnId+productKey+docDigest)

_id
providerId
vulnId
productKey
status
justification?
introducedVersion?
fixedVersion?
lastObserved
docDigest
provenance { uri, line?, pointer?, signatureState }
evidence[] { key, value, locator }
indices: 
  - {vulnId:1, productKey:1}
  - {providerId:1, lastObserved:-1}
  - {status:1}
  - text index (optional) on evidence.value for debugging

vex.consensus (rollups)

_id: sha256(canonical(vulnId, productKey, policyRevision))
vulnId
productKey
rollupStatus
sources[]: [
  { providerId, status, justification?, weight, lastObserved, accepted:bool, reason }
]
policyRevisionId
evaluatedAt
consensusDigest  // same as _id
indices:
  - {vulnId:1, productKey:1}
  - {policyRevisionId:1, evaluatedAt:-1}

vex.exports (manifest of emitted artifacts)

_id
querySignature
format: raw|consensus|index
artifactSha256
rekor { uuid, index, url }?
createdAt
policyRevisionId
cacheable: bool

vex.cache

querySignature -> exportId (for fast reuse)
ttl, hits

vex.migrations

  • ordered migrations applied at bootstrap to ensure indexes.

3.2 Indexing strategy

  • Hot path queries use exact (vulnId, productKey) and timebounded windows; compound indexes cover both.
  • Providers list view by lastObserved for monitoring staleness.
  • vex.consensus keyed by (vulnId, productKey, policyRevision) for deterministic reuse.

4) Ingestion pipeline

4.1 Connector contract

public interface IVexConnector
{
    string ProviderId { get; }
    Task FetchAsync(VexConnectorContext ctx, CancellationToken ct);   // raw docs
    Task NormalizeAsync(VexConnectorContext ctx, CancellationToken ct); // raw -> VexClaim[]
}
  • Fetch must implement: window scheduling, conditional GET (ETag/IfModifiedSince), rate limiting, retry/backoff.
  • Normalize parses the format, validates schema, maps product identities deterministically, emits VexClaim records with provenance.

4.2 Signature verification (per provider)

  • cosign (keyless or keyful) for OCI referrers or HTTPserved JSON with Sigstore bundles.
  • PGP (provider keyrings) for distro/vendor feeds that sign docs.
  • x509 (mutual TLS / providerpinned certs) where applicable.
  • Signature state is stored on vex.raw.sig and copied into provenance.signatureState on claims.

Claims from sources failing signature policy are marked "signatureState.verified=false" and policy can downweight or ignore them.

4.3 Time discipline

  • For each doc, prefer providers document timestamp; if absent, use fetch time.
  • Claims carry lastObserved which drives tiebreaking within equal weight tiers.

5) Normalization: product & status semantics

5.1 Product mapping

  • purl first; cpe second; OS package NVRA/EVR mapping helpers (distro connectors) produce purls via canonical tables (e.g., rpm→purl:rpm, deb→purl:deb).
  • Where a provider publishes platformlevel VEX (e.g., “RHEL 9 not affected”), connectors expand to known product inventory rules (e.g., map to sets of packages/components shipped in the platform). Expansion tables are versioned and kept per provider; every expansion emits evidence indicating the rule applied.
  • If expansion would be speculative, the claim remains platformscoped with productKey="platform:redhat:rhel:9" and is flagged nonjoinable; backend can decide to use platform VEX only when Scanner proves the platform runtime.

5.2 Status + justification mapping

  • Canonical status: affected | not_affected | fixed | under_investigation.

  • Justifications normalized to a controlled vocabulary (CISAaligned), e.g.:

    • component_not_present
    • vulnerable_code_not_in_execute_path
    • vulnerable_configuration_unused
    • inline_mitigation_applied
    • fix_available (with fixedVersion)
    • under_investigation
  • Providers with freetext justifications are mapped by deterministic tables; raw text preserved as evidence.


6) Consensus algorithm

Goal: produce a stable, explainable rollupStatus per (vulnId, productKey) given possibly conflicting claims.

6.1 Inputs

  • Set S of VexClaim for the key.

  • Vexer policy snapshot:

    • weights per provider tier and per provider overrides.
    • justification gates (e.g., require justification for not_affected to be acceptable).
    • minEvidence rules (e.g., not_affected must come from ≥1 vendor or 2 distros).
    • signature requirements (e.g., require verified signature for fixed to be considered).

6.2 Steps

  1. Filter invalid claims by signature policy & justification gates → set S'.

  2. Score each claim: score = weight(provider) * freshnessFactor(lastObserved) where freshnessFactor ∈ [0.8, 1.0] for staleness decay (configurable; small effect).

  3. Aggregate scores per status: W(status) = Σ score(claims with that status).

  4. Pick rollupStatus = argmax_status W(status).

  5. Tiebreakers (in order):

    • Higher max single provider score wins (vendor > distro > platform > hub).
    • More recent lastObserved wins.
    • Deterministic lexicographic order of status (fixed > not_affected > under_investigation > affected) as final tiebreaker.
  6. Explain: mark accepted sources (accepted=true; reason="weight"/"freshness"), mark rejected sources with explicit reason ("insufficient_justification", "signature_unverified", "lower_weight").

The algorithm is pure given S and policy snapshot; result is reproducible and hashed into consensusDigest.


7) Query & export APIs

All endpoints are versioned under /api/v1/vex.

7.1 Query (online)

POST /claims/search
  body: { vulnIds?: string[], productKeys?: string[], providers?: string[], since?: timestamp, limit?: int, pageToken?: string }
  → { claims[], nextPageToken? }

POST /consensus/search
  body: { vulnIds?: string[], productKeys?: string[], policyRevisionId?: string, since?: timestamp, limit?: int, pageToken?: string }
  → { entries[], nextPageToken? }

POST /excititor/resolve (scope: vex.read)
  body: { productKeys?: string[], purls?: string[], vulnerabilityIds: string[], policyRevisionId?: string }
  → { policy, resolvedAt, results: [ { vulnerabilityId, productKey, status, sources[], conflicts[], decisions[], signals?, summary?, envelope: { artifact, contentSignature?, attestation?, attestationEnvelope?, attestationSignature? } } ] }

7.2 Exports (cacheable snapshots)

POST /exports
  body: { signature: { vulnFilter?, productFilter?, providers?, since? }, format: raw|consensus|index, policyRevisionId?: string, force?: bool }
  → { exportId, artifactSha256, rekor? }

GET  /exports/{exportId}        → bytes (application/json or binary index)
GET  /exports/{exportId}/meta   → { signature, policyRevisionId, createdAt, artifactSha256, rekor? }

7.3 Provider operations

GET  /providers                  → provider list & signature policy
POST /providers/{id}/refresh     → trigger fetch/normalize window
GET  /providers/{id}/status      → last fetch, doc counts, signature stats

Auth: servicetoservice via Authority tokens; operator operations via UI/CLI with RBAC.


8) Attestation integration

  • Exports can be DSSEsigned via Signer and logged to Rekor v2 via Attestor (optional but recommended for regulated pipelines).

  • vex.exports.rekor stores {uuid, index, url} when present.

  • Predicate type: https://stella-ops.org/attestations/vex-export/1 with fields:

    • querySignature, policyRevisionId, artifactSha256, createdAt.

9) Configuration (YAML)

vexer:
  mongo: { uri: "mongodb://mongo/vexer" }
  s3:
    endpoint: http://minio:9000
    bucket: stellaops
  policy:
    weights:
      vendor: 1.0
      distro: 0.9
      platform: 0.7
      hub: 0.5
      attestation: 0.6
    providerOverrides:
      redhat: 1.0
      suse: 0.95
    requireJustificationForNotAffected: true
    signatureRequiredForFixed: true
    minEvidence:
      not_affected:
        vendorOrTwoDistros: true
  connectors:
    - providerId: redhat
      kind: csaf
      baseUrl: https://access.redhat.com/security/data/csaf/v2/
      signaturePolicy: { type: pgp, keys: [ "…redhat-pgp-key…" ] }
      windowDays: 7
    - providerId: suse
      kind: csaf
      baseUrl: https://ftp.suse.com/pub/projects/security/csaf/
      signaturePolicy: { type: pgp, keys: [ "…suse-pgp-key…" ] }
    - providerId: ubuntu
      kind: openvex
      baseUrl: https://…/vex/
      signaturePolicy: { type: none }
    - providerId: vendorX
      kind: cyclonedx-vex
      ociRef: ghcr.io/vendorx/vex@sha256:…
      signaturePolicy: { type: cosign, cosignKeylessRoots: [ "sigstore-root" ] }

10) Security model

  • Input signature verification enforced per provider policy (PGP, cosign, x509).
  • Connector allowlists: outbound fetch constrained to configured domains.
  • Tenant isolation: pertenant DB prefixes or separate DBs; pertenant S3 prefixes; pertenant policies.
  • AuthN/Z: Authorityissued OpToks; RBAC roles (vex.read, vex.admin, vex.export).
  • No secrets in logs; deterministic logging contexts include providerId, docDigest, claim keys.

11) Performance & scale

  • Targets:

    • Normalize 10k VEX claims/minute/core.
    • Consensus compute ≤50ms for 1k unique (vuln, product) pairs in hot cache.
    • Export (consensus) 1M rows in ≤60s on 8 cores with streaming writer.
  • Scaling:

    • WebService handles control APIs; Worker background services (same image) execute fetch/normalize in parallel with ratelimits; Mongo writes batched; upserts by natural keys.
    • Exports stream straight to S3 (MinIO) with rolling buffers.
  • Caching:

    • vex.cache maps query signatures → export; TTL to avoid stampedes; optimistic reuse unless force.

12) Observability

  • Metrics:

    • vex.ingest.docs_total{provider}
    • vex.normalize.claims_total{provider}
    • vex.signature.failures_total{provider,method}
    • vex.consensus.conflicts_total{vulnId}
    • vex.exports.bytes{format} / vex.exports.latency_seconds
  • Tracing: spans for fetch, verify, parse, map, consensus, export.

  • Dashboards: provider staleness, top conflicting vulns/components, signature posture, export cache hitrate.


13) Testing matrix

  • Connectors: golden raw docs → deterministic claims (fixtures per provider/format).
  • Signature policies: valid/invalid PGP/cosign/x509 samples; ensure rejects are recorded but not accepted.
  • Normalization edge cases: platformonly claims, freetext justifications, nonpurl products.
  • Consensus: conflict scenarios across tiers; check tiebreakers; justification gates.
  • Performance: 1Mrow export timing; memory ceilings; stream correctness.
  • Determinism: same inputs + policy → identical consensusDigest and export bytes.
  • API contract tests: pagination, filters, RBAC, rate limits.

14) Integration points

  • Backend Policy Engine (in Scanner.WebService): calls POST /excititor/resolve (scope vex.read) with batched (purl, vulnId) pairs to fetch rollupStatus + sources.
  • Feedser: provides alias graph (CVE↔vendor IDs) and may supply VEXadjacent metadata (e.g., KEV flag) for policy escalation.
  • UI: VEX explorer screens use /claims/search and /consensus/search; show conflicts & provenance.
  • CLI: stellaops vex export --consensus --since 7d --out vex.json for audits.

15) Failure modes & fallback

  • Provider unreachable: stale thresholds trigger warnings; policy can downweight stale providers automatically (freshness factor).
  • Signature outage: continue to ingest but mark signatureState.verified=false; consensus will likely exclude or downweight per policy.
  • Schema drift: unknown fields are preserved as evidence; normalization rejects only on invalid identity or status.

16) Rollout plan (incremental)

  1. MVP: OpenVEX + CSAF connectors for 3 major providers (e.g., Red Hat/SUSE/Ubuntu), normalization + consensus + /excititor/resolve.
  2. Signature policies: PGP for distros; cosign for OCI.
  3. Exports + optional attestation.
  4. CycloneDX VEX connectors; platform claim expansion tables; UI explorer.
  5. Scale hardening: export indexes; conflict analytics.

17) Appendix — canonical JSON (stable ordering)

All exports and consensus entries are serialized via VexCanonicalJsonSerializer:

  • UTF8 without BOM;
  • keys sorted (ASCII);
  • arrays sorted by (providerId, vulnId, productKey, lastObserved) unless semantic order mandated;
  • timestamps in YYYYMMDDThh:mm:ssZ;
  • no insignificant whitespace.