Rewrite architecture docs and add Vexer connector template
This commit is contained in:
@@ -1,85 +1,463 @@
|
||||
# StellaOps Vexer Architecture
|
||||
# component_architecture_vexer.md — **Stella Ops Vexer** (2025Q4)
|
||||
|
||||
Vexer is StellaOps' vulnerability-exploitability (VEX) platform. It ingests VEX statements from multiple providers, normalizes them into canonical claims, projects trust-weighted consensus, and delivers deterministic export artifacts with signed attestations. This document summarizes the target architecture and how the current implementation maps to those goals.
|
||||
> **Scope.** This document specifies the **Vexer** service: its purpose, trust model, data structures, APIs, plug‑in contracts, storage schema, normalization/consensus algorithms, performance budgets, testing matrix, and how it integrates with Scanner, Policy, Feedser, and the attestation chain. It is implementation‑ready.
|
||||
|
||||
## 1. Solution topology
|
||||
---
|
||||
|
||||
| Module | Purpose | Key contracts |
|
||||
| --- | --- | --- |
|
||||
| `StellaOps.Vexer.Core` | Domain models (`VexClaim`, `VexConsensus`, `VexExportManifest`), deterministic JSON helpers, shared abstractions (connectors, exporters, attestations). | `IVexConnector`, `IVexExporter`, `IVexAttestationClient`, `VexCanonicalJsonSerializer` |
|
||||
| `StellaOps.Vexer.Policy` | Loads operator policy (weights, overrides, justification gates) and exposes snapshots for consensus. | `IVexPolicyProvider`, `IVexPolicyEvaluator`, `VexPolicyOptions` |
|
||||
| `StellaOps.Vexer.Storage.Mongo` | Persistence layer for providers, raw docs, claims, consensus, exports, cache. | `IVexRawStore`, `IVexExportStore`, Mongo class maps |
|
||||
| `StellaOps.Vexer.Export` | Orchestrates export pipeline (query signature → cache lookup → snapshot build → attestation handoff). | `IExportEngine`, `IVexExportDataSource` |
|
||||
| `StellaOps.Vexer.Attestation` *(planned)* | Builds in-toto/DSSE envelopes and communicates with Sigstore/Rekor. | `IVexAttestationClient` |
|
||||
| `StellaOps.Vexer.WebService` *(planned)* | Minimal API host for ingest/export endpoints. | `AddVexerWebService()` |
|
||||
| `StellaOps.Vexer.Worker` *(planned)* | Background executor for scheduled pulls, verification, reconciliation, cache GC. | Hosted services |
|
||||
## 0) Mission & role in the platform
|
||||
|
||||
All modules target .NET 10 preview and follow the same deterministic logging and serialization conventions as Feedser.
|
||||
**Mission.** Convert heterogeneous **VEX** statements (OpenVEX, CSAF VEX, CycloneDX VEX; vendor/distro/platform sources) into **canonical, queryable claims**; compute **deterministic consensus** per *(vuln, product)*; preserve **conflicts with provenance**; publish **stable, attestable exports** that the backend uses to suppress non‑exploitable findings, prioritize remaining risk, and explain decisions.
|
||||
|
||||
## 2. Data model
|
||||
**Boundaries.**
|
||||
|
||||
MongoDB acts as the canonical store; collections (with logical responsibilities) are:
|
||||
* Vexer **does not** decide PASS/FAIL. It supplies **evidence** (statuses + justifications + provenance weights).
|
||||
* Vexer preserves **conflicting claims** unchanged; consensus encodes how we would pick, but the raw set is always exportable.
|
||||
* VEX consumption is **backend‑only**: Scanner never applies VEX. The backend’s **Policy Engine** asks Vexer for status evidence and then decides what to show.
|
||||
|
||||
- `vex.providers` – provider metadata, trust tiers, discovery endpoints, and cosign/PGP details.
|
||||
- `vex.raw` – immutable raw documents (CSAF, CycloneDX VEX, OpenVEX, OCI attestations) with digests, retrieval metadata, and signature state.
|
||||
- `vex.claims` – normalized `VexClaim` rows; deduped on `(providerId, vulnId, productKey, docDigest)`.
|
||||
- `vex.consensus` – consensus projections per `(vulnId, productKey)` capturing rollup status, source weights, conflicts, and policy revision.
|
||||
- `vex.exports` – export manifests containing artifact digests, cache metadata, and attestation pointers.
|
||||
- `vex.cache` – index from `querySignature`/`format` to export digest for fast reuse.
|
||||
- `vex.migrations` – tracks applied storage migrations (index bootstrap, future schema updates).
|
||||
---
|
||||
|
||||
GridFS is used for large raw payloads when necessary, and artifact stores (S3/MinIO/file) hold serialized exports referenced by `vex.exports`.
|
||||
## 1) Inputs, outputs & canonical domain
|
||||
|
||||
## 3. Ingestion and reconciliation flow
|
||||
### 1.1 Accepted input formats (ingest)
|
||||
|
||||
1. **Discovery & configuration** – connectors load YAML/JSON settings via `StellaOps.Vexer.Policy` (provider enablement, trust overrides).
|
||||
2. **Fetch** – each `IVexConnector` pulls source windows, writing raw documents through `IVexRawDocumentSink` (Mongo-backed) with dedupe on digest.
|
||||
3. **Verification** – signatures/attestations validated through `IVexSignatureVerifier`; metadata stored alongside raw records.
|
||||
4. **Normalization** – format-specific `IVexNormalizer` instances translate raw payloads to canonical `VexClaim` batches.
|
||||
5. **Consensus** – `VexConsensusResolver` (Core) consumes claims with policy weights supplied by `IVexPolicyEvaluator`, producing deterministic consensus entries and conflict annotations.
|
||||
6. **Export** – query requests pass through `VexExportEngine`, generating `VexExportManifest` instances, caching by `VexQuerySignature`, and emitting artifacts for attestation/signature.
|
||||
7. **Attestation & transparency** *(planned)* – `IVexAttestationClient` signs exports (in-toto/DSSE) and records bundles in Rekor v2.
|
||||
* **OpenVEX** JSON documents (attested or raw).
|
||||
* **CSAF VEX** 2.x (vendor PSIRTs and distros commonly publish CSAF).
|
||||
* **CycloneDX VEX** 1.4+ (standalone VEX or embedded VEX blocks).
|
||||
* **OCI‑attached attestations** (VEX statements shipped as OCI referrers) — optional connectors.
|
||||
|
||||
The Worker coordinates the long-running steps (fetch/verify/normalize/export), while the WebService exposes synchronous APIs for on-demand operations and status lookups.
|
||||
All connectors register **source metadata**: provider identity, trust tier, signature expectations (PGP/cosign/PKI), fetch windows, rate limits, and time anchors.
|
||||
|
||||
## 4. Policy semantics
|
||||
### 1.2 Canonical model (normalized)
|
||||
|
||||
- **Weights** – default tiers (`vendor=1.0`, `distro=0.9`, `platform=0.7`, `hub=0.5`, `attestation=0.6`) loaded via `VexPolicyOptions.Weights`, with per-provider overrides.
|
||||
- **Justification gates** – policy enforces that `not_affected` claims must provide a recognized justification; rejected claims are preserved as conflicts with reason metadata.
|
||||
- **Diagnostics** – policy snapshots carry structured issues for misconfigurations (out-of-range weights, empty overrides) surfaced to operators via logs and future CLI/Web endpoints.
|
||||
Every incoming statement becomes a set of **VexClaim** records:
|
||||
|
||||
Policy snapshots are immutable and versioned so consensus records capture the policy revision used during evaluation.
|
||||
```
|
||||
VexClaim
|
||||
- providerId // 'redhat', 'suse', 'ubuntu', 'github', 'vendorX'
|
||||
- vulnId // 'CVE-2025-12345', 'GHSA-xxxx', canonicalized
|
||||
- productKey // canonical product identity (see §2.2)
|
||||
- status // affected | not_affected | fixed | under_investigation
|
||||
- justification? // for 'not_affected'/'affected' where provided
|
||||
- introducedVersion? // semantics per provider (range or exact)
|
||||
- fixedVersion? // where provided (range or exact)
|
||||
- lastObserved // timestamp from source or fetch time
|
||||
- provenance // doc digest, signature status, fetch URI, line/offset anchors
|
||||
- evidence[] // raw source snippets for explainability
|
||||
- supersedes? // optional cross-doc chain (docDigest → docDigest)
|
||||
```
|
||||
|
||||
## 5. Determinism & caching
|
||||
### 1.3 Exports (consumption)
|
||||
|
||||
- JSON serialization uses `VexCanonicalJsonSerializer`, enforcing property ordering and camelCase naming for reproducible snapshots and test fixtures.
|
||||
- `VexQuerySignature` produces canonical filter/order strings and SHA-256 digests, enabling cache keys shared across services.
|
||||
- Export manifests reuse cached artifacts when the same signature/format is requested unless `ForceRefresh` is explicitly set.
|
||||
- For scorring multiple sources on same VEX topic use - `VEXER_SCORRING.md`
|
||||
* **VexConsensus** per `(vulnId, productKey)` with:
|
||||
|
||||
## 6. Observability & offline posture
|
||||
* `rollupStatus` (after policy weights/justification gates),
|
||||
* `sources[]` (winning + losing claims with weights & reasons),
|
||||
* `policyRevisionId` (identifier of the Vexer policy used),
|
||||
* `consensusDigest` (stable SHA‑256 over canonical JSON).
|
||||
* **Raw claims** export for auditing (unchanged, with provenance).
|
||||
* **Provider snapshots** (per source, last N days) for operator debugging.
|
||||
* **Index** optimized for backend joins: `(productKey, vulnId) → (status, confidence, sourceSet)`.
|
||||
|
||||
- Structured logs (`ILogger`) capture correlation IDs, query signatures, provider IDs, and policy revisions. Metrics/OTel instrumentation will mirror Feedser once tracing hooks are added.
|
||||
- Offline-first: connectors, policy bundles, and export caches can be bundled inside the Offline Kit; no mandatory outbound calls beyond configured provider allowlists.
|
||||
- Operator tooling (CLI/WebService) will expose diagnostics (policy issues, verification failures, cache status) so air-gapped deployments maintain visibility without external telemetry.
|
||||
All exports are **deterministic**, and (optionally) **attested** via DSSE and logged to Rekor v2.
|
||||
|
||||
## 7. Roadmap highlights
|
||||
---
|
||||
|
||||
- Complete storage mappings for providers/consensus/cache and add migrations/indices per collection.
|
||||
- Implement Rekor/in-toto attestation clients and wire export engine to produce signed bundles.
|
||||
- Build WebService endpoints (`/vexer/status`, `/vexer/claims`, `/vexer/exports`) plus CLI verbs mirroring Feedser patterns.
|
||||
- Provide CSAF, CycloneDX VEX, and OpenVEX normalizers along with vendor-specific connectors (Red Hat, Cisco, SUSE, MSRC, Oracle, Ubuntu, OCI attestation).
|
||||
- Extend policy diagnostics with schema validation, change tracking, and operator-facing diff reports.
|
||||
- Mongo bootstrapper runs ordered migrations (`vex.migrations`) to ensure indexes for raw documents, providers, consensus snapshots, exports, and cache entries.
|
||||
## 2) Identity model — products & joins
|
||||
|
||||
## Appendix A – Policy diagnostics workflow
|
||||
### 2.1 Vuln identity
|
||||
|
||||
- `StellaOps.Vexer.Policy` now exposes `IVexPolicyDiagnostics`, producing deterministic diagnostics reports with timestamp, severity counts, active provider overrides, and the full issue list surfaced by `IVexPolicyProvider`.
|
||||
- CLI/WebService layers should call `IVexPolicyDiagnostics.GetDiagnostics()` to display operator-friendly summaries (`vexer policy diagnostics` and `/vexer/policy/diagnostics` are the planned entry points).
|
||||
- Recommendations in the report guide operators to resolve blocking errors, review warnings, and audit override usage before consensus runs—embed them directly in UX copy instead of re-deriving logic.
|
||||
- Export/consensus telemetry should log the diagnostic `Version` alongside `policyRevisionId` so dashboards can correlate policy changes with consensus decisions.
|
||||
- Offline installations can persist the diagnostics report (JSON) in the Offline Kit to document policy headroom during audits; the output is deterministic and diff-friendly.
|
||||
- Use `VexPolicyBinder` when ingesting operator-supplied YAML/JSON bundles; it normalizes weight/override values, reports deterministic issues, and returns the consensus-ready `VexConsensusPolicyOptions` used by `VexPolicyProvider`.
|
||||
- Reload telemetry emits `vex.policy.reloads` (tags: `revision`, `version`, `issues`) whenever a new digest is observed—feed this into dashboards to correlate policy changes with consensus outcomes.
|
||||
* Accepts **CVE**, **GHSA**, vendor IDs (MSRC, RHSA…), distro IDs (DSA/USN/RHSA…) — normalized to `vulnId` with alias sets.
|
||||
* **Alias graph** maintained (from Feedser) to map vendor/distro IDs → CVE (primary) and to **GHSA** where applicable.
|
||||
|
||||
### 2.2 Product identity (`productKey`)
|
||||
|
||||
* **Primary:** `purl` (Package URL).
|
||||
* **Secondary links:** `cpe`, **OS package NVRA/EVR**, NuGet/Maven/Golang identity, and **OS package name** when purl unavailable.
|
||||
* **Fallback:** `oci:<registry>/<repo>@<digest>` for image‑level VEX.
|
||||
* **Special cases:** kernel modules, firmware, platforms → provider‑specific mapping helpers (connector captures provider’s product taxonomy → canonical `productKey`).
|
||||
|
||||
> Vexer does not invent identities. If a provider cannot be mapped to purl/CPE/NVRA deterministically, we keep the native **product string** and mark the claim as **non‑joinable**; the backend will ignore it unless a policy explicitly whitelists that provider mapping.
|
||||
|
||||
---
|
||||
|
||||
## 3) Storage schema (MongoDB)
|
||||
|
||||
Database: `vexer`
|
||||
|
||||
### 3.1 Collections
|
||||
|
||||
**`vex.providers`**
|
||||
|
||||
```
|
||||
_id: providerId
|
||||
name, homepage, contact
|
||||
trustTier: enum {vendor, distro, platform, hub, attestation}
|
||||
signaturePolicy: { type: pgp|cosign|x509|none, keys[], certs[], cosignKeylessRoots[] }
|
||||
fetch: { baseUrl, kind: http|oci|file, rateLimit, etagSupport, windowDays }
|
||||
enabled: bool
|
||||
createdAt, modifiedAt
|
||||
```
|
||||
|
||||
**`vex.raw`** (immutable raw documents)
|
||||
|
||||
```
|
||||
_id: sha256(doc bytes)
|
||||
providerId
|
||||
uri
|
||||
ingestedAt
|
||||
contentType
|
||||
sig: { verified: bool, method: pgp|cosign|x509|none, keyId|certSubject, bundle? }
|
||||
payload: GridFS pointer (if large)
|
||||
disposition: kept|replaced|superseded
|
||||
correlation: { replaces?: sha256, replacedBy?: sha256 }
|
||||
```
|
||||
|
||||
**`vex.claims`** (normalized rows; dedupe on providerId+vulnId+productKey+docDigest)
|
||||
|
||||
```
|
||||
_id
|
||||
providerId
|
||||
vulnId
|
||||
productKey
|
||||
status
|
||||
justification?
|
||||
introducedVersion?
|
||||
fixedVersion?
|
||||
lastObserved
|
||||
docDigest
|
||||
provenance { uri, line?, pointer?, signatureState }
|
||||
evidence[] { key, value, locator }
|
||||
indices:
|
||||
- {vulnId:1, productKey:1}
|
||||
- {providerId:1, lastObserved:-1}
|
||||
- {status:1}
|
||||
- text index (optional) on evidence.value for debugging
|
||||
```
|
||||
|
||||
**`vex.consensus`** (rollups)
|
||||
|
||||
```
|
||||
_id: sha256(canonical(vulnId, productKey, policyRevision))
|
||||
vulnId
|
||||
productKey
|
||||
rollupStatus
|
||||
sources[]: [
|
||||
{ providerId, status, justification?, weight, lastObserved, accepted:bool, reason }
|
||||
]
|
||||
policyRevisionId
|
||||
evaluatedAt
|
||||
consensusDigest // same as _id
|
||||
indices:
|
||||
- {vulnId:1, productKey:1}
|
||||
- {policyRevisionId:1, evaluatedAt:-1}
|
||||
```
|
||||
|
||||
**`vex.exports`** (manifest of emitted artifacts)
|
||||
|
||||
```
|
||||
_id
|
||||
querySignature
|
||||
format: raw|consensus|index
|
||||
artifactSha256
|
||||
rekor { uuid, index, url }?
|
||||
createdAt
|
||||
policyRevisionId
|
||||
cacheable: bool
|
||||
```
|
||||
|
||||
**`vex.cache`**
|
||||
|
||||
```
|
||||
querySignature -> exportId (for fast reuse)
|
||||
ttl, hits
|
||||
```
|
||||
|
||||
**`vex.migrations`**
|
||||
|
||||
* ordered migrations applied at bootstrap to ensure indexes.
|
||||
|
||||
### 3.2 Indexing strategy
|
||||
|
||||
* Hot path queries use exact `(vulnId, productKey)` and time‑bounded windows; compound indexes cover both.
|
||||
* Providers list view by `lastObserved` for monitoring staleness.
|
||||
* `vex.consensus` keyed by `(vulnId, productKey, policyRevision)` for deterministic reuse.
|
||||
|
||||
---
|
||||
|
||||
## 4) Ingestion pipeline
|
||||
|
||||
### 4.1 Connector contract
|
||||
|
||||
```csharp
|
||||
public interface IVexConnector
|
||||
{
|
||||
string ProviderId { get; }
|
||||
Task FetchAsync(VexConnectorContext ctx, CancellationToken ct); // raw docs
|
||||
Task NormalizeAsync(VexConnectorContext ctx, CancellationToken ct); // raw -> VexClaim[]
|
||||
}
|
||||
```
|
||||
|
||||
* **Fetch** must implement: window scheduling, conditional GET (ETag/If‑Modified‑Since), rate limiting, retry/backoff.
|
||||
* **Normalize** parses the format, validates schema, maps product identities deterministically, emits `VexClaim` records with **provenance**.
|
||||
|
||||
### 4.2 Signature verification (per provider)
|
||||
|
||||
* **cosign (keyless or keyful)** for OCI referrers or HTTP‑served JSON with Sigstore bundles.
|
||||
* **PGP** (provider keyrings) for distro/vendor feeds that sign docs.
|
||||
* **x509** (mutual TLS / provider‑pinned certs) where applicable.
|
||||
* Signature state is stored on **vex.raw.sig** and copied into **provenance.signatureState** on claims.
|
||||
|
||||
> Claims from sources failing signature policy are marked `"signatureState.verified=false"` and **policy** can down‑weight or ignore them.
|
||||
|
||||
### 4.3 Time discipline
|
||||
|
||||
* For each doc, prefer **provider’s document timestamp**; if absent, use fetch time.
|
||||
* Claims carry `lastObserved` which drives **tie‑breaking** within equal weight tiers.
|
||||
|
||||
---
|
||||
|
||||
## 5) Normalization: product & status semantics
|
||||
|
||||
### 5.1 Product mapping
|
||||
|
||||
* **purl** first; **cpe** second; OS package NVRA/EVR mapping helpers (distro connectors) produce purls via canonical tables (e.g., rpm→purl:rpm, deb→purl:deb).
|
||||
* Where a provider publishes **platform‑level** VEX (e.g., “RHEL 9 not affected”), connectors expand to known product inventory rules (e.g., map to sets of packages/components shipped in the platform). Expansion tables are versioned and kept per provider; every expansion emits **evidence** indicating the rule applied.
|
||||
* If expansion would be speculative, the claim remains **platform‑scoped** with `productKey="platform:redhat:rhel:9"` and is flagged **non‑joinable**; backend can decide to use platform VEX only when Scanner proves the platform runtime.
|
||||
|
||||
### 5.2 Status + justification mapping
|
||||
|
||||
* Canonical **status**: `affected | not_affected | fixed | under_investigation`.
|
||||
* **Justifications** normalized to a controlled vocabulary (CISA‑aligned), e.g.:
|
||||
|
||||
* `component_not_present`
|
||||
* `vulnerable_code_not_in_execute_path`
|
||||
* `vulnerable_configuration_unused`
|
||||
* `inline_mitigation_applied`
|
||||
* `fix_available` (with `fixedVersion`)
|
||||
* `under_investigation`
|
||||
* Providers with free‑text justifications are mapped by deterministic tables; raw text preserved as `evidence`.
|
||||
|
||||
---
|
||||
|
||||
## 6) Consensus algorithm
|
||||
|
||||
**Goal:** produce a **stable**, explainable `rollupStatus` per `(vulnId, productKey)` given possibly conflicting claims.
|
||||
|
||||
### 6.1 Inputs
|
||||
|
||||
* Set **S** of `VexClaim` for the key.
|
||||
* **Vexer policy snapshot**:
|
||||
|
||||
* **weights** per provider tier and per provider overrides.
|
||||
* **justification gates** (e.g., require justification for `not_affected` to be acceptable).
|
||||
* **minEvidence** rules (e.g., `not_affected` must come from ≥1 vendor or 2 distros).
|
||||
* **signature requirements** (e.g., require verified signature for ‘fixed’ to be considered).
|
||||
|
||||
### 6.2 Steps
|
||||
|
||||
1. **Filter invalid** claims by signature policy & justification gates → set `S'`.
|
||||
2. **Score** each claim:
|
||||
`score = weight(provider) * freshnessFactor(lastObserved)` where freshnessFactor ∈ [0.8, 1.0] for staleness decay (configurable; small effect).
|
||||
3. **Aggregate** scores per status: `W(status) = Σ score(claims with that status)`.
|
||||
4. **Pick** `rollupStatus = argmax_status W(status)`.
|
||||
5. **Tie‑breakers** (in order):
|
||||
|
||||
* Higher **max single** provider score wins (vendor > distro > platform > hub).
|
||||
* More **recent** lastObserved wins.
|
||||
* Deterministic lexicographic order of status (`fixed` > `not_affected` > `under_investigation` > `affected`) as final tiebreaker.
|
||||
6. **Explain**: mark accepted sources (`accepted=true; reason="weight"`/`"freshness"`), mark rejected sources with explicit `reason` (`"insufficient_justification"`, `"signature_unverified"`, `"lower_weight"`).
|
||||
|
||||
> The algorithm is **pure** given S and policy snapshot; result is reproducible and hashed into `consensusDigest`.
|
||||
|
||||
---
|
||||
|
||||
## 7) Query & export APIs
|
||||
|
||||
All endpoints are versioned under `/api/v1/vex`.
|
||||
|
||||
### 7.1 Query (online)
|
||||
|
||||
```
|
||||
POST /claims/search
|
||||
body: { vulnIds?: string[], productKeys?: string[], providers?: string[], since?: timestamp, limit?: int, pageToken?: string }
|
||||
→ { claims[], nextPageToken? }
|
||||
|
||||
POST /consensus/search
|
||||
body: { vulnIds?: string[], productKeys?: string[], policyRevisionId?: string, since?: timestamp, limit?: int, pageToken?: string }
|
||||
→ { entries[], nextPageToken? }
|
||||
|
||||
POST /resolve
|
||||
body: { purls: string[], vulnIds: string[], policyRevisionId?: string }
|
||||
→ { results: [ { vulnId, productKey, rollupStatus, sources[] } ] }
|
||||
```
|
||||
|
||||
### 7.2 Exports (cacheable snapshots)
|
||||
|
||||
```
|
||||
POST /exports
|
||||
body: { signature: { vulnFilter?, productFilter?, providers?, since? }, format: raw|consensus|index, policyRevisionId?: string, force?: bool }
|
||||
→ { exportId, artifactSha256, rekor? }
|
||||
|
||||
GET /exports/{exportId} → bytes (application/json or binary index)
|
||||
GET /exports/{exportId}/meta → { signature, policyRevisionId, createdAt, artifactSha256, rekor? }
|
||||
```
|
||||
|
||||
### 7.3 Provider operations
|
||||
|
||||
```
|
||||
GET /providers → provider list & signature policy
|
||||
POST /providers/{id}/refresh → trigger fetch/normalize window
|
||||
GET /providers/{id}/status → last fetch, doc counts, signature stats
|
||||
```
|
||||
|
||||
**Auth:** service‑to‑service via Authority tokens; operator operations via UI/CLI with RBAC.
|
||||
|
||||
---
|
||||
|
||||
## 8) Attestation integration
|
||||
|
||||
* Exports can be **DSSE‑signed** via **Signer** and logged to **Rekor v2** via **Attestor** (optional but recommended for regulated pipelines).
|
||||
* `vex.exports.rekor` stores `{uuid, index, url}` when present.
|
||||
* **Predicate type**: `https://stella-ops.org/attestations/vex-export/1` with fields:
|
||||
|
||||
* `querySignature`, `policyRevisionId`, `artifactSha256`, `createdAt`.
|
||||
|
||||
---
|
||||
|
||||
## 9) Configuration (YAML)
|
||||
|
||||
```yaml
|
||||
vexer:
|
||||
mongo: { uri: "mongodb://mongo/vexer" }
|
||||
s3:
|
||||
endpoint: http://minio:9000
|
||||
bucket: stellaops
|
||||
policy:
|
||||
weights:
|
||||
vendor: 1.0
|
||||
distro: 0.9
|
||||
platform: 0.7
|
||||
hub: 0.5
|
||||
attestation: 0.6
|
||||
providerOverrides:
|
||||
redhat: 1.0
|
||||
suse: 0.95
|
||||
requireJustificationForNotAffected: true
|
||||
signatureRequiredForFixed: true
|
||||
minEvidence:
|
||||
not_affected:
|
||||
vendorOrTwoDistros: true
|
||||
connectors:
|
||||
- providerId: redhat
|
||||
kind: csaf
|
||||
baseUrl: https://access.redhat.com/security/data/csaf/v2/
|
||||
signaturePolicy: { type: pgp, keys: [ "…redhat-pgp-key…" ] }
|
||||
windowDays: 7
|
||||
- providerId: suse
|
||||
kind: csaf
|
||||
baseUrl: https://ftp.suse.com/pub/projects/security/csaf/
|
||||
signaturePolicy: { type: pgp, keys: [ "…suse-pgp-key…" ] }
|
||||
- providerId: ubuntu
|
||||
kind: openvex
|
||||
baseUrl: https://…/vex/
|
||||
signaturePolicy: { type: none }
|
||||
- providerId: vendorX
|
||||
kind: cyclonedx-vex
|
||||
ociRef: ghcr.io/vendorx/vex@sha256:…
|
||||
signaturePolicy: { type: cosign, cosignKeylessRoots: [ "sigstore-root" ] }
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 10) Security model
|
||||
|
||||
* **Input signature verification** enforced per provider policy (PGP, cosign, x509).
|
||||
* **Connector allowlists**: outbound fetch constrained to configured domains.
|
||||
* **Tenant isolation**: per‑tenant DB prefixes or separate DBs; per‑tenant S3 prefixes; per‑tenant policies.
|
||||
* **AuthN/Z**: Authority‑issued OpToks; RBAC roles (`vex.read`, `vex.admin`, `vex.export`).
|
||||
* **No secrets in logs**; deterministic logging contexts include providerId, docDigest, claim keys.
|
||||
|
||||
---
|
||||
|
||||
## 11) Performance & scale
|
||||
|
||||
* **Targets:**
|
||||
|
||||
* Normalize 10k VEX claims/minute/core.
|
||||
* Consensus compute ≤ 50 ms for 1k unique `(vuln, product)` pairs in hot cache.
|
||||
* Export (consensus) 1M rows in ≤ 60 s on 8 cores with streaming writer.
|
||||
|
||||
* **Scaling:**
|
||||
|
||||
* WebService handles control APIs; **Worker** background services (same image) execute fetch/normalize in parallel with rate‑limits; Mongo writes batched; upserts by natural keys.
|
||||
* Exports stream straight to S3 (MinIO) with rolling buffers.
|
||||
|
||||
* **Caching:**
|
||||
|
||||
* `vex.cache` maps query signatures → export; TTL to avoid stampedes; optimistic reuse unless `force`.
|
||||
|
||||
---
|
||||
|
||||
## 12) Observability
|
||||
|
||||
* **Metrics:**
|
||||
|
||||
* `vex.ingest.docs_total{provider}`
|
||||
* `vex.normalize.claims_total{provider}`
|
||||
* `vex.signature.failures_total{provider,method}`
|
||||
* `vex.consensus.conflicts_total{vulnId}`
|
||||
* `vex.exports.bytes{format}` / `vex.exports.latency_seconds`
|
||||
* **Tracing:** spans for fetch, verify, parse, map, consensus, export.
|
||||
* **Dashboards:** provider staleness, top conflicting vulns/components, signature posture, export cache hit‑rate.
|
||||
|
||||
---
|
||||
|
||||
## 13) Testing matrix
|
||||
|
||||
* **Connectors:** golden raw docs → deterministic claims (fixtures per provider/format).
|
||||
* **Signature policies:** valid/invalid PGP/cosign/x509 samples; ensure rejects are recorded but not accepted.
|
||||
* **Normalization edge cases:** platform‑only claims, free‑text justifications, non‑purl products.
|
||||
* **Consensus:** conflict scenarios across tiers; check tie‑breakers; justification gates.
|
||||
* **Performance:** 1M‑row export timing; memory ceilings; stream correctness.
|
||||
* **Determinism:** same inputs + policy → identical `consensusDigest` and export bytes.
|
||||
* **API contract tests:** pagination, filters, RBAC, rate limits.
|
||||
|
||||
---
|
||||
|
||||
## 14) Integration points
|
||||
|
||||
* **Backend Policy Engine** (in Scanner.WebService): calls `POST /resolve` with batched `(purl, vulnId)` pairs to fetch `rollupStatus + sources`.
|
||||
* **Feedser**: provides alias graph (CVE↔vendor IDs) and may supply VEX‑adjacent metadata (e.g., KEV flag) for policy escalation.
|
||||
* **UI**: VEX explorer screens use `/claims/search` and `/consensus/search`; show conflicts & provenance.
|
||||
* **CLI**: `stellaops vex export --consensus --since 7d --out vex.json` for audits.
|
||||
|
||||
---
|
||||
|
||||
## 15) Failure modes & fallback
|
||||
|
||||
* **Provider unreachable:** stale thresholds trigger warnings; policy can down‑weight stale providers automatically (freshness factor).
|
||||
* **Signature outage:** continue to ingest but mark `signatureState.verified=false`; consensus will likely exclude or down‑weight per policy.
|
||||
* **Schema drift:** unknown fields are preserved as `evidence`; normalization rejects only on **invalid identity** or **status**.
|
||||
|
||||
---
|
||||
|
||||
## 16) Rollout plan (incremental)
|
||||
|
||||
1. **MVP**: OpenVEX + CSAF connectors for 3 major providers (e.g., Red Hat/SUSE/Ubuntu), normalization + consensus + `/resolve`.
|
||||
2. **Signature policies**: PGP for distros; cosign for OCI.
|
||||
3. **Exports + optional attestation**.
|
||||
4. **CycloneDX VEX** connectors; platform claim expansion tables; UI explorer.
|
||||
5. **Scale hardening**: export indexes; conflict analytics.
|
||||
|
||||
---
|
||||
|
||||
## 17) Appendix — canonical JSON (stable ordering)
|
||||
|
||||
All exports and consensus entries are serialized via `VexCanonicalJsonSerializer`:
|
||||
|
||||
* UTF‑8 without BOM;
|
||||
* keys sorted (ASCII);
|
||||
* arrays sorted by `(providerId, vulnId, productKey, lastObserved)` unless semantic order mandated;
|
||||
* timestamps in `YYYY‑MM‑DDThh:mm:ssZ`;
|
||||
* no insignificant whitespace.
|
||||
|
||||
This architecture keeps Vexer aligned with StellaOps' deterministic, offline-operable design while layering VEX-specific consensus and attestation capabilities on top of the Feedser foundations.
|
||||
|
||||
Reference in New Issue
Block a user