Files
git.stella-ops.org/docs/ARCHITECTURE_VEXER.md
Vladimir Moushkov c65061602b
Some checks failed
Build Test Deploy / build-test (push) Has been cancelled
Build Test Deploy / authority-container (push) Has been cancelled
Build Test Deploy / docs (push) Has been cancelled
Build Test Deploy / deploy (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
commit
2025-10-16 19:44:10 +03:00

86 lines
8.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# StellaOps Vexer Architecture
Vexer is StellaOps' vulnerability-exploitability (VEX) platform. It ingests VEX statements from multiple providers, normalizes them into canonical claims, projects trust-weighted consensus, and delivers deterministic export artifacts with signed attestations. This document summarizes the target architecture and how the current implementation maps to those goals.
## 1. Solution topology
| Module | Purpose | Key contracts |
| --- | --- | --- |
| `StellaOps.Vexer.Core` | Domain models (`VexClaim`, `VexConsensus`, `VexExportManifest`), deterministic JSON helpers, shared abstractions (connectors, exporters, attestations). | `IVexConnector`, `IVexExporter`, `IVexAttestationClient`, `VexCanonicalJsonSerializer` |
| `StellaOps.Vexer.Policy` | Loads operator policy (weights, overrides, justification gates) and exposes snapshots for consensus. | `IVexPolicyProvider`, `IVexPolicyEvaluator`, `VexPolicyOptions` |
| `StellaOps.Vexer.Storage.Mongo` | Persistence layer for providers, raw docs, claims, consensus, exports, cache. | `IVexRawStore`, `IVexExportStore`, Mongo class maps |
| `StellaOps.Vexer.Export` | Orchestrates export pipeline (query signature → cache lookup → snapshot build → attestation handoff). | `IExportEngine`, `IVexExportDataSource` |
| `StellaOps.Vexer.Attestation` *(planned)* | Builds in-toto/DSSE envelopes and communicates with Sigstore/Rekor. | `IVexAttestationClient` |
| `StellaOps.Vexer.WebService` *(planned)* | Minimal API host for ingest/export endpoints. | `AddVexerWebService()` |
| `StellaOps.Vexer.Worker` *(planned)* | Background executor for scheduled pulls, verification, reconciliation, cache GC. | Hosted services |
All modules target .NET 10 preview and follow the same deterministic logging and serialization conventions as Feedser.
## 2. Data model
MongoDB acts as the canonical store; collections (with logical responsibilities) are:
- `vex.providers` provider metadata, trust tiers, discovery endpoints, and cosign/PGP details.
- `vex.raw` immutable raw documents (CSAF, CycloneDX VEX, OpenVEX, OCI attestations) with digests, retrieval metadata, and signature state.
- `vex.claims` normalized `VexClaim` rows; deduped on `(providerId, vulnId, productKey, docDigest)`.
- `vex.consensus` consensus projections per `(vulnId, productKey)` capturing rollup status, source weights, conflicts, and policy revision.
- `vex.exports` export manifests containing artifact digests, cache metadata, and attestation pointers.
- `vex.cache` index from `querySignature`/`format` to export digest for fast reuse.
- `vex.migrations` tracks applied storage migrations (index bootstrap, future schema updates).
GridFS is used for large raw payloads when necessary, and artifact stores (S3/MinIO/file) hold serialized exports referenced by `vex.exports`.
## 3. Ingestion and reconciliation flow
1. **Discovery & configuration** connectors load YAML/JSON settings via `StellaOps.Vexer.Policy` (provider enablement, trust overrides).
2. **Fetch** each `IVexConnector` pulls source windows, writing raw documents through `IVexRawDocumentSink` (Mongo-backed) with dedupe on digest.
3. **Verification** signatures/attestations validated through `IVexSignatureVerifier`; metadata stored alongside raw records.
4. **Normalization** format-specific `IVexNormalizer` instances translate raw payloads to canonical `VexClaim` batches.
5. **Consensus** `VexConsensusResolver` (Core) consumes claims with policy weights supplied by `IVexPolicyEvaluator`, producing deterministic consensus entries and conflict annotations.
6. **Export** query requests pass through `VexExportEngine`, generating `VexExportManifest` instances, caching by `VexQuerySignature`, and emitting artifacts for attestation/signature.
7. **Attestation & transparency** *(planned)* `IVexAttestationClient` signs exports (in-toto/DSSE) and records bundles in Rekor v2.
The Worker coordinates the long-running steps (fetch/verify/normalize/export), while the WebService exposes synchronous APIs for on-demand operations and status lookups.
## 4. Policy semantics
- **Weights** default tiers (`vendor=1.0`, `distro=0.9`, `platform=0.7`, `hub=0.5`, `attestation=0.6`) loaded via `VexPolicyOptions.Weights`, with per-provider overrides.
- **Justification gates** policy enforces that `not_affected` claims must provide a recognized justification; rejected claims are preserved as conflicts with reason metadata.
- **Diagnostics** policy snapshots carry structured issues for misconfigurations (out-of-range weights, empty overrides) surfaced to operators via logs and future CLI/Web endpoints.
Policy snapshots are immutable and versioned so consensus records capture the policy revision used during evaluation.
## 5. Determinism & caching
- JSON serialization uses `VexCanonicalJsonSerializer`, enforcing property ordering and camelCase naming for reproducible snapshots and test fixtures.
- `VexQuerySignature` produces canonical filter/order strings and SHA-256 digests, enabling cache keys shared across services.
- Export manifests reuse cached artifacts when the same signature/format is requested unless `ForceRefresh` is explicitly set.
- For scorring multiple sources on same VEX topic use - `VEXER_SCORRING.md`
## 6. Observability & offline posture
- Structured logs (`ILogger`) capture correlation IDs, query signatures, provider IDs, and policy revisions. Metrics/OTel instrumentation will mirror Feedser once tracing hooks are added.
- Offline-first: connectors, policy bundles, and export caches can be bundled inside the Offline Kit; no mandatory outbound calls beyond configured provider allowlists.
- Operator tooling (CLI/WebService) will expose diagnostics (policy issues, verification failures, cache status) so air-gapped deployments maintain visibility without external telemetry.
## 7. Roadmap highlights
- Complete storage mappings for providers/consensus/cache and add migrations/indices per collection.
- Implement Rekor/in-toto attestation clients and wire export engine to produce signed bundles.
- Build WebService endpoints (`/vexer/status`, `/vexer/claims`, `/vexer/exports`) plus CLI verbs mirroring Feedser patterns.
- Provide CSAF, CycloneDX VEX, and OpenVEX normalizers along with vendor-specific connectors (Red Hat, Cisco, SUSE, MSRC, Oracle, Ubuntu, OCI attestation).
- Extend policy diagnostics with schema validation, change tracking, and operator-facing diff reports.
- Mongo bootstrapper runs ordered migrations (`vex.migrations`) to ensure indexes for raw documents, providers, consensus snapshots, exports, and cache entries.
## Appendix A Policy diagnostics workflow
- `StellaOps.Vexer.Policy` now exposes `IVexPolicyDiagnostics`, producing deterministic diagnostics reports with timestamp, severity counts, active provider overrides, and the full issue list surfaced by `IVexPolicyProvider`.
- CLI/WebService layers should call `IVexPolicyDiagnostics.GetDiagnostics()` to display operator-friendly summaries (`vexer policy diagnostics` and `/vexer/policy/diagnostics` are the planned entry points).
- Recommendations in the report guide operators to resolve blocking errors, review warnings, and audit override usage before consensus runs—embed them directly in UX copy instead of re-deriving logic.
- Export/consensus telemetry should log the diagnostic `Version` alongside `policyRevisionId` so dashboards can correlate policy changes with consensus decisions.
- Offline installations can persist the diagnostics report (JSON) in the Offline Kit to document policy headroom during audits; the output is deterministic and diff-friendly.
- Use `VexPolicyBinder` when ingesting operator-supplied YAML/JSON bundles; it normalizes weight/override values, reports deterministic issues, and returns the consensus-ready `VexConsensusPolicyOptions` used by `VexPolicyProvider`.
- Reload telemetry emits `vex.policy.reloads` (tags: `revision`, `version`, `issues`) whenever a new digest is observed—feed this into dashboards to correlate policy changes with consensus outcomes.
This architecture keeps Vexer aligned with StellaOps' deterministic, offline-operable design while layering VEX-specific consensus and attestation capabilities on top of the Feedser foundations.