feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules
- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes. - Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes. - Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables. - Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
This commit is contained in:
		
							
								
								
									
										22
									
								
								docs/modules/vexer/AGENTS.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										22
									
								
								docs/modules/vexer/AGENTS.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,22 @@ | ||||
| # Vexer agent guide | ||||
|  | ||||
| ## Mission | ||||
| Vexer computes deterministic consensus across VEX claims, preserving conflicts and producing attestable evidence for policy suppression. | ||||
|  | ||||
| ## Key docs | ||||
| - [Module README](./README.md) | ||||
| - [Architecture](./architecture.md) | ||||
| - [Implementation plan](./implementation_plan.md) | ||||
| - [Task board](./TASKS.md) | ||||
|  | ||||
| ## How to get started | ||||
| 1. Open ../../implplan/SPRINTS.md and locate the stories referencing this module. | ||||
| 2. Review ./TASKS.md for local follow-ups and confirm status transitions (TODO → DOING → DONE/BLOCKED). | ||||
| 3. Read the architecture and README for domain context before editing code or docs. | ||||
| 4. Coordinate cross-module changes in the main /AGENTS.md description and through the sprint plan. | ||||
|  | ||||
| ## Guardrails | ||||
| - Honour the Aggregation-Only Contract where applicable (see ../../ingestion/aggregation-only-contract.md). | ||||
| - Preserve determinism: sort outputs, normalise timestamps (UTC ISO-8601), and avoid machine-specific artefacts. | ||||
| - Keep Offline Kit parity in mind—document air-gapped workflows for any new feature. | ||||
| - Update runbooks/observability assets when operational characteristics change. | ||||
							
								
								
									
										34
									
								
								docs/modules/vexer/README.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										34
									
								
								docs/modules/vexer/README.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,34 @@ | ||||
| # StellaOps Vexer | ||||
|  | ||||
| Vexer computes deterministic consensus across VEX claims, preserving conflicts and producing attestable evidence for policy suppression. | ||||
|  | ||||
| ## Responsibilities | ||||
| - Ingest Excititor observations and compute per-product consensus snapshots. | ||||
| - Provide APIs for querying canonical VEX positions and conflict sets. | ||||
| - Publish exports and DSSE-ready digests for downstream consumption. | ||||
| - Keep provenance weights and disagreement metadata. | ||||
|  | ||||
| ## Key components | ||||
| - Consensus engine and API host in `StellaOps.Vexer.*` (to-be-implemented). | ||||
| - Storage schema for consensus graphs. | ||||
| - Integration hooks for Policy Engine suppression logic. | ||||
|  | ||||
| ## Integrations & dependencies | ||||
| - Excititor for raw observations. | ||||
| - Policy Engine and UI for suppression stories. | ||||
| - CLI for evidence inspection. | ||||
|  | ||||
| ## Operational notes | ||||
| - Deterministic consensus algorithms (see architecture). | ||||
| - Planned telemetry for disagreement counts and freshness. | ||||
| - Offline exports aligning with Concelier/Excititor timelines. | ||||
|  | ||||
| ## Related resources | ||||
| - ./scoring.md | ||||
|  | ||||
| ## Backlog references | ||||
| - DOCS-VEXER backlog referenced in architecture doc. | ||||
| - CLI parity tracked in ../../TASKS.md (CLI-GRAPH/VEX stories). | ||||
|  | ||||
| ## Epic alignment | ||||
| - **Epic 7 – VEX Consensus Lens:** deliver trust-weighted consensus snapshots, disagreement metadata, and explain APIs. | ||||
							
								
								
									
										9
									
								
								docs/modules/vexer/TASKS.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										9
									
								
								docs/modules/vexer/TASKS.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,9 @@ | ||||
| # Task board — Vexer | ||||
|  | ||||
| > Local tasks should link back to ./AGENTS.md and mirror status updates into ../../TASKS.md when applicable. | ||||
|  | ||||
| | ID | Status | Owner(s) | Description | Notes | | ||||
| |----|--------|----------|-------------|-------| | ||||
| | VEXER-DOCS-0001 | DOING (2025-10-29) | Docs Guild | Validate that ./README.md aligns with the latest release notes. | See ./AGENTS.md | | ||||
| | VEXER-OPS-0001 | TODO | Ops Guild | Review runbooks/observability assets after next sprint demo. | Sync outcomes back to ../../TASKS.md | | ||||
| | VEXER-ENG-0001 | TODO | Module Team | Cross-check implementation plan milestones against ../../implplan/SPRINTS.md. | Update status via ./AGENTS.md workflow | | ||||
							
								
								
									
										465
									
								
								docs/modules/vexer/architecture.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										465
									
								
								docs/modules/vexer/architecture.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,465 @@ | ||||
| # component_architecture_vexer.md — **Stella Ops Vexer** (2025Q4) | ||||
|  | ||||
| > Built to satisfy Epic 7 – VEX Consensus Lens requirements. | ||||
|  | ||||
| > **Scope.** This document specifies the **Vexer** service: its purpose, trust model, data structures, APIs, plug‑in contracts, storage schema, normalization/consensus algorithms, performance budgets, testing matrix, and how it integrates with Scanner, Policy, Feedser, and the attestation chain. It is implementation‑ready. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 0) Mission & role in the platform | ||||
|  | ||||
| **Mission.** Convert heterogeneous **VEX** statements (OpenVEX, CSAF VEX, CycloneDX VEX; vendor/distro/platform sources) into **canonical, queryable claims**; compute **deterministic consensus** per *(vuln, product)*; preserve **conflicts with provenance**; publish **stable, attestable exports** that the backend uses to suppress non‑exploitable findings, prioritize remaining risk, and explain decisions. | ||||
|  | ||||
| **Boundaries.** | ||||
|  | ||||
| * Vexer **does not** decide PASS/FAIL. It supplies **evidence** (statuses + justifications + provenance weights). | ||||
| * Vexer preserves **conflicting claims** unchanged; consensus encodes how we would pick, but the raw set is always exportable. | ||||
| * VEX consumption is **backend‑only**: Scanner never applies VEX. The backend’s **Policy Engine** asks Vexer for status evidence and then decides what to show. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 1) Inputs, outputs & canonical domain | ||||
|  | ||||
| ### 1.1 Accepted input formats (ingest) | ||||
|  | ||||
| * **OpenVEX** JSON documents (attested or raw). | ||||
| * **CSAF VEX** 2.x (vendor PSIRTs and distros commonly publish CSAF). | ||||
| * **CycloneDX VEX** 1.4+ (standalone VEX or embedded VEX blocks). | ||||
| * **OCI‑attached attestations** (VEX statements shipped as OCI referrers) — optional connectors. | ||||
|  | ||||
| All connectors register **source metadata**: provider identity, trust tier, signature expectations (PGP/cosign/PKI), fetch windows, rate limits, and time anchors. | ||||
|  | ||||
| ### 1.2 Canonical model (normalized) | ||||
|  | ||||
| Every incoming statement becomes a set of **VexClaim** records: | ||||
|  | ||||
| ``` | ||||
| VexClaim | ||||
| - providerId           // 'redhat', 'suse', 'ubuntu', 'github', 'vendorX' | ||||
| - vulnId               // 'CVE-2025-12345', 'GHSA-xxxx', canonicalized | ||||
| - productKey           // canonical product identity (see §2.2) | ||||
| - status               // affected | not_affected | fixed | under_investigation | ||||
| - justification?       // for 'not_affected'/'affected' where provided | ||||
| - introducedVersion?   // semantics per provider (range or exact) | ||||
| - fixedVersion?        // where provided (range or exact) | ||||
| - lastObserved         // timestamp from source or fetch time | ||||
| - provenance           // doc digest, signature status, fetch URI, line/offset anchors | ||||
| - evidence[]           // raw source snippets for explainability | ||||
| - supersedes?          // optional cross-doc chain (docDigest → docDigest) | ||||
| ``` | ||||
|  | ||||
| ### 1.3 Exports (consumption) | ||||
|  | ||||
| * **VexConsensus** per `(vulnId, productKey)` with: | ||||
|  | ||||
|   * `rollupStatus` (after policy weights/justification gates), | ||||
|   * `sources[]` (winning + losing claims with weights & reasons), | ||||
|   * `policyRevisionId` (identifier of the Vexer policy used), | ||||
|   * `consensusDigest` (stable SHA‑256 over canonical JSON). | ||||
| * **Raw claims** export for auditing (unchanged, with provenance). | ||||
| * **Provider snapshots** (per source, last N days) for operator debugging. | ||||
| * **Index** optimized for backend joins: `(productKey, vulnId) → (status, confidence, sourceSet)`. | ||||
|  | ||||
| All exports are **deterministic**, and (optionally) **attested** via DSSE and logged to Rekor v2. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 2) Identity model — products & joins | ||||
|  | ||||
| ### 2.1 Vuln identity | ||||
|  | ||||
| * Accepts **CVE**, **GHSA**, vendor IDs (MSRC, RHSA…), distro IDs (DSA/USN/RHSA…) — normalized to `vulnId` with alias sets. | ||||
| * **Alias graph** maintained (from Feedser) to map vendor/distro IDs → CVE (primary) and to **GHSA** where applicable. | ||||
|  | ||||
| ### 2.2 Product identity (`productKey`) | ||||
|  | ||||
| * **Primary:** `purl` (Package URL). | ||||
| * **Secondary links:** `cpe`, **OS package NVRA/EVR**, NuGet/Maven/Golang identity, and **OS package name** when purl unavailable. | ||||
| * **Fallback:** `oci:<registry>/<repo>@<digest>` for image‑level VEX. | ||||
| * **Special cases:** kernel modules, firmware, platforms → provider‑specific mapping helpers (connector captures provider’s product taxonomy → canonical `productKey`). | ||||
|  | ||||
| > Vexer does not invent identities. If a provider cannot be mapped to purl/CPE/NVRA deterministically, we keep the native **product string** and mark the claim as **non‑joinable**; the backend will ignore it unless a policy explicitly whitelists that provider mapping. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 3) Storage schema (MongoDB) | ||||
|  | ||||
| Database: `vexer` | ||||
|  | ||||
| ### 3.1 Collections | ||||
|  | ||||
| **`vex.providers`** | ||||
|  | ||||
| ``` | ||||
| _id: providerId | ||||
| name, homepage, contact | ||||
| trustTier: enum {vendor, distro, platform, hub, attestation} | ||||
| signaturePolicy: { type: pgp|cosign|x509|none, keys[], certs[], cosignKeylessRoots[] } | ||||
| fetch: { baseUrl, kind: http|oci|file, rateLimit, etagSupport, windowDays } | ||||
| enabled: bool | ||||
| createdAt, modifiedAt | ||||
| ``` | ||||
|  | ||||
| **`vex.raw`** (immutable raw documents) | ||||
|  | ||||
| ``` | ||||
| _id: sha256(doc bytes) | ||||
| providerId | ||||
| uri | ||||
| ingestedAt | ||||
| contentType | ||||
| sig: { verified: bool, method: pgp|cosign|x509|none, keyId|certSubject, bundle? } | ||||
| payload: GridFS pointer (if large) | ||||
| disposition: kept|replaced|superseded | ||||
| correlation: { replaces?: sha256, replacedBy?: sha256 } | ||||
| ``` | ||||
|  | ||||
| **`vex.claims`** (normalized rows; dedupe on providerId+vulnId+productKey+docDigest) | ||||
|  | ||||
| ``` | ||||
| _id | ||||
| providerId | ||||
| vulnId | ||||
| productKey | ||||
| status | ||||
| justification? | ||||
| introducedVersion? | ||||
| fixedVersion? | ||||
| lastObserved | ||||
| docDigest | ||||
| provenance { uri, line?, pointer?, signatureState } | ||||
| evidence[] { key, value, locator } | ||||
| indices:  | ||||
|   - {vulnId:1, productKey:1} | ||||
|   - {providerId:1, lastObserved:-1} | ||||
|   - {status:1} | ||||
|   - text index (optional) on evidence.value for debugging | ||||
| ``` | ||||
|  | ||||
| **`vex.consensus`** (rollups) | ||||
|  | ||||
| ``` | ||||
| _id: sha256(canonical(vulnId, productKey, policyRevision)) | ||||
| vulnId | ||||
| productKey | ||||
| rollupStatus | ||||
| sources[]: [ | ||||
|   { providerId, status, justification?, weight, lastObserved, accepted:bool, reason } | ||||
| ] | ||||
| policyRevisionId | ||||
| evaluatedAt | ||||
| consensusDigest  // same as _id | ||||
| indices: | ||||
|   - {vulnId:1, productKey:1} | ||||
|   - {policyRevisionId:1, evaluatedAt:-1} | ||||
| ``` | ||||
|  | ||||
| **`vex.exports`** (manifest of emitted artifacts) | ||||
|  | ||||
| ``` | ||||
| _id | ||||
| querySignature | ||||
| format: raw|consensus|index | ||||
| artifactSha256 | ||||
| rekor { uuid, index, url }? | ||||
| createdAt | ||||
| policyRevisionId | ||||
| cacheable: bool | ||||
| ``` | ||||
|  | ||||
| **`vex.cache`** | ||||
|  | ||||
| ``` | ||||
| querySignature -> exportId (for fast reuse) | ||||
| ttl, hits | ||||
| ``` | ||||
|  | ||||
| **`vex.migrations`** | ||||
|  | ||||
| * ordered migrations applied at bootstrap to ensure indexes. | ||||
|  | ||||
| ### 3.2 Indexing strategy | ||||
|  | ||||
| * Hot path queries use exact `(vulnId, productKey)` and time‑bounded windows; compound indexes cover both. | ||||
| * Providers list view by `lastObserved` for monitoring staleness. | ||||
| * `vex.consensus` keyed by `(vulnId, productKey, policyRevision)` for deterministic reuse. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 4) Ingestion pipeline | ||||
|  | ||||
| ### 4.1 Connector contract | ||||
|  | ||||
| ```csharp | ||||
| public interface IVexConnector | ||||
| { | ||||
|     string ProviderId { get; } | ||||
|     Task FetchAsync(VexConnectorContext ctx, CancellationToken ct);   // raw docs | ||||
|     Task NormalizeAsync(VexConnectorContext ctx, CancellationToken ct); // raw -> VexClaim[] | ||||
| } | ||||
| ``` | ||||
|  | ||||
| * **Fetch** must implement: window scheduling, conditional GET (ETag/If‑Modified‑Since), rate limiting, retry/backoff. | ||||
| * **Normalize** parses the format, validates schema, maps product identities deterministically, emits `VexClaim` records with **provenance**. | ||||
|  | ||||
| ### 4.2 Signature verification (per provider) | ||||
|  | ||||
| * **cosign (keyless or keyful)** for OCI referrers or HTTP‑served JSON with Sigstore bundles. | ||||
| * **PGP** (provider keyrings) for distro/vendor feeds that sign docs. | ||||
| * **x509** (mutual TLS / provider‑pinned certs) where applicable. | ||||
| * Signature state is stored on **vex.raw.sig** and copied into **provenance.signatureState** on claims. | ||||
|  | ||||
| > Claims from sources failing signature policy are marked `"signatureState.verified=false"` and **policy** can down‑weight or ignore them. | ||||
|  | ||||
| ### 4.3 Time discipline | ||||
|  | ||||
| * For each doc, prefer **provider’s document timestamp**; if absent, use fetch time. | ||||
| * Claims carry `lastObserved` which drives **tie‑breaking** within equal weight tiers. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 5) Normalization: product & status semantics | ||||
|  | ||||
| ### 5.1 Product mapping | ||||
|  | ||||
| * **purl** first; **cpe** second; OS package NVRA/EVR mapping helpers (distro connectors) produce purls via canonical tables (e.g., rpm→purl:rpm, deb→purl:deb). | ||||
| * Where a provider publishes **platform‑level** VEX (e.g., “RHEL 9 not affected”), connectors expand to known product inventory rules (e.g., map to sets of packages/components shipped in the platform). Expansion tables are versioned and kept per provider; every expansion emits **evidence** indicating the rule applied. | ||||
| * If expansion would be speculative, the claim remains **platform‑scoped** with `productKey="platform:redhat:rhel:9"` and is flagged **non‑joinable**; backend can decide to use platform VEX only when Scanner proves the platform runtime. | ||||
|  | ||||
| ### 5.2 Status + justification mapping | ||||
|  | ||||
| * Canonical **status**: `affected | not_affected | fixed | under_investigation`. | ||||
| * **Justifications** normalized to a controlled vocabulary (CISA‑aligned), e.g.: | ||||
|  | ||||
|   * `component_not_present` | ||||
|   * `vulnerable_code_not_in_execute_path` | ||||
|   * `vulnerable_configuration_unused` | ||||
|   * `inline_mitigation_applied` | ||||
|   * `fix_available` (with `fixedVersion`) | ||||
|   * `under_investigation` | ||||
| * Providers with free‑text justifications are mapped by deterministic tables; raw text preserved as `evidence`. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 6) Consensus algorithm | ||||
|  | ||||
| **Goal:** produce a **stable**, explainable `rollupStatus` per `(vulnId, productKey)` given possibly conflicting claims. | ||||
|  | ||||
| ### 6.1 Inputs | ||||
|  | ||||
| * Set **S** of `VexClaim` for the key. | ||||
| * **Vexer policy snapshot**: | ||||
|  | ||||
|   * **weights** per provider tier and per provider overrides. | ||||
|   * **justification gates** (e.g., require justification for `not_affected` to be acceptable). | ||||
|   * **minEvidence** rules (e.g., `not_affected` must come from ≥1 vendor or 2 distros). | ||||
|   * **signature requirements** (e.g., require verified signature for ‘fixed’ to be considered). | ||||
|  | ||||
| ### 6.2 Steps | ||||
|  | ||||
| 1. **Filter invalid** claims by signature policy & justification gates → set `S'`. | ||||
| 2. **Score** each claim: | ||||
|    `score = weight(provider) * freshnessFactor(lastObserved)` where freshnessFactor ∈ [0.8, 1.0] for staleness decay (configurable; small effect). | ||||
| 3. **Aggregate** scores per status: `W(status) = Σ score(claims with that status)`. | ||||
| 4. **Pick** `rollupStatus = argmax_status W(status)`. | ||||
| 5. **Tie‑breakers** (in order): | ||||
|  | ||||
|    * Higher **max single** provider score wins (vendor > distro > platform > hub). | ||||
|    * More **recent** lastObserved wins. | ||||
|    * Deterministic lexicographic order of status (`fixed` > `not_affected` > `under_investigation` > `affected`) as final tiebreaker. | ||||
| 6. **Explain**: mark accepted sources (`accepted=true; reason="weight"`/`"freshness"`), mark rejected sources with explicit `reason` (`"insufficient_justification"`, `"signature_unverified"`, `"lower_weight"`). | ||||
|  | ||||
| > The algorithm is **pure** given S and policy snapshot; result is reproducible and hashed into `consensusDigest`. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 7) Query & export APIs | ||||
|  | ||||
| All endpoints are versioned under `/api/v1/vex`. | ||||
|  | ||||
| ### 7.1 Query (online) | ||||
|  | ||||
| ``` | ||||
| POST /claims/search | ||||
|   body: { vulnIds?: string[], productKeys?: string[], providers?: string[], since?: timestamp, limit?: int, pageToken?: string } | ||||
|   → { claims[], nextPageToken? } | ||||
|  | ||||
| POST /consensus/search | ||||
|   body: { vulnIds?: string[], productKeys?: string[], policyRevisionId?: string, since?: timestamp, limit?: int, pageToken?: string } | ||||
|   → { entries[], nextPageToken? } | ||||
|  | ||||
| POST /excititor/resolve (scope: vex.read) | ||||
|   body: { productKeys?: string[], purls?: string[], vulnerabilityIds: string[], policyRevisionId?: string } | ||||
|   → { policy, resolvedAt, results: [ { vulnerabilityId, productKey, status, sources[], conflicts[], decisions[], signals?, summary?, envelope: { artifact, contentSignature?, attestation?, attestationEnvelope?, attestationSignature? } } ] } | ||||
| ``` | ||||
|  | ||||
| ### 7.2 Exports (cacheable snapshots) | ||||
|  | ||||
| ``` | ||||
| POST /exports | ||||
|   body: { signature: { vulnFilter?, productFilter?, providers?, since? }, format: raw|consensus|index, policyRevisionId?: string, force?: bool } | ||||
|   → { exportId, artifactSha256, rekor? } | ||||
|  | ||||
| GET  /exports/{exportId}        → bytes (application/json or binary index) | ||||
| GET  /exports/{exportId}/meta   → { signature, policyRevisionId, createdAt, artifactSha256, rekor? } | ||||
| ``` | ||||
|  | ||||
| ### 7.3 Provider operations | ||||
|  | ||||
| ``` | ||||
| GET  /providers                  → provider list & signature policy | ||||
| POST /providers/{id}/refresh     → trigger fetch/normalize window | ||||
| GET  /providers/{id}/status      → last fetch, doc counts, signature stats | ||||
| ``` | ||||
|  | ||||
| **Auth:** service‑to‑service via Authority tokens; operator operations via UI/CLI with RBAC. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 8) Attestation integration | ||||
|  | ||||
| * Exports can be **DSSE‑signed** via **Signer** and logged to **Rekor v2** via **Attestor** (optional but recommended for regulated pipelines). | ||||
| * `vex.exports.rekor` stores `{uuid, index, url}` when present. | ||||
| * **Predicate type**: `https://stella-ops.org/attestations/vex-export/1` with fields: | ||||
|  | ||||
|   * `querySignature`, `policyRevisionId`, `artifactSha256`, `createdAt`. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 9) Configuration (YAML) | ||||
|  | ||||
| ```yaml | ||||
| vexer: | ||||
|   mongo: { uri: "mongodb://mongo/vexer" } | ||||
|   s3: | ||||
|     endpoint: http://minio:9000 | ||||
|     bucket: stellaops | ||||
|   policy: | ||||
|     weights: | ||||
|       vendor: 1.0 | ||||
|       distro: 0.9 | ||||
|       platform: 0.7 | ||||
|       hub: 0.5 | ||||
|       attestation: 0.6 | ||||
|     providerOverrides: | ||||
|       redhat: 1.0 | ||||
|       suse: 0.95 | ||||
|     requireJustificationForNotAffected: true | ||||
|     signatureRequiredForFixed: true | ||||
|     minEvidence: | ||||
|       not_affected: | ||||
|         vendorOrTwoDistros: true | ||||
|   connectors: | ||||
|     - providerId: redhat | ||||
|       kind: csaf | ||||
|       baseUrl: https://access.redhat.com/security/data/csaf/v2/ | ||||
|       signaturePolicy: { type: pgp, keys: [ "…redhat-pgp-key…" ] } | ||||
|       windowDays: 7 | ||||
|     - providerId: suse | ||||
|       kind: csaf | ||||
|       baseUrl: https://ftp.suse.com/pub/projects/security/csaf/ | ||||
|       signaturePolicy: { type: pgp, keys: [ "…suse-pgp-key…" ] } | ||||
|     - providerId: ubuntu | ||||
|       kind: openvex | ||||
|       baseUrl: https://…/vex/ | ||||
|       signaturePolicy: { type: none } | ||||
|     - providerId: vendorX | ||||
|       kind: cyclonedx-vex | ||||
|       ociRef: ghcr.io/vendorx/vex@sha256:… | ||||
|       signaturePolicy: { type: cosign, cosignKeylessRoots: [ "sigstore-root" ] } | ||||
| ``` | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 10) Security model | ||||
|  | ||||
| * **Input signature verification** enforced per provider policy (PGP, cosign, x509). | ||||
| * **Connector allowlists**: outbound fetch constrained to configured domains. | ||||
| * **Tenant isolation**: per‑tenant DB prefixes or separate DBs; per‑tenant S3 prefixes; per‑tenant policies. | ||||
| * **AuthN/Z**: Authority‑issued OpToks; RBAC roles (`vex.read`, `vex.admin`, `vex.export`). | ||||
| * **No secrets in logs**; deterministic logging contexts include providerId, docDigest, claim keys. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 11) Performance & scale | ||||
|  | ||||
| * **Targets:** | ||||
|  | ||||
|   * Normalize 10k VEX claims/minute/core. | ||||
|   * Consensus compute ≤ 50 ms for 1k unique `(vuln, product)` pairs in hot cache. | ||||
|   * Export (consensus) 1M rows in ≤ 60 s on 8 cores with streaming writer. | ||||
|  | ||||
| * **Scaling:** | ||||
|  | ||||
|   * WebService handles control APIs; **Worker** background services (same image) execute fetch/normalize in parallel with rate‑limits; Mongo writes batched; upserts by natural keys. | ||||
|   * Exports stream straight to S3 (MinIO) with rolling buffers. | ||||
|  | ||||
| * **Caching:** | ||||
|  | ||||
|   * `vex.cache` maps query signatures → export; TTL to avoid stampedes; optimistic reuse unless `force`. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 12) Observability | ||||
|  | ||||
| * **Metrics:** | ||||
|  | ||||
|   * `vex.ingest.docs_total{provider}` | ||||
|   * `vex.normalize.claims_total{provider}` | ||||
|   * `vex.signature.failures_total{provider,method}` | ||||
|   * `vex.consensus.conflicts_total{vulnId}` | ||||
|   * `vex.exports.bytes{format}` / `vex.exports.latency_seconds` | ||||
| * **Tracing:** spans for fetch, verify, parse, map, consensus, export. | ||||
| * **Dashboards:** provider staleness, top conflicting vulns/components, signature posture, export cache hit‑rate. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 13) Testing matrix | ||||
|  | ||||
| * **Connectors:** golden raw docs → deterministic claims (fixtures per provider/format). | ||||
| * **Signature policies:** valid/invalid PGP/cosign/x509 samples; ensure rejects are recorded but not accepted. | ||||
| * **Normalization edge cases:** platform‑only claims, free‑text justifications, non‑purl products. | ||||
| * **Consensus:** conflict scenarios across tiers; check tie‑breakers; justification gates. | ||||
| * **Performance:** 1M‑row export timing; memory ceilings; stream correctness. | ||||
| * **Determinism:** same inputs + policy → identical `consensusDigest` and export bytes. | ||||
| * **API contract tests:** pagination, filters, RBAC, rate limits. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 14) Integration points | ||||
|  | ||||
| * **Backend Policy Engine** (in Scanner.WebService): calls `POST /excititor/resolve` (scope `vex.read`) with batched `(purl, vulnId)` pairs to fetch `rollupStatus + sources`. | ||||
| * **Feedser**: provides alias graph (CVE↔vendor IDs) and may supply VEX‑adjacent metadata (e.g., KEV flag) for policy escalation. | ||||
| * **UI**: VEX explorer screens use `/claims/search` and `/consensus/search`; show conflicts & provenance. | ||||
| * **CLI**: `stellaops vex export --consensus --since 7d --out vex.json` for audits. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 15) Failure modes & fallback | ||||
|  | ||||
| * **Provider unreachable:** stale thresholds trigger warnings; policy can down‑weight stale providers automatically (freshness factor). | ||||
| * **Signature outage:** continue to ingest but mark `signatureState.verified=false`; consensus will likely exclude or down‑weight per policy. | ||||
| * **Schema drift:** unknown fields are preserved as `evidence`; normalization rejects only on **invalid identity** or **status**. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 16) Rollout plan (incremental) | ||||
|  | ||||
| 1. **MVP**: OpenVEX + CSAF connectors for 3 major providers (e.g., Red Hat/SUSE/Ubuntu), normalization + consensus + `/excititor/resolve`. | ||||
| 2. **Signature policies**: PGP for distros; cosign for OCI. | ||||
| 3. **Exports + optional attestation**. | ||||
| 4. **CycloneDX VEX** connectors; platform claim expansion tables; UI explorer. | ||||
| 5. **Scale hardening**: export indexes; conflict analytics. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 17) Appendix — canonical JSON (stable ordering) | ||||
|  | ||||
| All exports and consensus entries are serialized via `VexCanonicalJsonSerializer`: | ||||
|  | ||||
| * UTF‑8 without BOM; | ||||
| * keys sorted (ASCII); | ||||
| * arrays sorted by `(providerId, vulnId, productKey, lastObserved)` unless semantic order mandated; | ||||
| * timestamps in `YYYY‑MM‑DDThh:mm:ssZ`; | ||||
| * no insignificant whitespace. | ||||
|  | ||||
							
								
								
									
										65
									
								
								docs/modules/vexer/implementation_plan.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										65
									
								
								docs/modules/vexer/implementation_plan.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,65 @@ | ||||
| # Implementation plan — Vexer | ||||
|  | ||||
| ## Delivery phases | ||||
| - **Phase 1 – Connectors & normalization**   | ||||
|   Build connectors for OpenVEX, CSAF VEX, CycloneDX VEX, OCI attestations; capture provenance, signatures, and source metadata; normalise into `VexClaim`. | ||||
| - **Phase 2 – Mapping & trust registry**   | ||||
|   Implement product mapping (CPE → purl/version), issuer registry (trust tiers, signatures), scope scoring, and justification taxonomy. | ||||
| - **Phase 3 – Consensus & projections**   | ||||
|   Deliver consensus computation, conflict preservation, projections (`vex_consensus`, history, provider snapshots), and DSSE events. | ||||
| - **Phase 4 – APIs & integrations**   | ||||
|   Expose REST/CLI endpoints for claims, consensus, conflicts, exports; integrate Policy Engine, Vuln Explorer, Advisory AI, Export Center. | ||||
| - **Phase 5 – Observability & offline**   | ||||
|   Ship metrics, logs, traces, dashboards, incident runbooks, Offline Kit bundles, and performance tuning (10M claims/tenant). | ||||
|  | ||||
| ## Work breakdown | ||||
| - **Connectors** | ||||
|   - Fetchers for vendor feeds, CSAF repositories, OpenVEX docs, OCI referrers. | ||||
|   - Signature verification (PGP, cosign, PKI) per source; schema validation; rate limiting. | ||||
|   - Source configuration (trust tier, fetch cadence, blackout windows) stored in metadata registry. | ||||
| - **Normalization** | ||||
|   - Canonical `VexClaim` schema with deterministic IDs, provenance, supersedes chains. | ||||
|   - Product tree parsing, mapping to canonical product keys and environments. | ||||
|   - Justification and scope scoring derived from source semantics. | ||||
| - **Consensus & projections** | ||||
|   - Lattice join with precedence rules, conflict tracking, confidence scores, recency decay. | ||||
|   - Append-only history, conflict queue, DSSE events (`vex.consensus.updated`). | ||||
|   - Export-ready JSONL & DSSE bundles for Offline Kit and Export Center. | ||||
| - **APIs & UX** | ||||
|   - REST endpoints (`/claims`, `/consensus`, `/conflicts`, `/providers`) with tenant RBAC. | ||||
|   - CLI commands `stella vex claims|consensus|conflicts|export`. | ||||
|   - Console modules (list/detail, conflict diagnostics, provider health, simulation hooks). | ||||
| - **Integrations** | ||||
|   - Policy Engine trust knobs, Vuln Explorer consensus badges, Advisory AI narrative generation, Notify alerts for conflicts. | ||||
|   - Orchestrator jobs for recompute/backfill triggered by Excitator deltas. | ||||
| - **Observability & Ops** | ||||
|   - Metrics (ingest latency, signature failure rate, conflict rate, consensus latency). | ||||
|   - Logs/traces with tenant/issuer/provenance context. | ||||
|   - Runbooks for mapping failures, signature errors, recompute storms, quota exhaustion. | ||||
|  | ||||
| ## Acceptance criteria | ||||
| - Connectors ingest validated VEX statements with signed provenance, deterministic mapping, and tenant isolation. | ||||
| - Consensus outputs reproducible, include conflicts, and integrate with Policy Engine/Vuln Explorer/Export Center. | ||||
| - CLI/Console provide evidence inspection, conflict analysis, and exports; Offline Kit bundles replay verification offline. | ||||
| - Observability dashboards/alerts capture ingest health, trust anomalies, conflict spikes, and performance budgets. | ||||
| - Recompute pipeline handles policy changes and new evidence without dropping deterministic outcomes. | ||||
|  | ||||
| ## Risks & mitigations | ||||
| - **Mapping ambiguity:** maintain scope scores, manual overrides, highlight warnings. | ||||
| - **Signature trust gaps:** issuer registry with auditing, fallback trust policies, tenant overrides. | ||||
| - **Evidence surges:** orchestrator backpressure, prioritised queues, shardable workers. | ||||
| - **Performance regressions:** indexing, caching, load tests, budget enforcement. | ||||
| - **Tenant leakage:** strict RBAC/filters, fuzz tests, compliance reviews. | ||||
|  | ||||
| ## Test strategy | ||||
| - **Unit:** connector parsers, normalization, mapping conversions, lattice operations. | ||||
| - **Property:** randomised evidence ensuring commutative consensus and deterministic digests. | ||||
| - **Integration:** end-to-end pipeline from Excitator to consensus export, policy simulation, conflict handling. | ||||
| - **Performance:** large feed ingestion, recompute stress, CLI export throughput. | ||||
| - **Security:** signature tampering, issuer revocation, RBAC. | ||||
| - **Offline:** export/import verification, DSSE bundle validation. | ||||
|  | ||||
| ## Definition of done | ||||
| - Connectors, normalization, consensus, APIs, and integrations deployed with telemetry, runbooks, and Offline Kit parity. | ||||
| - Documentation (overview, architecture, algorithm, issuer registry, API/CLI, runbooks) updated with imposed rule compliance. | ||||
| - ./TASKS.md and ../../TASKS.md reflect active status and dependencies. | ||||
							
								
								
									
										83
									
								
								docs/modules/vexer/scoring.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										83
									
								
								docs/modules/vexer/scoring.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,83 @@ | ||||
| ## Status | ||||
|  | ||||
| This document tracks the future-looking risk scoring model for Vexer. The calculation below is not active yet; Sprint 7 work will add the required schema fields, policy controls, and services. Until that ships, Vexer emits consensus statuses without numeric scores. | ||||
|  | ||||
| ## Scoring model (target state) | ||||
|  | ||||
| **S = Gate(VEX_status) × W_trust(source) × [Severity_base × (1 + α·KEV + β·EPSS)]** | ||||
|  | ||||
| * **Gate(VEX_status)**: `affected`/`under_investigation` → 1, `not_affected`/`fixed` → 0. A trusted “not affected” or “fixed” still zeroes the score. | ||||
| * **W_trust(source)**: normalized policy weight (baseline 0‒1). Policies may opt into >1 boosts for signed vendor feeds once Phase 1 closes. | ||||
| * **Severity_base**: canonical numeric severity from Feedser (CVSS or org-defined scale). | ||||
| * **KEV flag**: 0/1 boost when CISA Known Exploited Vulnerabilities applies. | ||||
| * **EPSS**: probability [0,1]; bounded multiplier. | ||||
| * **α, β**: configurable coefficients (default α=0.25, β=0.5) stored in policy. | ||||
|  | ||||
| Safeguards: freeze boosts when product identity is unknown, clamp outputs ≥0, and log every factor in the audit trail. | ||||
|  | ||||
| ## Implementation roadmap | ||||
|  | ||||
| | Phase | Scope | Artifacts | | ||||
| | --- | --- | --- | | ||||
| | **Phase 1 – Schema foundations** | Extend Vexer consensus/claims and Feedser canonical advisories with severity, KEV, EPSS, and expose α/β + weight ceilings in policy. | Sprint 7 tasks `VEXER-CORE-02-001`, `VEXER-POLICY-02-001`, `VEXER-STORAGE-02-001`, `FEEDCORE-ENGINE-07-001`. | | ||||
| | **Phase 2 – Deterministic score engine** | Implement a scoring component that executes alongside consensus and persists score envelopes with hashes. | Planned task `VEXER-CORE-02-002` (backlog). | | ||||
| | **Phase 3 – Surfacing & enforcement** | Expose scores via WebService/CLI, integrate with Feedser noise priors, and enforce policy-based suppressions. | To be scheduled after Phase 2. | | ||||
|  | ||||
| ## Data model (after Phase 1) | ||||
|  | ||||
| ```json | ||||
| { | ||||
|   "vulnerabilityId": "CVE-2025-12345", | ||||
|   "product": "pkg:name@version", | ||||
|   "consensus": { | ||||
|     "status": "affected", | ||||
|     "policyRevisionId": "rev-12", | ||||
|     "policyDigest": "0D9AEC…" | ||||
|   }, | ||||
|   "signals": { | ||||
|     "severity": {"scheme": "CVSS:3.1", "score": 7.5}, | ||||
|     "kev": true, | ||||
|     "epss": 0.40 | ||||
|   }, | ||||
|   "policy": { | ||||
|     "weight": 1.15, | ||||
|     "alpha": 0.25, | ||||
|     "beta": 0.5 | ||||
|   }, | ||||
|   "score": { | ||||
|     "value": 10.8, | ||||
|     "generatedAt": "2025-11-05T14:12:30Z", | ||||
|     "audit": [ | ||||
|       "gate:affected", | ||||
|       "weight:1.15", | ||||
|       "severity:7.5", | ||||
|       "kev:1", | ||||
|       "epss:0.40" | ||||
|     ] | ||||
|   } | ||||
| } | ||||
| ``` | ||||
|  | ||||
| ## Operational guidance | ||||
|  | ||||
| * **Inputs**: Feedser delivers severity/KEV/EPSS via the advisory event log; Vexer connectors load VEX statements. Policy owns trust tiers and coefficients. | ||||
| * **Processing**: the scoring engine (Phase 2) runs next to consensus, storing results with deterministic hashes so exports and attestations can reference them. | ||||
| * **Consumption**: WebService/CLI will return consensus plus score; scanners may suppress findings only when policy-authorized VEX gating and signed score envelopes agree. | ||||
|  | ||||
| ## Pseudocode (Phase 2 preview) | ||||
|  | ||||
| ```python | ||||
| def risk_score(gate, weight, severity, kev, epss, alpha, beta, freeze_boosts=False): | ||||
|     if gate == 0: | ||||
|         return 0 | ||||
|     if freeze_boosts: | ||||
|         kev, epss = 0, 0 | ||||
|     boost = 1 + alpha * kev + beta * epss | ||||
|     return max(0, weight * severity * boost) | ||||
| ``` | ||||
|  | ||||
| ## FAQ | ||||
|  | ||||
| * **Can operators opt out?** Set α=β=0 or keep weights ≤1.0 via policy. | ||||
| * **What about missing signals?** Treat them as zero and log the omission. | ||||
| * **When will this ship?** Phase 1 is planned for Sprint 7; later phases depend on connector coverage and attestation delivery. | ||||
		Reference in New Issue
	
	Block a user