feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules

- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes.
- Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes.
- Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables.
- Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
This commit is contained in:
2025-10-30 00:09:39 +02:00
parent 3154c67978
commit 7b5bdcf4d3
503 changed files with 16136 additions and 54638 deletions

View File

@@ -0,0 +1,22 @@
# Vexer agent guide
## Mission
Vexer computes deterministic consensus across VEX claims, preserving conflicts and producing attestable evidence for policy suppression.
## Key docs
- [Module README](./README.md)
- [Architecture](./architecture.md)
- [Implementation plan](./implementation_plan.md)
- [Task board](./TASKS.md)
## How to get started
1. Open ../../implplan/SPRINTS.md and locate the stories referencing this module.
2. Review ./TASKS.md for local follow-ups and confirm status transitions (TODO → DOING → DONE/BLOCKED).
3. Read the architecture and README for domain context before editing code or docs.
4. Coordinate cross-module changes in the main /AGENTS.md description and through the sprint plan.
## Guardrails
- Honour the Aggregation-Only Contract where applicable (see ../../ingestion/aggregation-only-contract.md).
- Preserve determinism: sort outputs, normalise timestamps (UTC ISO-8601), and avoid machine-specific artefacts.
- Keep Offline Kit parity in mind—document air-gapped workflows for any new feature.
- Update runbooks/observability assets when operational characteristics change.

View File

@@ -0,0 +1,34 @@
# StellaOps Vexer
Vexer computes deterministic consensus across VEX claims, preserving conflicts and producing attestable evidence for policy suppression.
## Responsibilities
- Ingest Excititor observations and compute per-product consensus snapshots.
- Provide APIs for querying canonical VEX positions and conflict sets.
- Publish exports and DSSE-ready digests for downstream consumption.
- Keep provenance weights and disagreement metadata.
## Key components
- Consensus engine and API host in `StellaOps.Vexer.*` (to-be-implemented).
- Storage schema for consensus graphs.
- Integration hooks for Policy Engine suppression logic.
## Integrations & dependencies
- Excititor for raw observations.
- Policy Engine and UI for suppression stories.
- CLI for evidence inspection.
## Operational notes
- Deterministic consensus algorithms (see architecture).
- Planned telemetry for disagreement counts and freshness.
- Offline exports aligning with Concelier/Excititor timelines.
## Related resources
- ./scoring.md
## Backlog references
- DOCS-VEXER backlog referenced in architecture doc.
- CLI parity tracked in ../../TASKS.md (CLI-GRAPH/VEX stories).
## Epic alignment
- **Epic 7 VEX Consensus Lens:** deliver trust-weighted consensus snapshots, disagreement metadata, and explain APIs.

View File

@@ -0,0 +1,9 @@
# Task board — Vexer
> Local tasks should link back to ./AGENTS.md and mirror status updates into ../../TASKS.md when applicable.
| ID | Status | Owner(s) | Description | Notes |
|----|--------|----------|-------------|-------|
| VEXER-DOCS-0001 | DOING (2025-10-29) | Docs Guild | Validate that ./README.md aligns with the latest release notes. | See ./AGENTS.md |
| VEXER-OPS-0001 | TODO | Ops Guild | Review runbooks/observability assets after next sprint demo. | Sync outcomes back to ../../TASKS.md |
| VEXER-ENG-0001 | TODO | Module Team | Cross-check implementation plan milestones against ../../implplan/SPRINTS.md. | Update status via ./AGENTS.md workflow |

View File

@@ -0,0 +1,465 @@
# component_architecture_vexer.md — **StellaOps Vexer** (2025Q4)
> Built to satisfy Epic7 VEX Consensus Lens requirements.
> **Scope.** This document specifies the **Vexer** service: its purpose, trust model, data structures, APIs, plugin contracts, storage schema, normalization/consensus algorithms, performance budgets, testing matrix, and how it integrates with Scanner, Policy, Feedser, and the attestation chain. It is implementationready.
---
## 0) Mission & role in the platform
**Mission.** Convert heterogeneous **VEX** statements (OpenVEX, CSAF VEX, CycloneDX VEX; vendor/distro/platform sources) into **canonical, queryable claims**; compute **deterministic consensus** per *(vuln, product)*; preserve **conflicts with provenance**; publish **stable, attestable exports** that the backend uses to suppress nonexploitable findings, prioritize remaining risk, and explain decisions.
**Boundaries.**
* Vexer **does not** decide PASS/FAIL. It supplies **evidence** (statuses + justifications + provenance weights).
* Vexer preserves **conflicting claims** unchanged; consensus encodes how we would pick, but the raw set is always exportable.
* VEX consumption is **backendonly**: Scanner never applies VEX. The backends **Policy Engine** asks Vexer for status evidence and then decides what to show.
---
## 1) Inputs, outputs & canonical domain
### 1.1 Accepted input formats (ingest)
* **OpenVEX** JSON documents (attested or raw).
* **CSAF VEX** 2.x (vendor PSIRTs and distros commonly publish CSAF).
* **CycloneDX VEX** 1.4+ (standalone VEX or embedded VEX blocks).
* **OCIattached attestations** (VEX statements shipped as OCI referrers) — optional connectors.
All connectors register **source metadata**: provider identity, trust tier, signature expectations (PGP/cosign/PKI), fetch windows, rate limits, and time anchors.
### 1.2 Canonical model (normalized)
Every incoming statement becomes a set of **VexClaim** records:
```
VexClaim
- providerId // 'redhat', 'suse', 'ubuntu', 'github', 'vendorX'
- vulnId // 'CVE-2025-12345', 'GHSA-xxxx', canonicalized
- productKey // canonical product identity (see §2.2)
- status // affected | not_affected | fixed | under_investigation
- justification? // for 'not_affected'/'affected' where provided
- introducedVersion? // semantics per provider (range or exact)
- fixedVersion? // where provided (range or exact)
- lastObserved // timestamp from source or fetch time
- provenance // doc digest, signature status, fetch URI, line/offset anchors
- evidence[] // raw source snippets for explainability
- supersedes? // optional cross-doc chain (docDigest → docDigest)
```
### 1.3 Exports (consumption)
* **VexConsensus** per `(vulnId, productKey)` with:
* `rollupStatus` (after policy weights/justification gates),
* `sources[]` (winning + losing claims with weights & reasons),
* `policyRevisionId` (identifier of the Vexer policy used),
* `consensusDigest` (stable SHA256 over canonical JSON).
* **Raw claims** export for auditing (unchanged, with provenance).
* **Provider snapshots** (per source, last N days) for operator debugging.
* **Index** optimized for backend joins: `(productKey, vulnId) → (status, confidence, sourceSet)`.
All exports are **deterministic**, and (optionally) **attested** via DSSE and logged to Rekor v2.
---
## 2) Identity model — products & joins
### 2.1 Vuln identity
* Accepts **CVE**, **GHSA**, vendor IDs (MSRC, RHSA…), distro IDs (DSA/USN/RHSA…) — normalized to `vulnId` with alias sets.
* **Alias graph** maintained (from Feedser) to map vendor/distro IDs → CVE (primary) and to **GHSA** where applicable.
### 2.2 Product identity (`productKey`)
* **Primary:** `purl` (Package URL).
* **Secondary links:** `cpe`, **OS package NVRA/EVR**, NuGet/Maven/Golang identity, and **OS package name** when purl unavailable.
* **Fallback:** `oci:<registry>/<repo>@<digest>` for imagelevel VEX.
* **Special cases:** kernel modules, firmware, platforms → providerspecific mapping helpers (connector captures providers product taxonomy → canonical `productKey`).
> Vexer does not invent identities. If a provider cannot be mapped to purl/CPE/NVRA deterministically, we keep the native **product string** and mark the claim as **nonjoinable**; the backend will ignore it unless a policy explicitly whitelists that provider mapping.
---
## 3) Storage schema (MongoDB)
Database: `vexer`
### 3.1 Collections
**`vex.providers`**
```
_id: providerId
name, homepage, contact
trustTier: enum {vendor, distro, platform, hub, attestation}
signaturePolicy: { type: pgp|cosign|x509|none, keys[], certs[], cosignKeylessRoots[] }
fetch: { baseUrl, kind: http|oci|file, rateLimit, etagSupport, windowDays }
enabled: bool
createdAt, modifiedAt
```
**`vex.raw`** (immutable raw documents)
```
_id: sha256(doc bytes)
providerId
uri
ingestedAt
contentType
sig: { verified: bool, method: pgp|cosign|x509|none, keyId|certSubject, bundle? }
payload: GridFS pointer (if large)
disposition: kept|replaced|superseded
correlation: { replaces?: sha256, replacedBy?: sha256 }
```
**`vex.claims`** (normalized rows; dedupe on providerId+vulnId+productKey+docDigest)
```
_id
providerId
vulnId
productKey
status
justification?
introducedVersion?
fixedVersion?
lastObserved
docDigest
provenance { uri, line?, pointer?, signatureState }
evidence[] { key, value, locator }
indices:
- {vulnId:1, productKey:1}
- {providerId:1, lastObserved:-1}
- {status:1}
- text index (optional) on evidence.value for debugging
```
**`vex.consensus`** (rollups)
```
_id: sha256(canonical(vulnId, productKey, policyRevision))
vulnId
productKey
rollupStatus
sources[]: [
{ providerId, status, justification?, weight, lastObserved, accepted:bool, reason }
]
policyRevisionId
evaluatedAt
consensusDigest // same as _id
indices:
- {vulnId:1, productKey:1}
- {policyRevisionId:1, evaluatedAt:-1}
```
**`vex.exports`** (manifest of emitted artifacts)
```
_id
querySignature
format: raw|consensus|index
artifactSha256
rekor { uuid, index, url }?
createdAt
policyRevisionId
cacheable: bool
```
**`vex.cache`**
```
querySignature -> exportId (for fast reuse)
ttl, hits
```
**`vex.migrations`**
* ordered migrations applied at bootstrap to ensure indexes.
### 3.2 Indexing strategy
* Hot path queries use exact `(vulnId, productKey)` and timebounded windows; compound indexes cover both.
* Providers list view by `lastObserved` for monitoring staleness.
* `vex.consensus` keyed by `(vulnId, productKey, policyRevision)` for deterministic reuse.
---
## 4) Ingestion pipeline
### 4.1 Connector contract
```csharp
public interface IVexConnector
{
string ProviderId { get; }
Task FetchAsync(VexConnectorContext ctx, CancellationToken ct); // raw docs
Task NormalizeAsync(VexConnectorContext ctx, CancellationToken ct); // raw -> VexClaim[]
}
```
* **Fetch** must implement: window scheduling, conditional GET (ETag/IfModifiedSince), rate limiting, retry/backoff.
* **Normalize** parses the format, validates schema, maps product identities deterministically, emits `VexClaim` records with **provenance**.
### 4.2 Signature verification (per provider)
* **cosign (keyless or keyful)** for OCI referrers or HTTPserved JSON with Sigstore bundles.
* **PGP** (provider keyrings) for distro/vendor feeds that sign docs.
* **x509** (mutual TLS / providerpinned certs) where applicable.
* Signature state is stored on **vex.raw.sig** and copied into **provenance.signatureState** on claims.
> Claims from sources failing signature policy are marked `"signatureState.verified=false"` and **policy** can downweight or ignore them.
### 4.3 Time discipline
* For each doc, prefer **providers document timestamp**; if absent, use fetch time.
* Claims carry `lastObserved` which drives **tiebreaking** within equal weight tiers.
---
## 5) Normalization: product & status semantics
### 5.1 Product mapping
* **purl** first; **cpe** second; OS package NVRA/EVR mapping helpers (distro connectors) produce purls via canonical tables (e.g., rpm→purl:rpm, deb→purl:deb).
* Where a provider publishes **platformlevel** VEX (e.g., “RHEL 9 not affected”), connectors expand to known product inventory rules (e.g., map to sets of packages/components shipped in the platform). Expansion tables are versioned and kept per provider; every expansion emits **evidence** indicating the rule applied.
* If expansion would be speculative, the claim remains **platformscoped** with `productKey="platform:redhat:rhel:9"` and is flagged **nonjoinable**; backend can decide to use platform VEX only when Scanner proves the platform runtime.
### 5.2 Status + justification mapping
* Canonical **status**: `affected | not_affected | fixed | under_investigation`.
* **Justifications** normalized to a controlled vocabulary (CISAaligned), e.g.:
* `component_not_present`
* `vulnerable_code_not_in_execute_path`
* `vulnerable_configuration_unused`
* `inline_mitigation_applied`
* `fix_available` (with `fixedVersion`)
* `under_investigation`
* Providers with freetext justifications are mapped by deterministic tables; raw text preserved as `evidence`.
---
## 6) Consensus algorithm
**Goal:** produce a **stable**, explainable `rollupStatus` per `(vulnId, productKey)` given possibly conflicting claims.
### 6.1 Inputs
* Set **S** of `VexClaim` for the key.
* **Vexer policy snapshot**:
* **weights** per provider tier and per provider overrides.
* **justification gates** (e.g., require justification for `not_affected` to be acceptable).
* **minEvidence** rules (e.g., `not_affected` must come from ≥1 vendor or 2 distros).
* **signature requirements** (e.g., require verified signature for fixed to be considered).
### 6.2 Steps
1. **Filter invalid** claims by signature policy & justification gates → set `S'`.
2. **Score** each claim:
`score = weight(provider) * freshnessFactor(lastObserved)` where freshnessFactor ∈ [0.8, 1.0] for staleness decay (configurable; small effect).
3. **Aggregate** scores per status: `W(status) = Σ score(claims with that status)`.
4. **Pick** `rollupStatus = argmax_status W(status)`.
5. **Tiebreakers** (in order):
* Higher **max single** provider score wins (vendor > distro > platform > hub).
* More **recent** lastObserved wins.
* Deterministic lexicographic order of status (`fixed` > `not_affected` > `under_investigation` > `affected`) as final tiebreaker.
6. **Explain**: mark accepted sources (`accepted=true; reason="weight"`/`"freshness"`), mark rejected sources with explicit `reason` (`"insufficient_justification"`, `"signature_unverified"`, `"lower_weight"`).
> The algorithm is **pure** given S and policy snapshot; result is reproducible and hashed into `consensusDigest`.
---
## 7) Query & export APIs
All endpoints are versioned under `/api/v1/vex`.
### 7.1 Query (online)
```
POST /claims/search
body: { vulnIds?: string[], productKeys?: string[], providers?: string[], since?: timestamp, limit?: int, pageToken?: string }
→ { claims[], nextPageToken? }
POST /consensus/search
body: { vulnIds?: string[], productKeys?: string[], policyRevisionId?: string, since?: timestamp, limit?: int, pageToken?: string }
→ { entries[], nextPageToken? }
POST /excititor/resolve (scope: vex.read)
body: { productKeys?: string[], purls?: string[], vulnerabilityIds: string[], policyRevisionId?: string }
→ { policy, resolvedAt, results: [ { vulnerabilityId, productKey, status, sources[], conflicts[], decisions[], signals?, summary?, envelope: { artifact, contentSignature?, attestation?, attestationEnvelope?, attestationSignature? } } ] }
```
### 7.2 Exports (cacheable snapshots)
```
POST /exports
body: { signature: { vulnFilter?, productFilter?, providers?, since? }, format: raw|consensus|index, policyRevisionId?: string, force?: bool }
→ { exportId, artifactSha256, rekor? }
GET /exports/{exportId} → bytes (application/json or binary index)
GET /exports/{exportId}/meta → { signature, policyRevisionId, createdAt, artifactSha256, rekor? }
```
### 7.3 Provider operations
```
GET /providers → provider list & signature policy
POST /providers/{id}/refresh → trigger fetch/normalize window
GET /providers/{id}/status → last fetch, doc counts, signature stats
```
**Auth:** servicetoservice via Authority tokens; operator operations via UI/CLI with RBAC.
---
## 8) Attestation integration
* Exports can be **DSSEsigned** via **Signer** and logged to **Rekor v2** via **Attestor** (optional but recommended for regulated pipelines).
* `vex.exports.rekor` stores `{uuid, index, url}` when present.
* **Predicate type**: `https://stella-ops.org/attestations/vex-export/1` with fields:
* `querySignature`, `policyRevisionId`, `artifactSha256`, `createdAt`.
---
## 9) Configuration (YAML)
```yaml
vexer:
mongo: { uri: "mongodb://mongo/vexer" }
s3:
endpoint: http://minio:9000
bucket: stellaops
policy:
weights:
vendor: 1.0
distro: 0.9
platform: 0.7
hub: 0.5
attestation: 0.6
providerOverrides:
redhat: 1.0
suse: 0.95
requireJustificationForNotAffected: true
signatureRequiredForFixed: true
minEvidence:
not_affected:
vendorOrTwoDistros: true
connectors:
- providerId: redhat
kind: csaf
baseUrl: https://access.redhat.com/security/data/csaf/v2/
signaturePolicy: { type: pgp, keys: [ "…redhat-pgp-key…" ] }
windowDays: 7
- providerId: suse
kind: csaf
baseUrl: https://ftp.suse.com/pub/projects/security/csaf/
signaturePolicy: { type: pgp, keys: [ "…suse-pgp-key…" ] }
- providerId: ubuntu
kind: openvex
baseUrl: https://…/vex/
signaturePolicy: { type: none }
- providerId: vendorX
kind: cyclonedx-vex
ociRef: ghcr.io/vendorx/vex@sha256:…
signaturePolicy: { type: cosign, cosignKeylessRoots: [ "sigstore-root" ] }
```
---
## 10) Security model
* **Input signature verification** enforced per provider policy (PGP, cosign, x509).
* **Connector allowlists**: outbound fetch constrained to configured domains.
* **Tenant isolation**: pertenant DB prefixes or separate DBs; pertenant S3 prefixes; pertenant policies.
* **AuthN/Z**: Authorityissued OpToks; RBAC roles (`vex.read`, `vex.admin`, `vex.export`).
* **No secrets in logs**; deterministic logging contexts include providerId, docDigest, claim keys.
---
## 11) Performance & scale
* **Targets:**
* Normalize 10k VEX claims/minute/core.
* Consensus compute ≤50ms for 1k unique `(vuln, product)` pairs in hot cache.
* Export (consensus) 1M rows in ≤60s on 8 cores with streaming writer.
* **Scaling:**
* WebService handles control APIs; **Worker** background services (same image) execute fetch/normalize in parallel with ratelimits; Mongo writes batched; upserts by natural keys.
* Exports stream straight to S3 (MinIO) with rolling buffers.
* **Caching:**
* `vex.cache` maps query signatures → export; TTL to avoid stampedes; optimistic reuse unless `force`.
---
## 12) Observability
* **Metrics:**
* `vex.ingest.docs_total{provider}`
* `vex.normalize.claims_total{provider}`
* `vex.signature.failures_total{provider,method}`
* `vex.consensus.conflicts_total{vulnId}`
* `vex.exports.bytes{format}` / `vex.exports.latency_seconds`
* **Tracing:** spans for fetch, verify, parse, map, consensus, export.
* **Dashboards:** provider staleness, top conflicting vulns/components, signature posture, export cache hitrate.
---
## 13) Testing matrix
* **Connectors:** golden raw docs → deterministic claims (fixtures per provider/format).
* **Signature policies:** valid/invalid PGP/cosign/x509 samples; ensure rejects are recorded but not accepted.
* **Normalization edge cases:** platformonly claims, freetext justifications, nonpurl products.
* **Consensus:** conflict scenarios across tiers; check tiebreakers; justification gates.
* **Performance:** 1Mrow export timing; memory ceilings; stream correctness.
* **Determinism:** same inputs + policy → identical `consensusDigest` and export bytes.
* **API contract tests:** pagination, filters, RBAC, rate limits.
---
## 14) Integration points
* **Backend Policy Engine** (in Scanner.WebService): calls `POST /excititor/resolve` (scope `vex.read`) with batched `(purl, vulnId)` pairs to fetch `rollupStatus + sources`.
* **Feedser**: provides alias graph (CVE↔vendor IDs) and may supply VEXadjacent metadata (e.g., KEV flag) for policy escalation.
* **UI**: VEX explorer screens use `/claims/search` and `/consensus/search`; show conflicts & provenance.
* **CLI**: `stellaops vex export --consensus --since 7d --out vex.json` for audits.
---
## 15) Failure modes & fallback
* **Provider unreachable:** stale thresholds trigger warnings; policy can downweight stale providers automatically (freshness factor).
* **Signature outage:** continue to ingest but mark `signatureState.verified=false`; consensus will likely exclude or downweight per policy.
* **Schema drift:** unknown fields are preserved as `evidence`; normalization rejects only on **invalid identity** or **status**.
---
## 16) Rollout plan (incremental)
1. **MVP**: OpenVEX + CSAF connectors for 3 major providers (e.g., Red Hat/SUSE/Ubuntu), normalization + consensus + `/excititor/resolve`.
2. **Signature policies**: PGP for distros; cosign for OCI.
3. **Exports + optional attestation**.
4. **CycloneDX VEX** connectors; platform claim expansion tables; UI explorer.
5. **Scale hardening**: export indexes; conflict analytics.
---
## 17) Appendix — canonical JSON (stable ordering)
All exports and consensus entries are serialized via `VexCanonicalJsonSerializer`:
* UTF8 without BOM;
* keys sorted (ASCII);
* arrays sorted by `(providerId, vulnId, productKey, lastObserved)` unless semantic order mandated;
* timestamps in `YYYYMMDDThh:mm:ssZ`;
* no insignificant whitespace.

View File

@@ -0,0 +1,65 @@
# Implementation plan — Vexer
## Delivery phases
- **Phase 1 Connectors & normalization**
Build connectors for OpenVEX, CSAF VEX, CycloneDX VEX, OCI attestations; capture provenance, signatures, and source metadata; normalise into `VexClaim`.
- **Phase 2 Mapping & trust registry**
Implement product mapping (CPE → purl/version), issuer registry (trust tiers, signatures), scope scoring, and justification taxonomy.
- **Phase 3 Consensus & projections**
Deliver consensus computation, conflict preservation, projections (`vex_consensus`, history, provider snapshots), and DSSE events.
- **Phase 4 APIs & integrations**
Expose REST/CLI endpoints for claims, consensus, conflicts, exports; integrate Policy Engine, Vuln Explorer, Advisory AI, Export Center.
- **Phase 5 Observability & offline**
Ship metrics, logs, traces, dashboards, incident runbooks, Offline Kit bundles, and performance tuning (10M claims/tenant).
## Work breakdown
- **Connectors**
- Fetchers for vendor feeds, CSAF repositories, OpenVEX docs, OCI referrers.
- Signature verification (PGP, cosign, PKI) per source; schema validation; rate limiting.
- Source configuration (trust tier, fetch cadence, blackout windows) stored in metadata registry.
- **Normalization**
- Canonical `VexClaim` schema with deterministic IDs, provenance, supersedes chains.
- Product tree parsing, mapping to canonical product keys and environments.
- Justification and scope scoring derived from source semantics.
- **Consensus & projections**
- Lattice join with precedence rules, conflict tracking, confidence scores, recency decay.
- Append-only history, conflict queue, DSSE events (`vex.consensus.updated`).
- Export-ready JSONL & DSSE bundles for Offline Kit and Export Center.
- **APIs & UX**
- REST endpoints (`/claims`, `/consensus`, `/conflicts`, `/providers`) with tenant RBAC.
- CLI commands `stella vex claims|consensus|conflicts|export`.
- Console modules (list/detail, conflict diagnostics, provider health, simulation hooks).
- **Integrations**
- Policy Engine trust knobs, Vuln Explorer consensus badges, Advisory AI narrative generation, Notify alerts for conflicts.
- Orchestrator jobs for recompute/backfill triggered by Excitator deltas.
- **Observability & Ops**
- Metrics (ingest latency, signature failure rate, conflict rate, consensus latency).
- Logs/traces with tenant/issuer/provenance context.
- Runbooks for mapping failures, signature errors, recompute storms, quota exhaustion.
## Acceptance criteria
- Connectors ingest validated VEX statements with signed provenance, deterministic mapping, and tenant isolation.
- Consensus outputs reproducible, include conflicts, and integrate with Policy Engine/Vuln Explorer/Export Center.
- CLI/Console provide evidence inspection, conflict analysis, and exports; Offline Kit bundles replay verification offline.
- Observability dashboards/alerts capture ingest health, trust anomalies, conflict spikes, and performance budgets.
- Recompute pipeline handles policy changes and new evidence without dropping deterministic outcomes.
## Risks & mitigations
- **Mapping ambiguity:** maintain scope scores, manual overrides, highlight warnings.
- **Signature trust gaps:** issuer registry with auditing, fallback trust policies, tenant overrides.
- **Evidence surges:** orchestrator backpressure, prioritised queues, shardable workers.
- **Performance regressions:** indexing, caching, load tests, budget enforcement.
- **Tenant leakage:** strict RBAC/filters, fuzz tests, compliance reviews.
## Test strategy
- **Unit:** connector parsers, normalization, mapping conversions, lattice operations.
- **Property:** randomised evidence ensuring commutative consensus and deterministic digests.
- **Integration:** end-to-end pipeline from Excitator to consensus export, policy simulation, conflict handling.
- **Performance:** large feed ingestion, recompute stress, CLI export throughput.
- **Security:** signature tampering, issuer revocation, RBAC.
- **Offline:** export/import verification, DSSE bundle validation.
## Definition of done
- Connectors, normalization, consensus, APIs, and integrations deployed with telemetry, runbooks, and Offline Kit parity.
- Documentation (overview, architecture, algorithm, issuer registry, API/CLI, runbooks) updated with imposed rule compliance.
- ./TASKS.md and ../../TASKS.md reflect active status and dependencies.

View File

@@ -0,0 +1,83 @@
## Status
This document tracks the future-looking risk scoring model for Vexer. The calculation below is not active yet; Sprint 7 work will add the required schema fields, policy controls, and services. Until that ships, Vexer emits consensus statuses without numeric scores.
## Scoring model (target state)
**S = Gate(VEX_status) × W_trust(source) × [Severity_base × (1 + α·KEV + β·EPSS)]**
* **Gate(VEX_status)**: `affected`/`under_investigation` → 1, `not_affected`/`fixed` → 0. A trusted “not affected” or “fixed” still zeroes the score.
* **W_trust(source)**: normalized policy weight (baseline 01). Policies may opt into >1 boosts for signed vendor feeds once Phase 1 closes.
* **Severity_base**: canonical numeric severity from Feedser (CVSS or org-defined scale).
* **KEV flag**: 0/1 boost when CISA Known Exploited Vulnerabilities applies.
* **EPSS**: probability [0,1]; bounded multiplier.
* **α, β**: configurable coefficients (default α=0.25, β=0.5) stored in policy.
Safeguards: freeze boosts when product identity is unknown, clamp outputs ≥0, and log every factor in the audit trail.
## Implementation roadmap
| Phase | Scope | Artifacts |
| --- | --- | --- |
| **Phase 1 Schema foundations** | Extend Vexer consensus/claims and Feedser canonical advisories with severity, KEV, EPSS, and expose α/β + weight ceilings in policy. | Sprint 7 tasks `VEXER-CORE-02-001`, `VEXER-POLICY-02-001`, `VEXER-STORAGE-02-001`, `FEEDCORE-ENGINE-07-001`. |
| **Phase 2 Deterministic score engine** | Implement a scoring component that executes alongside consensus and persists score envelopes with hashes. | Planned task `VEXER-CORE-02-002` (backlog). |
| **Phase 3 Surfacing & enforcement** | Expose scores via WebService/CLI, integrate with Feedser noise priors, and enforce policy-based suppressions. | To be scheduled after Phase 2. |
## Data model (after Phase 1)
```json
{
"vulnerabilityId": "CVE-2025-12345",
"product": "pkg:name@version",
"consensus": {
"status": "affected",
"policyRevisionId": "rev-12",
"policyDigest": "0D9AEC…"
},
"signals": {
"severity": {"scheme": "CVSS:3.1", "score": 7.5},
"kev": true,
"epss": 0.40
},
"policy": {
"weight": 1.15,
"alpha": 0.25,
"beta": 0.5
},
"score": {
"value": 10.8,
"generatedAt": "2025-11-05T14:12:30Z",
"audit": [
"gate:affected",
"weight:1.15",
"severity:7.5",
"kev:1",
"epss:0.40"
]
}
}
```
## Operational guidance
* **Inputs**: Feedser delivers severity/KEV/EPSS via the advisory event log; Vexer connectors load VEX statements. Policy owns trust tiers and coefficients.
* **Processing**: the scoring engine (Phase 2) runs next to consensus, storing results with deterministic hashes so exports and attestations can reference them.
* **Consumption**: WebService/CLI will return consensus plus score; scanners may suppress findings only when policy-authorized VEX gating and signed score envelopes agree.
## Pseudocode (Phase 2 preview)
```python
def risk_score(gate, weight, severity, kev, epss, alpha, beta, freeze_boosts=False):
if gate == 0:
return 0
if freeze_boosts:
kev, epss = 0, 0
boost = 1 + alpha * kev + beta * epss
return max(0, weight * severity * boost)
```
## FAQ
* **Can operators opt out?** Set α=β=0 or keep weights ≤1.0 via policy.
* **What about missing signals?** Treat them as zero and log the omission.
* **When will this ship?** Phase 1 is planned for Sprint 7; later phases depend on connector coverage and attestation delivery.