feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules

- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes. - Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes. - Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables. - Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
2025-10-30 00:09:39 +02:00
parent 3154c67978
commit 7b5bdcf4d3
503 changed files with 16136 additions and 54638 deletions
--- a/docs/modules/excititor/architecture.md
+++ b/docs/modules/excititor/architecture.md
@@ -0,0 +1,749 @@
+# component_architecture_excititor.md — **Stella Ops Excititor** (Sprint 22)
+
+> Consolidates the VEX ingestion guardrails from Epic 1 with consensus and AI-facing requirements from Epics 7 and 8. This is the authoritative architecture record for Excititor.
+
+> **Scope.** This document specifies the **Excititor** service: its purpose, trust model, data structures, observation/linkset pipelines, APIs, plug-in contracts, storage schema, performance budgets, testing matrix, and how it integrates with Concelier, Policy Engine, and evidence surfaces. It is implementation-ready.
+
+---
+
+## 0) Mission & role in the platform
+
+**Mission.** Convert heterogeneous **VEX** statements (OpenVEX, CSAF VEX, CycloneDX VEX; vendor/distro/platform sources) into immutable **VEX observations**, correlate them into **linksets** that retain provenance/conflicts without precedence, and publish deterministic evidence exports and events that Policy Engine, Console, and CLI use to suppress or explain findings.
+
+**Boundaries.**
+
+* Excititor **does not** decide PASS/FAIL. It supplies **evidence** (statuses + justifications + provenance weights).
+* Excititor preserves **conflicting observations** unchanged; consensus (when enabled) merely annotates how policy might choose, but raw evidence remains exportable.
+* VEX consumption is **backend-only**: Scanner never applies VEX. The backend’s **Policy Engine** asks Excititor for status evidence and then decides what to show.
+
+---
+
+## 1) Aggregation guardrails (AOC baseline)
+
+Excititor enforces the same ingestion covenant as Concelier, tailored to VEX payloads:
+
+1. **Immutable `vex_raw` documents.** Upstream OpenVEX/CSAF/CycloneDX files are stored verbatim (`content.raw`) with provenance (`issuer`, `statement_id`, timestamps, signatures). Revisions append new versions linked by `supersedes`.
+2. **No derived consensus at ingest time.** Fields such as `effective_status`, `merged_state`, `severity`, or reachability are forbidden. Roslyn analyzers and runtime guards block violations before writes.
+3. **Linkset-only joins.** Product aliases, CVE keys, SBOM hints, and references live under `linkset`; ingestion must never mutate the underlying statement.
+4. **Deterministic canonicalisation.** Writers sort JSON keys/arrays, normalize timestamps (UTC ISO‑8601), and hash content for reproducible exports.
+5. **AOC verifier.** `StellaOps.AOC.Verifier` runs in CI and production, checking schema compliance, provenance completeness, sorted collections, and signature metadata.
+
+### 1.1 VEX raw document shape
+
+```json
+{
+  "_id": "vex_raw:openvex:VEX-2025-00001:v2",
+  "source": {
+    "issuer": "vendor:redhat",
+    "stream": "openvex",
+    "api": "https://vendor/api/vex/VEX-2025-00001.json",
+    "collector_version": "excititor/0.9.4"
+  },
+  "upstream": {
+    "statement_id": "VEX-2025-00001",
+    "document_version": "2025-08-30T12:00:00Z",
+    "fetched_at": "2025-08-30T12:05:00Z",
+    "received_at": "2025-08-30T12:05:01Z",
+    "content_hash": "sha256:...",
+    "signature": {
+      "present": true,
+      "format": "dsse",
+      "key_id": "rekor:uuid",
+      "sig": "base64..."
+    }
+  },
+  "content": {
+    "format": "openvex",
+    "spec_version": "1.0",
+    "raw": { /* upstream statement */ }
+  },
+  "identifiers": {
+    "cve": ["CVE-2025-13579"],
+    "products": [
+      {"purl": "pkg:rpm/redhat/openssl@3.0.9", "component": "openssl"}
+    ]
+  },
+  "linkset": {
+    "aliases": ["REDHAT:RHSA-2025:1234"],
+    "sbom_products": ["pkg:rpm/redhat/openssl@3.0.9"],
+    "justifications": ["reasonable_worst_case_assumption"],
+    "references": [
+      {"type": "advisory", "url": "https://..."}
+    ]
+  },
+  "supersedes": "vex_raw:openvex:VEX-2025-00001:v1",
+  "tenant": "default"
+}
+```
+
+### 1.2 Issuer trust registry
+
+To enable Epic 7’s consensus lens, Excititor maintains `vex_issuer_registry` documents containing:
+
+- `issuer_id`, canonical name, and allowed domains.
+- `trust.tier` (`critical`, `high`, `medium`, `low`), `trust.confidence` (0–1).
+- `products` PURL patterns the issuer is authoritative for.
+- `signing_keys` with key IDs and expiry.
+- `last_validated_at`, `revocation_status`.
+
+The registry is distributed as a signed bundle and cached locally; ingestion rejects statements from issuers without registry entries or valid signatures.
+
+### 1.3 Normalised tuple store
+
+Excititor derives `vex_normalized` tuples (without making decisions) for downstream consumers:
+
+```json
+{
+  "advisory_key": "CVE-2025-13579",
+  "artifact": "pkg:rpm/redhat/openssl@3.0.9",
+  "issuer": "vendor:redhat",
+  "status": "not_affected",
+  "justification": "component_not_present",
+  "scope": "runtime_path",
+  "timestamp": "2025-08-30T12:00:00Z",
+  "trust": {"tier": "high", "confidence": 0.95},
+  "statement_id": "VEX-2025-00001:v2",
+  "content_hash": "sha256:..."
+}
+```
+
+These tuples allow VEX Lens to compute deterministic consensus without re-parsing heavy upstream documents.
+
+### 1.4 AI-ready citations
+
+`GET /v1/vex/statements/{advisory_key}` produces sorted JSON responses containing raw statement metadata (`issuer`, `content_hash`, `signature`), normalised tuples, and provenance pointers. Advisory AI consumes this endpoint to build retrieval contexts with explicit citations.
+
+---
+
+## 2) Inputs, outputs & canonical domain
+
+### 1.1 Accepted input formats (ingest)
+
+* **OpenVEX** JSON documents (attested or raw).
+* **CSAF VEX** 2.x (vendor PSIRTs and distros commonly publish CSAF).
+* **CycloneDX VEX** 1.4+ (standalone VEX or embedded VEX blocks).
+* **OCI‑attached attestations** (VEX statements shipped as OCI referrers) — optional connectors.
+
+All connectors register **source metadata**: provider identity, trust tier, signature expectations (PGP/cosign/PKI), fetch windows, rate limits, and time anchors.
+
+### 1.2 Canonical model (observations & linksets)
+
+#### VexObservation
+
+```jsonc
+observationId       // {tenant}:{providerId}:{upstreamId}:{revision}
+tenant
+providerId          // e.g., redhat, suse, ubuntu, osv
+streamId            // connector stream (csaf, openvex, cyclonedx, attestation)
+upstream{
+    upstreamId,
+    documentVersion?,
+    fetchedAt,
+    receivedAt,
+    contentHash,
+    signature{present, format?, keyId?, signature?}
+}
+statements[
+  {
+    vulnerabilityId,
+    productKey,
+    status,                    // affected | not_affected | fixed | under_investigation
+    justification?,
+    introducedVersion?,
+    fixedVersion?,
+    lastObserved,
+    locator?,                  // JSON Pointer/line for provenance
+    evidence?[]
+  }
+]
+content{
+    format,
+    specVersion?,
+    raw
+}
+linkset{
+    aliases[],                 // CVE/GHSA/vendor IDs
+    purls[],
+    cpes[],
+    references[{type,url}],
+    reconciledFrom[]
+}
+supersedes?
+createdAt
+attributes?
+```
+
+#### VexLinkset
+
+```jsonc
+linksetId           // sha256 over sorted (tenant, vulnId, productKey, observationIds)
+tenant
+key{
+    vulnerabilityId,
+    productKey,
+    confidence          // low|medium|high
+}
+observations[] = [
+  {
+    observationId,
+    providerId,
+    status,
+    justification?,
+    introducedVersion?,
+    fixedVersion?,
+    evidence?,
+    collectedAt
+  }
+]
+aliases{
+    primary,
+    others[]
+}
+purls[]
+cpes[]
+conflicts[]?        // see VexLinksetConflict
+createdAt
+updatedAt
+```
+
+#### VexLinksetConflict
+
+```jsonc
+conflictId
+type                // status-mismatch | justification-divergence | version-range-clash | non-joinable-overlap | metadata-gap
+field?              // optional pointer for UI rendering
+statements[]        // per-observation values with providerId + status/justification/version data
+confidence
+detectedAt
+```
+
+#### VexConsensus (optional)
+
+```jsonc
+consensusId         // sha256(vulnerabilityId, productKey, policyRevisionId)
+vulnerabilityId
+productKey
+rollupStatus        // derived by Excititor policy adapter (linkset aware)
+sources[]           // observation references with weight, accepted flag, reason
+policyRevisionId
+evaluatedAt
+consensusDigest
+```
+
+Consensus persists only when Excititor policy adapters require pre-computed rollups (e.g., Offline Kit). Policy Engine can also compute consensus on demand from linksets.
+
+### 1.3 Exports & evidence bundles
+
+* **Raw observations** — JSON tree per observation for auditing/offline.
+* **Linksets** — grouped evidence for policy/Console/CLI consumption.
+* **Consensus (optional)** — if enabled, mirrors existing API contracts.
+* **Provider snapshots** — last N days of observations per provider to support diagnostics.
+* **Index** — `(productKey, vulnerabilityId) → {status candidates, confidence, observationIds}` for high-speed joins.
+
+All exports remain deterministic and, when configured, attested via DSSE + Rekor v2.
+
+---
+
+## 3) Identity model — products & joins
+
+### 2.1 Vuln identity
+
+* Accepts **CVE**, **GHSA**, vendor IDs (MSRC, RHSA…), distro IDs (DSA/USN/RHSA…) — normalized to `vulnId` with alias sets.
+* **Alias graph** maintained (from Concelier) to map vendor/distro IDs → CVE (primary) and to **GHSA** where applicable.
+
+### 2.2 Product identity (`productKey`)
+
+* **Primary:** `purl` (Package URL).
+* **Secondary links:** `cpe`, **OS package NVRA/EVR**, NuGet/Maven/Golang identity, and **OS package name** when purl unavailable.
+* **Fallback:** `oci:<registry>/<repo>@<digest>` for image‑level VEX.
+* **Special cases:** kernel modules, firmware, platforms → provider‑specific mapping helpers (connector captures provider’s product taxonomy → canonical `productKey`).
+
+> Excititor does not invent identities. If a provider cannot be mapped to purl/CPE/NVRA deterministically, we keep the native **product string** and mark the claim as **non‑joinable**; the backend will ignore it unless a policy explicitly whitelists that provider mapping.
+
+---
+
+## 4) Storage schema (MongoDB)
+
+Database: `excititor`
+
+### 3.1 Collections
+
+**`vex.providers`**
+
+```
+_id: providerId
+name, homepage, contact
+trustTier: enum {vendor, distro, platform, hub, attestation}
+signaturePolicy: { type: pgp|cosign|x509|none, keys[], certs[], cosignKeylessRoots[] }
+fetch: { baseUrl, kind: http|oci|file, rateLimit, etagSupport, windowDays }
+enabled: bool
+createdAt, modifiedAt
+```
+
+**`vex.raw`** (immutable raw documents)
+
+```
+_id: sha256(doc bytes)
+providerId
+uri
+ingestedAt
+contentType
+sig: { verified: bool, method: pgp|cosign|x509|none, keyId|certSubject, bundle? }
+payload: GridFS pointer (if large)
+disposition: kept|replaced|superseded
+correlation: { replaces?: sha256, replacedBy?: sha256 }
+```
+
+**`vex.observations`**
+
+```
+{
+  _id: "tenant:providerId:upstreamId:revision",
+  tenant,
+  providerId,
+  streamId,
+  upstream: { upstreamId, documentVersion?, fetchedAt, receivedAt, contentHash, signature },
+  statements: [
+    {
+      vulnerabilityId,
+      productKey,
+      status,
+      justification?,
+      introducedVersion?,
+      fixedVersion?,
+      lastObserved,
+      locator?,
+      evidence?
+    }
+  ],
+  content: { format, specVersion?, raw },
+  linkset: { aliases[], purls[], cpes[], references[], reconciledFrom[] },
+  supersedes?,
+  createdAt,
+  attributes?
+}
+```
+
+  * Indexes: `{tenant:1, providerId:1, upstream.upstreamId:1}`, `{tenant:1, statements.vulnerabilityId:1}`, `{tenant:1, linkset.purls:1}`, `{tenant:1, createdAt:-1}`.
+
+**`vex.linksets`**
+
+```
+{
+  _id: "sha256:...",
+  tenant,
+  key: { vulnerabilityId, productKey, confidence },
+  observations: [
+    { observationId, providerId, status, justification?, introducedVersion?, fixedVersion?, evidence?, collectedAt }
+  ],
+  aliases: { primary, others: [] },
+  purls: [],
+  cpes: [],
+  conflicts: [],
+  createdAt,
+  updatedAt
+}
+```
+
+  * Indexes: `{tenant:1, key.vulnerabilityId:1, key.productKey:1}`, `{tenant:1, purls:1}`, `{tenant:1, updatedAt:-1}`.
+
+**`vex.events`** (observation/linkset events, optional long retention)
+
+```
+{
+  _id: ObjectId,
+  tenant,
+  type: "vex.observation.updated" | "vex.linkset.updated",
+  key,
+  delta,
+  hash,
+  occurredAt
+}
+```
+
+  * Indexes: `{type:1, occurredAt:-1}`, TTL on `occurredAt` for configurable retention.
+
+**`vex.consensus`** (optional rollups)
+
+```
+_id: sha256(canonical(vulnerabilityId, productKey, policyRevisionId))
+vulnerabilityId
+productKey
+rollupStatus
+sources[]      // observation references with weights/reasons
+policyRevisionId
+evaluatedAt
+signals?       // optional severity/kev/epss hints
+consensusDigest
+```
+
+  * Indexes: `{vulnerabilityId:1, productKey:1}`, `{policyRevisionId:1, evaluatedAt:-1}`.
+
+**`vex.exports`** (manifest of emitted artifacts)
+
+```
+_id
+querySignature
+format: raw|consensus|index
+artifactSha256
+rekor { uuid, index, url }?
+createdAt
+policyRevisionId
+cacheable: bool
+```
+
+**`vex.cache`** — observation/linkset export cache: `{querySignature, exportId, ttl, hits}`.
+
+**`vex.migrations`** — ordered migrations ensuring new indexes (`20251027-linksets-introduced`, etc.).
+
+### 3.2 Indexing strategy
+
+* Hot path queries rely on `{tenant, key.vulnerabilityId, key.productKey}` covering linkset lookup.
+* Observability queries use `{tenant, updatedAt}` to monitor staleness.
+* Consensus (if enabled) keyed by `{vulnerabilityId, productKey, policyRevisionId}` for deterministic reuse.
+
+---
+
+## 5) Ingestion pipeline
+
+### 4.1 Connector contract
+
+```csharp
+public interface IVexConnector
+{
+    string ProviderId { get; }
+    Task FetchAsync(VexConnectorContext ctx, CancellationToken ct);   // raw docs
+    Task NormalizeAsync(VexConnectorContext ctx, CancellationToken ct); // raw -> ObservationStatements[]
+}
+```
+
+* **Fetch** must implement: window scheduling, conditional GET (ETag/If‑Modified‑Since), rate limiting, retry/backoff.
+* **Normalize** parses the format, validates schema, maps product identities deterministically, emits observation statements with **provenance** metadata (locator, justification, version ranges).
+
+### 4.2 Signature verification (per provider)
+
+* **cosign (keyless or keyful)** for OCI referrers or HTTP‑served JSON with Sigstore bundles.
+* **PGP** (provider keyrings) for distro/vendor feeds that sign docs.
+* **x509** (mutual TLS / provider‑pinned certs) where applicable.
+* Signature state is stored on **vex.raw.sig** and copied into `statements[].signatureState` so downstream policy can gate by verification result.
+
+> Observation statements from sources failing signature policy are marked `"signatureState.verified=false"` and policy can down-weight or ignore them.
+
+### 4.3 Time discipline
+
+* For each doc, prefer **provider’s document timestamp**; if absent, use fetch time.
+* Statements carry `lastObserved` which drives **tie-breaking** within equal weight tiers.
+
+---
+
+## 6) Normalization: product & status semantics
+
+### 5.1 Product mapping
+
+* **purl** first; **cpe** second; OS package NVRA/EVR mapping helpers (distro connectors) produce purls via canonical tables (e.g., rpm→purl:rpm, deb→purl:deb).
+* Where a provider publishes **platform‑level** VEX (e.g., “RHEL 9 not affected”), connectors expand to known product inventory rules (e.g., map to sets of packages/components shipped in the platform). Expansion tables are versioned and kept per provider; every expansion emits **evidence** indicating the rule applied.
+* If expansion would be speculative, the statement remains **platform-scoped** with `productKey="platform:redhat:rhel:9"` and is flagged **non-joinable**; backend can decide to use platform VEX only when Scanner proves the platform runtime.
+
+### 5.2 Status + justification mapping
+
+* Canonical **status**: `affected | not_affected | fixed | under_investigation`.
+* **Justifications** normalized to a controlled vocabulary (CISA‑aligned), e.g.:
+
+  * `component_not_present`
+  * `vulnerable_code_not_in_execute_path`
+  * `vulnerable_configuration_unused`
+  * `inline_mitigation_applied`
+  * `fix_available` (with `fixedVersion`)
+  * `under_investigation`
+* Providers with free‑text justifications are mapped by deterministic tables; raw text preserved as `evidence`.
+
+---
+
+## 7) Consensus algorithm
+
+**Goal:** produce a **stable**, explainable `rollupStatus` per `(vulnId, productKey)` when consumers opt into Excititor-managed consensus derived from linksets.
+
+### 6.1 Inputs
+
+* Set **S** of observation statements drawn from the current `VexLinkset` for `(tenant, vulnId, productKey)`.
+* **Excititor policy snapshot**:
+
+  * **weights** per provider tier and per provider overrides.
+  * **justification gates** (e.g., require justification for `not_affected` to be acceptable).
+  * **minEvidence** rules (e.g., `not_affected` must come from ≥1 vendor or 2 distros).
+  * **signature requirements** (e.g., require verified signature for ‘fixed’ to be considered).
+
+### 6.2 Steps
+
+1. **Filter invalid** statements by signature policy & justification gates → set `S'`.
+2. **Score** each statement:
+   `score = weight(provider) * freshnessFactor(lastObserved)` where freshnessFactor ∈ [0.8, 1.0] for staleness decay (configurable; small effect). Observations lacking verified signatures receive policy-configured penalties.
+3. **Aggregate** scores per status: `W(status) = Σ score(statements with that status)`.
+4. **Pick** `rollupStatus = argmax_status W(status)`.
+5. **Tie‑breakers** (in order):
+
+   * Higher **max single** provider score wins (vendor > distro > platform > hub).
+   * More **recent** lastObserved wins.
+   * Deterministic lexicographic order of status (`fixed` > `not_affected` > `under_investigation` > `affected`) as final tiebreaker.
+6. **Explain**: mark accepted observations (`accepted=true; reason="weight"`/`"freshness"`/`"confidence"`) and rejected ones with explicit `reason` (`"insufficient_justification"`, `"signature_unverified"`, `"lower_weight"`, `"low_confidence_linkset"`).
+
+> The algorithm is **pure** given `S` and policy snapshot; result is reproducible and hashed into `consensusDigest`.
+
+---
+
+## 8) Query & export APIs
+
+All endpoints are versioned under `/api/v1/vex`.
+
+### 7.1 Query (online)
+
+```
+POST /observations/search
+  body: { vulnIds?: string[], productKeys?: string[], providers?: string[], since?: timestamp, limit?: int, pageToken?: string }
+  → { observations[], nextPageToken? }
+
+POST /linksets/search
+  body: { vulnIds?: string[], productKeys?: string[], confidence?: string[], since?: timestamp, limit?: int, pageToken?: string }
+  → { linksets[], nextPageToken? }
+
+POST /consensus/search
+  body: { vulnIds?: string[], productKeys?: string[], policyRevisionId?: string, since?: timestamp, limit?: int, pageToken?: string }
+  → { entries[], nextPageToken? }
+
+POST /excititor/resolve (scope: vex.read)
+  body: { productKeys?: string[], purls?: string[], vulnerabilityIds: string[], policyRevisionId?: string }
+  → { policy, resolvedAt, results: [ { vulnerabilityId, productKey, status, observations[], conflicts[], linksetConfidence, consensus?, signals?, envelope? } ] }
+```
+
+### 7.2 Exports (cacheable snapshots)
+
+```
+POST /exports
+  body: { signature: { vulnFilter?, productFilter?, providers?, since? }, format: raw|consensus|index, policyRevisionId?: string, force?: bool }
+  → { exportId, artifactSha256, rekor? }
+
+GET  /exports/{exportId}        → bytes (application/json or binary index)
+GET  /exports/{exportId}/meta   → { signature, policyRevisionId, createdAt, artifactSha256, rekor? }
+```
+
+### 7.3 Provider operations
+
+```
+GET  /providers                  → provider list & signature policy
+POST /providers/{id}/refresh     → trigger fetch/normalize window
+GET  /providers/{id}/status      → last fetch, doc counts, signature stats
+```
+
+**Auth:** service‑to‑service via Authority tokens; operator operations via UI/CLI with RBAC.
+
+---
+
+## 9) Attestation integration
+
+* Exports can be **DSSE‑signed** via **Signer** and logged to **Rekor v2** via **Attestor** (optional but recommended for regulated pipelines).
+* `vex.exports.rekor` stores `{uuid, index, url}` when present.
+* **Predicate type**: `https://stella-ops.org/attestations/vex-export/1` with fields:
+
+  * `querySignature`, `policyRevisionId`, `artifactSha256`, `createdAt`.
+
+---
+
+## 10) Configuration (YAML)
+
+```yaml
+excititor:
+  mongo: { uri: "mongodb://mongo/excititor" }
+  s3:
+    endpoint: http://minio:9000
+    bucket: stellaops
+  policy:
+    weights:
+      vendor: 1.0
+      distro: 0.9
+      platform: 0.7
+      hub: 0.5
+      attestation: 0.6
+      ceiling: 1.25
+    scoring:
+      alpha: 0.25
+      beta: 0.5
+    providerOverrides:
+      redhat: 1.0
+      suse: 0.95
+    requireJustificationForNotAffected: true
+    signatureRequiredForFixed: true
+    minEvidence:
+      not_affected:
+        vendorOrTwoDistros: true
+  connectors:
+    - providerId: redhat
+      kind: csaf
+      baseUrl: https://access.redhat.com/security/data/csaf/v2/
+      signaturePolicy: { type: pgp, keys: [ "…redhat-pgp-key…" ] }
+      windowDays: 7
+    - providerId: suse
+      kind: csaf
+      baseUrl: https://ftp.suse.com/pub/projects/security/csaf/
+      signaturePolicy: { type: pgp, keys: [ "…suse-pgp-key…" ] }
+    - providerId: ubuntu
+      kind: openvex
+      baseUrl: https://…/vex/
+      signaturePolicy: { type: none }
+    - providerId: vendorX
+      kind: cyclonedx-vex
+      ociRef: ghcr.io/vendorx/vex@sha256:…
+      signaturePolicy: { type: cosign, cosignKeylessRoots: [ "sigstore-root" ] }
+```
+
+### 9.1 WebService endpoints
+
+With storage configured, the WebService exposes the following ingress and diagnostic APIs:
+
+* `GET /excititor/status` – returns the active storage configuration and registered artifact stores.
+* `GET /excititor/health` – simple liveness probe.
+* `POST /excititor/statements` – accepts normalized VEX statements and persists them via `IVexClaimStore`; use this for migrations/backfills.
+* `GET /excititor/statements/{vulnId}/{productKey}?since=` – returns the immutable statement log for a vulnerability/product pair.
+* `POST /excititor/resolve` – requires `vex.read` scope; accepts up to 256 `(vulnId, productKey)` pairs via `productKeys` or `purls` and returns deterministic consensus results, decision telemetry, and a signed envelope (`artifact` digest, optional signer signature, optional attestation metadata + DSSE envelope). Returns **409 Conflict** when the requested `policyRevisionId` mismatches the active snapshot.
+
+Run the ingestion endpoint once after applying migration `20251019-consensus-signals-statements` to repopulate historical statements with the new severity/KEV/EPSS signal fields.
+
+* `weights.ceiling` raises the deterministic clamp applied to provider tiers/overrides (range 1.0‒5.0). Values outside the range are clamped with warnings so operators can spot typos.
+* `scoring.alpha` / `scoring.beta` configure KEV/EPSS boosts for the Phase 1 → Phase 2 scoring pipeline. Defaults (0.25, 0.5) preserve prior behaviour; negative or excessively large values fall back with diagnostics.
+
+---
+
+## 11) Security model
+
+* **Input signature verification** enforced per provider policy (PGP, cosign, x509).
+* **Connector allowlists**: outbound fetch constrained to configured domains.
+* **Tenant isolation**: per‑tenant DB prefixes or separate DBs; per‑tenant S3 prefixes; per‑tenant policies.
+* **AuthN/Z**: Authority‑issued OpToks; RBAC roles (`vex.read`, `vex.admin`, `vex.export`).
+* **No secrets in logs**; deterministic logging contexts include providerId, docDigest, observationId, and linksetId.
+
+---
+
+## 12) Performance & scale
+
+* **Targets:**
+
+  * Normalize 10k observation statements/minute/core.
+  * Linkset rebuild ≤ 20 ms P95 for 1k unique `(vuln, product)` pairs in hot cache.
+  * Consensus (when enabled) compute ≤ 50 ms for 1k unique `(vuln, product)` pairs.
+  * Export (observations + linksets) 1M rows in ≤ 60 s on 8 cores with streaming writer.
+
+* **Scaling:**
+
+  * WebService handles control APIs; **Worker** background services (same image) execute fetch/normalize in parallel with rate‑limits; Mongo writes batched; upserts by natural keys.
+  * Exports stream straight to S3 (MinIO) with rolling buffers.
+
+* **Caching:**
+
+  * `vex.cache` maps query signatures → export; TTL to avoid stampedes; optimistic reuse unless `force`.
+
+### 11.1 Worker TTL refresh controls
+
+Excititor.Worker ships with a background refresh service that re-evaluates stale consensus rows and applies stability dampers before publishing status flips. Operators can tune its behaviour through the following configuration (shown in `appsettings.json` syntax):
+
+```jsonc
+{
+  "Excititor": {
+    "Worker": {
+      "Refresh": {
+        "Enabled": true,
+        "ConsensusTtl": "02:00:00",       // refresh consensus older than 2 hours
+        "ScanInterval": "00:10:00",       // sweep cadence
+        "ScanBatchSize": 250,              // max documents examined per sweep
+        "Damper": {
+          "Minimum": "1.00:00:00",       // lower bound before status flip publishes
+          "Maximum": "2.00:00:00",       // upper bound guardrail
+          "DefaultDuration": "1.12:00:00",
+          "Rules": [
+            { "MinWeight": 0.90, "Duration": "1.00:00:00" },
+            { "MinWeight": 0.75, "Duration": "1.06:00:00" },
+            { "MinWeight": 0.50, "Duration": "1.12:00:00" }
+          ]
+        }
+      }
+    }
+  }
+}
+```
+
+* `ConsensusTtl` governs when the worker issues a fresh resolve for cached consensus data.
+* `Damper` lengths are clamped between `Minimum`/`Maximum`; duration is bypassed when component fingerprints (`VexProduct.ComponentIdentifiers`) change.
+* The same keys are available through environment variables (e.g., `Excititor__Worker__Refresh__ConsensusTtl=02:00:00`).
+
+---
+
+## 13) Observability
+
+* **Metrics:**
+
+  * `vex.fetch.requests_total{provider}` / `vex.fetch.bytes_total{provider}`
+  * `vex.fetch.failures_total{provider,reason}` / `vex.signature.failures_total{provider,method}`
+  * `vex.normalize.statements_total{provider}`
+  * `vex.observations.write_total{result}`
+  * `vex.linksets.updated_total{result}` / `vex.linksets.conflicts_total{type}`
+  * `vex.consensus.rollup_total{status}` (when enabled)
+  * `vex.exports.bytes_total{format}` / `vex.exports.latency_seconds{format}`
+* **Tracing:** spans for fetch, verify, parse, map, observe, linkset, consensus, export.
+* **Dashboards:** provider staleness, linkset conflict hot spots, signature posture, export cache hit-rate.
+
+---
+
+## 14) Testing matrix
+
+* **Connectors:** golden raw docs → deterministic observation statements (fixtures per provider/format).
+* **Signature policies:** valid/invalid PGP/cosign/x509 samples; ensure rejects are recorded but not accepted.
+* **Normalization edge cases:** platform-scoped statements, free-text justifications, non-purl products.
+* **Linksets:** conflict scenarios across tiers; verify confidence scoring + conflict payload stability.
+* **Consensus (optional):** ensure tie-breakers honour policy weights/justification gates.
+* **Performance:** 1M-row observation/linkset export timing; memory ceilings; stream correctness.
+* **Determinism:** same inputs + policy → identical linkset hashes, conflict payloads, optional `consensusDigest`, and export bytes.
+* **API contract tests:** pagination, filters, RBAC, rate limits.
+
+---
+
+## 15) Integration points
+
+* **Backend Policy Engine** (in Scanner.WebService): calls `POST /excititor/resolve` (scope `vex.read`) with batched `(purl, vulnId)` pairs to fetch `rollupStatus + sources`.
+* **Concelier**: provides alias graph (CVE↔vendor IDs) and may supply VEX‑adjacent metadata (e.g., KEV flag) for policy escalation.
+* **UI**: VEX explorer screens use `/observations/search`, `/linksets/search`, and `/consensus/search`; show conflicts & provenance.
+* **CLI**: `stella vex linksets export --since 7d --out vex-linksets.json` (optionally `--include-consensus`) for audits and Offline Kit parity.
+
+---
+
+## 16) Failure modes & fallback
+
+* **Provider unreachable:** stale thresholds trigger warnings; policy can down‑weight stale providers automatically (freshness factor).
+* **Signature outage:** continue to ingest but mark `signatureState.verified=false`; consensus will likely exclude or down‑weight per policy.
+* **Schema drift:** unknown fields are preserved as `evidence`; normalization rejects only on **invalid identity** or **status**.
+
+---
+
+## 17) Rollout plan (incremental)
+
+1. **MVP**: OpenVEX + CSAF connectors for 3 major providers (e.g., Red Hat/SUSE/Ubuntu), normalization + consensus + `/excititor/resolve`.
+2. **Signature policies**: PGP for distros; cosign for OCI.
+3. **Exports + optional attestation**.
+4. **CycloneDX VEX** connectors; platform claim expansion tables; UI explorer.
+5. **Scale hardening**: export indexes; conflict analytics.
+
+---
+
+## 18) Operational runbooks
+
+* **Statement backfill** — see `docs/dev/EXCITITOR_STATEMENT_BACKFILL.md` for the CLI workflow, required permissions, observability guidance, and rollback steps.
+
+---
+
+## 19) Appendix — canonical JSON (stable ordering)
+
+All exports and consensus entries are serialized via `VexCanonicalJsonSerializer`:
+
+* UTF‑8 without BOM;
+* keys sorted (ASCII);
+* arrays sorted by `(providerId, vulnId, productKey, lastObserved)` unless semantic order mandated;
+* timestamps in `YYYY‑MM‑DDThh:mm:ssZ`;
+* no insignificant whitespace.
+