Restructure solution layout by module
	
		
			
	
		
	
	
		
	
		
			Some checks failed
		
		
	
	
		
			
				
	
				Docs CI / lint-and-preview (push) Has been cancelled
				
			
		
		
	
	
				
					
				
			
		
			Some checks failed
		
		
	
	Docs CI / lint-and-preview (push) Has been cancelled
				
			This commit is contained in:
		| @@ -1,229 +1,229 @@ | ||||
| # VEX Observations & Linksets | ||||
|  | ||||
| > Imposed rule: Work of this type or tasks of this type on this component must | ||||
| > also be applied everywhere else it should be applied. | ||||
|  | ||||
| Link-Not-Merge brings the same immutable observation model to Excititor that | ||||
| Concelier now uses for advisories. VEX statements are stored as append-only | ||||
| observations; linksets correlate them, capture conflicts, and keep provenance so | ||||
| Policy Engine and UI surfaces can explain decisions without collapsing sources. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 1. Model overview | ||||
|  | ||||
| ### 1.1 Observation lifecycle | ||||
|  | ||||
| 1. **Ingest** – Connectors fetch OpenVEX, CSAF VEX, CycloneDX VEX, or VEX | ||||
|    attestations, validate signatures, and strip any derived consensus data | ||||
|    forbidden by the Aggregation-Only Contract (AOC). | ||||
| 2. **Persist** – Excititor writes immutable `vex_observations` keyed by tenant, | ||||
|    provider, upstream identifier, and `contentHash`. Supersedes chains record | ||||
|    revisions; the original payload is never mutated. | ||||
| 3. **Expose** – WebService will surface paginated observation APIs and Offline | ||||
|    Kit snapshots mirror the same data for air-gapped sites. | ||||
|  | ||||
| Observation schema sketch (final shape lands with `EXCITITOR-LNM-21-001`): | ||||
|  | ||||
| ```text | ||||
| observationId = {tenant}:{providerId}:{upstreamId}:{revision} | ||||
| tenant, providerId, streamId | ||||
| upstream{ upstreamId, documentVersion, fetchedAt, receivedAt, | ||||
|           contentHash, signature{present, format?, keyId?, signature?} } | ||||
| content{ format, specVersion, raw } | ||||
| statements[ | ||||
|   { vulnerabilityId, productKey, status, justification?, | ||||
|     introducedVersion?, fixedVersion?, locator } | ||||
| ] | ||||
| linkset{ purls[], cpes[], aliases[], references[], | ||||
|          reconciledFrom[], conflicts[]? } | ||||
| attributes{ batchId?, replayCursor? } | ||||
| createdAt | ||||
| ``` | ||||
|  | ||||
| - **Raw payload** (`content.raw`) remains lossless (Relaxed Extended JSON). | ||||
| - **Statements** provide normalized tuples for each claim contained in the | ||||
|   document, including justification and version hints. | ||||
| - **Linkset** mirrors identifiers extracted during ingestion, retaining JSON | ||||
|   pointer metadata so audits can trace back to the source fragment. | ||||
|  | ||||
| ### 1.2 Linkset lifecycle | ||||
|  | ||||
| Linksets correlate claims referring to the same `(vulnerabilityId, productKey)` | ||||
| pair across providers. | ||||
|  | ||||
| 1. **Seed** – Observations push normalized identifiers (CVE, GHSA, vendor IDs) | ||||
|    plus canonical product keys (purl preferred, cpe fallback). Platform-scoped | ||||
|    statements remain marked `non_joinable`. | ||||
| 2. **Correlate** – The linkset builder groups statements by tenant and identity, | ||||
|    combines alias graphs from Concelier, and uses justification/product overlap | ||||
|    to assign correlation confidence. | ||||
| 3. **Annotate** – Conflicts (status disagreement, justification mismatch, range | ||||
|    inconsistencies) are recorded as structured entries. | ||||
| 4. **Persist** – Results land in `vex_linksets` with deterministic IDs (hash of | ||||
|    sorted `(vulnerabilityId, productKey, observationIds)`) and append-only | ||||
|    history for replay/debugging. | ||||
|  | ||||
| Linksets never override statements or invent consensus; they simply align | ||||
| evidence for Policy Engine and consumers. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 2. Observation vs. linkset | ||||
|  | ||||
| - **Purpose** | ||||
|   - Observation: Immutable record of a single upstream VEX document. | ||||
|   - Linkset: Correlated evidence spanning observations that describe the same | ||||
|     product-vulnerability pair. | ||||
| - **Mutation** | ||||
|   - Observation: Append-only via supersedes. | ||||
|   - Linkset: Regenerated deterministically by correlation jobs. | ||||
| - **Allowed fields** | ||||
|   - Observation: Raw payload, provenance, normalized statement tuples, join | ||||
|     hints. | ||||
|   - Linkset: Observation references, statement IDs, confidence metrics, conflict | ||||
|     annotations. | ||||
| - **Forbidden fields** | ||||
|   - Observation: Derived consensus, suppression flags, risk scores. | ||||
|   - Linkset: Derived severity or policy decisions (only evidence + conflicts). | ||||
| - **Consumers** | ||||
|   - Observation: Evidence exports, Offline Kit mirrors, CLI raw dumps. | ||||
|   - Linkset: Policy Engine VEX overlay, Console evidence panes, Vuln Explorer. | ||||
|  | ||||
| ### 2.1 Example sequence | ||||
|  | ||||
| 1. Canonical vendor issues an attested OpenVEX declaring `CVE-2025-2222` as | ||||
|    `not_affected` for `pkg:rpm/redhat/openssl@1.1.1w-12`. Excititor inserts a | ||||
|    new observation referencing that statement. | ||||
| 2. Upstream CycloneDX VEX from a distro reports the same product as `affected` | ||||
|    with `under_investigation` justification. | ||||
| 3. Linkset builder groups both statements by alias overlap and product key, | ||||
|    setting confidence `high` because CVE and purl match. | ||||
| 4. Conflict annotation records `status-mismatch` and retains both justifications; | ||||
|    Policy Engine uses this to explain why suppression cannot proceed without | ||||
|    policy override. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 3. Conflict handling | ||||
|  | ||||
| Structured conflicts capture disagreements without mutating source statements. | ||||
|  | ||||
| ```json | ||||
| { | ||||
|   "type": "status-mismatch", | ||||
|   "vulnerabilityId": "CVE-2025-2222", | ||||
|   "productKey": "pkg:rpm/redhat/openssl@1.1.1w-12", | ||||
|   "statements": [ | ||||
|     { | ||||
|       "observationId": "tenant:redhat:openvex:3", | ||||
|       "providerId": "redhat", | ||||
|       "status": "not_affected", | ||||
|       "justification": "component_not_present" | ||||
|     }, | ||||
|     { | ||||
|       "observationId": "tenant:ubuntu:cyclonedx:12", | ||||
|       "providerId": "ubuntu", | ||||
|       "status": "affected", | ||||
|       "justification": "under_investigation" | ||||
|     } | ||||
|   ], | ||||
|   "confidence": "medium", | ||||
|   "detectedAt": "2025-10-27T14:30:00Z" | ||||
| } | ||||
| ``` | ||||
|  | ||||
| Conflict classes (tracked via `EXCITITOR-LNM-21-003`): | ||||
|  | ||||
| - `status-mismatch` – Different statuses for the same pair (affected vs | ||||
|   not_affected vs fixed vs under_investigation). | ||||
| - `justification-divergence` – Same status but incompatible justifications or | ||||
|   missing justification where policy requires it. | ||||
| - `version-range-clash` – Introduced/fixed ranges contradict each other. | ||||
| - `non-joinable-overlap` – Platform-scoped statements collide with package | ||||
|   statements; flagged as warning but retained. | ||||
| - `metadata-gap` – Missing provenance/signature field on specific statements. | ||||
|  | ||||
| Conflicts surface through: | ||||
|  | ||||
| - `/vex/linksets/{id}` APIs (`conflicts[]` payload). | ||||
| - Console evidence panels (badges + drawer detail). | ||||
| - CLI exports (`stella vex linkset …` planned in `CLI-LNM-22-002`). | ||||
| - Metrics dashboards (`vex_linkset_conflicts_total{type}`). | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 4. AOC alignment | ||||
|  | ||||
| - **Raw-first** – `content.raw` and `statements[]` mirror upstream input; no | ||||
|   derived consensus or suppression values are written by ingestion. | ||||
| - **No merges** – Each upstream statement persists independently; linksets refer | ||||
|   back via `observationId`. | ||||
| - **Provenance mandatory** – Missing signature or source metadata yields | ||||
|   `ERR_AOC_004`; ingestion blocks until connectors fix the feed. | ||||
| - **Idempotent writes** – Duplicate `(providerId, upstreamId, contentHash)` | ||||
|   results in a no-op; revisions append with a `supersedes` pointer. | ||||
| - **Deterministic output** – Correlator sorts identifiers, normalizes timestamps | ||||
|   (UTC ISO-8601), and hashes canonical JSON to generate stable linkset IDs. | ||||
| - **Scope-aware** – Tenant claims enforced on write/read; Authority scopes | ||||
|   `vex:ingest` / `vex:read` are required (see `AUTH-AOC-22-001`). | ||||
|  | ||||
| Violations raise `ERR_AOC_00x`, emit `aoc_violation_total`, and prevent the data | ||||
| from landing downstream. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 5. Downstream consumption | ||||
|  | ||||
| - **Policy Engine** – Evaluates VEX evidence alongside advisory linksets to gate | ||||
|   suppression, severity downgrades, or explainability. | ||||
| - **Console UI** – Evidence panel renders VEX statements grouped by provider and | ||||
|   highlights conflicts or missing signatures. | ||||
| - **CLI** – Planned commands export observations/linksets for offline analysis | ||||
|   (`CLI-LNM-22-002`). | ||||
| - **Offline Kit** – Bundled snapshots keep VEX data aligned with advisory | ||||
|   observations for air-gapped parity. | ||||
| - **Observability** – Dashboards track ingestion latency, conflict counts, and | ||||
|   supersedes depth per provider. | ||||
|  | ||||
| New consumers must treat both collections as read-only and preserve deterministic | ||||
| ordering when caching. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 6. Validation & testing | ||||
|  | ||||
| - **Unit tests** (`StellaOps.Excititor.Core.Tests`) to cover schema guards, | ||||
|   deterministic linkset hashing, conflict classification, and supersedes | ||||
|   behaviour. | ||||
| - **Mongo integration tests** (`StellaOps.Excititor.Storage.Mongo.Tests`) to | ||||
|   verify indexes, shard keys, and idempotent writes across tenants. | ||||
| - **CLI smoke suites** (`stella vex observations`, `stella vex linksets`) for | ||||
|   JSON determinism and exit code coverage. | ||||
| - **Replay determinism** – Feed identical upstream payloads twice and ensure | ||||
|   observation/linkset hashes match across runs. | ||||
| - **Offline kit verification** – Validate VEX exports packaged in Offline Kit | ||||
|   snapshots against live service outputs. | ||||
| - **Fixture refresh** – Samples (`SAMPLES-LNM-22-002`) must include multi-source | ||||
|   conflicts and justification variants used by docs and UI tests. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 7. Reviewer checklist | ||||
|  | ||||
| - Observation schema aligns with `EXCITITOR-LNM-21-001` once the schema lands; | ||||
|   update references as soon as the final contract is published. | ||||
| - Linkset lifecycle covers correlation signals (alias graphs, product keys, | ||||
|   justification rules) and deterministic ID strategy. | ||||
| - Conflict classes include status, justification, version range, platform overlap | ||||
|   scenarios. | ||||
| - AOC guardrails called out with relevant error codes and Authority scopes. | ||||
| - Downstream consumer list matches active APIs/CLI features (update when | ||||
|   `CLI-LNM-22-002` and WebService endpoints ship). | ||||
| - Validation section references Core, Storage, CLI, and Offline test suites plus | ||||
|   fixture requirements. | ||||
| - Imposed rule reminder retained at top. | ||||
|  | ||||
| Dependencies outstanding (2025-10-27): `EXCITITOR-LNM-21-001..005` and | ||||
| `EXCITITOR-LNM-21-101..102` are still TODO; revisit this document once schemas, | ||||
| APIs, and fixtures are implemented. | ||||
| # VEX Observations & Linksets | ||||
|  | ||||
| > Imposed rule: Work of this type or tasks of this type on this component must | ||||
| > also be applied everywhere else it should be applied. | ||||
|  | ||||
| Link-Not-Merge brings the same immutable observation model to Excititor that | ||||
| Concelier now uses for advisories. VEX statements are stored as append-only | ||||
| observations; linksets correlate them, capture conflicts, and keep provenance so | ||||
| Policy Engine and UI surfaces can explain decisions without collapsing sources. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 1. Model overview | ||||
|  | ||||
| ### 1.1 Observation lifecycle | ||||
|  | ||||
| 1. **Ingest** – Connectors fetch OpenVEX, CSAF VEX, CycloneDX VEX, or VEX | ||||
|    attestations, validate signatures, and strip any derived consensus data | ||||
|    forbidden by the Aggregation-Only Contract (AOC). | ||||
| 2. **Persist** – Excititor writes immutable `vex_observations` keyed by tenant, | ||||
|    provider, upstream identifier, and `contentHash`. Supersedes chains record | ||||
|    revisions; the original payload is never mutated. | ||||
| 3. **Expose** – WebService will surface paginated observation APIs and Offline | ||||
|    Kit snapshots mirror the same data for air-gapped sites. | ||||
|  | ||||
| Observation schema sketch (final shape lands with `EXCITITOR-LNM-21-001`): | ||||
|  | ||||
| ```text | ||||
| observationId = {tenant}:{providerId}:{upstreamId}:{revision} | ||||
| tenant, providerId, streamId | ||||
| upstream{ upstreamId, documentVersion, fetchedAt, receivedAt, | ||||
|           contentHash, signature{present, format?, keyId?, signature?} } | ||||
| content{ format, specVersion, raw } | ||||
| statements[ | ||||
|   { vulnerabilityId, productKey, status, justification?, | ||||
|     introducedVersion?, fixedVersion?, locator } | ||||
| ] | ||||
| linkset{ purls[], cpes[], aliases[], references[], | ||||
|          reconciledFrom[], conflicts[]? } | ||||
| attributes{ batchId?, replayCursor? } | ||||
| createdAt | ||||
| ``` | ||||
|  | ||||
| - **Raw payload** (`content.raw`) remains lossless (Relaxed Extended JSON). | ||||
| - **Statements** provide normalized tuples for each claim contained in the | ||||
|   document, including justification and version hints. | ||||
| - **Linkset** mirrors identifiers extracted during ingestion, retaining JSON | ||||
|   pointer metadata so audits can trace back to the source fragment. | ||||
|  | ||||
| ### 1.2 Linkset lifecycle | ||||
|  | ||||
| Linksets correlate claims referring to the same `(vulnerabilityId, productKey)` | ||||
| pair across providers. | ||||
|  | ||||
| 1. **Seed** – Observations push normalized identifiers (CVE, GHSA, vendor IDs) | ||||
|    plus canonical product keys (purl preferred, cpe fallback). Platform-scoped | ||||
|    statements remain marked `non_joinable`. | ||||
| 2. **Correlate** – The linkset builder groups statements by tenant and identity, | ||||
|    combines alias graphs from Concelier, and uses justification/product overlap | ||||
|    to assign correlation confidence. | ||||
| 3. **Annotate** – Conflicts (status disagreement, justification mismatch, range | ||||
|    inconsistencies) are recorded as structured entries. | ||||
| 4. **Persist** – Results land in `vex_linksets` with deterministic IDs (hash of | ||||
|    sorted `(vulnerabilityId, productKey, observationIds)`) and append-only | ||||
|    history for replay/debugging. | ||||
|  | ||||
| Linksets never override statements or invent consensus; they simply align | ||||
| evidence for Policy Engine and consumers. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 2. Observation vs. linkset | ||||
|  | ||||
| - **Purpose** | ||||
|   - Observation: Immutable record of a single upstream VEX document. | ||||
|   - Linkset: Correlated evidence spanning observations that describe the same | ||||
|     product-vulnerability pair. | ||||
| - **Mutation** | ||||
|   - Observation: Append-only via supersedes. | ||||
|   - Linkset: Regenerated deterministically by correlation jobs. | ||||
| - **Allowed fields** | ||||
|   - Observation: Raw payload, provenance, normalized statement tuples, join | ||||
|     hints. | ||||
|   - Linkset: Observation references, statement IDs, confidence metrics, conflict | ||||
|     annotations. | ||||
| - **Forbidden fields** | ||||
|   - Observation: Derived consensus, suppression flags, risk scores. | ||||
|   - Linkset: Derived severity or policy decisions (only evidence + conflicts). | ||||
| - **Consumers** | ||||
|   - Observation: Evidence exports, Offline Kit mirrors, CLI raw dumps. | ||||
|   - Linkset: Policy Engine VEX overlay, Console evidence panes, Vuln Explorer. | ||||
|  | ||||
| ### 2.1 Example sequence | ||||
|  | ||||
| 1. Canonical vendor issues an attested OpenVEX declaring `CVE-2025-2222` as | ||||
|    `not_affected` for `pkg:rpm/redhat/openssl@1.1.1w-12`. Excititor inserts a | ||||
|    new observation referencing that statement. | ||||
| 2. Upstream CycloneDX VEX from a distro reports the same product as `affected` | ||||
|    with `under_investigation` justification. | ||||
| 3. Linkset builder groups both statements by alias overlap and product key, | ||||
|    setting confidence `high` because CVE and purl match. | ||||
| 4. Conflict annotation records `status-mismatch` and retains both justifications; | ||||
|    Policy Engine uses this to explain why suppression cannot proceed without | ||||
|    policy override. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 3. Conflict handling | ||||
|  | ||||
| Structured conflicts capture disagreements without mutating source statements. | ||||
|  | ||||
| ```json | ||||
| { | ||||
|   "type": "status-mismatch", | ||||
|   "vulnerabilityId": "CVE-2025-2222", | ||||
|   "productKey": "pkg:rpm/redhat/openssl@1.1.1w-12", | ||||
|   "statements": [ | ||||
|     { | ||||
|       "observationId": "tenant:redhat:openvex:3", | ||||
|       "providerId": "redhat", | ||||
|       "status": "not_affected", | ||||
|       "justification": "component_not_present" | ||||
|     }, | ||||
|     { | ||||
|       "observationId": "tenant:ubuntu:cyclonedx:12", | ||||
|       "providerId": "ubuntu", | ||||
|       "status": "affected", | ||||
|       "justification": "under_investigation" | ||||
|     } | ||||
|   ], | ||||
|   "confidence": "medium", | ||||
|   "detectedAt": "2025-10-27T14:30:00Z" | ||||
| } | ||||
| ``` | ||||
|  | ||||
| Conflict classes (tracked via `EXCITITOR-LNM-21-003`): | ||||
|  | ||||
| - `status-mismatch` – Different statuses for the same pair (affected vs | ||||
|   not_affected vs fixed vs under_investigation). | ||||
| - `justification-divergence` – Same status but incompatible justifications or | ||||
|   missing justification where policy requires it. | ||||
| - `version-range-clash` – Introduced/fixed ranges contradict each other. | ||||
| - `non-joinable-overlap` – Platform-scoped statements collide with package | ||||
|   statements; flagged as warning but retained. | ||||
| - `metadata-gap` – Missing provenance/signature field on specific statements. | ||||
|  | ||||
| Conflicts surface through: | ||||
|  | ||||
| - `/vex/linksets/{id}` APIs (`conflicts[]` payload). | ||||
| - Console evidence panels (badges + drawer detail). | ||||
| - CLI exports (`stella vex linkset …` planned in `CLI-LNM-22-002`). | ||||
| - Metrics dashboards (`vex_linkset_conflicts_total{type}`). | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 4. AOC alignment | ||||
|  | ||||
| - **Raw-first** – `content.raw` and `statements[]` mirror upstream input; no | ||||
|   derived consensus or suppression values are written by ingestion. | ||||
| - **No merges** – Each upstream statement persists independently; linksets refer | ||||
|   back via `observationId`. | ||||
| - **Provenance mandatory** – Missing signature or source metadata yields | ||||
|   `ERR_AOC_004`; ingestion blocks until connectors fix the feed. | ||||
| - **Idempotent writes** – Duplicate `(providerId, upstreamId, contentHash)` | ||||
|   results in a no-op; revisions append with a `supersedes` pointer. | ||||
| - **Deterministic output** – Correlator sorts identifiers, normalizes timestamps | ||||
|   (UTC ISO-8601), and hashes canonical JSON to generate stable linkset IDs. | ||||
| - **Scope-aware** – Tenant claims enforced on write/read; Authority scopes | ||||
|   `vex:ingest` / `vex:read` are required (see `AUTH-AOC-22-001`). | ||||
|  | ||||
| Violations raise `ERR_AOC_00x`, emit `aoc_violation_total`, and prevent the data | ||||
| from landing downstream. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 5. Downstream consumption | ||||
|  | ||||
| - **Policy Engine** – Evaluates VEX evidence alongside advisory linksets to gate | ||||
|   suppression, severity downgrades, or explainability. | ||||
| - **Console UI** – Evidence panel renders VEX statements grouped by provider and | ||||
|   highlights conflicts or missing signatures. | ||||
| - **CLI** – Planned commands export observations/linksets for offline analysis | ||||
|   (`CLI-LNM-22-002`). | ||||
| - **Offline Kit** – Bundled snapshots keep VEX data aligned with advisory | ||||
|   observations for air-gapped parity. | ||||
| - **Observability** – Dashboards track ingestion latency, conflict counts, and | ||||
|   supersedes depth per provider. | ||||
|  | ||||
| New consumers must treat both collections as read-only and preserve deterministic | ||||
| ordering when caching. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 6. Validation & testing | ||||
|  | ||||
| - **Unit tests** (`StellaOps.Excititor.Core.Tests`) to cover schema guards, | ||||
|   deterministic linkset hashing, conflict classification, and supersedes | ||||
|   behaviour. | ||||
| - **Mongo integration tests** (`StellaOps.Excititor.Storage.Mongo.Tests`) to | ||||
|   verify indexes, shard keys, and idempotent writes across tenants. | ||||
| - **CLI smoke suites** (`stella vex observations`, `stella vex linksets`) for | ||||
|   JSON determinism and exit code coverage. | ||||
| - **Replay determinism** – Feed identical upstream payloads twice and ensure | ||||
|   observation/linkset hashes match across runs. | ||||
| - **Offline kit verification** – Validate VEX exports packaged in Offline Kit | ||||
|   snapshots against live service outputs. | ||||
| - **Fixture refresh** – Samples (`SAMPLES-LNM-22-002`) must include multi-source | ||||
|   conflicts and justification variants used by docs and UI tests. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 7. Reviewer checklist | ||||
|  | ||||
| - Observation schema aligns with `EXCITITOR-LNM-21-001` once the schema lands; | ||||
|   update references as soon as the final contract is published. | ||||
| - Linkset lifecycle covers correlation signals (alias graphs, product keys, | ||||
|   justification rules) and deterministic ID strategy. | ||||
| - Conflict classes include status, justification, version range, platform overlap | ||||
|   scenarios. | ||||
| - AOC guardrails called out with relevant error codes and Authority scopes. | ||||
| - Downstream consumer list matches active APIs/CLI features (update when | ||||
|   `CLI-LNM-22-002` and WebService endpoints ship). | ||||
| - Validation section references Core, Storage, CLI, and Offline test suites plus | ||||
|   fixture requirements. | ||||
| - Imposed rule reminder retained at top. | ||||
|  | ||||
| Dependencies outstanding (2025-10-27): `EXCITITOR-LNM-21-001..005` and | ||||
| `EXCITITOR-LNM-21-101..102` are still TODO; revisit this document once schemas, | ||||
| APIs, and fixtures are implemented. | ||||
|   | ||||
		Reference in New Issue
	
	Block a user