219 lines
		
	
	
		
			8.6 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			219 lines
		
	
	
		
			8.6 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # Advisory Observations & Linksets
 | ||
| 
 | ||
| > Imposed rule: Work of this type or tasks of this type on this component must also
 | ||
| > be applied everywhere else it should be applied.
 | ||
| 
 | ||
| The Link-Not-Merge (LNM) initiative replaces the legacy "merge" pipeline with
 | ||
| immutable observations and correlation linksets. This guide explains how
 | ||
| Concelier ingests advisory statements, preserves upstream truth, and produces
 | ||
| linksets that downstream services (Policy Engine, Vuln Explorer, Console) can
 | ||
| use without collapsing sources together.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 1. Model overview
 | ||
| 
 | ||
| ### 1.1 Observation lifecycle
 | ||
| 
 | ||
| 1. **Ingest** – Connectors fetch upstream payloads (CSAF, OSV, vendor feeds),
 | ||
|    validate signatures, and drop any derived fields prohibited by the
 | ||
|    Aggregation-Only Contract (AOC).
 | ||
| 2. **Persist** – Concelier writes immutable `advisory_observations` scoped by
 | ||
|    `tenant`, `(source.vendor, upstreamId)`, and `contentHash`. Supersedes chains
 | ||
|    capture revisions without mutating history.
 | ||
| 3. **Expose** – WebService surfaces paged/read APIs; Offline Kit snapshots
 | ||
|    include the same documents for air-gapped installs.
 | ||
| 
 | ||
| Observation schema highlights:
 | ||
| 
 | ||
| ```text
 | ||
| observationId = {tenant}:{source.vendor}:{upstreamId}:{revision}
 | ||
| tenant, source{vendor, stream, api, collectorVersion}
 | ||
| upstream{upstreamId, documentVersion, fetchedAt, receivedAt,
 | ||
|          contentHash, signature{present, format, keyId, signature}}
 | ||
| content{format, specVersion, raw}
 | ||
| identifiers{cve?, ghsa?, aliases[], osvIds[]}
 | ||
| linkset{purls[], cpes[], aliases[], references[], conflicts[]?}
 | ||
| createdAt, attributes{batchId?, replayCursor?}
 | ||
| ```
 | ||
| 
 | ||
| - **Immutable raw** (`content.raw`) mirrors upstream payloads exactly.
 | ||
| - **Provenance** (`source.*`, `upstream.*`) satisfies AOC guardrails and enables
 | ||
|   cryptographic attestations.
 | ||
| - **Identifiers** retain lossless extracts (CVE, GHSA, vendor aliases) that seed
 | ||
|   linksets.
 | ||
| - **Linkset** captures join hints but never merges or adds derived severity.
 | ||
| 
 | ||
| ### 1.2 Linkset lifecycle
 | ||
| 
 | ||
| Linksets correlate observations that describe the same vulnerable product while
 | ||
| keeping each source intact.
 | ||
| 
 | ||
| 1. **Seed** – Observations emit normalized identifiers (`purl`, `cpe`,
 | ||
|    `alias`) during ingestion.
 | ||
| 2. **Correlate** – Linkset builder groups observations by tenant, product
 | ||
|    coordinates, and equivalence signals (PURL alias graph, CVE overlap, CVSS
 | ||
|    vector equality, fuzzy titles).
 | ||
| 3. **Annotate** – Detected conflicts (severity disagreements, affected-range
 | ||
|    mismatch, incompatible references) are recorded with structured payloads and
 | ||
|    preserved for UI/API export.
 | ||
| 4. **Persist** – Results land in `advisory_linksets` with deterministic IDs
 | ||
|    (`linksetId = {tenant}:{hash(aliases+purls+seedIds)}`) and append-only history
 | ||
|    for reproducibility.
 | ||
| 
 | ||
| Linksets never suppress or prefer one source; they provide aligned evidence so
 | ||
| other services can apply policy.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 2. Observation vs. linkset
 | ||
| 
 | ||
| - **Purpose**
 | ||
|   - Observation: Immutable record per vendor and revision.
 | ||
|   - Linkset: Correlates observations that share product identity.
 | ||
| - **Mutation**
 | ||
|   - Observation: Append-only via supersedes chain.
 | ||
|   - Linkset: Rebuilt deterministically from canonical signals.
 | ||
| - **Allowed fields**
 | ||
|   - Observation: Raw payload, provenance, identifiers, join hints.
 | ||
|   - Linkset: Observation references, normalized product metadata, conflicts.
 | ||
| - **Forbidden fields**
 | ||
|   - Observation: Derived severity, policy status, opinionated dedupe.
 | ||
|   - Linkset: Derived severity (conflicts recorded but unresolved).
 | ||
| - **Consumers**
 | ||
|   - Observation: Evidence API, Offline Kit, CLI exports.
 | ||
|   - Linkset: Policy Engine overlay, UI evidence panel, Vuln Explorer.
 | ||
| 
 | ||
| ### 2.1 Example sequence
 | ||
| 
 | ||
| 1. Red Hat PSIRT publishes RHSA-2025:1234 for OpenSSL; Concelier inserts an
 | ||
|    observation for vendor `redhat` with `pkg:rpm/redhat/openssl@1.1.1w-12`.
 | ||
| 2. NVD issues CVE-2025-0001; a second observation is inserted for vendor `nvd`.
 | ||
| 3. Linkset builder runs, groups the two observations, records alias and PURL
 | ||
|    overlap, and flags a CVSS disagreement (`7.5` vs `7.2`).
 | ||
| 4. Policy Engine reads the linkset, recognises the severity variance, and relies
 | ||
|    on configured rules to decide the effective output.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 3. Conflict handling
 | ||
| 
 | ||
| Conflicts record disagreements without altering source payloads. The builder
 | ||
| emits structured entries:
 | ||
| 
 | ||
| ```json
 | ||
| {
 | ||
|   "type": "severity-mismatch",
 | ||
|   "field": "cvss.baseScore",
 | ||
|   "observations": [
 | ||
|     {
 | ||
|       "source": "redhat",
 | ||
|       "value": "7.5",
 | ||
|       "vector": "AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N"
 | ||
|     },
 | ||
|     {
 | ||
|       "source": "nvd",
 | ||
|       "value": "7.2",
 | ||
|       "vector": "AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:H/A:N"
 | ||
|     }
 | ||
|   ],
 | ||
|   "confidence": "medium",
 | ||
|   "detectedAt": "2025-10-27T14:00:00Z"
 | ||
| }
 | ||
| ```
 | ||
| 
 | ||
| Supported conflict classes:
 | ||
| 
 | ||
| - `severity-mismatch` – CVSS or qualitative severities differ.
 | ||
| - `affected-range-divergence` – Product ranges, fixed versions, or platforms
 | ||
|   disagree.
 | ||
| - `statement-disagreement` – One observation declares `not_affected` while
 | ||
|   another states `affected`.
 | ||
| - `reference-clash` – URL or classifier collisions (for example, exploit URL vs
 | ||
|   conflicting advisory).
 | ||
| - `alias-inconsistency` – Aliases map to different canonical IDs (GHSA vs CVE).
 | ||
| - `metadata-gap` – Required provenance missing on one source; logged as a
 | ||
|   warning.
 | ||
| 
 | ||
| Conflict surfaces:
 | ||
| 
 | ||
| - WebService endpoints (`GET /advisories/linksets/{id}` → `conflicts[]`).
 | ||
| - UI evidence panel chips and conflict badges.
 | ||
| - CLI exports (JSON/OSV) exposed through LNM commands.
 | ||
| - Observability metrics (`advisory_linkset_conflicts_total{type}`).
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 4. AOC alignment
 | ||
| 
 | ||
| Observations and linksets must satisfy Aggregation-Only Contract invariants:
 | ||
| 
 | ||
| - **No derived severity** – `content.raw` may include upstream severity, but the
 | ||
|   observation body never injects or edits severity.
 | ||
| - **No merges** – Each upstream document stays separate; linksets reference
 | ||
|   observations via deterministic IDs.
 | ||
| - **Provenance mandatory** – Missing `signature` or `source` metadata is an AOC
 | ||
|   violation (`ERR_AOC_004`).
 | ||
| - **Idempotent writes** – Duplicate `contentHash` yields a no-op; supersedes
 | ||
|   pointer captures new revisions.
 | ||
| - **Deterministic output** – Linkset builder sorts keys, normalizes timestamps
 | ||
|   (UTC ISO-8601), and uses canonical JSON hashing.
 | ||
| 
 | ||
| Violations trigger guard errors (`ERR_AOC_00x`), emit `aoc_violation_total`
 | ||
| metrics, and block persistence until corrected.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 5. Downstream consumption
 | ||
| 
 | ||
| - **Policy Engine** – Computes effective severity and risk overlays from linkset
 | ||
|   evidence and conflicts.
 | ||
| - **Console UI** – Renders per-source statements, signed hashes, and conflict
 | ||
|   banners inside the evidence panel.
 | ||
| - **CLI (`stella advisories linkset …`)** – Exports observations and linksets as
 | ||
|   JSON or OSV for offline triage.
 | ||
| - **Offline Kit** – Shipping snapshots include observation and linkset
 | ||
|   collections for air-gap parity.
 | ||
| - **Observability** – Dashboards track ingestion latency, conflict counts, and
 | ||
|   supersedes depth.
 | ||
| 
 | ||
| When adding new consumers, ensure they honour append-only semantics and do not
 | ||
| mutate observation or linkset collections.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 6. Validation & testing
 | ||
| 
 | ||
| - **Unit tests** (`StellaOps.Concelier.Core.Tests`) validate schema guards,
 | ||
|   deterministic linkset hashing, conflict detection fixtures, and supersedes
 | ||
|   chains.
 | ||
| - **Mongo integration tests** (`StellaOps.Concelier.Storage.Mongo.Tests`) verify
 | ||
|   indexes and idempotent writes under concurrency.
 | ||
| - **CLI smoke suites** confirm `stella advisories observations` and `stella
 | ||
|   advisories linksets` export stable JSON.
 | ||
| - **Determinism checks** replay identical upstream payloads and assert that the
 | ||
|   resulting observation and linkset documents match byte for byte.
 | ||
| - **Offline kit verification** simulates air-gapped bootstrap to confirm that
 | ||
|   snapshots align with live data.
 | ||
| 
 | ||
| Add fixtures whenever a new conflict type or correlation signal is introduced.
 | ||
| Ensure canonical JSON serialization remains stable across .NET runtime updates.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 7. Reviewer checklist
 | ||
| 
 | ||
| - Observation schema segment matches the latest `StellaOps.Concelier.Models`
 | ||
|   contract.
 | ||
| - Linkset lifecycle covers correlation signals, conflict classes, and
 | ||
|   deterministic IDs.
 | ||
| - AOC invariants are explicitly called out with violation codes.
 | ||
| - Examples include multi-source correlation plus conflict annotation.
 | ||
| - Downstream consumer guidance reflects active APIs and CLI features.
 | ||
| - Testing section lists required suites (Core, Storage, CLI, Offline).
 | ||
| - Imposed rule reminder is present at the top of the document.
 | ||
| 
 | ||
| Confirmed against Concelier Link-Not-Merge tasks:
 | ||
| `CONCELIER-LNM-21-001..005`, `CONCELIER-LNM-21-101..103`,
 | ||
| `CONCELIER-LNM-21-201..203`.
 |