219 lines
		
	
	
		
			8.6 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			219 lines
		
	
	
		
			8.6 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
# Advisory Observations & Linksets
 | 
						||
 | 
						||
> Imposed rule: Work of this type or tasks of this type on this component must also
 | 
						||
> be applied everywhere else it should be applied.
 | 
						||
 | 
						||
The Link-Not-Merge (LNM) initiative replaces the legacy "merge" pipeline with
 | 
						||
immutable observations and correlation linksets. This guide explains how
 | 
						||
Concelier ingests advisory statements, preserves upstream truth, and produces
 | 
						||
linksets that downstream services (Policy Engine, Vuln Explorer, Console) can
 | 
						||
use without collapsing sources together.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 1. Model overview
 | 
						||
 | 
						||
### 1.1 Observation lifecycle
 | 
						||
 | 
						||
1. **Ingest** – Connectors fetch upstream payloads (CSAF, OSV, vendor feeds),
 | 
						||
   validate signatures, and drop any derived fields prohibited by the
 | 
						||
   Aggregation-Only Contract (AOC).
 | 
						||
2. **Persist** – Concelier writes immutable `advisory_observations` scoped by
 | 
						||
   `tenant`, `(source.vendor, upstreamId)`, and `contentHash`. Supersedes chains
 | 
						||
   capture revisions without mutating history.
 | 
						||
3. **Expose** – WebService surfaces paged/read APIs; Offline Kit snapshots
 | 
						||
   include the same documents for air-gapped installs.
 | 
						||
 | 
						||
Observation schema highlights:
 | 
						||
 | 
						||
```text
 | 
						||
observationId = {tenant}:{source.vendor}:{upstreamId}:{revision}
 | 
						||
tenant, source{vendor, stream, api, collectorVersion}
 | 
						||
upstream{upstreamId, documentVersion, fetchedAt, receivedAt,
 | 
						||
         contentHash, signature{present, format, keyId, signature}}
 | 
						||
content{format, specVersion, raw}
 | 
						||
identifiers{cve?, ghsa?, aliases[], osvIds[]}
 | 
						||
linkset{purls[], cpes[], aliases[], references[], conflicts[]?}
 | 
						||
createdAt, attributes{batchId?, replayCursor?}
 | 
						||
```
 | 
						||
 | 
						||
- **Immutable raw** (`content.raw`) mirrors upstream payloads exactly.
 | 
						||
- **Provenance** (`source.*`, `upstream.*`) satisfies AOC guardrails and enables
 | 
						||
  cryptographic attestations.
 | 
						||
- **Identifiers** retain lossless extracts (CVE, GHSA, vendor aliases) that seed
 | 
						||
  linksets.
 | 
						||
- **Linkset** captures join hints but never merges or adds derived severity.
 | 
						||
 | 
						||
### 1.2 Linkset lifecycle
 | 
						||
 | 
						||
Linksets correlate observations that describe the same vulnerable product while
 | 
						||
keeping each source intact.
 | 
						||
 | 
						||
1. **Seed** – Observations emit normalized identifiers (`purl`, `cpe`,
 | 
						||
   `alias`) during ingestion.
 | 
						||
2. **Correlate** – Linkset builder groups observations by tenant, product
 | 
						||
   coordinates, and equivalence signals (PURL alias graph, CVE overlap, CVSS
 | 
						||
   vector equality, fuzzy titles).
 | 
						||
3. **Annotate** – Detected conflicts (severity disagreements, affected-range
 | 
						||
   mismatch, incompatible references) are recorded with structured payloads and
 | 
						||
   preserved for UI/API export.
 | 
						||
4. **Persist** – Results land in `advisory_linksets` with deterministic IDs
 | 
						||
   (`linksetId = {tenant}:{hash(aliases+purls+seedIds)}`) and append-only history
 | 
						||
   for reproducibility.
 | 
						||
 | 
						||
Linksets never suppress or prefer one source; they provide aligned evidence so
 | 
						||
other services can apply policy.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 2. Observation vs. linkset
 | 
						||
 | 
						||
- **Purpose**
 | 
						||
  - Observation: Immutable record per vendor and revision.
 | 
						||
  - Linkset: Correlates observations that share product identity.
 | 
						||
- **Mutation**
 | 
						||
  - Observation: Append-only via supersedes chain.
 | 
						||
  - Linkset: Rebuilt deterministically from canonical signals.
 | 
						||
- **Allowed fields**
 | 
						||
  - Observation: Raw payload, provenance, identifiers, join hints.
 | 
						||
  - Linkset: Observation references, normalized product metadata, conflicts.
 | 
						||
- **Forbidden fields**
 | 
						||
  - Observation: Derived severity, policy status, opinionated dedupe.
 | 
						||
  - Linkset: Derived severity (conflicts recorded but unresolved).
 | 
						||
- **Consumers**
 | 
						||
  - Observation: Evidence API, Offline Kit, CLI exports.
 | 
						||
  - Linkset: Policy Engine overlay, UI evidence panel, Vuln Explorer.
 | 
						||
 | 
						||
### 2.1 Example sequence
 | 
						||
 | 
						||
1. Red Hat PSIRT publishes RHSA-2025:1234 for OpenSSL; Concelier inserts an
 | 
						||
   observation for vendor `redhat` with `pkg:rpm/redhat/openssl@1.1.1w-12`.
 | 
						||
2. NVD issues CVE-2025-0001; a second observation is inserted for vendor `nvd`.
 | 
						||
3. Linkset builder runs, groups the two observations, records alias and PURL
 | 
						||
   overlap, and flags a CVSS disagreement (`7.5` vs `7.2`).
 | 
						||
4. Policy Engine reads the linkset, recognises the severity variance, and relies
 | 
						||
   on configured rules to decide the effective output.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 3. Conflict handling
 | 
						||
 | 
						||
Conflicts record disagreements without altering source payloads. The builder
 | 
						||
emits structured entries:
 | 
						||
 | 
						||
```json
 | 
						||
{
 | 
						||
  "type": "severity-mismatch",
 | 
						||
  "field": "cvss.baseScore",
 | 
						||
  "observations": [
 | 
						||
    {
 | 
						||
      "source": "redhat",
 | 
						||
      "value": "7.5",
 | 
						||
      "vector": "AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N"
 | 
						||
    },
 | 
						||
    {
 | 
						||
      "source": "nvd",
 | 
						||
      "value": "7.2",
 | 
						||
      "vector": "AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:H/A:N"
 | 
						||
    }
 | 
						||
  ],
 | 
						||
  "confidence": "medium",
 | 
						||
  "detectedAt": "2025-10-27T14:00:00Z"
 | 
						||
}
 | 
						||
```
 | 
						||
 | 
						||
Supported conflict classes:
 | 
						||
 | 
						||
- `severity-mismatch` – CVSS or qualitative severities differ.
 | 
						||
- `affected-range-divergence` – Product ranges, fixed versions, or platforms
 | 
						||
  disagree.
 | 
						||
- `statement-disagreement` – One observation declares `not_affected` while
 | 
						||
  another states `affected`.
 | 
						||
- `reference-clash` – URL or classifier collisions (for example, exploit URL vs
 | 
						||
  conflicting advisory).
 | 
						||
- `alias-inconsistency` – Aliases map to different canonical IDs (GHSA vs CVE).
 | 
						||
- `metadata-gap` – Required provenance missing on one source; logged as a
 | 
						||
  warning.
 | 
						||
 | 
						||
Conflict surfaces:
 | 
						||
 | 
						||
- WebService endpoints (`GET /advisories/linksets/{id}` → `conflicts[]`).
 | 
						||
- UI evidence panel chips and conflict badges.
 | 
						||
- CLI exports (JSON/OSV) exposed through LNM commands.
 | 
						||
- Observability metrics (`advisory_linkset_conflicts_total{type}`).
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 4. AOC alignment
 | 
						||
 | 
						||
Observations and linksets must satisfy Aggregation-Only Contract invariants:
 | 
						||
 | 
						||
- **No derived severity** – `content.raw` may include upstream severity, but the
 | 
						||
  observation body never injects or edits severity.
 | 
						||
- **No merges** – Each upstream document stays separate; linksets reference
 | 
						||
  observations via deterministic IDs.
 | 
						||
- **Provenance mandatory** – Missing `signature` or `source` metadata is an AOC
 | 
						||
  violation (`ERR_AOC_004`).
 | 
						||
- **Idempotent writes** – Duplicate `contentHash` yields a no-op; supersedes
 | 
						||
  pointer captures new revisions.
 | 
						||
- **Deterministic output** – Linkset builder sorts keys, normalizes timestamps
 | 
						||
  (UTC ISO-8601), and uses canonical JSON hashing.
 | 
						||
 | 
						||
Violations trigger guard errors (`ERR_AOC_00x`), emit `aoc_violation_total`
 | 
						||
metrics, and block persistence until corrected.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 5. Downstream consumption
 | 
						||
 | 
						||
- **Policy Engine** – Computes effective severity and risk overlays from linkset
 | 
						||
  evidence and conflicts.
 | 
						||
- **Console UI** – Renders per-source statements, signed hashes, and conflict
 | 
						||
  banners inside the evidence panel.
 | 
						||
- **CLI (`stella advisories linkset …`)** – Exports observations and linksets as
 | 
						||
  JSON or OSV for offline triage.
 | 
						||
- **Offline Kit** – Shipping snapshots include observation and linkset
 | 
						||
  collections for air-gap parity.
 | 
						||
- **Observability** – Dashboards track ingestion latency, conflict counts, and
 | 
						||
  supersedes depth.
 | 
						||
 | 
						||
When adding new consumers, ensure they honour append-only semantics and do not
 | 
						||
mutate observation or linkset collections.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 6. Validation & testing
 | 
						||
 | 
						||
- **Unit tests** (`StellaOps.Concelier.Core.Tests`) validate schema guards,
 | 
						||
  deterministic linkset hashing, conflict detection fixtures, and supersedes
 | 
						||
  chains.
 | 
						||
- **Mongo integration tests** (`StellaOps.Concelier.Storage.Mongo.Tests`) verify
 | 
						||
  indexes and idempotent writes under concurrency.
 | 
						||
- **CLI smoke suites** confirm `stella advisories observations` and `stella
 | 
						||
  advisories linksets` export stable JSON.
 | 
						||
- **Determinism checks** replay identical upstream payloads and assert that the
 | 
						||
  resulting observation and linkset documents match byte for byte.
 | 
						||
- **Offline kit verification** simulates air-gapped bootstrap to confirm that
 | 
						||
  snapshots align with live data.
 | 
						||
 | 
						||
Add fixtures whenever a new conflict type or correlation signal is introduced.
 | 
						||
Ensure canonical JSON serialization remains stable across .NET runtime updates.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 7. Reviewer checklist
 | 
						||
 | 
						||
- Observation schema segment matches the latest `StellaOps.Concelier.Models`
 | 
						||
  contract.
 | 
						||
- Linkset lifecycle covers correlation signals, conflict classes, and
 | 
						||
  deterministic IDs.
 | 
						||
- AOC invariants are explicitly called out with violation codes.
 | 
						||
- Examples include multi-source correlation plus conflict annotation.
 | 
						||
- Downstream consumer guidance reflects active APIs and CLI features.
 | 
						||
- Testing section lists required suites (Core, Storage, CLI, Offline).
 | 
						||
- Imposed rule reminder is present at the top of the document.
 | 
						||
 | 
						||
Confirmed against Concelier Link-Not-Merge tasks:
 | 
						||
`CONCELIER-LNM-21-001..005`, `CONCELIER-LNM-21-101..103`,
 | 
						||
`CONCELIER-LNM-21-201..203`.
 |