# CONCELIER-LNM-21-002 · Linkset correlation rules (v1) Purpose: unblock CONCELIER-LNM-21-002 by freezing correlation/precedence rules and providing fixtures so builders and downstream consumers can proceed. ## Scope - Applies to linksets produced from `advisory_observations` (LNM v1). - Correlation is aggregation-only: no value synthesis or merge; emit conflicts instead of collapsing fields. - Output persists in `advisory_linksets` and drives `advisory.linkset.updated@1` events. ## Deterministic confidence calculation (0–1) ``` confidence = clamp( 0.40 * alias_score + 0.25 * purl_overlap_score + 0.15 * cpe_overlap_score + 0.10 * severity_agreement + 0.05 * reference_overlap + 0.05 * freshness_score ) ``` - `alias_score`: 1 if any alias exact-match across observations; 0.5 if vendor ID prefixes match; else 0. - `purl_overlap_score`: 1 if same pkg+version range intersects; 0.6 if same pkg family but disjoint ranges; 0 otherwise. Use semver/rpm/deb comparers as in LNM v1. - `cpe_overlap_score`: 1 if any CPE exact-match; 0.5 if same vendor/product, any version; else 0. - `severity_agreement`: 1 if CVSS base score delta ≤ 0.1; 0.5 if ≤ 1.0; else 0. Use max of available CVSS per observation. - `reference_overlap`: fraction of shared reference URLs (case-normalized) between the pair with the highest overlap across the set. - `freshness_score`: 1 when `fetchedAt` spread ≤ 48h; linearly decays to 0 at 14 days. - Sort observations before scoring by `(source.vendor, advisoryId, fetchedAt)`; reuse that order for hashing and for output arrays. ## Conflict emission (add-only) Emit a conflict entry per divergent field group: - `severity-mismatch`: CVSS base score delta > 1.0 or vector differs. - `affected-range-divergence`: version ranges do not intersect. - `reference-clash`: no shared references and source vendors differ. - `alias-inconsistency`: aliases disjoint across observations. - `metadata-gap`: required fields missing on any observation. Each conflict includes `field`, `reason`, and `values` (array of `source: value` strings) and is stable-sorted by `field` then `reason`. ## Linkset output shape additions - `key.confidence`: populated from formula above. - `conflicts[]`: as defined; may be empty but never null. Each conflict also carries `sourceIds[]` (vendors/sources that produced the values) for provenance. - `normalized` retains add-only fields from `link-not-merge-schema.md`; do not drop raw ranges even when disjoint. - `provenance.hashes`: sorted list of `observationHash` values; used by replay bundles. ## Fixtures - `docs/samples/lnm/linkset-lnm-21-002-sample.json`: two-source agreement (high confidence, no conflicts). - `docs/samples/lnm/linkset-lnm-21-002-conflict.json`: three-source disagreement showing conflict records and confidence < 0.7. All fixtures use ASCII ordering and ISO-8601 UTC timestamps and may be used as golden outputs in tests. ## Implementation checklist - Builder must refuse to overwrite existing linkset when incoming hash list unchanged. - Correlation job idempotency key: `hash(tenantId|aliasSet|purlSet|fetchedAtBucket)`. - Telemetry: counter `concelier.linkset.builder.conflict_total{field,reason}` and histogram `concelier.linkset.builder.confidence` (0–1 buckets). - Event emission: include `confidence` and `conflicts` summary in `advisory.linkset.updated@1`; keep arrays sorted as above. ## Change control - Add-only. Adjusting weights or conflict codes requires new version `advisory.linkset.updated@2` and a sprint note.