# CONCELIER-LNM-21-002 · Linkset correlation rules (v1)

Purpose: unblock CONCELIER-LNM-21-002 by freezing correlation/precedence rules and providing fixtures so builders and downstream consumers can proceed.

## Scope
- Applies to linksets produced from `advisory_observations` (LNM v1).
- Correlation is aggregation-only: no value synthesis or merge; emit conflicts instead of collapsing fields.
- Output persists in `advisory_linksets` and drives `advisory.linkset.updated@1` events.

## Deterministic confidence calculation (0–1)
```
confidence = clamp(
  0.40 * alias_score +
  0.25 * purl_overlap_score +
  0.15 * cpe_overlap_score +
  0.10 * severity_agreement +
  0.05 * reference_overlap +
  0.05 * freshness_score
)
```
- `alias_score`: 1 if any alias exact-match across observations; 0.5 if vendor ID prefixes match; else 0.
- `purl_overlap_score`: 1 if same pkg+version range intersects; 0.6 if same pkg family but disjoint ranges; 0 otherwise. Use semver/rpm/deb comparers as in LNM v1.
- `cpe_overlap_score`: 1 if any CPE exact-match; 0.5 if same vendor/product, any version; else 0.
- `severity_agreement`: 1 if CVSS base score delta ≤ 0.1; 0.5 if ≤ 1.0; else 0. Use max of available CVSS per observation.
- `reference_overlap`: fraction of shared reference URLs (case-normalized) between the pair with the highest overlap across the set.
- `freshness_score`: 1 when `fetchedAt` spread ≤ 48h; linearly decays to 0 at 14 days.
- Sort observations before scoring by `(source.vendor, advisoryId, fetchedAt)`; reuse that order for hashing and for output arrays.

## Conflict emission (add-only)
Emit a conflict entry per divergent field group:
- `severity-mismatch`: CVSS base score delta > 1.0 or vector differs.
- `affected-range-divergence`: version ranges do not intersect.
- `reference-clash`: no shared references and source vendors differ.
- `alias-inconsistency`: aliases disjoint across observations.
- `metadata-gap`: required fields missing on any observation.
Each conflict includes `field`, `reason`, and `values` (array of `source: value` strings) and is stable-sorted by `field` then `reason`.

## Linkset output shape additions
- `key.confidence`: populated from formula above.
- `conflicts[]`: as defined; may be empty but never null. Each conflict also carries `sourceIds[]` (vendors/sources that produced the values) for provenance.
- `normalized` retains add-only fields from `link-not-merge-schema.md`; do not drop raw ranges even when disjoint.
- `provenance.hashes`: sorted list of `observationHash` values; used by replay bundles.

## Fixtures
- `docs/samples/lnm/linkset-lnm-21-002-sample.json`: two-source agreement (high confidence, no conflicts).
- `docs/samples/lnm/linkset-lnm-21-002-conflict.json`: three-source disagreement showing conflict records and confidence < 0.7.
All fixtures use ASCII ordering and ISO-8601 UTC timestamps and may be used as golden outputs in tests.

## Implementation checklist
- Builder must refuse to overwrite existing linkset when incoming hash list unchanged.
- Correlation job idempotency key: `hash(tenantId|aliasSet|purlSet|fetchedAtBucket)`.
- Telemetry: counter `concelier.linkset.builder.conflict_total{field,reason}` and histogram `concelier.linkset.builder.confidence` (0–1 buckets).
- Event emission: include `confidence` and `conflicts` summary in `advisory.linkset.updated@1`; keep arrays sorted as above.

## Change control
- Add-only. Adjusting weights or conflict codes requires new version `advisory.linkset.updated@2` and a sprint note.