# Link-Not-Merge (LNM) Observation & Linkset Schema _Frozen v1 (add-only) — approved 2025-11-17 for CONCELIER-LNM-21-001/002/101._ ## Goals - Immutable storage of raw advisory observations per source/tenant. - Deterministic linksets built from observations without merging or mutating originals. - Stable across online/offline deployments; replayable from raw inputs. ## Status - Frozen v1 as of 2025-11-17; further schema changes must go through ADR + sprint gating (CONCELIER-LNM-22x+). ## Observation document (Mongo JSON Schema excerpt) ```json { "bsonType": "object", "required": ["_id","tenantId","source","advisoryId","affected","provenance","ingestedAt"], "properties": { "_id": {"bsonType": "objectId"}, "tenantId": {"bsonType": "string"}, "source": {"bsonType": "string", "description": "Adapter id, e.g., ghsa, nvd, cert-bund"}, "advisoryId": {"bsonType": "string"}, "title": {"bsonType": "string"}, "summary": {"bsonType": "string"}, "severities": { "bsonType": "array", "items": {"bsonType": "object", "required": ["system","score"], "properties": {"system":{"bsonType":"string"},"score":{"bsonType":"double"},"vector":{"bsonType":"string"}}} }, "affected": { "bsonType": "array", "items": {"bsonType":"object","required":["purl"], "properties": { "purl": {"bsonType":"string"}, "package": {"bsonType":"string"}, "versions": {"bsonType":"array","items":{"bsonType":"string"}}, "ranges": {"bsonType":"array","items":{"bsonType":"object", "required":["type","events"], "properties": {"type":{"bsonType":"string"},"events":{"bsonType":"array","items":{"bsonType":"object"}}}}}, "ecosystem": {"bsonType":"string"}, "cpe": {"bsonType":"array","items":{"bsonType":"string"}}, "cpes": {"bsonType":"array","items":{"bsonType":"string"}} } } }, "references": {"bsonType": "array", "items": {"bsonType":"string"}}, "scopes": {"bsonType":"array","items":{"bsonType":"string"}}, "relationships": { "bsonType": "array", "items": {"bsonType":"object","required":["type","source","target"], "properties": { "type":{"bsonType":"string"}, "source":{"bsonType":"string"}, "target":{"bsonType":"string"}, "provenance":{"bsonType":"string"} }} }, "weaknesses": {"bsonType":"array","items":{"bsonType":"string"}}, "published": {"bsonType": "date"}, "modified": {"bsonType": "date"}, "provenance": { "bsonType": "object", "required": ["sourceArtifactSha","fetchedAt"], "properties": { "sourceArtifactSha": {"bsonType":"string"}, "fetchedAt": {"bsonType":"date"}, "ingestJobId": {"bsonType":"string"}, "signature": {"bsonType":"object"} } }, "ingestedAt": {"bsonType": "date"} } } ``` ### Observation invariants - **Immutable:** no in-place updates; new revision → new document with `supersedesId` optional pointer. - **Deterministic keying:** `_id` derived from `hash(tenantId|source|advisoryId|provenance.sourceArtifactSha)` to keep inserts idempotent in replay. - **Normalization guardrails:** version ranges must be stored as raw-from-source; no inferred merges. ## Linkset document ```json { "bsonType":"object", "required":["_id","tenantId","advisoryId","source","observations","createdAt"], "properties":{ "_id":{"bsonType":"objectId"}, "tenantId":{"bsonType":"string"}, "advisoryId":{"bsonType":"string"}, "source":{"bsonType":"string"}, "observations":{"bsonType":"array","items":{"bsonType":"objectId"}}, "normalized": { "bsonType":"object", "properties":{ "purls":{"bsonType":"array","items":{"bsonType":"string"}}, "versions":{"bsonType":"array","items":{"bsonType":"string"}}, "ranges": {"bsonType":"array","items":{"bsonType":"object"}}, "severities": {"bsonType":"array","items":{"bsonType":"object"}} } }, "confidence": {"bsonType":"double", "description":"Optional correlation confidence (0–1)"}, "conflicts": {"bsonType":"array","items":{"bsonType":"object", "required":["field","reason"], "properties":{ "field":{"bsonType":"string"}, "reason":{"bsonType":"string"}, "values":{"bsonType":"array","items":{"bsonType":"string"}}, "sourceIds":{"bsonType":"array","items":{"bsonType":"string"}} }}}, "createdAt":{"bsonType":"date"}, "builtByJobId":{"bsonType":"string"}, "provenance": {"bsonType":"object","properties":{ "observationHashes":{"bsonType":"array","items":{"bsonType":"string"}}, "toolVersion" : {"bsonType":"string"}, "policyHash" : {"bsonType":"string"} }} } } ``` ### Linkset invariants - Built from a set of observation IDs; never overwrites observations. - Carries the hash list of source observations for audit/replay. - Deterministic sort: observations sorted by `source, advisoryId, fetchedAt` before hashing. - Conflicts are additive only and now carry optional `sourceIds[]` to trace which upstream sources produced divergent values. ## Indexes (Mongo) - Observations: `{ tenantId:1, source:1, advisoryId:1, provenance.fetchedAt:-1 }` (compound for ingest); `{ provenance.sourceArtifactSha:1 }` unique to avoid dup writes. - Linksets: `{ tenantId:1, advisoryId:1, source:1 }` unique; `{ observations:1 }` sparse for reverse lookups. ## Collections - `advisory_observations` — raw per-source docs (immutable). - `advisory_linksets` — derived normalized aggregates with observation pointers and hashes. ## Determinism & replay - Replay rebuild: order observations by fetchedAt, recompute linkset hash list, ensure byte-identical linkset JSON. - All timestamps UTC ISO-8601; no server-local time. - String normalization: lowercase `source`, trim/normalize PURLs, stable sort arrays. ## Sample documents See `docs/samples/lnm/observation-ghsa.json` and `docs/samples/lnm/linkset-ghsa.json` (added with this draft) for concrete payloads. ## Approval path 1) Architecture + Concelier Core review this document. 2) If accepted, freeze JSON Schema and roll into `src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo` migrations. 3) Update consumers (policy/CLI/export) to read from linksets only; deprecate Merge endpoints. --- Tracking: CONCELIER-LNM-21-001/002/101; Sprint 110 blockers (Concelier/Excititor waves).