Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
feat: Implement BsonJsonConverter for converting BsonDocument and BsonArray to JSON fix: Update project file to include MongoDB.Bson package test: Add GraphOverlayExporterTests to validate NDJSON export functionality refactor: Refactor Program.cs in Attestation Tool for improved argument parsing and error handling docs: Update README for stella-forensic-verify with usage instructions and exit codes feat: Enhance HmacVerifier with clock skew and not-after checks feat: Add MerkleRootVerifier and ChainOfCustodyVerifier for additional verification methods fix: Update DenoRuntimeShim to correctly handle file paths feat: Introduce ComposerAutoloadData and related parsing in ComposerLockReader test: Add tests for Deno runtime execution and verification test: Enhance PHP package tests to include autoload data verification test: Add unit tests for HmacVerifier and verification logic
150 lines
6.4 KiB
Markdown
150 lines
6.4 KiB
Markdown
# Link-Not-Merge (LNM) Observation & Linkset Schema
|
||
|
||
_Frozen v1 (add-only) — approved 2025-11-17 for CONCELIER-LNM-21-001/002/101._
|
||
|
||
## Goals
|
||
- Immutable storage of raw advisory observations per source/tenant.
|
||
- Deterministic linksets built from observations without merging or mutating originals.
|
||
- Stable across online/offline deployments; replayable from raw inputs.
|
||
|
||
## Status
|
||
- Frozen v1 as of 2025-11-17; further schema changes must go through ADR + sprint gating (CONCELIER-LNM-22x+).
|
||
|
||
## Observation document (Mongo JSON Schema excerpt)
|
||
```json
|
||
{
|
||
"bsonType": "object",
|
||
"required": ["_id","tenantId","source","advisoryId","affected","provenance","ingestedAt"],
|
||
"properties": {
|
||
"_id": {"bsonType": "objectId"},
|
||
"tenantId": {"bsonType": "string"},
|
||
"source": {"bsonType": "string", "description": "Adapter id, e.g., ghsa, nvd, cert-bund"},
|
||
"advisoryId": {"bsonType": "string"},
|
||
"title": {"bsonType": "string"},
|
||
"summary": {"bsonType": "string"},
|
||
"severities": {
|
||
"bsonType": "array",
|
||
"items": {"bsonType": "object", "required": ["system","score"],
|
||
"properties": {"system":{"bsonType":"string"},"score":{"bsonType":"double"},"vector":{"bsonType":"string"}}}
|
||
},
|
||
"affected": {
|
||
"bsonType": "array",
|
||
"items": {"bsonType":"object","required":["purl"],
|
||
"properties": {
|
||
"purl": {"bsonType":"string"},
|
||
"package": {"bsonType":"string"},
|
||
"versions": {"bsonType":"array","items":{"bsonType":"string"}},
|
||
"ranges": {"bsonType":"array","items":{"bsonType":"object",
|
||
"required":["type","events"],
|
||
"properties": {"type":{"bsonType":"string"},"events":{"bsonType":"array","items":{"bsonType":"object"}}}}},
|
||
"ecosystem": {"bsonType":"string"},
|
||
"cpe": {"bsonType":"array","items":{"bsonType":"string"}},
|
||
"cpes": {"bsonType":"array","items":{"bsonType":"string"}}
|
||
}
|
||
}
|
||
},
|
||
"references": {"bsonType": "array", "items": {"bsonType":"string"}},
|
||
"scopes": {"bsonType":"array","items":{"bsonType":"string"}},
|
||
"relationships": {
|
||
"bsonType": "array",
|
||
"items": {"bsonType":"object","required":["type","source","target"],
|
||
"properties": {
|
||
"type":{"bsonType":"string"},
|
||
"source":{"bsonType":"string"},
|
||
"target":{"bsonType":"string"},
|
||
"provenance":{"bsonType":"string"}
|
||
}}
|
||
},
|
||
"weaknesses": {"bsonType":"array","items":{"bsonType":"string"}},
|
||
"published": {"bsonType": "date"},
|
||
"modified": {"bsonType": "date"},
|
||
"provenance": {
|
||
"bsonType": "object",
|
||
"required": ["sourceArtifactSha","fetchedAt"],
|
||
"properties": {
|
||
"sourceArtifactSha": {"bsonType":"string"},
|
||
"fetchedAt": {"bsonType":"date"},
|
||
"ingestJobId": {"bsonType":"string"},
|
||
"signature": {"bsonType":"object"}
|
||
}
|
||
},
|
||
"ingestedAt": {"bsonType": "date"}
|
||
}
|
||
}
|
||
```
|
||
|
||
### Observation invariants
|
||
- **Immutable:** no in-place updates; new revision → new document with `supersedesId` optional pointer.
|
||
- **Deterministic keying:** `_id` derived from `hash(tenantId|source|advisoryId|provenance.sourceArtifactSha)` to keep inserts idempotent in replay.
|
||
- **Normalization guardrails:** version ranges must be stored as raw-from-source; no inferred merges.
|
||
|
||
## Linkset document
|
||
```json
|
||
{
|
||
"bsonType":"object",
|
||
"required":["_id","tenantId","advisoryId","source","observations","createdAt"],
|
||
"properties":{
|
||
"_id":{"bsonType":"objectId"},
|
||
"tenantId":{"bsonType":"string"},
|
||
"advisoryId":{"bsonType":"string"},
|
||
"source":{"bsonType":"string"},
|
||
"observations":{"bsonType":"array","items":{"bsonType":"objectId"}},
|
||
"normalized": {
|
||
"bsonType":"object",
|
||
"properties":{
|
||
"purls":{"bsonType":"array","items":{"bsonType":"string"}},
|
||
"versions":{"bsonType":"array","items":{"bsonType":"string"}},
|
||
"ranges": {"bsonType":"array","items":{"bsonType":"object"}},
|
||
"severities": {"bsonType":"array","items":{"bsonType":"object"}}
|
||
}
|
||
},
|
||
"confidence": {"bsonType":"double", "description":"Optional correlation confidence (0–1)"},
|
||
"conflicts": {"bsonType":"array","items":{"bsonType":"object",
|
||
"required":["field","reason"],
|
||
"properties":{
|
||
"field":{"bsonType":"string"},
|
||
"reason":{"bsonType":"string"},
|
||
"values":{"bsonType":"array","items":{"bsonType":"string"}},
|
||
"sourceIds":{"bsonType":"array","items":{"bsonType":"string"}}
|
||
}}},
|
||
"createdAt":{"bsonType":"date"},
|
||
"builtByJobId":{"bsonType":"string"},
|
||
"provenance": {"bsonType":"object","properties":{
|
||
"observationHashes":{"bsonType":"array","items":{"bsonType":"string"}},
|
||
"toolVersion" : {"bsonType":"string"},
|
||
"policyHash" : {"bsonType":"string"}
|
||
}}
|
||
}
|
||
}
|
||
```
|
||
|
||
### Linkset invariants
|
||
- Built from a set of observation IDs; never overwrites observations.
|
||
- Carries the hash list of source observations for audit/replay.
|
||
- Deterministic sort: observations sorted by `source, advisoryId, fetchedAt` before hashing.
|
||
- Conflicts are additive only and now carry optional `sourceIds[]` to trace which upstream sources produced divergent values.
|
||
|
||
## Indexes (Mongo)
|
||
- Observations: `{ tenantId:1, source:1, advisoryId:1, provenance.fetchedAt:-1 }` (compound for ingest); `{ provenance.sourceArtifactSha:1 }` unique to avoid dup writes.
|
||
- Linksets: `{ tenantId:1, advisoryId:1, source:1 }` unique; `{ observations:1 }` sparse for reverse lookups.
|
||
|
||
## Collections
|
||
- `advisory_observations` — raw per-source docs (immutable).
|
||
- `advisory_linksets` — derived normalized aggregates with observation pointers and hashes.
|
||
|
||
## Determinism & replay
|
||
- Replay rebuild: order observations by fetchedAt, recompute linkset hash list, ensure byte-identical linkset JSON.
|
||
- All timestamps UTC ISO-8601; no server-local time.
|
||
- String normalization: lowercase `source`, trim/normalize PURLs, stable sort arrays.
|
||
|
||
## Sample documents
|
||
See `docs/samples/lnm/observation-ghsa.json` and `docs/samples/lnm/linkset-ghsa.json` (added with this draft) for concrete payloads.
|
||
|
||
## Approval path
|
||
1) Architecture + Concelier Core review this document.
|
||
2) If accepted, freeze JSON Schema and roll into `src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo` migrations.
|
||
3) Update consumers (policy/CLI/export) to read from linksets only; deprecate Merge endpoints.
|
||
|
||
---
|
||
Tracking: CONCELIER-LNM-21-001/002/101; Sprint 110 blockers (Concelier/Excititor waves).
|