Files
git.stella-ops.org/docs/modules/concelier/link-not-merge-schema.md
StellaOps Bot dc7c75b496
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
feat: Add MongoIdempotencyStoreOptions for MongoDB configuration
feat: Implement BsonJsonConverter for converting BsonDocument and BsonArray to JSON

fix: Update project file to include MongoDB.Bson package

test: Add GraphOverlayExporterTests to validate NDJSON export functionality

refactor: Refactor Program.cs in Attestation Tool for improved argument parsing and error handling

docs: Update README for stella-forensic-verify with usage instructions and exit codes

feat: Enhance HmacVerifier with clock skew and not-after checks

feat: Add MerkleRootVerifier and ChainOfCustodyVerifier for additional verification methods

fix: Update DenoRuntimeShim to correctly handle file paths

feat: Introduce ComposerAutoloadData and related parsing in ComposerLockReader

test: Add tests for Deno runtime execution and verification

test: Enhance PHP package tests to include autoload data verification

test: Add unit tests for HmacVerifier and verification logic
2025-11-22 16:42:56 +02:00

150 lines
6.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Link-Not-Merge (LNM) Observation & Linkset Schema
_Frozen v1 (add-only) — approved 2025-11-17 for CONCELIER-LNM-21-001/002/101._
## Goals
- Immutable storage of raw advisory observations per source/tenant.
- Deterministic linksets built from observations without merging or mutating originals.
- Stable across online/offline deployments; replayable from raw inputs.
## Status
- Frozen v1 as of 2025-11-17; further schema changes must go through ADR + sprint gating (CONCELIER-LNM-22x+).
## Observation document (Mongo JSON Schema excerpt)
```json
{
"bsonType": "object",
"required": ["_id","tenantId","source","advisoryId","affected","provenance","ingestedAt"],
"properties": {
"_id": {"bsonType": "objectId"},
"tenantId": {"bsonType": "string"},
"source": {"bsonType": "string", "description": "Adapter id, e.g., ghsa, nvd, cert-bund"},
"advisoryId": {"bsonType": "string"},
"title": {"bsonType": "string"},
"summary": {"bsonType": "string"},
"severities": {
"bsonType": "array",
"items": {"bsonType": "object", "required": ["system","score"],
"properties": {"system":{"bsonType":"string"},"score":{"bsonType":"double"},"vector":{"bsonType":"string"}}}
},
"affected": {
"bsonType": "array",
"items": {"bsonType":"object","required":["purl"],
"properties": {
"purl": {"bsonType":"string"},
"package": {"bsonType":"string"},
"versions": {"bsonType":"array","items":{"bsonType":"string"}},
"ranges": {"bsonType":"array","items":{"bsonType":"object",
"required":["type","events"],
"properties": {"type":{"bsonType":"string"},"events":{"bsonType":"array","items":{"bsonType":"object"}}}}},
"ecosystem": {"bsonType":"string"},
"cpe": {"bsonType":"array","items":{"bsonType":"string"}},
"cpes": {"bsonType":"array","items":{"bsonType":"string"}}
}
}
},
"references": {"bsonType": "array", "items": {"bsonType":"string"}},
"scopes": {"bsonType":"array","items":{"bsonType":"string"}},
"relationships": {
"bsonType": "array",
"items": {"bsonType":"object","required":["type","source","target"],
"properties": {
"type":{"bsonType":"string"},
"source":{"bsonType":"string"},
"target":{"bsonType":"string"},
"provenance":{"bsonType":"string"}
}}
},
"weaknesses": {"bsonType":"array","items":{"bsonType":"string"}},
"published": {"bsonType": "date"},
"modified": {"bsonType": "date"},
"provenance": {
"bsonType": "object",
"required": ["sourceArtifactSha","fetchedAt"],
"properties": {
"sourceArtifactSha": {"bsonType":"string"},
"fetchedAt": {"bsonType":"date"},
"ingestJobId": {"bsonType":"string"},
"signature": {"bsonType":"object"}
}
},
"ingestedAt": {"bsonType": "date"}
}
}
```
### Observation invariants
- **Immutable:** no in-place updates; new revision → new document with `supersedesId` optional pointer.
- **Deterministic keying:** `_id` derived from `hash(tenantId|source|advisoryId|provenance.sourceArtifactSha)` to keep inserts idempotent in replay.
- **Normalization guardrails:** version ranges must be stored as raw-from-source; no inferred merges.
## Linkset document
```json
{
"bsonType":"object",
"required":["_id","tenantId","advisoryId","source","observations","createdAt"],
"properties":{
"_id":{"bsonType":"objectId"},
"tenantId":{"bsonType":"string"},
"advisoryId":{"bsonType":"string"},
"source":{"bsonType":"string"},
"observations":{"bsonType":"array","items":{"bsonType":"objectId"}},
"normalized": {
"bsonType":"object",
"properties":{
"purls":{"bsonType":"array","items":{"bsonType":"string"}},
"versions":{"bsonType":"array","items":{"bsonType":"string"}},
"ranges": {"bsonType":"array","items":{"bsonType":"object"}},
"severities": {"bsonType":"array","items":{"bsonType":"object"}}
}
},
"confidence": {"bsonType":"double", "description":"Optional correlation confidence (01)"},
"conflicts": {"bsonType":"array","items":{"bsonType":"object",
"required":["field","reason"],
"properties":{
"field":{"bsonType":"string"},
"reason":{"bsonType":"string"},
"values":{"bsonType":"array","items":{"bsonType":"string"}},
"sourceIds":{"bsonType":"array","items":{"bsonType":"string"}}
}}},
"createdAt":{"bsonType":"date"},
"builtByJobId":{"bsonType":"string"},
"provenance": {"bsonType":"object","properties":{
"observationHashes":{"bsonType":"array","items":{"bsonType":"string"}},
"toolVersion" : {"bsonType":"string"},
"policyHash" : {"bsonType":"string"}
}}
}
}
```
### Linkset invariants
- Built from a set of observation IDs; never overwrites observations.
- Carries the hash list of source observations for audit/replay.
- Deterministic sort: observations sorted by `source, advisoryId, fetchedAt` before hashing.
- Conflicts are additive only and now carry optional `sourceIds[]` to trace which upstream sources produced divergent values.
## Indexes (Mongo)
- Observations: `{ tenantId:1, source:1, advisoryId:1, provenance.fetchedAt:-1 }` (compound for ingest); `{ provenance.sourceArtifactSha:1 }` unique to avoid dup writes.
- Linksets: `{ tenantId:1, advisoryId:1, source:1 }` unique; `{ observations:1 }` sparse for reverse lookups.
## Collections
- `advisory_observations` — raw per-source docs (immutable).
- `advisory_linksets` — derived normalized aggregates with observation pointers and hashes.
## Determinism & replay
- Replay rebuild: order observations by fetchedAt, recompute linkset hash list, ensure byte-identical linkset JSON.
- All timestamps UTC ISO-8601; no server-local time.
- String normalization: lowercase `source`, trim/normalize PURLs, stable sort arrays.
## Sample documents
See `docs/samples/lnm/observation-ghsa.json` and `docs/samples/lnm/linkset-ghsa.json` (added with this draft) for concrete payloads.
## Approval path
1) Architecture + Concelier Core review this document.
2) If accepted, freeze JSON Schema and roll into `src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo` migrations.
3) Update consumers (policy/CLI/export) to read from linksets only; deprecate Merge endpoints.
---
Tracking: CONCELIER-LNM-21-001/002/101; Sprint 110 blockers (Concelier/Excititor waves).