Files
git.stella-ops.org/docs/modules/concelier/link-not-merge-schema.md
StellaOps Bot 47168fec38 feat: Add VEX compact fixture and implement offline verifier for Findings Ledger exports
- Introduced a new VEX compact fixture for testing purposes.
- Implemented `verify_export.py` script to validate Findings Ledger exports, ensuring deterministic ordering and applying redaction manifests.
- Added a lightweight stub `HarnessRunner` for unit tests to validate ledger hashing expectations.
- Documented tasks related to the Mirror Creator.
- Created models for entropy signals and implemented the `EntropyPenaltyCalculator` to compute penalties based on scanner outputs.
- Developed unit tests for `EntropyPenaltyCalculator` to ensure correct penalty calculations and handling of edge cases.
- Added tests for symbol ID normalization in the reachability scanner.
- Enhanced console status service with comprehensive unit tests for connection handling and error recovery.
- Included Cosign tool version 2.6.0 with checksums for various platforms.
2025-12-02 21:08:01 +02:00

178 lines
8.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Link-Not-Merge (LNM) Observation & Linkset Schema
_Frozen v1 (add-only) — approved 2025-11-17 for CONCELIER-LNM-21-001/002/101._
## Goals
- Immutable storage of raw advisory observations per source/tenant.
- Deterministic linksets built from observations without merging or mutating originals.
- Stable across online/offline deployments; replayable from raw inputs.
## Status
- Frozen v1 as of 2025-11-17; further schema changes must go through ADR + sprint gating (CONCELIER-LNM-22x+).
- Canonical JSON Schemas + signed manifest live in `docs/modules/concelier/schemas/` (advisory observation, linkset, offline bundle). Verify with `openssl dgst -sha256 -verify schema-signing-pub.pem -signature schema.manifest.sig schema.manifest.json`.
## Observation document (Mongo JSON Schema excerpt)
```json
{
"bsonType": "object",
"required": ["_id","tenantId","source","advisoryId","affected","provenance","ingestedAt"],
"properties": {
"_id": {"bsonType": "objectId"},
"tenantId": {"bsonType": "string"},
"source": {"bsonType": "string", "description": "Adapter id, e.g., ghsa, nvd, cert-bund"},
"advisoryId": {"bsonType": "string"},
"title": {"bsonType": "string"},
"summary": {"bsonType": "string"},
"severities": {
"bsonType": "array",
"items": {"bsonType": "object", "required": ["system","score"],
"properties": {"system":{"bsonType":"string"},"score":{"bsonType":"double"},"vector":{"bsonType":"string"}}}
},
"affected": {
"bsonType": "array",
"items": {"bsonType":"object","required":["purl"],
"properties": {
"purl": {"bsonType":"string"},
"package": {"bsonType":"string"},
"versions": {"bsonType":"array","items":{"bsonType":"string"}},
"ranges": {"bsonType":"array","items":{"bsonType":"object",
"required":["type","events"],
"properties": {"type":{"bsonType":"string"},"events":{"bsonType":"array","items":{"bsonType":"object"}}}}},
"ecosystem": {"bsonType":"string"},
"cpe": {"bsonType":"array","items":{"bsonType":"string"}},
"cpes": {"bsonType":"array","items":{"bsonType":"string"}}
}
}
},
"references": {"bsonType": "array", "items": {"bsonType":"string"}},
"scopes": {"bsonType":"array","items":{"bsonType":"string"}},
"relationships": {
"bsonType": "array",
"items": {"bsonType":"object","required":["type","source","target"],
"properties": {
"type":{"bsonType":"string"},
"source":{"bsonType":"string"},
"target":{"bsonType":"string"},
"provenance":{"bsonType":"string"}
}}
},
"weaknesses": {"bsonType":"array","items":{"bsonType":"string"}},
"published": {"bsonType": "date"},
"modified": {"bsonType": "date"},
"provenance": {
"bsonType": "object",
"required": ["sourceArtifactSha","fetchedAt"],
"properties": {
"sourceArtifactSha": {"bsonType":"string"},
"fetchedAt": {"bsonType":"date"},
"ingestJobId": {"bsonType":"string"},
"signature": {"bsonType":"object"}
}
},
"ingestedAt": {"bsonType": "date"}
}
}
```
### Observation invariants
- **Immutable:** no in-place updates; new revision → new document with `supersedesId` optional pointer.
- **Deterministic keying:** `_id` derived from `hash(tenantId|source|advisoryId|provenance.sourceArtifactSha)` to keep inserts idempotent in replay.
- **Normalization guardrails:** version ranges must be stored as raw-from-source; no inferred merges.
## Append-Only Contract (AOC) — LNM-21-004
The Aggregation-Only Contract (AOC) ensures observations are immutable after creation. This is enforced by `IAdvisoryObservationWriteGuard`.
### Write disposition rules
| Existing Hash | New Hash | Disposition | Action |
|--------------|----------|-------------|--------|
| null/empty | any | `Proceed` | Insert new observation |
| X | X (identical) | `SkipIdentical` | Idempotent re-insert, no write |
| X | Y (different) | `RejectMutation` | Reject with `AppendOnlyViolationException` |
### Supersession model
When an advisory source publishes a revised version of an advisory:
1. A **new observation** is created with its own unique `observationId` and `contentHash`.
2. The new observation MAY carry a `supersedesId` pointing to the previous observation.
3. The **original observation remains immutable** — it is never updated or deleted.
4. Linksets are rebuilt to include all non-superseded observations; superseded observations remain queryable for audit but excluded from active linkset aggregation.
### Implementation checklist (LNM-21-004)
- [x] `IAdvisoryObservationWriteGuard` interface with `ValidateWrite(observation, existingContentHash)` method.
- [x] `AdvisoryObservationWriteGuard` implementation enforcing append-only semantics.
- [x] `AppendOnlyViolationException` for mutation rejections.
- [x] DI registration via `AddConcelierAocGuards()` extension.
- [x] Unit tests covering Proceed/SkipIdentical/RejectMutation scenarios.
- [x] Legacy merge logic deprecated with `[Obsolete]` and gated by `NoMergeEnabled` feature flag (defaults to `true`).
- [x] Roslyn analyzer `StellaOps.Concelier.Analyzers.NoMergeApiAnalyzer` emits warnings for merge API usage.
## Linkset document
```json
{
"bsonType":"object",
"required":["_id","tenantId","advisoryId","source","observations","createdAt"],
"properties":{
"_id":{"bsonType":"objectId"},
"tenantId":{"bsonType":"string"},
"advisoryId":{"bsonType":"string"},
"source":{"bsonType":"string"},
"observations":{"bsonType":"array","items":{"bsonType":"objectId"}},
"normalized": {
"bsonType":"object",
"properties":{
"purls":{"bsonType":"array","items":{"bsonType":"string"}},
"versions":{"bsonType":"array","items":{"bsonType":"string"}},
"ranges": {"bsonType":"array","items":{"bsonType":"object"}},
"severities": {"bsonType":"array","items":{"bsonType":"object"}}
}
},
"confidence": {"bsonType":"double", "description":"Optional correlation confidence (01)"},
"conflicts": {"bsonType":"array","items":{"bsonType":"object",
"required":["field","reason"],
"properties":{
"field":{"bsonType":"string"},
"reason":{"bsonType":"string"},
"values":{"bsonType":"array","items":{"bsonType":"string"}},
"sourceIds":{"bsonType":"array","items":{"bsonType":"string"}}
}}},
"createdAt":{"bsonType":"date"},
"builtByJobId":{"bsonType":"string"},
"provenance": {"bsonType":"object","properties":{
"observationHashes":{"bsonType":"array","items":{"bsonType":"string"}},
"toolVersion" : {"bsonType":"string"},
"policyHash" : {"bsonType":"string"}
}}
}
}
```
### Linkset invariants
- Built from a set of observation IDs; never overwrites observations.
- Carries the hash list of source observations for audit/replay.
- Deterministic sort: observations sorted by `source, advisoryId, fetchedAt` before hashing.
- Conflicts are additive only and now carry optional `sourceIds[]` to trace which upstream sources produced divergent values.
## Indexes (Mongo)
- Observations: `{ tenantId:1, source:1, advisoryId:1, provenance.fetchedAt:-1 }` (compound for ingest); `{ provenance.sourceArtifactSha:1 }` unique to avoid dup writes.
- Linksets: `{ tenantId:1, advisoryId:1, source:1 }` unique; `{ observations:1 }` sparse for reverse lookups.
## Collections
- `advisory_observations` — raw per-source docs (immutable).
- `advisory_linksets` — derived normalized aggregates with observation pointers and hashes.
## Determinism & replay
- Replay rebuild: order observations by fetchedAt, recompute linkset hash list, ensure byte-identical linkset JSON.
- All timestamps UTC ISO-8601; no server-local time.
- String normalization: lowercase `source`, trim/normalize PURLs, stable sort arrays.
## Sample documents
See `docs/samples/lnm/observation-ghsa.json` and `docs/samples/lnm/linkset-ghsa.json` (added with this draft) for concrete payloads.
## Approval path
1) Architecture + Concelier Core review this document.
2) If accepted, freeze JSON Schema and roll into `src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo` migrations.
3) Update consumers (policy/CLI/export) to read from linksets only; deprecate Merge endpoints.
---
Tracking: CONCELIER-LNM-21-001/002/101; Sprint 110 blockers (Concelier/Excititor waves).