Files
git.stella-ops.org/docs/modules/concelier/link-not-merge-schema.md
StellaOps Bot 47168fec38 feat: Add VEX compact fixture and implement offline verifier for Findings Ledger exports
- Introduced a new VEX compact fixture for testing purposes.
- Implemented `verify_export.py` script to validate Findings Ledger exports, ensuring deterministic ordering and applying redaction manifests.
- Added a lightweight stub `HarnessRunner` for unit tests to validate ledger hashing expectations.
- Documented tasks related to the Mirror Creator.
- Created models for entropy signals and implemented the `EntropyPenaltyCalculator` to compute penalties based on scanner outputs.
- Developed unit tests for `EntropyPenaltyCalculator` to ensure correct penalty calculations and handling of edge cases.
- Added tests for symbol ID normalization in the reachability scanner.
- Enhanced console status service with comprehensive unit tests for connection handling and error recovery.
- Included Cosign tool version 2.6.0 with checksums for various platforms.
2025-12-02 21:08:01 +02:00

8.3 KiB
Raw Blame History

Link-Not-Merge (LNM) Observation & Linkset Schema

Frozen v1 (add-only) — approved 2025-11-17 for CONCELIER-LNM-21-001/002/101.

Goals

  • Immutable storage of raw advisory observations per source/tenant.
  • Deterministic linksets built from observations without merging or mutating originals.
  • Stable across online/offline deployments; replayable from raw inputs.

Status

  • Frozen v1 as of 2025-11-17; further schema changes must go through ADR + sprint gating (CONCELIER-LNM-22x+).
  • Canonical JSON Schemas + signed manifest live in docs/modules/concelier/schemas/ (advisory observation, linkset, offline bundle). Verify with openssl dgst -sha256 -verify schema-signing-pub.pem -signature schema.manifest.sig schema.manifest.json.

Observation document (Mongo JSON Schema excerpt)

{
  "bsonType": "object",
  "required": ["_id","tenantId","source","advisoryId","affected","provenance","ingestedAt"],
  "properties": {
    "_id": {"bsonType": "objectId"},
    "tenantId": {"bsonType": "string"},
    "source": {"bsonType": "string", "description": "Adapter id, e.g., ghsa, nvd, cert-bund"},
    "advisoryId": {"bsonType": "string"},
    "title": {"bsonType": "string"},
    "summary": {"bsonType": "string"},
    "severities": {
      "bsonType": "array",
      "items": {"bsonType": "object", "required": ["system","score"],
        "properties": {"system":{"bsonType":"string"},"score":{"bsonType":"double"},"vector":{"bsonType":"string"}}}
    },
    "affected": {
      "bsonType": "array",
      "items": {"bsonType":"object","required":["purl"],
        "properties": {
          "purl": {"bsonType":"string"},
          "package": {"bsonType":"string"},
          "versions": {"bsonType":"array","items":{"bsonType":"string"}},
          "ranges": {"bsonType":"array","items":{"bsonType":"object",
            "required":["type","events"],
            "properties": {"type":{"bsonType":"string"},"events":{"bsonType":"array","items":{"bsonType":"object"}}}}},
          "ecosystem": {"bsonType":"string"},
          "cpe": {"bsonType":"array","items":{"bsonType":"string"}},
          "cpes": {"bsonType":"array","items":{"bsonType":"string"}}
        }
      }
    },
    "references": {"bsonType": "array", "items": {"bsonType":"string"}},
    "scopes": {"bsonType":"array","items":{"bsonType":"string"}},
    "relationships": {
      "bsonType": "array",
      "items": {"bsonType":"object","required":["type","source","target"],
        "properties": {
          "type":{"bsonType":"string"},
          "source":{"bsonType":"string"},
          "target":{"bsonType":"string"},
          "provenance":{"bsonType":"string"}
        }}
    },
    "weaknesses": {"bsonType":"array","items":{"bsonType":"string"}},
    "published": {"bsonType": "date"},
    "modified": {"bsonType": "date"},
    "provenance": {
      "bsonType": "object",
      "required": ["sourceArtifactSha","fetchedAt"],
      "properties": {
        "sourceArtifactSha": {"bsonType":"string"},
        "fetchedAt": {"bsonType":"date"},
        "ingestJobId": {"bsonType":"string"},
        "signature": {"bsonType":"object"}
      }
    },
    "ingestedAt": {"bsonType": "date"}
  }
}

Observation invariants

  • Immutable: no in-place updates; new revision → new document with supersedesId optional pointer.
  • Deterministic keying: _id derived from hash(tenantId|source|advisoryId|provenance.sourceArtifactSha) to keep inserts idempotent in replay.
  • Normalization guardrails: version ranges must be stored as raw-from-source; no inferred merges.

Append-Only Contract (AOC) — LNM-21-004

The Aggregation-Only Contract (AOC) ensures observations are immutable after creation. This is enforced by IAdvisoryObservationWriteGuard.

Write disposition rules

Existing Hash New Hash Disposition Action
null/empty any Proceed Insert new observation
X X (identical) SkipIdentical Idempotent re-insert, no write
X Y (different) RejectMutation Reject with AppendOnlyViolationException

Supersession model

When an advisory source publishes a revised version of an advisory:

  1. A new observation is created with its own unique observationId and contentHash.
  2. The new observation MAY carry a supersedesId pointing to the previous observation.
  3. The original observation remains immutable — it is never updated or deleted.
  4. Linksets are rebuilt to include all non-superseded observations; superseded observations remain queryable for audit but excluded from active linkset aggregation.

Implementation checklist (LNM-21-004)

  • IAdvisoryObservationWriteGuard interface with ValidateWrite(observation, existingContentHash) method.
  • AdvisoryObservationWriteGuard implementation enforcing append-only semantics.
  • AppendOnlyViolationException for mutation rejections.
  • DI registration via AddConcelierAocGuards() extension.
  • Unit tests covering Proceed/SkipIdentical/RejectMutation scenarios.
  • Legacy merge logic deprecated with [Obsolete] and gated by NoMergeEnabled feature flag (defaults to true).
  • Roslyn analyzer StellaOps.Concelier.Analyzers.NoMergeApiAnalyzer emits warnings for merge API usage.

Linkset document

{
  "bsonType":"object",
  "required":["_id","tenantId","advisoryId","source","observations","createdAt"],
  "properties":{
    "_id":{"bsonType":"objectId"},
    "tenantId":{"bsonType":"string"},
    "advisoryId":{"bsonType":"string"},
    "source":{"bsonType":"string"},
    "observations":{"bsonType":"array","items":{"bsonType":"objectId"}},
    "normalized": {
      "bsonType":"object",
      "properties":{
        "purls":{"bsonType":"array","items":{"bsonType":"string"}},
        "versions":{"bsonType":"array","items":{"bsonType":"string"}},
        "ranges": {"bsonType":"array","items":{"bsonType":"object"}},
        "severities": {"bsonType":"array","items":{"bsonType":"object"}}
      }
    },
    "confidence": {"bsonType":"double", "description":"Optional correlation confidence (01)"},
    "conflicts": {"bsonType":"array","items":{"bsonType":"object",
      "required":["field","reason"],
      "properties":{
        "field":{"bsonType":"string"},
        "reason":{"bsonType":"string"},
        "values":{"bsonType":"array","items":{"bsonType":"string"}},
        "sourceIds":{"bsonType":"array","items":{"bsonType":"string"}}
      }}},
    "createdAt":{"bsonType":"date"},
    "builtByJobId":{"bsonType":"string"},
    "provenance": {"bsonType":"object","properties":{
      "observationHashes":{"bsonType":"array","items":{"bsonType":"string"}},
      "toolVersion" : {"bsonType":"string"},
      "policyHash" : {"bsonType":"string"}
    }}
  }
}

Linkset invariants

  • Built from a set of observation IDs; never overwrites observations.
  • Carries the hash list of source observations for audit/replay.
  • Deterministic sort: observations sorted by source, advisoryId, fetchedAt before hashing.
  • Conflicts are additive only and now carry optional sourceIds[] to trace which upstream sources produced divergent values.

Indexes (Mongo)

  • Observations: { tenantId:1, source:1, advisoryId:1, provenance.fetchedAt:-1 } (compound for ingest); { provenance.sourceArtifactSha:1 } unique to avoid dup writes.
  • Linksets: { tenantId:1, advisoryId:1, source:1 } unique; { observations:1 } sparse for reverse lookups.

Collections

  • advisory_observations — raw per-source docs (immutable).
  • advisory_linksets — derived normalized aggregates with observation pointers and hashes.

Determinism & replay

  • Replay rebuild: order observations by fetchedAt, recompute linkset hash list, ensure byte-identical linkset JSON.
  • All timestamps UTC ISO-8601; no server-local time.
  • String normalization: lowercase source, trim/normalize PURLs, stable sort arrays.

Sample documents

See docs/samples/lnm/observation-ghsa.json and docs/samples/lnm/linkset-ghsa.json (added with this draft) for concrete payloads.

Approval path

  1. Architecture + Concelier Core review this document.
  2. If accepted, freeze JSON Schema and roll into src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo migrations.
  3. Update consumers (policy/CLI/export) to read from linksets only; deprecate Merge endpoints.

Tracking: CONCELIER-LNM-21-001/002/101; Sprint 110 blockers (Concelier/Excititor waves).