Files
git.stella-ops.org/docs/modules/concelier/link-not-merge-schema.md
master 607ce619fe feat(concelier): multi-sprint batch (mirror domain + advisory sources + durable runtime + credentials)
Bundled commit covering pre-session work from multiple Concelier sprints
already archived or in-flight:
- SPRINT_20260419_006: mirror domain / source key validation
- SPRINT_20260419_029 / 030: durable jobs orchestrator runtime + endpoint verification
- SPRINT_20260421_001: advisory source projection truthful counts
- SPRINT_20260421_002: FE advisory source consistency (connector-side bits)
- SPRINT_20260421_003: advisory connector runtime alignment
- SPRINT_20260422_003: source credential entry paths (in-flight)

Includes connector internals (ACSC / Adobe / CERT-BUND / Chromium / Cisco /
CVE-KEV / GHSA / JVN / KISA / MSRC / Oracle / Ubuntu), source management
endpoints, mirror domain management, federation endpoints, topology setup,
job registration, and associated dossier updates under
docs/modules/concelier/.

This commit groups ~229 file changes that accumulated across the above
sprints; individual changes are preserved at file granularity so blame
remains useful.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 16:05:53 +03:00

8.3 KiB
Raw Blame History

Link-Not-Merge (LNM) Observation & Linkset Schema

Frozen v1 (add-only) — approved 2025-11-17 for CONCELIER-LNM-21-001/002/101.

Goals

  • Immutable storage of raw advisory observations per source/tenant.
  • Deterministic linksets built from observations without merging or mutating originals.
  • Stable across online/offline deployments; replayable from raw inputs.

Status

  • Frozen v1 as of 2025-11-17; further schema changes must go through ADR + sprint gating (CONCELIER-LNM-22x+).
  • Canonical JSON Schemas + signed manifest live in docs/modules/concelier/schemas/ (advisory observation, linkset, offline bundle). Verify with openssl dgst -sha256 -verify schema-signing-pub.pem -signature schema.manifest.sig schema.manifest.json.

Observation document (PostgreSQL JSON Schema excerpt)

{
  "bsonType": "object",
  "required": ["_id","tenantId","source","advisoryId","affected","provenance","ingestedAt"],
  "properties": {
    "_id": {"bsonType": "objectId"},
    "tenantId": {"bsonType": "string"},
    "source": {"bsonType": "string", "description": "Adapter id, e.g., ghsa, nvd, cert-de"},
    "advisoryId": {"bsonType": "string"},
    "title": {"bsonType": "string"},
    "summary": {"bsonType": "string"},
    "severities": {
      "bsonType": "array",
      "items": {"bsonType": "object", "required": ["system","score"],
        "properties": {"system":{"bsonType":"string"},"score":{"bsonType":"double"},"vector":{"bsonType":"string"}}}
    },
    "affected": {
      "bsonType": "array",
      "items": {"bsonType":"object","required":["purl"],
        "properties": {
          "purl": {"bsonType":"string"},
          "package": {"bsonType":"string"},
          "versions": {"bsonType":"array","items":{"bsonType":"string"}},
          "ranges": {"bsonType":"array","items":{"bsonType":"object",
            "required":["type","events"],
            "properties": {"type":{"bsonType":"string"},"events":{"bsonType":"array","items":{"bsonType":"object"}}}}},
          "ecosystem": {"bsonType":"string"},
          "cpe": {"bsonType":"array","items":{"bsonType":"string"}},
          "cpes": {"bsonType":"array","items":{"bsonType":"string"}}
        }
      }
    },
    "references": {"bsonType": "array", "items": {"bsonType":"string"}},
    "scopes": {"bsonType":"array","items":{"bsonType":"string"}},
    "relationships": {
      "bsonType": "array",
      "items": {"bsonType":"object","required":["type","source","target"],
        "properties": {
          "type":{"bsonType":"string"},
          "source":{"bsonType":"string"},
          "target":{"bsonType":"string"},
          "provenance":{"bsonType":"string"}
        }}
    },
    "weaknesses": {"bsonType":"array","items":{"bsonType":"string"}},
    "published": {"bsonType": "date"},
    "modified": {"bsonType": "date"},
    "provenance": {
      "bsonType": "object",
      "required": ["sourceArtifactSha","fetchedAt"],
      "properties": {
        "sourceArtifactSha": {"bsonType":"string"},
        "fetchedAt": {"bsonType":"date"},
        "ingestJobId": {"bsonType":"string"},
        "signature": {"bsonType":"object"}
      }
    },
    "ingestedAt": {"bsonType": "date"}
  }
}

Observation invariants

  • Immutable: no in-place updates; new revision → new document with supersedesId optional pointer.
  • Deterministic keying: _id derived from hash(tenantId|source|advisoryId|provenance.sourceArtifactSha) to keep inserts idempotent in replay.
  • Normalization guardrails: version ranges must be stored as raw-from-source; no inferred merges.

Append-Only Contract (AOC) — LNM-21-004

The Aggregation-Only Contract (AOC) ensures observations are immutable after creation. This is enforced by IAdvisoryObservationWriteGuard.

Write disposition rules

Existing Hash New Hash Disposition Action
null/empty any Proceed Insert new observation
X X (identical) SkipIdentical Idempotent re-insert, no write
X Y (different) RejectMutation Reject with AppendOnlyViolationException

Supersession model

When an advisory source publishes a revised version of an advisory:

  1. A new observation is created with its own unique observationId and contentHash.
  2. The new observation MAY carry a supersedesId pointing to the previous observation.
  3. The original observation remains immutable — it is never updated or deleted.
  4. Linksets are rebuilt to include all non-superseded observations; superseded observations remain queryable for audit but excluded from active linkset aggregation.

Implementation checklist (LNM-21-004)

  • IAdvisoryObservationWriteGuard interface with ValidateWrite(observation, existingContentHash) method.
  • AdvisoryObservationWriteGuard implementation enforcing append-only semantics.
  • AppendOnlyViolationException for mutation rejections.
  • DI registration via AddConcelierAocGuards() extension.
  • Unit tests covering Proceed/SkipIdentical/RejectMutation scenarios.
  • Legacy merge logic deprecated with [Obsolete] and gated by NoMergeEnabled feature flag (defaults to true).
  • Roslyn analyzer StellaOps.Concelier.Analyzers.NoMergeApiAnalyzer emits warnings for merge API usage.

Linkset document

{
  "bsonType":"object",
  "required":["_id","tenantId","advisoryId","source","observations","createdAt"],
  "properties":{
    "_id":{"bsonType":"objectId"},
    "tenantId":{"bsonType":"string"},
    "advisoryId":{"bsonType":"string"},
    "source":{"bsonType":"string"},
    "observations":{"bsonType":"array","items":{"bsonType":"objectId"}},
    "normalized": {
      "bsonType":"object",
      "properties":{
        "purls":{"bsonType":"array","items":{"bsonType":"string"}},
        "versions":{"bsonType":"array","items":{"bsonType":"string"}},
        "ranges": {"bsonType":"array","items":{"bsonType":"object"}},
        "severities": {"bsonType":"array","items":{"bsonType":"object"}}
      }
    },
    "confidence": {"bsonType":"double", "description":"Optional correlation confidence (01)"},
    "conflicts": {"bsonType":"array","items":{"bsonType":"object",
      "required":["field","reason"],
      "properties":{
        "field":{"bsonType":"string"},
        "reason":{"bsonType":"string"},
        "values":{"bsonType":"array","items":{"bsonType":"string"}},
        "sourceIds":{"bsonType":"array","items":{"bsonType":"string"}}
      }}},
    "createdAt":{"bsonType":"date"},
    "builtByJobId":{"bsonType":"string"},
    "provenance": {"bsonType":"object","properties":{
      "observationHashes":{"bsonType":"array","items":{"bsonType":"string"}},
      "toolVersion" : {"bsonType":"string"},
      "policyHash" : {"bsonType":"string"}
    }}
  }
}

Linkset invariants

  • Built from a set of observation IDs; never overwrites observations.
  • Carries the hash list of source observations for audit/replay.
  • Deterministic sort: observations sorted by source, advisoryId, fetchedAt before hashing.
  • Conflicts are additive only and now carry optional sourceIds[] to trace which upstream sources produced divergent values.

Indexes (PostgreSQL)

  • Observations: { tenantId:1, source:1, advisoryId:1, provenance.fetchedAt:-1 } (compound for ingest); { provenance.sourceArtifactSha:1 } unique to avoid dup writes.
  • Linksets: { tenantId:1, advisoryId:1, source:1 } unique; { observations:1 } sparse for reverse lookups.

Tables

  • advisory_observations — raw per-source docs (immutable).
  • advisory_linksets — derived normalized aggregates with observation pointers and hashes.

Determinism & replay

  • Replay rebuild: order observations by fetchedAt, recompute linkset hash list, ensure byte-identical linkset JSON.
  • All timestamps UTC ISO-8601; no server-local time.
  • String normalization: lowercase source, trim/normalize PURLs, stable sort arrays.

Sample documents

See docs/modules/concelier/samples/observation-ghsa.json and docs/modules/concelier/samples/linkset-ghsa.json (added with this draft) for concrete payloads.

Approval path

  1. Architecture + Concelier Core review this document.
  2. If accepted, freeze JSON Schema and roll into src/Concelier/__Libraries/StellaOps.Concelier.Storage.Postgres migrations.
  3. Update consumers (policy/CLI/export) to read from linksets only; deprecate Merge endpoints.

Tracking: CONCELIER-LNM-21-001/002/101; Sprint 110 blockers (Concelier/Excititor waves).