8.6 KiB
Advisory Observations & Linksets
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
The Link-Not-Merge (LNM) initiative replaces the legacy "merge" pipeline with immutable observations and correlation linksets. This guide explains how Concelier ingests advisory statements, preserves upstream truth, and produces linksets that downstream services (Policy Engine, Vuln Explorer, Console) can use without collapsing sources together.
1. Model overview
1.1 Observation lifecycle
- Ingest – Connectors fetch upstream payloads (CSAF, OSV, vendor feeds), validate signatures, and drop any derived fields prohibited by the Aggregation-Only Contract (AOC).
- Persist – Concelier writes immutable
advisory_observationsscoped bytenant,(source.vendor, upstreamId), andcontentHash. Supersedes chains capture revisions without mutating history. - Expose – WebService surfaces paged/read APIs; Offline Kit snapshots include the same documents for air-gapped installs.
Observation schema highlights:
observationId = {tenant}:{source.vendor}:{upstreamId}:{revision}
tenant, source{vendor, stream, api, collectorVersion}
upstream{upstreamId, documentVersion, fetchedAt, receivedAt,
contentHash, signature{present, format, keyId, signature}}
content{format, specVersion, raw}
identifiers{cve?, ghsa?, aliases[], osvIds[]}
linkset{purls[], cpes[], aliases[], references[], conflicts[]?}
createdAt, attributes{batchId?, replayCursor?}
- Immutable raw (
content.raw) mirrors upstream payloads exactly. - Provenance (
source.*,upstream.*) satisfies AOC guardrails and enables cryptographic attestations. - Identifiers retain lossless extracts (CVE, GHSA, vendor aliases) that seed linksets.
- Linkset captures join hints but never merges or adds derived severity.
1.2 Linkset lifecycle
Linksets correlate observations that describe the same vulnerable product while keeping each source intact.
- Seed – Observations emit normalized identifiers (
purl,cpe,alias) during ingestion. - Correlate – Linkset builder groups observations by tenant, product coordinates, and equivalence signals (PURL alias graph, CVE overlap, CVSS vector equality, fuzzy titles).
- Annotate – Detected conflicts (severity disagreements, affected-range mismatch, incompatible references) are recorded with structured payloads and preserved for UI/API export.
- Persist – Results land in
advisory_linksetswith deterministic IDs (linksetId = {tenant}:{hash(aliases+purls+seedIds)}) and append-only history for reproducibility.
Linksets never suppress or prefer one source; they provide aligned evidence so other services can apply policy.
2. Observation vs. linkset
- Purpose
- Observation: Immutable record per vendor and revision.
- Linkset: Correlates observations that share product identity.
- Mutation
- Observation: Append-only via supersedes chain.
- Linkset: Rebuilt deterministically from canonical signals.
- Allowed fields
- Observation: Raw payload, provenance, identifiers, join hints.
- Linkset: Observation references, normalized product metadata, conflicts.
- Forbidden fields
- Observation: Derived severity, policy status, opinionated dedupe.
- Linkset: Derived severity (conflicts recorded but unresolved).
- Consumers
- Observation: Evidence API, Offline Kit, CLI exports.
- Linkset: Policy Engine overlay, UI evidence panel, Vuln Explorer.
2.1 Example sequence
- Red Hat PSIRT publishes RHSA-2025:1234 for OpenSSL; Concelier inserts an
observation for vendor
redhatwithpkg:rpm/redhat/openssl@1.1.1w-12. - NVD issues CVE-2025-0001; a second observation is inserted for vendor
nvd. - Linkset builder runs, groups the two observations, records alias and PURL
overlap, and flags a CVSS disagreement (
7.5vs7.2). - Policy Engine reads the linkset, recognises the severity variance, and relies on configured rules to decide the effective output.
3. Conflict handling
Conflicts record disagreements without altering source payloads. The builder emits structured entries:
{
"type": "severity-mismatch",
"field": "cvss.baseScore",
"observations": [
{
"source": "redhat",
"value": "7.5",
"vector": "AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N"
},
{
"source": "nvd",
"value": "7.2",
"vector": "AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:H/A:N"
}
],
"confidence": "medium",
"detectedAt": "2025-10-27T14:00:00Z"
}
Supported conflict classes:
severity-mismatch– CVSS or qualitative severities differ.affected-range-divergence– Product ranges, fixed versions, or platforms disagree.statement-disagreement– One observation declaresnot_affectedwhile another statesaffected.reference-clash– URL or classifier collisions (for example, exploit URL vs conflicting advisory).alias-inconsistency– Aliases map to different canonical IDs (GHSA vs CVE).metadata-gap– Required provenance missing on one source; logged as a warning.
Conflict surfaces:
- WebService endpoints (
GET /advisories/linksets/{id}→conflicts[]). - UI evidence panel chips and conflict badges.
- CLI exports (JSON/OSV) exposed through LNM commands.
- Observability metrics (
advisory_linkset_conflicts_total{type}).
4. AOC alignment
Observations and linksets must satisfy Aggregation-Only Contract invariants:
- No derived severity –
content.rawmay include upstream severity, but the observation body never injects or edits severity. - No merges – Each upstream document stays separate; linksets reference observations via deterministic IDs.
- Provenance mandatory – Missing
signatureorsourcemetadata is an AOC violation (ERR_AOC_004). - Idempotent writes – Duplicate
contentHashyields a no-op; supersedes pointer captures new revisions. - Deterministic output – Linkset builder sorts keys, normalizes timestamps (UTC ISO-8601), and uses canonical JSON hashing.
Violations trigger guard errors (ERR_AOC_00x), emit aoc_violation_total
metrics, and block persistence until corrected.
5. Downstream consumption
- Policy Engine – Computes effective severity and risk overlays from linkset evidence and conflicts.
- Console UI – Renders per-source statements, signed hashes, and conflict banners inside the evidence panel.
- CLI (
stella advisories linkset …) – Exports observations and linksets as JSON or OSV for offline triage. - Offline Kit – Shipping snapshots include observation and linkset collections for air-gap parity.
- Observability – Dashboards track ingestion latency, conflict counts, and supersedes depth.
When adding new consumers, ensure they honour append-only semantics and do not mutate observation or linkset collections.
6. Validation & testing
- Unit tests (
StellaOps.Concelier.Core.Tests) validate schema guards, deterministic linkset hashing, conflict detection fixtures, and supersedes chains. - Mongo integration tests (
StellaOps.Concelier.Storage.Mongo.Tests) verify indexes and idempotent writes under concurrency. - CLI smoke suites confirm
stella advisories observationsandstella advisories linksetsexport stable JSON. - Determinism checks replay identical upstream payloads and assert that the resulting observation and linkset documents match byte for byte.
- Offline kit verification simulates air-gapped bootstrap to confirm that snapshots align with live data.
Add fixtures whenever a new conflict type or correlation signal is introduced. Ensure canonical JSON serialization remains stable across .NET runtime updates.
7. Reviewer checklist
- Observation schema segment matches the latest
StellaOps.Concelier.Modelscontract. - Linkset lifecycle covers correlation signals, conflict classes, and deterministic IDs.
- AOC invariants are explicitly called out with violation codes.
- Examples include multi-source correlation plus conflict annotation.
- Downstream consumer guidance reflects active APIs and CLI features.
- Testing section lists required suites (Core, Storage, CLI, Offline).
- Imposed rule reminder is present at the top of the document.
Confirmed against Concelier Link-Not-Merge tasks:
CONCELIER-LNM-21-001..005, CONCELIER-LNM-21-101..103,
CONCELIER-LNM-21-201..203.