Files
git.stella-ops.org/docs/advisories/aggregation.md
root 68da90a11a
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Restructure solution layout by module
2025-10-28 15:10:40 +02:00

8.6 KiB
Raw Blame History

Advisory Observations & Linksets

Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.

The Link-Not-Merge (LNM) initiative replaces the legacy "merge" pipeline with immutable observations and correlation linksets. This guide explains how Concelier ingests advisory statements, preserves upstream truth, and produces linksets that downstream services (Policy Engine, Vuln Explorer, Console) can use without collapsing sources together.


1. Model overview

1.1 Observation lifecycle

  1. Ingest Connectors fetch upstream payloads (CSAF, OSV, vendor feeds), validate signatures, and drop any derived fields prohibited by the Aggregation-Only Contract (AOC).
  2. Persist Concelier writes immutable advisory_observations scoped by tenant, (source.vendor, upstreamId), and contentHash. Supersedes chains capture revisions without mutating history.
  3. Expose WebService surfaces paged/read APIs; Offline Kit snapshots include the same documents for air-gapped installs.

Observation schema highlights:

observationId = {tenant}:{source.vendor}:{upstreamId}:{revision}
tenant, source{vendor, stream, api, collectorVersion}
upstream{upstreamId, documentVersion, fetchedAt, receivedAt,
         contentHash, signature{present, format, keyId, signature}}
content{format, specVersion, raw}
identifiers{cve?, ghsa?, aliases[], osvIds[]}
linkset{purls[], cpes[], aliases[], references[], conflicts[]?}
createdAt, attributes{batchId?, replayCursor?}
  • Immutable raw (content.raw) mirrors upstream payloads exactly.
  • Provenance (source.*, upstream.*) satisfies AOC guardrails and enables cryptographic attestations.
  • Identifiers retain lossless extracts (CVE, GHSA, vendor aliases) that seed linksets.
  • Linkset captures join hints but never merges or adds derived severity.

1.2 Linkset lifecycle

Linksets correlate observations that describe the same vulnerable product while keeping each source intact.

  1. Seed Observations emit normalized identifiers (purl, cpe, alias) during ingestion.
  2. Correlate Linkset builder groups observations by tenant, product coordinates, and equivalence signals (PURL alias graph, CVE overlap, CVSS vector equality, fuzzy titles).
  3. Annotate Detected conflicts (severity disagreements, affected-range mismatch, incompatible references) are recorded with structured payloads and preserved for UI/API export.
  4. Persist Results land in advisory_linksets with deterministic IDs (linksetId = {tenant}:{hash(aliases+purls+seedIds)}) and append-only history for reproducibility.

Linksets never suppress or prefer one source; they provide aligned evidence so other services can apply policy.


2. Observation vs. linkset

  • Purpose
    • Observation: Immutable record per vendor and revision.
    • Linkset: Correlates observations that share product identity.
  • Mutation
    • Observation: Append-only via supersedes chain.
    • Linkset: Rebuilt deterministically from canonical signals.
  • Allowed fields
    • Observation: Raw payload, provenance, identifiers, join hints.
    • Linkset: Observation references, normalized product metadata, conflicts.
  • Forbidden fields
    • Observation: Derived severity, policy status, opinionated dedupe.
    • Linkset: Derived severity (conflicts recorded but unresolved).
  • Consumers
    • Observation: Evidence API, Offline Kit, CLI exports.
    • Linkset: Policy Engine overlay, UI evidence panel, Vuln Explorer.

2.1 Example sequence

  1. Red Hat PSIRT publishes RHSA-2025:1234 for OpenSSL; Concelier inserts an observation for vendor redhat with pkg:rpm/redhat/openssl@1.1.1w-12.
  2. NVD issues CVE-2025-0001; a second observation is inserted for vendor nvd.
  3. Linkset builder runs, groups the two observations, records alias and PURL overlap, and flags a CVSS disagreement (7.5 vs 7.2).
  4. Policy Engine reads the linkset, recognises the severity variance, and relies on configured rules to decide the effective output.

3. Conflict handling

Conflicts record disagreements without altering source payloads. The builder emits structured entries:

{
  "type": "severity-mismatch",
  "field": "cvss.baseScore",
  "observations": [
    {
      "source": "redhat",
      "value": "7.5",
      "vector": "AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N"
    },
    {
      "source": "nvd",
      "value": "7.2",
      "vector": "AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:H/A:N"
    }
  ],
  "confidence": "medium",
  "detectedAt": "2025-10-27T14:00:00Z"
}

Supported conflict classes:

  • severity-mismatch CVSS or qualitative severities differ.
  • affected-range-divergence Product ranges, fixed versions, or platforms disagree.
  • statement-disagreement One observation declares not_affected while another states affected.
  • reference-clash URL or classifier collisions (for example, exploit URL vs conflicting advisory).
  • alias-inconsistency Aliases map to different canonical IDs (GHSA vs CVE).
  • metadata-gap Required provenance missing on one source; logged as a warning.

Conflict surfaces:

  • WebService endpoints (GET /advisories/linksets/{id}conflicts[]).
  • UI evidence panel chips and conflict badges.
  • CLI exports (JSON/OSV) exposed through LNM commands.
  • Observability metrics (advisory_linkset_conflicts_total{type}).

4. AOC alignment

Observations and linksets must satisfy Aggregation-Only Contract invariants:

  • No derived severity content.raw may include upstream severity, but the observation body never injects or edits severity.
  • No merges Each upstream document stays separate; linksets reference observations via deterministic IDs.
  • Provenance mandatory Missing signature or source metadata is an AOC violation (ERR_AOC_004).
  • Idempotent writes Duplicate contentHash yields a no-op; supersedes pointer captures new revisions.
  • Deterministic output Linkset builder sorts keys, normalizes timestamps (UTC ISO-8601), and uses canonical JSON hashing.

Violations trigger guard errors (ERR_AOC_00x), emit aoc_violation_total metrics, and block persistence until corrected.


5. Downstream consumption

  • Policy Engine Computes effective severity and risk overlays from linkset evidence and conflicts.
  • Console UI Renders per-source statements, signed hashes, and conflict banners inside the evidence panel.
  • CLI (stella advisories linkset …) Exports observations and linksets as JSON or OSV for offline triage.
  • Offline Kit Shipping snapshots include observation and linkset collections for air-gap parity.
  • Observability Dashboards track ingestion latency, conflict counts, and supersedes depth.

When adding new consumers, ensure they honour append-only semantics and do not mutate observation or linkset collections.


6. Validation & testing

  • Unit tests (StellaOps.Concelier.Core.Tests) validate schema guards, deterministic linkset hashing, conflict detection fixtures, and supersedes chains.
  • Mongo integration tests (StellaOps.Concelier.Storage.Mongo.Tests) verify indexes and idempotent writes under concurrency.
  • CLI smoke suites confirm stella advisories observations and stella advisories linksets export stable JSON.
  • Determinism checks replay identical upstream payloads and assert that the resulting observation and linkset documents match byte for byte.
  • Offline kit verification simulates air-gapped bootstrap to confirm that snapshots align with live data.

Add fixtures whenever a new conflict type or correlation signal is introduced. Ensure canonical JSON serialization remains stable across .NET runtime updates.


7. Reviewer checklist

  • Observation schema segment matches the latest StellaOps.Concelier.Models contract.
  • Linkset lifecycle covers correlation signals, conflict classes, and deterministic IDs.
  • AOC invariants are explicitly called out with violation codes.
  • Examples include multi-source correlation plus conflict annotation.
  • Downstream consumer guidance reflects active APIs and CLI features.
  • Testing section lists required suites (Core, Storage, CLI, Offline).
  • Imposed rule reminder is present at the top of the document.

Confirmed against Concelier Link-Not-Merge tasks: CONCELIER-LNM-21-001..005, CONCELIER-LNM-21-101..103, CONCELIER-LNM-21-201..203.