219 lines
8.6 KiB
Markdown
219 lines
8.6 KiB
Markdown
# Advisory Observations & Linksets
|
||
|
||
> Imposed rule: Work of this type or tasks of this type on this component must also
|
||
> be applied everywhere else it should be applied.
|
||
|
||
The Link-Not-Merge (LNM) initiative replaces the legacy "merge" pipeline with
|
||
immutable observations and correlation linksets. This guide explains how
|
||
Concelier ingests advisory statements, preserves upstream truth, and produces
|
||
linksets that downstream services (Policy Engine, Vuln Explorer, Console) can
|
||
use without collapsing sources together.
|
||
|
||
---
|
||
|
||
## 1. Model overview
|
||
|
||
### 1.1 Observation lifecycle
|
||
|
||
1. **Ingest** – Connectors fetch upstream payloads (CSAF, OSV, vendor feeds),
|
||
validate signatures, and drop any derived fields prohibited by the
|
||
Aggregation-Only Contract (AOC).
|
||
2. **Persist** – Concelier writes immutable `advisory_observations` scoped by
|
||
`tenant`, `(source.vendor, upstreamId)`, and `contentHash`. Supersedes chains
|
||
capture revisions without mutating history.
|
||
3. **Expose** – WebService surfaces paged/read APIs; Offline Kit snapshots
|
||
include the same documents for air-gapped installs.
|
||
|
||
Observation schema highlights:
|
||
|
||
```text
|
||
observationId = {tenant}:{source.vendor}:{upstreamId}:{revision}
|
||
tenant, source{vendor, stream, api, collectorVersion}
|
||
upstream{upstreamId, documentVersion, fetchedAt, receivedAt,
|
||
contentHash, signature{present, format, keyId, signature}}
|
||
content{format, specVersion, raw}
|
||
identifiers{cve?, ghsa?, aliases[], osvIds[]}
|
||
linkset{purls[], cpes[], aliases[], references[], conflicts[]?}
|
||
createdAt, attributes{batchId?, replayCursor?}
|
||
```
|
||
|
||
- **Immutable raw** (`content.raw`) mirrors upstream payloads exactly.
|
||
- **Provenance** (`source.*`, `upstream.*`) satisfies AOC guardrails and enables
|
||
cryptographic attestations.
|
||
- **Identifiers** retain lossless extracts (CVE, GHSA, vendor aliases) that seed
|
||
linksets.
|
||
- **Linkset** captures join hints but never merges or adds derived severity.
|
||
|
||
### 1.2 Linkset lifecycle
|
||
|
||
Linksets correlate observations that describe the same vulnerable product while
|
||
keeping each source intact.
|
||
|
||
1. **Seed** – Observations emit normalized identifiers (`purl`, `cpe`,
|
||
`alias`) during ingestion.
|
||
2. **Correlate** – Linkset builder groups observations by tenant, product
|
||
coordinates, and equivalence signals (PURL alias graph, CVE overlap, CVSS
|
||
vector equality, fuzzy titles).
|
||
3. **Annotate** – Detected conflicts (severity disagreements, affected-range
|
||
mismatch, incompatible references) are recorded with structured payloads and
|
||
preserved for UI/API export.
|
||
4. **Persist** – Results land in `advisory_linksets` with deterministic IDs
|
||
(`linksetId = {tenant}:{hash(aliases+purls+seedIds)}`) and append-only history
|
||
for reproducibility.
|
||
|
||
Linksets never suppress or prefer one source; they provide aligned evidence so
|
||
other services can apply policy.
|
||
|
||
---
|
||
|
||
## 2. Observation vs. linkset
|
||
|
||
- **Purpose**
|
||
- Observation: Immutable record per vendor and revision.
|
||
- Linkset: Correlates observations that share product identity.
|
||
- **Mutation**
|
||
- Observation: Append-only via supersedes chain.
|
||
- Linkset: Rebuilt deterministically from canonical signals.
|
||
- **Allowed fields**
|
||
- Observation: Raw payload, provenance, identifiers, join hints.
|
||
- Linkset: Observation references, normalized product metadata, conflicts.
|
||
- **Forbidden fields**
|
||
- Observation: Derived severity, policy status, opinionated dedupe.
|
||
- Linkset: Derived severity (conflicts recorded but unresolved).
|
||
- **Consumers**
|
||
- Observation: Evidence API, Offline Kit, CLI exports.
|
||
- Linkset: Policy Engine overlay, UI evidence panel, Vuln Explorer.
|
||
|
||
### 2.1 Example sequence
|
||
|
||
1. Red Hat PSIRT publishes RHSA-2025:1234 for OpenSSL; Concelier inserts an
|
||
observation for vendor `redhat` with `pkg:rpm/redhat/openssl@1.1.1w-12`.
|
||
2. NVD issues CVE-2025-0001; a second observation is inserted for vendor `nvd`.
|
||
3. Linkset builder runs, groups the two observations, records alias and PURL
|
||
overlap, and flags a CVSS disagreement (`7.5` vs `7.2`).
|
||
4. Policy Engine reads the linkset, recognises the severity variance, and relies
|
||
on configured rules to decide the effective output.
|
||
|
||
---
|
||
|
||
## 3. Conflict handling
|
||
|
||
Conflicts record disagreements without altering source payloads. The builder
|
||
emits structured entries:
|
||
|
||
```json
|
||
{
|
||
"type": "severity-mismatch",
|
||
"field": "cvss.baseScore",
|
||
"observations": [
|
||
{
|
||
"source": "redhat",
|
||
"value": "7.5",
|
||
"vector": "AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N"
|
||
},
|
||
{
|
||
"source": "nvd",
|
||
"value": "7.2",
|
||
"vector": "AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:H/A:N"
|
||
}
|
||
],
|
||
"confidence": "medium",
|
||
"detectedAt": "2025-10-27T14:00:00Z"
|
||
}
|
||
```
|
||
|
||
Supported conflict classes:
|
||
|
||
- `severity-mismatch` – CVSS or qualitative severities differ.
|
||
- `affected-range-divergence` – Product ranges, fixed versions, or platforms
|
||
disagree.
|
||
- `statement-disagreement` – One observation declares `not_affected` while
|
||
another states `affected`.
|
||
- `reference-clash` – URL or classifier collisions (for example, exploit URL vs
|
||
conflicting advisory).
|
||
- `alias-inconsistency` – Aliases map to different canonical IDs (GHSA vs CVE).
|
||
- `metadata-gap` – Required provenance missing on one source; logged as a
|
||
warning.
|
||
|
||
Conflict surfaces:
|
||
|
||
- WebService endpoints (`GET /advisories/linksets/{id}` → `conflicts[]`).
|
||
- UI evidence panel chips and conflict badges.
|
||
- CLI exports (JSON/OSV) exposed through LNM commands.
|
||
- Observability metrics (`advisory_linkset_conflicts_total{type}`).
|
||
|
||
---
|
||
|
||
## 4. AOC alignment
|
||
|
||
Observations and linksets must satisfy Aggregation-Only Contract invariants:
|
||
|
||
- **No derived severity** – `content.raw` may include upstream severity, but the
|
||
observation body never injects or edits severity.
|
||
- **No merges** – Each upstream document stays separate; linksets reference
|
||
observations via deterministic IDs.
|
||
- **Provenance mandatory** – Missing `signature` or `source` metadata is an AOC
|
||
violation (`ERR_AOC_004`).
|
||
- **Idempotent writes** – Duplicate `contentHash` yields a no-op; supersedes
|
||
pointer captures new revisions.
|
||
- **Deterministic output** – Linkset builder sorts keys, normalizes timestamps
|
||
(UTC ISO-8601), and uses canonical JSON hashing.
|
||
|
||
Violations trigger guard errors (`ERR_AOC_00x`), emit `aoc_violation_total`
|
||
metrics, and block persistence until corrected.
|
||
|
||
---
|
||
|
||
## 5. Downstream consumption
|
||
|
||
- **Policy Engine** – Computes effective severity and risk overlays from linkset
|
||
evidence and conflicts.
|
||
- **Console UI** – Renders per-source statements, signed hashes, and conflict
|
||
banners inside the evidence panel.
|
||
- **CLI (`stella advisories linkset …`)** – Exports observations and linksets as
|
||
JSON or OSV for offline triage.
|
||
- **Offline Kit** – Shipping snapshots include observation and linkset
|
||
collections for air-gap parity.
|
||
- **Observability** – Dashboards track ingestion latency, conflict counts, and
|
||
supersedes depth.
|
||
|
||
When adding new consumers, ensure they honour append-only semantics and do not
|
||
mutate observation or linkset collections.
|
||
|
||
---
|
||
|
||
## 6. Validation & testing
|
||
|
||
- **Unit tests** (`StellaOps.Concelier.Core.Tests`) validate schema guards,
|
||
deterministic linkset hashing, conflict detection fixtures, and supersedes
|
||
chains.
|
||
- **Mongo integration tests** (`StellaOps.Concelier.Storage.Mongo.Tests`) verify
|
||
indexes and idempotent writes under concurrency.
|
||
- **CLI smoke suites** confirm `stella advisories observations` and `stella
|
||
advisories linksets` export stable JSON.
|
||
- **Determinism checks** replay identical upstream payloads and assert that the
|
||
resulting observation and linkset documents match byte for byte.
|
||
- **Offline kit verification** simulates air-gapped bootstrap to confirm that
|
||
snapshots align with live data.
|
||
|
||
Add fixtures whenever a new conflict type or correlation signal is introduced.
|
||
Ensure canonical JSON serialization remains stable across .NET runtime updates.
|
||
|
||
---
|
||
|
||
## 7. Reviewer checklist
|
||
|
||
- Observation schema segment matches the latest `StellaOps.Concelier.Models`
|
||
contract.
|
||
- Linkset lifecycle covers correlation signals, conflict classes, and
|
||
deterministic IDs.
|
||
- AOC invariants are explicitly called out with violation codes.
|
||
- Examples include multi-source correlation plus conflict annotation.
|
||
- Downstream consumer guidance reflects active APIs and CLI features.
|
||
- Testing section lists required suites (Core, Storage, CLI, Offline).
|
||
- Imposed rule reminder is present at the top of the document.
|
||
|
||
Confirmed against Concelier Link-Not-Merge tasks:
|
||
`CONCELIER-LNM-21-001..005`, `CONCELIER-LNM-21-101..103`,
|
||
`CONCELIER-LNM-21-201..203`.
|