Files
git.stella-ops.org/docs/modules/concelier/linkset-correlation-21-002.md
StellaOps Bot b6b9ffc050
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Add PHP Analyzer Plugin and Composer Lock Data Handling
- Implemented the PhpAnalyzerPlugin to analyze PHP projects.
- Created ComposerLockData class to represent data from composer.lock files.
- Developed ComposerLockReader to load and parse composer.lock files asynchronously.
- Introduced ComposerPackage class to encapsulate package details.
- Added PhpPackage class to represent PHP packages with metadata and evidence.
- Implemented PhpPackageCollector to gather packages from ComposerLockData.
- Created PhpLanguageAnalyzer to perform analysis and emit results.
- Added capability signals for known PHP frameworks and CMS.
- Developed unit tests for the PHP language analyzer and its components.
- Included sample composer.lock and expected output for testing.
- Updated project files for the new PHP analyzer library and tests.
2025-11-22 14:02:49 +02:00

3.5 KiB
Raw Blame History

CONCELIER-LNM-21-002 · Linkset correlation rules (v1)

Purpose: unblock CONCELIER-LNM-21-002 by freezing correlation/precedence rules and providing fixtures so builders and downstream consumers can proceed.

Scope

  • Applies to linksets produced from advisory_observations (LNM v1).
  • Correlation is aggregation-only: no value synthesis or merge; emit conflicts instead of collapsing fields.
  • Output persists in advisory_linksets and drives advisory.linkset.updated@1 events.

Deterministic confidence calculation (01)

confidence = clamp(
  0.40 * alias_score +
  0.25 * purl_overlap_score +
  0.15 * cpe_overlap_score +
  0.10 * severity_agreement +
  0.05 * reference_overlap +
  0.05 * freshness_score
)
  • alias_score: 1 if any alias exact-match across observations; 0.5 if vendor ID prefixes match; else 0.
  • purl_overlap_score: 1 if same pkg+version range intersects; 0.6 if same pkg family but disjoint ranges; 0 otherwise. Use semver/rpm/deb comparers as in LNM v1.
  • cpe_overlap_score: 1 if any CPE exact-match; 0.5 if same vendor/product, any version; else 0.
  • severity_agreement: 1 if CVSS base score delta ≤ 0.1; 0.5 if ≤ 1.0; else 0. Use max of available CVSS per observation.
  • reference_overlap: fraction of shared reference URLs (case-normalized) between the pair with the highest overlap across the set.
  • freshness_score: 1 when fetchedAt spread ≤ 48h; linearly decays to 0 at 14 days.
  • Sort observations before scoring by (source.vendor, advisoryId, fetchedAt); reuse that order for hashing and for output arrays.

Conflict emission (add-only)

Emit a conflict entry per divergent field group:

  • severity-mismatch: CVSS base score delta > 1.0 or vector differs.
  • affected-range-divergence: version ranges do not intersect.
  • reference-clash: no shared references and source vendors differ.
  • alias-inconsistency: aliases disjoint across observations.
  • metadata-gap: required fields missing on any observation. Each conflict includes field, reason, and values (array of source: value strings) and is stable-sorted by field then reason.

Linkset output shape additions

  • key.confidence: populated from formula above.
  • conflicts[]: as defined; may be empty but never null. Each conflict also carries sourceIds[] (vendors/sources that produced the values) for provenance.
  • normalized retains add-only fields from link-not-merge-schema.md; do not drop raw ranges even when disjoint.
  • provenance.hashes: sorted list of observationHash values; used by replay bundles.

Fixtures

  • docs/samples/lnm/linkset-lnm-21-002-sample.json: two-source agreement (high confidence, no conflicts).
  • docs/samples/lnm/linkset-lnm-21-002-conflict.json: three-source disagreement showing conflict records and confidence < 0.7. All fixtures use ASCII ordering and ISO-8601 UTC timestamps and may be used as golden outputs in tests.

Implementation checklist

  • Builder must refuse to overwrite existing linkset when incoming hash list unchanged.
  • Correlation job idempotency key: hash(tenantId|aliasSet|purlSet|fetchedAtBucket).
  • Telemetry: counter concelier.linkset.builder.conflict_total{field,reason} and histogram concelier.linkset.builder.confidence (01 buckets).
  • Event emission: include confidence and conflicts summary in advisory.linkset.updated@1; keep arrays sorted as above.

Change control

  • Add-only. Adjusting weights or conflict codes requires new version advisory.linkset.updated@2 and a sprint note.