# Concelier SemVer Merge Playbook (Sprint 1–2) This playbook describes how the merge layer and connector teams should emit the new SemVer primitives introduced in Sprint 1–2, how those primitives become normalized version rules, and how downstream jobs query them deterministically. ## 1. What landed in Sprint 1–2 - `RangePrimitives.SemVer` now infers a canonical `style` (`range`, `exact`, `lt`, `lte`, `gt`, `gte`) and captures `exactValue` when the constraint is a single version. - `NormalizedVersionRule` documents the analytics-friendly projection of each `AffectedPackage` coverage entry and is persisted alongside legacy `versionRanges`. - `AdvisoryProvenance.decisionReason` records whether merge resolution favored precedence, freshness, or a tie-breaker comparison. See `src/Concelier/__Libraries/StellaOps.Concelier.Models/CANONICAL_RECORDS.md` for the full schema and field descriptions. ## 2. Mapper pattern Connectors should emit SemVer primitives as soon as they can normalize a vendor constraint. The helper `SemVerPrimitiveExtensions.ToNormalizedVersionRule` turns those primitives into the persisted rules: ```csharp var primitive = new SemVerPrimitive( introduced: "1.2.3", introducedInclusive: true, fixed: "2.0.0", fixedInclusive: false, lastAffected: null, lastAffectedInclusive: false, constraintExpression: ">=1.2.3 <2.0.0", exactValue: null); var rule = primitive.ToNormalizedVersionRule(notes: "nvd:CVE-2025-1234"); // rule => scheme=semver, type=range, min=1.2.3, minInclusive=true, max=2.0.0, maxInclusive=false ``` If you omit the optional `notes` argument, `ToNormalizedVersionRule` now falls back to the primitive’s `ConstraintExpression`, ensuring the original comparator expression is preserved for provenance/audit queries. Emit the resulting rule inside `AffectedPackage.NormalizedVersions` while continuing to populate `AffectedVersionRange.RangeExpression` for backward compatibility. ## 3. Merge dedupe flow During merge, feed all package candidates through `NormalizedVersionRuleComparer.Instance` prior to persistence. The comparer orders by scheme → type → min → minInclusive → max → maxInclusive → value → notes, guaranteeing consistent document layout and making `$unwind` pipelines deterministic. If multiple connectors emit identical constraints, the merge layer should: 1. Combine provenance entries (preserving one per source). 2. Preserve a single normalized rule instance (thanks to `NormalizedVersionRuleEqualityComparer.Instance`). 3. Attach `decisionReason="precedence"` if one source overrides another. ## 4. Example Mongo pipeline Use the following aggregation to locate advisories that affect a specific SemVer: ```javascript db.advisories.aggregate([ { $match: { "affectedPackages.type": "semver", "affectedPackages.identifier": "pkg:npm/lodash" } }, { $unwind: "$affectedPackages" }, { $unwind: "$affectedPackages.normalizedVersions" }, { $match: { $or: [ { "affectedPackages.normalizedVersions.type": "exact", "affectedPackages.normalizedVersions.value": "4.17.21" }, { "affectedPackages.normalizedVersions.type": "range", "affectedPackages.normalizedVersions.min": { $lte: "4.17.21" }, "affectedPackages.normalizedVersions.max": { $gt: "4.17.21" } }, { "affectedPackages.normalizedVersions.type": "gte", "affectedPackages.normalizedVersions.min": { $lte: "4.17.21" } }, { "affectedPackages.normalizedVersions.type": "lte", "affectedPackages.normalizedVersions.max": { $gte: "4.17.21" } } ] }}, { $project: { advisoryKey: 1, title: 1, "affectedPackages.identifier": 1 } } ]); ``` Pair this query with the indexes listed in [Normalized Versions Query Guide](mongo_indices.md). ## 5. Recommended indexes | Collection | Index | Purpose | |------------|-------|---------| | `advisory` | `{ "affectedPackages.identifier": 1, "affectedPackages.normalizedVersions.scheme": 1, "affectedPackages.normalizedVersions.type": 1 }` (compound, multikey) | Speeds up `$match` on identifier + rule style. | | `advisory` | `{ "affectedPackages.normalizedVersions.value": 1 }` (sparse) | Optimizes lookups for exact version hits. | Coordinate with the Storage team when enabling these indexes so deployment windows account for collection size. ## 6. Dual-write rollout Follow the operational checklist in `docs/modules/devops/migrations/semver-style.md`. The summary: 1. **Dual write (now)** – emit both legacy `versionRanges` and the new `normalizedVersions`. 2. **Backfill** – follow the storage migration in `docs/modules/devops/migrations/semver-style.md` to rewrite historical advisories before switching consumers. 3. **Verify** – run the aggregation above (with `explain("executionStats")`) to ensure the new indexes are used. 4. **Cutover** – after consumers switch to normalized rules, mark the old `rangeExpression` as deprecated. ## 7. Checklist for connectors & merge - [ ] Populate `SemVerPrimitive` for every SemVer-friendly constraint. - [ ] Call `ToNormalizedVersionRule` and store the result. - [ ] Emit provenance masks covering both `versionRanges[].primitives.semver` and `normalizedVersions[]`. - [ ] Ensure merge deduping relies on the canonical comparer. - [ ] Capture merge decisions via `decisionReason`. - [ ] Confirm integration tests include fixtures with normalized rules and SemVer styles. For deeper query examples and maintenance tasks, continue with [Normalized Versions Query Guide](mongo_indices.md). ## 8. Storage projection reference `NormalizedVersionDocumentFactory` copies each normalized rule into MongoDB using the shape below. Use this as a contract when reviewing connector fixtures or diagnosing merge/storage diffs: ```json { "packageId": "pkg:npm/example", "packageType": "npm", "scheme": "semver", "type": "range", "style": "range", "min": "1.2.3", "minInclusive": true, "max": "2.0.0", "maxInclusive": false, "value": null, "notes": "ghsa:GHSA-xxxx-yyyy", "decisionReason": "ghsa-precedence-over-nvd", "constraint": ">= 1.2.3 < 2.0.0", "source": "ghsa", "recordedAt": "2025-10-11T00:00:00Z" } ``` For distro-specific ranges (`nevra`, `evr`) the same envelope applies with `scheme` switched accordingly. Example: ```json { "packageId": "bash", "packageType": "rpm", "scheme": "nevra", "type": "range", "style": "range", "min": "0:4.4.18-2.el7", "minInclusive": true, "max": "0:4.4.20-1.el7", "maxInclusive": false, "value": null, "notes": "redhat:RHSA-2025:1234", "decisionReason": "rhel-priority-over-nvd", "constraint": "<= 0:4.4.20-1.el7", "source": "redhat", "recordedAt": "2025-10-11T00:00:00Z" } ``` If a new scheme is required (for example, `apple.build` or `ios.semver`), raise it with the Models team before emitting documents so merge comparers and hashing logic can incorporate the change deterministically. ## 9. Observability signals - `concelier.merge.normalized_rules` (counter, tags: `package_type`, `scheme`) – increments once per normalized rule retained after precedence merge. - `concelier.merge.normalized_rules_missing` (counter, tags: `package_type`) – increments when a merged package still carries version ranges but no normalized rules; watch for spikes to catch connectors that have not emitted normalized arrays yet.