Resolve Concelier/Excititor merge conflicts

This commit is contained in:
root
2025-10-20 14:19:25 +03:00
2687 changed files with 212646 additions and 85913 deletions

View File

@@ -1,154 +1,154 @@
# Feedser SemVer Merge Playbook (Sprint 12)
This playbook describes how the merge layer and connector teams should emit the new SemVer primitives introduced in Sprint12, how those primitives become normalized version rules, and how downstream jobs query them deterministically.
## 1. What landed in Sprint12
- `RangePrimitives.SemVer` now infers a canonical `style` (`range`, `exact`, `lt`, `lte`, `gt`, `gte`) and captures `exactValue` when the constraint is a single version.
- `NormalizedVersionRule` documents the analytics-friendly projection of each `AffectedPackage` coverage entry and is persisted alongside legacy `versionRanges`.
- `AdvisoryProvenance.decisionReason` records whether merge resolution favored precedence, freshness, or a tie-breaker comparison.
See `src/StellaOps.Feedser.Models/CANONICAL_RECORDS.md` for the full schema and field descriptions.
## 2. Mapper pattern
Connectors should emit SemVer primitives as soon as they can normalize a vendor constraint. The helper `SemVerPrimitiveExtensions.ToNormalizedVersionRule` turns those primitives into the persisted rules:
```csharp
var primitive = new SemVerPrimitive(
introduced: "1.2.3",
introducedInclusive: true,
fixed: "2.0.0",
fixedInclusive: false,
lastAffected: null,
lastAffectedInclusive: false,
constraintExpression: ">=1.2.3 <2.0.0",
exactValue: null);
var rule = primitive.ToNormalizedVersionRule(notes: "nvd:CVE-2025-1234");
// rule => scheme=semver, type=range, min=1.2.3, minInclusive=true, max=2.0.0, maxInclusive=false
```
If you omit the optional `notes` argument, `ToNormalizedVersionRule` now falls back to the primitives `ConstraintExpression`, ensuring the original comparator expression is preserved for provenance/audit queries.
Emit the resulting rule inside `AffectedPackage.NormalizedVersions` while continuing to populate `AffectedVersionRange.RangeExpression` for backward compatibility.
## 3. Merge dedupe flow
During merge, feed all package candidates through `NormalizedVersionRuleComparer.Instance` prior to persistence. The comparer orders by scheme → type → min → minInclusive → max → maxInclusive → value → notes, guaranteeing consistent document layout and making `$unwind` pipelines deterministic.
If multiple connectors emit identical constraints, the merge layer should:
1. Combine provenance entries (preserving one per source).
2. Preserve a single normalized rule instance (thanks to `NormalizedVersionRuleEqualityComparer.Instance`).
3. Attach `decisionReason="precedence"` if one source overrides another.
## 4. Example Mongo pipeline
Use the following aggregation to locate advisories that affect a specific SemVer:
```javascript
db.advisories.aggregate([
{ $match: { "affectedPackages.type": "semver", "affectedPackages.identifier": "pkg:npm/lodash" } },
{ $unwind: "$affectedPackages" },
{ $unwind: "$affectedPackages.normalizedVersions" },
{ $match: {
$or: [
{ "affectedPackages.normalizedVersions.type": "exact",
"affectedPackages.normalizedVersions.value": "4.17.21" },
{ "affectedPackages.normalizedVersions.type": "range",
"affectedPackages.normalizedVersions.min": { $lte: "4.17.21" },
"affectedPackages.normalizedVersions.max": { $gt: "4.17.21" } },
{ "affectedPackages.normalizedVersions.type": "gte",
"affectedPackages.normalizedVersions.min": { $lte: "4.17.21" } },
{ "affectedPackages.normalizedVersions.type": "lte",
"affectedPackages.normalizedVersions.max": { $gte: "4.17.21" } }
]
}},
{ $project: { advisoryKey: 1, title: 1, "affectedPackages.identifier": 1 } }
]);
```
Pair this query with the indexes listed in [Normalized Versions Query Guide](mongo_indices.md).
## 5. Recommended indexes
| Collection | Index | Purpose |
|------------|-------|---------|
| `advisory` | `{ "affectedPackages.identifier": 1, "affectedPackages.normalizedVersions.scheme": 1, "affectedPackages.normalizedVersions.type": 1 }` (compound, multikey) | Speeds up `$match` on identifier + rule style. |
| `advisory` | `{ "affectedPackages.normalizedVersions.value": 1 }` (sparse) | Optimizes lookups for exact version hits. |
Coordinate with the Storage team when enabling these indexes so deployment windows account for collection size.
## 6. Dual-write rollout
Follow the operational checklist in `docs/ops/migrations/SEMVER_STYLE.md`. The summary:
1. **Dual write (now)** emit both legacy `versionRanges` and the new `normalizedVersions`.
2. **Backfill** follow the storage migration in `docs/ops/migrations/SEMVER_STYLE.md` to rewrite historical advisories before switching consumers.
3. **Verify** run the aggregation above (with `explain("executionStats")`) to ensure the new indexes are used.
4. **Cutover** after consumers switch to normalized rules, mark the old `rangeExpression` as deprecated.
## 7. Checklist for connectors & merge
- [ ] Populate `SemVerPrimitive` for every SemVer-friendly constraint.
- [ ] Call `ToNormalizedVersionRule` and store the result.
- [ ] Emit provenance masks covering both `versionRanges[].primitives.semver` and `normalizedVersions[]`.
- [ ] Ensure merge deduping relies on the canonical comparer.
- [ ] Capture merge decisions via `decisionReason`.
- [ ] Confirm integration tests include fixtures with normalized rules and SemVer styles.
For deeper query examples and maintenance tasks, continue with [Normalized Versions Query Guide](mongo_indices.md).
## 8. Storage projection reference
`NormalizedVersionDocumentFactory` copies each normalized rule into MongoDB using the shape below. Use this as a contract when reviewing connector fixtures or diagnosing merge/storage diffs:
```json
{
"packageId": "pkg:npm/example",
"packageType": "npm",
"scheme": "semver",
"type": "range",
"style": "range",
"min": "1.2.3",
"minInclusive": true,
"max": "2.0.0",
"maxInclusive": false,
"value": null,
"notes": "ghsa:GHSA-xxxx-yyyy",
"decisionReason": "ghsa-precedence-over-nvd",
"constraint": ">= 1.2.3 < 2.0.0",
"source": "ghsa",
"recordedAt": "2025-10-11T00:00:00Z"
}
```
For distro-specific ranges (`nevra`, `evr`) the same envelope applies with `scheme` switched accordingly. Example:
```json
{
"packageId": "bash",
"packageType": "rpm",
"scheme": "nevra",
"type": "range",
"style": "range",
"min": "0:4.4.18-2.el7",
"minInclusive": true,
"max": "0:4.4.20-1.el7",
"maxInclusive": false,
"value": null,
"notes": "redhat:RHSA-2025:1234",
"decisionReason": "rhel-priority-over-nvd",
"constraint": "<= 0:4.4.20-1.el7",
"source": "redhat",
"recordedAt": "2025-10-11T00:00:00Z"
}
```
If a new scheme is required (for example, `apple.build` or `ios.semver`), raise it with the Models team before emitting documents so merge comparers and hashing logic can incorporate the change deterministically.
## 9. Observability signals
- `feedser.merge.normalized_rules` (counter, tags: `package_type`, `scheme`) increments once per normalized rule retained after precedence merge.
- `feedser.merge.normalized_rules_missing` (counter, tags: `package_type`) increments when a merged package still carries version ranges but no normalized rules; watch for spikes to catch connectors that have not emitted normalized arrays yet.
# Concelier SemVer Merge Playbook (Sprint 12)
This playbook describes how the merge layer and connector teams should emit the new SemVer primitives introduced in Sprint12, how those primitives become normalized version rules, and how downstream jobs query them deterministically.
## 1. What landed in Sprint12
- `RangePrimitives.SemVer` now infers a canonical `style` (`range`, `exact`, `lt`, `lte`, `gt`, `gte`) and captures `exactValue` when the constraint is a single version.
- `NormalizedVersionRule` documents the analytics-friendly projection of each `AffectedPackage` coverage entry and is persisted alongside legacy `versionRanges`.
- `AdvisoryProvenance.decisionReason` records whether merge resolution favored precedence, freshness, or a tie-breaker comparison.
See `src/StellaOps.Concelier.Models/CANONICAL_RECORDS.md` for the full schema and field descriptions.
## 2. Mapper pattern
Connectors should emit SemVer primitives as soon as they can normalize a vendor constraint. The helper `SemVerPrimitiveExtensions.ToNormalizedVersionRule` turns those primitives into the persisted rules:
```csharp
var primitive = new SemVerPrimitive(
introduced: "1.2.3",
introducedInclusive: true,
fixed: "2.0.0",
fixedInclusive: false,
lastAffected: null,
lastAffectedInclusive: false,
constraintExpression: ">=1.2.3 <2.0.0",
exactValue: null);
var rule = primitive.ToNormalizedVersionRule(notes: "nvd:CVE-2025-1234");
// rule => scheme=semver, type=range, min=1.2.3, minInclusive=true, max=2.0.0, maxInclusive=false
```
If you omit the optional `notes` argument, `ToNormalizedVersionRule` now falls back to the primitives `ConstraintExpression`, ensuring the original comparator expression is preserved for provenance/audit queries.
Emit the resulting rule inside `AffectedPackage.NormalizedVersions` while continuing to populate `AffectedVersionRange.RangeExpression` for backward compatibility.
## 3. Merge dedupe flow
During merge, feed all package candidates through `NormalizedVersionRuleComparer.Instance` prior to persistence. The comparer orders by scheme → type → min → minInclusive → max → maxInclusive → value → notes, guaranteeing consistent document layout and making `$unwind` pipelines deterministic.
If multiple connectors emit identical constraints, the merge layer should:
1. Combine provenance entries (preserving one per source).
2. Preserve a single normalized rule instance (thanks to `NormalizedVersionRuleEqualityComparer.Instance`).
3. Attach `decisionReason="precedence"` if one source overrides another.
## 4. Example Mongo pipeline
Use the following aggregation to locate advisories that affect a specific SemVer:
```javascript
db.advisories.aggregate([
{ $match: { "affectedPackages.type": "semver", "affectedPackages.identifier": "pkg:npm/lodash" } },
{ $unwind: "$affectedPackages" },
{ $unwind: "$affectedPackages.normalizedVersions" },
{ $match: {
$or: [
{ "affectedPackages.normalizedVersions.type": "exact",
"affectedPackages.normalizedVersions.value": "4.17.21" },
{ "affectedPackages.normalizedVersions.type": "range",
"affectedPackages.normalizedVersions.min": { $lte: "4.17.21" },
"affectedPackages.normalizedVersions.max": { $gt: "4.17.21" } },
{ "affectedPackages.normalizedVersions.type": "gte",
"affectedPackages.normalizedVersions.min": { $lte: "4.17.21" } },
{ "affectedPackages.normalizedVersions.type": "lte",
"affectedPackages.normalizedVersions.max": { $gte: "4.17.21" } }
]
}},
{ $project: { advisoryKey: 1, title: 1, "affectedPackages.identifier": 1 } }
]);
```
Pair this query with the indexes listed in [Normalized Versions Query Guide](mongo_indices.md).
## 5. Recommended indexes
| Collection | Index | Purpose |
|------------|-------|---------|
| `advisory` | `{ "affectedPackages.identifier": 1, "affectedPackages.normalizedVersions.scheme": 1, "affectedPackages.normalizedVersions.type": 1 }` (compound, multikey) | Speeds up `$match` on identifier + rule style. |
| `advisory` | `{ "affectedPackages.normalizedVersions.value": 1 }` (sparse) | Optimizes lookups for exact version hits. |
Coordinate with the Storage team when enabling these indexes so deployment windows account for collection size.
## 6. Dual-write rollout
Follow the operational checklist in `docs/ops/migrations/SEMVER_STYLE.md`. The summary:
1. **Dual write (now)** emit both legacy `versionRanges` and the new `normalizedVersions`.
2. **Backfill** follow the storage migration in `docs/ops/migrations/SEMVER_STYLE.md` to rewrite historical advisories before switching consumers.
3. **Verify** run the aggregation above (with `explain("executionStats")`) to ensure the new indexes are used.
4. **Cutover** after consumers switch to normalized rules, mark the old `rangeExpression` as deprecated.
## 7. Checklist for connectors & merge
- [ ] Populate `SemVerPrimitive` for every SemVer-friendly constraint.
- [ ] Call `ToNormalizedVersionRule` and store the result.
- [ ] Emit provenance masks covering both `versionRanges[].primitives.semver` and `normalizedVersions[]`.
- [ ] Ensure merge deduping relies on the canonical comparer.
- [ ] Capture merge decisions via `decisionReason`.
- [ ] Confirm integration tests include fixtures with normalized rules and SemVer styles.
For deeper query examples and maintenance tasks, continue with [Normalized Versions Query Guide](mongo_indices.md).
## 8. Storage projection reference
`NormalizedVersionDocumentFactory` copies each normalized rule into MongoDB using the shape below. Use this as a contract when reviewing connector fixtures or diagnosing merge/storage diffs:
```json
{
"packageId": "pkg:npm/example",
"packageType": "npm",
"scheme": "semver",
"type": "range",
"style": "range",
"min": "1.2.3",
"minInclusive": true,
"max": "2.0.0",
"maxInclusive": false,
"value": null,
"notes": "ghsa:GHSA-xxxx-yyyy",
"decisionReason": "ghsa-precedence-over-nvd",
"constraint": ">= 1.2.3 < 2.0.0",
"source": "ghsa",
"recordedAt": "2025-10-11T00:00:00Z"
}
```
For distro-specific ranges (`nevra`, `evr`) the same envelope applies with `scheme` switched accordingly. Example:
```json
{
"packageId": "bash",
"packageType": "rpm",
"scheme": "nevra",
"type": "range",
"style": "range",
"min": "0:4.4.18-2.el7",
"minInclusive": true,
"max": "0:4.4.20-1.el7",
"maxInclusive": false,
"value": null,
"notes": "redhat:RHSA-2025:1234",
"decisionReason": "rhel-priority-over-nvd",
"constraint": "<= 0:4.4.20-1.el7",
"source": "redhat",
"recordedAt": "2025-10-11T00:00:00Z"
}
```
If a new scheme is required (for example, `apple.build` or `ios.semver`), raise it with the Models team before emitting documents so merge comparers and hashing logic can incorporate the change deterministically.
## 9. Observability signals
- `concelier.merge.normalized_rules` (counter, tags: `package_type`, `scheme`) increments once per normalized rule retained after precedence merge.
- `concelier.merge.normalized_rules_missing` (counter, tags: `package_type`) increments when a merged package still carries version ranges but no normalized rules; watch for spikes to catch connectors that have not emitted normalized arrays yet.