109 lines
		
	
	
		
			3.9 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			109 lines
		
	
	
		
			3.9 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # Normalized Versions Query Guide
 | ||
| 
 | ||
| This guide complements the Sprint 1–2 normalized versions rollout. It documents recommended indexes and aggregation patterns for querying `AffectedPackage.normalizedVersions`.
 | ||
| 
 | ||
| For a field-by-field look at how normalized rules persist in MongoDB (including provenance metadata), see Section 8 of the [Concelier SemVer Merge Playbook](merge_semver_playbook.md).
 | ||
| 
 | ||
| ## 1. Recommended indexes
 | ||
| 
 | ||
| When `concelier.storage.enableSemVerStyle` is enabled, advisories expose a flattened
 | ||
| `normalizedVersions` array at the document root. Create these indexes in `mongosh`
 | ||
| after the migration completes (adjust collection name if you use a prefix):
 | ||
| 
 | ||
| ```javascript
 | ||
| db.advisories.createIndex(
 | ||
|   {
 | ||
|     "normalizedVersions.packageId": 1,
 | ||
|     "normalizedVersions.scheme": 1,
 | ||
|     "normalizedVersions.type": 1
 | ||
|   },
 | ||
|   { name: "advisory_normalizedVersions_pkg_scheme_type" }
 | ||
| );
 | ||
| 
 | ||
| db.advisories.createIndex(
 | ||
|   { "normalizedVersions.value": 1 },
 | ||
|   { name: "advisory_normalizedVersions_value", sparse: true }
 | ||
| );
 | ||
| ```
 | ||
| 
 | ||
| - The compound index accelerates `$match` stages that filter by package identifier and rule style without unwinding `affectedPackages`.
 | ||
| - The sparse index keeps storage costs low while supporting pure exact-version lookups (type `exact`).
 | ||
| 
 | ||
| The storage bootstrapper creates the same indexes automatically when the feature flag is enabled.
 | ||
| 
 | ||
| ## 2. Query patterns
 | ||
| 
 | ||
| ### 2.1 Determine if a specific version is affected
 | ||
| 
 | ||
| ```javascript
 | ||
| db.advisories.aggregate([
 | ||
|   { $match: { "normalizedVersions.packageId": "pkg:npm/lodash" } },
 | ||
|   { $unwind: "$normalizedVersions" },
 | ||
|   { $match: {
 | ||
|       $or: [
 | ||
|         { "normalizedVersions.type": "exact",
 | ||
|           "normalizedVersions.value": "4.17.21" },
 | ||
|         { "normalizedVersions.type": "range",
 | ||
|           "normalizedVersions.min": { $lte: "4.17.21" },
 | ||
|           "normalizedVersions.max": { $gt: "4.17.21" } },
 | ||
|         { "normalizedVersions.type": "gte",
 | ||
|           "normalizedVersions.min": { $lte: "4.17.21" } },
 | ||
|         { "normalizedVersions.type": "lte",
 | ||
|           "normalizedVersions.max": { $gte: "4.17.21" } }
 | ||
|       ]
 | ||
|   }},
 | ||
|   { $project: { advisoryKey: 1, title: 1, "normalizedVersions.packageId": 1 } }
 | ||
| ]);
 | ||
| ```
 | ||
| 
 | ||
| Use this pipeline during Sprint 2 staging validation runs. Invoke `explain("executionStats")` to confirm the compound index is selected.
 | ||
| 
 | ||
| ### 2.2 Locate advisories missing normalized rules
 | ||
| 
 | ||
| ```javascript
 | ||
| db.advisories.aggregate([
 | ||
|   { $match: { $or: [
 | ||
|       { "normalizedVersions": { $exists: false } },
 | ||
|       { "normalizedVersions": { $size: 0 } }
 | ||
|     ] } },
 | ||
|   { $project: { advisoryKey: 1, affectedPackages: 1 } }
 | ||
| ]);
 | ||
| ```
 | ||
| 
 | ||
| Run this query after backfill jobs to identify gaps that still rely solely on `rangeExpression`.
 | ||
| 
 | ||
| ### 2.3 Deduplicate overlapping rules
 | ||
| 
 | ||
| ```javascript
 | ||
| db.advisories.aggregate([
 | ||
|   { $unwind: "$normalizedVersions" },
 | ||
|   { $group: {
 | ||
|       _id: {
 | ||
|         identifier: "$normalizedVersions.packageId",
 | ||
|         scheme: "$normalizedVersions.scheme",
 | ||
|         type: "$normalizedVersions.type",
 | ||
|         min: "$normalizedVersions.min",
 | ||
|         minInclusive: "$normalizedVersions.minInclusive",
 | ||
|         max: "$normalizedVersions.max",
 | ||
|         maxInclusive: "$normalizedVersions.maxInclusive",
 | ||
|         value: "$normalizedVersions.value"
 | ||
|       },
 | ||
|       advisories: { $addToSet: "$advisoryKey" },
 | ||
|       notes: { $addToSet: "$normalizedVersions.notes" }
 | ||
|   }},
 | ||
|   { $match: { "advisories.1": { $exists: true } } },
 | ||
|   { $sort: { "_id.identifier": 1, "_id.type": 1 } }
 | ||
| ]);
 | ||
| ```
 | ||
| 
 | ||
| Use this to confirm the merge dedupe logic keeps only one normalized rule per unique constraint.
 | ||
| 
 | ||
| ## 3. Operational checklist
 | ||
| 
 | ||
| - [ ] Create the indexes in staging before toggling dual-write in production.
 | ||
| - [ ] Capture explain plans and attach them to the release notes.
 | ||
| - [ ] Notify downstream services that consume advisory snapshots about the new `normalizedVersions` array.
 | ||
| - [ ] Update export fixtures once dedupe verification passes.
 | ||
| 
 | ||
| Additional background and mapper examples live in [Concelier SemVer Merge Playbook](merge_semver_playbook.md).
 |