more features checks. setup improvements
This commit is contained in:
@@ -0,0 +1,39 @@
|
||||
# Deterministic Semantic Merge Hash for Advisory Deduplication
|
||||
|
||||
## Module
|
||||
Concelier
|
||||
|
||||
## Status
|
||||
VERIFIED
|
||||
|
||||
## Description
|
||||
Computes identity-based semantic hash from (CVE + PURL/CPE + version-range + CWE + patch_lineage) for cross-distro advisory deduplication. Includes normalizers (PURL, CPE, version range, CWE, patch lineage), golden corpus validation (Debian/RHEL/SUSE/Alpine), fuzzing tests (1000 random inputs), shadow-write migration mode, and backfill service. Distinct from "Advisory Ingestion with Canonical Deduplication" which is the overall dedup concept; this is the specific merge_hash identity algorithm.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/Concelier/__Libraries/StellaOps.Concelier.Merge/Identity/`, `src/Concelier/__Libraries/StellaOps.Concelier.Merge/Services/`, `src/Concelier/__Libraries/StellaOps.Concelier.Merge/Jobs/`
|
||||
- **Key Classes**:
|
||||
- `MergeHashCalculator` (`src/Concelier/__Libraries/StellaOps.Concelier.Merge/Identity/MergeHashCalculator.cs`) - computes deterministic semantic hash from (CVE + PURL/CPE + version-range + CWE + patch_lineage) with input normalizers
|
||||
- `MergeHashShadowWriteService` (`src/Concelier/__Libraries/StellaOps.Concelier.Merge/Identity/MergeHashShadowWriteService.cs`) - shadow-write mode for migration validation
|
||||
- `MergeHashBackfillService` (`src/Concelier/__Libraries/StellaOps.Concelier.Merge/Services/MergeHashBackfillService.cs`) - retroactive backfill of merge hashes for existing advisories
|
||||
- `MergeHashBackfillJob` (`src/Concelier/__Libraries/StellaOps.Concelier.Merge/Jobs/MergeHashBackfillJob.cs`) - scheduled `IJob` for backfill execution
|
||||
- **Interfaces**: `IMergeHashCalculator`
|
||||
- **Source**: SPRINT_8200_0012_0001_CONCEL_merge_hash_library.md
|
||||
|
||||
## Verification Evidence
|
||||
- **Run**: run-002 (2026-02-13)
|
||||
- **Test project**: StellaOps.Concelier.Merge.Tests (731/731 pass)
|
||||
- **Baseline**: 687 existing tests + 44 new tests
|
||||
- **New test files**:
|
||||
- `MergeHashShadowWriteServiceTests.cs` (16 tests): backfill-all, backfill-one, skip-if-hash-exists, force recompute, error resilience, cancellation, field preservation
|
||||
- `MergeHashBackfillServiceTests.cs` (18 tests): dry-run mode, skip-if-hash-exists, error counting, cancellation, duration, SuccessRate/AvgTimePerAdvisoryMs metrics
|
||||
- `MergeHashBackfillJobTests.cs` (10 tests): IJob parameter parsing (seed/force routing, empty seed fallback, type-safe force)
|
||||
- **Existing coverage**: MergeHashCalculatorTests (20), GoldenCorpusTests (10), FuzzingTests (5) - all assertions verified meaningful
|
||||
|
||||
## E2E Test Plan
|
||||
- [x] Compute merge hash for two semantically identical advisories from different sources (e.g., Debian and RHEL for same CVE) and verify identical hash output
|
||||
- [x] Verify PURL normalization: different PURL formats for the same package produce the same merge hash
|
||||
- [x] Verify CPE normalization: equivalent CPE strings produce identical hashes
|
||||
- [x] Verify determinism: same input produces the same hash across 1000 repeated computations
|
||||
- [x] Verify golden corpus: validate merge hash against the golden corpus of known Debian/RHEL/SUSE/Alpine advisories
|
||||
- [x] Verify shadow-write mode: enable shadow writes and confirm both old and new hashes are persisted for comparison
|
||||
- [x] Verify backfill: run `MergeHashBackfillJob` and confirm pre-existing advisories receive computed merge hashes
|
||||
Reference in New Issue
Block a user