# Phase 5: Vulnerability Index Conversion (Concelier) **Sprint:** 6-7 **Duration:** 2 sprints **Status:** TODO **Dependencies:** Phase 0 (Foundations) --- ## Objectives 1. Create `StellaOps.Concelier.Storage.Postgres` project 2. Implement full vulnerability schema in PostgreSQL 3. Build advisory conversion pipeline 4. Maintain deterministic vulnerability matching --- ## Deliverables | Deliverable | Acceptance Criteria | |-------------|---------------------| | Vuln schema | All tables created with indexes | | Conversion pipeline | MongoDB advisories converted to PostgreSQL | | Matching verification | Same CVEs found for identical SBOMs | | Integration tests | 100% coverage of query operations | --- ## Schema Reference See [SPECIFICATION.md](../SPECIFICATION.md) Section 5.2 for complete vulnerability schema. **Tables:** - `vuln.sources` - `vuln.feed_snapshots` - `vuln.advisory_snapshots` - `vuln.advisories` - `vuln.advisory_aliases` - `vuln.advisory_cvss` - `vuln.advisory_affected` - `vuln.advisory_references` - `vuln.advisory_credits` - `vuln.advisory_weaknesses` - `vuln.kev_flags` - `vuln.source_states` - `vuln.merge_events` --- ## Sprint 5a: Schema & Repositories ### T5a.1: Create Concelier.Storage.Postgres Project **Status:** TODO **Estimate:** 0.5 days **Subtasks:** - [ ] Create project structure - [ ] Add NuGet references - [ ] Create `ConcelierDataSource` class - [ ] Create `ServiceCollectionExtensions.cs` --- ### T5a.2: Implement Schema Migrations **Status:** TODO **Estimate:** 1.5 days **Subtasks:** - [ ] Create schema migration - [ ] Include all tables - [ ] Add full-text search index - [ ] Add PURL lookup index - [ ] Test migration idempotency --- ### T5a.3: Implement Source Repository **Status:** TODO **Estimate:** 0.5 days **Subtasks:** - [ ] Implement CRUD operations - [ ] Implement GetByKeyAsync - [ ] Write integration tests --- ### T5a.4: Implement Advisory Repository **Status:** TODO **Estimate:** 2 days **Interface:** ```csharp public interface IAdvisoryRepository { Task GetByKeyAsync(string advisoryKey, CancellationToken ct); Task GetByAliasAsync(string aliasType, string aliasValue, CancellationToken ct); Task> SearchAsync(AdvisorySearchQuery query, CancellationToken ct); Task UpsertAsync(Advisory advisory, CancellationToken ct); Task> GetAffectingPackageAsync(string purl, CancellationToken ct); Task> GetAffectingPackageNameAsync(string ecosystem, string name, CancellationToken ct); } ``` **Subtasks:** - [ ] Implement GetByKeyAsync - [ ] Implement GetByAliasAsync (CVE lookup) - [ ] Implement SearchAsync with full-text search - [ ] Implement UpsertAsync with all child tables - [ ] Implement GetAffectingPackageAsync (PURL match) - [ ] Implement GetAffectingPackageNameAsync - [ ] Write integration tests --- ### T5a.5: Implement Child Table Repositories **Status:** TODO **Estimate:** 2 days **Subtasks:** - [ ] Implement Alias repository - [ ] Implement CVSS repository - [ ] Implement Affected repository - [ ] Implement Reference repository - [ ] Implement Credit repository - [ ] Implement Weakness repository - [ ] Implement KEV repository - [ ] Write integration tests --- ### T5a.6: Implement Source State Repository **Status:** TODO **Estimate:** 0.5 days **Subtasks:** - [ ] Implement CRUD operations - [ ] Implement cursor management - [ ] Write integration tests --- ## Sprint 5b: Conversion & Verification ### T5b.1: Build Advisory Conversion Service **Status:** TODO **Estimate:** 2 days **Description:** Create service to convert MongoDB advisory documents to PostgreSQL relational structure. **Subtasks:** - [ ] Parse MongoDB `AdvisoryDocument` structure - [ ] Map to `vuln.advisories` table - [ ] Extract and normalize aliases - [ ] Extract and normalize CVSS metrics - [ ] Extract and normalize affected packages - [ ] Preserve provenance JSONB - [ ] Handle version ranges (keep as JSONB) - [ ] Handle normalized versions (keep as JSONB) **Conversion Logic:** ```csharp public sealed class AdvisoryConverter { public async Task ConvertAsync( IMongoCollection source, IAdvisoryRepository target, CancellationToken ct) { await foreach (var doc in source.AsAsyncEnumerable(ct)) { var advisory = MapToAdvisory(doc); await target.UpsertAsync(advisory, ct); } } private Advisory MapToAdvisory(AdvisoryDocument doc) { // Extract from BsonDocument payload var payload = doc.Payload; return new Advisory { AdvisoryKey = doc.Id, PrimaryVulnId = payload["primaryVulnId"].AsString, Title = payload["title"]?.AsString, Summary = payload["summary"]?.AsString, // ... etc Provenance = BsonSerializer.Deserialize(payload["provenance"]), }; } } ``` --- ### T5b.2: Build Feed Import Pipeline **Status:** TODO **Estimate:** 1 day **Description:** Modify feed import to write directly to PostgreSQL. **Subtasks:** - [ ] Update NVD importer to use PostgreSQL - [ ] Update OSV importer to use PostgreSQL - [ ] Update GHSA importer to use PostgreSQL - [ ] Update vendor feed importers - [ ] Test incremental imports --- ### T5b.3: Run Parallel Import **Status:** TODO **Estimate:** 1 day **Description:** Run imports to both MongoDB and PostgreSQL simultaneously. **Subtasks:** - [ ] Configure dual-import mode - [ ] Run import cycle - [ ] Compare record counts - [ ] Sample comparison checks --- ### T5b.4: Verify Vulnerability Matching **Status:** TODO **Estimate:** 2 days **Description:** Verify that vulnerability matching produces identical results. **Subtasks:** - [ ] Select sample SBOMs (various ecosystems) - [ ] Run matching with MongoDB backend - [ ] Run matching with PostgreSQL backend - [ ] Compare findings (must be identical) - [ ] Document any differences - [ ] Fix any issues found **Verification Tests:** ```csharp [Theory] [MemberData(nameof(GetSampleSboms))] public async Task Scanner_Should_Find_Same_Vulns(string sbomPath) { var sbom = await LoadSbom(sbomPath); _config["Persistence:Concelier"] = "Mongo"; var mongoFindings = await _scanner.ScanAsync(sbom); _config["Persistence:Concelier"] = "Postgres"; var postgresFindings = await _scanner.ScanAsync(sbom); // Strict ordering for determinism postgresFindings.Should().BeEquivalentTo(mongoFindings, options => options.WithStrictOrdering()); } ``` --- ### T5b.5: Performance Optimization **Status:** TODO **Estimate:** 1 day **Subtasks:** - [ ] Analyze slow queries with EXPLAIN ANALYZE - [ ] Optimize indexes for common queries - [ ] Consider partial indexes for active advisories - [ ] Benchmark PostgreSQL vs MongoDB performance --- ### T5b.6: Switch Scanner to PostgreSQL **Status:** TODO **Estimate:** 0.5 days **Subtasks:** - [ ] Update configuration - [ ] Deploy to staging - [ ] Run full scan suite - [ ] Deploy to production --- ## Exit Criteria - [ ] All repository interfaces implemented - [ ] Advisory conversion pipeline working - [ ] Vulnerability matching produces identical results - [ ] Feed imports working on PostgreSQL - [ ] Concelier running on PostgreSQL in production --- ## Risks & Mitigations | Risk | Likelihood | Impact | Mitigation | |------|------------|--------|------------| | Matching discrepancies | Medium | High | Extensive comparison testing | | Performance regression on queries | Medium | Medium | Index optimization, query tuning | | Data loss during conversion | Low | High | Verify counts, sample checks | --- ## Data Volume Estimates | Table | Estimated Rows | Growth Rate | |-------|----------------|-------------| | advisories | 300,000+ | ~100/day | | advisory_aliases | 600,000+ | ~200/day | | advisory_affected | 2,000,000+ | ~1000/day | | advisory_cvss | 400,000+ | ~150/day | --- *Phase Version: 1.0.0* *Last Updated: 2025-11-28*