Files
git.stella-ops.org/docs/db/tasks/PHASE_5_VULNERABILITIES.md
StellaOps Bot 2548abc56f
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
up
2025-11-29 01:35:49 +02:00

7.9 KiB

Phase 5: Vulnerability Index Conversion (Concelier)

Sprint: 6-7 Duration: 2 sprints Status: TODO Dependencies: Phase 0 (Foundations)


Objectives

  1. Create StellaOps.Concelier.Storage.Postgres project
  2. Implement full vulnerability schema in PostgreSQL
  3. Build advisory conversion pipeline
  4. Maintain deterministic vulnerability matching

Deliverables

Deliverable Acceptance Criteria
Vuln schema All tables created with indexes
Conversion pipeline MongoDB advisories converted to PostgreSQL
Matching verification Same CVEs found for identical SBOMs
Integration tests 100% coverage of query operations

Schema Reference

See SPECIFICATION.md Section 5.2 for complete vulnerability schema.

Tables:

  • vuln.sources
  • vuln.feed_snapshots
  • vuln.advisory_snapshots
  • vuln.advisories
  • vuln.advisory_aliases
  • vuln.advisory_cvss
  • vuln.advisory_affected
  • vuln.advisory_references
  • vuln.advisory_credits
  • vuln.advisory_weaknesses
  • vuln.kev_flags
  • vuln.source_states
  • vuln.merge_events

Sprint 5a: Schema & Repositories

T5a.1: Create Concelier.Storage.Postgres Project

Status: TODO Estimate: 0.5 days

Subtasks:

  • Create project structure
  • Add NuGet references
  • Create ConcelierDataSource class
  • Create ServiceCollectionExtensions.cs

T5a.2: Implement Schema Migrations

Status: TODO Estimate: 1.5 days

Subtasks:

  • Create schema migration
  • Include all tables
  • Add full-text search index
  • Add PURL lookup index
  • Test migration idempotency

T5a.3: Implement Source Repository

Status: TODO Estimate: 0.5 days

Subtasks:

  • Implement CRUD operations
  • Implement GetByKeyAsync
  • Write integration tests

T5a.4: Implement Advisory Repository

Status: TODO Estimate: 2 days

Interface:

public interface IAdvisoryRepository
{
    Task<Advisory?> GetByKeyAsync(string advisoryKey, CancellationToken ct);
    Task<Advisory?> GetByAliasAsync(string aliasType, string aliasValue, CancellationToken ct);
    Task<IReadOnlyList<Advisory>> SearchAsync(AdvisorySearchQuery query, CancellationToken ct);
    Task<Advisory> UpsertAsync(Advisory advisory, CancellationToken ct);
    Task<IReadOnlyList<Advisory>> GetAffectingPackageAsync(string purl, CancellationToken ct);
    Task<IReadOnlyList<Advisory>> GetAffectingPackageNameAsync(string ecosystem, string name, CancellationToken ct);
}

Subtasks:

  • Implement GetByKeyAsync
  • Implement GetByAliasAsync (CVE lookup)
  • Implement SearchAsync with full-text search
  • Implement UpsertAsync with all child tables
  • Implement GetAffectingPackageAsync (PURL match)
  • Implement GetAffectingPackageNameAsync
  • Write integration tests

T5a.5: Implement Child Table Repositories

Status: TODO Estimate: 2 days

Subtasks:

  • Implement Alias repository
  • Implement CVSS repository
  • Implement Affected repository
  • Implement Reference repository
  • Implement Credit repository
  • Implement Weakness repository
  • Implement KEV repository
  • Write integration tests

T5a.6: Implement Source State Repository

Status: TODO Estimate: 0.5 days

Subtasks:

  • Implement CRUD operations
  • Implement cursor management
  • Write integration tests

Sprint 5b: Conversion & Verification

T5b.1: Build Advisory Conversion Service

Status: TODO Estimate: 2 days

Description: Create service to convert MongoDB advisory documents to PostgreSQL relational structure.

Subtasks:

  • Parse MongoDB AdvisoryDocument structure
  • Map to vuln.advisories table
  • Extract and normalize aliases
  • Extract and normalize CVSS metrics
  • Extract and normalize affected packages
  • Preserve provenance JSONB
  • Handle version ranges (keep as JSONB)
  • Handle normalized versions (keep as JSONB)

Conversion Logic:

public sealed class AdvisoryConverter
{
    public async Task ConvertAsync(
        IMongoCollection<AdvisoryDocument> source,
        IAdvisoryRepository target,
        CancellationToken ct)
    {
        await foreach (var doc in source.AsAsyncEnumerable(ct))
        {
            var advisory = MapToAdvisory(doc);
            await target.UpsertAsync(advisory, ct);
        }
    }

    private Advisory MapToAdvisory(AdvisoryDocument doc)
    {
        // Extract from BsonDocument payload
        var payload = doc.Payload;
        return new Advisory
        {
            AdvisoryKey = doc.Id,
            PrimaryVulnId = payload["primaryVulnId"].AsString,
            Title = payload["title"]?.AsString,
            Summary = payload["summary"]?.AsString,
            // ... etc
            Provenance = BsonSerializer.Deserialize<JsonElement>(payload["provenance"]),
        };
    }
}

T5b.2: Build Feed Import Pipeline

Status: TODO Estimate: 1 day

Description: Modify feed import to write directly to PostgreSQL.

Subtasks:

  • Update NVD importer to use PostgreSQL
  • Update OSV importer to use PostgreSQL
  • Update GHSA importer to use PostgreSQL
  • Update vendor feed importers
  • Test incremental imports

T5b.3: Run Parallel Import

Status: TODO Estimate: 1 day

Description: Run imports to both MongoDB and PostgreSQL simultaneously.

Subtasks:

  • Configure dual-import mode
  • Run import cycle
  • Compare record counts
  • Sample comparison checks

T5b.4: Verify Vulnerability Matching

Status: TODO Estimate: 2 days

Description: Verify that vulnerability matching produces identical results.

Subtasks:

  • Select sample SBOMs (various ecosystems)
  • Run matching with MongoDB backend
  • Run matching with PostgreSQL backend
  • Compare findings (must be identical)
  • Document any differences
  • Fix any issues found

Verification Tests:

[Theory]
[MemberData(nameof(GetSampleSboms))]
public async Task Scanner_Should_Find_Same_Vulns(string sbomPath)
{
    var sbom = await LoadSbom(sbomPath);

    _config["Persistence:Concelier"] = "Mongo";
    var mongoFindings = await _scanner.ScanAsync(sbom);

    _config["Persistence:Concelier"] = "Postgres";
    var postgresFindings = await _scanner.ScanAsync(sbom);

    // Strict ordering for determinism
    postgresFindings.Should().BeEquivalentTo(mongoFindings,
        options => options.WithStrictOrdering());
}

T5b.5: Performance Optimization

Status: TODO Estimate: 1 day

Subtasks:

  • Analyze slow queries with EXPLAIN ANALYZE
  • Optimize indexes for common queries
  • Consider partial indexes for active advisories
  • Benchmark PostgreSQL vs MongoDB performance

T5b.6: Switch Scanner to PostgreSQL

Status: TODO Estimate: 0.5 days

Subtasks:

  • Update configuration
  • Deploy to staging
  • Run full scan suite
  • Deploy to production

Exit Criteria

  • All repository interfaces implemented
  • Advisory conversion pipeline working
  • Vulnerability matching produces identical results
  • Feed imports working on PostgreSQL
  • Concelier running on PostgreSQL in production

Risks & Mitigations

Risk Likelihood Impact Mitigation
Matching discrepancies Medium High Extensive comparison testing
Performance regression on queries Medium Medium Index optimization, query tuning
Data loss during conversion Low High Verify counts, sample checks

Data Volume Estimates

Table Estimated Rows Growth Rate
advisories 300,000+ ~100/day
advisory_aliases 600,000+ ~200/day
advisory_affected 2,000,000+ ~1000/day
advisory_cvss 400,000+ ~150/day

Phase Version: 1.0.0 Last Updated: 2025-11-28