Rename Concelier Source modules to Connector

This commit is contained in:
2025-10-18 20:11:18 +03:00
parent 0137856fdb
commit 6524626230
789 changed files with 1489 additions and 1489 deletions

View File

@@ -0,0 +1,28 @@
# AGENTS
## Role
Chromium/Chrome vendor feed connector parsing Stable Channel Update posts; authoritative vendor context for Chrome/Chromium versions and CVE lists; maps fixed versions as affected ranges.
## Scope
- Crawl Chrome Releases blog list; window by publish date; fetch detail posts; identify "Stable Channel Update" and security fix sections.
- Validate HTML; extract version trains, platform notes (Windows/macOS/Linux/Android), CVEs, acknowledgements; map fixed versions.
- Persist raw docs and maintain source_state cursor; idempotent mapping.
## Participants
- Source.Common (HTTP, HTML helpers, validators).
- Storage.Mongo (document, dto, advisory, alias, affected, reference, psirt_flags, source_state).
- Models (canonical; affected ranges by product/version).
- Core/WebService (jobs: source:chromium:fetch|parse|map).
- Merge engine (later) to respect vendor PSIRT precedence for Chrome.
## Interfaces & contracts
- Aliases: CHROMIUM-POST:<yyyy-mm-dd or slug> plus CVE ids.
- Affected: Vendor=Google, Product=Chrome/Chromium (platform tags), Type=vendor; Versions indicate introduced? (often unknown) and fixed (for example 127.0.6533.88); tags mark platforms.
- References: advisory (post URL), release notes, bug links; kind set appropriately.
- Provenance: method=parser; value=post slug; recordedAt=fetch time.
## In/Out of scope
In: vendor advisory mapping, fixed version emission per platform, psirt_flags vendor context.
Out: OS distro packaging semantics; bug bounty details beyond references.
## Observability & security expectations
- Metrics: SourceDiagnostics exports the shared `concelier.source.http.*` counters/histograms tagged `concelier.source=chromium`, enabling dashboards to observe fetch volumes, parse failures, and map affected counts via tag filters.
- Logs: post slugs, version extracted, platform coverage, timing; allowlist blog host.
## Tests
- Author and review coverage in `../StellaOps.Concelier.Connector.Vndr.Chromium.Tests`.
- Shared fixtures (e.g., `MongoIntegrationFixture`, `ConnectorTestHarness`) live in `../StellaOps.Concelier.Testing`.
- Keep fixtures deterministic; match new cases to real-world advisories or regression scenarios.

View File

@@ -0,0 +1,366 @@
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.Json;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using MongoDB.Bson;
using MongoDB.Bson.IO;
using StellaOps.Concelier.Models;
using StellaOps.Concelier.Connector.Common;
using StellaOps.Concelier.Connector.Common.Fetch;
using StellaOps.Concelier.Connector.Common.Json;
using StellaOps.Concelier.Connector.Vndr.Chromium.Configuration;
using StellaOps.Concelier.Connector.Vndr.Chromium.Internal;
using StellaOps.Concelier.Storage.Mongo;
using StellaOps.Concelier.Storage.Mongo.Advisories;
using StellaOps.Concelier.Storage.Mongo.Documents;
using StellaOps.Concelier.Storage.Mongo.Dtos;
using StellaOps.Concelier.Storage.Mongo.PsirtFlags;
using StellaOps.Plugin;
using Json.Schema;
namespace StellaOps.Concelier.Connector.Vndr.Chromium;
public sealed class ChromiumConnector : IFeedConnector
{
private static readonly JsonSchema Schema = ChromiumSchemaProvider.Schema;
private static readonly JsonSerializerOptions SerializerOptions = new()
{
PropertyNamingPolicy = JsonNamingPolicy.CamelCase,
DefaultIgnoreCondition = System.Text.Json.Serialization.JsonIgnoreCondition.WhenWritingNull,
};
private readonly ChromiumFeedLoader _feedLoader;
private readonly SourceFetchService _fetchService;
private readonly RawDocumentStorage _rawDocumentStorage;
private readonly IDocumentStore _documentStore;
private readonly IDtoStore _dtoStore;
private readonly IAdvisoryStore _advisoryStore;
private readonly IPsirtFlagStore _psirtFlagStore;
private readonly ISourceStateRepository _stateRepository;
private readonly IJsonSchemaValidator _schemaValidator;
private readonly ChromiumOptions _options;
private readonly TimeProvider _timeProvider;
private readonly ChromiumDiagnostics _diagnostics;
private readonly ILogger<ChromiumConnector> _logger;
public ChromiumConnector(
ChromiumFeedLoader feedLoader,
SourceFetchService fetchService,
RawDocumentStorage rawDocumentStorage,
IDocumentStore documentStore,
IDtoStore dtoStore,
IAdvisoryStore advisoryStore,
IPsirtFlagStore psirtFlagStore,
ISourceStateRepository stateRepository,
IJsonSchemaValidator schemaValidator,
IOptions<ChromiumOptions> options,
TimeProvider? timeProvider,
ChromiumDiagnostics diagnostics,
ILogger<ChromiumConnector> logger)
{
_feedLoader = feedLoader ?? throw new ArgumentNullException(nameof(feedLoader));
_fetchService = fetchService ?? throw new ArgumentNullException(nameof(fetchService));
_rawDocumentStorage = rawDocumentStorage ?? throw new ArgumentNullException(nameof(rawDocumentStorage));
_documentStore = documentStore ?? throw new ArgumentNullException(nameof(documentStore));
_dtoStore = dtoStore ?? throw new ArgumentNullException(nameof(dtoStore));
_advisoryStore = advisoryStore ?? throw new ArgumentNullException(nameof(advisoryStore));
_psirtFlagStore = psirtFlagStore ?? throw new ArgumentNullException(nameof(psirtFlagStore));
_stateRepository = stateRepository ?? throw new ArgumentNullException(nameof(stateRepository));
_schemaValidator = schemaValidator ?? throw new ArgumentNullException(nameof(schemaValidator));
_options = options?.Value ?? throw new ArgumentNullException(nameof(options));
_options.Validate();
_timeProvider = timeProvider ?? TimeProvider.System;
_diagnostics = diagnostics ?? throw new ArgumentNullException(nameof(diagnostics));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
public string SourceName => VndrChromiumConnectorPlugin.SourceName;
public async Task FetchAsync(IServiceProvider services, CancellationToken cancellationToken)
{
var cursor = await GetCursorAsync(cancellationToken).ConfigureAwait(false);
var now = _timeProvider.GetUtcNow();
var (windowStart, windowEnd) = CalculateWindow(cursor, now);
ProvenanceDiagnostics.ReportResumeWindow(SourceName, windowStart, _logger);
IReadOnlyList<ChromiumFeedEntry> feedEntries;
_diagnostics.FetchAttempt();
try
{
feedEntries = await _feedLoader.LoadAsync(windowStart, windowEnd, cancellationToken).ConfigureAwait(false);
}
catch (Exception ex)
{
_logger.LogError(ex, "Chromium feed load failed {Start}-{End}", windowStart, windowEnd);
_diagnostics.FetchFailure();
await _stateRepository.MarkFailureAsync(SourceName, now, TimeSpan.FromMinutes(10), ex.Message, cancellationToken).ConfigureAwait(false);
throw;
}
var fetchCache = new Dictionary<string, ChromiumFetchCacheEntry>(cursor.FetchCache, StringComparer.Ordinal);
var touchedResources = new HashSet<string>(StringComparer.Ordinal);
var candidates = feedEntries
.Where(static entry => entry.IsSecurityUpdate())
.OrderBy(static entry => entry.Published)
.ToArray();
if (candidates.Length == 0)
{
var untouched = cursor
.WithLastPublished(cursor.LastPublished ?? windowEnd)
.WithFetchCache(fetchCache);
await UpdateCursorAsync(untouched, cancellationToken).ConfigureAwait(false);
return;
}
var pendingDocuments = cursor.PendingDocuments.ToList();
var maxPublished = cursor.LastPublished;
foreach (var entry in candidates)
{
try
{
var cacheKey = entry.DetailUri.ToString();
touchedResources.Add(cacheKey);
var metadata = ChromiumDocumentMetadata.CreateMetadata(entry.PostId, entry.Title, entry.Published, entry.Updated, entry.Summary);
var request = new SourceFetchRequest(ChromiumOptions.HttpClientName, SourceName, entry.DetailUri)
{
Metadata = metadata,
AcceptHeaders = new[] { "text/html", "application/xhtml+xml", "text/plain;q=0.5" },
};
var result = await _fetchService.FetchAsync(request, cancellationToken).ConfigureAwait(false);
if (!result.IsSuccess || result.Document is null)
{
continue;
}
if (cursor.TryGetFetchCache(cacheKey, out var cached) && string.Equals(cached.Sha256, result.Document.Sha256, StringComparison.OrdinalIgnoreCase))
{
_diagnostics.FetchUnchanged();
fetchCache[cacheKey] = new ChromiumFetchCacheEntry(result.Document.Sha256);
await _documentStore.UpdateStatusAsync(result.Document.Id, DocumentStatuses.Mapped, cancellationToken).ConfigureAwait(false);
if (!maxPublished.HasValue || entry.Published > maxPublished)
{
maxPublished = entry.Published;
}
continue;
}
_diagnostics.FetchDocument();
if (!pendingDocuments.Contains(result.Document.Id))
{
pendingDocuments.Add(result.Document.Id);
}
if (!maxPublished.HasValue || entry.Published > maxPublished)
{
maxPublished = entry.Published;
}
fetchCache[cacheKey] = new ChromiumFetchCacheEntry(result.Document.Sha256);
}
catch (Exception ex)
{
_logger.LogError(ex, "Chromium fetch failed for {Uri}", entry.DetailUri);
_diagnostics.FetchFailure();
await _stateRepository.MarkFailureAsync(SourceName, now, TimeSpan.FromMinutes(5), ex.Message, cancellationToken).ConfigureAwait(false);
throw;
}
}
if (touchedResources.Count > 0)
{
var keysToRemove = fetchCache.Keys.Where(key => !touchedResources.Contains(key)).ToArray();
foreach (var key in keysToRemove)
{
fetchCache.Remove(key);
}
}
var updatedCursor = cursor
.WithPendingDocuments(pendingDocuments)
.WithPendingMappings(cursor.PendingMappings)
.WithLastPublished(maxPublished ?? cursor.LastPublished ?? windowEnd)
.WithFetchCache(fetchCache);
await UpdateCursorAsync(updatedCursor, cancellationToken).ConfigureAwait(false);
}
public async Task ParseAsync(IServiceProvider services, CancellationToken cancellationToken)
{
var cursor = await GetCursorAsync(cancellationToken).ConfigureAwait(false);
if (cursor.PendingDocuments.Count == 0)
{
return;
}
var pendingDocuments = cursor.PendingDocuments.ToList();
var pendingMappings = cursor.PendingMappings.ToList();
foreach (var documentId in cursor.PendingDocuments)
{
var document = await _documentStore.FindAsync(documentId, cancellationToken).ConfigureAwait(false);
if (document is null)
{
pendingDocuments.Remove(documentId);
pendingMappings.Remove(documentId);
continue;
}
if (!document.GridFsId.HasValue)
{
_logger.LogWarning("Chromium document {DocumentId} missing GridFS payload", document.Id);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
pendingDocuments.Remove(documentId);
pendingMappings.Remove(documentId);
continue;
}
ChromiumDto dto;
try
{
var metadata = ChromiumDocumentMetadata.FromDocument(document);
var content = await _rawDocumentStorage.DownloadAsync(document.GridFsId.Value, cancellationToken).ConfigureAwait(false);
var html = Encoding.UTF8.GetString(content);
dto = ChromiumParser.Parse(html, metadata);
}
catch (Exception ex)
{
_logger.LogError(ex, "Chromium parse failed for {Uri}", document.Uri);
_diagnostics.ParseFailure();
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
pendingDocuments.Remove(documentId);
pendingMappings.Remove(documentId);
continue;
}
var json = JsonSerializer.Serialize(dto, SerializerOptions);
using var jsonDocument = JsonDocument.Parse(json);
try
{
_schemaValidator.Validate(jsonDocument, Schema, dto.PostId);
}
catch (StellaOps.Concelier.Connector.Common.Json.JsonSchemaValidationException ex)
{
_logger.LogError(ex, "Chromium schema validation failed for {DocumentId}", document.Id);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
pendingDocuments.Remove(documentId);
pendingMappings.Remove(documentId);
continue;
}
var payload = BsonDocument.Parse(json);
var existingDto = await _dtoStore.FindByDocumentIdAsync(document.Id, cancellationToken).ConfigureAwait(false);
var validatedAt = _timeProvider.GetUtcNow();
var dtoRecord = existingDto is null
? new DtoRecord(Guid.NewGuid(), document.Id, SourceName, "chromium.post.v1", payload, validatedAt)
: existingDto with
{
Payload = payload,
SchemaVersion = "chromium.post.v1",
ValidatedAt = validatedAt,
};
await _dtoStore.UpsertAsync(dtoRecord, cancellationToken).ConfigureAwait(false);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.PendingMap, cancellationToken).ConfigureAwait(false);
_diagnostics.ParseSuccess();
pendingDocuments.Remove(documentId);
if (!pendingMappings.Contains(documentId))
{
pendingMappings.Add(documentId);
}
}
var updatedCursor = cursor
.WithPendingDocuments(pendingDocuments)
.WithPendingMappings(pendingMappings);
await UpdateCursorAsync(updatedCursor, cancellationToken).ConfigureAwait(false);
}
public async Task MapAsync(IServiceProvider services, CancellationToken cancellationToken)
{
var cursor = await GetCursorAsync(cancellationToken).ConfigureAwait(false);
if (cursor.PendingMappings.Count == 0)
{
return;
}
var pendingMappings = cursor.PendingMappings.ToList();
foreach (var documentId in cursor.PendingMappings)
{
var dtoRecord = await _dtoStore.FindByDocumentIdAsync(documentId, cancellationToken).ConfigureAwait(false);
var document = await _documentStore.FindAsync(documentId, cancellationToken).ConfigureAwait(false);
if (dtoRecord is null || document is null)
{
pendingMappings.Remove(documentId);
continue;
}
var json = dtoRecord.Payload.ToJson(new JsonWriterSettings { OutputMode = JsonOutputMode.RelaxedExtendedJson });
var dto = JsonSerializer.Deserialize<ChromiumDto>(json, SerializerOptions);
if (dto is null)
{
_logger.LogWarning("Chromium DTO deserialization failed for {DocumentId}", documentId);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
pendingMappings.Remove(documentId);
continue;
}
var recordedAt = _timeProvider.GetUtcNow();
var (advisory, flag) = ChromiumMapper.Map(dto, SourceName, recordedAt);
await _advisoryStore.UpsertAsync(advisory, cancellationToken).ConfigureAwait(false);
await _psirtFlagStore.UpsertAsync(flag, cancellationToken).ConfigureAwait(false);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Mapped, cancellationToken).ConfigureAwait(false);
_diagnostics.MapSuccess();
pendingMappings.Remove(documentId);
}
var updatedCursor = cursor.WithPendingMappings(pendingMappings);
await UpdateCursorAsync(updatedCursor, cancellationToken).ConfigureAwait(false);
}
private async Task<ChromiumCursor> GetCursorAsync(CancellationToken cancellationToken)
{
var record = await _stateRepository.TryGetAsync(SourceName, cancellationToken).ConfigureAwait(false);
return ChromiumCursor.FromBsonDocument(record?.Cursor);
}
private async Task UpdateCursorAsync(ChromiumCursor cursor, CancellationToken cancellationToken)
{
var completedAt = _timeProvider.GetUtcNow();
await _stateRepository.UpdateCursorAsync(SourceName, cursor.ToBsonDocument(), completedAt, cancellationToken).ConfigureAwait(false);
}
private (DateTimeOffset start, DateTimeOffset end) CalculateWindow(ChromiumCursor cursor, DateTimeOffset now)
{
var lastPublished = cursor.LastPublished ?? now - _options.InitialBackfill;
var start = lastPublished - _options.WindowOverlap;
var backfill = now - _options.InitialBackfill;
if (start < backfill)
{
start = backfill;
}
var end = now;
if (end <= start)
{
end = start.AddHours(1);
}
return (start, end);
}
}

View File

@@ -0,0 +1,20 @@
using System;
using Microsoft.Extensions.DependencyInjection;
using StellaOps.Plugin;
namespace StellaOps.Concelier.Connector.Vndr.Chromium;
public sealed class VndrChromiumConnectorPlugin : IConnectorPlugin
{
public const string SourceName = "vndr-chromium";
public string Name => SourceName;
public bool IsAvailable(IServiceProvider services) => services.GetService<ChromiumConnector>() is not null;
public IFeedConnector Create(IServiceProvider services)
{
ArgumentNullException.ThrowIfNull(services);
return services.GetRequiredService<ChromiumConnector>();
}
}

View File

@@ -0,0 +1,69 @@
using System.Diagnostics.Metrics;
namespace StellaOps.Concelier.Connector.Vndr.Chromium;
public sealed class ChromiumDiagnostics : IDisposable
{
public const string MeterName = "StellaOps.Concelier.Connector.Vndr.Chromium";
public const string MeterVersion = "1.0.0";
private readonly Meter _meter;
private readonly Counter<long> _fetchAttempts;
private readonly Counter<long> _fetchDocuments;
private readonly Counter<long> _fetchFailures;
private readonly Counter<long> _fetchUnchanged;
private readonly Counter<long> _parseSuccess;
private readonly Counter<long> _parseFailures;
private readonly Counter<long> _mapSuccess;
public ChromiumDiagnostics()
{
_meter = new Meter(MeterName, MeterVersion);
_fetchAttempts = _meter.CreateCounter<long>(
name: "chromium.fetch.attempts",
unit: "operations",
description: "Number of Chromium fetch operations executed.");
_fetchDocuments = _meter.CreateCounter<long>(
name: "chromium.fetch.documents",
unit: "documents",
description: "Count of Chromium advisory documents fetched successfully.");
_fetchFailures = _meter.CreateCounter<long>(
name: "chromium.fetch.failures",
unit: "operations",
description: "Count of Chromium fetch failures.");
_fetchUnchanged = _meter.CreateCounter<long>(
name: "chromium.fetch.unchanged",
unit: "documents",
description: "Count of Chromium documents skipped due to unchanged content.");
_parseSuccess = _meter.CreateCounter<long>(
name: "chromium.parse.success",
unit: "documents",
description: "Count of Chromium documents parsed successfully.");
_parseFailures = _meter.CreateCounter<long>(
name: "chromium.parse.failures",
unit: "documents",
description: "Count of Chromium documents that failed to parse.");
_mapSuccess = _meter.CreateCounter<long>(
name: "chromium.map.success",
unit: "advisories",
description: "Count of Chromium advisories mapped successfully.");
}
public void FetchAttempt() => _fetchAttempts.Add(1);
public void FetchDocument() => _fetchDocuments.Add(1);
public void FetchFailure() => _fetchFailures.Add(1);
public void FetchUnchanged() => _fetchUnchanged.Add(1);
public void ParseSuccess() => _parseSuccess.Add(1);
public void ParseFailure() => _parseFailures.Add(1);
public void MapSuccess() => _mapSuccess.Add(1);
public Meter Meter => _meter;
public void Dispose() => _meter.Dispose();
}

View File

@@ -0,0 +1,37 @@
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Options;
using StellaOps.Concelier.Connector.Common.Http;
using StellaOps.Concelier.Connector.Vndr.Chromium.Configuration;
using StellaOps.Concelier.Connector.Vndr.Chromium.Internal;
namespace StellaOps.Concelier.Connector.Vndr.Chromium;
public static class ChromiumServiceCollectionExtensions
{
public static IServiceCollection AddChromiumConnector(this IServiceCollection services, Action<ChromiumOptions> configure)
{
ArgumentNullException.ThrowIfNull(services);
ArgumentNullException.ThrowIfNull(configure);
services.AddOptions<ChromiumOptions>()
.Configure(configure)
.PostConfigure(static opts => opts.Validate());
services.AddSingleton(static sp => sp.GetRequiredService<IOptions<ChromiumOptions>>().Value);
services.AddSourceHttpClient(ChromiumOptions.HttpClientName, static (sp, clientOptions) =>
{
var options = sp.GetRequiredService<IOptions<ChromiumOptions>>().Value;
clientOptions.BaseAddress = new Uri(options.FeedUri.GetLeftPart(UriPartial.Authority));
clientOptions.Timeout = TimeSpan.FromSeconds(20);
clientOptions.UserAgent = "StellaOps.Concelier.VndrChromium/1.0";
clientOptions.AllowedHosts.Clear();
clientOptions.AllowedHosts.Add(options.FeedUri.Host);
});
services.AddSingleton<ChromiumDiagnostics>();
services.AddTransient<ChromiumFeedLoader>();
services.AddTransient<ChromiumConnector>();
return services;
}
}

View File

@@ -0,0 +1,44 @@
namespace StellaOps.Concelier.Connector.Vndr.Chromium.Configuration;
public sealed class ChromiumOptions
{
public const string HttpClientName = "source-vndr-chromium";
public Uri FeedUri { get; set; } = new("https://chromereleases.googleblog.com/atom.xml");
public TimeSpan InitialBackfill { get; set; } = TimeSpan.FromDays(30);
public TimeSpan WindowOverlap { get; set; } = TimeSpan.FromDays(2);
public int MaxFeedPages { get; set; } = 4;
public int MaxEntriesPerPage { get; set; } = 50;
public void Validate()
{
if (FeedUri is null || !FeedUri.IsAbsoluteUri)
{
throw new ArgumentException("FeedUri must be an absolute URI.", nameof(FeedUri));
}
if (InitialBackfill <= TimeSpan.Zero)
{
throw new ArgumentException("InitialBackfill must be positive.", nameof(InitialBackfill));
}
if (WindowOverlap < TimeSpan.Zero)
{
throw new ArgumentException("WindowOverlap cannot be negative.", nameof(WindowOverlap));
}
if (MaxFeedPages <= 0)
{
throw new ArgumentException("MaxFeedPages must be positive.", nameof(MaxFeedPages));
}
if (MaxEntriesPerPage <= 0 || MaxEntriesPerPage > 100)
{
throw new ArgumentException("MaxEntriesPerPage must be between 1 and 100.", nameof(MaxEntriesPerPage));
}
}
}

View File

@@ -0,0 +1,143 @@
using System.Collections.Generic;
using System.Linq;
using MongoDB.Bson;
namespace StellaOps.Concelier.Connector.Vndr.Chromium.Internal;
internal sealed record ChromiumCursor(
DateTimeOffset? LastPublished,
IReadOnlyCollection<Guid> PendingDocuments,
IReadOnlyCollection<Guid> PendingMappings,
IReadOnlyDictionary<string, ChromiumFetchCacheEntry> FetchCache)
{
public static ChromiumCursor Empty { get; } = new(null, Array.Empty<Guid>(), Array.Empty<Guid>(), new Dictionary<string, ChromiumFetchCacheEntry>(StringComparer.Ordinal));
public BsonDocument ToBsonDocument()
{
var document = new BsonDocument();
if (LastPublished.HasValue)
{
document["lastPublished"] = LastPublished.Value.UtcDateTime;
}
document["pendingDocuments"] = new BsonArray(PendingDocuments.Select(id => id.ToString()));
document["pendingMappings"] = new BsonArray(PendingMappings.Select(id => id.ToString()));
if (FetchCache.Count > 0)
{
var cacheDocument = new BsonDocument();
foreach (var (key, entry) in FetchCache)
{
cacheDocument[key] = entry.ToBson();
}
document["fetchCache"] = cacheDocument;
}
return document;
}
public static ChromiumCursor FromBsonDocument(BsonDocument? document)
{
if (document is null || document.ElementCount == 0)
{
return Empty;
}
DateTimeOffset? lastPublished = null;
if (document.TryGetValue("lastPublished", out var lastPublishedValue))
{
lastPublished = ReadDateTime(lastPublishedValue);
}
var pendingDocuments = ReadGuidArray(document, "pendingDocuments");
var pendingMappings = ReadGuidArray(document, "pendingMappings");
var fetchCache = ReadFetchCache(document);
return new ChromiumCursor(lastPublished, pendingDocuments, pendingMappings, fetchCache);
}
public ChromiumCursor WithLastPublished(DateTimeOffset? lastPublished)
=> this with { LastPublished = lastPublished?.ToUniversalTime() };
public ChromiumCursor WithPendingDocuments(IEnumerable<Guid> ids)
=> this with { PendingDocuments = ids?.Distinct().ToArray() ?? Array.Empty<Guid>() };
public ChromiumCursor WithPendingMappings(IEnumerable<Guid> ids)
=> this with { PendingMappings = ids?.Distinct().ToArray() ?? Array.Empty<Guid>() };
public ChromiumCursor WithFetchCache(IDictionary<string, ChromiumFetchCacheEntry> cache)
=> this with { FetchCache = cache is null ? new Dictionary<string, ChromiumFetchCacheEntry>(StringComparer.Ordinal) : new Dictionary<string, ChromiumFetchCacheEntry>(cache, StringComparer.Ordinal) };
public bool TryGetFetchCache(string key, out ChromiumFetchCacheEntry entry)
=> FetchCache.TryGetValue(key, out entry);
private static DateTimeOffset? ReadDateTime(BsonValue value)
{
return value.BsonType switch
{
BsonType.DateTime => DateTime.SpecifyKind(value.ToUniversalTime(), DateTimeKind.Utc),
BsonType.String when DateTimeOffset.TryParse(value.AsString, out var parsed) => parsed.ToUniversalTime(),
_ => null,
};
}
private static IReadOnlyCollection<Guid> ReadGuidArray(BsonDocument document, string field)
{
if (!document.TryGetValue(field, out var value) || value is not BsonArray array)
{
return Array.Empty<Guid>();
}
var list = new List<Guid>(array.Count);
foreach (var element in array)
{
if (Guid.TryParse(element.ToString(), out var guid))
{
list.Add(guid);
}
}
return list;
}
private static IReadOnlyDictionary<string, ChromiumFetchCacheEntry> ReadFetchCache(BsonDocument document)
{
if (!document.TryGetValue("fetchCache", out var value) || value is not BsonDocument cacheDocument)
{
return new Dictionary<string, ChromiumFetchCacheEntry>(StringComparer.Ordinal);
}
var dictionary = new Dictionary<string, ChromiumFetchCacheEntry>(StringComparer.Ordinal);
foreach (var element in cacheDocument.Elements)
{
if (element.Value is BsonDocument entryDocument)
{
dictionary[element.Name] = ChromiumFetchCacheEntry.FromBson(entryDocument);
}
}
return dictionary;
}
}
internal sealed record ChromiumFetchCacheEntry(string Sha256)
{
public static ChromiumFetchCacheEntry Empty { get; } = new(string.Empty);
public BsonDocument ToBson()
{
var document = new BsonDocument
{
["sha256"] = Sha256,
};
return document;
}
public static ChromiumFetchCacheEntry FromBson(BsonDocument document)
{
var sha = document.TryGetValue("sha256", out var shaValue) ? shaValue.AsString : string.Empty;
return new ChromiumFetchCacheEntry(sha);
}
}

View File

@@ -0,0 +1,78 @@
using StellaOps.Concelier.Storage.Mongo.Documents;
namespace StellaOps.Concelier.Connector.Vndr.Chromium.Internal;
internal sealed record ChromiumDocumentMetadata(
string PostId,
string Title,
Uri DetailUrl,
DateTimeOffset Published,
DateTimeOffset? Updated,
string? Summary)
{
private const string PostIdKey = "postId";
private const string TitleKey = "title";
private const string PublishedKey = "published";
private const string UpdatedKey = "updated";
private const string SummaryKey = "summary";
public static ChromiumDocumentMetadata FromDocument(DocumentRecord document)
{
ArgumentNullException.ThrowIfNull(document);
var metadata = document.Metadata ?? throw new InvalidOperationException("Chromium document metadata missing.");
if (!metadata.TryGetValue(PostIdKey, out var postId) || string.IsNullOrWhiteSpace(postId))
{
throw new InvalidOperationException("Chromium document metadata missing postId.");
}
if (!metadata.TryGetValue(TitleKey, out var title) || string.IsNullOrWhiteSpace(title))
{
throw new InvalidOperationException("Chromium document metadata missing title.");
}
if (!metadata.TryGetValue(PublishedKey, out var publishedString) || !DateTimeOffset.TryParse(publishedString, out var published))
{
throw new InvalidOperationException("Chromium document metadata missing published timestamp.");
}
DateTimeOffset? updated = null;
if (metadata.TryGetValue(UpdatedKey, out var updatedString) && DateTimeOffset.TryParse(updatedString, out var updatedValue))
{
updated = updatedValue;
}
metadata.TryGetValue(SummaryKey, out var summary);
return new ChromiumDocumentMetadata(
postId.Trim(),
title.Trim(),
new Uri(document.Uri, UriKind.Absolute),
published.ToUniversalTime(),
updated?.ToUniversalTime(),
string.IsNullOrWhiteSpace(summary) ? null : summary.Trim());
}
public static IReadOnlyDictionary<string, string> CreateMetadata(string postId, string title, DateTimeOffset published, DateTimeOffset? updated, string? summary)
{
var dictionary = new Dictionary<string, string>(StringComparer.Ordinal)
{
[PostIdKey] = postId,
[TitleKey] = title,
[PublishedKey] = published.ToUniversalTime().ToString("O"),
};
if (updated.HasValue)
{
dictionary[UpdatedKey] = updated.Value.ToUniversalTime().ToString("O");
}
if (!string.IsNullOrWhiteSpace(summary))
{
dictionary[SummaryKey] = summary.Trim();
}
return dictionary;
}
}

View File

@@ -0,0 +1,39 @@
using System.Text.Json.Serialization;
namespace StellaOps.Concelier.Connector.Vndr.Chromium.Internal;
internal sealed record ChromiumDto(
[property: JsonPropertyName("postId")] string PostId,
[property: JsonPropertyName("title")] string Title,
[property: JsonPropertyName("detailUrl")] string DetailUrl,
[property: JsonPropertyName("published")] DateTimeOffset Published,
[property: JsonPropertyName("updated")] DateTimeOffset? Updated,
[property: JsonPropertyName("summary")] string? Summary,
[property: JsonPropertyName("cves")] IReadOnlyList<string> Cves,
[property: JsonPropertyName("platforms")] IReadOnlyList<string> Platforms,
[property: JsonPropertyName("versions")] IReadOnlyList<ChromiumVersionInfo> Versions,
[property: JsonPropertyName("references")] IReadOnlyList<ChromiumReference> References)
{
public static ChromiumDto From(ChromiumDocumentMetadata metadata, IReadOnlyList<string> cves, IReadOnlyList<string> platforms, IReadOnlyList<ChromiumVersionInfo> versions, IReadOnlyList<ChromiumReference> references)
=> new(
metadata.PostId,
metadata.Title,
metadata.DetailUrl.ToString(),
metadata.Published,
metadata.Updated,
metadata.Summary,
cves,
platforms,
versions,
references);
}
internal sealed record ChromiumVersionInfo(
[property: JsonPropertyName("platform")] string Platform,
[property: JsonPropertyName("channel")] string Channel,
[property: JsonPropertyName("version")] string Version);
internal sealed record ChromiumReference(
[property: JsonPropertyName("url")] string Url,
[property: JsonPropertyName("kind")] string Kind,
[property: JsonPropertyName("label")] string? Label);

View File

@@ -0,0 +1,24 @@
namespace StellaOps.Concelier.Connector.Vndr.Chromium.Internal;
public sealed record ChromiumFeedEntry(
string EntryId,
string PostId,
string Title,
Uri DetailUri,
DateTimeOffset Published,
DateTimeOffset? Updated,
string? Summary,
IReadOnlyCollection<string> Categories)
{
public bool IsSecurityUpdate()
{
if (Categories.Count > 0 && Categories.Contains("Stable updates", StringComparer.OrdinalIgnoreCase))
{
return true;
}
return Title.Contains("Stable Channel Update", StringComparison.OrdinalIgnoreCase)
|| Title.Contains("Extended Stable", StringComparison.OrdinalIgnoreCase)
|| Title.Contains("Stable Channel Desktop", StringComparison.OrdinalIgnoreCase);
}
}

View File

@@ -0,0 +1,147 @@
using System.ServiceModel.Syndication;
using System.Xml;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using StellaOps.Concelier.Connector.Vndr.Chromium.Configuration;
namespace StellaOps.Concelier.Connector.Vndr.Chromium.Internal;
public sealed class ChromiumFeedLoader
{
private readonly IHttpClientFactory _httpClientFactory;
private readonly ChromiumOptions _options;
private readonly ILogger<ChromiumFeedLoader> _logger;
public ChromiumFeedLoader(IHttpClientFactory httpClientFactory, IOptions<ChromiumOptions> options, ILogger<ChromiumFeedLoader> logger)
{
_httpClientFactory = httpClientFactory ?? throw new ArgumentNullException(nameof(httpClientFactory));
_options = options?.Value ?? throw new ArgumentNullException(nameof(options));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
public async Task<IReadOnlyList<ChromiumFeedEntry>> LoadAsync(DateTimeOffset windowStart, DateTimeOffset windowEnd, CancellationToken cancellationToken)
{
var client = _httpClientFactory.CreateClient(ChromiumOptions.HttpClientName);
var results = new List<ChromiumFeedEntry>();
var startIndex = 1;
for (var page = 0; page < _options.MaxFeedPages; page++)
{
var requestUri = BuildRequestUri(startIndex);
using var response = await client.GetAsync(requestUri, cancellationToken).ConfigureAwait(false);
response.EnsureSuccessStatusCode();
await using var stream = await response.Content.ReadAsStreamAsync(cancellationToken).ConfigureAwait(false);
using var reader = XmlReader.Create(stream);
var feed = SyndicationFeed.Load(reader);
if (feed is null || feed.Items is null)
{
break;
}
var pageEntries = new List<ChromiumFeedEntry>();
foreach (var entry in feed.Items)
{
var published = entry.PublishDate != DateTimeOffset.MinValue
? entry.PublishDate.ToUniversalTime()
: entry.LastUpdatedTime.ToUniversalTime();
if (published > windowEnd || published < windowStart - _options.WindowOverlap)
{
continue;
}
var detailUri = entry.Links.FirstOrDefault(link => string.Equals(link.RelationshipType, "alternate", StringComparison.OrdinalIgnoreCase))?.Uri;
if (detailUri is null)
{
continue;
}
var postId = ExtractPostId(detailUri);
if (string.IsNullOrEmpty(postId))
{
continue;
}
var categories = entry.Categories.Select(static cat => cat.Name).Where(static name => !string.IsNullOrWhiteSpace(name)).ToArray();
var chromiumEntry = new ChromiumFeedEntry(
entry.Id ?? detailUri.ToString(),
postId,
entry.Title?.Text?.Trim() ?? postId,
detailUri,
published,
entry.LastUpdatedTime == DateTimeOffset.MinValue ? null : entry.LastUpdatedTime.ToUniversalTime(),
entry.Summary?.Text?.Trim(),
categories);
if (chromiumEntry.Published >= windowStart && chromiumEntry.Published <= windowEnd)
{
pageEntries.Add(chromiumEntry);
}
}
if (pageEntries.Count == 0)
{
var oldest = feed.Items?.Select(static item => item.PublishDate).Where(static dt => dt != DateTimeOffset.MinValue).OrderBy(static dt => dt).FirstOrDefault();
if (oldest.HasValue && oldest.Value.ToUniversalTime() < windowStart)
{
break;
}
}
results.AddRange(pageEntries);
if (feed.Items?.Any() != true)
{
break;
}
var nextLink = feed.Links?.FirstOrDefault(link => string.Equals(link.RelationshipType, "next", StringComparison.OrdinalIgnoreCase))?.Uri;
if (nextLink is null)
{
break;
}
startIndex += _options.MaxEntriesPerPage;
}
return results
.DistinctBy(static entry => entry.DetailUri)
.OrderBy(static entry => entry.Published)
.ToArray();
}
private Uri BuildRequestUri(int startIndex)
{
var builder = new UriBuilder(_options.FeedUri);
var query = new List<string>();
if (!string.IsNullOrEmpty(builder.Query))
{
query.Add(builder.Query.TrimStart('?'));
}
query.Add($"max-results={_options.MaxEntriesPerPage}");
query.Add($"start-index={startIndex}");
query.Add("redirect=false");
builder.Query = string.Join('&', query);
return builder.Uri;
}
private static string ExtractPostId(Uri detailUri)
{
var segments = detailUri.Segments;
if (segments.Length == 0)
{
return detailUri.AbsoluteUri;
}
var last = segments[^1].Trim('/');
if (last.EndsWith(".html", StringComparison.OrdinalIgnoreCase))
{
last = last[..^5];
}
return last.Replace('/', '-');
}
}

View File

@@ -0,0 +1,174 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Globalization;
using StellaOps.Concelier.Models;
using StellaOps.Concelier.Storage.Mongo.PsirtFlags;
namespace StellaOps.Concelier.Connector.Vndr.Chromium.Internal;
internal static class ChromiumMapper
{
private const string VendorIdentifier = "google:chrome";
public static (Advisory Advisory, PsirtFlagRecord Flag) Map(ChromiumDto dto, string sourceName, DateTimeOffset recordedAt)
{
ArgumentNullException.ThrowIfNull(dto);
ArgumentException.ThrowIfNullOrEmpty(sourceName);
var advisoryKey = $"chromium/post/{dto.PostId}";
var provenance = new AdvisoryProvenance(sourceName, "document", dto.PostId, recordedAt.ToUniversalTime());
var aliases = BuildAliases(dto).ToArray();
var references = BuildReferences(dto, provenance).ToArray();
var affectedPackages = BuildAffected(dto, provenance).ToArray();
var advisory = new Advisory(
advisoryKey,
dto.Title,
dto.Summary,
language: "en",
dto.Published.ToUniversalTime(),
dto.Updated?.ToUniversalTime(),
severity: null,
exploitKnown: false,
aliases,
references,
affectedPackages,
Array.Empty<CvssMetric>(),
new[] { provenance });
var flag = new PsirtFlagRecord(
advisoryKey,
"Google",
sourceName,
dto.PostId,
recordedAt.ToUniversalTime());
return (advisory, flag);
}
private static IEnumerable<string> BuildAliases(ChromiumDto dto)
{
yield return $"CHROMIUM-POST:{dto.PostId}";
yield return $"CHROMIUM-POST:{dto.Published:yyyy-MM-dd}";
foreach (var cve in dto.Cves)
{
yield return cve;
}
}
private static IEnumerable<AdvisoryReference> BuildReferences(ChromiumDto dto, AdvisoryProvenance provenance)
{
var comparer = StringComparer.OrdinalIgnoreCase;
var references = new List<(AdvisoryReference Reference, int Priority)>
{
(new AdvisoryReference(dto.DetailUrl, "advisory", "chromium-blog", summary: null, provenance), 0),
};
foreach (var reference in dto.References)
{
var summary = string.IsNullOrWhiteSpace(reference.Label) ? null : reference.Label;
var sourceTag = string.IsNullOrWhiteSpace(reference.Kind) ? null : reference.Kind;
var advisoryReference = new AdvisoryReference(reference.Url, reference.Kind, sourceTag, summary, provenance);
references.Add((advisoryReference, 1));
}
return references
.GroupBy(tuple => tuple.Reference.Url, comparer)
.Select(group => group
.OrderBy(t => t.Priority)
.ThenBy(t => t.Reference.Kind ?? string.Empty, comparer)
.ThenBy(t => t.Reference.SourceTag ?? string.Empty, comparer)
.ThenBy(t => t.Reference.Url, comparer)
.First())
.OrderBy(t => t.Priority)
.ThenBy(t => t.Reference.Kind ?? string.Empty, comparer)
.ThenBy(t => t.Reference.Url, comparer)
.Select(t => t.Reference);
}
private static IEnumerable<AffectedPackage> BuildAffected(ChromiumDto dto, AdvisoryProvenance provenance)
{
foreach (var version in dto.Versions)
{
var identifier = version.Channel switch
{
"extended-stable" => $"{VendorIdentifier}:extended-stable",
"beta" => $"{VendorIdentifier}:beta",
"dev" => $"{VendorIdentifier}:dev",
_ => VendorIdentifier,
};
var range = new AffectedVersionRange(
rangeKind: "vendor",
introducedVersion: null,
fixedVersion: version.Version,
lastAffectedVersion: null,
rangeExpression: null,
provenance,
primitives: BuildRangePrimitives(version));
yield return new AffectedPackage(
AffectedPackageTypes.Vendor,
identifier,
version.Platform,
new[] { range },
statuses: Array.Empty<AffectedPackageStatus>(),
provenance: new[] { provenance });
}
}
private static RangePrimitives? BuildRangePrimitives(ChromiumVersionInfo version)
{
var extensions = new Dictionary<string, string>(StringComparer.Ordinal);
AddExtension(extensions, "chromium.channel", version.Channel);
AddExtension(extensions, "chromium.platform", version.Platform);
AddExtension(extensions, "chromium.version.raw", version.Version);
if (Version.TryParse(version.Version, out var parsed))
{
AddExtension(extensions, "chromium.version.normalized", BuildNormalizedVersion(parsed));
extensions["chromium.version.major"] = parsed.Major.ToString(CultureInfo.InvariantCulture);
extensions["chromium.version.minor"] = parsed.Minor.ToString(CultureInfo.InvariantCulture);
if (parsed.Build >= 0)
{
extensions["chromium.version.build"] = parsed.Build.ToString(CultureInfo.InvariantCulture);
}
if (parsed.Revision >= 0)
{
extensions["chromium.version.patch"] = parsed.Revision.ToString(CultureInfo.InvariantCulture);
}
}
return extensions.Count == 0 ? null : new RangePrimitives(null, null, null, extensions);
}
private static string BuildNormalizedVersion(Version version)
{
if (version.Build >= 0 && version.Revision >= 0)
{
return $"{version.Major}.{version.Minor}.{version.Build}.{version.Revision}";
}
if (version.Build >= 0)
{
return $"{version.Major}.{version.Minor}.{version.Build}";
}
return $"{version.Major}.{version.Minor}";
}
private static void AddExtension(Dictionary<string, string> extensions, string key, string? value)
{
if (string.IsNullOrWhiteSpace(value))
{
return;
}
extensions[key] = value.Trim();
}
}

View File

@@ -0,0 +1,282 @@
using System.Text.RegularExpressions;
using AngleSharp.Dom;
using AngleSharp.Html.Parser;
namespace StellaOps.Concelier.Connector.Vndr.Chromium.Internal;
internal static class ChromiumParser
{
private static readonly HtmlParser HtmlParser = new();
private static readonly Regex CveRegex = new("CVE-\\d{4}-\\d{4,}", RegexOptions.Compiled | RegexOptions.IgnoreCase);
private static readonly Regex VersionRegex = new("(?<version>\\d+\\.\\d+\\.\\d+\\.\\d+)", RegexOptions.Compiled);
public static ChromiumDto Parse(string html, ChromiumDocumentMetadata metadata)
{
ArgumentException.ThrowIfNullOrEmpty(html);
ArgumentNullException.ThrowIfNull(metadata);
var document = HtmlParser.ParseDocument(html);
var body = document.QuerySelector("div.post-body") ?? document.Body;
if (body is null)
{
throw new InvalidOperationException("Chromium post body not found.");
}
var cves = ExtractCves(body);
var versions = ExtractVersions(body);
var platforms = versions.Select(static v => v.Platform).Distinct(StringComparer.OrdinalIgnoreCase).ToArray();
var references = ExtractReferences(body, metadata.DetailUrl);
return ChromiumDto.From(metadata, cves, platforms, versions, references);
}
private static IReadOnlyList<string> ExtractCves(IElement body)
{
var matches = CveRegex.Matches(body.TextContent ?? string.Empty);
return matches
.Select(static match => match.Value.ToUpperInvariant())
.Distinct(StringComparer.Ordinal)
.OrderBy(static cve => cve, StringComparer.Ordinal)
.ToArray();
}
private static IReadOnlyList<ChromiumVersionInfo> ExtractVersions(IElement body)
{
var results = new Dictionary<string, ChromiumVersionInfo>(StringComparer.OrdinalIgnoreCase);
var elements = body.QuerySelectorAll("p,li");
if (elements.Length == 0)
{
elements = body.QuerySelectorAll("div,span");
}
foreach (var element in elements)
{
var text = element.TextContent?.Trim();
if (string.IsNullOrEmpty(text))
{
continue;
}
var channel = DetermineChannel(text);
foreach (Match match in VersionRegex.Matches(text))
{
var version = match.Groups["version"].Value;
var platform = DeterminePlatform(text, match);
var key = string.Join('|', platform.ToLowerInvariant(), channel.ToLowerInvariant(), version);
if (!results.ContainsKey(key))
{
results[key] = new ChromiumVersionInfo(platform, channel, version);
}
}
}
return results.Values
.OrderBy(static v => v.Platform, StringComparer.OrdinalIgnoreCase)
.ThenBy(static v => v.Channel, StringComparer.OrdinalIgnoreCase)
.ThenBy(static v => v.Version, StringComparer.Ordinal)
.ToArray();
}
private static string DeterminePlatform(string text, Match match)
{
var after = ExtractSlice(text, match.Index + match.Length, Math.Min(120, text.Length - (match.Index + match.Length)));
var segment = ExtractPlatformSegment(after);
var normalized = NormalizePlatform(segment);
if (!string.IsNullOrEmpty(normalized))
{
return normalized!;
}
var before = ExtractSlice(text, Math.Max(0, match.Index - 80), Math.Min(80, match.Index));
normalized = NormalizePlatform(before + " " + after);
return string.IsNullOrEmpty(normalized) ? "desktop" : normalized!;
}
private static string DetermineChannel(string text)
{
if (text.Contains("Extended Stable", StringComparison.OrdinalIgnoreCase))
{
return "extended-stable";
}
if (text.Contains("Beta", StringComparison.OrdinalIgnoreCase))
{
return "beta";
}
if (text.Contains("Dev", StringComparison.OrdinalIgnoreCase))
{
return "dev";
}
return "stable";
}
private static string ExtractSlice(string text, int start, int length)
{
if (length <= 0)
{
return string.Empty;
}
return text.Substring(start, length);
}
private static string ExtractPlatformSegment(string after)
{
if (string.IsNullOrEmpty(after))
{
return string.Empty;
}
var forIndex = after.IndexOf("for ", StringComparison.OrdinalIgnoreCase);
if (forIndex < 0)
{
return string.Empty;
}
var remainder = after[(forIndex + 4)..];
var terminatorIndex = remainder.IndexOfAny(new[] { '.', ';', '\n', '(', ')' });
if (terminatorIndex >= 0)
{
remainder = remainder[..terminatorIndex];
}
var digitIndex = remainder.IndexOfAny("0123456789".ToCharArray());
if (digitIndex >= 0)
{
remainder = remainder[..digitIndex];
}
var whichIndex = remainder.IndexOf(" which", StringComparison.OrdinalIgnoreCase);
if (whichIndex >= 0)
{
remainder = remainder[..whichIndex];
}
return remainder.Trim();
}
private static string? NormalizePlatform(string? value)
{
if (string.IsNullOrWhiteSpace(value))
{
return null;
}
var normalized = value.Replace("/", " ", StringComparison.OrdinalIgnoreCase)
.Replace(" and ", " ", StringComparison.OrdinalIgnoreCase)
.Replace("&", " ", StringComparison.OrdinalIgnoreCase)
.Trim();
if (normalized.Contains("android", StringComparison.OrdinalIgnoreCase))
{
return "android";
}
if (normalized.Contains("chromeos flex", StringComparison.OrdinalIgnoreCase))
{
return "chromeos-flex";
}
if (normalized.Contains("chromeos", StringComparison.OrdinalIgnoreCase) || normalized.Contains("chrome os", StringComparison.OrdinalIgnoreCase))
{
return "chromeos";
}
if (normalized.Contains("linux", StringComparison.OrdinalIgnoreCase))
{
return "linux";
}
var hasWindows = normalized.Contains("windows", StringComparison.OrdinalIgnoreCase);
var hasMac = normalized.Contains("mac", StringComparison.OrdinalIgnoreCase);
if (hasWindows && hasMac)
{
return "windows-mac";
}
if (hasWindows)
{
return "windows";
}
if (hasMac)
{
return "mac";
}
return null;
}
private static IReadOnlyList<ChromiumReference> ExtractReferences(IElement body, Uri detailUri)
{
var references = new Dictionary<string, ChromiumReference>(StringComparer.OrdinalIgnoreCase);
foreach (var anchor in body.QuerySelectorAll("a[href]"))
{
var href = anchor.GetAttribute("href");
if (string.IsNullOrWhiteSpace(href))
{
continue;
}
if (!Uri.TryCreate(href.Trim(), UriKind.Absolute, out var linkUri))
{
continue;
}
if (string.Equals(linkUri.AbsoluteUri, detailUri.AbsoluteUri, StringComparison.OrdinalIgnoreCase))
{
continue;
}
if (!string.Equals(linkUri.Scheme, Uri.UriSchemeHttp, StringComparison.OrdinalIgnoreCase)
&& !string.Equals(linkUri.Scheme, Uri.UriSchemeHttps, StringComparison.OrdinalIgnoreCase))
{
continue;
}
var kind = ClassifyReference(linkUri);
var label = anchor.TextContent?.Trim();
if (!references.ContainsKey(linkUri.AbsoluteUri))
{
references[linkUri.AbsoluteUri] = new ChromiumReference(linkUri.AbsoluteUri, kind, string.IsNullOrWhiteSpace(label) ? null : label);
}
}
return references.Values
.OrderBy(static r => r.Url, StringComparer.Ordinal)
.ThenBy(static r => r.Kind, StringComparer.Ordinal)
.ToArray();
}
private static string ClassifyReference(Uri uri)
{
var host = uri.Host;
if (host.Contains("googlesource.com", StringComparison.OrdinalIgnoreCase))
{
return "changelog";
}
if (host.Contains("issues.chromium.org", StringComparison.OrdinalIgnoreCase)
|| host.Contains("bugs.chromium.org", StringComparison.OrdinalIgnoreCase)
|| host.Contains("crbug.com", StringComparison.OrdinalIgnoreCase))
{
return "bug";
}
if (host.Contains("chromium.org", StringComparison.OrdinalIgnoreCase))
{
return "doc";
}
if (host.Contains("google.com", StringComparison.OrdinalIgnoreCase))
{
return "google";
}
return "reference";
}
}

View File

@@ -0,0 +1,25 @@
using System.IO;
using System.Reflection;
using System.Threading;
using Json.Schema;
namespace StellaOps.Concelier.Connector.Vndr.Chromium.Internal;
internal static class ChromiumSchemaProvider
{
private static readonly Lazy<JsonSchema> Cached = new(Load, LazyThreadSafetyMode.ExecutionAndPublication);
public static JsonSchema Schema => Cached.Value;
private static JsonSchema Load()
{
var assembly = typeof(ChromiumSchemaProvider).GetTypeInfo().Assembly;
const string resourceName = "StellaOps.Concelier.Connector.Vndr.Chromium.Schemas.chromium-post.schema.json";
using var stream = assembly.GetManifestResourceStream(resourceName)
?? throw new InvalidOperationException($"Embedded schema '{resourceName}' not found.");
using var reader = new StreamReader(stream);
var schemaText = reader.ReadToEnd();
return JsonSchema.FromText(schemaText);
}
}

View File

@@ -0,0 +1,3 @@
using System.Runtime.CompilerServices;
[assembly: InternalsVisibleTo("StellaOps.Concelier.Connector.Vndr.Chromium.Tests")]

View File

@@ -0,0 +1,97 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://stellaops.example/schemas/chromium-post.schema.json",
"type": "object",
"required": [
"postId",
"title",
"detailUrl",
"published",
"cves",
"platforms",
"versions",
"references"
],
"properties": {
"postId": {
"type": "string",
"minLength": 1
},
"title": {
"type": "string",
"minLength": 1
},
"detailUrl": {
"type": "string",
"format": "uri"
},
"published": {
"type": "string",
"format": "date-time"
},
"updated": {
"type": ["string", "null"],
"format": "date-time"
},
"summary": {
"type": ["string", "null"]
},
"cves": {
"type": "array",
"uniqueItems": true,
"items": {
"type": "string",
"pattern": "^CVE-\\d{4}-\\d{4,}$"
}
},
"platforms": {
"type": "array",
"items": {
"type": "string",
"minLength": 1
}
},
"versions": {
"type": "array",
"minItems": 1,
"items": {
"type": "object",
"required": ["platform", "channel", "version"],
"properties": {
"platform": {
"type": "string",
"minLength": 1
},
"channel": {
"type": "string",
"minLength": 1
},
"version": {
"type": "string",
"minLength": 4
}
}
}
},
"references": {
"type": "array",
"items": {
"type": "object",
"required": ["url", "kind"],
"properties": {
"url": {
"type": "string",
"format": "uri"
},
"kind": {
"type": "string",
"minLength": 1
},
"label": {
"type": ["string", "null"]
}
}
}
}
}
}

View File

@@ -0,0 +1,32 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="AngleSharp" Version="1.1.1" />
<PackageReference Include="System.ServiceModel.Syndication" Version="8.0.0" />
</ItemGroup>
<ItemGroup>
<EmbeddedResource Include="Schemas\chromium-post.schema.json" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="../StellaOps.Plugin/StellaOps.Plugin.csproj" />
<ProjectReference Include="../StellaOps.Concelier.Connector.Common/StellaOps.Concelier.Connector.Common.csproj" />
<ProjectReference Include="../StellaOps.Concelier.Models/StellaOps.Concelier.Models.csproj" />
<ProjectReference Include="../StellaOps.Concelier.Storage.Mongo/StellaOps.Concelier.Storage.Mongo.csproj" />
</ItemGroup>
<ItemGroup>
<AssemblyAttribute Include="System.Runtime.CompilerServices.InternalsVisibleTo">
<_Parameter1>StellaOps.Concelier.Connector.Vndr.Chromium.Tests</_Parameter1>
</AssemblyAttribute>
</ItemGroup>
</Project>

View File

@@ -0,0 +1,17 @@
# Source.Vndr.Chromium — Task Board
| ID | Task | Owner | Status | Depends On | Notes |
|------|-----------------------------------------------|-------|--------|------------|-------|
| CH1 | Blog crawl + cursor | Conn | DONE | Common | Sliding window feed reader with cursor persisted. |
| CH2 | Post parser → DTO (CVEs, versions, refs) | QA | DONE | | AngleSharp parser normalizes CVEs, versions, references. |
| CH3 | Canonical mapping (aliases/refs/affected-hint)| Conn | DONE | Models | Deterministic advisory mapping with psirt flags. |
| CH4 | Snapshot tests + resume | QA | DONE | Storage | Deterministic snapshot plus resume scenario via Mongo state. |
| CH5 | Observability | QA | DONE | | Metered fetch/parse/map counters. |
| CH6 | SourceState + SHA dedupe | Conn | DONE | Storage | Cursor tracks SHA cache to skip unchanged posts. |
| CH7 | Stabilize resume integration (preserve pending docs across provider instances) | QA | DONE | Storage.Mongo | Resume integration test exercises pending docs across providers via shared Mongo. |
| CH8 | Mark failed parse documents | Conn | DONE | Storage.Mongo | Parse pipeline marks failures; unit tests assert status transitions. |
| CH9 | Reference dedupe & ordering | Conn | DONE | Models | Mapper groups references by URL and sorts deterministically. |
| CH10 | Range primitives + provenance instrumentation | Conn | DONE | Models, Storage.Mongo | Vendor primitives + logging in place, resume metrics updated, snapshots refreshed. |
## Changelog
- YYYY-MM-DD: Created.