Initial commit (history squashed)
Some checks failed
Build Test Deploy / authority-container (push) Has been cancelled
Build Test Deploy / docs (push) Has been cancelled
Build Test Deploy / deploy (push) Has been cancelled
Build Test Deploy / build-test (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled

This commit is contained in:
2025-10-07 10:14:21 +03:00
commit b97fc7685a
1132 changed files with 117842 additions and 0 deletions

View File

@@ -0,0 +1,27 @@
# AGENTS
## Role
ANSSI CERT-FR advisories connector (avis/alertes) providing national enrichment: advisory metadata, CVE links, mitigation notes, and references.
## Scope
- Harvest CERT-FR items via RSS and/or list pages; follow item pages for detail; window by publish/update date.
- Validate HTML or JSON payloads; extract structured fields; map to canonical aliases, references, severity text.
- Maintain watermarks and de-duplication by content hash; idempotent processing.
## Participants
- Source.Common (HTTP, HTML parsing helpers, validators).
- Storage.Mongo (document, dto, advisory, reference, source_state).
- Models (canonical).
- Core/WebService (jobs: source:certfr:fetch|parse|map).
- Merge engine (later) to enrich only.
## Interfaces & contracts
- Treat CERT-FR as enrichment; never override distro or PSIRT version ranges absent concrete evidence.
- References must include primary bulletin URL and vendor links; tag kind=bulletin/vendor/mitigation appropriately.
- Provenance records cite "cert-fr" with method=parser and source URL.
## In/Out of scope
In: advisory metadata extraction, references, severity text, watermarking.
Out: OVAL or package-level authority.
## Observability & security expectations
- Metrics: SourceDiagnostics emits shared `feedser.source.http.*` counters/histograms tagged `feedser.source=certfr`, covering fetch counts, parse failures, and map activity.
- Logs: feed URL(s), item ids/urls, extraction durations; no PII; allowlist hostnames.
## Tests
- Author and review coverage in `../StellaOps.Feedser.Source.CertFr.Tests`.
- Shared fixtures (e.g., `MongoIntegrationFixture`, `ConnectorTestHarness`) live in `../StellaOps.Feedser.Testing`.
- Keep fixtures deterministic; match new cases to real-world advisories or regression scenarios.

View File

@@ -0,0 +1,337 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.Json;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using MongoDB.Bson;
using StellaOps.Feedser.Source.CertFr.Configuration;
using StellaOps.Feedser.Source.CertFr.Internal;
using StellaOps.Feedser.Source.Common;
using StellaOps.Feedser.Source.Common.Fetch;
using StellaOps.Feedser.Storage.Mongo;
using StellaOps.Feedser.Storage.Mongo.Advisories;
using StellaOps.Feedser.Storage.Mongo.Documents;
using StellaOps.Feedser.Storage.Mongo.Dtos;
using StellaOps.Plugin;
namespace StellaOps.Feedser.Source.CertFr;
public sealed class CertFrConnector : IFeedConnector
{
private static readonly JsonSerializerOptions SerializerOptions = new()
{
PropertyNamingPolicy = JsonNamingPolicy.CamelCase,
DefaultIgnoreCondition = System.Text.Json.Serialization.JsonIgnoreCondition.WhenWritingNull,
};
private readonly CertFrFeedClient _feedClient;
private readonly SourceFetchService _fetchService;
private readonly RawDocumentStorage _rawDocumentStorage;
private readonly IDocumentStore _documentStore;
private readonly IDtoStore _dtoStore;
private readonly IAdvisoryStore _advisoryStore;
private readonly ISourceStateRepository _stateRepository;
private readonly CertFrOptions _options;
private readonly TimeProvider _timeProvider;
private readonly ILogger<CertFrConnector> _logger;
public CertFrConnector(
CertFrFeedClient feedClient,
SourceFetchService fetchService,
RawDocumentStorage rawDocumentStorage,
IDocumentStore documentStore,
IDtoStore dtoStore,
IAdvisoryStore advisoryStore,
ISourceStateRepository stateRepository,
IOptions<CertFrOptions> options,
TimeProvider? timeProvider,
ILogger<CertFrConnector> logger)
{
_feedClient = feedClient ?? throw new ArgumentNullException(nameof(feedClient));
_fetchService = fetchService ?? throw new ArgumentNullException(nameof(fetchService));
_rawDocumentStorage = rawDocumentStorage ?? throw new ArgumentNullException(nameof(rawDocumentStorage));
_documentStore = documentStore ?? throw new ArgumentNullException(nameof(documentStore));
_dtoStore = dtoStore ?? throw new ArgumentNullException(nameof(dtoStore));
_advisoryStore = advisoryStore ?? throw new ArgumentNullException(nameof(advisoryStore));
_stateRepository = stateRepository ?? throw new ArgumentNullException(nameof(stateRepository));
_options = (options ?? throw new ArgumentNullException(nameof(options))).Value ?? throw new ArgumentNullException(nameof(options));
_options.Validate();
_timeProvider = timeProvider ?? TimeProvider.System;
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
public string SourceName => CertFrConnectorPlugin.SourceName;
public async Task FetchAsync(IServiceProvider services, CancellationToken cancellationToken)
{
var cursor = await GetCursorAsync(cancellationToken).ConfigureAwait(false);
var now = _timeProvider.GetUtcNow();
var windowEnd = now;
var lastPublished = cursor.LastPublished ?? now - _options.InitialBackfill;
var windowStart = lastPublished - _options.WindowOverlap;
var minStart = now - _options.InitialBackfill;
if (windowStart < minStart)
{
windowStart = minStart;
}
IReadOnlyList<CertFrFeedItem> items;
try
{
items = await _feedClient.LoadAsync(windowStart, windowEnd, cancellationToken).ConfigureAwait(false);
}
catch (Exception ex)
{
_logger.LogError(ex, "Cert-FR feed load failed {Start:o}-{End:o}", windowStart, windowEnd);
await _stateRepository.MarkFailureAsync(SourceName, now, TimeSpan.FromMinutes(10), ex.Message, cancellationToken).ConfigureAwait(false);
throw;
}
if (items.Count == 0)
{
await UpdateCursorAsync(cursor.WithLastPublished(windowEnd), cancellationToken).ConfigureAwait(false);
return;
}
var pendingDocuments = cursor.PendingDocuments.ToList();
var pendingMappings = cursor.PendingMappings.ToList();
var maxPublished = cursor.LastPublished ?? DateTimeOffset.MinValue;
foreach (var item in items)
{
cancellationToken.ThrowIfCancellationRequested();
try
{
var existing = await _documentStore.FindBySourceAndUriAsync(SourceName, item.DetailUri.ToString(), cancellationToken).ConfigureAwait(false);
var request = new SourceFetchRequest(CertFrOptions.HttpClientName, SourceName, item.DetailUri)
{
Metadata = CertFrDocumentMetadata.CreateMetadata(item),
ETag = existing?.Etag,
LastModified = existing?.LastModified,
AcceptHeaders = new[] { "text/html", "application/xhtml+xml", "text/plain;q=0.5" },
};
var result = await _fetchService.FetchAsync(request, cancellationToken).ConfigureAwait(false);
if (result.IsNotModified || !result.IsSuccess || result.Document is null)
{
if (item.Published > maxPublished)
{
maxPublished = item.Published;
}
continue;
}
if (existing is not null
&& string.Equals(existing.Sha256, result.Document.Sha256, StringComparison.OrdinalIgnoreCase)
&& string.Equals(existing.Status, DocumentStatuses.Mapped, StringComparison.Ordinal))
{
await _documentStore.UpdateStatusAsync(result.Document.Id, existing.Status, cancellationToken).ConfigureAwait(false);
if (item.Published > maxPublished)
{
maxPublished = item.Published;
}
continue;
}
if (!pendingDocuments.Contains(result.Document.Id))
{
pendingDocuments.Add(result.Document.Id);
}
if (item.Published > maxPublished)
{
maxPublished = item.Published;
}
if (_options.RequestDelay > TimeSpan.Zero)
{
await Task.Delay(_options.RequestDelay, cancellationToken).ConfigureAwait(false);
}
}
catch (Exception ex)
{
_logger.LogError(ex, "Cert-FR fetch failed for {Uri}", item.DetailUri);
await _stateRepository.MarkFailureAsync(SourceName, _timeProvider.GetUtcNow(), TimeSpan.FromMinutes(5), ex.Message, cancellationToken).ConfigureAwait(false);
throw;
}
}
if (maxPublished == DateTimeOffset.MinValue)
{
maxPublished = cursor.LastPublished ?? windowEnd;
}
var updatedCursor = cursor
.WithPendingDocuments(pendingDocuments)
.WithPendingMappings(pendingMappings)
.WithLastPublished(maxPublished);
await UpdateCursorAsync(updatedCursor, cancellationToken).ConfigureAwait(false);
}
public async Task ParseAsync(IServiceProvider services, CancellationToken cancellationToken)
{
var cursor = await GetCursorAsync(cancellationToken).ConfigureAwait(false);
if (cursor.PendingDocuments.Count == 0)
{
return;
}
var pendingDocuments = cursor.PendingDocuments.ToList();
var pendingMappings = cursor.PendingMappings.ToList();
foreach (var documentId in cursor.PendingDocuments)
{
cancellationToken.ThrowIfCancellationRequested();
var document = await _documentStore.FindAsync(documentId, cancellationToken).ConfigureAwait(false);
if (document is null)
{
pendingDocuments.Remove(documentId);
pendingMappings.Remove(documentId);
continue;
}
if (!document.GridFsId.HasValue)
{
_logger.LogWarning("Cert-FR document {DocumentId} missing GridFS payload", document.Id);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
pendingDocuments.Remove(documentId);
pendingMappings.Remove(documentId);
continue;
}
CertFrDocumentMetadata metadata;
try
{
metadata = CertFrDocumentMetadata.FromDocument(document);
}
catch (Exception ex)
{
_logger.LogError(ex, "Cert-FR metadata parse failed for document {DocumentId}", document.Id);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
pendingDocuments.Remove(documentId);
pendingMappings.Remove(documentId);
continue;
}
CertFrDto dto;
try
{
var content = await _rawDocumentStorage.DownloadAsync(document.GridFsId.Value, cancellationToken).ConfigureAwait(false);
var html = System.Text.Encoding.UTF8.GetString(content);
dto = CertFrParser.Parse(html, metadata);
}
catch (Exception ex)
{
_logger.LogError(ex, "Cert-FR parse failed for advisory {AdvisoryId} ({Uri})", metadata.AdvisoryId, document.Uri);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
pendingDocuments.Remove(documentId);
pendingMappings.Remove(documentId);
continue;
}
var json = JsonSerializer.Serialize(dto, SerializerOptions);
var payload = BsonDocument.Parse(json);
var validatedAt = _timeProvider.GetUtcNow();
var existingDto = await _dtoStore.FindByDocumentIdAsync(document.Id, cancellationToken).ConfigureAwait(false);
var dtoRecord = existingDto is null
? new DtoRecord(Guid.NewGuid(), document.Id, SourceName, "certfr.detail.v1", payload, validatedAt)
: existingDto with
{
Payload = payload,
SchemaVersion = "certfr.detail.v1",
ValidatedAt = validatedAt,
};
await _dtoStore.UpsertAsync(dtoRecord, cancellationToken).ConfigureAwait(false);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.PendingMap, cancellationToken).ConfigureAwait(false);
pendingDocuments.Remove(documentId);
if (!pendingMappings.Contains(documentId))
{
pendingMappings.Add(documentId);
}
}
var updatedCursor = cursor
.WithPendingDocuments(pendingDocuments)
.WithPendingMappings(pendingMappings);
await UpdateCursorAsync(updatedCursor, cancellationToken).ConfigureAwait(false);
}
public async Task MapAsync(IServiceProvider services, CancellationToken cancellationToken)
{
var cursor = await GetCursorAsync(cancellationToken).ConfigureAwait(false);
if (cursor.PendingMappings.Count == 0)
{
return;
}
var pendingMappings = cursor.PendingMappings.ToList();
foreach (var documentId in cursor.PendingMappings)
{
cancellationToken.ThrowIfCancellationRequested();
var dtoRecord = await _dtoStore.FindByDocumentIdAsync(documentId, cancellationToken).ConfigureAwait(false);
var document = await _documentStore.FindAsync(documentId, cancellationToken).ConfigureAwait(false);
if (dtoRecord is null || document is null)
{
pendingMappings.Remove(documentId);
continue;
}
CertFrDto? dto;
try
{
var json = dtoRecord.Payload.ToJson();
dto = JsonSerializer.Deserialize<CertFrDto>(json, SerializerOptions);
}
catch (Exception ex)
{
_logger.LogError(ex, "Cert-FR DTO deserialization failed for document {DocumentId}", documentId);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
pendingMappings.Remove(documentId);
continue;
}
if (dto is null)
{
_logger.LogWarning("Cert-FR DTO payload deserialized as null for document {DocumentId}", documentId);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
pendingMappings.Remove(documentId);
continue;
}
var mappedAt = _timeProvider.GetUtcNow();
var advisory = CertFrMapper.Map(dto, SourceName, mappedAt);
await _advisoryStore.UpsertAsync(advisory, cancellationToken).ConfigureAwait(false);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Mapped, cancellationToken).ConfigureAwait(false);
pendingMappings.Remove(documentId);
}
var updatedCursor = cursor.WithPendingMappings(pendingMappings);
await UpdateCursorAsync(updatedCursor, cancellationToken).ConfigureAwait(false);
}
private async Task<CertFrCursor> GetCursorAsync(CancellationToken cancellationToken)
{
var record = await _stateRepository.TryGetAsync(SourceName, cancellationToken).ConfigureAwait(false);
return CertFrCursor.FromBson(record?.Cursor);
}
private async Task UpdateCursorAsync(CertFrCursor cursor, CancellationToken cancellationToken)
{
var completedAt = _timeProvider.GetUtcNow();
await _stateRepository.UpdateCursorAsync(SourceName, cursor.ToBsonDocument(), completedAt, cancellationToken).ConfigureAwait(false);
}
}

View File

@@ -0,0 +1,21 @@
using System;
using Microsoft.Extensions.DependencyInjection;
using StellaOps.Plugin;
namespace StellaOps.Feedser.Source.CertFr;
public sealed class CertFrConnectorPlugin : IConnectorPlugin
{
public const string SourceName = "cert-fr";
public string Name => SourceName;
public bool IsAvailable(IServiceProvider services)
=> services.GetService<CertFrConnector>() is not null;
public IFeedConnector Create(IServiceProvider services)
{
ArgumentNullException.ThrowIfNull(services);
return services.GetRequiredService<CertFrConnector>();
}
}

View File

@@ -0,0 +1,54 @@
using System;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using StellaOps.DependencyInjection;
using StellaOps.Feedser.Core.Jobs;
using StellaOps.Feedser.Source.CertFr.Configuration;
namespace StellaOps.Feedser.Source.CertFr;
public sealed class CertFrDependencyInjectionRoutine : IDependencyInjectionRoutine
{
private const string ConfigurationSection = "feedser:sources:cert-fr";
public IServiceCollection Register(IServiceCollection services, IConfiguration configuration)
{
ArgumentNullException.ThrowIfNull(services);
ArgumentNullException.ThrowIfNull(configuration);
services.AddCertFrConnector(options =>
{
configuration.GetSection(ConfigurationSection).Bind(options);
options.Validate();
});
services.AddTransient<CertFrFetchJob>();
services.AddTransient<CertFrParseJob>();
services.AddTransient<CertFrMapJob>();
services.PostConfigure<JobSchedulerOptions>(options =>
{
EnsureJob(options, CertFrJobKinds.Fetch, typeof(CertFrFetchJob));
EnsureJob(options, CertFrJobKinds.Parse, typeof(CertFrParseJob));
EnsureJob(options, CertFrJobKinds.Map, typeof(CertFrMapJob));
});
return services;
}
private static void EnsureJob(JobSchedulerOptions options, string kind, Type jobType)
{
if (options.Definitions.ContainsKey(kind))
{
return;
}
options.Definitions[kind] = new JobDefinition(
kind,
jobType,
options.DefaultTimeout,
options.DefaultLeaseDuration,
CronExpression: null,
Enabled: true);
}
}

View File

@@ -0,0 +1,36 @@
using System;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.DependencyInjection.Extensions;
using Microsoft.Extensions.Options;
using StellaOps.Feedser.Source.CertFr.Configuration;
using StellaOps.Feedser.Source.CertFr.Internal;
using StellaOps.Feedser.Source.Common.Http;
namespace StellaOps.Feedser.Source.CertFr;
public static class CertFrServiceCollectionExtensions
{
public static IServiceCollection AddCertFrConnector(this IServiceCollection services, Action<CertFrOptions> configure)
{
ArgumentNullException.ThrowIfNull(services);
ArgumentNullException.ThrowIfNull(configure);
services.AddOptions<CertFrOptions>()
.Configure(configure)
.PostConfigure(static options => options.Validate());
services.AddSourceHttpClient(CertFrOptions.HttpClientName, static (sp, clientOptions) =>
{
var options = sp.GetRequiredService<IOptions<CertFrOptions>>().Value;
clientOptions.BaseAddress = options.FeedUri;
clientOptions.UserAgent = "StellaOps.Feedser.CertFr/1.0";
clientOptions.Timeout = TimeSpan.FromSeconds(20);
clientOptions.AllowedHosts.Clear();
clientOptions.AllowedHosts.Add(options.FeedUri.Host);
});
services.TryAddSingleton<CertFrFeedClient>();
services.AddTransient<CertFrConnector>();
return services;
}
}

View File

@@ -0,0 +1,46 @@
using System;
namespace StellaOps.Feedser.Source.CertFr.Configuration;
public sealed class CertFrOptions
{
public const string HttpClientName = "cert-fr";
public Uri FeedUri { get; set; } = new("https://www.cert.ssi.gouv.fr/feed/alertes/");
public TimeSpan InitialBackfill { get; set; } = TimeSpan.FromDays(30);
public TimeSpan WindowOverlap { get; set; } = TimeSpan.FromDays(2);
public int MaxItemsPerFetch { get; set; } = 100;
public TimeSpan RequestDelay { get; set; } = TimeSpan.Zero;
public void Validate()
{
if (FeedUri is null || !FeedUri.IsAbsoluteUri)
{
throw new InvalidOperationException("Cert-FR FeedUri must be an absolute URI.");
}
if (InitialBackfill <= TimeSpan.Zero)
{
throw new InvalidOperationException("InitialBackfill must be a positive duration.");
}
if (WindowOverlap < TimeSpan.Zero)
{
throw new InvalidOperationException("WindowOverlap cannot be negative.");
}
if (MaxItemsPerFetch <= 0)
{
throw new InvalidOperationException("MaxItemsPerFetch must be positive.");
}
if (RequestDelay < TimeSpan.Zero)
{
throw new InvalidOperationException("RequestDelay cannot be negative.");
}
}
}

View File

@@ -0,0 +1,88 @@
using System;
using System.Collections.Generic;
using System.Linq;
using MongoDB.Bson;
namespace StellaOps.Feedser.Source.CertFr.Internal;
internal sealed record CertFrCursor(
DateTimeOffset? LastPublished,
IReadOnlyCollection<Guid> PendingDocuments,
IReadOnlyCollection<Guid> PendingMappings)
{
public static CertFrCursor Empty { get; } = new(null, Array.Empty<Guid>(), Array.Empty<Guid>());
public BsonDocument ToBsonDocument()
{
var document = new BsonDocument
{
["pendingDocuments"] = new BsonArray(PendingDocuments.Select(id => id.ToString())),
["pendingMappings"] = new BsonArray(PendingMappings.Select(id => id.ToString())),
};
if (LastPublished.HasValue)
{
document["lastPublished"] = LastPublished.Value.UtcDateTime;
}
return document;
}
public static CertFrCursor FromBson(BsonDocument? document)
{
if (document is null || document.ElementCount == 0)
{
return Empty;
}
var lastPublished = document.TryGetValue("lastPublished", out var value)
? ParseDate(value)
: null;
return new CertFrCursor(
lastPublished,
ReadGuidArray(document, "pendingDocuments"),
ReadGuidArray(document, "pendingMappings"));
}
public CertFrCursor WithLastPublished(DateTimeOffset? timestamp)
=> this with { LastPublished = timestamp };
public CertFrCursor WithPendingDocuments(IEnumerable<Guid> ids)
=> this with { PendingDocuments = ids?.Distinct().ToArray() ?? Array.Empty<Guid>() };
public CertFrCursor WithPendingMappings(IEnumerable<Guid> ids)
=> this with { PendingMappings = ids?.Distinct().ToArray() ?? Array.Empty<Guid>() };
private static DateTimeOffset? ParseDate(BsonValue value)
=> value.BsonType switch
{
BsonType.DateTime => DateTime.SpecifyKind(value.ToUniversalTime(), DateTimeKind.Utc),
BsonType.String when DateTimeOffset.TryParse(value.AsString, out var parsed) => parsed.ToUniversalTime(),
_ => null,
};
private static IReadOnlyCollection<Guid> ReadGuidArray(BsonDocument document, string field)
{
if (!document.TryGetValue(field, out var raw) || raw is not BsonArray array)
{
return Array.Empty<Guid>();
}
var result = new List<Guid>(array.Count);
foreach (var element in array)
{
if (element is null)
{
continue;
}
if (Guid.TryParse(element.ToString(), out var guid))
{
result.Add(guid);
}
}
return result;
}
}

View File

@@ -0,0 +1,77 @@
using System;
using System.Collections.Generic;
using StellaOps.Feedser.Storage.Mongo.Documents;
namespace StellaOps.Feedser.Source.CertFr.Internal;
internal sealed record CertFrDocumentMetadata(
string AdvisoryId,
string Title,
DateTimeOffset Published,
Uri DetailUri,
string? Summary)
{
private const string AdvisoryIdKey = "certfr.advisoryId";
private const string TitleKey = "certfr.title";
private const string PublishedKey = "certfr.published";
private const string SummaryKey = "certfr.summary";
public static CertFrDocumentMetadata FromDocument(DocumentRecord document)
{
ArgumentNullException.ThrowIfNull(document);
if (document.Metadata is null)
{
throw new InvalidOperationException("Cert-FR document metadata is missing.");
}
var metadata = document.Metadata;
if (!metadata.TryGetValue(AdvisoryIdKey, out var advisoryId) || string.IsNullOrWhiteSpace(advisoryId))
{
throw new InvalidOperationException("Cert-FR advisory id metadata missing.");
}
if (!metadata.TryGetValue(TitleKey, out var title) || string.IsNullOrWhiteSpace(title))
{
throw new InvalidOperationException("Cert-FR title metadata missing.");
}
if (!metadata.TryGetValue(PublishedKey, out var publishedRaw) || !DateTimeOffset.TryParse(publishedRaw, out var published))
{
throw new InvalidOperationException("Cert-FR published metadata invalid.");
}
if (!Uri.TryCreate(document.Uri, UriKind.Absolute, out var detailUri))
{
throw new InvalidOperationException("Cert-FR document URI invalid.");
}
metadata.TryGetValue(SummaryKey, out var summary);
return new CertFrDocumentMetadata(
advisoryId.Trim(),
title.Trim(),
published.ToUniversalTime(),
detailUri,
string.IsNullOrWhiteSpace(summary) ? null : summary.Trim());
}
public static IReadOnlyDictionary<string, string> CreateMetadata(CertFrFeedItem item)
{
ArgumentNullException.ThrowIfNull(item);
var metadata = new Dictionary<string, string>(StringComparer.Ordinal)
{
[AdvisoryIdKey] = item.AdvisoryId,
[TitleKey] = item.Title ?? item.AdvisoryId,
[PublishedKey] = item.Published.ToString("O"),
};
if (!string.IsNullOrWhiteSpace(item.Summary))
{
metadata[SummaryKey] = item.Summary!;
}
return metadata;
}
}

View File

@@ -0,0 +1,14 @@
using System;
using System.Collections.Generic;
using System.Text.Json.Serialization;
namespace StellaOps.Feedser.Source.CertFr.Internal;
internal sealed record CertFrDto(
[property: JsonPropertyName("advisoryId")] string AdvisoryId,
[property: JsonPropertyName("title")] string Title,
[property: JsonPropertyName("detailUrl")] string DetailUrl,
[property: JsonPropertyName("published")] DateTimeOffset Published,
[property: JsonPropertyName("summary")] string? Summary,
[property: JsonPropertyName("content")] string Content,
[property: JsonPropertyName("references")] IReadOnlyList<string> References);

View File

@@ -0,0 +1,109 @@
using System;
using System.Collections.Generic;
using System.Globalization;
using System.Linq;
using System.Net.Http;
using System.Threading;
using System.Threading.Tasks;
using System.Xml.Linq;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using StellaOps.Feedser.Source.CertFr.Configuration;
namespace StellaOps.Feedser.Source.CertFr.Internal;
public sealed class CertFrFeedClient
{
private readonly IHttpClientFactory _httpClientFactory;
private readonly CertFrOptions _options;
private readonly ILogger<CertFrFeedClient> _logger;
public CertFrFeedClient(IHttpClientFactory httpClientFactory, IOptions<CertFrOptions> options, ILogger<CertFrFeedClient> logger)
{
_httpClientFactory = httpClientFactory ?? throw new ArgumentNullException(nameof(httpClientFactory));
_options = (options ?? throw new ArgumentNullException(nameof(options))).Value ?? throw new ArgumentNullException(nameof(options));
_options.Validate();
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
public async Task<IReadOnlyList<CertFrFeedItem>> LoadAsync(DateTimeOffset windowStart, DateTimeOffset windowEnd, CancellationToken cancellationToken)
{
var client = _httpClientFactory.CreateClient(CertFrOptions.HttpClientName);
using var response = await client.GetAsync(_options.FeedUri, cancellationToken).ConfigureAwait(false);
response.EnsureSuccessStatusCode();
await using var stream = await response.Content.ReadAsStreamAsync(cancellationToken).ConfigureAwait(false);
var document = XDocument.Load(stream);
var items = new List<CertFrFeedItem>();
var now = DateTimeOffset.UtcNow;
foreach (var itemElement in document.Descendants("item"))
{
var link = itemElement.Element("link")?.Value;
if (string.IsNullOrWhiteSpace(link) || !Uri.TryCreate(link.Trim(), UriKind.Absolute, out var detailUri))
{
continue;
}
var title = itemElement.Element("title")?.Value?.Trim();
var summary = itemElement.Element("description")?.Value?.Trim();
var published = ParsePublished(itemElement.Element("pubDate")?.Value) ?? now;
if (published < windowStart)
{
continue;
}
if (published > windowEnd)
{
published = windowEnd;
}
var advisoryId = ResolveAdvisoryId(itemElement, detailUri);
items.Add(new CertFrFeedItem(advisoryId, detailUri, published.ToUniversalTime(), title, summary));
}
return items
.OrderBy(item => item.Published)
.Take(_options.MaxItemsPerFetch)
.ToArray();
}
private static DateTimeOffset? ParsePublished(string? value)
{
if (string.IsNullOrWhiteSpace(value))
{
return null;
}
if (DateTimeOffset.TryParse(value, CultureInfo.InvariantCulture, DateTimeStyles.AssumeUniversal | DateTimeStyles.AdjustToUniversal, out var parsed))
{
return parsed;
}
return null;
}
private static string ResolveAdvisoryId(XElement itemElement, Uri detailUri)
{
var guid = itemElement.Element("guid")?.Value;
if (!string.IsNullOrWhiteSpace(guid))
{
return guid.Trim();
}
var segments = detailUri.Segments;
if (segments.Length > 0)
{
var slug = segments[^1].Trim('/');
if (!string.IsNullOrWhiteSpace(slug))
{
return slug;
}
}
return detailUri.AbsoluteUri;
}
}

View File

@@ -0,0 +1,10 @@
using System;
namespace StellaOps.Feedser.Source.CertFr.Internal;
public sealed record CertFrFeedItem(
string AdvisoryId,
Uri DetailUri,
DateTimeOffset Published,
string? Title,
string? Summary);

View File

@@ -0,0 +1,116 @@
using System;
using System.Collections.Generic;
using System.Linq;
using StellaOps.Feedser.Models;
namespace StellaOps.Feedser.Source.CertFr.Internal;
internal static class CertFrMapper
{
public static Advisory Map(CertFrDto dto, string sourceName, DateTimeOffset recordedAt)
{
ArgumentNullException.ThrowIfNull(dto);
ArgumentException.ThrowIfNullOrEmpty(sourceName);
var advisoryKey = $"cert-fr/{dto.AdvisoryId}";
var provenance = new AdvisoryProvenance(sourceName, "document", dto.DetailUrl, recordedAt.ToUniversalTime());
var aliases = new List<string>
{
$"CERT-FR:{dto.AdvisoryId}",
};
var references = BuildReferences(dto, provenance).ToArray();
var affectedPackages = BuildAffectedPackages(dto, provenance).ToArray();
return new Advisory(
advisoryKey,
dto.Title,
dto.Summary ?? dto.Title,
language: "fr",
published: dto.Published.ToUniversalTime(),
modified: null,
severity: null,
exploitKnown: false,
aliases: aliases,
references: references,
affectedPackages: affectedPackages,
cvssMetrics: Array.Empty<CvssMetric>(),
provenance: new[] { provenance });
}
private static IEnumerable<AdvisoryReference> BuildReferences(CertFrDto dto, AdvisoryProvenance provenance)
{
var comparer = StringComparer.OrdinalIgnoreCase;
var entries = new List<(AdvisoryReference Reference, int Priority)>
{
(new AdvisoryReference(dto.DetailUrl, "advisory", "cert-fr", dto.Summary, provenance), 0),
};
foreach (var url in dto.References)
{
entries.Add((new AdvisoryReference(url, "reference", null, null, provenance), 1));
}
return entries
.GroupBy(tuple => tuple.Reference.Url, comparer)
.Select(group => group
.OrderBy(t => t.Priority)
.ThenBy(t => t.Reference.Kind ?? string.Empty, comparer)
.ThenBy(t => t.Reference.Url, comparer)
.First())
.OrderBy(t => t.Priority)
.ThenBy(t => t.Reference.Kind ?? string.Empty, comparer)
.ThenBy(t => t.Reference.Url, comparer)
.Select(t => t.Reference);
}
private static IEnumerable<AffectedPackage> BuildAffectedPackages(CertFrDto dto, AdvisoryProvenance provenance)
{
var extensions = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
if (!string.IsNullOrWhiteSpace(dto.Summary))
{
extensions["certfr.summary"] = dto.Summary.Trim();
}
if (!string.IsNullOrWhiteSpace(dto.Content))
{
var trimmed = dto.Content.Length > 1024 ? dto.Content[..1024].Trim() : dto.Content.Trim();
if (trimmed.Length > 0)
{
extensions["certfr.content"] = trimmed;
}
}
if (dto.References.Count > 0)
{
extensions["certfr.reference.count"] = dto.References.Count.ToString();
}
if (extensions.Count == 0)
{
return Array.Empty<AffectedPackage>();
}
var range = new AffectedVersionRange(
rangeKind: "vendor",
introducedVersion: null,
fixedVersion: null,
lastAffectedVersion: null,
rangeExpression: null,
provenance: provenance,
primitives: new RangePrimitives(null, null, null, extensions));
return new[]
{
new AffectedPackage(
AffectedPackageTypes.Vendor,
identifier: dto.AdvisoryId,
platform: null,
versionRanges: new[] { range },
statuses: Array.Empty<AffectedPackageStatus>(),
provenance: new[] { provenance })
};
}
}

View File

@@ -0,0 +1,80 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
namespace StellaOps.Feedser.Source.CertFr.Internal;
internal static class CertFrParser
{
private static readonly Regex AnchorRegex = new("<a[^>]+href=\"(?<url>https?://[^\"]+)\"", RegexOptions.IgnoreCase | RegexOptions.Compiled);
private static readonly Regex ScriptRegex = new("<script[\\s\\S]*?</script>", RegexOptions.IgnoreCase | RegexOptions.Compiled);
private static readonly Regex StyleRegex = new("<style[\\s\\S]*?</style>", RegexOptions.IgnoreCase | RegexOptions.Compiled);
private static readonly Regex TagRegex = new("<[^>]+>", RegexOptions.Compiled);
private static readonly Regex WhitespaceRegex = new("\\s+", RegexOptions.Compiled);
public static CertFrDto Parse(string html, CertFrDocumentMetadata metadata)
{
ArgumentException.ThrowIfNullOrEmpty(html);
ArgumentNullException.ThrowIfNull(metadata);
var sanitized = SanitizeHtml(html);
var summary = BuildSummary(metadata.Summary, sanitized);
var references = ExtractReferences(html);
return new CertFrDto(
metadata.AdvisoryId,
metadata.Title,
metadata.DetailUri.ToString(),
metadata.Published,
summary,
sanitized,
references);
}
private static string SanitizeHtml(string html)
{
var withoutScripts = ScriptRegex.Replace(html, string.Empty);
var withoutStyles = StyleRegex.Replace(withoutScripts, string.Empty);
var withoutTags = TagRegex.Replace(withoutStyles, " ");
var decoded = System.Net.WebUtility.HtmlDecode(withoutTags) ?? string.Empty;
return WhitespaceRegex.Replace(decoded, " ").Trim();
}
private static string? BuildSummary(string? metadataSummary, string content)
{
if (!string.IsNullOrWhiteSpace(metadataSummary))
{
return metadataSummary.Trim();
}
if (string.IsNullOrWhiteSpace(content))
{
return null;
}
var sentences = content.Split(new[] { '.','!','?' }, StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries);
if (sentences.Length > 0)
{
return sentences[0].Trim();
}
return content.Length > 280 ? content[..280].Trim() : content;
}
private static IReadOnlyList<string> ExtractReferences(string html)
{
var references = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
foreach (Match match in AnchorRegex.Matches(html))
{
if (match.Success)
{
references.Add(match.Groups["url"].Value.Trim());
}
}
return references.Count == 0
? Array.Empty<string>()
: references.OrderBy(url => url, StringComparer.OrdinalIgnoreCase).ToArray();
}
}

View File

@@ -0,0 +1,46 @@
using System;
using System.Threading;
using System.Threading.Tasks;
using StellaOps.Feedser.Core.Jobs;
namespace StellaOps.Feedser.Source.CertFr;
internal static class CertFrJobKinds
{
public const string Fetch = "source:cert-fr:fetch";
public const string Parse = "source:cert-fr:parse";
public const string Map = "source:cert-fr:map";
}
internal sealed class CertFrFetchJob : IJob
{
private readonly CertFrConnector _connector;
public CertFrFetchJob(CertFrConnector connector)
=> _connector = connector ?? throw new ArgumentNullException(nameof(connector));
public Task ExecuteAsync(JobExecutionContext context, CancellationToken cancellationToken)
=> _connector.FetchAsync(context.Services, cancellationToken);
}
internal sealed class CertFrParseJob : IJob
{
private readonly CertFrConnector _connector;
public CertFrParseJob(CertFrConnector connector)
=> _connector = connector ?? throw new ArgumentNullException(nameof(connector));
public Task ExecuteAsync(JobExecutionContext context, CancellationToken cancellationToken)
=> _connector.ParseAsync(context.Services, cancellationToken);
}
internal sealed class CertFrMapJob : IJob
{
private readonly CertFrConnector _connector;
public CertFrMapJob(CertFrConnector connector)
=> _connector = connector ?? throw new ArgumentNullException(nameof(connector));
public Task ExecuteAsync(JobExecutionContext context, CancellationToken cancellationToken)
=> _connector.MapAsync(context.Services, cancellationToken);
}

View File

@@ -0,0 +1,13 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
</PropertyGroup>
<ItemGroup>
<ProjectReference Include="../StellaOps.Plugin/StellaOps.Plugin.csproj" />
<ProjectReference Include="..\StellaOps.Feedser.Source.Common\StellaOps.Feedser.Source.Common.csproj" />
<ProjectReference Include="..\StellaOps.Feedser.Storage.Mongo\StellaOps.Feedser.Storage.Mongo.csproj" />
<ProjectReference Include="..\StellaOps.Feedser.Models\StellaOps.Feedser.Models.csproj" />
</ItemGroup>
</Project>

View File

@@ -0,0 +1,11 @@
# TASKS
| Task | Owner(s) | Depends on | Notes |
|---|---|---|---|
|RSS/list fetcher with sliding window|BE-Conn-CertFr|Source.Common|**DONE** RSS/list ingestion implemented with sliding date cursor.|
|Detail page fetch and sanitizer|BE-Conn-CertFr|Source.Common|**DONE** HTML sanitizer trims boilerplate prior to DTO mapping.|
|Extractor and schema validation of DTO|BE-Conn-CertFr, QA|Source.Common|**DONE** DTO parsing validates structure before persistence.|
|Canonical mapping (aliases, refs, severity text)|BE-Conn-CertFr|Models|**DONE** mapper emits enrichment references with severity text.|
|Watermark plus dedupe by sha256|BE-Conn-CertFr|Storage.Mongo|**DONE** SHA comparisons skip unchanged docs; covered by duplicate/not-modified connector tests.|
|Golden fixtures and determinism tests|QA|Source.CertFr|**DONE** snapshot fixtures added in `CertFrConnectorTests` to enforce deterministic output.|
|Mark failure/backoff on fetch errors|BE-Conn-CertFr|Storage.Mongo|**DONE** fetch path now marks failures/backoff and tests assert state repository updates.|
|Conditional fetch caching|BE-Conn-CertFr|Source.Common|**DONE** ETag/Last-Modified support wired via `SourceFetchService` and verified in not-modified test.|