Rename Concelier Source modules to Connector
Some checks failed
Build Test Deploy / build-test (push) Has been cancelled
Build Test Deploy / authority-container (push) Has been cancelled
Build Test Deploy / docs (push) Has been cancelled
Build Test Deploy / deploy (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Some checks failed
Build Test Deploy / build-test (push) Has been cancelled
Build Test Deploy / authority-container (push) Has been cancelled
Build Test Deploy / docs (push) Has been cancelled
Build Test Deploy / deploy (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
This commit is contained in:
40
src/StellaOps.Concelier.Connector.Acsc/AGENTS.md
Normal file
40
src/StellaOps.Concelier.Connector.Acsc/AGENTS.md
Normal file
@@ -0,0 +1,40 @@
|
||||
# AGENTS
|
||||
## Role
|
||||
Bootstrap the ACSC (Australian Cyber Security Centre) advisories connector so the Concelier pipeline can ingest, normalise, and enrich ACSC security bulletins.
|
||||
|
||||
## Scope
|
||||
- Research the authoritative ACSC advisory feed (RSS/Atom, JSON API, or HTML).
|
||||
- Implement fetch windowing, cursor persistence, and retry strategy consistent with other external connectors.
|
||||
- Parse advisory content (summary, affected products, mitigation guidance, references).
|
||||
- Map advisories into canonical `Advisory` records with aliases, references, affected packages, and provenance metadata.
|
||||
- Provide deterministic fixtures and regression tests that cover fetch/parse/map flows.
|
||||
|
||||
## Participants
|
||||
- `Source.Common` for HTTP client creation, fetch service, and DTO persistence helpers.
|
||||
- `Storage.Mongo` for raw/document/DTO/advisory storage plus cursor management.
|
||||
- `Concelier.Models` for canonical advisory structures and provenance utilities.
|
||||
- `Concelier.Testing` for integration harnesses and snapshot helpers.
|
||||
|
||||
## Interfaces & Contracts
|
||||
- Job kinds should follow the pattern `acsc:fetch`, `acsc:parse`, `acsc:map`.
|
||||
- Documents persisted to Mongo must include ETag/Last-Modified metadata when the source exposes it.
|
||||
- Canonical advisories must emit aliases (ACSC ID + CVE IDs) and references (official bulletin + vendor notices).
|
||||
|
||||
## In/Out of scope
|
||||
In scope:
|
||||
- Initial end-to-end connector implementation with tests, fixtures, and range primitive coverage.
|
||||
- Minimal telemetry (logging + diagnostics counters) consistent with other connectors.
|
||||
|
||||
Out of scope:
|
||||
- Upstream remediation automation or vendor-specific enrichment beyond ACSC data.
|
||||
- Export-related changes (handled by exporter teams).
|
||||
|
||||
## Observability & Security Expectations
|
||||
- Log key lifecycle events (fetch/page processed, parse success/error counts, mapping stats).
|
||||
- Sanitise HTML safely and avoid persisting external scripts or embedded media.
|
||||
- Handle transient fetch failures gracefully with exponential backoff and mark failures in source state.
|
||||
|
||||
## Tests
|
||||
- Add integration-style tests under `StellaOps.Concelier.Connector.Acsc.Tests` covering fetch/parse/map with canned fixtures.
|
||||
- Snapshot canonical advisories; provide UPDATE flag flow for regeneration.
|
||||
- Validate determinism (ordering, casing, timestamps) to satisfy pipeline reproducibility requirements.
|
||||
699
src/StellaOps.Concelier.Connector.Acsc/AcscConnector.cs
Normal file
699
src/StellaOps.Concelier.Connector.Acsc/AcscConnector.cs
Normal file
@@ -0,0 +1,699 @@
|
||||
using System.Collections.Generic;
|
||||
using System.Globalization;
|
||||
using System.IO;
|
||||
using System.Net;
|
||||
using System.Net.Http;
|
||||
using System.Linq;
|
||||
using System.Threading;
|
||||
using System.Threading.Tasks;
|
||||
using System.Xml.Linq;
|
||||
using System.Text.Json;
|
||||
using Microsoft.Extensions.Logging;
|
||||
using Microsoft.Extensions.Options;
|
||||
using MongoDB.Bson;
|
||||
using MongoDB.Bson.IO;
|
||||
using StellaOps.Concelier.Connector.Acsc.Configuration;
|
||||
using StellaOps.Concelier.Connector.Acsc.Internal;
|
||||
using StellaOps.Concelier.Connector.Common.Fetch;
|
||||
using StellaOps.Concelier.Connector.Common.Html;
|
||||
using StellaOps.Concelier.Connector.Common;
|
||||
using StellaOps.Concelier.Storage.Mongo;
|
||||
using StellaOps.Concelier.Storage.Mongo.Documents;
|
||||
using StellaOps.Concelier.Storage.Mongo.Dtos;
|
||||
using StellaOps.Concelier.Storage.Mongo.Advisories;
|
||||
using StellaOps.Plugin;
|
||||
|
||||
namespace StellaOps.Concelier.Connector.Acsc;
|
||||
|
||||
public sealed class AcscConnector : IFeedConnector
|
||||
{
|
||||
private static readonly string[] AcceptHeaders =
|
||||
{
|
||||
"application/rss+xml",
|
||||
"application/atom+xml;q=0.9",
|
||||
"application/xml;q=0.8",
|
||||
"text/xml;q=0.7",
|
||||
};
|
||||
|
||||
private static readonly JsonSerializerOptions SerializerOptions = new(JsonSerializerDefaults.Web)
|
||||
{
|
||||
PropertyNameCaseInsensitive = true,
|
||||
WriteIndented = false,
|
||||
};
|
||||
|
||||
private readonly SourceFetchService _fetchService;
|
||||
private readonly RawDocumentStorage _rawDocumentStorage;
|
||||
private readonly IDocumentStore _documentStore;
|
||||
private readonly IDtoStore _dtoStore;
|
||||
private readonly IAdvisoryStore _advisoryStore;
|
||||
private readonly ISourceStateRepository _stateRepository;
|
||||
private readonly IHttpClientFactory _httpClientFactory;
|
||||
private readonly AcscOptions _options;
|
||||
private readonly AcscDiagnostics _diagnostics;
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly ILogger<AcscConnector> _logger;
|
||||
private readonly HtmlContentSanitizer _htmlSanitizer = new();
|
||||
|
||||
public AcscConnector(
|
||||
SourceFetchService fetchService,
|
||||
RawDocumentStorage rawDocumentStorage,
|
||||
IDocumentStore documentStore,
|
||||
IDtoStore dtoStore,
|
||||
IAdvisoryStore advisoryStore,
|
||||
ISourceStateRepository stateRepository,
|
||||
IHttpClientFactory httpClientFactory,
|
||||
IOptions<AcscOptions> options,
|
||||
AcscDiagnostics diagnostics,
|
||||
TimeProvider? timeProvider,
|
||||
ILogger<AcscConnector> logger)
|
||||
{
|
||||
_fetchService = fetchService ?? throw new ArgumentNullException(nameof(fetchService));
|
||||
_rawDocumentStorage = rawDocumentStorage ?? throw new ArgumentNullException(nameof(rawDocumentStorage));
|
||||
_documentStore = documentStore ?? throw new ArgumentNullException(nameof(documentStore));
|
||||
_dtoStore = dtoStore ?? throw new ArgumentNullException(nameof(dtoStore));
|
||||
_advisoryStore = advisoryStore ?? throw new ArgumentNullException(nameof(advisoryStore));
|
||||
_stateRepository = stateRepository ?? throw new ArgumentNullException(nameof(stateRepository));
|
||||
_httpClientFactory = httpClientFactory ?? throw new ArgumentNullException(nameof(httpClientFactory));
|
||||
_options = (options ?? throw new ArgumentNullException(nameof(options))).Value ?? throw new ArgumentNullException(nameof(options));
|
||||
_options.Validate();
|
||||
_diagnostics = diagnostics ?? throw new ArgumentNullException(nameof(diagnostics));
|
||||
_timeProvider = timeProvider ?? TimeProvider.System;
|
||||
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
|
||||
}
|
||||
|
||||
public string SourceName => AcscConnectorPlugin.SourceName;
|
||||
|
||||
public async Task FetchAsync(IServiceProvider services, CancellationToken cancellationToken)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(services);
|
||||
|
||||
var now = _timeProvider.GetUtcNow();
|
||||
var cursor = await GetCursorAsync(cancellationToken).ConfigureAwait(false);
|
||||
|
||||
var lastPublished = new Dictionary<string, DateTimeOffset?>(cursor.LastPublishedByFeed, StringComparer.OrdinalIgnoreCase);
|
||||
var pendingDocuments = cursor.PendingDocuments.ToHashSet();
|
||||
var pendingMappings = cursor.PendingMappings.ToHashSet();
|
||||
var failures = new List<(AcscFeedOptions Feed, Exception Error)>();
|
||||
|
||||
var preferredEndpoint = ResolveInitialPreference(cursor);
|
||||
AcscEndpointPreference? successPreference = null;
|
||||
|
||||
foreach (var feed in GetEnabledFeeds())
|
||||
{
|
||||
cancellationToken.ThrowIfCancellationRequested();
|
||||
|
||||
Exception? lastError = null;
|
||||
bool handled = false;
|
||||
|
||||
foreach (var mode in BuildFetchOrder(preferredEndpoint))
|
||||
{
|
||||
cancellationToken.ThrowIfCancellationRequested();
|
||||
if (mode == AcscFetchMode.Relay && !IsRelayConfigured)
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
var modeName = ModeName(mode);
|
||||
var targetUri = BuildFeedUri(feed, mode);
|
||||
|
||||
var metadata = CreateMetadata(feed, cursor, modeName);
|
||||
var existing = await _documentStore.FindBySourceAndUriAsync(SourceName, targetUri.ToString(), cancellationToken).ConfigureAwait(false);
|
||||
|
||||
var request = new SourceFetchRequest(AcscOptions.HttpClientName, SourceName, targetUri)
|
||||
{
|
||||
Metadata = metadata,
|
||||
ETag = existing?.Etag,
|
||||
LastModified = existing?.LastModified,
|
||||
AcceptHeaders = AcceptHeaders,
|
||||
TimeoutOverride = _options.RequestTimeout,
|
||||
};
|
||||
|
||||
try
|
||||
{
|
||||
_diagnostics.FetchAttempt(feed.Slug, modeName);
|
||||
var result = await _fetchService.FetchAsync(request, cancellationToken).ConfigureAwait(false);
|
||||
|
||||
if (result.IsNotModified)
|
||||
{
|
||||
_diagnostics.FetchUnchanged(feed.Slug, modeName);
|
||||
successPreference ??= mode switch
|
||||
{
|
||||
AcscFetchMode.Relay => AcscEndpointPreference.Relay,
|
||||
_ => AcscEndpointPreference.Direct,
|
||||
};
|
||||
handled = true;
|
||||
_logger.LogDebug("ACSC feed {Feed} returned 304 via {Mode}", feed.Slug, modeName);
|
||||
break;
|
||||
}
|
||||
|
||||
if (!result.IsSuccess || result.Document is null)
|
||||
{
|
||||
_diagnostics.FetchFailure(feed.Slug, modeName);
|
||||
lastError = new InvalidOperationException($"Fetch returned no document for {targetUri}");
|
||||
continue;
|
||||
}
|
||||
|
||||
pendingDocuments.Add(result.Document.Id);
|
||||
successPreference = mode switch
|
||||
{
|
||||
AcscFetchMode.Relay => AcscEndpointPreference.Relay,
|
||||
_ => AcscEndpointPreference.Direct,
|
||||
};
|
||||
handled = true;
|
||||
_diagnostics.FetchSuccess(feed.Slug, modeName);
|
||||
_logger.LogInformation("ACSC fetched {Feed} via {Mode} (documentId={DocumentId})", feed.Slug, modeName, result.Document.Id);
|
||||
|
||||
var latestPublished = await TryComputeLatestPublishedAsync(result.Document, cancellationToken).ConfigureAwait(false);
|
||||
if (latestPublished.HasValue)
|
||||
{
|
||||
if (!lastPublished.TryGetValue(feed.Slug, out var existingPublished) || latestPublished.Value > existingPublished)
|
||||
{
|
||||
lastPublished[feed.Slug] = latestPublished.Value;
|
||||
_diagnostics.CursorUpdated(feed.Slug);
|
||||
_logger.LogDebug("ACSC feed {Feed} advanced published cursor to {Timestamp:O}", feed.Slug, latestPublished.Value);
|
||||
}
|
||||
}
|
||||
|
||||
break;
|
||||
}
|
||||
catch (HttpRequestException ex) when (ShouldRetryWithRelay(mode))
|
||||
{
|
||||
lastError = ex;
|
||||
_diagnostics.FetchFallback(feed.Slug, modeName, "http-request");
|
||||
_logger.LogWarning(ex, "ACSC fetch via {Mode} failed for {Feed}; attempting relay fallback.", modeName, feed.Slug);
|
||||
continue;
|
||||
}
|
||||
catch (TaskCanceledException ex) when (ShouldRetryWithRelay(mode))
|
||||
{
|
||||
lastError = ex;
|
||||
_diagnostics.FetchFallback(feed.Slug, modeName, "timeout");
|
||||
_logger.LogWarning(ex, "ACSC fetch via {Mode} timed out for {Feed}; attempting relay fallback.", modeName, feed.Slug);
|
||||
continue;
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
lastError = ex;
|
||||
_diagnostics.FetchFailure(feed.Slug, modeName);
|
||||
_logger.LogError(ex, "ACSC fetch failed for {Feed} via {Mode}", feed.Slug, modeName);
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
if (!handled && lastError is not null)
|
||||
{
|
||||
failures.Add((feed, lastError));
|
||||
}
|
||||
}
|
||||
|
||||
if (failures.Count > 0)
|
||||
{
|
||||
var failureReason = string.Join("; ", failures.Select(f => $"{f.Feed.Slug}: {f.Error.Message}"));
|
||||
await _stateRepository.MarkFailureAsync(SourceName, now, _options.FailureBackoff, failureReason, cancellationToken).ConfigureAwait(false);
|
||||
throw new AggregateException($"ACSC fetch failed for {failures.Count} feed(s): {failureReason}", failures.Select(f => f.Error));
|
||||
}
|
||||
|
||||
var updatedPreference = successPreference ?? preferredEndpoint;
|
||||
if (_options.ForceRelay)
|
||||
{
|
||||
updatedPreference = AcscEndpointPreference.Relay;
|
||||
}
|
||||
else if (!IsRelayConfigured)
|
||||
{
|
||||
updatedPreference = AcscEndpointPreference.Direct;
|
||||
}
|
||||
|
||||
var updatedCursor = cursor
|
||||
.WithPreferredEndpoint(updatedPreference)
|
||||
.WithPendingDocuments(pendingDocuments)
|
||||
.WithPendingMappings(pendingMappings)
|
||||
.WithLastPublished(lastPublished);
|
||||
|
||||
await UpdateCursorAsync(updatedCursor, cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
|
||||
public async Task ParseAsync(IServiceProvider services, CancellationToken cancellationToken)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(services);
|
||||
|
||||
var cursor = await GetCursorAsync(cancellationToken).ConfigureAwait(false);
|
||||
if (cursor.PendingDocuments.Count == 0)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
var pendingDocuments = cursor.PendingDocuments.ToList();
|
||||
var pendingMappings = cursor.PendingMappings.ToHashSet();
|
||||
|
||||
foreach (var documentId in cursor.PendingDocuments)
|
||||
{
|
||||
cancellationToken.ThrowIfCancellationRequested();
|
||||
|
||||
var document = await _documentStore.FindAsync(documentId, cancellationToken).ConfigureAwait(false);
|
||||
if (document is null)
|
||||
{
|
||||
pendingDocuments.Remove(documentId);
|
||||
pendingMappings.Remove(documentId);
|
||||
continue;
|
||||
}
|
||||
|
||||
var metadata = AcscDocumentMetadata.FromDocument(document);
|
||||
var feedTag = string.IsNullOrWhiteSpace(metadata.FeedSlug) ? "(unknown)" : metadata.FeedSlug;
|
||||
|
||||
_diagnostics.ParseAttempt(feedTag);
|
||||
|
||||
if (!document.GridFsId.HasValue)
|
||||
{
|
||||
_diagnostics.ParseFailure(feedTag, "missingPayload");
|
||||
_logger.LogWarning("ACSC document {DocumentId} missing GridFS payload (feed={Feed})", document.Id, feedTag);
|
||||
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
|
||||
pendingDocuments.Remove(documentId);
|
||||
pendingMappings.Remove(documentId);
|
||||
continue;
|
||||
}
|
||||
|
||||
byte[] rawBytes;
|
||||
try
|
||||
{
|
||||
rawBytes = await _rawDocumentStorage.DownloadAsync(document.GridFsId.Value, cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_diagnostics.ParseFailure(feedTag, "download");
|
||||
_logger.LogError(ex, "ACSC failed to download payload for document {DocumentId} (feed={Feed})", document.Id, feedTag);
|
||||
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
|
||||
pendingDocuments.Remove(documentId);
|
||||
pendingMappings.Remove(documentId);
|
||||
continue;
|
||||
}
|
||||
|
||||
try
|
||||
{
|
||||
var parsedAt = _timeProvider.GetUtcNow();
|
||||
var dto = AcscFeedParser.Parse(rawBytes, metadata.FeedSlug, parsedAt, _htmlSanitizer);
|
||||
|
||||
var json = JsonSerializer.Serialize(dto, SerializerOptions);
|
||||
var payload = BsonDocument.Parse(json);
|
||||
|
||||
var existingDto = await _dtoStore.FindByDocumentIdAsync(document.Id, cancellationToken).ConfigureAwait(false);
|
||||
var dtoRecord = existingDto is null
|
||||
? new DtoRecord(Guid.NewGuid(), document.Id, SourceName, "acsc.feed.v1", payload, parsedAt)
|
||||
: existingDto with
|
||||
{
|
||||
Payload = payload,
|
||||
SchemaVersion = "acsc.feed.v1",
|
||||
ValidatedAt = parsedAt,
|
||||
};
|
||||
|
||||
await _dtoStore.UpsertAsync(dtoRecord, cancellationToken).ConfigureAwait(false);
|
||||
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.PendingMap, cancellationToken).ConfigureAwait(false);
|
||||
|
||||
pendingDocuments.Remove(documentId);
|
||||
pendingMappings.Add(document.Id);
|
||||
|
||||
_diagnostics.ParseSuccess(feedTag);
|
||||
_logger.LogInformation("ACSC parsed document {DocumentId} (feed={Feed}, entries={EntryCount})", document.Id, feedTag, dto.Entries.Count);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_diagnostics.ParseFailure(feedTag, "parse");
|
||||
_logger.LogError(ex, "ACSC parse failed for document {DocumentId} (feed={Feed})", document.Id, feedTag);
|
||||
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
|
||||
pendingDocuments.Remove(documentId);
|
||||
pendingMappings.Remove(documentId);
|
||||
}
|
||||
}
|
||||
|
||||
var updatedCursor = cursor
|
||||
.WithPendingDocuments(pendingDocuments)
|
||||
.WithPendingMappings(pendingMappings);
|
||||
|
||||
await UpdateCursorAsync(updatedCursor, cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
|
||||
public async Task MapAsync(IServiceProvider services, CancellationToken cancellationToken)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(services);
|
||||
|
||||
var cursor = await GetCursorAsync(cancellationToken).ConfigureAwait(false);
|
||||
if (cursor.PendingMappings.Count == 0)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
var pendingMappings = cursor.PendingMappings.ToHashSet();
|
||||
var documentIds = cursor.PendingMappings.ToList();
|
||||
|
||||
foreach (var documentId in documentIds)
|
||||
{
|
||||
cancellationToken.ThrowIfCancellationRequested();
|
||||
|
||||
var dtoRecord = await _dtoStore.FindByDocumentIdAsync(documentId, cancellationToken).ConfigureAwait(false);
|
||||
var document = await _documentStore.FindAsync(documentId, cancellationToken).ConfigureAwait(false);
|
||||
|
||||
if (dtoRecord is null || document is null)
|
||||
{
|
||||
pendingMappings.Remove(documentId);
|
||||
continue;
|
||||
}
|
||||
|
||||
AcscFeedDto? feed;
|
||||
try
|
||||
{
|
||||
var dtoJson = dtoRecord.Payload.ToJson(new JsonWriterSettings
|
||||
{
|
||||
OutputMode = JsonOutputMode.RelaxedExtendedJson,
|
||||
});
|
||||
|
||||
feed = JsonSerializer.Deserialize<AcscFeedDto>(dtoJson, SerializerOptions);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogError(ex, "ACSC mapping failed to deserialize DTO for document {DocumentId}", document.Id);
|
||||
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
|
||||
pendingMappings.Remove(documentId);
|
||||
continue;
|
||||
}
|
||||
|
||||
if (feed is null)
|
||||
{
|
||||
_logger.LogWarning("ACSC mapping encountered null DTO payload for document {DocumentId}", document.Id);
|
||||
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
|
||||
pendingMappings.Remove(documentId);
|
||||
continue;
|
||||
}
|
||||
|
||||
var mappedAt = _timeProvider.GetUtcNow();
|
||||
var advisories = AcscMapper.Map(feed, document, dtoRecord, SourceName, mappedAt);
|
||||
|
||||
if (advisories.Count > 0)
|
||||
{
|
||||
foreach (var advisory in advisories)
|
||||
{
|
||||
await _advisoryStore.UpsertAsync(advisory, cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
|
||||
_diagnostics.MapSuccess(advisories.Count);
|
||||
_logger.LogInformation(
|
||||
"ACSC mapped {Count} advisories from document {DocumentId} (feed={Feed})",
|
||||
advisories.Count,
|
||||
document.Id,
|
||||
feed.FeedSlug ?? "(unknown)");
|
||||
}
|
||||
else
|
||||
{
|
||||
_logger.LogInformation(
|
||||
"ACSC mapping produced no advisories for document {DocumentId} (feed={Feed})",
|
||||
document.Id,
|
||||
feed.FeedSlug ?? "(unknown)");
|
||||
}
|
||||
|
||||
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Mapped, cancellationToken).ConfigureAwait(false);
|
||||
pendingMappings.Remove(documentId);
|
||||
}
|
||||
|
||||
var updatedCursor = cursor.WithPendingMappings(pendingMappings);
|
||||
await UpdateCursorAsync(updatedCursor, cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
|
||||
public async Task ProbeAsync(CancellationToken cancellationToken)
|
||||
{
|
||||
var cursor = await GetCursorAsync(cancellationToken).ConfigureAwait(false);
|
||||
|
||||
if (_options.ForceRelay)
|
||||
{
|
||||
if (cursor.PreferredEndpoint != AcscEndpointPreference.Relay)
|
||||
{
|
||||
await UpdateCursorAsync(cursor.WithPreferredEndpoint(AcscEndpointPreference.Relay), cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
if (!IsRelayConfigured)
|
||||
{
|
||||
if (cursor.PreferredEndpoint != AcscEndpointPreference.Direct)
|
||||
{
|
||||
await UpdateCursorAsync(cursor.WithPreferredEndpoint(AcscEndpointPreference.Direct), cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
var feed = GetEnabledFeeds().FirstOrDefault();
|
||||
if (feed is null)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
var httpClient = _httpClientFactory.CreateClient(AcscOptions.HttpClientName);
|
||||
httpClient.Timeout = TimeSpan.FromSeconds(15);
|
||||
|
||||
var directUri = BuildFeedUri(feed, AcscFetchMode.Direct);
|
||||
|
||||
try
|
||||
{
|
||||
using var headRequest = new HttpRequestMessage(HttpMethod.Head, directUri);
|
||||
using var response = await httpClient.SendAsync(headRequest, HttpCompletionOption.ResponseHeadersRead, cancellationToken).ConfigureAwait(false);
|
||||
if (response.IsSuccessStatusCode)
|
||||
{
|
||||
if (cursor.PreferredEndpoint != AcscEndpointPreference.Direct)
|
||||
{
|
||||
await UpdateCursorAsync(cursor.WithPreferredEndpoint(AcscEndpointPreference.Direct), cancellationToken).ConfigureAwait(false);
|
||||
_logger.LogInformation("ACSC probe succeeded via direct endpoint ({StatusCode}); relay preference cleared.", (int)response.StatusCode);
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
if (response.StatusCode == HttpStatusCode.MethodNotAllowed)
|
||||
{
|
||||
using var probeRequest = new HttpRequestMessage(HttpMethod.Get, directUri);
|
||||
using var probeResponse = await httpClient.SendAsync(probeRequest, HttpCompletionOption.ResponseHeadersRead, cancellationToken).ConfigureAwait(false);
|
||||
if (probeResponse.IsSuccessStatusCode)
|
||||
{
|
||||
if (cursor.PreferredEndpoint != AcscEndpointPreference.Direct)
|
||||
{
|
||||
await UpdateCursorAsync(cursor.WithPreferredEndpoint(AcscEndpointPreference.Direct), cancellationToken).ConfigureAwait(false);
|
||||
_logger.LogInformation("ACSC probe succeeded via direct endpoint after GET fallback ({StatusCode}).", (int)probeResponse.StatusCode);
|
||||
}
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
_logger.LogWarning("ACSC direct probe returned HTTP {StatusCode}; relay preference enabled.", (int)response.StatusCode);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogWarning(ex, "ACSC direct probe failed; relay preference will be enabled.");
|
||||
}
|
||||
|
||||
if (cursor.PreferredEndpoint != AcscEndpointPreference.Relay)
|
||||
{
|
||||
await UpdateCursorAsync(cursor.WithPreferredEndpoint(AcscEndpointPreference.Relay), cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
}
|
||||
|
||||
private bool ShouldRetryWithRelay(AcscFetchMode mode)
|
||||
=> mode == AcscFetchMode.Direct && _options.EnableRelayFallback && IsRelayConfigured && !_options.ForceRelay;
|
||||
|
||||
private IEnumerable<AcscFetchMode> BuildFetchOrder(AcscEndpointPreference preference)
|
||||
{
|
||||
if (_options.ForceRelay)
|
||||
{
|
||||
if (IsRelayConfigured)
|
||||
{
|
||||
yield return AcscFetchMode.Relay;
|
||||
}
|
||||
yield break;
|
||||
}
|
||||
|
||||
if (!IsRelayConfigured)
|
||||
{
|
||||
yield return AcscFetchMode.Direct;
|
||||
yield break;
|
||||
}
|
||||
|
||||
var preferRelay = preference == AcscEndpointPreference.Relay;
|
||||
if (preference == AcscEndpointPreference.Auto)
|
||||
{
|
||||
preferRelay = _options.PreferRelayByDefault;
|
||||
}
|
||||
|
||||
if (preferRelay)
|
||||
{
|
||||
yield return AcscFetchMode.Relay;
|
||||
if (_options.EnableRelayFallback)
|
||||
{
|
||||
yield return AcscFetchMode.Direct;
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
yield return AcscFetchMode.Direct;
|
||||
if (_options.EnableRelayFallback)
|
||||
{
|
||||
yield return AcscFetchMode.Relay;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private AcscEndpointPreference ResolveInitialPreference(AcscCursor cursor)
|
||||
{
|
||||
if (_options.ForceRelay)
|
||||
{
|
||||
return AcscEndpointPreference.Relay;
|
||||
}
|
||||
|
||||
if (!IsRelayConfigured)
|
||||
{
|
||||
return AcscEndpointPreference.Direct;
|
||||
}
|
||||
|
||||
if (cursor.PreferredEndpoint != AcscEndpointPreference.Auto)
|
||||
{
|
||||
return cursor.PreferredEndpoint;
|
||||
}
|
||||
|
||||
return _options.PreferRelayByDefault ? AcscEndpointPreference.Relay : AcscEndpointPreference.Direct;
|
||||
}
|
||||
|
||||
private async Task<DateTimeOffset?> TryComputeLatestPublishedAsync(DocumentRecord document, CancellationToken cancellationToken)
|
||||
{
|
||||
if (!document.GridFsId.HasValue)
|
||||
{
|
||||
return null;
|
||||
}
|
||||
|
||||
var rawBytes = await _rawDocumentStorage.DownloadAsync(document.GridFsId.Value, cancellationToken).ConfigureAwait(false);
|
||||
if (rawBytes.Length == 0)
|
||||
{
|
||||
return null;
|
||||
}
|
||||
|
||||
try
|
||||
{
|
||||
using var memoryStream = new MemoryStream(rawBytes, writable: false);
|
||||
var xml = XDocument.Load(memoryStream, LoadOptions.None);
|
||||
|
||||
DateTimeOffset? latest = null;
|
||||
foreach (var element in xml.Descendants())
|
||||
{
|
||||
if (!IsEntryElement(element.Name.LocalName))
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
var published = ExtractPublished(element);
|
||||
if (!published.HasValue)
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
if (latest is null || published.Value > latest.Value)
|
||||
{
|
||||
latest = published;
|
||||
}
|
||||
}
|
||||
|
||||
return latest;
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogWarning(ex, "ACSC failed to derive published cursor for document {DocumentId} ({Uri})", document.Id, document.Uri);
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
private static bool IsEntryElement(string localName)
|
||||
=> string.Equals(localName, "item", StringComparison.OrdinalIgnoreCase)
|
||||
|| string.Equals(localName, "entry", StringComparison.OrdinalIgnoreCase);
|
||||
|
||||
private static DateTimeOffset? ExtractPublished(XElement element)
|
||||
{
|
||||
foreach (var name in EnumerateTimestampNames(element))
|
||||
{
|
||||
if (DateTimeOffset.TryParse(
|
||||
name.Value,
|
||||
CultureInfo.InvariantCulture,
|
||||
DateTimeStyles.AllowWhiteSpaces | DateTimeStyles.AssumeUniversal,
|
||||
out var parsed))
|
||||
{
|
||||
return parsed.ToUniversalTime();
|
||||
}
|
||||
}
|
||||
|
||||
return null;
|
||||
}
|
||||
|
||||
private static IEnumerable<XElement> EnumerateTimestampNames(XElement element)
|
||||
{
|
||||
foreach (var child in element.Elements())
|
||||
{
|
||||
var localName = child.Name.LocalName;
|
||||
if (string.Equals(localName, "pubDate", StringComparison.OrdinalIgnoreCase) ||
|
||||
string.Equals(localName, "published", StringComparison.OrdinalIgnoreCase) ||
|
||||
string.Equals(localName, "updated", StringComparison.OrdinalIgnoreCase) ||
|
||||
string.Equals(localName, "date", StringComparison.OrdinalIgnoreCase))
|
||||
{
|
||||
yield return child;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private Dictionary<string, string> CreateMetadata(AcscFeedOptions feed, AcscCursor cursor, string mode)
|
||||
{
|
||||
var metadata = new Dictionary<string, string>(StringComparer.Ordinal)
|
||||
{
|
||||
["acsc.feed.slug"] = feed.Slug,
|
||||
["acsc.fetch.mode"] = mode,
|
||||
};
|
||||
|
||||
if (cursor.LastPublishedByFeed.TryGetValue(feed.Slug, out var published) && published.HasValue)
|
||||
{
|
||||
metadata["acsc.cursor.lastPublished"] = published.Value.ToString("O");
|
||||
}
|
||||
|
||||
return metadata;
|
||||
}
|
||||
|
||||
private Uri BuildFeedUri(AcscFeedOptions feed, AcscFetchMode mode)
|
||||
{
|
||||
var baseUri = mode switch
|
||||
{
|
||||
AcscFetchMode.Relay when IsRelayConfigured => _options.RelayEndpoint!,
|
||||
_ => _options.BaseEndpoint,
|
||||
};
|
||||
|
||||
return new Uri(baseUri, feed.RelativePath);
|
||||
}
|
||||
|
||||
private IEnumerable<AcscFeedOptions> GetEnabledFeeds()
|
||||
=> _options.Feeds.Where(feed => feed is { Enabled: true });
|
||||
|
||||
private Task<AcscCursor> GetCursorAsync(CancellationToken cancellationToken)
|
||||
=> GetCursorCoreAsync(cancellationToken);
|
||||
|
||||
private async Task<AcscCursor> GetCursorCoreAsync(CancellationToken cancellationToken)
|
||||
{
|
||||
var state = await _stateRepository.TryGetAsync(SourceName, cancellationToken).ConfigureAwait(false);
|
||||
return state is null ? AcscCursor.Empty : AcscCursor.FromBson(state.Cursor);
|
||||
}
|
||||
|
||||
private Task UpdateCursorAsync(AcscCursor cursor, CancellationToken cancellationToken)
|
||||
{
|
||||
var document = cursor.ToBsonDocument();
|
||||
var completedAt = _timeProvider.GetUtcNow();
|
||||
return _stateRepository.UpdateCursorAsync(SourceName, document, completedAt, cancellationToken);
|
||||
}
|
||||
|
||||
private bool IsRelayConfigured => _options.RelayEndpoint is not null;
|
||||
|
||||
private static string ModeName(AcscFetchMode mode) => mode switch
|
||||
{
|
||||
AcscFetchMode.Relay => "relay",
|
||||
_ => "direct",
|
||||
};
|
||||
|
||||
private enum AcscFetchMode
|
||||
{
|
||||
Direct = 0,
|
||||
Relay = 1,
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,19 @@
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using StellaOps.Plugin;
|
||||
|
||||
namespace StellaOps.Concelier.Connector.Acsc;
|
||||
|
||||
public sealed class AcscConnectorPlugin : IConnectorPlugin
|
||||
{
|
||||
public const string SourceName = "acsc";
|
||||
|
||||
public string Name => SourceName;
|
||||
|
||||
public bool IsAvailable(IServiceProvider services) => services is not null;
|
||||
|
||||
public IFeedConnector Create(IServiceProvider services)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(services);
|
||||
return ActivatorUtilities.CreateInstance<AcscConnector>(services);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,44 @@
|
||||
using Microsoft.Extensions.Configuration;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using StellaOps.DependencyInjection;
|
||||
using StellaOps.Concelier.Core.Jobs;
|
||||
using StellaOps.Concelier.Connector.Acsc.Configuration;
|
||||
|
||||
namespace StellaOps.Concelier.Connector.Acsc;
|
||||
|
||||
public sealed class AcscDependencyInjectionRoutine : IDependencyInjectionRoutine
|
||||
{
|
||||
private const string ConfigurationSection = "concelier:sources:acsc";
|
||||
|
||||
private const string FetchCron = "7,37 * * * *";
|
||||
private const string ParseCron = "12,42 * * * *";
|
||||
private const string MapCron = "17,47 * * * *";
|
||||
private const string ProbeCron = "25,55 * * * *";
|
||||
|
||||
private static readonly TimeSpan FetchTimeout = TimeSpan.FromMinutes(4);
|
||||
private static readonly TimeSpan ParseTimeout = TimeSpan.FromMinutes(3);
|
||||
private static readonly TimeSpan MapTimeout = TimeSpan.FromMinutes(3);
|
||||
private static readonly TimeSpan ProbeTimeout = TimeSpan.FromMinutes(1);
|
||||
private static readonly TimeSpan LeaseDuration = TimeSpan.FromMinutes(3);
|
||||
|
||||
public IServiceCollection Register(IServiceCollection services, IConfiguration configuration)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(services);
|
||||
ArgumentNullException.ThrowIfNull(configuration);
|
||||
|
||||
services.AddAcscConnector(options =>
|
||||
{
|
||||
configuration.GetSection(ConfigurationSection).Bind(options);
|
||||
options.Validate();
|
||||
});
|
||||
|
||||
var scheduler = new JobSchedulerBuilder(services);
|
||||
scheduler
|
||||
.AddJob<AcscFetchJob>(AcscJobKinds.Fetch, FetchCron, FetchTimeout, LeaseDuration)
|
||||
.AddJob<AcscParseJob>(AcscJobKinds.Parse, ParseCron, ParseTimeout, LeaseDuration)
|
||||
.AddJob<AcscMapJob>(AcscJobKinds.Map, MapCron, MapTimeout, LeaseDuration)
|
||||
.AddJob<AcscProbeJob>(AcscJobKinds.Probe, ProbeCron, ProbeTimeout, LeaseDuration);
|
||||
|
||||
return services;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,56 @@
|
||||
using System.Net;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.Options;
|
||||
using StellaOps.Concelier.Connector.Acsc.Configuration;
|
||||
using StellaOps.Concelier.Connector.Acsc.Internal;
|
||||
using StellaOps.Concelier.Connector.Common.Http;
|
||||
|
||||
namespace StellaOps.Concelier.Connector.Acsc;
|
||||
|
||||
public static class AcscServiceCollectionExtensions
|
||||
{
|
||||
public static IServiceCollection AddAcscConnector(this IServiceCollection services, Action<AcscOptions> configure)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(services);
|
||||
ArgumentNullException.ThrowIfNull(configure);
|
||||
|
||||
services.AddOptions<AcscOptions>()
|
||||
.Configure(configure)
|
||||
.PostConfigure(static options => options.Validate());
|
||||
|
||||
services.AddSourceHttpClient(AcscOptions.HttpClientName, (sp, clientOptions) =>
|
||||
{
|
||||
var options = sp.GetRequiredService<IOptions<AcscOptions>>().Value;
|
||||
clientOptions.Timeout = options.RequestTimeout;
|
||||
clientOptions.UserAgent = options.UserAgent;
|
||||
clientOptions.RequestVersion = options.RequestVersion;
|
||||
clientOptions.VersionPolicy = options.VersionPolicy;
|
||||
clientOptions.AllowAutoRedirect = true;
|
||||
clientOptions.ConfigureHandler = handler =>
|
||||
{
|
||||
handler.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
|
||||
handler.AllowAutoRedirect = true;
|
||||
};
|
||||
|
||||
clientOptions.AllowedHosts.Clear();
|
||||
clientOptions.AllowedHosts.Add(options.BaseEndpoint.Host);
|
||||
if (options.RelayEndpoint is not null)
|
||||
{
|
||||
clientOptions.AllowedHosts.Add(options.RelayEndpoint.Host);
|
||||
}
|
||||
|
||||
clientOptions.DefaultRequestHeaders["Accept"] = string.Join(", ", new[]
|
||||
{
|
||||
"application/rss+xml",
|
||||
"application/atom+xml;q=0.9",
|
||||
"application/xml;q=0.8",
|
||||
"text/xml;q=0.7",
|
||||
});
|
||||
});
|
||||
|
||||
services.AddSingleton<AcscDiagnostics>();
|
||||
services.AddTransient<AcscConnector>();
|
||||
|
||||
return services;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,54 @@
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
namespace StellaOps.Concelier.Connector.Acsc.Configuration;
|
||||
|
||||
/// <summary>
|
||||
/// Defines a single ACSC RSS feed endpoint.
|
||||
/// </summary>
|
||||
public sealed class AcscFeedOptions
|
||||
{
|
||||
private static readonly Regex SlugPattern = new("^[a-z0-9][a-z0-9\\-]*$", RegexOptions.Compiled | RegexOptions.CultureInvariant);
|
||||
|
||||
/// <summary>
|
||||
/// Logical slug for the feed (alerts, advisories, threats, etc.).
|
||||
/// </summary>
|
||||
public string Slug { get; set; } = "alerts";
|
||||
|
||||
/// <summary>
|
||||
/// Relative path (under <see cref="AcscOptions.BaseEndpoint"/>) for the RSS feed.
|
||||
/// </summary>
|
||||
public string RelativePath { get; set; } = "/acsc/view-all-content/alerts/rss";
|
||||
|
||||
/// <summary>
|
||||
/// Indicates whether the feed is active.
|
||||
/// </summary>
|
||||
public bool Enabled { get; set; } = true;
|
||||
|
||||
/// <summary>
|
||||
/// Optional display name for logging.
|
||||
/// </summary>
|
||||
public string? DisplayName { get; set; }
|
||||
|
||||
internal void Validate(int index)
|
||||
{
|
||||
if (string.IsNullOrWhiteSpace(Slug))
|
||||
{
|
||||
throw new InvalidOperationException($"ACSC feed entry #{index} must define a slug.");
|
||||
}
|
||||
|
||||
if (!SlugPattern.IsMatch(Slug))
|
||||
{
|
||||
throw new InvalidOperationException($"ACSC feed slug '{Slug}' is invalid. Slugs must be lower-case alphanumeric with optional hyphen separators.");
|
||||
}
|
||||
|
||||
if (string.IsNullOrWhiteSpace(RelativePath))
|
||||
{
|
||||
throw new InvalidOperationException($"ACSC feed '{Slug}' must specify a relative path.");
|
||||
}
|
||||
|
||||
if (!RelativePath.StartsWith("/", StringComparison.Ordinal))
|
||||
{
|
||||
throw new InvalidOperationException($"ACSC feed '{Slug}' relative path must begin with '/' (value: '{RelativePath}').");
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,153 @@
|
||||
using System.Net;
|
||||
using System.Net.Http;
|
||||
|
||||
namespace StellaOps.Concelier.Connector.Acsc.Configuration;
|
||||
|
||||
/// <summary>
|
||||
/// Connector options governing ACSC feed access and retry behaviour.
|
||||
/// </summary>
|
||||
public sealed class AcscOptions
|
||||
{
|
||||
public const string HttpClientName = "acsc";
|
||||
|
||||
private static readonly TimeSpan DefaultRequestTimeout = TimeSpan.FromSeconds(45);
|
||||
private static readonly TimeSpan DefaultFailureBackoff = TimeSpan.FromMinutes(5);
|
||||
private static readonly TimeSpan DefaultInitialBackfill = TimeSpan.FromDays(120);
|
||||
|
||||
public AcscOptions()
|
||||
{
|
||||
Feeds = new List<AcscFeedOptions>
|
||||
{
|
||||
new() { Slug = "alerts", RelativePath = "/acsc/view-all-content/alerts/rss" },
|
||||
new() { Slug = "advisories", RelativePath = "/acsc/view-all-content/advisories/rss" },
|
||||
new() { Slug = "news", RelativePath = "/acsc/view-all-content/news/rss", Enabled = false },
|
||||
new() { Slug = "publications", RelativePath = "/acsc/view-all-content/publications/rss", Enabled = false },
|
||||
new() { Slug = "threats", RelativePath = "/acsc/view-all-content/threats/rss", Enabled = false },
|
||||
};
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Base endpoint for direct ACSC fetches.
|
||||
/// </summary>
|
||||
public Uri BaseEndpoint { get; set; } = new("https://www.cyber.gov.au/", UriKind.Absolute);
|
||||
|
||||
/// <summary>
|
||||
/// Optional relay endpoint used when Akamai terminates direct HTTP/2 connections.
|
||||
/// </summary>
|
||||
public Uri? RelayEndpoint { get; set; }
|
||||
|
||||
/// <summary>
|
||||
/// Default mode when no preference has been captured in connector state. When <c>true</c>, the relay will be preferred for initial fetches.
|
||||
/// </summary>
|
||||
public bool PreferRelayByDefault { get; set; }
|
||||
|
||||
/// <summary>
|
||||
/// If enabled, the connector may switch to the relay endpoint when direct fetches fail.
|
||||
/// </summary>
|
||||
public bool EnableRelayFallback { get; set; } = true;
|
||||
|
||||
/// <summary>
|
||||
/// If set, the connector will always use the relay endpoint and skip direct attempts.
|
||||
/// </summary>
|
||||
public bool ForceRelay { get; set; }
|
||||
|
||||
/// <summary>
|
||||
/// Timeout applied to fetch requests (overrides HttpClient default).
|
||||
/// </summary>
|
||||
public TimeSpan RequestTimeout { get; set; } = DefaultRequestTimeout;
|
||||
|
||||
/// <summary>
|
||||
/// Backoff applied when marking fetch failures.
|
||||
/// </summary>
|
||||
public TimeSpan FailureBackoff { get; set; } = DefaultFailureBackoff;
|
||||
|
||||
/// <summary>
|
||||
/// Look-back period used when deriving initial published cursors.
|
||||
/// </summary>
|
||||
public TimeSpan InitialBackfill { get; set; } = DefaultInitialBackfill;
|
||||
|
||||
/// <summary>
|
||||
/// User-agent header sent with outbound requests.
|
||||
/// </summary>
|
||||
public string UserAgent { get; set; } = "StellaOps/Concelier (+https://stella-ops.org)";
|
||||
|
||||
/// <summary>
|
||||
/// RSS feeds requested during fetch.
|
||||
/// </summary>
|
||||
public IList<AcscFeedOptions> Feeds { get; }
|
||||
|
||||
/// <summary>
|
||||
/// HTTP version policy requested for outbound requests.
|
||||
/// </summary>
|
||||
public HttpVersionPolicy VersionPolicy { get; set; } = HttpVersionPolicy.RequestVersionOrLower;
|
||||
|
||||
/// <summary>
|
||||
/// Default HTTP version requested when connecting to ACSC (defaults to HTTP/2 but allows downgrade).
|
||||
/// </summary>
|
||||
public Version RequestVersion { get; set; } = HttpVersion.Version20;
|
||||
|
||||
public void Validate()
|
||||
{
|
||||
if (BaseEndpoint is null || !BaseEndpoint.IsAbsoluteUri)
|
||||
{
|
||||
throw new InvalidOperationException("ACSC BaseEndpoint must be an absolute URI.");
|
||||
}
|
||||
|
||||
if (!BaseEndpoint.AbsoluteUri.EndsWith("/", StringComparison.Ordinal))
|
||||
{
|
||||
throw new InvalidOperationException("ACSC BaseEndpoint must include a trailing slash.");
|
||||
}
|
||||
|
||||
if (RelayEndpoint is not null && !RelayEndpoint.IsAbsoluteUri)
|
||||
{
|
||||
throw new InvalidOperationException("ACSC RelayEndpoint must be an absolute URI when specified.");
|
||||
}
|
||||
|
||||
if (RelayEndpoint is not null && !RelayEndpoint.AbsoluteUri.EndsWith("/", StringComparison.Ordinal))
|
||||
{
|
||||
throw new InvalidOperationException("ACSC RelayEndpoint must include a trailing slash when specified.");
|
||||
}
|
||||
|
||||
if (RequestTimeout <= TimeSpan.Zero)
|
||||
{
|
||||
throw new InvalidOperationException("ACSC RequestTimeout must be positive.");
|
||||
}
|
||||
|
||||
if (FailureBackoff < TimeSpan.Zero)
|
||||
{
|
||||
throw new InvalidOperationException("ACSC FailureBackoff cannot be negative.");
|
||||
}
|
||||
|
||||
if (InitialBackfill <= TimeSpan.Zero)
|
||||
{
|
||||
throw new InvalidOperationException("ACSC InitialBackfill must be positive.");
|
||||
}
|
||||
|
||||
if (string.IsNullOrWhiteSpace(UserAgent))
|
||||
{
|
||||
throw new InvalidOperationException("ACSC UserAgent cannot be empty.");
|
||||
}
|
||||
|
||||
if (Feeds.Count == 0)
|
||||
{
|
||||
throw new InvalidOperationException("At least one ACSC feed must be configured.");
|
||||
}
|
||||
|
||||
var seen = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
|
||||
for (var i = 0; i < Feeds.Count; i++)
|
||||
{
|
||||
var feed = Feeds[i];
|
||||
feed.Validate(i);
|
||||
|
||||
if (!feed.Enabled)
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
if (!seen.Add(feed.Slug))
|
||||
{
|
||||
throw new InvalidOperationException($"Duplicate ACSC feed slug '{feed.Slug}' detected. Slugs must be unique (case-insensitive).");
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
141
src/StellaOps.Concelier.Connector.Acsc/Internal/AcscCursor.cs
Normal file
141
src/StellaOps.Concelier.Connector.Acsc/Internal/AcscCursor.cs
Normal file
@@ -0,0 +1,141 @@
|
||||
using MongoDB.Bson;
|
||||
|
||||
namespace StellaOps.Concelier.Connector.Acsc.Internal;
|
||||
|
||||
internal enum AcscEndpointPreference
|
||||
{
|
||||
Auto = 0,
|
||||
Direct = 1,
|
||||
Relay = 2,
|
||||
}
|
||||
|
||||
internal sealed record AcscCursor(
|
||||
AcscEndpointPreference PreferredEndpoint,
|
||||
IReadOnlyDictionary<string, DateTimeOffset?> LastPublishedByFeed,
|
||||
IReadOnlyCollection<Guid> PendingDocuments,
|
||||
IReadOnlyCollection<Guid> PendingMappings)
|
||||
{
|
||||
private static readonly IReadOnlyCollection<Guid> EmptyGuidList = Array.Empty<Guid>();
|
||||
private static readonly IReadOnlyDictionary<string, DateTimeOffset?> EmptyFeedDictionary =
|
||||
new Dictionary<string, DateTimeOffset?>(StringComparer.OrdinalIgnoreCase);
|
||||
|
||||
public static AcscCursor Empty { get; } = new(
|
||||
AcscEndpointPreference.Auto,
|
||||
EmptyFeedDictionary,
|
||||
EmptyGuidList,
|
||||
EmptyGuidList);
|
||||
|
||||
public AcscCursor WithPendingDocuments(IEnumerable<Guid> documents)
|
||||
=> this with { PendingDocuments = documents?.Distinct().ToArray() ?? EmptyGuidList };
|
||||
|
||||
public AcscCursor WithPendingMappings(IEnumerable<Guid> mappings)
|
||||
=> this with { PendingMappings = mappings?.Distinct().ToArray() ?? EmptyGuidList };
|
||||
|
||||
public AcscCursor WithPreferredEndpoint(AcscEndpointPreference preference)
|
||||
=> this with { PreferredEndpoint = preference };
|
||||
|
||||
public AcscCursor WithLastPublished(IDictionary<string, DateTimeOffset?> values)
|
||||
{
|
||||
var snapshot = new Dictionary<string, DateTimeOffset?>(StringComparer.OrdinalIgnoreCase);
|
||||
if (values is not null)
|
||||
{
|
||||
foreach (var kvp in values)
|
||||
{
|
||||
snapshot[kvp.Key] = kvp.Value;
|
||||
}
|
||||
}
|
||||
|
||||
return this with { LastPublishedByFeed = snapshot };
|
||||
}
|
||||
|
||||
public BsonDocument ToBsonDocument()
|
||||
{
|
||||
var document = new BsonDocument
|
||||
{
|
||||
["preferredEndpoint"] = PreferredEndpoint.ToString(),
|
||||
["pendingDocuments"] = new BsonArray(PendingDocuments.Select(id => id.ToString())),
|
||||
["pendingMappings"] = new BsonArray(PendingMappings.Select(id => id.ToString())),
|
||||
};
|
||||
|
||||
var feedsDocument = new BsonDocument();
|
||||
foreach (var kvp in LastPublishedByFeed)
|
||||
{
|
||||
if (kvp.Value.HasValue)
|
||||
{
|
||||
feedsDocument[kvp.Key] = kvp.Value.Value.UtcDateTime;
|
||||
}
|
||||
}
|
||||
|
||||
document["feeds"] = feedsDocument;
|
||||
return document;
|
||||
}
|
||||
|
||||
public static AcscCursor FromBson(BsonDocument? document)
|
||||
{
|
||||
if (document is null || document.ElementCount == 0)
|
||||
{
|
||||
return Empty;
|
||||
}
|
||||
|
||||
var preferredEndpoint = document.TryGetValue("preferredEndpoint", out var endpointValue)
|
||||
? ParseEndpointPreference(endpointValue.AsString)
|
||||
: AcscEndpointPreference.Auto;
|
||||
|
||||
var feeds = new Dictionary<string, DateTimeOffset?>(StringComparer.OrdinalIgnoreCase);
|
||||
if (document.TryGetValue("feeds", out var feedsValue) && feedsValue is BsonDocument feedsDocument)
|
||||
{
|
||||
foreach (var element in feedsDocument.Elements)
|
||||
{
|
||||
feeds[element.Name] = ParseDate(element.Value);
|
||||
}
|
||||
}
|
||||
|
||||
var pendingDocuments = ReadGuidArray(document, "pendingDocuments");
|
||||
var pendingMappings = ReadGuidArray(document, "pendingMappings");
|
||||
|
||||
return new AcscCursor(
|
||||
preferredEndpoint,
|
||||
feeds,
|
||||
pendingDocuments,
|
||||
pendingMappings);
|
||||
}
|
||||
|
||||
private static IReadOnlyCollection<Guid> ReadGuidArray(BsonDocument document, string field)
|
||||
{
|
||||
if (!document.TryGetValue(field, out var value) || value is not BsonArray array)
|
||||
{
|
||||
return EmptyGuidList;
|
||||
}
|
||||
|
||||
var list = new List<Guid>(array.Count);
|
||||
foreach (var element in array)
|
||||
{
|
||||
if (Guid.TryParse(element?.ToString(), out var guid))
|
||||
{
|
||||
list.Add(guid);
|
||||
}
|
||||
}
|
||||
|
||||
return list;
|
||||
}
|
||||
|
||||
private static DateTimeOffset? ParseDate(BsonValue value)
|
||||
{
|
||||
return value.BsonType switch
|
||||
{
|
||||
BsonType.DateTime => DateTime.SpecifyKind(value.ToUniversalTime(), DateTimeKind.Utc),
|
||||
BsonType.String when DateTimeOffset.TryParse(value.AsString, out var parsed) => parsed.ToUniversalTime(),
|
||||
_ => null,
|
||||
};
|
||||
}
|
||||
|
||||
private static AcscEndpointPreference ParseEndpointPreference(string? value)
|
||||
{
|
||||
if (Enum.TryParse<AcscEndpointPreference>(value, ignoreCase: true, out var parsed))
|
||||
{
|
||||
return parsed;
|
||||
}
|
||||
|
||||
return AcscEndpointPreference.Auto;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,97 @@
|
||||
using System.Diagnostics.Metrics;
|
||||
|
||||
namespace StellaOps.Concelier.Connector.Acsc.Internal;
|
||||
|
||||
public sealed class AcscDiagnostics : IDisposable
|
||||
{
|
||||
private const string MeterName = "StellaOps.Concelier.Connector.Acsc";
|
||||
private const string MeterVersion = "1.0.0";
|
||||
|
||||
private readonly Meter _meter;
|
||||
private readonly Counter<long> _fetchAttempts;
|
||||
private readonly Counter<long> _fetchSuccess;
|
||||
private readonly Counter<long> _fetchFailures;
|
||||
private readonly Counter<long> _fetchUnchanged;
|
||||
private readonly Counter<long> _fetchFallbacks;
|
||||
private readonly Counter<long> _cursorUpdates;
|
||||
private readonly Counter<long> _parseAttempts;
|
||||
private readonly Counter<long> _parseSuccess;
|
||||
private readonly Counter<long> _parseFailures;
|
||||
private readonly Counter<long> _mapSuccess;
|
||||
|
||||
public AcscDiagnostics()
|
||||
{
|
||||
_meter = new Meter(MeterName, MeterVersion);
|
||||
_fetchAttempts = _meter.CreateCounter<long>("acsc.fetch.attempts", unit: "operations");
|
||||
_fetchSuccess = _meter.CreateCounter<long>("acsc.fetch.success", unit: "operations");
|
||||
_fetchFailures = _meter.CreateCounter<long>("acsc.fetch.failures", unit: "operations");
|
||||
_fetchUnchanged = _meter.CreateCounter<long>("acsc.fetch.unchanged", unit: "operations");
|
||||
_fetchFallbacks = _meter.CreateCounter<long>("acsc.fetch.fallbacks", unit: "operations");
|
||||
_cursorUpdates = _meter.CreateCounter<long>("acsc.cursor.published_updates", unit: "feeds");
|
||||
_parseAttempts = _meter.CreateCounter<long>("acsc.parse.attempts", unit: "documents");
|
||||
_parseSuccess = _meter.CreateCounter<long>("acsc.parse.success", unit: "documents");
|
||||
_parseFailures = _meter.CreateCounter<long>("acsc.parse.failures", unit: "documents");
|
||||
_mapSuccess = _meter.CreateCounter<long>("acsc.map.success", unit: "advisories");
|
||||
}
|
||||
|
||||
public void FetchAttempt(string feed, string mode)
|
||||
=> _fetchAttempts.Add(1, GetTags(feed, mode));
|
||||
|
||||
public void FetchSuccess(string feed, string mode)
|
||||
=> _fetchSuccess.Add(1, GetTags(feed, mode));
|
||||
|
||||
public void FetchFailure(string feed, string mode)
|
||||
=> _fetchFailures.Add(1, GetTags(feed, mode));
|
||||
|
||||
public void FetchUnchanged(string feed, string mode)
|
||||
=> _fetchUnchanged.Add(1, GetTags(feed, mode));
|
||||
|
||||
public void FetchFallback(string feed, string mode, string reason)
|
||||
=> _fetchFallbacks.Add(1, GetTags(feed, mode, new KeyValuePair<string, object?>("reason", reason)));
|
||||
|
||||
public void CursorUpdated(string feed)
|
||||
=> _cursorUpdates.Add(1, new KeyValuePair<string, object?>("feed", feed));
|
||||
|
||||
public void ParseAttempt(string feed)
|
||||
=> _parseAttempts.Add(1, new KeyValuePair<string, object?>("feed", feed));
|
||||
|
||||
public void ParseSuccess(string feed)
|
||||
=> _parseSuccess.Add(1, new KeyValuePair<string, object?>("feed", feed));
|
||||
|
||||
public void ParseFailure(string feed, string reason)
|
||||
=> _parseFailures.Add(1, new KeyValuePair<string, object?>[]
|
||||
{
|
||||
new("feed", feed),
|
||||
new("reason", reason),
|
||||
});
|
||||
|
||||
public void MapSuccess(int advisoryCount)
|
||||
{
|
||||
if (advisoryCount <= 0)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
_mapSuccess.Add(advisoryCount);
|
||||
}
|
||||
|
||||
private static KeyValuePair<string, object?>[] GetTags(string feed, string mode)
|
||||
=> new[]
|
||||
{
|
||||
new KeyValuePair<string, object?>("feed", feed),
|
||||
new KeyValuePair<string, object?>("mode", mode),
|
||||
};
|
||||
|
||||
private static KeyValuePair<string, object?>[] GetTags(string feed, string mode, KeyValuePair<string, object?> extra)
|
||||
=> new[]
|
||||
{
|
||||
new KeyValuePair<string, object?>("feed", feed),
|
||||
new KeyValuePair<string, object?>("mode", mode),
|
||||
extra,
|
||||
};
|
||||
|
||||
public void Dispose()
|
||||
{
|
||||
_meter.Dispose();
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,20 @@
|
||||
using StellaOps.Concelier.Storage.Mongo.Documents;
|
||||
|
||||
namespace StellaOps.Concelier.Connector.Acsc.Internal;
|
||||
|
||||
internal readonly record struct AcscDocumentMetadata(string FeedSlug, string FetchMode)
|
||||
{
|
||||
public static AcscDocumentMetadata FromDocument(DocumentRecord document)
|
||||
{
|
||||
if (document.Metadata is null)
|
||||
{
|
||||
return new AcscDocumentMetadata(string.Empty, string.Empty);
|
||||
}
|
||||
|
||||
document.Metadata.TryGetValue("acsc.feed.slug", out var slug);
|
||||
document.Metadata.TryGetValue("acsc.fetch.mode", out var mode);
|
||||
return new AcscDocumentMetadata(
|
||||
string.IsNullOrWhiteSpace(slug) ? string.Empty : slug.Trim(),
|
||||
string.IsNullOrWhiteSpace(mode) ? string.Empty : mode.Trim());
|
||||
}
|
||||
}
|
||||
58
src/StellaOps.Concelier.Connector.Acsc/Internal/AcscDto.cs
Normal file
58
src/StellaOps.Concelier.Connector.Acsc/Internal/AcscDto.cs
Normal file
@@ -0,0 +1,58 @@
|
||||
using System.Text.Json.Serialization;
|
||||
|
||||
namespace StellaOps.Concelier.Connector.Acsc.Internal;
|
||||
|
||||
internal sealed record AcscFeedDto(
|
||||
[property: JsonPropertyName("feedSlug")] string FeedSlug,
|
||||
[property: JsonPropertyName("feedTitle")] string? FeedTitle,
|
||||
[property: JsonPropertyName("feedLink")] string? FeedLink,
|
||||
[property: JsonPropertyName("feedUpdated")] DateTimeOffset? FeedUpdated,
|
||||
[property: JsonPropertyName("parsedAt")] DateTimeOffset ParsedAt,
|
||||
[property: JsonPropertyName("entries")] IReadOnlyList<AcscEntryDto> Entries)
|
||||
{
|
||||
public static AcscFeedDto Empty { get; } = new(
|
||||
FeedSlug: string.Empty,
|
||||
FeedTitle: null,
|
||||
FeedLink: null,
|
||||
FeedUpdated: null,
|
||||
ParsedAt: DateTimeOffset.UnixEpoch,
|
||||
Entries: Array.Empty<AcscEntryDto>());
|
||||
}
|
||||
|
||||
internal sealed record AcscEntryDto(
|
||||
[property: JsonPropertyName("entryId")] string EntryId,
|
||||
[property: JsonPropertyName("title")] string Title,
|
||||
[property: JsonPropertyName("link")] string? Link,
|
||||
[property: JsonPropertyName("feedSlug")] string FeedSlug,
|
||||
[property: JsonPropertyName("published")] DateTimeOffset? Published,
|
||||
[property: JsonPropertyName("updated")] DateTimeOffset? Updated,
|
||||
[property: JsonPropertyName("summary")] string Summary,
|
||||
[property: JsonPropertyName("contentHtml")] string ContentHtml,
|
||||
[property: JsonPropertyName("contentText")] string ContentText,
|
||||
[property: JsonPropertyName("references")] IReadOnlyList<AcscReferenceDto> References,
|
||||
[property: JsonPropertyName("aliases")] IReadOnlyList<string> Aliases,
|
||||
[property: JsonPropertyName("fields")] IReadOnlyDictionary<string, string> Fields)
|
||||
{
|
||||
public static AcscEntryDto Empty { get; } = new(
|
||||
EntryId: string.Empty,
|
||||
Title: string.Empty,
|
||||
Link: null,
|
||||
FeedSlug: string.Empty,
|
||||
Published: null,
|
||||
Updated: null,
|
||||
Summary: string.Empty,
|
||||
ContentHtml: string.Empty,
|
||||
ContentText: string.Empty,
|
||||
References: Array.Empty<AcscReferenceDto>(),
|
||||
Aliases: Array.Empty<string>(),
|
||||
Fields: new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase));
|
||||
}
|
||||
|
||||
internal sealed record AcscReferenceDto(
|
||||
[property: JsonPropertyName("title")] string Title,
|
||||
[property: JsonPropertyName("url")] string Url)
|
||||
{
|
||||
public static AcscReferenceDto Empty { get; } = new(
|
||||
Title: string.Empty,
|
||||
Url: string.Empty);
|
||||
}
|
||||
@@ -0,0 +1,594 @@
|
||||
using System.Globalization;
|
||||
using System.Text;
|
||||
using System.Xml.Linq;
|
||||
using AngleSharp.Dom;
|
||||
using AngleSharp.Html.Parser;
|
||||
using System.Security.Cryptography;
|
||||
using StellaOps.Concelier.Connector.Common.Html;
|
||||
|
||||
namespace StellaOps.Concelier.Connector.Acsc.Internal;
|
||||
|
||||
internal static class AcscFeedParser
|
||||
{
|
||||
private static readonly XNamespace AtomNamespace = "http://www.w3.org/2005/Atom";
|
||||
private static readonly XNamespace ContentNamespace = "http://purl.org/rss/1.0/modules/content/";
|
||||
public static AcscFeedDto Parse(byte[] payload, string feedSlug, DateTimeOffset parsedAt, HtmlContentSanitizer sanitizer)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(payload);
|
||||
ArgumentNullException.ThrowIfNull(sanitizer);
|
||||
|
||||
if (payload.Length == 0)
|
||||
{
|
||||
return AcscFeedDto.Empty with
|
||||
{
|
||||
FeedSlug = feedSlug ?? string.Empty,
|
||||
ParsedAt = parsedAt,
|
||||
Entries = Array.Empty<AcscEntryDto>(),
|
||||
};
|
||||
}
|
||||
|
||||
var xml = XDocument.Parse(Encoding.UTF8.GetString(payload));
|
||||
|
||||
var (feedTitle, feedLink, feedUpdated) = ExtractFeedMetadata(xml);
|
||||
var items = ExtractEntries(xml).ToArray();
|
||||
|
||||
var entries = new List<AcscEntryDto>(items.Length);
|
||||
foreach (var item in items)
|
||||
{
|
||||
var entryId = ExtractEntryId(item);
|
||||
if (string.IsNullOrWhiteSpace(entryId))
|
||||
{
|
||||
// Fall back to hash of title + link to avoid duplicates.
|
||||
entryId = GenerateFallbackId(item);
|
||||
}
|
||||
|
||||
var title = ExtractTitle(item);
|
||||
var link = ExtractLink(item);
|
||||
var published = ExtractDate(item, "pubDate") ?? ExtractAtomDate(item, "published") ?? ExtractDcDate(item);
|
||||
var updated = ExtractAtomDate(item, "updated");
|
||||
|
||||
var rawHtml = ExtractContent(item);
|
||||
var baseUri = TryCreateUri(link);
|
||||
var sanitizedHtml = sanitizer.Sanitize(rawHtml, baseUri);
|
||||
var htmlFragment = ParseHtmlFragment(sanitizedHtml);
|
||||
|
||||
var summary = BuildSummary(htmlFragment) ?? string.Empty;
|
||||
var contentText = NormalizeWhitespace(htmlFragment?.TextContent ?? string.Empty);
|
||||
|
||||
var references = ExtractReferences(htmlFragment);
|
||||
var fields = ExtractFields(htmlFragment, out var serialNumber, out var advisoryType);
|
||||
var aliases = BuildAliases(serialNumber, advisoryType);
|
||||
|
||||
var entry = new AcscEntryDto(
|
||||
EntryId: entryId,
|
||||
Title: title,
|
||||
Link: link,
|
||||
FeedSlug: feedSlug ?? string.Empty,
|
||||
Published: published,
|
||||
Updated: updated,
|
||||
Summary: summary,
|
||||
ContentHtml: sanitizedHtml,
|
||||
ContentText: contentText,
|
||||
References: references,
|
||||
Aliases: aliases,
|
||||
Fields: fields);
|
||||
|
||||
entries.Add(entry);
|
||||
}
|
||||
|
||||
return new AcscFeedDto(
|
||||
FeedSlug: feedSlug ?? string.Empty,
|
||||
FeedTitle: feedTitle,
|
||||
FeedLink: feedLink,
|
||||
FeedUpdated: feedUpdated,
|
||||
ParsedAt: parsedAt,
|
||||
Entries: entries);
|
||||
}
|
||||
|
||||
private static (string? Title, string? Link, DateTimeOffset? Updated) ExtractFeedMetadata(XDocument xml)
|
||||
{
|
||||
var root = xml.Root;
|
||||
if (root is null)
|
||||
{
|
||||
return (null, null, null);
|
||||
}
|
||||
|
||||
if (string.Equals(root.Name.LocalName, "rss", StringComparison.OrdinalIgnoreCase))
|
||||
{
|
||||
var channel = root.Element("channel");
|
||||
var title = channel?.Element("title")?.Value?.Trim();
|
||||
var link = channel?.Element("link")?.Value?.Trim();
|
||||
var updated = TryParseDate(channel?.Element("lastBuildDate")?.Value);
|
||||
return (title, link, updated);
|
||||
}
|
||||
|
||||
if (root.Name == AtomNamespace + "feed")
|
||||
{
|
||||
var title = root.Element(AtomNamespace + "title")?.Value?.Trim();
|
||||
var link = root.Elements(AtomNamespace + "link")
|
||||
.FirstOrDefault(static element =>
|
||||
string.Equals(element.Attribute("rel")?.Value, "alternate", StringComparison.OrdinalIgnoreCase))
|
||||
?.Attribute("href")?.Value?.Trim()
|
||||
?? root.Element(AtomNamespace + "link")?.Attribute("href")?.Value?.Trim();
|
||||
var updated = TryParseDate(root.Element(AtomNamespace + "updated")?.Value);
|
||||
return (title, link, updated);
|
||||
}
|
||||
|
||||
return (null, null, null);
|
||||
}
|
||||
|
||||
private static IEnumerable<XElement> ExtractEntries(XDocument xml)
|
||||
{
|
||||
var root = xml.Root;
|
||||
if (root is null)
|
||||
{
|
||||
yield break;
|
||||
}
|
||||
|
||||
if (string.Equals(root.Name.LocalName, "rss", StringComparison.OrdinalIgnoreCase))
|
||||
{
|
||||
var channel = root.Element("channel");
|
||||
if (channel is null)
|
||||
{
|
||||
yield break;
|
||||
}
|
||||
|
||||
foreach (var item in channel.Elements("item"))
|
||||
{
|
||||
yield return item;
|
||||
}
|
||||
yield break;
|
||||
}
|
||||
|
||||
if (root.Name == AtomNamespace + "feed")
|
||||
{
|
||||
foreach (var entry in root.Elements(AtomNamespace + "entry"))
|
||||
{
|
||||
yield return entry;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private static string ExtractTitle(XElement element)
|
||||
{
|
||||
var title = element.Element("title")?.Value
|
||||
?? element.Element(AtomNamespace + "title")?.Value
|
||||
?? string.Empty;
|
||||
return title.Trim();
|
||||
}
|
||||
|
||||
private static string? ExtractLink(XElement element)
|
||||
{
|
||||
var linkValue = element.Element("link")?.Value;
|
||||
if (!string.IsNullOrWhiteSpace(linkValue))
|
||||
{
|
||||
return linkValue.Trim();
|
||||
}
|
||||
|
||||
var atomLink = element.Elements(AtomNamespace + "link")
|
||||
.FirstOrDefault(static el =>
|
||||
string.Equals(el.Attribute("rel")?.Value, "alternate", StringComparison.OrdinalIgnoreCase))
|
||||
?? element.Element(AtomNamespace + "link");
|
||||
|
||||
if (atomLink is not null)
|
||||
{
|
||||
var href = atomLink.Attribute("href")?.Value;
|
||||
if (!string.IsNullOrWhiteSpace(href))
|
||||
{
|
||||
return href.Trim();
|
||||
}
|
||||
}
|
||||
|
||||
return null;
|
||||
}
|
||||
|
||||
private static string ExtractEntryId(XElement element)
|
||||
{
|
||||
var guid = element.Element("guid")?.Value;
|
||||
if (!string.IsNullOrWhiteSpace(guid))
|
||||
{
|
||||
return guid.Trim();
|
||||
}
|
||||
|
||||
var atomId = element.Element(AtomNamespace + "id")?.Value;
|
||||
if (!string.IsNullOrWhiteSpace(atomId))
|
||||
{
|
||||
return atomId.Trim();
|
||||
}
|
||||
|
||||
if (!string.IsNullOrWhiteSpace(element.Element("link")?.Value))
|
||||
{
|
||||
return element.Element("link")!.Value.Trim();
|
||||
}
|
||||
|
||||
if (!string.IsNullOrWhiteSpace(element.Element("title")?.Value))
|
||||
{
|
||||
return GenerateStableKey(element.Element("title")!.Value);
|
||||
}
|
||||
|
||||
return string.Empty;
|
||||
}
|
||||
|
||||
private static string GenerateFallbackId(XElement element)
|
||||
{
|
||||
var builder = new StringBuilder();
|
||||
var title = element.Element("title")?.Value;
|
||||
if (!string.IsNullOrWhiteSpace(title))
|
||||
{
|
||||
builder.Append(title.Trim());
|
||||
}
|
||||
|
||||
var link = ExtractLink(element);
|
||||
if (!string.IsNullOrWhiteSpace(link))
|
||||
{
|
||||
if (builder.Length > 0)
|
||||
{
|
||||
builder.Append("::");
|
||||
}
|
||||
builder.Append(link);
|
||||
}
|
||||
|
||||
if (builder.Length == 0)
|
||||
{
|
||||
return Guid.NewGuid().ToString("n");
|
||||
}
|
||||
|
||||
return GenerateStableKey(builder.ToString());
|
||||
}
|
||||
|
||||
private static string GenerateStableKey(string value)
|
||||
{
|
||||
using var sha = SHA256.Create();
|
||||
var bytes = Encoding.UTF8.GetBytes(value);
|
||||
var hash = sha.ComputeHash(bytes);
|
||||
return Convert.ToHexString(hash).ToLowerInvariant();
|
||||
}
|
||||
|
||||
private static string ExtractContent(XElement element)
|
||||
{
|
||||
var encoded = element.Element(ContentNamespace + "encoded")?.Value;
|
||||
if (!string.IsNullOrWhiteSpace(encoded))
|
||||
{
|
||||
return encoded;
|
||||
}
|
||||
|
||||
var description = element.Element("description")?.Value;
|
||||
if (!string.IsNullOrWhiteSpace(description))
|
||||
{
|
||||
return description;
|
||||
}
|
||||
|
||||
var summary = element.Element(AtomNamespace + "summary")?.Value;
|
||||
if (!string.IsNullOrWhiteSpace(summary))
|
||||
{
|
||||
return summary;
|
||||
}
|
||||
|
||||
return string.Empty;
|
||||
}
|
||||
|
||||
private static DateTimeOffset? ExtractDate(XElement element, string name)
|
||||
{
|
||||
var value = element.Element(name)?.Value;
|
||||
return TryParseDate(value);
|
||||
}
|
||||
|
||||
private static DateTimeOffset? ExtractAtomDate(XElement element, string name)
|
||||
{
|
||||
var value = element.Element(AtomNamespace + name)?.Value;
|
||||
return TryParseDate(value);
|
||||
}
|
||||
|
||||
private static DateTimeOffset? ExtractDcDate(XElement element)
|
||||
{
|
||||
var value = element.Element(XName.Get("date", "http://purl.org/dc/elements/1.1/"))?.Value;
|
||||
return TryParseDate(value);
|
||||
}
|
||||
|
||||
private static DateTimeOffset? TryParseDate(string? value)
|
||||
{
|
||||
if (string.IsNullOrWhiteSpace(value))
|
||||
{
|
||||
return null;
|
||||
}
|
||||
|
||||
if (DateTimeOffset.TryParse(value, CultureInfo.InvariantCulture, DateTimeStyles.AssumeUniversal | DateTimeStyles.AllowWhiteSpaces, out var result))
|
||||
{
|
||||
return result.ToUniversalTime();
|
||||
}
|
||||
|
||||
if (DateTimeOffset.TryParse(value, CultureInfo.CurrentCulture, DateTimeStyles.AssumeUniversal | DateTimeStyles.AllowWhiteSpaces, out result))
|
||||
{
|
||||
return result.ToUniversalTime();
|
||||
}
|
||||
|
||||
return null;
|
||||
}
|
||||
|
||||
private static Uri? TryCreateUri(string? value)
|
||||
{
|
||||
if (string.IsNullOrWhiteSpace(value))
|
||||
{
|
||||
return null;
|
||||
}
|
||||
|
||||
return Uri.TryCreate(value, UriKind.Absolute, out var uri) ? uri : null;
|
||||
}
|
||||
|
||||
private static IElement? ParseHtmlFragment(string html)
|
||||
{
|
||||
if (string.IsNullOrWhiteSpace(html))
|
||||
{
|
||||
return null;
|
||||
}
|
||||
|
||||
var parser = new HtmlParser(new HtmlParserOptions
|
||||
{
|
||||
IsKeepingSourceReferences = false,
|
||||
});
|
||||
var document = parser.ParseDocument($"<body>{html}</body>");
|
||||
return document.Body;
|
||||
}
|
||||
|
||||
private static string? BuildSummary(IElement? root)
|
||||
{
|
||||
if (root is null || !root.HasChildNodes)
|
||||
{
|
||||
return root?.TextContent is { Length: > 0 } text
|
||||
? NormalizeWhitespace(text)
|
||||
: string.Empty;
|
||||
}
|
||||
|
||||
var segments = new List<string>();
|
||||
foreach (var child in root.Children)
|
||||
{
|
||||
var text = NormalizeWhitespace(child.TextContent);
|
||||
if (string.IsNullOrEmpty(text))
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
if (string.Equals(child.NodeName, "LI", StringComparison.OrdinalIgnoreCase))
|
||||
{
|
||||
segments.Add($"- {text}");
|
||||
continue;
|
||||
}
|
||||
|
||||
segments.Add(text);
|
||||
}
|
||||
|
||||
if (segments.Count == 0)
|
||||
{
|
||||
var fallback = NormalizeWhitespace(root.TextContent);
|
||||
return fallback;
|
||||
}
|
||||
|
||||
return string.Join("\n\n", segments);
|
||||
}
|
||||
|
||||
private static IReadOnlyList<AcscReferenceDto> ExtractReferences(IElement? root)
|
||||
{
|
||||
if (root is null)
|
||||
{
|
||||
return Array.Empty<AcscReferenceDto>();
|
||||
}
|
||||
|
||||
var anchors = root.QuerySelectorAll("a");
|
||||
if (anchors.Length == 0)
|
||||
{
|
||||
return Array.Empty<AcscReferenceDto>();
|
||||
}
|
||||
|
||||
var seen = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
|
||||
var references = new List<AcscReferenceDto>(anchors.Length);
|
||||
|
||||
foreach (var anchor in anchors)
|
||||
{
|
||||
var href = anchor.GetAttribute("href");
|
||||
if (string.IsNullOrWhiteSpace(href))
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
if (!seen.Add(href))
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
var text = NormalizeWhitespace(anchor.TextContent);
|
||||
if (string.IsNullOrEmpty(text))
|
||||
{
|
||||
text = href;
|
||||
}
|
||||
|
||||
references.Add(new AcscReferenceDto(text, href));
|
||||
}
|
||||
|
||||
return references;
|
||||
}
|
||||
|
||||
private static IReadOnlyDictionary<string, string> ExtractFields(IElement? root, out string? serialNumber, out string? advisoryType)
|
||||
{
|
||||
serialNumber = null;
|
||||
advisoryType = null;
|
||||
|
||||
if (root is null)
|
||||
{
|
||||
return EmptyFields;
|
||||
}
|
||||
|
||||
var map = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
|
||||
|
||||
foreach (var element in root.QuerySelectorAll("strong"))
|
||||
{
|
||||
var labelRaw = NormalizeWhitespace(element.TextContent);
|
||||
if (string.IsNullOrEmpty(labelRaw))
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
var label = labelRaw.TrimEnd(':').Trim();
|
||||
if (string.IsNullOrEmpty(label))
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
var key = NormalizeFieldKey(label);
|
||||
if (string.IsNullOrEmpty(key))
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
var value = ExtractFieldValue(element);
|
||||
if (string.IsNullOrEmpty(value))
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
if (!map.ContainsKey(key))
|
||||
{
|
||||
map[key] = value;
|
||||
}
|
||||
|
||||
if (string.Equals(key, "serialNumber", StringComparison.OrdinalIgnoreCase))
|
||||
{
|
||||
serialNumber ??= value;
|
||||
}
|
||||
else if (string.Equals(key, "advisoryType", StringComparison.OrdinalIgnoreCase))
|
||||
{
|
||||
advisoryType ??= value;
|
||||
}
|
||||
}
|
||||
|
||||
return map.Count == 0
|
||||
? EmptyFields
|
||||
: map;
|
||||
}
|
||||
|
||||
private static string? ExtractFieldValue(IElement strongElement)
|
||||
{
|
||||
var builder = new StringBuilder();
|
||||
var node = strongElement.NextSibling;
|
||||
|
||||
while (node is not null)
|
||||
{
|
||||
if (node.NodeType == NodeType.Text)
|
||||
{
|
||||
builder.Append(node.TextContent);
|
||||
}
|
||||
else if (node is IElement element)
|
||||
{
|
||||
builder.Append(element.TextContent);
|
||||
}
|
||||
|
||||
node = node.NextSibling;
|
||||
}
|
||||
|
||||
var value = builder.ToString();
|
||||
if (string.IsNullOrWhiteSpace(value))
|
||||
{
|
||||
var parent = strongElement.ParentElement;
|
||||
if (parent is not null)
|
||||
{
|
||||
var parentText = parent.TextContent ?? string.Empty;
|
||||
var trimmed = parentText.Replace(strongElement.TextContent ?? string.Empty, string.Empty, StringComparison.OrdinalIgnoreCase);
|
||||
value = trimmed;
|
||||
}
|
||||
}
|
||||
|
||||
value = NormalizeWhitespace(value);
|
||||
if (string.IsNullOrEmpty(value))
|
||||
{
|
||||
return null;
|
||||
}
|
||||
|
||||
value = value.TrimStart(':', '-', '–', '—', ' ');
|
||||
return value.Trim();
|
||||
}
|
||||
|
||||
private static IReadOnlyList<string> BuildAliases(string? serialNumber, string? advisoryType)
|
||||
{
|
||||
var aliases = new List<string>(capacity: 2);
|
||||
if (!string.IsNullOrWhiteSpace(serialNumber))
|
||||
{
|
||||
aliases.Add(serialNumber.Trim());
|
||||
}
|
||||
|
||||
if (!string.IsNullOrWhiteSpace(advisoryType))
|
||||
{
|
||||
aliases.Add(advisoryType.Trim());
|
||||
}
|
||||
|
||||
return aliases.Count == 0 ? Array.Empty<string>() : aliases;
|
||||
}
|
||||
|
||||
private static string NormalizeFieldKey(string label)
|
||||
{
|
||||
if (string.IsNullOrWhiteSpace(label))
|
||||
{
|
||||
return string.Empty;
|
||||
}
|
||||
|
||||
var builder = new StringBuilder(label.Length);
|
||||
var upperNext = false;
|
||||
|
||||
foreach (var c in label)
|
||||
{
|
||||
if (char.IsLetterOrDigit(c))
|
||||
{
|
||||
if (builder.Length == 0)
|
||||
{
|
||||
builder.Append(char.ToLowerInvariant(c));
|
||||
}
|
||||
else if (upperNext)
|
||||
{
|
||||
builder.Append(char.ToUpperInvariant(c));
|
||||
upperNext = false;
|
||||
}
|
||||
else
|
||||
{
|
||||
builder.Append(char.ToLowerInvariant(c));
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
if (builder.Length > 0)
|
||||
{
|
||||
upperNext = true;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return builder.Length == 0 ? label.Trim() : builder.ToString();
|
||||
}
|
||||
|
||||
private static string NormalizeWhitespace(string? value)
|
||||
{
|
||||
if (string.IsNullOrWhiteSpace(value))
|
||||
{
|
||||
return string.Empty;
|
||||
}
|
||||
|
||||
var builder = new StringBuilder(value.Length);
|
||||
var previousIsWhitespace = false;
|
||||
|
||||
foreach (var ch in value)
|
||||
{
|
||||
if (char.IsWhiteSpace(ch))
|
||||
{
|
||||
if (!previousIsWhitespace)
|
||||
{
|
||||
builder.Append(' ');
|
||||
previousIsWhitespace = true;
|
||||
}
|
||||
continue;
|
||||
}
|
||||
|
||||
builder.Append(ch);
|
||||
previousIsWhitespace = false;
|
||||
}
|
||||
|
||||
return builder.ToString().Trim();
|
||||
}
|
||||
private static readonly IReadOnlyDictionary<string, string> EmptyFields = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
|
||||
}
|
||||
312
src/StellaOps.Concelier.Connector.Acsc/Internal/AcscMapper.cs
Normal file
312
src/StellaOps.Concelier.Connector.Acsc/Internal/AcscMapper.cs
Normal file
@@ -0,0 +1,312 @@
|
||||
using System.Security.Cryptography;
|
||||
using System.Text;
|
||||
using System.Text.RegularExpressions;
|
||||
using StellaOps.Concelier.Models;
|
||||
using StellaOps.Concelier.Storage.Mongo.Documents;
|
||||
using StellaOps.Concelier.Storage.Mongo.Dtos;
|
||||
|
||||
namespace StellaOps.Concelier.Connector.Acsc.Internal;
|
||||
|
||||
internal static class AcscMapper
|
||||
{
|
||||
private static readonly Regex CveRegex = new("CVE-\\d{4}-\\d{4,7}", RegexOptions.IgnoreCase | RegexOptions.Compiled);
|
||||
|
||||
public static IReadOnlyList<Advisory> Map(
|
||||
AcscFeedDto feed,
|
||||
DocumentRecord document,
|
||||
DtoRecord dtoRecord,
|
||||
string sourceName,
|
||||
DateTimeOffset mappedAt)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(feed);
|
||||
ArgumentNullException.ThrowIfNull(document);
|
||||
ArgumentNullException.ThrowIfNull(dtoRecord);
|
||||
ArgumentException.ThrowIfNullOrEmpty(sourceName);
|
||||
|
||||
if (feed.Entries is null || feed.Entries.Count == 0)
|
||||
{
|
||||
return Array.Empty<Advisory>();
|
||||
}
|
||||
|
||||
var advisories = new List<Advisory>(feed.Entries.Count);
|
||||
foreach (var entry in feed.Entries)
|
||||
{
|
||||
if (entry is null)
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
var advisoryKey = CreateAdvisoryKey(sourceName, feed.FeedSlug, entry);
|
||||
var fetchProvenance = new AdvisoryProvenance(
|
||||
sourceName,
|
||||
"document",
|
||||
document.Uri,
|
||||
document.FetchedAt.ToUniversalTime(),
|
||||
fieldMask: new[] { "summary", "aliases", "references", "affectedPackages" });
|
||||
|
||||
var feedProvenance = new AdvisoryProvenance(
|
||||
sourceName,
|
||||
"feed",
|
||||
feed.FeedSlug ?? string.Empty,
|
||||
feed.ParsedAt.ToUniversalTime(),
|
||||
fieldMask: new[] { "summary" });
|
||||
|
||||
var mappingProvenance = new AdvisoryProvenance(
|
||||
sourceName,
|
||||
"mapping",
|
||||
entry.EntryId ?? entry.Link ?? advisoryKey,
|
||||
mappedAt.ToUniversalTime(),
|
||||
fieldMask: new[] { "summary", "aliases", "references", "affectedpackages" });
|
||||
|
||||
var provenance = new[]
|
||||
{
|
||||
fetchProvenance,
|
||||
feedProvenance,
|
||||
mappingProvenance,
|
||||
};
|
||||
|
||||
var aliases = BuildAliases(entry);
|
||||
var severity = TryGetSeverity(entry.Fields);
|
||||
var references = BuildReferences(entry, sourceName, mappedAt);
|
||||
var affectedPackages = BuildAffectedPackages(entry, sourceName, mappedAt);
|
||||
|
||||
var advisory = new Advisory(
|
||||
advisoryKey,
|
||||
string.IsNullOrWhiteSpace(entry.Title) ? $"ACSC Advisory {entry.EntryId}" : entry.Title,
|
||||
string.IsNullOrWhiteSpace(entry.Summary) ? null : entry.Summary,
|
||||
language: "en",
|
||||
published: entry.Published?.ToUniversalTime() ?? feed.FeedUpdated?.ToUniversalTime() ?? document.FetchedAt.ToUniversalTime(),
|
||||
modified: entry.Updated?.ToUniversalTime(),
|
||||
severity: severity,
|
||||
exploitKnown: false,
|
||||
aliases: aliases,
|
||||
references: references,
|
||||
affectedPackages: affectedPackages,
|
||||
cvssMetrics: Array.Empty<CvssMetric>(),
|
||||
provenance: provenance);
|
||||
|
||||
advisories.Add(advisory);
|
||||
}
|
||||
|
||||
return advisories;
|
||||
}
|
||||
|
||||
private static IReadOnlyList<string> BuildAliases(AcscEntryDto entry)
|
||||
{
|
||||
var aliases = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
|
||||
|
||||
if (!string.IsNullOrWhiteSpace(entry.EntryId))
|
||||
{
|
||||
aliases.Add(entry.EntryId.Trim());
|
||||
}
|
||||
|
||||
foreach (var alias in entry.Aliases ?? Array.Empty<string>())
|
||||
{
|
||||
if (!string.IsNullOrWhiteSpace(alias))
|
||||
{
|
||||
aliases.Add(alias.Trim());
|
||||
}
|
||||
}
|
||||
|
||||
foreach (var match in CveRegex.Matches(entry.Summary ?? string.Empty).Cast<Match>())
|
||||
{
|
||||
var value = match.Value.ToUpperInvariant();
|
||||
aliases.Add(value);
|
||||
}
|
||||
|
||||
foreach (var match in CveRegex.Matches(entry.ContentText ?? string.Empty).Cast<Match>())
|
||||
{
|
||||
var value = match.Value.ToUpperInvariant();
|
||||
aliases.Add(value);
|
||||
}
|
||||
|
||||
return aliases.Count == 0
|
||||
? Array.Empty<string>()
|
||||
: aliases.OrderBy(static value => value, StringComparer.OrdinalIgnoreCase).ToArray();
|
||||
}
|
||||
|
||||
private static IReadOnlyList<AdvisoryReference> BuildReferences(AcscEntryDto entry, string sourceName, DateTimeOffset recordedAt)
|
||||
{
|
||||
var references = new List<AdvisoryReference>();
|
||||
var seen = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
|
||||
|
||||
void AddReference(string? url, string? kind, string? sourceTag, string? summary)
|
||||
{
|
||||
if (string.IsNullOrWhiteSpace(url))
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
if (!Validation.LooksLikeHttpUrl(url))
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
if (!seen.Add(url))
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
references.Add(new AdvisoryReference(
|
||||
url,
|
||||
kind,
|
||||
sourceTag,
|
||||
summary,
|
||||
new AdvisoryProvenance(sourceName, "reference", url, recordedAt.ToUniversalTime())));
|
||||
}
|
||||
|
||||
AddReference(entry.Link, "advisory", entry.FeedSlug, entry.Title);
|
||||
|
||||
foreach (var reference in entry.References ?? Array.Empty<AcscReferenceDto>())
|
||||
{
|
||||
if (reference is null)
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
AddReference(reference.Url, "reference", null, reference.Title);
|
||||
}
|
||||
|
||||
return references.Count == 0
|
||||
? Array.Empty<AdvisoryReference>()
|
||||
: references
|
||||
.OrderBy(static reference => reference.Url, StringComparer.OrdinalIgnoreCase)
|
||||
.ToArray();
|
||||
}
|
||||
|
||||
private static IReadOnlyList<AffectedPackage> BuildAffectedPackages(AcscEntryDto entry, string sourceName, DateTimeOffset recordedAt)
|
||||
{
|
||||
if (entry.Fields is null || entry.Fields.Count == 0)
|
||||
{
|
||||
return Array.Empty<AffectedPackage>();
|
||||
}
|
||||
|
||||
if (!entry.Fields.TryGetValue("systemsAffected", out var systemsAffected) && !entry.Fields.TryGetValue("productsAffected", out systemsAffected))
|
||||
{
|
||||
return Array.Empty<AffectedPackage>();
|
||||
}
|
||||
|
||||
if (string.IsNullOrWhiteSpace(systemsAffected))
|
||||
{
|
||||
return Array.Empty<AffectedPackage>();
|
||||
}
|
||||
|
||||
var identifiers = systemsAffected
|
||||
.Split(new[] { ',', ';', '\n' }, StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries)
|
||||
.Select(static value => value.Trim())
|
||||
.Where(static value => !string.IsNullOrWhiteSpace(value))
|
||||
.Distinct(StringComparer.OrdinalIgnoreCase)
|
||||
.ToArray();
|
||||
|
||||
if (identifiers.Length == 0)
|
||||
{
|
||||
return Array.Empty<AffectedPackage>();
|
||||
}
|
||||
|
||||
var packages = new List<AffectedPackage>(identifiers.Length);
|
||||
foreach (var identifier in identifiers)
|
||||
{
|
||||
var provenance = new[]
|
||||
{
|
||||
new AdvisoryProvenance(sourceName, "affected", identifier, recordedAt.ToUniversalTime(), fieldMask: new[] { "affectedpackages" }),
|
||||
};
|
||||
|
||||
packages.Add(new AffectedPackage(
|
||||
AffectedPackageTypes.Vendor,
|
||||
identifier,
|
||||
platform: null,
|
||||
versionRanges: Array.Empty<AffectedVersionRange>(),
|
||||
statuses: Array.Empty<AffectedPackageStatus>(),
|
||||
provenance: provenance,
|
||||
normalizedVersions: Array.Empty<NormalizedVersionRule>()));
|
||||
}
|
||||
|
||||
return packages
|
||||
.OrderBy(static package => package.Identifier, StringComparer.OrdinalIgnoreCase)
|
||||
.ToArray();
|
||||
}
|
||||
|
||||
private static string? TryGetSeverity(IReadOnlyDictionary<string, string> fields)
|
||||
{
|
||||
if (fields is null || fields.Count == 0)
|
||||
{
|
||||
return null;
|
||||
}
|
||||
|
||||
var keys = new[]
|
||||
{
|
||||
"severity",
|
||||
"riskLevel",
|
||||
"threatLevel",
|
||||
"impact",
|
||||
};
|
||||
|
||||
foreach (var key in keys)
|
||||
{
|
||||
if (fields.TryGetValue(key, out var value) && !string.IsNullOrWhiteSpace(value))
|
||||
{
|
||||
return value.Trim();
|
||||
}
|
||||
}
|
||||
|
||||
return null;
|
||||
}
|
||||
|
||||
private static string CreateAdvisoryKey(string sourceName, string? feedSlug, AcscEntryDto entry)
|
||||
{
|
||||
var slug = string.IsNullOrWhiteSpace(feedSlug) ? "general" : ToSlug(feedSlug);
|
||||
var candidate = !string.IsNullOrWhiteSpace(entry.EntryId)
|
||||
? entry.EntryId
|
||||
: !string.IsNullOrWhiteSpace(entry.Link)
|
||||
? entry.Link
|
||||
: entry.Title;
|
||||
|
||||
var identifier = !string.IsNullOrWhiteSpace(candidate) ? ToSlug(candidate!) : null;
|
||||
if (string.IsNullOrEmpty(identifier))
|
||||
{
|
||||
identifier = CreateHash(entry.Title ?? Guid.NewGuid().ToString());
|
||||
}
|
||||
|
||||
return $"{sourceName}/{slug}/{identifier}";
|
||||
}
|
||||
|
||||
private static string ToSlug(string value)
|
||||
{
|
||||
if (string.IsNullOrWhiteSpace(value))
|
||||
{
|
||||
return "unknown";
|
||||
}
|
||||
|
||||
var builder = new StringBuilder(value.Length);
|
||||
var previousDash = false;
|
||||
|
||||
foreach (var ch in value)
|
||||
{
|
||||
if (char.IsLetterOrDigit(ch))
|
||||
{
|
||||
builder.Append(char.ToLowerInvariant(ch));
|
||||
previousDash = false;
|
||||
}
|
||||
else if (!previousDash)
|
||||
{
|
||||
builder.Append('-');
|
||||
previousDash = true;
|
||||
}
|
||||
}
|
||||
|
||||
var slug = builder.ToString().Trim('-');
|
||||
if (string.IsNullOrEmpty(slug))
|
||||
{
|
||||
slug = CreateHash(value);
|
||||
}
|
||||
|
||||
return slug.Length <= 64 ? slug : slug[..64];
|
||||
}
|
||||
|
||||
private static string CreateHash(string value)
|
||||
{
|
||||
var bytes = Encoding.UTF8.GetBytes(value);
|
||||
var hash = SHA256.HashData(bytes);
|
||||
return Convert.ToHexString(hash).ToLowerInvariant()[..16];
|
||||
}
|
||||
}
|
||||
55
src/StellaOps.Concelier.Connector.Acsc/Jobs.cs
Normal file
55
src/StellaOps.Concelier.Connector.Acsc/Jobs.cs
Normal file
@@ -0,0 +1,55 @@
|
||||
using StellaOps.Concelier.Core.Jobs;
|
||||
|
||||
namespace StellaOps.Concelier.Connector.Acsc;
|
||||
|
||||
internal static class AcscJobKinds
|
||||
{
|
||||
public const string Fetch = "source:acsc:fetch";
|
||||
public const string Parse = "source:acsc:parse";
|
||||
public const string Map = "source:acsc:map";
|
||||
public const string Probe = "source:acsc:probe";
|
||||
}
|
||||
|
||||
internal sealed class AcscFetchJob : IJob
|
||||
{
|
||||
private readonly AcscConnector _connector;
|
||||
|
||||
public AcscFetchJob(AcscConnector connector)
|
||||
=> _connector = connector ?? throw new ArgumentNullException(nameof(connector));
|
||||
|
||||
public Task ExecuteAsync(JobExecutionContext context, CancellationToken cancellationToken)
|
||||
=> _connector.FetchAsync(context.Services, cancellationToken);
|
||||
}
|
||||
|
||||
internal sealed class AcscParseJob : IJob
|
||||
{
|
||||
private readonly AcscConnector _connector;
|
||||
|
||||
public AcscParseJob(AcscConnector connector)
|
||||
=> _connector = connector ?? throw new ArgumentNullException(nameof(connector));
|
||||
|
||||
public Task ExecuteAsync(JobExecutionContext context, CancellationToken cancellationToken)
|
||||
=> _connector.ParseAsync(context.Services, cancellationToken);
|
||||
}
|
||||
|
||||
internal sealed class AcscMapJob : IJob
|
||||
{
|
||||
private readonly AcscConnector _connector;
|
||||
|
||||
public AcscMapJob(AcscConnector connector)
|
||||
=> _connector = connector ?? throw new ArgumentNullException(nameof(connector));
|
||||
|
||||
public Task ExecuteAsync(JobExecutionContext context, CancellationToken cancellationToken)
|
||||
=> _connector.MapAsync(context.Services, cancellationToken);
|
||||
}
|
||||
|
||||
internal sealed class AcscProbeJob : IJob
|
||||
{
|
||||
private readonly AcscConnector _connector;
|
||||
|
||||
public AcscProbeJob(AcscConnector connector)
|
||||
=> _connector = connector ?? throw new ArgumentNullException(nameof(connector));
|
||||
|
||||
public Task ExecuteAsync(JobExecutionContext context, CancellationToken cancellationToken)
|
||||
=> _connector.ProbeAsync(cancellationToken);
|
||||
}
|
||||
@@ -0,0 +1,4 @@
|
||||
using System.Runtime.CompilerServices;
|
||||
|
||||
[assembly: InternalsVisibleTo("FixtureUpdater")]
|
||||
[assembly: InternalsVisibleTo("StellaOps.Concelier.Connector.Acsc.Tests")]
|
||||
68
src/StellaOps.Concelier.Connector.Acsc/README.md
Normal file
68
src/StellaOps.Concelier.Connector.Acsc/README.md
Normal file
@@ -0,0 +1,68 @@
|
||||
## StellaOps.Concelier.Connector.Acsc
|
||||
|
||||
Australian Cyber Security Centre (ACSC) connector that ingests RSS/Atom advisories, sanitises embedded HTML, and maps entries into canonical `Advisory` records for Concelier.
|
||||
|
||||
### Configuration
|
||||
Settings live under `concelier:sources:acsc` (see `AcscOptions`):
|
||||
|
||||
| Setting | Description | Default |
|
||||
| --- | --- | --- |
|
||||
| `baseEndpoint` | Base URI for direct ACSC requests (trailing slash required). | `https://www.cyber.gov.au/` |
|
||||
| `relayEndpoint` | Optional relay host to fall back to when Akamai refuses HTTP/2. | empty |
|
||||
| `preferRelayByDefault` | Default endpoint preference when no cursor state exists. | `false` |
|
||||
| `enableRelayFallback` | Allows automatic relay fallback when direct fetch fails. | `true` |
|
||||
| `forceRelay` | Forces all fetches through the relay (skips direct attempts). | `false` |
|
||||
| `feeds` | Array of feed descriptors (`slug`, `relativePath`, `enabled`). | alerts/advisories enabled |
|
||||
| `requestTimeout` | Per-request timeout override. | 45 seconds |
|
||||
| `failureBackoff` | Backoff window when fetch fails. | 5 minutes |
|
||||
| `initialBackfill` | Sliding window used to seed published cursors. | 120 days |
|
||||
| `userAgent` | Outbound `User-Agent` header. | `StellaOps/Concelier (+https://stella-ops.org)` |
|
||||
| `requestVersion`/`versionPolicy` | HTTP version negotiation knobs. | HTTP/2 with downgrade |
|
||||
|
||||
The dependency injection routine registers the connector plus scheduled jobs:
|
||||
|
||||
| Job | Cron | Purpose |
|
||||
| --- | --- | --- |
|
||||
| `source:acsc:fetch` | `7,37 * * * *` | Fetch RSS/Atom feeds (direct + relay fallback). |
|
||||
| `source:acsc:parse` | `12,42 * * * *` | Persist sanitised DTOs (`acsc.feed.v1`). |
|
||||
| `source:acsc:map` | `17,47 * * * *` | Map DTO entries into canonical advisories. |
|
||||
| `source:acsc:probe` | `25,55 * * * *` | Verify direct endpoint health and adjust cursor preference. |
|
||||
|
||||
### Metrics
|
||||
Emitted via `AcscDiagnostics` (`Meter` = `StellaOps.Concelier.Connector.Acsc`):
|
||||
|
||||
| Instrument | Unit | Description |
|
||||
| --- | --- | --- |
|
||||
| `acsc.fetch.attempts` | operations | Feed fetch attempts (tags: `feed`, `mode`). |
|
||||
| `acsc.fetch.success` | operations | Successful fetches. |
|
||||
| `acsc.fetch.failures` | operations | Failed fetches before retry backoff. |
|
||||
| `acsc.fetch.unchanged` | operations | 304 Not Modified responses. |
|
||||
| `acsc.fetch.fallbacks` | operations | Relay fallbacks triggered (`reason` tag). |
|
||||
| `acsc.cursor.published_updates` | feeds | Published cursor updates per feed slug. |
|
||||
| `acsc.parse.attempts` | documents | Parse attempts per feed. |
|
||||
| `acsc.parse.success` | documents | Successful RSS → DTO conversions. |
|
||||
| `acsc.parse.failures` | documents | Parse failures (tags: `feed`, `reason`). |
|
||||
| `acsc.map.success` | advisories | Advisories emitted from a mapping pass. |
|
||||
|
||||
### Logging
|
||||
Key log messages include:
|
||||
- Fetch successes/failures, HTTP status codes, and relay fallbacks.
|
||||
- Parse failures with reasons (download, schema, sanitisation).
|
||||
- Mapping summaries showing advisory counts per document.
|
||||
- Probe results toggling relay usage.
|
||||
|
||||
Logs include feed slug metadata for troubleshooting parallel ingestion.
|
||||
|
||||
### Tests & fixtures
|
||||
`StellaOps.Concelier.Connector.Acsc.Tests` exercises the fetch→parse→map pipeline using canned RSS content. Deterministic snapshots live in `Acsc/Fixtures`. To refresh them after intentional behavioural changes:
|
||||
|
||||
```bash
|
||||
UPDATE_ACSC_FIXTURES=1 dotnet test src/StellaOps.Concelier.Connector.Acsc.Tests/StellaOps.Concelier.Connector.Acsc.Tests.csproj
|
||||
```
|
||||
|
||||
Remember to review the generated `.actual.json` files when assertions fail without fixture updates.
|
||||
|
||||
### Operational notes
|
||||
- Keep the relay endpoint allowlisted for air-gapped deployments; the probe job will automatically switch back to direct fetching when Akamai stabilises.
|
||||
- Mapping currently emits vendor `affectedPackages` from “Systems/Products affected” fields; expand range primitives once structured version data appears in ACSC feeds.
|
||||
- The connector is offline-friendly—no outbound calls beyond the configured feeds.
|
||||
@@ -0,0 +1,18 @@
|
||||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
|
||||
<PropertyGroup>
|
||||
<TargetFramework>net10.0</TargetFramework>
|
||||
<ImplicitUsings>enable</ImplicitUsings>
|
||||
<Nullable>enable</Nullable>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="../StellaOps.Plugin/StellaOps.Plugin.csproj" />
|
||||
|
||||
<ProjectReference Include="../StellaOps.Concelier.Connector.Common/StellaOps.Concelier.Connector.Common.csproj" />
|
||||
<ProjectReference Include="../StellaOps.Concelier.Models/StellaOps.Concelier.Models.csproj" />
|
||||
<ProjectReference Include="../StellaOps.Concelier.Storage.Mongo/StellaOps.Concelier.Storage.Mongo.csproj" />
|
||||
<ProjectReference Include="../StellaOps.Concelier.Core/StellaOps.Concelier.Core.csproj" />
|
||||
</ItemGroup>
|
||||
</Project>
|
||||
|
||||
11
src/StellaOps.Concelier.Connector.Acsc/TASKS.md
Normal file
11
src/StellaOps.Concelier.Connector.Acsc/TASKS.md
Normal file
@@ -0,0 +1,11 @@
|
||||
# TASKS
|
||||
| Task | Owner(s) | Depends on | Notes |
|
||||
|---|---|---|---|
|
||||
|FEEDCONN-ACSC-02-001 Source discovery & feed contract|BE-Conn-ACSC|Research|**DONE (2025-10-11)** – Catalogued feed slugs `/acsc/view-all-content/{alerts,advisories,news,publications,threats}/rss`; every endpoint currently negotiates HTTP/2 then aborts with `INTERNAL_ERROR` (curl exit 92) and hanging >600 s when forcing `--http1.1`. Documented traces + mitigations in `docs/concelier-connector-research-20251011.md` and opened `FEEDCONN-SHARED-HTTP2-001` for shared handler tweaks (force `RequestVersionOrLower`, jittered retries, relay option).|
|
||||
|FEEDCONN-ACSC-02-002 Fetch pipeline & cursor persistence|BE-Conn-ACSC|Source.Common, Storage.Mongo|**DONE (2025-10-12)** – HTTP client now pins `HttpRequestMessage.VersionPolicy = RequestVersionOrLower`, forces `AutomaticDecompression = GZip | Deflate`, and sends `User-Agent: StellaOps/Concelier (+https://stella-ops.org)` via `AddAcscConnector`. Fetch pipeline implemented in `AcscConnector` with relay-aware fallback (`AcscProbeJob` seeds preference), deterministic cursor updates (`preferredEndpoint`, published timestamp per feed), and metadata-deduped documents. Unit tests `AcscConnectorFetchTests` + `AcscHttpClientConfigurationTests` cover direct/relay flows and client wiring.|
|
||||
|FEEDCONN-ACSC-02-003 Parser & DTO sanitiser|BE-Conn-ACSC|Source.Common|**DONE (2025-10-12)** – Added `AcscFeedParser` to sanitise RSS payloads, collapse multi-paragraph summaries, dedupe references, and surface `serialNumber`/`advisoryType` fields as structured metadata + alias candidates. `ParseAsync` now materialises `acsc.feed.v1` DTOs, promotes documents to `pending-map`, and advances cursor state. Covered by `AcscConnectorParseTests`.|
|
||||
|FEEDCONN-ACSC-02-004 Canonical mapper + range primitives|BE-Conn-ACSC|Models|**DONE (2025-10-12)** – Introduced `AcscMapper` and wired `MapAsync` to emit canonical advisories with normalized aliases, source-tagged references, and optional vendor `affectedPackages` derived from “Systems/Products affected” fields. Documents transition to `mapped`, advisories persist via `IAdvisoryStore`, and metrics/logging capture mapped counts. `AcscConnectorParseTests` exercise fetch→parse→map flow.|
|
||||
|FEEDCONN-ACSC-02-005 Deterministic fixtures & regression tests|QA|Testing|**DONE (2025-10-12)** – `AcscConnectorParseTests` now snapshots fetch→parse→map output via `Acsc/Fixtures/acsc-advisories.snapshot.json`; set `UPDATE_ACSC_FIXTURES=1` to regenerate. Tests assert DTO status transitions, advisory persistence, and state cleanup.|
|
||||
|FEEDCONN-ACSC-02-006 Diagnostics & documentation|DevEx|Docs|**DONE (2025-10-12)** – Added module README describing configuration, job schedules, metrics (including new `acsc.map.success` counter), relay behaviour, and fixture workflow. Diagnostics updated to count map successes alongside existing fetch/parse metrics.|
|
||||
|FEEDCONN-ACSC-02-007 Feed retention & pagination validation|BE-Conn-ACSC|Research|**DONE (2025-10-11)** – Relay sampling shows retention ≥ July 2025; need to re-run once direct HTTP/2 path is stable to see if feed caps at ~50 items and whether `?page=` exists. Pending action tracked in shared HTTP downgrade task.|
|
||||
|FEEDCONN-ACSC-02-008 HTTP client compatibility plan|BE-Conn-ACSC|Source.Common|**DONE (2025-10-11)** – Reproduced Akamai resets, drafted downgrade plan (two-stage HTTP/2 retry + relay fallback), and filed `FEEDCONN-SHARED-HTTP2-001`; module README TODO will host the per-environment knob matrix.|
|
||||
Reference in New Issue
Block a user