Rename Concelier Source modules to Connector
Some checks failed
Build Test Deploy / build-test (push) Has been cancelled
Build Test Deploy / authority-container (push) Has been cancelled
Build Test Deploy / docs (push) Has been cancelled
Build Test Deploy / deploy (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled

This commit is contained in:
master
2025-10-18 20:11:18 +03:00
parent 9af1fd6bf0
commit d97779eed6
789 changed files with 1489 additions and 1489 deletions

View File

@@ -0,0 +1,40 @@
# AGENTS
## Role
Bootstrap the ACSC (Australian Cyber Security Centre) advisories connector so the Concelier pipeline can ingest, normalise, and enrich ACSC security bulletins.
## Scope
- Research the authoritative ACSC advisory feed (RSS/Atom, JSON API, or HTML).
- Implement fetch windowing, cursor persistence, and retry strategy consistent with other external connectors.
- Parse advisory content (summary, affected products, mitigation guidance, references).
- Map advisories into canonical `Advisory` records with aliases, references, affected packages, and provenance metadata.
- Provide deterministic fixtures and regression tests that cover fetch/parse/map flows.
## Participants
- `Source.Common` for HTTP client creation, fetch service, and DTO persistence helpers.
- `Storage.Mongo` for raw/document/DTO/advisory storage plus cursor management.
- `Concelier.Models` for canonical advisory structures and provenance utilities.
- `Concelier.Testing` for integration harnesses and snapshot helpers.
## Interfaces & Contracts
- Job kinds should follow the pattern `acsc:fetch`, `acsc:parse`, `acsc:map`.
- Documents persisted to Mongo must include ETag/Last-Modified metadata when the source exposes it.
- Canonical advisories must emit aliases (ACSC ID + CVE IDs) and references (official bulletin + vendor notices).
## In/Out of scope
In scope:
- Initial end-to-end connector implementation with tests, fixtures, and range primitive coverage.
- Minimal telemetry (logging + diagnostics counters) consistent with other connectors.
Out of scope:
- Upstream remediation automation or vendor-specific enrichment beyond ACSC data.
- Export-related changes (handled by exporter teams).
## Observability & Security Expectations
- Log key lifecycle events (fetch/page processed, parse success/error counts, mapping stats).
- Sanitise HTML safely and avoid persisting external scripts or embedded media.
- Handle transient fetch failures gracefully with exponential backoff and mark failures in source state.
## Tests
- Add integration-style tests under `StellaOps.Concelier.Connector.Acsc.Tests` covering fetch/parse/map with canned fixtures.
- Snapshot canonical advisories; provide UPDATE flag flow for regeneration.
- Validate determinism (ordering, casing, timestamps) to satisfy pipeline reproducibility requirements.

View File

@@ -0,0 +1,699 @@
using System.Collections.Generic;
using System.Globalization;
using System.IO;
using System.Net;
using System.Net.Http;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using System.Xml.Linq;
using System.Text.Json;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using MongoDB.Bson;
using MongoDB.Bson.IO;
using StellaOps.Concelier.Connector.Acsc.Configuration;
using StellaOps.Concelier.Connector.Acsc.Internal;
using StellaOps.Concelier.Connector.Common.Fetch;
using StellaOps.Concelier.Connector.Common.Html;
using StellaOps.Concelier.Connector.Common;
using StellaOps.Concelier.Storage.Mongo;
using StellaOps.Concelier.Storage.Mongo.Documents;
using StellaOps.Concelier.Storage.Mongo.Dtos;
using StellaOps.Concelier.Storage.Mongo.Advisories;
using StellaOps.Plugin;
namespace StellaOps.Concelier.Connector.Acsc;
public sealed class AcscConnector : IFeedConnector
{
private static readonly string[] AcceptHeaders =
{
"application/rss+xml",
"application/atom+xml;q=0.9",
"application/xml;q=0.8",
"text/xml;q=0.7",
};
private static readonly JsonSerializerOptions SerializerOptions = new(JsonSerializerDefaults.Web)
{
PropertyNameCaseInsensitive = true,
WriteIndented = false,
};
private readonly SourceFetchService _fetchService;
private readonly RawDocumentStorage _rawDocumentStorage;
private readonly IDocumentStore _documentStore;
private readonly IDtoStore _dtoStore;
private readonly IAdvisoryStore _advisoryStore;
private readonly ISourceStateRepository _stateRepository;
private readonly IHttpClientFactory _httpClientFactory;
private readonly AcscOptions _options;
private readonly AcscDiagnostics _diagnostics;
private readonly TimeProvider _timeProvider;
private readonly ILogger<AcscConnector> _logger;
private readonly HtmlContentSanitizer _htmlSanitizer = new();
public AcscConnector(
SourceFetchService fetchService,
RawDocumentStorage rawDocumentStorage,
IDocumentStore documentStore,
IDtoStore dtoStore,
IAdvisoryStore advisoryStore,
ISourceStateRepository stateRepository,
IHttpClientFactory httpClientFactory,
IOptions<AcscOptions> options,
AcscDiagnostics diagnostics,
TimeProvider? timeProvider,
ILogger<AcscConnector> logger)
{
_fetchService = fetchService ?? throw new ArgumentNullException(nameof(fetchService));
_rawDocumentStorage = rawDocumentStorage ?? throw new ArgumentNullException(nameof(rawDocumentStorage));
_documentStore = documentStore ?? throw new ArgumentNullException(nameof(documentStore));
_dtoStore = dtoStore ?? throw new ArgumentNullException(nameof(dtoStore));
_advisoryStore = advisoryStore ?? throw new ArgumentNullException(nameof(advisoryStore));
_stateRepository = stateRepository ?? throw new ArgumentNullException(nameof(stateRepository));
_httpClientFactory = httpClientFactory ?? throw new ArgumentNullException(nameof(httpClientFactory));
_options = (options ?? throw new ArgumentNullException(nameof(options))).Value ?? throw new ArgumentNullException(nameof(options));
_options.Validate();
_diagnostics = diagnostics ?? throw new ArgumentNullException(nameof(diagnostics));
_timeProvider = timeProvider ?? TimeProvider.System;
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
public string SourceName => AcscConnectorPlugin.SourceName;
public async Task FetchAsync(IServiceProvider services, CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(services);
var now = _timeProvider.GetUtcNow();
var cursor = await GetCursorAsync(cancellationToken).ConfigureAwait(false);
var lastPublished = new Dictionary<string, DateTimeOffset?>(cursor.LastPublishedByFeed, StringComparer.OrdinalIgnoreCase);
var pendingDocuments = cursor.PendingDocuments.ToHashSet();
var pendingMappings = cursor.PendingMappings.ToHashSet();
var failures = new List<(AcscFeedOptions Feed, Exception Error)>();
var preferredEndpoint = ResolveInitialPreference(cursor);
AcscEndpointPreference? successPreference = null;
foreach (var feed in GetEnabledFeeds())
{
cancellationToken.ThrowIfCancellationRequested();
Exception? lastError = null;
bool handled = false;
foreach (var mode in BuildFetchOrder(preferredEndpoint))
{
cancellationToken.ThrowIfCancellationRequested();
if (mode == AcscFetchMode.Relay && !IsRelayConfigured)
{
continue;
}
var modeName = ModeName(mode);
var targetUri = BuildFeedUri(feed, mode);
var metadata = CreateMetadata(feed, cursor, modeName);
var existing = await _documentStore.FindBySourceAndUriAsync(SourceName, targetUri.ToString(), cancellationToken).ConfigureAwait(false);
var request = new SourceFetchRequest(AcscOptions.HttpClientName, SourceName, targetUri)
{
Metadata = metadata,
ETag = existing?.Etag,
LastModified = existing?.LastModified,
AcceptHeaders = AcceptHeaders,
TimeoutOverride = _options.RequestTimeout,
};
try
{
_diagnostics.FetchAttempt(feed.Slug, modeName);
var result = await _fetchService.FetchAsync(request, cancellationToken).ConfigureAwait(false);
if (result.IsNotModified)
{
_diagnostics.FetchUnchanged(feed.Slug, modeName);
successPreference ??= mode switch
{
AcscFetchMode.Relay => AcscEndpointPreference.Relay,
_ => AcscEndpointPreference.Direct,
};
handled = true;
_logger.LogDebug("ACSC feed {Feed} returned 304 via {Mode}", feed.Slug, modeName);
break;
}
if (!result.IsSuccess || result.Document is null)
{
_diagnostics.FetchFailure(feed.Slug, modeName);
lastError = new InvalidOperationException($"Fetch returned no document for {targetUri}");
continue;
}
pendingDocuments.Add(result.Document.Id);
successPreference = mode switch
{
AcscFetchMode.Relay => AcscEndpointPreference.Relay,
_ => AcscEndpointPreference.Direct,
};
handled = true;
_diagnostics.FetchSuccess(feed.Slug, modeName);
_logger.LogInformation("ACSC fetched {Feed} via {Mode} (documentId={DocumentId})", feed.Slug, modeName, result.Document.Id);
var latestPublished = await TryComputeLatestPublishedAsync(result.Document, cancellationToken).ConfigureAwait(false);
if (latestPublished.HasValue)
{
if (!lastPublished.TryGetValue(feed.Slug, out var existingPublished) || latestPublished.Value > existingPublished)
{
lastPublished[feed.Slug] = latestPublished.Value;
_diagnostics.CursorUpdated(feed.Slug);
_logger.LogDebug("ACSC feed {Feed} advanced published cursor to {Timestamp:O}", feed.Slug, latestPublished.Value);
}
}
break;
}
catch (HttpRequestException ex) when (ShouldRetryWithRelay(mode))
{
lastError = ex;
_diagnostics.FetchFallback(feed.Slug, modeName, "http-request");
_logger.LogWarning(ex, "ACSC fetch via {Mode} failed for {Feed}; attempting relay fallback.", modeName, feed.Slug);
continue;
}
catch (TaskCanceledException ex) when (ShouldRetryWithRelay(mode))
{
lastError = ex;
_diagnostics.FetchFallback(feed.Slug, modeName, "timeout");
_logger.LogWarning(ex, "ACSC fetch via {Mode} timed out for {Feed}; attempting relay fallback.", modeName, feed.Slug);
continue;
}
catch (Exception ex)
{
lastError = ex;
_diagnostics.FetchFailure(feed.Slug, modeName);
_logger.LogError(ex, "ACSC fetch failed for {Feed} via {Mode}", feed.Slug, modeName);
break;
}
}
if (!handled && lastError is not null)
{
failures.Add((feed, lastError));
}
}
if (failures.Count > 0)
{
var failureReason = string.Join("; ", failures.Select(f => $"{f.Feed.Slug}: {f.Error.Message}"));
await _stateRepository.MarkFailureAsync(SourceName, now, _options.FailureBackoff, failureReason, cancellationToken).ConfigureAwait(false);
throw new AggregateException($"ACSC fetch failed for {failures.Count} feed(s): {failureReason}", failures.Select(f => f.Error));
}
var updatedPreference = successPreference ?? preferredEndpoint;
if (_options.ForceRelay)
{
updatedPreference = AcscEndpointPreference.Relay;
}
else if (!IsRelayConfigured)
{
updatedPreference = AcscEndpointPreference.Direct;
}
var updatedCursor = cursor
.WithPreferredEndpoint(updatedPreference)
.WithPendingDocuments(pendingDocuments)
.WithPendingMappings(pendingMappings)
.WithLastPublished(lastPublished);
await UpdateCursorAsync(updatedCursor, cancellationToken).ConfigureAwait(false);
}
public async Task ParseAsync(IServiceProvider services, CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(services);
var cursor = await GetCursorAsync(cancellationToken).ConfigureAwait(false);
if (cursor.PendingDocuments.Count == 0)
{
return;
}
var pendingDocuments = cursor.PendingDocuments.ToList();
var pendingMappings = cursor.PendingMappings.ToHashSet();
foreach (var documentId in cursor.PendingDocuments)
{
cancellationToken.ThrowIfCancellationRequested();
var document = await _documentStore.FindAsync(documentId, cancellationToken).ConfigureAwait(false);
if (document is null)
{
pendingDocuments.Remove(documentId);
pendingMappings.Remove(documentId);
continue;
}
var metadata = AcscDocumentMetadata.FromDocument(document);
var feedTag = string.IsNullOrWhiteSpace(metadata.FeedSlug) ? "(unknown)" : metadata.FeedSlug;
_diagnostics.ParseAttempt(feedTag);
if (!document.GridFsId.HasValue)
{
_diagnostics.ParseFailure(feedTag, "missingPayload");
_logger.LogWarning("ACSC document {DocumentId} missing GridFS payload (feed={Feed})", document.Id, feedTag);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
pendingDocuments.Remove(documentId);
pendingMappings.Remove(documentId);
continue;
}
byte[] rawBytes;
try
{
rawBytes = await _rawDocumentStorage.DownloadAsync(document.GridFsId.Value, cancellationToken).ConfigureAwait(false);
}
catch (Exception ex)
{
_diagnostics.ParseFailure(feedTag, "download");
_logger.LogError(ex, "ACSC failed to download payload for document {DocumentId} (feed={Feed})", document.Id, feedTag);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
pendingDocuments.Remove(documentId);
pendingMappings.Remove(documentId);
continue;
}
try
{
var parsedAt = _timeProvider.GetUtcNow();
var dto = AcscFeedParser.Parse(rawBytes, metadata.FeedSlug, parsedAt, _htmlSanitizer);
var json = JsonSerializer.Serialize(dto, SerializerOptions);
var payload = BsonDocument.Parse(json);
var existingDto = await _dtoStore.FindByDocumentIdAsync(document.Id, cancellationToken).ConfigureAwait(false);
var dtoRecord = existingDto is null
? new DtoRecord(Guid.NewGuid(), document.Id, SourceName, "acsc.feed.v1", payload, parsedAt)
: existingDto with
{
Payload = payload,
SchemaVersion = "acsc.feed.v1",
ValidatedAt = parsedAt,
};
await _dtoStore.UpsertAsync(dtoRecord, cancellationToken).ConfigureAwait(false);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.PendingMap, cancellationToken).ConfigureAwait(false);
pendingDocuments.Remove(documentId);
pendingMappings.Add(document.Id);
_diagnostics.ParseSuccess(feedTag);
_logger.LogInformation("ACSC parsed document {DocumentId} (feed={Feed}, entries={EntryCount})", document.Id, feedTag, dto.Entries.Count);
}
catch (Exception ex)
{
_diagnostics.ParseFailure(feedTag, "parse");
_logger.LogError(ex, "ACSC parse failed for document {DocumentId} (feed={Feed})", document.Id, feedTag);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
pendingDocuments.Remove(documentId);
pendingMappings.Remove(documentId);
}
}
var updatedCursor = cursor
.WithPendingDocuments(pendingDocuments)
.WithPendingMappings(pendingMappings);
await UpdateCursorAsync(updatedCursor, cancellationToken).ConfigureAwait(false);
}
public async Task MapAsync(IServiceProvider services, CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(services);
var cursor = await GetCursorAsync(cancellationToken).ConfigureAwait(false);
if (cursor.PendingMappings.Count == 0)
{
return;
}
var pendingMappings = cursor.PendingMappings.ToHashSet();
var documentIds = cursor.PendingMappings.ToList();
foreach (var documentId in documentIds)
{
cancellationToken.ThrowIfCancellationRequested();
var dtoRecord = await _dtoStore.FindByDocumentIdAsync(documentId, cancellationToken).ConfigureAwait(false);
var document = await _documentStore.FindAsync(documentId, cancellationToken).ConfigureAwait(false);
if (dtoRecord is null || document is null)
{
pendingMappings.Remove(documentId);
continue;
}
AcscFeedDto? feed;
try
{
var dtoJson = dtoRecord.Payload.ToJson(new JsonWriterSettings
{
OutputMode = JsonOutputMode.RelaxedExtendedJson,
});
feed = JsonSerializer.Deserialize<AcscFeedDto>(dtoJson, SerializerOptions);
}
catch (Exception ex)
{
_logger.LogError(ex, "ACSC mapping failed to deserialize DTO for document {DocumentId}", document.Id);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
pendingMappings.Remove(documentId);
continue;
}
if (feed is null)
{
_logger.LogWarning("ACSC mapping encountered null DTO payload for document {DocumentId}", document.Id);
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Failed, cancellationToken).ConfigureAwait(false);
pendingMappings.Remove(documentId);
continue;
}
var mappedAt = _timeProvider.GetUtcNow();
var advisories = AcscMapper.Map(feed, document, dtoRecord, SourceName, mappedAt);
if (advisories.Count > 0)
{
foreach (var advisory in advisories)
{
await _advisoryStore.UpsertAsync(advisory, cancellationToken).ConfigureAwait(false);
}
_diagnostics.MapSuccess(advisories.Count);
_logger.LogInformation(
"ACSC mapped {Count} advisories from document {DocumentId} (feed={Feed})",
advisories.Count,
document.Id,
feed.FeedSlug ?? "(unknown)");
}
else
{
_logger.LogInformation(
"ACSC mapping produced no advisories for document {DocumentId} (feed={Feed})",
document.Id,
feed.FeedSlug ?? "(unknown)");
}
await _documentStore.UpdateStatusAsync(document.Id, DocumentStatuses.Mapped, cancellationToken).ConfigureAwait(false);
pendingMappings.Remove(documentId);
}
var updatedCursor = cursor.WithPendingMappings(pendingMappings);
await UpdateCursorAsync(updatedCursor, cancellationToken).ConfigureAwait(false);
}
public async Task ProbeAsync(CancellationToken cancellationToken)
{
var cursor = await GetCursorAsync(cancellationToken).ConfigureAwait(false);
if (_options.ForceRelay)
{
if (cursor.PreferredEndpoint != AcscEndpointPreference.Relay)
{
await UpdateCursorAsync(cursor.WithPreferredEndpoint(AcscEndpointPreference.Relay), cancellationToken).ConfigureAwait(false);
}
return;
}
if (!IsRelayConfigured)
{
if (cursor.PreferredEndpoint != AcscEndpointPreference.Direct)
{
await UpdateCursorAsync(cursor.WithPreferredEndpoint(AcscEndpointPreference.Direct), cancellationToken).ConfigureAwait(false);
}
return;
}
var feed = GetEnabledFeeds().FirstOrDefault();
if (feed is null)
{
return;
}
var httpClient = _httpClientFactory.CreateClient(AcscOptions.HttpClientName);
httpClient.Timeout = TimeSpan.FromSeconds(15);
var directUri = BuildFeedUri(feed, AcscFetchMode.Direct);
try
{
using var headRequest = new HttpRequestMessage(HttpMethod.Head, directUri);
using var response = await httpClient.SendAsync(headRequest, HttpCompletionOption.ResponseHeadersRead, cancellationToken).ConfigureAwait(false);
if (response.IsSuccessStatusCode)
{
if (cursor.PreferredEndpoint != AcscEndpointPreference.Direct)
{
await UpdateCursorAsync(cursor.WithPreferredEndpoint(AcscEndpointPreference.Direct), cancellationToken).ConfigureAwait(false);
_logger.LogInformation("ACSC probe succeeded via direct endpoint ({StatusCode}); relay preference cleared.", (int)response.StatusCode);
}
return;
}
if (response.StatusCode == HttpStatusCode.MethodNotAllowed)
{
using var probeRequest = new HttpRequestMessage(HttpMethod.Get, directUri);
using var probeResponse = await httpClient.SendAsync(probeRequest, HttpCompletionOption.ResponseHeadersRead, cancellationToken).ConfigureAwait(false);
if (probeResponse.IsSuccessStatusCode)
{
if (cursor.PreferredEndpoint != AcscEndpointPreference.Direct)
{
await UpdateCursorAsync(cursor.WithPreferredEndpoint(AcscEndpointPreference.Direct), cancellationToken).ConfigureAwait(false);
_logger.LogInformation("ACSC probe succeeded via direct endpoint after GET fallback ({StatusCode}).", (int)probeResponse.StatusCode);
}
return;
}
}
_logger.LogWarning("ACSC direct probe returned HTTP {StatusCode}; relay preference enabled.", (int)response.StatusCode);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "ACSC direct probe failed; relay preference will be enabled.");
}
if (cursor.PreferredEndpoint != AcscEndpointPreference.Relay)
{
await UpdateCursorAsync(cursor.WithPreferredEndpoint(AcscEndpointPreference.Relay), cancellationToken).ConfigureAwait(false);
}
}
private bool ShouldRetryWithRelay(AcscFetchMode mode)
=> mode == AcscFetchMode.Direct && _options.EnableRelayFallback && IsRelayConfigured && !_options.ForceRelay;
private IEnumerable<AcscFetchMode> BuildFetchOrder(AcscEndpointPreference preference)
{
if (_options.ForceRelay)
{
if (IsRelayConfigured)
{
yield return AcscFetchMode.Relay;
}
yield break;
}
if (!IsRelayConfigured)
{
yield return AcscFetchMode.Direct;
yield break;
}
var preferRelay = preference == AcscEndpointPreference.Relay;
if (preference == AcscEndpointPreference.Auto)
{
preferRelay = _options.PreferRelayByDefault;
}
if (preferRelay)
{
yield return AcscFetchMode.Relay;
if (_options.EnableRelayFallback)
{
yield return AcscFetchMode.Direct;
}
}
else
{
yield return AcscFetchMode.Direct;
if (_options.EnableRelayFallback)
{
yield return AcscFetchMode.Relay;
}
}
}
private AcscEndpointPreference ResolveInitialPreference(AcscCursor cursor)
{
if (_options.ForceRelay)
{
return AcscEndpointPreference.Relay;
}
if (!IsRelayConfigured)
{
return AcscEndpointPreference.Direct;
}
if (cursor.PreferredEndpoint != AcscEndpointPreference.Auto)
{
return cursor.PreferredEndpoint;
}
return _options.PreferRelayByDefault ? AcscEndpointPreference.Relay : AcscEndpointPreference.Direct;
}
private async Task<DateTimeOffset?> TryComputeLatestPublishedAsync(DocumentRecord document, CancellationToken cancellationToken)
{
if (!document.GridFsId.HasValue)
{
return null;
}
var rawBytes = await _rawDocumentStorage.DownloadAsync(document.GridFsId.Value, cancellationToken).ConfigureAwait(false);
if (rawBytes.Length == 0)
{
return null;
}
try
{
using var memoryStream = new MemoryStream(rawBytes, writable: false);
var xml = XDocument.Load(memoryStream, LoadOptions.None);
DateTimeOffset? latest = null;
foreach (var element in xml.Descendants())
{
if (!IsEntryElement(element.Name.LocalName))
{
continue;
}
var published = ExtractPublished(element);
if (!published.HasValue)
{
continue;
}
if (latest is null || published.Value > latest.Value)
{
latest = published;
}
}
return latest;
}
catch (Exception ex)
{
_logger.LogWarning(ex, "ACSC failed to derive published cursor for document {DocumentId} ({Uri})", document.Id, document.Uri);
return null;
}
}
private static bool IsEntryElement(string localName)
=> string.Equals(localName, "item", StringComparison.OrdinalIgnoreCase)
|| string.Equals(localName, "entry", StringComparison.OrdinalIgnoreCase);
private static DateTimeOffset? ExtractPublished(XElement element)
{
foreach (var name in EnumerateTimestampNames(element))
{
if (DateTimeOffset.TryParse(
name.Value,
CultureInfo.InvariantCulture,
DateTimeStyles.AllowWhiteSpaces | DateTimeStyles.AssumeUniversal,
out var parsed))
{
return parsed.ToUniversalTime();
}
}
return null;
}
private static IEnumerable<XElement> EnumerateTimestampNames(XElement element)
{
foreach (var child in element.Elements())
{
var localName = child.Name.LocalName;
if (string.Equals(localName, "pubDate", StringComparison.OrdinalIgnoreCase) ||
string.Equals(localName, "published", StringComparison.OrdinalIgnoreCase) ||
string.Equals(localName, "updated", StringComparison.OrdinalIgnoreCase) ||
string.Equals(localName, "date", StringComparison.OrdinalIgnoreCase))
{
yield return child;
}
}
}
private Dictionary<string, string> CreateMetadata(AcscFeedOptions feed, AcscCursor cursor, string mode)
{
var metadata = new Dictionary<string, string>(StringComparer.Ordinal)
{
["acsc.feed.slug"] = feed.Slug,
["acsc.fetch.mode"] = mode,
};
if (cursor.LastPublishedByFeed.TryGetValue(feed.Slug, out var published) && published.HasValue)
{
metadata["acsc.cursor.lastPublished"] = published.Value.ToString("O");
}
return metadata;
}
private Uri BuildFeedUri(AcscFeedOptions feed, AcscFetchMode mode)
{
var baseUri = mode switch
{
AcscFetchMode.Relay when IsRelayConfigured => _options.RelayEndpoint!,
_ => _options.BaseEndpoint,
};
return new Uri(baseUri, feed.RelativePath);
}
private IEnumerable<AcscFeedOptions> GetEnabledFeeds()
=> _options.Feeds.Where(feed => feed is { Enabled: true });
private Task<AcscCursor> GetCursorAsync(CancellationToken cancellationToken)
=> GetCursorCoreAsync(cancellationToken);
private async Task<AcscCursor> GetCursorCoreAsync(CancellationToken cancellationToken)
{
var state = await _stateRepository.TryGetAsync(SourceName, cancellationToken).ConfigureAwait(false);
return state is null ? AcscCursor.Empty : AcscCursor.FromBson(state.Cursor);
}
private Task UpdateCursorAsync(AcscCursor cursor, CancellationToken cancellationToken)
{
var document = cursor.ToBsonDocument();
var completedAt = _timeProvider.GetUtcNow();
return _stateRepository.UpdateCursorAsync(SourceName, document, completedAt, cancellationToken);
}
private bool IsRelayConfigured => _options.RelayEndpoint is not null;
private static string ModeName(AcscFetchMode mode) => mode switch
{
AcscFetchMode.Relay => "relay",
_ => "direct",
};
private enum AcscFetchMode
{
Direct = 0,
Relay = 1,
}
}

View File

@@ -0,0 +1,19 @@
using Microsoft.Extensions.DependencyInjection;
using StellaOps.Plugin;
namespace StellaOps.Concelier.Connector.Acsc;
public sealed class AcscConnectorPlugin : IConnectorPlugin
{
public const string SourceName = "acsc";
public string Name => SourceName;
public bool IsAvailable(IServiceProvider services) => services is not null;
public IFeedConnector Create(IServiceProvider services)
{
ArgumentNullException.ThrowIfNull(services);
return ActivatorUtilities.CreateInstance<AcscConnector>(services);
}
}

View File

@@ -0,0 +1,44 @@
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using StellaOps.DependencyInjection;
using StellaOps.Concelier.Core.Jobs;
using StellaOps.Concelier.Connector.Acsc.Configuration;
namespace StellaOps.Concelier.Connector.Acsc;
public sealed class AcscDependencyInjectionRoutine : IDependencyInjectionRoutine
{
private const string ConfigurationSection = "concelier:sources:acsc";
private const string FetchCron = "7,37 * * * *";
private const string ParseCron = "12,42 * * * *";
private const string MapCron = "17,47 * * * *";
private const string ProbeCron = "25,55 * * * *";
private static readonly TimeSpan FetchTimeout = TimeSpan.FromMinutes(4);
private static readonly TimeSpan ParseTimeout = TimeSpan.FromMinutes(3);
private static readonly TimeSpan MapTimeout = TimeSpan.FromMinutes(3);
private static readonly TimeSpan ProbeTimeout = TimeSpan.FromMinutes(1);
private static readonly TimeSpan LeaseDuration = TimeSpan.FromMinutes(3);
public IServiceCollection Register(IServiceCollection services, IConfiguration configuration)
{
ArgumentNullException.ThrowIfNull(services);
ArgumentNullException.ThrowIfNull(configuration);
services.AddAcscConnector(options =>
{
configuration.GetSection(ConfigurationSection).Bind(options);
options.Validate();
});
var scheduler = new JobSchedulerBuilder(services);
scheduler
.AddJob<AcscFetchJob>(AcscJobKinds.Fetch, FetchCron, FetchTimeout, LeaseDuration)
.AddJob<AcscParseJob>(AcscJobKinds.Parse, ParseCron, ParseTimeout, LeaseDuration)
.AddJob<AcscMapJob>(AcscJobKinds.Map, MapCron, MapTimeout, LeaseDuration)
.AddJob<AcscProbeJob>(AcscJobKinds.Probe, ProbeCron, ProbeTimeout, LeaseDuration);
return services;
}
}

View File

@@ -0,0 +1,56 @@
using System.Net;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Options;
using StellaOps.Concelier.Connector.Acsc.Configuration;
using StellaOps.Concelier.Connector.Acsc.Internal;
using StellaOps.Concelier.Connector.Common.Http;
namespace StellaOps.Concelier.Connector.Acsc;
public static class AcscServiceCollectionExtensions
{
public static IServiceCollection AddAcscConnector(this IServiceCollection services, Action<AcscOptions> configure)
{
ArgumentNullException.ThrowIfNull(services);
ArgumentNullException.ThrowIfNull(configure);
services.AddOptions<AcscOptions>()
.Configure(configure)
.PostConfigure(static options => options.Validate());
services.AddSourceHttpClient(AcscOptions.HttpClientName, (sp, clientOptions) =>
{
var options = sp.GetRequiredService<IOptions<AcscOptions>>().Value;
clientOptions.Timeout = options.RequestTimeout;
clientOptions.UserAgent = options.UserAgent;
clientOptions.RequestVersion = options.RequestVersion;
clientOptions.VersionPolicy = options.VersionPolicy;
clientOptions.AllowAutoRedirect = true;
clientOptions.ConfigureHandler = handler =>
{
handler.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
handler.AllowAutoRedirect = true;
};
clientOptions.AllowedHosts.Clear();
clientOptions.AllowedHosts.Add(options.BaseEndpoint.Host);
if (options.RelayEndpoint is not null)
{
clientOptions.AllowedHosts.Add(options.RelayEndpoint.Host);
}
clientOptions.DefaultRequestHeaders["Accept"] = string.Join(", ", new[]
{
"application/rss+xml",
"application/atom+xml;q=0.9",
"application/xml;q=0.8",
"text/xml;q=0.7",
});
});
services.AddSingleton<AcscDiagnostics>();
services.AddTransient<AcscConnector>();
return services;
}
}

View File

@@ -0,0 +1,54 @@
using System.Text.RegularExpressions;
namespace StellaOps.Concelier.Connector.Acsc.Configuration;
/// <summary>
/// Defines a single ACSC RSS feed endpoint.
/// </summary>
public sealed class AcscFeedOptions
{
private static readonly Regex SlugPattern = new("^[a-z0-9][a-z0-9\\-]*$", RegexOptions.Compiled | RegexOptions.CultureInvariant);
/// <summary>
/// Logical slug for the feed (alerts, advisories, threats, etc.).
/// </summary>
public string Slug { get; set; } = "alerts";
/// <summary>
/// Relative path (under <see cref="AcscOptions.BaseEndpoint"/>) for the RSS feed.
/// </summary>
public string RelativePath { get; set; } = "/acsc/view-all-content/alerts/rss";
/// <summary>
/// Indicates whether the feed is active.
/// </summary>
public bool Enabled { get; set; } = true;
/// <summary>
/// Optional display name for logging.
/// </summary>
public string? DisplayName { get; set; }
internal void Validate(int index)
{
if (string.IsNullOrWhiteSpace(Slug))
{
throw new InvalidOperationException($"ACSC feed entry #{index} must define a slug.");
}
if (!SlugPattern.IsMatch(Slug))
{
throw new InvalidOperationException($"ACSC feed slug '{Slug}' is invalid. Slugs must be lower-case alphanumeric with optional hyphen separators.");
}
if (string.IsNullOrWhiteSpace(RelativePath))
{
throw new InvalidOperationException($"ACSC feed '{Slug}' must specify a relative path.");
}
if (!RelativePath.StartsWith("/", StringComparison.Ordinal))
{
throw new InvalidOperationException($"ACSC feed '{Slug}' relative path must begin with '/' (value: '{RelativePath}').");
}
}
}

View File

@@ -0,0 +1,153 @@
using System.Net;
using System.Net.Http;
namespace StellaOps.Concelier.Connector.Acsc.Configuration;
/// <summary>
/// Connector options governing ACSC feed access and retry behaviour.
/// </summary>
public sealed class AcscOptions
{
public const string HttpClientName = "acsc";
private static readonly TimeSpan DefaultRequestTimeout = TimeSpan.FromSeconds(45);
private static readonly TimeSpan DefaultFailureBackoff = TimeSpan.FromMinutes(5);
private static readonly TimeSpan DefaultInitialBackfill = TimeSpan.FromDays(120);
public AcscOptions()
{
Feeds = new List<AcscFeedOptions>
{
new() { Slug = "alerts", RelativePath = "/acsc/view-all-content/alerts/rss" },
new() { Slug = "advisories", RelativePath = "/acsc/view-all-content/advisories/rss" },
new() { Slug = "news", RelativePath = "/acsc/view-all-content/news/rss", Enabled = false },
new() { Slug = "publications", RelativePath = "/acsc/view-all-content/publications/rss", Enabled = false },
new() { Slug = "threats", RelativePath = "/acsc/view-all-content/threats/rss", Enabled = false },
};
}
/// <summary>
/// Base endpoint for direct ACSC fetches.
/// </summary>
public Uri BaseEndpoint { get; set; } = new("https://www.cyber.gov.au/", UriKind.Absolute);
/// <summary>
/// Optional relay endpoint used when Akamai terminates direct HTTP/2 connections.
/// </summary>
public Uri? RelayEndpoint { get; set; }
/// <summary>
/// Default mode when no preference has been captured in connector state. When <c>true</c>, the relay will be preferred for initial fetches.
/// </summary>
public bool PreferRelayByDefault { get; set; }
/// <summary>
/// If enabled, the connector may switch to the relay endpoint when direct fetches fail.
/// </summary>
public bool EnableRelayFallback { get; set; } = true;
/// <summary>
/// If set, the connector will always use the relay endpoint and skip direct attempts.
/// </summary>
public bool ForceRelay { get; set; }
/// <summary>
/// Timeout applied to fetch requests (overrides HttpClient default).
/// </summary>
public TimeSpan RequestTimeout { get; set; } = DefaultRequestTimeout;
/// <summary>
/// Backoff applied when marking fetch failures.
/// </summary>
public TimeSpan FailureBackoff { get; set; } = DefaultFailureBackoff;
/// <summary>
/// Look-back period used when deriving initial published cursors.
/// </summary>
public TimeSpan InitialBackfill { get; set; } = DefaultInitialBackfill;
/// <summary>
/// User-agent header sent with outbound requests.
/// </summary>
public string UserAgent { get; set; } = "StellaOps/Concelier (+https://stella-ops.org)";
/// <summary>
/// RSS feeds requested during fetch.
/// </summary>
public IList<AcscFeedOptions> Feeds { get; }
/// <summary>
/// HTTP version policy requested for outbound requests.
/// </summary>
public HttpVersionPolicy VersionPolicy { get; set; } = HttpVersionPolicy.RequestVersionOrLower;
/// <summary>
/// Default HTTP version requested when connecting to ACSC (defaults to HTTP/2 but allows downgrade).
/// </summary>
public Version RequestVersion { get; set; } = HttpVersion.Version20;
public void Validate()
{
if (BaseEndpoint is null || !BaseEndpoint.IsAbsoluteUri)
{
throw new InvalidOperationException("ACSC BaseEndpoint must be an absolute URI.");
}
if (!BaseEndpoint.AbsoluteUri.EndsWith("/", StringComparison.Ordinal))
{
throw new InvalidOperationException("ACSC BaseEndpoint must include a trailing slash.");
}
if (RelayEndpoint is not null && !RelayEndpoint.IsAbsoluteUri)
{
throw new InvalidOperationException("ACSC RelayEndpoint must be an absolute URI when specified.");
}
if (RelayEndpoint is not null && !RelayEndpoint.AbsoluteUri.EndsWith("/", StringComparison.Ordinal))
{
throw new InvalidOperationException("ACSC RelayEndpoint must include a trailing slash when specified.");
}
if (RequestTimeout <= TimeSpan.Zero)
{
throw new InvalidOperationException("ACSC RequestTimeout must be positive.");
}
if (FailureBackoff < TimeSpan.Zero)
{
throw new InvalidOperationException("ACSC FailureBackoff cannot be negative.");
}
if (InitialBackfill <= TimeSpan.Zero)
{
throw new InvalidOperationException("ACSC InitialBackfill must be positive.");
}
if (string.IsNullOrWhiteSpace(UserAgent))
{
throw new InvalidOperationException("ACSC UserAgent cannot be empty.");
}
if (Feeds.Count == 0)
{
throw new InvalidOperationException("At least one ACSC feed must be configured.");
}
var seen = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
for (var i = 0; i < Feeds.Count; i++)
{
var feed = Feeds[i];
feed.Validate(i);
if (!feed.Enabled)
{
continue;
}
if (!seen.Add(feed.Slug))
{
throw new InvalidOperationException($"Duplicate ACSC feed slug '{feed.Slug}' detected. Slugs must be unique (case-insensitive).");
}
}
}
}

View File

@@ -0,0 +1,141 @@
using MongoDB.Bson;
namespace StellaOps.Concelier.Connector.Acsc.Internal;
internal enum AcscEndpointPreference
{
Auto = 0,
Direct = 1,
Relay = 2,
}
internal sealed record AcscCursor(
AcscEndpointPreference PreferredEndpoint,
IReadOnlyDictionary<string, DateTimeOffset?> LastPublishedByFeed,
IReadOnlyCollection<Guid> PendingDocuments,
IReadOnlyCollection<Guid> PendingMappings)
{
private static readonly IReadOnlyCollection<Guid> EmptyGuidList = Array.Empty<Guid>();
private static readonly IReadOnlyDictionary<string, DateTimeOffset?> EmptyFeedDictionary =
new Dictionary<string, DateTimeOffset?>(StringComparer.OrdinalIgnoreCase);
public static AcscCursor Empty { get; } = new(
AcscEndpointPreference.Auto,
EmptyFeedDictionary,
EmptyGuidList,
EmptyGuidList);
public AcscCursor WithPendingDocuments(IEnumerable<Guid> documents)
=> this with { PendingDocuments = documents?.Distinct().ToArray() ?? EmptyGuidList };
public AcscCursor WithPendingMappings(IEnumerable<Guid> mappings)
=> this with { PendingMappings = mappings?.Distinct().ToArray() ?? EmptyGuidList };
public AcscCursor WithPreferredEndpoint(AcscEndpointPreference preference)
=> this with { PreferredEndpoint = preference };
public AcscCursor WithLastPublished(IDictionary<string, DateTimeOffset?> values)
{
var snapshot = new Dictionary<string, DateTimeOffset?>(StringComparer.OrdinalIgnoreCase);
if (values is not null)
{
foreach (var kvp in values)
{
snapshot[kvp.Key] = kvp.Value;
}
}
return this with { LastPublishedByFeed = snapshot };
}
public BsonDocument ToBsonDocument()
{
var document = new BsonDocument
{
["preferredEndpoint"] = PreferredEndpoint.ToString(),
["pendingDocuments"] = new BsonArray(PendingDocuments.Select(id => id.ToString())),
["pendingMappings"] = new BsonArray(PendingMappings.Select(id => id.ToString())),
};
var feedsDocument = new BsonDocument();
foreach (var kvp in LastPublishedByFeed)
{
if (kvp.Value.HasValue)
{
feedsDocument[kvp.Key] = kvp.Value.Value.UtcDateTime;
}
}
document["feeds"] = feedsDocument;
return document;
}
public static AcscCursor FromBson(BsonDocument? document)
{
if (document is null || document.ElementCount == 0)
{
return Empty;
}
var preferredEndpoint = document.TryGetValue("preferredEndpoint", out var endpointValue)
? ParseEndpointPreference(endpointValue.AsString)
: AcscEndpointPreference.Auto;
var feeds = new Dictionary<string, DateTimeOffset?>(StringComparer.OrdinalIgnoreCase);
if (document.TryGetValue("feeds", out var feedsValue) && feedsValue is BsonDocument feedsDocument)
{
foreach (var element in feedsDocument.Elements)
{
feeds[element.Name] = ParseDate(element.Value);
}
}
var pendingDocuments = ReadGuidArray(document, "pendingDocuments");
var pendingMappings = ReadGuidArray(document, "pendingMappings");
return new AcscCursor(
preferredEndpoint,
feeds,
pendingDocuments,
pendingMappings);
}
private static IReadOnlyCollection<Guid> ReadGuidArray(BsonDocument document, string field)
{
if (!document.TryGetValue(field, out var value) || value is not BsonArray array)
{
return EmptyGuidList;
}
var list = new List<Guid>(array.Count);
foreach (var element in array)
{
if (Guid.TryParse(element?.ToString(), out var guid))
{
list.Add(guid);
}
}
return list;
}
private static DateTimeOffset? ParseDate(BsonValue value)
{
return value.BsonType switch
{
BsonType.DateTime => DateTime.SpecifyKind(value.ToUniversalTime(), DateTimeKind.Utc),
BsonType.String when DateTimeOffset.TryParse(value.AsString, out var parsed) => parsed.ToUniversalTime(),
_ => null,
};
}
private static AcscEndpointPreference ParseEndpointPreference(string? value)
{
if (Enum.TryParse<AcscEndpointPreference>(value, ignoreCase: true, out var parsed))
{
return parsed;
}
return AcscEndpointPreference.Auto;
}
}

View File

@@ -0,0 +1,97 @@
using System.Diagnostics.Metrics;
namespace StellaOps.Concelier.Connector.Acsc.Internal;
public sealed class AcscDiagnostics : IDisposable
{
private const string MeterName = "StellaOps.Concelier.Connector.Acsc";
private const string MeterVersion = "1.0.0";
private readonly Meter _meter;
private readonly Counter<long> _fetchAttempts;
private readonly Counter<long> _fetchSuccess;
private readonly Counter<long> _fetchFailures;
private readonly Counter<long> _fetchUnchanged;
private readonly Counter<long> _fetchFallbacks;
private readonly Counter<long> _cursorUpdates;
private readonly Counter<long> _parseAttempts;
private readonly Counter<long> _parseSuccess;
private readonly Counter<long> _parseFailures;
private readonly Counter<long> _mapSuccess;
public AcscDiagnostics()
{
_meter = new Meter(MeterName, MeterVersion);
_fetchAttempts = _meter.CreateCounter<long>("acsc.fetch.attempts", unit: "operations");
_fetchSuccess = _meter.CreateCounter<long>("acsc.fetch.success", unit: "operations");
_fetchFailures = _meter.CreateCounter<long>("acsc.fetch.failures", unit: "operations");
_fetchUnchanged = _meter.CreateCounter<long>("acsc.fetch.unchanged", unit: "operations");
_fetchFallbacks = _meter.CreateCounter<long>("acsc.fetch.fallbacks", unit: "operations");
_cursorUpdates = _meter.CreateCounter<long>("acsc.cursor.published_updates", unit: "feeds");
_parseAttempts = _meter.CreateCounter<long>("acsc.parse.attempts", unit: "documents");
_parseSuccess = _meter.CreateCounter<long>("acsc.parse.success", unit: "documents");
_parseFailures = _meter.CreateCounter<long>("acsc.parse.failures", unit: "documents");
_mapSuccess = _meter.CreateCounter<long>("acsc.map.success", unit: "advisories");
}
public void FetchAttempt(string feed, string mode)
=> _fetchAttempts.Add(1, GetTags(feed, mode));
public void FetchSuccess(string feed, string mode)
=> _fetchSuccess.Add(1, GetTags(feed, mode));
public void FetchFailure(string feed, string mode)
=> _fetchFailures.Add(1, GetTags(feed, mode));
public void FetchUnchanged(string feed, string mode)
=> _fetchUnchanged.Add(1, GetTags(feed, mode));
public void FetchFallback(string feed, string mode, string reason)
=> _fetchFallbacks.Add(1, GetTags(feed, mode, new KeyValuePair<string, object?>("reason", reason)));
public void CursorUpdated(string feed)
=> _cursorUpdates.Add(1, new KeyValuePair<string, object?>("feed", feed));
public void ParseAttempt(string feed)
=> _parseAttempts.Add(1, new KeyValuePair<string, object?>("feed", feed));
public void ParseSuccess(string feed)
=> _parseSuccess.Add(1, new KeyValuePair<string, object?>("feed", feed));
public void ParseFailure(string feed, string reason)
=> _parseFailures.Add(1, new KeyValuePair<string, object?>[]
{
new("feed", feed),
new("reason", reason),
});
public void MapSuccess(int advisoryCount)
{
if (advisoryCount <= 0)
{
return;
}
_mapSuccess.Add(advisoryCount);
}
private static KeyValuePair<string, object?>[] GetTags(string feed, string mode)
=> new[]
{
new KeyValuePair<string, object?>("feed", feed),
new KeyValuePair<string, object?>("mode", mode),
};
private static KeyValuePair<string, object?>[] GetTags(string feed, string mode, KeyValuePair<string, object?> extra)
=> new[]
{
new KeyValuePair<string, object?>("feed", feed),
new KeyValuePair<string, object?>("mode", mode),
extra,
};
public void Dispose()
{
_meter.Dispose();
}
}

View File

@@ -0,0 +1,20 @@
using StellaOps.Concelier.Storage.Mongo.Documents;
namespace StellaOps.Concelier.Connector.Acsc.Internal;
internal readonly record struct AcscDocumentMetadata(string FeedSlug, string FetchMode)
{
public static AcscDocumentMetadata FromDocument(DocumentRecord document)
{
if (document.Metadata is null)
{
return new AcscDocumentMetadata(string.Empty, string.Empty);
}
document.Metadata.TryGetValue("acsc.feed.slug", out var slug);
document.Metadata.TryGetValue("acsc.fetch.mode", out var mode);
return new AcscDocumentMetadata(
string.IsNullOrWhiteSpace(slug) ? string.Empty : slug.Trim(),
string.IsNullOrWhiteSpace(mode) ? string.Empty : mode.Trim());
}
}

View File

@@ -0,0 +1,58 @@
using System.Text.Json.Serialization;
namespace StellaOps.Concelier.Connector.Acsc.Internal;
internal sealed record AcscFeedDto(
[property: JsonPropertyName("feedSlug")] string FeedSlug,
[property: JsonPropertyName("feedTitle")] string? FeedTitle,
[property: JsonPropertyName("feedLink")] string? FeedLink,
[property: JsonPropertyName("feedUpdated")] DateTimeOffset? FeedUpdated,
[property: JsonPropertyName("parsedAt")] DateTimeOffset ParsedAt,
[property: JsonPropertyName("entries")] IReadOnlyList<AcscEntryDto> Entries)
{
public static AcscFeedDto Empty { get; } = new(
FeedSlug: string.Empty,
FeedTitle: null,
FeedLink: null,
FeedUpdated: null,
ParsedAt: DateTimeOffset.UnixEpoch,
Entries: Array.Empty<AcscEntryDto>());
}
internal sealed record AcscEntryDto(
[property: JsonPropertyName("entryId")] string EntryId,
[property: JsonPropertyName("title")] string Title,
[property: JsonPropertyName("link")] string? Link,
[property: JsonPropertyName("feedSlug")] string FeedSlug,
[property: JsonPropertyName("published")] DateTimeOffset? Published,
[property: JsonPropertyName("updated")] DateTimeOffset? Updated,
[property: JsonPropertyName("summary")] string Summary,
[property: JsonPropertyName("contentHtml")] string ContentHtml,
[property: JsonPropertyName("contentText")] string ContentText,
[property: JsonPropertyName("references")] IReadOnlyList<AcscReferenceDto> References,
[property: JsonPropertyName("aliases")] IReadOnlyList<string> Aliases,
[property: JsonPropertyName("fields")] IReadOnlyDictionary<string, string> Fields)
{
public static AcscEntryDto Empty { get; } = new(
EntryId: string.Empty,
Title: string.Empty,
Link: null,
FeedSlug: string.Empty,
Published: null,
Updated: null,
Summary: string.Empty,
ContentHtml: string.Empty,
ContentText: string.Empty,
References: Array.Empty<AcscReferenceDto>(),
Aliases: Array.Empty<string>(),
Fields: new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase));
}
internal sealed record AcscReferenceDto(
[property: JsonPropertyName("title")] string Title,
[property: JsonPropertyName("url")] string Url)
{
public static AcscReferenceDto Empty { get; } = new(
Title: string.Empty,
Url: string.Empty);
}

View File

@@ -0,0 +1,594 @@
using System.Globalization;
using System.Text;
using System.Xml.Linq;
using AngleSharp.Dom;
using AngleSharp.Html.Parser;
using System.Security.Cryptography;
using StellaOps.Concelier.Connector.Common.Html;
namespace StellaOps.Concelier.Connector.Acsc.Internal;
internal static class AcscFeedParser
{
private static readonly XNamespace AtomNamespace = "http://www.w3.org/2005/Atom";
private static readonly XNamespace ContentNamespace = "http://purl.org/rss/1.0/modules/content/";
public static AcscFeedDto Parse(byte[] payload, string feedSlug, DateTimeOffset parsedAt, HtmlContentSanitizer sanitizer)
{
ArgumentNullException.ThrowIfNull(payload);
ArgumentNullException.ThrowIfNull(sanitizer);
if (payload.Length == 0)
{
return AcscFeedDto.Empty with
{
FeedSlug = feedSlug ?? string.Empty,
ParsedAt = parsedAt,
Entries = Array.Empty<AcscEntryDto>(),
};
}
var xml = XDocument.Parse(Encoding.UTF8.GetString(payload));
var (feedTitle, feedLink, feedUpdated) = ExtractFeedMetadata(xml);
var items = ExtractEntries(xml).ToArray();
var entries = new List<AcscEntryDto>(items.Length);
foreach (var item in items)
{
var entryId = ExtractEntryId(item);
if (string.IsNullOrWhiteSpace(entryId))
{
// Fall back to hash of title + link to avoid duplicates.
entryId = GenerateFallbackId(item);
}
var title = ExtractTitle(item);
var link = ExtractLink(item);
var published = ExtractDate(item, "pubDate") ?? ExtractAtomDate(item, "published") ?? ExtractDcDate(item);
var updated = ExtractAtomDate(item, "updated");
var rawHtml = ExtractContent(item);
var baseUri = TryCreateUri(link);
var sanitizedHtml = sanitizer.Sanitize(rawHtml, baseUri);
var htmlFragment = ParseHtmlFragment(sanitizedHtml);
var summary = BuildSummary(htmlFragment) ?? string.Empty;
var contentText = NormalizeWhitespace(htmlFragment?.TextContent ?? string.Empty);
var references = ExtractReferences(htmlFragment);
var fields = ExtractFields(htmlFragment, out var serialNumber, out var advisoryType);
var aliases = BuildAliases(serialNumber, advisoryType);
var entry = new AcscEntryDto(
EntryId: entryId,
Title: title,
Link: link,
FeedSlug: feedSlug ?? string.Empty,
Published: published,
Updated: updated,
Summary: summary,
ContentHtml: sanitizedHtml,
ContentText: contentText,
References: references,
Aliases: aliases,
Fields: fields);
entries.Add(entry);
}
return new AcscFeedDto(
FeedSlug: feedSlug ?? string.Empty,
FeedTitle: feedTitle,
FeedLink: feedLink,
FeedUpdated: feedUpdated,
ParsedAt: parsedAt,
Entries: entries);
}
private static (string? Title, string? Link, DateTimeOffset? Updated) ExtractFeedMetadata(XDocument xml)
{
var root = xml.Root;
if (root is null)
{
return (null, null, null);
}
if (string.Equals(root.Name.LocalName, "rss", StringComparison.OrdinalIgnoreCase))
{
var channel = root.Element("channel");
var title = channel?.Element("title")?.Value?.Trim();
var link = channel?.Element("link")?.Value?.Trim();
var updated = TryParseDate(channel?.Element("lastBuildDate")?.Value);
return (title, link, updated);
}
if (root.Name == AtomNamespace + "feed")
{
var title = root.Element(AtomNamespace + "title")?.Value?.Trim();
var link = root.Elements(AtomNamespace + "link")
.FirstOrDefault(static element =>
string.Equals(element.Attribute("rel")?.Value, "alternate", StringComparison.OrdinalIgnoreCase))
?.Attribute("href")?.Value?.Trim()
?? root.Element(AtomNamespace + "link")?.Attribute("href")?.Value?.Trim();
var updated = TryParseDate(root.Element(AtomNamespace + "updated")?.Value);
return (title, link, updated);
}
return (null, null, null);
}
private static IEnumerable<XElement> ExtractEntries(XDocument xml)
{
var root = xml.Root;
if (root is null)
{
yield break;
}
if (string.Equals(root.Name.LocalName, "rss", StringComparison.OrdinalIgnoreCase))
{
var channel = root.Element("channel");
if (channel is null)
{
yield break;
}
foreach (var item in channel.Elements("item"))
{
yield return item;
}
yield break;
}
if (root.Name == AtomNamespace + "feed")
{
foreach (var entry in root.Elements(AtomNamespace + "entry"))
{
yield return entry;
}
}
}
private static string ExtractTitle(XElement element)
{
var title = element.Element("title")?.Value
?? element.Element(AtomNamespace + "title")?.Value
?? string.Empty;
return title.Trim();
}
private static string? ExtractLink(XElement element)
{
var linkValue = element.Element("link")?.Value;
if (!string.IsNullOrWhiteSpace(linkValue))
{
return linkValue.Trim();
}
var atomLink = element.Elements(AtomNamespace + "link")
.FirstOrDefault(static el =>
string.Equals(el.Attribute("rel")?.Value, "alternate", StringComparison.OrdinalIgnoreCase))
?? element.Element(AtomNamespace + "link");
if (atomLink is not null)
{
var href = atomLink.Attribute("href")?.Value;
if (!string.IsNullOrWhiteSpace(href))
{
return href.Trim();
}
}
return null;
}
private static string ExtractEntryId(XElement element)
{
var guid = element.Element("guid")?.Value;
if (!string.IsNullOrWhiteSpace(guid))
{
return guid.Trim();
}
var atomId = element.Element(AtomNamespace + "id")?.Value;
if (!string.IsNullOrWhiteSpace(atomId))
{
return atomId.Trim();
}
if (!string.IsNullOrWhiteSpace(element.Element("link")?.Value))
{
return element.Element("link")!.Value.Trim();
}
if (!string.IsNullOrWhiteSpace(element.Element("title")?.Value))
{
return GenerateStableKey(element.Element("title")!.Value);
}
return string.Empty;
}
private static string GenerateFallbackId(XElement element)
{
var builder = new StringBuilder();
var title = element.Element("title")?.Value;
if (!string.IsNullOrWhiteSpace(title))
{
builder.Append(title.Trim());
}
var link = ExtractLink(element);
if (!string.IsNullOrWhiteSpace(link))
{
if (builder.Length > 0)
{
builder.Append("::");
}
builder.Append(link);
}
if (builder.Length == 0)
{
return Guid.NewGuid().ToString("n");
}
return GenerateStableKey(builder.ToString());
}
private static string GenerateStableKey(string value)
{
using var sha = SHA256.Create();
var bytes = Encoding.UTF8.GetBytes(value);
var hash = sha.ComputeHash(bytes);
return Convert.ToHexString(hash).ToLowerInvariant();
}
private static string ExtractContent(XElement element)
{
var encoded = element.Element(ContentNamespace + "encoded")?.Value;
if (!string.IsNullOrWhiteSpace(encoded))
{
return encoded;
}
var description = element.Element("description")?.Value;
if (!string.IsNullOrWhiteSpace(description))
{
return description;
}
var summary = element.Element(AtomNamespace + "summary")?.Value;
if (!string.IsNullOrWhiteSpace(summary))
{
return summary;
}
return string.Empty;
}
private static DateTimeOffset? ExtractDate(XElement element, string name)
{
var value = element.Element(name)?.Value;
return TryParseDate(value);
}
private static DateTimeOffset? ExtractAtomDate(XElement element, string name)
{
var value = element.Element(AtomNamespace + name)?.Value;
return TryParseDate(value);
}
private static DateTimeOffset? ExtractDcDate(XElement element)
{
var value = element.Element(XName.Get("date", "http://purl.org/dc/elements/1.1/"))?.Value;
return TryParseDate(value);
}
private static DateTimeOffset? TryParseDate(string? value)
{
if (string.IsNullOrWhiteSpace(value))
{
return null;
}
if (DateTimeOffset.TryParse(value, CultureInfo.InvariantCulture, DateTimeStyles.AssumeUniversal | DateTimeStyles.AllowWhiteSpaces, out var result))
{
return result.ToUniversalTime();
}
if (DateTimeOffset.TryParse(value, CultureInfo.CurrentCulture, DateTimeStyles.AssumeUniversal | DateTimeStyles.AllowWhiteSpaces, out result))
{
return result.ToUniversalTime();
}
return null;
}
private static Uri? TryCreateUri(string? value)
{
if (string.IsNullOrWhiteSpace(value))
{
return null;
}
return Uri.TryCreate(value, UriKind.Absolute, out var uri) ? uri : null;
}
private static IElement? ParseHtmlFragment(string html)
{
if (string.IsNullOrWhiteSpace(html))
{
return null;
}
var parser = new HtmlParser(new HtmlParserOptions
{
IsKeepingSourceReferences = false,
});
var document = parser.ParseDocument($"<body>{html}</body>");
return document.Body;
}
private static string? BuildSummary(IElement? root)
{
if (root is null || !root.HasChildNodes)
{
return root?.TextContent is { Length: > 0 } text
? NormalizeWhitespace(text)
: string.Empty;
}
var segments = new List<string>();
foreach (var child in root.Children)
{
var text = NormalizeWhitespace(child.TextContent);
if (string.IsNullOrEmpty(text))
{
continue;
}
if (string.Equals(child.NodeName, "LI", StringComparison.OrdinalIgnoreCase))
{
segments.Add($"- {text}");
continue;
}
segments.Add(text);
}
if (segments.Count == 0)
{
var fallback = NormalizeWhitespace(root.TextContent);
return fallback;
}
return string.Join("\n\n", segments);
}
private static IReadOnlyList<AcscReferenceDto> ExtractReferences(IElement? root)
{
if (root is null)
{
return Array.Empty<AcscReferenceDto>();
}
var anchors = root.QuerySelectorAll("a");
if (anchors.Length == 0)
{
return Array.Empty<AcscReferenceDto>();
}
var seen = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
var references = new List<AcscReferenceDto>(anchors.Length);
foreach (var anchor in anchors)
{
var href = anchor.GetAttribute("href");
if (string.IsNullOrWhiteSpace(href))
{
continue;
}
if (!seen.Add(href))
{
continue;
}
var text = NormalizeWhitespace(anchor.TextContent);
if (string.IsNullOrEmpty(text))
{
text = href;
}
references.Add(new AcscReferenceDto(text, href));
}
return references;
}
private static IReadOnlyDictionary<string, string> ExtractFields(IElement? root, out string? serialNumber, out string? advisoryType)
{
serialNumber = null;
advisoryType = null;
if (root is null)
{
return EmptyFields;
}
var map = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
foreach (var element in root.QuerySelectorAll("strong"))
{
var labelRaw = NormalizeWhitespace(element.TextContent);
if (string.IsNullOrEmpty(labelRaw))
{
continue;
}
var label = labelRaw.TrimEnd(':').Trim();
if (string.IsNullOrEmpty(label))
{
continue;
}
var key = NormalizeFieldKey(label);
if (string.IsNullOrEmpty(key))
{
continue;
}
var value = ExtractFieldValue(element);
if (string.IsNullOrEmpty(value))
{
continue;
}
if (!map.ContainsKey(key))
{
map[key] = value;
}
if (string.Equals(key, "serialNumber", StringComparison.OrdinalIgnoreCase))
{
serialNumber ??= value;
}
else if (string.Equals(key, "advisoryType", StringComparison.OrdinalIgnoreCase))
{
advisoryType ??= value;
}
}
return map.Count == 0
? EmptyFields
: map;
}
private static string? ExtractFieldValue(IElement strongElement)
{
var builder = new StringBuilder();
var node = strongElement.NextSibling;
while (node is not null)
{
if (node.NodeType == NodeType.Text)
{
builder.Append(node.TextContent);
}
else if (node is IElement element)
{
builder.Append(element.TextContent);
}
node = node.NextSibling;
}
var value = builder.ToString();
if (string.IsNullOrWhiteSpace(value))
{
var parent = strongElement.ParentElement;
if (parent is not null)
{
var parentText = parent.TextContent ?? string.Empty;
var trimmed = parentText.Replace(strongElement.TextContent ?? string.Empty, string.Empty, StringComparison.OrdinalIgnoreCase);
value = trimmed;
}
}
value = NormalizeWhitespace(value);
if (string.IsNullOrEmpty(value))
{
return null;
}
value = value.TrimStart(':', '-', '', '—', ' ');
return value.Trim();
}
private static IReadOnlyList<string> BuildAliases(string? serialNumber, string? advisoryType)
{
var aliases = new List<string>(capacity: 2);
if (!string.IsNullOrWhiteSpace(serialNumber))
{
aliases.Add(serialNumber.Trim());
}
if (!string.IsNullOrWhiteSpace(advisoryType))
{
aliases.Add(advisoryType.Trim());
}
return aliases.Count == 0 ? Array.Empty<string>() : aliases;
}
private static string NormalizeFieldKey(string label)
{
if (string.IsNullOrWhiteSpace(label))
{
return string.Empty;
}
var builder = new StringBuilder(label.Length);
var upperNext = false;
foreach (var c in label)
{
if (char.IsLetterOrDigit(c))
{
if (builder.Length == 0)
{
builder.Append(char.ToLowerInvariant(c));
}
else if (upperNext)
{
builder.Append(char.ToUpperInvariant(c));
upperNext = false;
}
else
{
builder.Append(char.ToLowerInvariant(c));
}
}
else
{
if (builder.Length > 0)
{
upperNext = true;
}
}
}
return builder.Length == 0 ? label.Trim() : builder.ToString();
}
private static string NormalizeWhitespace(string? value)
{
if (string.IsNullOrWhiteSpace(value))
{
return string.Empty;
}
var builder = new StringBuilder(value.Length);
var previousIsWhitespace = false;
foreach (var ch in value)
{
if (char.IsWhiteSpace(ch))
{
if (!previousIsWhitespace)
{
builder.Append(' ');
previousIsWhitespace = true;
}
continue;
}
builder.Append(ch);
previousIsWhitespace = false;
}
return builder.ToString().Trim();
}
private static readonly IReadOnlyDictionary<string, string> EmptyFields = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
}

View File

@@ -0,0 +1,312 @@
using System.Security.Cryptography;
using System.Text;
using System.Text.RegularExpressions;
using StellaOps.Concelier.Models;
using StellaOps.Concelier.Storage.Mongo.Documents;
using StellaOps.Concelier.Storage.Mongo.Dtos;
namespace StellaOps.Concelier.Connector.Acsc.Internal;
internal static class AcscMapper
{
private static readonly Regex CveRegex = new("CVE-\\d{4}-\\d{4,7}", RegexOptions.IgnoreCase | RegexOptions.Compiled);
public static IReadOnlyList<Advisory> Map(
AcscFeedDto feed,
DocumentRecord document,
DtoRecord dtoRecord,
string sourceName,
DateTimeOffset mappedAt)
{
ArgumentNullException.ThrowIfNull(feed);
ArgumentNullException.ThrowIfNull(document);
ArgumentNullException.ThrowIfNull(dtoRecord);
ArgumentException.ThrowIfNullOrEmpty(sourceName);
if (feed.Entries is null || feed.Entries.Count == 0)
{
return Array.Empty<Advisory>();
}
var advisories = new List<Advisory>(feed.Entries.Count);
foreach (var entry in feed.Entries)
{
if (entry is null)
{
continue;
}
var advisoryKey = CreateAdvisoryKey(sourceName, feed.FeedSlug, entry);
var fetchProvenance = new AdvisoryProvenance(
sourceName,
"document",
document.Uri,
document.FetchedAt.ToUniversalTime(),
fieldMask: new[] { "summary", "aliases", "references", "affectedPackages" });
var feedProvenance = new AdvisoryProvenance(
sourceName,
"feed",
feed.FeedSlug ?? string.Empty,
feed.ParsedAt.ToUniversalTime(),
fieldMask: new[] { "summary" });
var mappingProvenance = new AdvisoryProvenance(
sourceName,
"mapping",
entry.EntryId ?? entry.Link ?? advisoryKey,
mappedAt.ToUniversalTime(),
fieldMask: new[] { "summary", "aliases", "references", "affectedpackages" });
var provenance = new[]
{
fetchProvenance,
feedProvenance,
mappingProvenance,
};
var aliases = BuildAliases(entry);
var severity = TryGetSeverity(entry.Fields);
var references = BuildReferences(entry, sourceName, mappedAt);
var affectedPackages = BuildAffectedPackages(entry, sourceName, mappedAt);
var advisory = new Advisory(
advisoryKey,
string.IsNullOrWhiteSpace(entry.Title) ? $"ACSC Advisory {entry.EntryId}" : entry.Title,
string.IsNullOrWhiteSpace(entry.Summary) ? null : entry.Summary,
language: "en",
published: entry.Published?.ToUniversalTime() ?? feed.FeedUpdated?.ToUniversalTime() ?? document.FetchedAt.ToUniversalTime(),
modified: entry.Updated?.ToUniversalTime(),
severity: severity,
exploitKnown: false,
aliases: aliases,
references: references,
affectedPackages: affectedPackages,
cvssMetrics: Array.Empty<CvssMetric>(),
provenance: provenance);
advisories.Add(advisory);
}
return advisories;
}
private static IReadOnlyList<string> BuildAliases(AcscEntryDto entry)
{
var aliases = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
if (!string.IsNullOrWhiteSpace(entry.EntryId))
{
aliases.Add(entry.EntryId.Trim());
}
foreach (var alias in entry.Aliases ?? Array.Empty<string>())
{
if (!string.IsNullOrWhiteSpace(alias))
{
aliases.Add(alias.Trim());
}
}
foreach (var match in CveRegex.Matches(entry.Summary ?? string.Empty).Cast<Match>())
{
var value = match.Value.ToUpperInvariant();
aliases.Add(value);
}
foreach (var match in CveRegex.Matches(entry.ContentText ?? string.Empty).Cast<Match>())
{
var value = match.Value.ToUpperInvariant();
aliases.Add(value);
}
return aliases.Count == 0
? Array.Empty<string>()
: aliases.OrderBy(static value => value, StringComparer.OrdinalIgnoreCase).ToArray();
}
private static IReadOnlyList<AdvisoryReference> BuildReferences(AcscEntryDto entry, string sourceName, DateTimeOffset recordedAt)
{
var references = new List<AdvisoryReference>();
var seen = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
void AddReference(string? url, string? kind, string? sourceTag, string? summary)
{
if (string.IsNullOrWhiteSpace(url))
{
return;
}
if (!Validation.LooksLikeHttpUrl(url))
{
return;
}
if (!seen.Add(url))
{
return;
}
references.Add(new AdvisoryReference(
url,
kind,
sourceTag,
summary,
new AdvisoryProvenance(sourceName, "reference", url, recordedAt.ToUniversalTime())));
}
AddReference(entry.Link, "advisory", entry.FeedSlug, entry.Title);
foreach (var reference in entry.References ?? Array.Empty<AcscReferenceDto>())
{
if (reference is null)
{
continue;
}
AddReference(reference.Url, "reference", null, reference.Title);
}
return references.Count == 0
? Array.Empty<AdvisoryReference>()
: references
.OrderBy(static reference => reference.Url, StringComparer.OrdinalIgnoreCase)
.ToArray();
}
private static IReadOnlyList<AffectedPackage> BuildAffectedPackages(AcscEntryDto entry, string sourceName, DateTimeOffset recordedAt)
{
if (entry.Fields is null || entry.Fields.Count == 0)
{
return Array.Empty<AffectedPackage>();
}
if (!entry.Fields.TryGetValue("systemsAffected", out var systemsAffected) && !entry.Fields.TryGetValue("productsAffected", out systemsAffected))
{
return Array.Empty<AffectedPackage>();
}
if (string.IsNullOrWhiteSpace(systemsAffected))
{
return Array.Empty<AffectedPackage>();
}
var identifiers = systemsAffected
.Split(new[] { ',', ';', '\n' }, StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries)
.Select(static value => value.Trim())
.Where(static value => !string.IsNullOrWhiteSpace(value))
.Distinct(StringComparer.OrdinalIgnoreCase)
.ToArray();
if (identifiers.Length == 0)
{
return Array.Empty<AffectedPackage>();
}
var packages = new List<AffectedPackage>(identifiers.Length);
foreach (var identifier in identifiers)
{
var provenance = new[]
{
new AdvisoryProvenance(sourceName, "affected", identifier, recordedAt.ToUniversalTime(), fieldMask: new[] { "affectedpackages" }),
};
packages.Add(new AffectedPackage(
AffectedPackageTypes.Vendor,
identifier,
platform: null,
versionRanges: Array.Empty<AffectedVersionRange>(),
statuses: Array.Empty<AffectedPackageStatus>(),
provenance: provenance,
normalizedVersions: Array.Empty<NormalizedVersionRule>()));
}
return packages
.OrderBy(static package => package.Identifier, StringComparer.OrdinalIgnoreCase)
.ToArray();
}
private static string? TryGetSeverity(IReadOnlyDictionary<string, string> fields)
{
if (fields is null || fields.Count == 0)
{
return null;
}
var keys = new[]
{
"severity",
"riskLevel",
"threatLevel",
"impact",
};
foreach (var key in keys)
{
if (fields.TryGetValue(key, out var value) && !string.IsNullOrWhiteSpace(value))
{
return value.Trim();
}
}
return null;
}
private static string CreateAdvisoryKey(string sourceName, string? feedSlug, AcscEntryDto entry)
{
var slug = string.IsNullOrWhiteSpace(feedSlug) ? "general" : ToSlug(feedSlug);
var candidate = !string.IsNullOrWhiteSpace(entry.EntryId)
? entry.EntryId
: !string.IsNullOrWhiteSpace(entry.Link)
? entry.Link
: entry.Title;
var identifier = !string.IsNullOrWhiteSpace(candidate) ? ToSlug(candidate!) : null;
if (string.IsNullOrEmpty(identifier))
{
identifier = CreateHash(entry.Title ?? Guid.NewGuid().ToString());
}
return $"{sourceName}/{slug}/{identifier}";
}
private static string ToSlug(string value)
{
if (string.IsNullOrWhiteSpace(value))
{
return "unknown";
}
var builder = new StringBuilder(value.Length);
var previousDash = false;
foreach (var ch in value)
{
if (char.IsLetterOrDigit(ch))
{
builder.Append(char.ToLowerInvariant(ch));
previousDash = false;
}
else if (!previousDash)
{
builder.Append('-');
previousDash = true;
}
}
var slug = builder.ToString().Trim('-');
if (string.IsNullOrEmpty(slug))
{
slug = CreateHash(value);
}
return slug.Length <= 64 ? slug : slug[..64];
}
private static string CreateHash(string value)
{
var bytes = Encoding.UTF8.GetBytes(value);
var hash = SHA256.HashData(bytes);
return Convert.ToHexString(hash).ToLowerInvariant()[..16];
}
}

View File

@@ -0,0 +1,55 @@
using StellaOps.Concelier.Core.Jobs;
namespace StellaOps.Concelier.Connector.Acsc;
internal static class AcscJobKinds
{
public const string Fetch = "source:acsc:fetch";
public const string Parse = "source:acsc:parse";
public const string Map = "source:acsc:map";
public const string Probe = "source:acsc:probe";
}
internal sealed class AcscFetchJob : IJob
{
private readonly AcscConnector _connector;
public AcscFetchJob(AcscConnector connector)
=> _connector = connector ?? throw new ArgumentNullException(nameof(connector));
public Task ExecuteAsync(JobExecutionContext context, CancellationToken cancellationToken)
=> _connector.FetchAsync(context.Services, cancellationToken);
}
internal sealed class AcscParseJob : IJob
{
private readonly AcscConnector _connector;
public AcscParseJob(AcscConnector connector)
=> _connector = connector ?? throw new ArgumentNullException(nameof(connector));
public Task ExecuteAsync(JobExecutionContext context, CancellationToken cancellationToken)
=> _connector.ParseAsync(context.Services, cancellationToken);
}
internal sealed class AcscMapJob : IJob
{
private readonly AcscConnector _connector;
public AcscMapJob(AcscConnector connector)
=> _connector = connector ?? throw new ArgumentNullException(nameof(connector));
public Task ExecuteAsync(JobExecutionContext context, CancellationToken cancellationToken)
=> _connector.MapAsync(context.Services, cancellationToken);
}
internal sealed class AcscProbeJob : IJob
{
private readonly AcscConnector _connector;
public AcscProbeJob(AcscConnector connector)
=> _connector = connector ?? throw new ArgumentNullException(nameof(connector));
public Task ExecuteAsync(JobExecutionContext context, CancellationToken cancellationToken)
=> _connector.ProbeAsync(cancellationToken);
}

View File

@@ -0,0 +1,4 @@
using System.Runtime.CompilerServices;
[assembly: InternalsVisibleTo("FixtureUpdater")]
[assembly: InternalsVisibleTo("StellaOps.Concelier.Connector.Acsc.Tests")]

View File

@@ -0,0 +1,68 @@
## StellaOps.Concelier.Connector.Acsc
Australian Cyber Security Centre (ACSC) connector that ingests RSS/Atom advisories, sanitises embedded HTML, and maps entries into canonical `Advisory` records for Concelier.
### Configuration
Settings live under `concelier:sources:acsc` (see `AcscOptions`):
| Setting | Description | Default |
| --- | --- | --- |
| `baseEndpoint` | Base URI for direct ACSC requests (trailing slash required). | `https://www.cyber.gov.au/` |
| `relayEndpoint` | Optional relay host to fall back to when Akamai refuses HTTP/2. | empty |
| `preferRelayByDefault` | Default endpoint preference when no cursor state exists. | `false` |
| `enableRelayFallback` | Allows automatic relay fallback when direct fetch fails. | `true` |
| `forceRelay` | Forces all fetches through the relay (skips direct attempts). | `false` |
| `feeds` | Array of feed descriptors (`slug`, `relativePath`, `enabled`). | alerts/advisories enabled |
| `requestTimeout` | Per-request timeout override. | 45 seconds |
| `failureBackoff` | Backoff window when fetch fails. | 5 minutes |
| `initialBackfill` | Sliding window used to seed published cursors. | 120 days |
| `userAgent` | Outbound `User-Agent` header. | `StellaOps/Concelier (+https://stella-ops.org)` |
| `requestVersion`/`versionPolicy` | HTTP version negotiation knobs. | HTTP/2 with downgrade |
The dependency injection routine registers the connector plus scheduled jobs:
| Job | Cron | Purpose |
| --- | --- | --- |
| `source:acsc:fetch` | `7,37 * * * *` | Fetch RSS/Atom feeds (direct + relay fallback). |
| `source:acsc:parse` | `12,42 * * * *` | Persist sanitised DTOs (`acsc.feed.v1`). |
| `source:acsc:map` | `17,47 * * * *` | Map DTO entries into canonical advisories. |
| `source:acsc:probe` | `25,55 * * * *` | Verify direct endpoint health and adjust cursor preference. |
### Metrics
Emitted via `AcscDiagnostics` (`Meter` = `StellaOps.Concelier.Connector.Acsc`):
| Instrument | Unit | Description |
| --- | --- | --- |
| `acsc.fetch.attempts` | operations | Feed fetch attempts (tags: `feed`, `mode`). |
| `acsc.fetch.success` | operations | Successful fetches. |
| `acsc.fetch.failures` | operations | Failed fetches before retry backoff. |
| `acsc.fetch.unchanged` | operations | 304 Not Modified responses. |
| `acsc.fetch.fallbacks` | operations | Relay fallbacks triggered (`reason` tag). |
| `acsc.cursor.published_updates` | feeds | Published cursor updates per feed slug. |
| `acsc.parse.attempts` | documents | Parse attempts per feed. |
| `acsc.parse.success` | documents | Successful RSS → DTO conversions. |
| `acsc.parse.failures` | documents | Parse failures (tags: `feed`, `reason`). |
| `acsc.map.success` | advisories | Advisories emitted from a mapping pass. |
### Logging
Key log messages include:
- Fetch successes/failures, HTTP status codes, and relay fallbacks.
- Parse failures with reasons (download, schema, sanitisation).
- Mapping summaries showing advisory counts per document.
- Probe results toggling relay usage.
Logs include feed slug metadata for troubleshooting parallel ingestion.
### Tests & fixtures
`StellaOps.Concelier.Connector.Acsc.Tests` exercises the fetch→parse→map pipeline using canned RSS content. Deterministic snapshots live in `Acsc/Fixtures`. To refresh them after intentional behavioural changes:
```bash
UPDATE_ACSC_FIXTURES=1 dotnet test src/StellaOps.Concelier.Connector.Acsc.Tests/StellaOps.Concelier.Connector.Acsc.Tests.csproj
```
Remember to review the generated `.actual.json` files when assertions fail without fixture updates.
### Operational notes
- Keep the relay endpoint allowlisted for air-gapped deployments; the probe job will automatically switch back to direct fetching when Akamai stabilises.
- Mapping currently emits vendor `affectedPackages` from “Systems/Products affected” fields; expand range primitives once structured version data appears in ACSC feeds.
- The connector is offline-friendly—no outbound calls beyond the configured feeds.

View File

@@ -0,0 +1,18 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
</PropertyGroup>
<ItemGroup>
<ProjectReference Include="../StellaOps.Plugin/StellaOps.Plugin.csproj" />
<ProjectReference Include="../StellaOps.Concelier.Connector.Common/StellaOps.Concelier.Connector.Common.csproj" />
<ProjectReference Include="../StellaOps.Concelier.Models/StellaOps.Concelier.Models.csproj" />
<ProjectReference Include="../StellaOps.Concelier.Storage.Mongo/StellaOps.Concelier.Storage.Mongo.csproj" />
<ProjectReference Include="../StellaOps.Concelier.Core/StellaOps.Concelier.Core.csproj" />
</ItemGroup>
</Project>

View File

@@ -0,0 +1,11 @@
# TASKS
| Task | Owner(s) | Depends on | Notes |
|---|---|---|---|
|FEEDCONN-ACSC-02-001 Source discovery & feed contract|BE-Conn-ACSC|Research|**DONE (2025-10-11)** Catalogued feed slugs `/acsc/view-all-content/{alerts,advisories,news,publications,threats}/rss`; every endpoint currently negotiates HTTP/2 then aborts with `INTERNAL_ERROR` (curl exit92) and hanging >600s when forcing `--http1.1`. Documented traces + mitigations in `docs/concelier-connector-research-20251011.md` and opened `FEEDCONN-SHARED-HTTP2-001` for shared handler tweaks (force `RequestVersionOrLower`, jittered retries, relay option).|
|FEEDCONN-ACSC-02-002 Fetch pipeline & cursor persistence|BE-Conn-ACSC|Source.Common, Storage.Mongo|**DONE (2025-10-12)** HTTP client now pins `HttpRequestMessage.VersionPolicy = RequestVersionOrLower`, forces `AutomaticDecompression = GZip | Deflate`, and sends `User-Agent: StellaOps/Concelier (+https://stella-ops.org)` via `AddAcscConnector`. Fetch pipeline implemented in `AcscConnector` with relay-aware fallback (`AcscProbeJob` seeds preference), deterministic cursor updates (`preferredEndpoint`, published timestamp per feed), and metadata-deduped documents. Unit tests `AcscConnectorFetchTests` + `AcscHttpClientConfigurationTests` cover direct/relay flows and client wiring.|
|FEEDCONN-ACSC-02-003 Parser & DTO sanitiser|BE-Conn-ACSC|Source.Common|**DONE (2025-10-12)** Added `AcscFeedParser` to sanitise RSS payloads, collapse multi-paragraph summaries, dedupe references, and surface `serialNumber`/`advisoryType` fields as structured metadata + alias candidates. `ParseAsync` now materialises `acsc.feed.v1` DTOs, promotes documents to `pending-map`, and advances cursor state. Covered by `AcscConnectorParseTests`.|
|FEEDCONN-ACSC-02-004 Canonical mapper + range primitives|BE-Conn-ACSC|Models|**DONE (2025-10-12)** Introduced `AcscMapper` and wired `MapAsync` to emit canonical advisories with normalized aliases, source-tagged references, and optional vendor `affectedPackages` derived from “Systems/Products affected” fields. Documents transition to `mapped`, advisories persist via `IAdvisoryStore`, and metrics/logging capture mapped counts. `AcscConnectorParseTests` exercise fetch→parse→map flow.|
|FEEDCONN-ACSC-02-005 Deterministic fixtures & regression tests|QA|Testing|**DONE (2025-10-12)** `AcscConnectorParseTests` now snapshots fetch→parse→map output via `Acsc/Fixtures/acsc-advisories.snapshot.json`; set `UPDATE_ACSC_FIXTURES=1` to regenerate. Tests assert DTO status transitions, advisory persistence, and state cleanup.|
|FEEDCONN-ACSC-02-006 Diagnostics & documentation|DevEx|Docs|**DONE (2025-10-12)** Added module README describing configuration, job schedules, metrics (including new `acsc.map.success` counter), relay behaviour, and fixture workflow. Diagnostics updated to count map successes alongside existing fetch/parse metrics.|
|FEEDCONN-ACSC-02-007 Feed retention & pagination validation|BE-Conn-ACSC|Research|**DONE (2025-10-11)** Relay sampling shows retention ≥ July 2025; need to re-run once direct HTTP/2 path is stable to see if feed caps at ~50 items and whether `?page=` exists. Pending action tracked in shared HTTP downgrade task.|
|FEEDCONN-ACSC-02-008 HTTP client compatibility plan|BE-Conn-ACSC|Source.Common|**DONE (2025-10-11)** Reproduced Akamai resets, drafted downgrade plan (two-stage HTTP/2 retry + relay fallback), and filed `FEEDCONN-SHARED-HTTP2-001`; module README TODO will host the per-environment knob matrix.|