Restructure solution layout by module

This commit is contained in:
master
2025-10-28 15:10:40 +02:00
parent 95daa159c4
commit d870da18ce
4103 changed files with 192899 additions and 187024 deletions

View File

@@ -0,0 +1,19 @@
# TASKS
| Task | Owner(s) | Depends on | Notes |
|---|---|---|---|
|Register source HTTP clients with allowlists and timeouts|BE-Conn-Shared|Source.Common|**DONE** `AddSourceHttpClient` wires named clients with host allowlists/timeouts.|
|Implement retry/backoff with jitter and 429 handling|BE-Conn-Shared|Source.Common|**DONE** `SourceRetryPolicy` retries with 429/5xx handling and exponential backoff.|
|Conditional GET helpers (ETag/Last-Modified)|BE-Conn-Shared|Source.Common|**DONE** `SourceFetchRequest` + fetch result propagate etag/last-modified for NotModified handling.|
|Windowed cursor and pagination utilities|BE-Conn-Shared|Source.Common|**DONE** `TimeWindowCursorPlanner` + `PaginationPlanner` centralize sliding windows and additional page indices.|
|JSON/XML schema validators with rich errors|BE-Conn-Shared, QA|Source.Common|DONE JsonSchemaValidator surfaces keyword/path/message details + tests.|
|Raw document capture helper|BE-Conn-Shared|Storage.Mongo|**DONE** `SourceFetchService` stores raw payload + headers with sha256 metadata.|
|Canned HTTP test harness|QA|Source.Common|DONE enriched `CannedHttpMessageHandler` with method-aware queues, request capture, fallbacks, and helpers + unit coverage.|
|HTML sanitization and URL normalization utilities|BE-Conn-Shared|Source.Common|DONE `HtmlContentSanitizer` + `UrlNormalizer` provide safe fragments and canonical links for connectors.|
|PDF-to-text sandbox helper|BE-Conn-Shared|Source.Common|DONE `PdfTextExtractor` uses PdfPig to yield deterministic text with options + tests.|
|PURL and SemVer helper library|BE-Conn-Shared|Models|DONE `PackageCoordinateHelper` exposes normalized purl + SemVer parsing utilities backed by normalization.|
|Telemetry wiring (logs/metrics/traces)|BE-Conn-Shared|Observability|DONE `SourceDiagnostics` emits Activity/Meter signals integrated into fetch pipeline and WebService OTEL setup.|
|Shared jitter source in retry policy|BE-Conn-Shared|Source.Common|**DONE** `SourceRetryPolicy` now consumes injected `CryptoJitterSource` for thread-safe jitter.|
|Allow per-request Accept header overrides|BE-Conn-Shared|Source.Common|**DONE** `SourceFetchRequest.AcceptHeaders` honored by `SourceFetchService` plus unit tests for overrides.|
|FEEDCONN-SHARED-HTTP2-001 HTTP version fallback policy|BE-Conn-Shared, Source.Common|Source.Common|**DONE (2025-10-11)** `AddSourceHttpClient` now honours per-connector HTTP version/ policy, exposes handler customisation, and defaults to downgrade-friendly settings; unit tests cover handler configuration hook.|
|FEEDCONN-SHARED-TLS-001 Sovereign trust store support|BE-Conn-Shared, Ops|Source.Common|**DONE (2025-10-11)** `SourceHttpClientOptions` now exposes `TrustedRootCertificates`, `ServerCertificateCustomValidation`, and `AllowInvalidServerCertificates`, and `AddSourceHttpClient` runs the shared configuration binder so connectors can pull `concelier:httpClients|sources:<name>:http` settings (incl. Offline Kit relative PEM paths via `concelier:offline:root`). Tests cover handler wiring. Ops follow-up: package RU trust roots for Offline Kit distribution.|
|FEEDCONN-SHARED-STATE-003 Source state seeding helper|Tools Guild, BE-Conn-MSRC|Tools|**DOING (2025-10-19)** Provide a reusable CLI/utility to seed `pendingDocuments`/`pendingMappings` for connectors (MSRC backfills require scripted CVRF + detail injection). Coordinate with MSRC team for expected JSON schema and handoff once prototype lands. Prereqs confirmed none (2025-10-19).|