Rename Concelier Source modules to Connector

This commit is contained in:
master
2025-10-18 20:11:18 +03:00
parent 89ede53cc3
commit 052da7a7d0
789 changed files with 1489 additions and 1489 deletions

View File

@@ -0,0 +1,39 @@
# CERT-Bund Security Advisories Connector Notes
## Publication endpoints
- **RSS feed (latest 250 advisories)** `https://wid.cert-bund.de/content/public/securityAdvisory/rss`. The feed refreshes quickly; the current window spans roughly 6days of activity, so fetch jobs must run frequently to avoid churn.
- **Portal bootstrap** `https://wid.cert-bund.de/portal/` is hit once per process start to prime the session (`client_config` cookie) before any API calls.
- **Detail API** `https://wid.cert-bund.de/portal/api/securityadvisory?name=<ID>`. The connector reuses the bootstrapped `SocketsHttpHandler` so cookies and headers match the Angular SPA. Manual reproduction requires the same cookie container; otherwise the endpoint responds with the shell HTML document.
## Telemetry
The OpenTelemetry meter is `StellaOps.Concelier.Connector.CertBund`. Key instruments:
| Metric | Type | Notes |
| --- | --- | --- |
| `certbund.feed.fetch.attempts` / `.success` / `.failures` | counter | Feed poll lifecycle. |
| `certbund.feed.items.count` | histogram | Items returned per RSS fetch. |
| `certbund.feed.enqueued.count` | histogram | Detail documents queued per cycle (post-dedupe, before truncation). |
| `certbund.feed.coverage.days` | histogram | Rolling window (fetch time oldest published entry). Useful to alert when feed depth contracts. |
| `certbund.detail.fetch.*` | counter | Attempts, successes, HTTP304, and failure counts; failures are tagged by reason (`skipped`, `exception`). |
| `certbund.parse.success` / `.failures` | counter | Parsing outcomes; histograms capture product and CVE counts. |
| `certbund.map.success` / `.failures` | counter | Canonical mapping results; histograms capture affected-package and alias volume. |
Dashboards should chart coverage days and enqueued counts alongside fetch failures: sharp drops indicate the upstream window tightened or parsing stalled.
## Logging signals
- `CERT-Bund fetch cycle: feed items …` summarises each RSS run (enqueued, already-known, HTTP304, failures, coverage window).
- Parse and map stages log corresponding counts when work remains in the cursor.
- Errors include advisory/document identifiers to simplify replays.
## Historical coverage
- RSS contains the newest **250** items (≈6days at the current publication rate). The connector prunes the “known advisory” set to 512 IDs to avoid unbounded memory but retains enough headroom for short-term replay.
- Older advisories remain accessible through the same detail API (`WID-SEC-<year>-<sequence>` identifiers). For deep backfills run a scripted sweep that queues historical IDs in descending order; the connector will persist any payloads that still resolve. Document these batches under source state comments so Merge/Docs can track provenance.
## Locale & translation stance
- CERT-Bund publishes advisory titles and summaries **only in German** (language tag `de`). The connector preserves original casing/content and sets `Advisory.Language = "de"`.
- Operator guidance:
1. Front-line analysts consuming Concelier data should maintain German literacy or rely on approved machine-translation pipelines.
2. When mirroring advisories into English dashboards, store translations outside the canonical advisory payload to keep determinism. Suggested approach: create an auxiliary collection keyed by advisory ID with timestamped translated snippets.
3. Offline Kit bundles must document that CERT-Bund content is untranslated to avoid surprise during audits.
The Docs guild will surface the translation policy (retain German source, optionally layer operator-provided translations) in the broader i18n section; this README is the connector-level reference.