4.5 KiB
4.5 KiB
Concelier CCCS Connector Operations
This runbook covers day‑to‑day operation of the Canadian Centre for Cyber Security (source:cccs:*) connector, including configuration, telemetry, and historical backfill guidance for English/French advisories.
1. Configuration Checklist
- Network egress (or mirrored cache) for
https://www.cyber.gc.ca/and the JSON API endpoints under/api/cccs/. - Set the Concelier options before restarting workers. Example
concelier.yamlsnippet:
concelier:
sources:
cccs:
feeds:
- language: "en"
uri: "https://www.cyber.gc.ca/api/cccs/threats/v1/get?lang=en&content_type=cccs_threat"
- language: "fr"
uri: "https://www.cyber.gc.ca/api/cccs/threats/v1/get?lang=fr&content_type=cccs_threat"
maxEntriesPerFetch: 80 # increase temporarily for backfill runs
maxKnownEntries: 512
requestTimeout: "00:00:30"
requestDelay: "00:00:00.250"
failureBackoff: "00:05:00"
ℹ️ The
/api/cccs/threats/v1/getendpoint returns thousands of records per language (≈5 100 rows each as of 2025‑10‑14). The connector honoursmaxEntriesPerFetch, so leave it low for steady‑state and raise it for planned backfills.
2. Telemetry & Logging
- Metrics (Meter
StellaOps.Concelier.Connector.Cccs):cccs.fetch.attempts,cccs.fetch.success,cccs.fetch.failurescccs.fetch.documents,cccs.fetch.unchangedcccs.parse.success,cccs.parse.failures,cccs.parse.quarantinecccs.map.success,cccs.map.failures
- Shared HTTP metrics via
SourceDiagnostics:concelier.source.http.requests{concelier.source="cccs"}concelier.source.http.failures{concelier.source="cccs"}concelier.source.http.duration{concelier.source="cccs"}
- Structured logs
CCCS fetch completed feeds=… items=… newDocuments=… pendingDocuments=…CCCS parse completed parsed=… failures=…CCCS map completed mapped=… failures=…- Warnings fire when GridFS payloads/DTOs go missing or parser sanitisation fails.
Suggested Grafana alerts:
increase(cccs.fetch.failures_total[15m]) > 0rate(cccs.map.success_total[1h]) == 0while other connectors are activehistogram_quantile(0.95, rate(concelier_source_http_duration_bucket{concelier_source="cccs"}[1h])) > 5s
3. Historical Backfill Plan
- Snapshot the source – the API accepts
page=<n>andlang=<en|fr>query parameters.page=0returns the full dataset (observed earliestdate_created: 2018‑06‑08 for EN, 2018‑06‑08 for FR). Mirror those responses into Offline Kit storage when operating air‑gapped. - Stage ingestion:
- Temporarily raise
maxEntriesPerFetch(e.g. 500) and restart Concelier workers. - Run chained jobs until
pendingDocumentsdrains:
stella db jobs run source:cccs:fetch --and-then source:cccs:parse --and-then source:cccs:map - Monitor
cccs.fetch.unchangedgrowth; once it approaches dataset size the backfill is complete.
- Temporarily raise
- Optional pagination sweep – for incremental mirrors, iterate
page=<n>(0…N) whileresponse.Count == 50, persisting JSON to disk. Store alongside metadata (language,page, SHA256) so repeated runs detect drift. - Language split – keep EN/FR payloads separate to preserve canonical language fields. The connector emits
Languagedirectly from the feed entry, so mixed ingestion simply produces parallel advisories keyed by the same serial number. - Throttle planning – schedule backfills during maintenance windows; the API tolerates burst downloads but respect the 250 ms request delay or raise it if mirrored traffic is not available.
4. Selector & Sanitiser Notes
CccsHtmlParsernow parses the unsanitised DOM (via AngleSharp) and only sanitises when persistingContentHtml.- Product extraction walks headings (
Affected Products,Produits touchés,Mesures recommandées) and consumes nested lists withindiv/section/articlecontainers. HtmlContentSanitizerallows<h1>…<h6>and<section>so stored HTML keeps headings for UI rendering and downstream summarisation.
5. Fixture Maintenance
- Regression fixtures live in
src/StellaOps.Concelier.Connector.Cccs.Tests/Fixtures. - Refresh via
UPDATE_CCCS_FIXTURES=1 dotnet test src/StellaOps.Concelier.Connector.Cccs.Tests/StellaOps.Concelier.Connector.Cccs.Tests.csproj. - Fixtures capture both EN/FR advisories with nested lists to guard against sanitiser regressions; review diffs for heading/list changes before committing.