Files
git.stella-ops.org/docs/ops/feedser-cccs-operations.md

4.5 KiB
Raw Blame History

Feedser CCCS Connector Operations

This runbook covers daytoday operation of the Canadian Centre for Cyber Security (source:cccs:*) connector, including configuration, telemetry, and historical backfill guidance for English/French advisories.

1. Configuration Checklist

  • Network egress (or mirrored cache) for https://www.cyber.gc.ca/ and the JSON API endpoints under /api/cccs/.
  • Set the Feedser options before restarting workers. Example feedser.yaml snippet:
feedser:
  sources:
    cccs:
      feeds:
        - language: "en"
          uri: "https://www.cyber.gc.ca/api/cccs/threats/v1/get?lang=en&content_type=cccs_threat"
        - language: "fr"
          uri: "https://www.cyber.gc.ca/api/cccs/threats/v1/get?lang=fr&content_type=cccs_threat"
      maxEntriesPerFetch: 80        # increase temporarily for backfill runs
      maxKnownEntries: 512
      requestTimeout: "00:00:30"
      requestDelay: "00:00:00.250"
      failureBackoff: "00:05:00"

The /api/cccs/threats/v1/get endpoint returns thousands of records per language (≈5100 rows each as of 20251014). The connector honours maxEntriesPerFetch, so leave it low for steadystate and raise it for planned backfills.

2. Telemetry & Logging

  • Metrics (Meter StellaOps.Feedser.Source.Cccs):
    • cccs.fetch.attempts, cccs.fetch.success, cccs.fetch.failures
    • cccs.fetch.documents, cccs.fetch.unchanged
    • cccs.parse.success, cccs.parse.failures, cccs.parse.quarantine
    • cccs.map.success, cccs.map.failures
  • Shared HTTP metrics via SourceDiagnostics:
    • feedser.source.http.requests{feedser.source="cccs"}
    • feedser.source.http.failures{feedser.source="cccs"}
    • feedser.source.http.duration{feedser.source="cccs"}
  • Structured logs
    • CCCS fetch completed feeds=… items=… newDocuments=… pendingDocuments=…
    • CCCS parse completed parsed=… failures=…
    • CCCS map completed mapped=… failures=…
    • Warnings fire when GridFS payloads/DTOs go missing or parser sanitisation fails.

Suggested Grafana alerts:

  • increase(cccs.fetch.failures_total[15m]) > 0
  • rate(cccs.map.success_total[1h]) == 0 while other connectors are active
  • histogram_quantile(0.95, rate(feedser_source_http_duration_bucket{feedser_source="cccs"}[1h])) > 5s

3. Historical Backfill Plan

  1. Snapshot the source the API accepts page=<n> and lang=<en|fr> query parameters. page=0 returns the full dataset (observed earliest date_created: 20180608 for EN, 20180608 for FR). Mirror those responses into Offline Kit storage when operating airgapped.
  2. Stage ingestion:
    • Temporarily raise maxEntriesPerFetch (e.g. 500) and restart Feedser workers.
    • Run chained jobs until pendingDocuments drains:
      stella db jobs run source:cccs:fetch --and-then source:cccs:parse --and-then source:cccs:map
    • Monitor cccs.fetch.unchanged growth; once it approaches dataset size the backfill is complete.
  3. Optional pagination sweep for incremental mirrors, iterate page=<n> (0…N) while response.Count == 50, persisting JSON to disk. Store alongside metadata (language, page, SHA256) so repeated runs detect drift.
  4. Language split keep EN/FR payloads separate to preserve canonical language fields. The connector emits Language directly from the feed entry, so mixed ingestion simply produces parallel advisories keyed by the same serial number.
  5. Throttle planning schedule backfills during maintenance windows; the API tolerates burst downloads but respect the 250ms request delay or raise it if mirrored traffic is not available.

4. Selector & Sanitiser Notes

  • CccsHtmlParser now parses the unsanitised DOM (via AngleSharp) and only sanitises when persisting ContentHtml.
  • Product extraction walks headings (Affected Products, Produits touchés, Mesures recommandées) and consumes nested lists within div/section/article containers.
  • HtmlContentSanitizer allows <h1>…<h6> and <section> so stored HTML keeps headings for UI rendering and downstream summarisation.

5. Fixture Maintenance

  • Regression fixtures live in src/StellaOps.Feedser.Source.Cccs.Tests/Fixtures.
  • Refresh via UPDATE_CCCS_FIXTURES=1 dotnet test src/StellaOps.Feedser.Source.Cccs.Tests/StellaOps.Feedser.Source.Cccs.Tests.csproj.
  • Fixtures capture both EN/FR advisories with nested lists to guard against sanitiser regressions; review diffs for heading/list changes before committing.