Files
git.stella-ops.org/docs/ops/feedser-cve-kev-operations.md
master 607e72e2a1
Some checks failed
Build Test Deploy / docs (push) Has been cancelled
Build Test Deploy / deploy (push) Has been cancelled
Build Test Deploy / build-test (push) Has been cancelled
Build Test Deploy / authority-container (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
up
2025-10-12 20:37:18 +03:00

8.1 KiB
Raw Blame History

Feedser CVE & KEV Connector Operations

This playbook equips operators with the steps required to roll out and monitor the CVE Services and CISA KEV connectors across environments.

1. CVE Services Connector (source:cve:*)

1.1 Prerequisites

  • CVE Services API credentials (organisation ID, user ID, API key) with access to the JSON 5 API.
  • Network egress to https://cveawg.mitre.org (or a mirrored endpoint) from the Feedser workers.
  • Updated feedser.yaml (or the matching environment variables) with the following section:
feedser:
  sources:
    cve:
      baseEndpoint: "https://cveawg.mitre.org/api/"
      apiOrg: "ORG123"
      apiUser: "user@example.org"
      apiKeyFile: "/var/run/secrets/feedser/cve-api-key"
      pageSize: 200
      maxPagesPerFetch: 5
      initialBackfill: "30.00:00:00"
      requestDelay: "00:00:00.250"
      failureBackoff: "00:10:00"

Store the API key outside source control. When using apiKeyFile, mount the secret file into the container/host; alternatively supply apiKey via FEEDSER_SOURCES__CVE__APIKEY.

1.2 Smoke Test (staging)

  1. Deploy the updated configuration and restart the Feedser service so the connector picks up the credentials.
  2. Trigger one end-to-end cycle:
    • Feedser CLI: stella db jobs run source:cve:fetch --and-then source:cve:parse --and-then source:cve:map
    • REST fallback: POST /jobs/run { "kind": "source:cve:fetch", "chain": ["source:cve:parse", "source:cve:map"] }
  3. Observe the following metrics (exported via OTEL meter StellaOps.Feedser.Source.Cve):
    • cve.fetch.attempts, cve.fetch.success, cve.fetch.documents, cve.fetch.failures, cve.fetch.unchanged
    • cve.parse.success, cve.parse.failures, cve.parse.quarantine
    • cve.map.success
  4. Verify Prometheus shows matching feedser.source.http.requests_total{feedser_source="cve"} deltas (list vs detail phases) while feedser.source.http.failures_total{feedser_source="cve"} stays flat.
  5. Confirm the info-level summary log CVEs fetch window … pages=X detailDocuments=Y detailFailures=Z appears once per fetch run and shows detailFailures=0.
  6. Verify the MongoDB advisory store contains fresh CVE advisories (advisoryKey prefix cve/) and that the source cursor (source_states collection) advanced.

1.3 Production Monitoring

  • Dashboards Plot rate(cve_fetch_success_total[5m]), rate(cve_fetch_failures_total[5m]), and rate(cve_fetch_documents_total[5m]) alongside feedser_source_http_requests_total{feedser_source="cve"} to confirm HTTP and connector counters stay aligned. Keep feedser.range.primitives{scheme=~"semver|vendor"} on the same board for range coverage. Example alerts:
    • rate(cve_fetch_failures_total[5m]) > 0 for 10minutes (severity=warning)
    • rate(cve_map_success_total[15m]) == 0 while rate(cve_fetch_success_total[15m]) > 0 (severity=critical)
    • sum_over_time(cve_parse_quarantine_total[1h]) > 0 to catch schema anomalies
  • Logs Monitor warnings such as Failed fetching CVE record {CveId} and Malformed CVE JSON, and surface the summary info log CVEs fetch window … detailFailures=0 detailUnchanged=0 on dashboards. A non-zero detailFailures usually indicates rate-limit or auth issues on detail requests.
  • Grafana pack Import docs/ops/feedser-cve-kev-grafana-dashboard.json and filter by panel legend (CVE, KEV) to reuse the canned layout.
  • Backfill window Operators can tighten or widen initialBackfill / maxPagesPerFetch after validating throughput. Update config and restart Feedser to apply changes.

2. CISA KEV Connector (source:kev:*)

2.1 Prerequisites

  • Network egress (or mirrored content) for https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json.
  • No credentials are required, but the HTTP allow-list must include www.cisa.gov.
  • Confirm the following snippet in feedser.yaml (defaults shown; tune as needed):
feedser:
  sources:
    kev:
      feedUri: "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json"
      requestTimeout: "00:01:00"
      failureBackoff: "00:05:00"

2.2 Schema validation & anomaly handling

The connector validates each catalog against Schemas/kev-catalog.schema.json. Failures increment kev.parse.failures_total{reason="schema"} and the document is quarantined (status Failed). Additional failure reasons include download, invalidJson, deserialize, missingPayload, and emptyCatalog. Entry-level anomalies are surfaced through kev.parse.anomalies_total with reasons:

Reason Meaning
missingCveId Catalog entry omitted cveID; the entry is skipped.
countMismatch Catalog count field disagreed with the actual entry total.
nullEntry Upstream emitted a null entry object (rare upstream defect).

Treat repeated schema failures or growing anomaly counts as an upstream regression and coordinate with CISA or mirror maintainers.

2.3 Smoke Test (staging)

  1. Deploy the configuration and restart Feedser.
  2. Trigger a pipeline run:
    • CLI: stella db jobs run source:kev:fetch --and-then source:kev:parse --and-then source:kev:map
    • REST: POST /jobs/run { "kind": "source:kev:fetch", "chain": ["source:kev:parse", "source:kev:map"] }
  3. Verify the metrics exposed by meter StellaOps.Feedser.Source.Kev:
    • kev.fetch.attempts, kev.fetch.success, kev.fetch.unchanged, kev.fetch.failures
    • kev.parse.entries (tag catalogVersion), kev.parse.failures, kev.parse.anomalies (tag reason)
    • kev.map.advisories (tag catalogVersion)
  4. Confirm feedser.source.http.requests_total{feedser_source="kev"} increments once per fetch and that the paired feedser.source.http.failures_total stays flat (zero increase).
  5. Inspect the info logs Fetched KEV catalog document … pendingDocuments=… and Parsed KEV catalog document … entries=…—they should appear exactly once per run and Mapped X/Y… skipped=0 should match the kev.map.advisories delta.
  6. Confirm MongoDB documents exist for the catalog JSON (raw_documents & dtos) and that advisories with prefix kev/ are written.

2.4 Production Monitoring

  • Alert when rate(kev_fetch_success_total[8h]) == 0 during working hours (daily cadence breach) and when increase(kev_fetch_failures_total[1h]) > 0.
  • Page the on-call if increase(kev_parse_failures_total{reason="schema"}[6h]) > 0—this usually signals an upstream payload change. Treat repeated reason="download" spikes as networking issues to the mirror.
  • Track anomaly spikes through sum_over_time(kev_parse_anomalies_total{reason="missingCveId"}[24h]). Rising countMismatch trends point to catalog publishing bugs.
  • Surface the fetch/mapping info logs (Fetched KEV catalog document … and Mapped X/Y KEV advisories … skipped=S) on dashboards; absence of those logs while metrics show success typically means schema validation short-circuited the run.

2.5 Known good dashboard tiles

Add the following panels to the Feedser observability board:

Metric Recommended visualisation
rate(kev_fetch_success_total[30m]) Single-stat (last 24h) with warning threshold >0
rate(kev_parse_entries_total[1h]) by catalogVersion Stacked area highlights daily release size
sum_over_time(kev_parse_anomalies_total[1d]) by reason Table anomaly breakdown (matches dashboard panel)
rate(cve_map_success_total[15m]) vs rate(kev_map_advisories_total[24h]) Comparative timeseries for advisories emitted

3. Runbook updates

  • Record staging/production smoke test results (date, catalog version, advisory counts) in your teams change log.
  • Add the CVE/KEV job kinds to the standard maintenance checklist so operators can manually trigger them after planned downtime.
  • Keep this document in sync with future connector changes (for example, new anomaly reasons or additional metrics).
  • Version-control dashboard tweaks alongside docs/ops/feedser-cve-kev-grafana-dashboard.json so operations can re-import the observability pack during restores.