Files
git.stella-ops.org/docs/ops/feedser-cve-kev-operations.md
master b97fc7685a
Some checks failed
Build Test Deploy / authority-container (push) Has been cancelled
Build Test Deploy / docs (push) Has been cancelled
Build Test Deploy / deploy (push) Has been cancelled
Build Test Deploy / build-test (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Initial commit (history squashed)
2025-10-11 23:28:35 +03:00

105 lines
5.8 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Feedser CVE & KEV Connector Operations
This playbook equips operators with the steps required to roll out and monitor the CVE Services and CISA KEV connectors across environments.
## 1. CVE Services Connector (`source:cve:*`)
### 1.1 Prerequisites
- CVE Services API credentials (organisation ID, user ID, API key) with access to the JSON 5 API.
- Network egress to `https://cveawg.mitre.org` (or a mirrored endpoint) from the Feedser workers.
- Updated `feedser.yaml` (or the matching environment variables) with the following section:
```yaml
feedser:
sources:
cve:
baseEndpoint: "https://cveawg.mitre.org/api/"
apiOrg: "ORG123"
apiUser: "user@example.org"
apiKeyFile: "/var/run/secrets/feedser/cve-api-key"
pageSize: 200
maxPagesPerFetch: 5
initialBackfill: "30.00:00:00"
requestDelay: "00:00:00.250"
failureBackoff: "00:10:00"
```
> Store the API key outside source control. When using `apiKeyFile`, mount the secret file into the container/host; alternatively supply `apiKey` via `FEEDSER_SOURCES__CVE__APIKEY`.
### 1.2 Smoke Test (staging)
1. Deploy the updated configuration and restart the Feedser service so the connector picks up the credentials.
2. Trigger one end-to-end cycle:
- Feedser CLI: `stella db jobs run source:cve:fetch --and-then source:cve:parse --and-then source:cve:map`
- REST fallback: `POST /jobs/run { "kind": "source:cve:fetch", "chain": ["source:cve:parse", "source:cve:map"] }`
3. Observe the following metrics (exported via OTEL meter `StellaOps.Feedser.Source.Cve`):
- `cve.fetch.attempts`, `cve.fetch.success`, `cve.fetch.failures`, `cve.fetch.unchanged`
- `cve.parse.success`, `cve.parse.failures`, `cve.parse.quarantine`
- `cve.map.success`
4. Verify the MongoDB advisory store contains fresh CVE advisories (`advisoryKey` prefix `cve/`) and that the source cursor (`source_states` collection) advanced.
### 1.3 Production Monitoring
- **Dashboards** Add the counters above plus `feedser.range.primitives` (filtered by `scheme=semver` or `scheme=vendor`) to the Feedser overview board. Alert when:
- `rate(cve.fetch.failures[5m]) > 0`
- `rate(cve.map.success[15m]) == 0` while fetch attempts continue
- `sum_over_time(cve.parse.quarantine[1h]) > 0`
- **Logs** Watch for `CveConnector` warnings such as `Failed fetching CVE record` or schema validation errors (`Malformed CVE JSON`). These are emitted with the CVE ID and document identifier for triage.
- **Backfill window** operators can tighten or widen the `initialBackfill` / `maxPagesPerFetch` values after validating baseline throughput. Update the config and restart the worker to apply changes.
## 2. CISA KEV Connector (`source:kev:*`)
### 2.1 Prerequisites
- Network egress (or mirrored content) for `https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json`.
- No credentials are required, but the HTTP allow-list must include `www.cisa.gov`.
- Confirm the following snippet in `feedser.yaml` (defaults shown; tune as needed):
```yaml
feedser:
sources:
kev:
feedUri: "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json"
requestTimeout: "00:01:00"
failureBackoff: "00:05:00"
```
### 2.2 Schema validation & anomaly handling
From this sprint the connector validates the KEV JSON payload against `Schemas/kev-catalog.schema.json`. Malformed documents are quarantined, and entries missing a CVE ID are dropped with a warning (`reason=missingCveId`). Operators should treat repeated schema failures as an upstream regression and coordinate with CISA or mirror maintainers.
### 2.3 Smoke Test (staging)
1. Deploy the configuration and restart Feedser.
2. Trigger a pipeline run:
- CLI: `stella db jobs run source:kev:fetch --and-then source:kev:parse --and-then source:kev:map`
- REST: `POST /jobs/run { "kind": "source:kev:fetch", "chain": ["source:kev:parse", "source:kev:map"] }`
3. Verify the metrics exposed by meter `StellaOps.Feedser.Source.Kev`:
- `kev.fetch.attempts`, `kev.fetch.success`, `kev.fetch.unchanged`, `kev.fetch.failures`
- `kev.parse.entries` (tag `catalogVersion`), `kev.parse.failures`, `kev.parse.anomalies` (tag `reason`)
- `kev.map.advisories` (tag `catalogVersion`)
4. Confirm MongoDB documents exist for the catalog JSON (`raw_documents` & `dtos`) and that advisories with prefix `kev/` are written.
### 2.4 Production Monitoring
- Alert when `kev.fetch.success` goes to zero for longer than the expected daily cadence (default: trigger if `rate(kev.fetch.success[8h]) == 0` during business hours).
- Track anomaly spikes via `kev.parse.anomalies{reason="missingCveId"}`. A sustained non-zero rate means the upstream catalog contains unexpected records.
- The connector logs each validated catalog: `Parsed KEV catalog document … entries=X`. Absence of that log alongside consecutive `kev.fetch.success` counts suggests schema validation failures—correlate with warning-level events in the `StellaOps.Feedser.Source.Kev` logger.
### 2.5 Known good dashboard tiles
Add the following panels to the Feedser observability board:
| Metric | Recommended visualisation |
|--------|---------------------------|
| `kev.fetch.success` | Single-stat (last 24h) with threshold alert |
| `rate(kev.parse.entries[1h])` by `catalogVersion` | Stacked area highlights daily release size |
| `sum_over_time(kev.parse.anomalies[1d])` by `reason` | Table anomaly breakdown |
## 3. Runbook updates
- Record staging/production smoke test results (date, catalog version, advisory counts) in your teams change log.
- Add the CVE/KEV job kinds to the standard maintenance checklist so operators can manually trigger them after planned downtime.
- Keep this document in sync with future connector changes (for example, new anomaly reasons or additional metrics).