Initial commit (history squashed)
Some checks failed
Build Test Deploy / authority-container (push) Has been cancelled
Build Test Deploy / docs (push) Has been cancelled
Build Test Deploy / deploy (push) Has been cancelled
Build Test Deploy / build-test (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled

This commit is contained in:
2025-10-07 10:14:21 +03:00
commit b97fc7685a
1132 changed files with 117842 additions and 0 deletions

View File

@@ -0,0 +1,104 @@
# Feedser CVE & KEV Connector Operations
This playbook equips operators with the steps required to roll out and monitor the CVE Services and CISA KEV connectors across environments.
## 1. CVE Services Connector (`source:cve:*`)
### 1.1 Prerequisites
- CVE Services API credentials (organisation ID, user ID, API key) with access to the JSON 5 API.
- Network egress to `https://cveawg.mitre.org` (or a mirrored endpoint) from the Feedser workers.
- Updated `feedser.yaml` (or the matching environment variables) with the following section:
```yaml
feedser:
sources:
cve:
baseEndpoint: "https://cveawg.mitre.org/api/"
apiOrg: "ORG123"
apiUser: "user@example.org"
apiKeyFile: "/var/run/secrets/feedser/cve-api-key"
pageSize: 200
maxPagesPerFetch: 5
initialBackfill: "30.00:00:00"
requestDelay: "00:00:00.250"
failureBackoff: "00:10:00"
```
> Store the API key outside source control. When using `apiKeyFile`, mount the secret file into the container/host; alternatively supply `apiKey` via `FEEDSER_SOURCES__CVE__APIKEY`.
### 1.2 Smoke Test (staging)
1. Deploy the updated configuration and restart the Feedser service so the connector picks up the credentials.
2. Trigger one end-to-end cycle:
- Feedser CLI: `stella db jobs run source:cve:fetch --and-then source:cve:parse --and-then source:cve:map`
- REST fallback: `POST /jobs/run { "kind": "source:cve:fetch", "chain": ["source:cve:parse", "source:cve:map"] }`
3. Observe the following metrics (exported via OTEL meter `StellaOps.Feedser.Source.Cve`):
- `cve.fetch.attempts`, `cve.fetch.success`, `cve.fetch.failures`, `cve.fetch.unchanged`
- `cve.parse.success`, `cve.parse.failures`, `cve.parse.quarantine`
- `cve.map.success`
4. Verify the MongoDB advisory store contains fresh CVE advisories (`advisoryKey` prefix `cve/`) and that the source cursor (`source_states` collection) advanced.
### 1.3 Production Monitoring
- **Dashboards** Add the counters above plus `feedser.range.primitives` (filtered by `scheme=semver` or `scheme=vendor`) to the Feedser overview board. Alert when:
- `rate(cve.fetch.failures[5m]) > 0`
- `rate(cve.map.success[15m]) == 0` while fetch attempts continue
- `sum_over_time(cve.parse.quarantine[1h]) > 0`
- **Logs** Watch for `CveConnector` warnings such as `Failed fetching CVE record` or schema validation errors (`Malformed CVE JSON`). These are emitted with the CVE ID and document identifier for triage.
- **Backfill window** operators can tighten or widen the `initialBackfill` / `maxPagesPerFetch` values after validating baseline throughput. Update the config and restart the worker to apply changes.
## 2. CISA KEV Connector (`source:kev:*`)
### 2.1 Prerequisites
- Network egress (or mirrored content) for `https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json`.
- No credentials are required, but the HTTP allow-list must include `www.cisa.gov`.
- Confirm the following snippet in `feedser.yaml` (defaults shown; tune as needed):
```yaml
feedser:
sources:
kev:
feedUri: "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json"
requestTimeout: "00:01:00"
failureBackoff: "00:05:00"
```
### 2.2 Schema validation & anomaly handling
From this sprint the connector validates the KEV JSON payload against `Schemas/kev-catalog.schema.json`. Malformed documents are quarantined, and entries missing a CVE ID are dropped with a warning (`reason=missingCveId`). Operators should treat repeated schema failures as an upstream regression and coordinate with CISA or mirror maintainers.
### 2.3 Smoke Test (staging)
1. Deploy the configuration and restart Feedser.
2. Trigger a pipeline run:
- CLI: `stella db jobs run source:kev:fetch --and-then source:kev:parse --and-then source:kev:map`
- REST: `POST /jobs/run { "kind": "source:kev:fetch", "chain": ["source:kev:parse", "source:kev:map"] }`
3. Verify the metrics exposed by meter `StellaOps.Feedser.Source.Kev`:
- `kev.fetch.attempts`, `kev.fetch.success`, `kev.fetch.unchanged`, `kev.fetch.failures`
- `kev.parse.entries` (tag `catalogVersion`), `kev.parse.failures`, `kev.parse.anomalies` (tag `reason`)
- `kev.map.advisories` (tag `catalogVersion`)
4. Confirm MongoDB documents exist for the catalog JSON (`raw_documents` & `dtos`) and that advisories with prefix `kev/` are written.
### 2.4 Production Monitoring
- Alert when `kev.fetch.success` goes to zero for longer than the expected daily cadence (default: trigger if `rate(kev.fetch.success[8h]) == 0` during business hours).
- Track anomaly spikes via `kev.parse.anomalies{reason="missingCveId"}`. A sustained non-zero rate means the upstream catalog contains unexpected records.
- The connector logs each validated catalog: `Parsed KEV catalog document … entries=X`. Absence of that log alongside consecutive `kev.fetch.success` counts suggests schema validation failures—correlate with warning-level events in the `StellaOps.Feedser.Source.Kev` logger.
### 2.5 Known good dashboard tiles
Add the following panels to the Feedser observability board:
| Metric | Recommended visualisation |
|--------|---------------------------|
| `kev.fetch.success` | Single-stat (last 24h) with threshold alert |
| `rate(kev.parse.entries[1h])` by `catalogVersion` | Stacked area highlights daily release size |
| `sum_over_time(kev.parse.anomalies[1d])` by `reason` | Table anomaly breakdown |
## 3. Runbook updates
- Record staging/production smoke test results (date, catalog version, advisory counts) in your teams change log.
- Add the CVE/KEV job kinds to the standard maintenance checklist so operators can manually trigger them after planned downtime.
- Keep this document in sync with future connector changes (for example, new anomaly reasons or additional metrics).