docs consolidation
This commit is contained in:
@@ -40,7 +40,7 @@ concelier:
|
||||
- `CCCS fetch completed feeds=… items=… newDocuments=… pendingDocuments=…`
|
||||
- `CCCS parse completed parsed=… failures=…`
|
||||
- `CCCS map completed mapped=… failures=…`
|
||||
- Warnings fire when GridFS payloads/DTOs go missing or parser sanitisation fails.
|
||||
- Warnings fire when document payloads/DTOs go missing or parser sanitisation fails.
|
||||
|
||||
Suggested Grafana alerts:
|
||||
- `increase(cccs.fetch.failures_total[15m]) > 0`
|
||||
@@ -53,7 +53,7 @@ Suggested Grafana alerts:
|
||||
2. **Stage ingestion**:
|
||||
- Temporarily raise `maxEntriesPerFetch` (e.g. 500) and restart Concelier workers.
|
||||
- Run chained jobs until `pendingDocuments` drains:
|
||||
Run `stella db fetch --source cccs --stage fetch`, then `--stage parse`, then `--stage map`.
|
||||
Run `stella db fetch --source cccs --stage fetch`, then `--stage parse`, then `--stage map`.
|
||||
- Monitor `cccs.fetch.unchanged` growth; once it approaches dataset size the backfill is complete.
|
||||
3. **Optional pagination sweep** – for incremental mirrors, iterate `page=<n>` (0…N) while `response.Count == 50`, persisting JSON to disk. Store alongside metadata (`language`, `page`, SHA256) so repeated runs detect drift.
|
||||
4. **Language split** – keep EN/FR payloads separate to preserve canonical language fields. The connector emits `Language` directly from the feed entry, so mixed ingestion simply produces parallel advisories keyed by the same serial number.
|
||||
|
||||
Reference in New Issue
Block a user