3.8 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	
			3.8 KiB
		
	
	
	
	
	
	
	
Excititor Statement Backfill Runbook
Last updated: 2025-10-19
Overview
Use this runbook when you need to rebuild the vex.statements collection from historical raw documents. Typical scenarios:
- Upgrading the statement schema (e.g., adding severity/KEV/EPSS signals).
- Recovering from a partial ingest outage where statements were never persisted.
- Seeding a freshly provisioned Excititor deployment from an existing raw archive.
Backfill operates server-side via the Excititor WebService and reuses the same pipeline that powers the /excititor/statements ingestion endpoint. Each raw document is normalized, signed metadata is preserved, and duplicate statements are skipped unless the run is forced.
Prerequisites
- Connectivity to Excititor WebService – the CLI uses the backend URL configured in stellaops.ymlor the--backend-urlargument.
- Authority credentials – the CLI honours the existing Authority client configuration; ensure the caller has permission to invoke admin endpoints.
- Mongo replica set (recommended) – causal consistency guarantees rely on majority read/write concerns. Standalone deployment works but skips cross-document transactions.
CLI command
stellaops excititor backfill-statements \
  [--retrieved-since <ISO8601>] \
  [--force] \
  [--batch-size <int>] \
  [--max-documents <int>]
| Option | Description | 
|---|---|
| --retrieved-since | Only process raw documents fetched on or after the specified timestamp (UTC by default). | 
| --force | Reprocess documents even if matching statements already exist (useful after schema upgrades). | 
| --batch-size | Number of raw documents pulled per batch (default 100). | 
| --max-documents | Optional hard limit on the number of raw documents to evaluate. | 
Example – replay the last 48 hours of Red Hat ingest while keeping existing statements:
stellaops excititor backfill-statements \
  --retrieved-since "$(date -u -d '48 hours ago' +%Y-%m-%dT%H:%M:%SZ)"
Example – full replay with forced overwrites, capped at 2,000 documents:
stellaops excititor backfill-statements --force --max-documents 2000
The command returns a summary similar to:
Backfill completed: evaluated 450, backfilled 180, claims written 320, skipped 270, failures 0.
Behaviour
- Raw documents are streamed in ascending retrievedAtorder.
- Each document is normalized using the registered VEX normalizers (CSAF, CycloneDX, OpenVEX).
- Statements are appended through the same IVexClaimStore.AppendAsyncpath that powers/excititor/statements.
- Duplicate detection compares Document.Digest; duplicates are skipped unless--forceis specified.
- Failures are logged with the offending digest and continue with the next document.
Observability
- CLI logs aggregate counts and the backend logs per-digest warnings or errors.
- Mongo writes carry majority write concern; expect backfill throughput to match ingest baselines (≈5 seconds warm, 30 seconds cold).
- Monitor the excititor.storage.backfilllog scope for detailed telemetry.
Post-run verification
- Inspect the vex.statementscollection for the targeted window (checkInsertedAt).
- Re-run the Excititor storage test suite if possible:
dotnet test src/Excititor/__Tests/StellaOps.Excititor.Storage.Mongo.Tests/StellaOps.Excititor.Storage.Mongo.Tests.csproj
- Optionally, call /excititor/statements/{vulnerabilityId}/{productKey}to confirm the expected statements exist.
Rollback
If a forced run produced incorrect statements, use the standard Mongo rollback procedure:
- Identify the InsertedAtwindow for the backfill run.
- Delete affected records from vex.statements(and any downstream exports if applicable).
- Rerun the backfill command with corrected parameters.