5.0 KiB
5.0 KiB
Notifications Digests
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
Digests coalesce multiple matching events into a single notification when rules request batched delivery. They protect responders from alert storms while preserving a deterministic record of every input.
1. Digest lifecycle
- Window selection. Rule actions opt into a digest cadence by setting
actions[].digest(instant,5m,15m,1h,1d).instantskips digest logic entirely. - Aggregation. When an event matches, the worker appends it to the open digest window (
tenantId + actionId + window). Events include the canonical scope, delta counts, and references. - Flush. When the window expires or hits the worker’s safety cap (configurable), the worker renders a digest template and emits a single delivery with status
Digested. - Audit. The delivery ledger links back to the digest document so operators can inspect individual items and the aggregated summary.
2. Storage model
Digest state lives in Mongo (digests collection) and mirrors the schema described in ARCHITECTURE_NOTIFY.md:
{
"_id": "tenant-dev:act-email-compliance:1h",
"tenantId": "tenant-dev",
"actionKey": "act-email-compliance",
"window": "1h",
"openedAt": "2025-10-24T08:00:00Z",
"status": "open",
"items": [
{
"eventId": "00000000-0000-0000-0000-000000000001",
"scope": {
"namespace": "prod-payments",
"repo": "ghcr.io/acme/api",
"digest": "sha256:…"
},
"delta": {
"newCritical": 1,
"kev": 1
}
}
]
}
statusreflects whether the window is currently collecting (open) or has been completed (closed). Future revisions may introduceflushingfor in-progress operations.items[].deltacaptures aggregated counts for reporting (e.g., new critical findings, KEV, quieted).- Workers use optimistic concurrency on the document ID to avoid duplicate flushes across replicas.
3. Rendering and templates
- Digest deliveries use the same template engine as instant notifications. Templates receive an additional
digestobject withwindow,openedAt,itemCount, anditems(findings grouped by namespace/repository when available). - Provide digest-specific templates (e.g.,
tmpl-digest-hourly) so the body can enumerate top offenders, summarise totals, and link to detailed dashboards. - When no template is specified, Notify falls back to channel defaults that emphasise summary counts and redirect to Console for detail.
4. API surface
| Endpoint | Description | Notes |
|---|---|---|
POST /digests |
Issues administrative commands (e.g., force flush, reopen) for a specific action/window. | Request body specifies the command target; requires notify.admin. |
GET /digests/{actionKey} |
Returns the currently open window (if any) for the referenced action. | Supports operators/CLI inspecting pending digests; requires notify.read. |
DELETE /digests/{actionKey} |
Drops the open window without notifying (emergency stop). | Emits an audit record; use sparingly. |
All routes honour the tenant header and reuse the standard Notify rate limits.
5. Worker behaviour and safety nets
- Idempotency. Flush operations generate a deterministic digest delivery ID (
digest:<tenant>:<actionId>:<window>:<openedAt>). Retries reuse the same ID. - Throttles. Digest generation respects action throttles; setting an aggressive throttle together with a digest window may result in deliberate skips (logged as
Throttledin the delivery ledger). - Quiet hours. Future sprint work (
NOTIFY-SVC-39-004) integrates quiet-hour calendars. When enabled, flush timers pause during quiet windows and resume afterwards. - Back-pressure. When the window reaches the configured item cap before the timer, the worker flushes early and starts a new window immediately.
- Crash resilience. Workers rebuild in-flight windows from Mongo on startup; partially flushed windows remain closed after success or reopened if the flush fails.
6. Operator guidance
- Choose hourly digests for high-volume compliance events; daily digests suit executive reporting.
- Pair digests with incident-focused instant rules so critical items surface immediately while less urgent noise is summarised.
- Monitor
/statsoutput foropenDigestCountto ensure windows are flushing; spikes may indicate downstream connector failures. - When testing new digest templates, open a small (
5m) window, trigger sample events, then callPOST /digests/{actionId}/flushto validate rendering before moving to longer cadences.
Imposed rule reminder: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.