Files
git.stella-ops.org/docs/notifications/digests.md
master 7b5bdcf4d3 feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules
- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes.
- Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes.
- Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables.
- Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
2025-10-30 00:09:39 +02:00

93 lines
4.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Notifications Digests
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
Digests coalesce multiple matching events into a single notification when rules request batched delivery. They protect responders from alert storms while preserving a deterministic record of every input.
---
## 1. Digest lifecycle
1. **Window selection.** Rule actions opt into a digest cadence by setting `actions[].digest` (`instant`, `5m`, `15m`, `1h`, `1d`). `instant` skips digest logic entirely.
2. **Aggregation.** When an event matches, the worker appends it to the open digest window (`tenantId + actionId + window`). Events include the canonical scope, delta counts, and references.
3. **Flush.** When the window expires or hits the workers safety cap (configurable), the worker renders a digest template and emits a single delivery with status `Digested`.
4. **Audit.** The delivery ledger links back to the digest document so operators can inspect individual items and the aggregated summary.
---
## 2. Storage model
Digest state lives in Mongo (`digests` collection) and mirrors the schema described in [modules/notify/architecture.md](../modules/notify/architecture.md#7-data-model-mongo):
```json
{
"_id": "tenant-dev:act-email-compliance:1h",
"tenantId": "tenant-dev",
"actionKey": "act-email-compliance",
"window": "1h",
"openedAt": "2025-10-24T08:00:00Z",
"status": "open",
"items": [
{
"eventId": "00000000-0000-0000-0000-000000000001",
"scope": {
"namespace": "prod-payments",
"repo": "ghcr.io/acme/api",
"digest": "sha256:…"
},
"delta": {
"newCritical": 1,
"kev": 1
}
}
]
}
```
- `status` reflects whether the window is currently collecting (`open`) or has been completed (`closed`). Future revisions may introduce `flushing` for in-progress operations.
- `items[].delta` captures aggregated counts for reporting (e.g., new critical findings, KEV, quieted).
- Workers use optimistic concurrency on the document ID to avoid duplicate flushes across replicas.
---
## 3. Rendering and templates
- Digest deliveries use the same template engine as instant notifications. Templates receive an additional `digest` object with `window`, `openedAt`, `itemCount`, and `items` (findings grouped by namespace/repository when available).
- Provide digest-specific templates (e.g., `tmpl-digest-hourly`) so the body can enumerate top offenders, summarise totals, and link to detailed dashboards.
- When no template is specified, Notify falls back to channel defaults that emphasise summary counts and redirect to Console for detail.
---
## 4. API surface
| Endpoint | Description | Notes |
|----------|-------------|-------|
| `POST /digests` | Issues administrative commands (e.g., force flush, reopen) for a specific action/window. | Request body specifies the command target; requires `notify.admin`. |
| `GET /digests/{actionKey}` | Returns the currently open window (if any) for the referenced action. | Supports operators/CLI inspecting pending digests; requires `notify.read`. |
| `DELETE /digests/{actionKey}` | Drops the open window without notifying (emergency stop). | Emits an audit record; use sparingly. |
All routes honour the tenant header and reuse the standard Notify rate limits.
---
## 5. Worker behaviour and safety nets
- **Idempotency.** Flush operations generate a deterministic digest delivery ID (`digest:<tenant>:<actionId>:<window>:<openedAt>`). Retries reuse the same ID.
- **Throttles.** Digest generation respects action throttles; setting an aggressive throttle together with a digest window may result in deliberate skips (logged as `Throttled` in the delivery ledger).
- **Quiet hours.** Future sprint work (`NOTIFY-SVC-39-004`) integrates quiet-hour calendars. When enabled, flush timers pause during quiet windows and resume afterwards.
- **Back-pressure.** When the window reaches the configured item cap before the timer, the worker flushes early and starts a new window immediately.
- **Crash resilience.** Workers rebuild in-flight windows from Mongo on startup; partially flushed windows remain closed after success or reopened if the flush fails.
---
## 6. Operator guidance
- Choose hourly digests for high-volume compliance events; daily digests suit executive reporting.
- Pair digests with incident-focused instant rules so critical items surface immediately while less urgent noise is summarised.
- Monitor `/stats` output for `openDigestCount` to ensure windows are flushing; spikes may indicate downstream connector failures.
- When testing new digest templates, open a small (`5m`) window, trigger sample events, then call `POST /digests/{actionId}/flush` to validate rendering before moving to longer cadences.
---
> **Imposed rule reminder:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.