# Notifications Architecture > **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied. This dossier distils the Notify architecture into implementation-ready guidance for service owners, SREs, and integrators. It complements the high-level overview by detailing process boundaries, persistence models, and extensibility points. --- ## 1. Runtime shape ``` ┌──────────────────┐ │ Authority (OpTok)│ └───────┬──────────┘ │ ┌───────▼──────────┐ ┌───────────────┐ │ Notify.WebService│◀──────▶│ MongoDB │ Tenant API│ REST + gRPC WIP │ │ rules/channels│ └───────▲──────────┘ │ deliveries │ │ │ digests │ Internal bus │ └───────────────┘ (NATS/Redis/etc) │ │ ┌─────────▼─────────┐ ┌───────────────┐ │ Notify.Worker │◀────▶│ Redis / Cache │ │ rule eval + render│ │ throttles/locks│ └─────────▲─────────┘ └───────▲───────┘ │ │ │ │ ┌──────┴──────┐ ┌─────────┴────────┐ │ Connectors │──────▶│ Slack/Teams/... │ │ (plug-ins) │ │ External targets │ └─────────────┘ └──────────────────┘ ``` - **WebService** hosts REST endpoints (`/channels`, `/rules`, `/templates`, `/deliveries`, `/digests`, `/stats`) and handles schema normalisation, validation, and Authority enforcement. - **Worker** subscribes to the platform event bus, evaluates rules per tenant, applies throttles/digests, renders payloads, writes ledger entries, and invokes connectors. - **Plug-ins** live under `plugins/notify/` and are loaded deterministically at service start (`orderedPlugins` list). Each implements connector contracts and optional health/test-preview providers. Both services share options via `notify.yaml` (see `etc/notify.yaml.sample`). For dev/test scenarios, an in-memory repository exists but production requires Mongo + Redis/NATS for durability and coordination. --- ## 2. Event ingestion and rule evaluation 1. **Subscription.** Workers attach to the internal bus (Redis Streams or NATS JetStream). Each partition key is `tenantId|scope.digest|event.kind` to preserve order for a given artefact. 2. **Normalisation.** Incoming events are hydrated into `NotifyEvent` envelopes. Payload JSON is normalised (sorted object keys) to preserve determinism and enable hashing. 3. **Rule snapshot.** Per-tenant rule sets are cached in memory. Change streams from Mongo trigger snapshot refreshes without restart. 4. **Match pipeline.** - Tenant check (`rule.tenantId` vs. event tenant). - Kind/namespace/repository/digest filters. - Severity and KEV gating based on event deltas. - VEX gating using `NotifyRuleMatchVex`. - Action iteration with throttle/digest decisions. 5. **Idempotency.** Each action computes `hash(ruleId|actionId|event.kind|scope.digest|delta.hash|dayBucket)`; matches within throttle TTL record `status=Throttled` and stop. 6. **Dispatch.** If digest is `instant`, the renderer immediately processes the action. Otherwise the event is appended to the digest window for later flush. Failures during evaluation are logged with correlation IDs and surfaced through `/stats` and worker metrics (`notify_rule_eval_failures_total`, `notify_digest_flush_errors_total`). --- ## 3. Rendering & connectors - **Template resolution.** The renderer picks the template in this order: action template → channel default template → locale fallback → built-in minimal template. Locale negotiation reduces `en-US` to `en-us`. - **Helpers & partials.** Exposed helpers mirror the list in [`notifications/templates.md`](templates.md#3-variables-helpers-and-context). Plug-ins may register additional helpers but must remain deterministic and side-effect free. - **Rendering output.** `NotifyDeliveryRendered` captures: - `channelType`, `format`, `locale` - `title`, `body`, optional `summary`, `textBody` - `target` (redacted where necessary) - `attachments[]` (safe URLs or references) - `bodyHash` (lowercase SHA-256) for audit parity - **Connector contract.** Connectors implement `INotifyConnector` (send + health) and can implement `INotifyChannelTestProvider` for `/channels/{id}/test`. All plugs are single-tenant aware; secrets are pulled via references at send time and never persisted in Mongo. - **Retries.** Workers track attempts with exponential jitter. On permanent failure, deliveries are marked `Failed` with `statusReason`, and optional DLQ fan-out is slated for Sprint 40. --- ## 4. Persistence model | Collection | Purpose | Key fields & indexes | |------------|---------|----------------------| | `rules` | Tenant rule definitions. | `_id`, `tenantId`, `enabled`; index on `{tenantId, enabled}`. | | `channels` | Channel metadata + config references. | `_id`, `tenantId`, `type`; index on `{tenantId, type}`. | | `templates` | Locale-specific render bodies. | `_id`, `tenantId`, `channelType`, `key`; index on `{tenantId, channelType, key}`. | | `deliveries` | Ledger of rendered notifications. | `_id`, `tenantId`, `sentAt`; compound index on `{tenantId, sentAt:-1}` for history queries. | | `digests` | Open digest windows per action. | `_id` (`tenantId:actionKey:window`), `status`; index on `{tenantId, actionKey}`. | | `throttles` | Short-lived throttle tokens (Mongo or Redis). | Key format `idem:` with TTL aligned to throttle duration. | Documents are stored using the canonical JSON serializer (`NotifyCanonicalJsonSerializer`) to preserve property ordering and casing. Schema migration helpers upgrade stored documents when new versions ship. --- ## 5. Deployment & configuration - **Configuration sources.** YAML files feed typed options (`NotifyMongoOptions`, `NotifyWorkerOptions`, etc.). Environment variables can override connection strings and rate limits for production. - **Authority integration.** Two OAuth clients (`notify-web`, `notify-web-dev`) with scopes `notify.read` and `notify.admin` are required. Authority enforcement can be disabled for air-gapped dev use by providing `developmentSigningKey`. - **Plug-in management.** `plugins.baseDirectory` and `orderedPlugins` guarantee deterministic loading. Offline Kits copy the plug-in tree verbatim; operations must keep the order aligned across environments. - **Observability.** Workers expose structured logs (`ruleId`, `actionId`, `eventId`, `throttleKey`). Metrics include: - `notify_rule_matches_total{tenant,eventKind}` - `notify_delivery_attempts_total{channelType,status}` - `notify_digest_open_windows{window}` - Optional OpenTelemetry traces for rule evaluation and connector round-trips. - **Scaling levers.** Increase worker replicas to cope with bus throughput; adjust `worker.prefetchCount` for Redis Streams or `ackWait` for NATS JetStream. WebService remains stateless and scales horizontally behind the gateway. --- ## 6. Roadmap alignment | Backlog | Architectural note | |---------|--------------------| | `NOTIFY-SVC-38-001` | Standardise event envelope publication (idempotency keys) – ensure bus bindings use the documented key format. | | `NOTIFY-SVC-38-002..004` | Introduce simulation endpoints and throttle dashboards – expect additional `/internal/notify/simulate` routes and metrics; update once merged. | | `NOTIFY-SVC-39-001..004` | Correlation engine, digests generator, simulation API, quiet hours – anticipate new Mongo documents (`quietHours`, correlation caches) and connector metadata (quiet mode hints). Review this guide when implementations land. | Action: schedule a documentation sync with the Notifications Service Guild immediately after `NOTIFY-SVC-39-001..004` merge to confirm schema adjustments (e.g., correlation edge storage, quiet hour calendars) and add any new persistence or API details here. --- > **Imposed rule reminder:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.