# StellaOps Notify Notify (Notifications Studio) converts platform events into tenant-scoped alerts with deterministic delivery, offline parity, and a full audit trail. The service is split between the reusable tooling in `src/Notify/*` and the runtime host in `src/Notifier/*` (decision recorded 2025-11-02) so downstream systems can embed the rules engine without inheriting the Studio UI. ## Latest updates (2025-11-30) - Sprint tracker `docs/implplan/SPRINT_322_docs_modules_notify.md` and module `TASKS.md` added to mirror status. - Observability runbook stub and Grafana placeholder added under `operations/` (offline import); finalize after next demo. - NOTIFY-DOCS-0002 remains blocked pending NOTIFY-SVC-39-001..004 outputs (correlation/digests/simulation/quiet hours). ## Scope & responsibilities - Apply tenant-scoped rules to events from Scanner, Scheduler, VEX Lens, Attestor, Task Runner, and Zastava. - Render channel-specific payloads (Slack, Teams, Email, webhook) using deterministic templates with localisation safeguards. - Enforce throttling, digests, and quiet-hour calendars so bursts stay explainable and recoverable. - Persist deliveries, attempts, throttles, and DSSE hashes for CLI/UI investigation and compliance export. ## Current capabilities (Sprint 38 foundations) - **Rules + channels API:** `StellaOps.Notify.WebService` exposes CRUD, previews, and health probes secured by Authority scopes. - **Worker pipeline:** `StellaOps.Notify.Worker` ingests bus events, evaluates match predicates, applies per-tenant throttles, and dispatches deliveries. - **Connector plug-ins:** Restart-time plug-ins under `StellaOps.Notify.Connectors.*` (Slack, Teams, Email, generic webhook) with health checks and retry policy hints declared in `notify-plugin.json`. - **Template engine:** Deterministic rendering with safe helpers, locale bundles, and redaction defaults that keep Offline Kit parity. - **Delivery ledger:** PostgreSQL-backed ledger storing hashed payloads, attempts, throttled/digested markers, and provenance links for audit + exports. ## In progress / upcoming (Sprint 39 focus) - `NOTIFY-SVC-39-001` correlation engine with token-bucket throttles, incident lifecycle, and quiet-hours evaluator. - `NOTIFY-SVC-39-002` digest generator with schedule runner, ledger queries, and distribution across existing channels. - `NOTIFY-SVC-39-003` simulation API for rule dry-runs against historical events. - `NOTIFY-SVC-39-004` quiet-hour calendar integration and default throttles with audit logging. Status for these items is tracked in `src/Notifier/StellaOps.Notifier/TASKS.md` and sprint plans; update this README once tasks merge. ## Key docs & release alignment - [`overview.md`](overview.md) — summary of capabilities, imposed rules, and customer journey. - [`architecture.md`](architecture.md) / [`architecture-detail.md`](architecture-detail.md) — Notifications Studio runtime view. - [`rules.md`](rules.md) — declarative matcher syntax and evaluation order. - [`digests.md`](digests.md) — digest windows, coalescing logic, and delivery samples. - [`templates.md`](templates.md) — template helpers, localisation, and redaction guidelines. - [`docs/implplan/archived/updates/2025-10-29-notify-docs.md`](../../implplan/archived/updates/2025-10-29-notify-docs.md) — latest release note; follow-ups remain to validate connector metadata, quiet-hours semantics, and simulation payloads once Sprint 39 drops land. ## Integrations & dependencies - **Storage:** PostgreSQL (schema `notify`) for rules, channels, deliveries, digests, and throttles; Valkey for worker coordination. - **Queues:** Valkey Streams or NATS JetStream for ingestion, throttling, and DLQs (`notify.dlq`). - **Authority:** OpTok-protected APIs, DPoP-backed CLI/UI scopes (`notify.viewer`, `notify.operator`, `notify.admin`), and secret references for channel credentials. - **Observability:** Prometheus metrics (`notify.sent_total`, `notify.failed_total`, `notify.digest_coalesced_total`, etc.), OTEL traces, and dashboards documented in `architecture-detail.md`. ## Operational notes - Schema fixtures live in `./resources/schemas`; event and delivery samples live in `./resources/samples` for contract tests and UI mocks. - Offline Kit bundles ship plug-ins, default templates, and seed rules; update manifests under `ops/offline-kit/` when connectors change. - Dashboards and alert references depend on `DEVOPS-NOTIFY-39-002`; coordinate before renaming metrics or labels. - Observability assets: `operations/observability.md` and `operations/dashboards/notify-observability.json` (offline import). - When releasing new rule or connector features, update guidance in this directory and related checklists until the follow-ups are closed. ## Epic alignment - **Epic 11 – Notifications Studio:** notifications workspace, preview tooling, immutable delivery ledger, throttling/digest controls, and forthcoming correlation/simulation features. ## Implementation Status ### Delivery Phases - **Phase 1 – Core rules engine & delivery ledger:** Implement rules/channels schema, event ingestion, rule evaluation, idempotent deliveries, audit logging - **Phase 2 – Connectors & rendering:** Ship Slack/Teams/Email/Webhook connectors, template rendering, localization, throttling, retries, secret referencing - **Phase 3 – Console & CLI authoring:** Provide UI/CLI for rule authoring, previews, channel health, delivery browsing, digests, test sends - **Phase 4 – Governance & observability:** Add approvals, RBAC, tenant quotas, metrics/logs/traces, dashboards, alerts, runbooks - **Phase 5 – Offline & compliance:** Produce Offline Kit bundles (rules/channels/deploy scripts), signed exports, retention policies, auditing ### Acceptance Criteria - Rules evaluate deterministically per event; deliveries idempotent with audit trail and DSSE signatures - Channel connectors support retries, rate limits, health checks, previews; secrets referenced securely - Console/CLI support rule creation, testing, digests, delivery browsing, export/import workflows - Observability dashboards track delivery health; alerts fire for sustained failures or backlog; runbooks cover remediation - Offline Kit bundle contains configs, rules, digests, deployment scripts for air-gapped installs - Notify respects tenancy and RBAC; governance (approvals, change log) enforced for high-impact rules ### Key Risks & Mitigations - **Notification storms:** Throttling, digests, dedupe windows, preview/test gating - **Secret compromise:** Secret references only, rotation workflows, audit logging - **Connector API changes:** Versioned adapter layer, nightly health checks, fallback channels - **Noise vs signal:** Simulation previews, metrics, rule scoring, recommended defaults - **Offline parity:** Export/import of rules, connectors, digests with signed manifests ### Current Phase Progress - Phase 1: Core rules engine mostly complete; template dispatch/rendering in progress - Phase 2: Connector and rendering work not yet started; depends on Phase 1 completion - Phase 3: Console/CLI authoring work not started; depends on Phase 2 completion - Phase 4: Core observability complete; governance and risk notifications blocked on upstream dependencies - Phase 5: Offline basics complete; tenancy work blocked on upstream Sprint 0172