docs consolidation

This commit is contained in:
StellaOps Bot
2025-12-24 21:45:46 +02:00
parent 4231305fec
commit 43e2af88f6
76 changed files with 2887 additions and 796 deletions

View File

@@ -37,7 +37,7 @@ Status for these items is tracked in `src/Notifier/StellaOps.Notifier/TASKS.md`
## Integrations & dependencies
- **Storage:** PostgreSQL (schema `notify`) for rules, channels, deliveries, digests, and throttles; Valkey for worker coordination.
- **Queues:** Redis Streams or NATS JetStream for ingestion, throttling, and DLQs (`notify.dlq`).
- **Queues:** Valkey Streams or NATS JetStream for ingestion, throttling, and DLQs (`notify.dlq`).
- **Authority:** OpTok-protected APIs, DPoP-backed CLI/UI scopes (`notify.viewer`, `notify.operator`, `notify.admin`), and secret references for channel credentials.
- **Observability:** Prometheus metrics (`notify.sent_total`, `notify.failed_total`, `notify.digest_coalesced_total`, etc.), OTEL traces, and dashboards documented in `docs/notifications/architecture.md#12-observability-prometheus--otel`.

View File

@@ -26,7 +26,7 @@ src/
├─ StellaOps.Notify.Engine/ # rules engine, templates, idempotency, digests, throttles
├─ StellaOps.Notify.Models/ # DTOs (Rule, Channel, Event, Delivery, Template)
├─ StellaOps.Notify.Storage.Postgres/ # canonical persistence (notify schema)
├─ StellaOps.Notify.Queue/ # bus client (Redis Streams/NATS JetStream)
├─ StellaOps.Notify.Queue/ # bus client (Valkey Streams/NATS JetStream)
└─ StellaOps.Notify.Tests.* # unit/integration/e2e
```
@@ -35,7 +35,7 @@ src/
* **Notify.WebService** (stateless API)
* **Notify.Worker** (horizontal scale)
**Dependencies**: Authority (OpToks; DPoP/mTLS), **PostgreSQL** (notify schema), Redis/NATS (bus), HTTP egress to Slack/Teams/Webhooks, SMTP relay for Email.
**Dependencies**: Authority (OpToks; DPoP/mTLS), **PostgreSQL** (notify schema), Valkey/NATS (bus), HTTP egress to Slack/Teams/Webhooks, SMTP relay for Email.
> **Configuration.** Notify.WebService bootstraps from `notify.yaml` (see `etc/notify.yaml.sample`). Use `storage.driver: postgres` and provide `postgres.notify` options (`connectionString`, `schemaName`, pool sizing, timeouts). Authority settings follow the platform defaults—when running locally without Authority, set `authority.enabled: false` and supply `developmentSigningKey` so JWTs can be validated offline.
>
@@ -277,7 +277,7 @@ Canonical JSON Schemas for rules/channels/events live in `docs/modules/notify/re
* `throttles`
```
{ key:"idem:<hash>", ttlAt } // short-lived, also cached in Redis
{ key:"idem:<hash>", ttlAt } // short-lived, also cached in Valkey
```
**Indexes**: rules by `{tenantId, enabled}`, deliveries by `{tenantId, sentAt desc}`, digests by `{tenantId, actionKey}`.
@@ -346,12 +346,12 @@ Authority signs ack tokens using keys configured under `notifications.ackTokens`
* **Ingestor**: N consumers with perkey ordering (key = tenant|digest|namespace).
* **RuleMatcher**: loads active rules snapshot for tenant into memory; vectorized predicate check.
* **Throttle/Dedupe**: consult Redis + PostgreSQL `throttles`; if hit → record `status=throttled`.
* **Throttle/Dedupe**: consult Valkey + PostgreSQL `throttles`; if hit → record `status=throttled`.
* **DigestCoalescer**: append to open digest window or flush when timer expires.
* **Renderer**: select template (channel+locale), inject variables, enforce length limits, compute `bodyHash`.
* **Connector**: send; handle providerspecific rate limits and backoffs; `maxAttempts` with exponential jitter; overflow → DLQ (deadletter topic) + UI surfacing.
**Idempotency**: per action **idempotency key** stored in Redis (TTL = `throttle window` or `digest window`). Connectors also respect **provider** idempotency where available (e.g., Slack `client_msg_id`).
**Idempotency**: per action **idempotency key** stored in Valkey (TTL = `throttle window` or `digest window`). Connectors also respect **provider** idempotency where available (e.g., Slack `client_msg_id`).
---
@@ -359,7 +359,7 @@ Authority signs ack tokens using keys configured under `notifications.ackTokens`
* **Pertenant** RPM caps (default 600/min) + **perchannel** concurrency (Slack 14, Teams 12, Email 832 based on relay).
* **Backoff** map: Slack 429 → respect `RetryAfter`; SMTP 4xx → retry; 5xx → retry with jitter; permanent rejects → drop with status recorded.
* **DLQ**: NATS/Redis stream `notify.dlq` with `{event, rule, action, error}` for operator inspection; UI shows DLQ items.
* **DLQ**: NATS/Valkey stream `notify.dlq` with `{event, rule, action, error}` for operator inspection; UI shows DLQ items.
---
@@ -402,7 +402,7 @@ notify:
issuer: "https://authority.internal"
require: "dpop" # or "mtls"
bus:
kind: "redis" # or "nats"
kind: "valkey" # or "nats" (valkey uses redis:// protocol)
streams:
- "scanner.events"
- "scheduler.events"
@@ -455,7 +455,7 @@ notify:
| Invalid channel secret | Mark channel unhealthy; suppress sends; surface in UI |
| Rule explosion (matches everything) | Safety valve: pertenant RPM caps; autopause rule after X drops; UI alert |
| Bus outage | Buffer to local queue (bounded); resume consuming when healthy |
| PostgreSQL slowness | Fall back to Redis throttles; batch write deliveries; shed lowpriority notifications |
| PostgreSQL slowness | Fall back to Valkey throttles; batch write deliveries; shed lowpriority notifications |
---
@@ -466,7 +466,7 @@ notify:
* **Integration**: synthetic event storm (10k/min), ensure p95 latency & duplicate rate.
* **Security**: DPoP/mTLS on APIs; secretRef resolution; webhook signing & replay windows.
* **i18n**: localized templates render deterministically.
* **Chaos**: Slack/Teams API flaps; SMTP greylisting; Redis hiccups; ensure graceful degradation.
* **Chaos**: Slack/Teams API flaps; SMTP greylisting; Valkey hiccups; ensure graceful degradation.
---
@@ -514,7 +514,7 @@ sequenceDiagram
## 18) Implementation notes
* **Language**: .NET 10; minimal API; `System.Text.Json` with canonical writer for body hashing; Channels for pipelines.
* **Bus**: Redis Streams (**XGROUP** consumers) or NATS JetStream for atleastonce with ack; pertenant consumer groups to localize backpressure.
* **Bus**: Valkey Streams (**XGROUP** consumers) or NATS JetStream for atleastonce with ack; pertenant consumer groups to localize backpressure.
* **Templates**: compile and cache per rule+channel+locale; version with rule `updatedAt` to invalidate.
* **Rules**: store raw YAML + parsed AST; validate with schema + static checks (e.g., nonsensical combos).
* **Secrets**: pluggable secret resolver (Authority Secret proxy, K8s, Vault).