docs consolidation work
This commit is contained in:
@@ -1,3 +0,0 @@
|
||||
# Notify Operations Artefacts
|
||||
|
||||
Landing zone for NR4, NR5, and NR8 evidence: quota/backpressure policies, DLQ schema, retry matrix, dashboards, and alert rules. Dashboards live under `operations/dashboards/`, alert configs under `operations/alerts/`.
|
||||
@@ -1,27 +0,0 @@
|
||||
groups:
|
||||
- name: notify-slo
|
||||
rules:
|
||||
- alert: NotifyDeliverySuccessSLO
|
||||
expr: sum(rate(notify_delivery_success_total[5m])) / sum(rate(notify_delivery_total[5m])) < 0.98
|
||||
for: 10m
|
||||
labels:
|
||||
severity: page
|
||||
annotations:
|
||||
summary: "Notify delivery success below SLO"
|
||||
description: "Success ratio below 98% over 10m"
|
||||
- alert: NotifyBacklogDepthHigh
|
||||
expr: notify_backlog_depth > 5000
|
||||
for: 5m
|
||||
labels:
|
||||
severity: page
|
||||
annotations:
|
||||
summary: "Notify backlog too high"
|
||||
description: "Backlog depth exceeded 5000 messages"
|
||||
- alert: NotifyDlqGrowth
|
||||
expr: rate(notify_dlq_depth[10m]) > 50
|
||||
for: 10m
|
||||
labels:
|
||||
severity: ticket
|
||||
annotations:
|
||||
summary: "Notify DLQ growth"
|
||||
description: "Dead letter queue growing faster than threshold"
|
||||
@@ -1,9 +0,0 @@
|
||||
{
|
||||
"title": "Notify SLO",
|
||||
"panels": [
|
||||
{ "title": "Delivery success", "target": "sum(rate(notify_delivery_success_total[5m])) / sum(rate(notify_delivery_total[5m]))" },
|
||||
{ "title": "Backlog depth", "target": "notify_backlog_depth" },
|
||||
{ "title": "DLQ depth", "target": "notify_dlq_depth" },
|
||||
{ "title": "Latency p95", "target": "histogram_quantile(0.95, rate(notify_delivery_latency_seconds_bucket[5m]))" }
|
||||
]
|
||||
}
|
||||
@@ -1,7 +0,0 @@
|
||||
# Quotas, backpressure, and DLQ (NR4)
|
||||
|
||||
- Per-tenant quotas: 500 deliveries/minute default; channel overrides: webhook 200/min, email 120/min, chat 240/min.
|
||||
- Burst budget: 2x quota for 60 seconds, then hard clamp.
|
||||
- Backpressure: reject enqueue when backlog > quota*10 or DLQ growth > 5%/min.
|
||||
- DLQ schema: `docs/notifications/schemas/dlq-notify.schema.json`; redrive requires idempotent `delivery_id`/`dedupe_key`.
|
||||
- Metrics to alert: backlog depth, DLQ depth, redrive success rate, enqueue reject count.
|
||||
@@ -1,7 +0,0 @@
|
||||
# Retry and idempotency policy (NR5)
|
||||
|
||||
- `delivery_id`: UUIDv7; `dedupe_key`: hash(event_id + rule_id + channel_id).
|
||||
- Backoff: exponential with jitter; base 2s, factor 2, max 5 attempts, cap 5 minutes between attempts.
|
||||
- Connectors must be idempotent; retries reuse the same `dedupe_key` and must not duplicate sends.
|
||||
- Out-of-order acks ignored: only monotonic `attempt` accepted.
|
||||
- Record retry outcomes in receipts and include attempt count + reason.
|
||||
Reference in New Issue
Block a user