docs consolidation work

This commit is contained in:
StellaOps Bot
2025-12-25 10:53:53 +02:00
parent b9f71fc7e9
commit deb82b4f03
117 changed files with 852 additions and 847 deletions

View File

@@ -1,3 +0,0 @@
# Notify Operations Artefacts
Landing zone for NR4, NR5, and NR8 evidence: quota/backpressure policies, DLQ schema, retry matrix, dashboards, and alert rules. Dashboards live under `operations/dashboards/`, alert configs under `operations/alerts/`.

View File

@@ -1,27 +0,0 @@
groups:
- name: notify-slo
rules:
- alert: NotifyDeliverySuccessSLO
expr: sum(rate(notify_delivery_success_total[5m])) / sum(rate(notify_delivery_total[5m])) < 0.98
for: 10m
labels:
severity: page
annotations:
summary: "Notify delivery success below SLO"
description: "Success ratio below 98% over 10m"
- alert: NotifyBacklogDepthHigh
expr: notify_backlog_depth > 5000
for: 5m
labels:
severity: page
annotations:
summary: "Notify backlog too high"
description: "Backlog depth exceeded 5000 messages"
- alert: NotifyDlqGrowth
expr: rate(notify_dlq_depth[10m]) > 50
for: 10m
labels:
severity: ticket
annotations:
summary: "Notify DLQ growth"
description: "Dead letter queue growing faster than threshold"

View File

@@ -1,9 +0,0 @@
{
"title": "Notify SLO",
"panels": [
{ "title": "Delivery success", "target": "sum(rate(notify_delivery_success_total[5m])) / sum(rate(notify_delivery_total[5m]))" },
{ "title": "Backlog depth", "target": "notify_backlog_depth" },
{ "title": "DLQ depth", "target": "notify_dlq_depth" },
{ "title": "Latency p95", "target": "histogram_quantile(0.95, rate(notify_delivery_latency_seconds_bucket[5m]))" }
]
}

View File

@@ -1,7 +0,0 @@
# Quotas, backpressure, and DLQ (NR4)
- Per-tenant quotas: 500 deliveries/minute default; channel overrides: webhook 200/min, email 120/min, chat 240/min.
- Burst budget: 2x quota for 60 seconds, then hard clamp.
- Backpressure: reject enqueue when backlog > quota*10 or DLQ growth > 5%/min.
- DLQ schema: `docs/notifications/schemas/dlq-notify.schema.json`; redrive requires idempotent `delivery_id`/`dedupe_key`.
- Metrics to alert: backlog depth, DLQ depth, redrive success rate, enqueue reject count.

View File

@@ -1,7 +0,0 @@
# Retry and idempotency policy (NR5)
- `delivery_id`: UUIDv7; `dedupe_key`: hash(event_id + rule_id + channel_id).
- Backoff: exponential with jitter; base 2s, factor 2, max 5 attempts, cap 5 minutes between attempts.
- Connectors must be idempotent; retries reuse the same `dedupe_key` and must not duplicate sends.
- Out-of-order acks ignored: only monotonic `attempt` accepted.
- Record retry outcomes in receipts and include attempt count + reason.