docs consolidation

This commit is contained in:
StellaOps Bot
2025-12-24 21:45:46 +02:00
parent 4231305fec
commit 43e2af88f6
76 changed files with 2887 additions and 796 deletions

View File

@@ -8,7 +8,7 @@ Operational steps to deploy, monitor, and recover the Notifications service (Web
## Pre-flight
- Secrets stored in Authority: SMTP creds, Slack/Teams hooks, webhook HMAC keys.
- Outbound allowlist updated for target channels.
- PostgreSQL and Redis reachable; health checks pass.
- PostgreSQL and Valkey reachable; health checks pass.
- Offline kit loaded: channel manifests, default templates, rule seeds.
## Deploy
@@ -37,7 +37,7 @@ Operational steps to deploy, monitor, and recover the Notifications service (Web
- **Rotate secrets**: update Authority secret, then `POST /api/v1/notify/channels/{id}:refresh-secret`.
## Failure recovery
- Worker crash loop: check Redis connectivity, template compile errors; run `notify-worker --validate-only` using current config.
- Worker crash loop: check Valkey connectivity, template compile errors; run `notify-worker --validate-only` using current config.
- PostgreSQL outage: worker backs off with exponential retry; after recovery, replay via `:replay` or digests as needed.
- Channel outage (e.g., Slack 5xx): throttles + retry policy handle transient errors; for extended outages, disable channel or swap to backup policy.
@@ -54,5 +54,5 @@ Operational steps to deploy, monitor, and recover the Notifications service (Web
- [ ] Health endpoints green.
- [ ] Delivery failure rate < 0.5% over last hour.
- [ ] Escalation backlog empty or within SLO.
- [ ] Redis memory < 75% and PostgreSQL primary healthy.
- [ ] Valkey memory < 75% and PostgreSQL primary healthy.
- [ ] Latest release notes applied and channels validated.