up
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
SDK Publish & Sign / sdk-publish (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
SDK Publish & Sign / sdk-publish (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
This commit is contained in:
51
docs/notifications/escalations.md
Normal file
51
docs/notifications/escalations.md
Normal file
@@ -0,0 +1,51 @@
|
||||
# Escalations & Acknowledgements
|
||||
|
||||
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
|
||||
|
||||
Last updated: 2025-11-25 (Docs Tasks Md.V · DOCS-NOTIFY-40-001)
|
||||
|
||||
## Model
|
||||
- **Escalation policy**: ordered stages of channels with delays; stored per tenant.
|
||||
- **Acknowledgement**: DSSE-signed token embedded in messages; acknowledger must present token to stop escalation.
|
||||
- **Suppression**: rules may mark events as non-escalating (informational) while still sending single notifications.
|
||||
|
||||
## Policy schema (conceptual)
|
||||
```json
|
||||
{
|
||||
"id": "uuid",
|
||||
"tenant": "string",
|
||||
"name": "pager-policy-prod",
|
||||
"stages": [
|
||||
{ "delaySeconds": 0, "channels": ["slack-prod", "email-oncall"] },
|
||||
{ "delaySeconds": 900, "channels": ["pager-primary"] },
|
||||
{ "delaySeconds": 1800,"channels": ["pager-management"] }
|
||||
],
|
||||
"autoCloseMinutes": 120,
|
||||
"retry": { "maxAttempts": 3, "backoffSeconds": 60 }
|
||||
}
|
||||
```
|
||||
- Stages execute sequentially until an **ack** is recorded.
|
||||
- Deterministic ordering: channels within a stage are sorted lexicographically before dispatch.
|
||||
|
||||
## Ack tokens
|
||||
- Token payload: `{ tenant, deliveryId, expiresUtc, ruleId, actionHash }`.
|
||||
- Signed with Authority-issued DSSE key; verified by Notify WebService before accepting `POST /acks/{token}`.
|
||||
- Expiry defaults to 24h; tokens are single-use and idempotent.
|
||||
|
||||
## Escalation flow
|
||||
1) Rule fires → action references an escalation policy.
|
||||
2) Stage 0 deliveries sent; ledger records attempts and ack URL.
|
||||
3) If no ack by `delaySeconds`, next stage dispatches; repeats until ack or final stage.
|
||||
4) On ack, remaining stages are cancelled; ledger entry marked `acknowledged` with timestamp and subject.
|
||||
|
||||
## Quiet hours & throttles
|
||||
- Quiet hours suppress *new* escalations; in-flight escalations continue.
|
||||
- Per-policy throttle prevents repeated escalation runs for identical `actionHash` within a configurable window (default 30m).
|
||||
|
||||
## Observability
|
||||
- Counters: `notify.escalation.started`, `notify.escalation.stage_sent`, `notify.escalation.ack`, `notify.escalation.cancelled` tagged by `tenant`, `policy`, `stage`.
|
||||
- Logs: structured `escalation.{started|stage_sent|ack|cancelled}` with delivery ids and rationale.
|
||||
|
||||
## Runbooks
|
||||
- Update escalation policy safely: create new policy id, switch rules, then delete old policy to avoid mid-flight ambiguity.
|
||||
- If a stage storms, set throttle higher or add quiet hours; do not delete the policy mid-flight—use `cancelEscalation` endpoint instead.
|
||||
Reference in New Issue
Block a user