Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
SDK Publish & Sign / sdk-publish (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
2.5 KiB
2.5 KiB
Escalations & Acknowledgements
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
Last updated: 2025-11-25 (Docs Tasks Md.V · DOCS-NOTIFY-40-001)
Model
- Escalation policy: ordered stages of channels with delays; stored per tenant.
- Acknowledgement: DSSE-signed token embedded in messages; acknowledger must present token to stop escalation.
- Suppression: rules may mark events as non-escalating (informational) while still sending single notifications.
Policy schema (conceptual)
{
"id": "uuid",
"tenant": "string",
"name": "pager-policy-prod",
"stages": [
{ "delaySeconds": 0, "channels": ["slack-prod", "email-oncall"] },
{ "delaySeconds": 900, "channels": ["pager-primary"] },
{ "delaySeconds": 1800,"channels": ["pager-management"] }
],
"autoCloseMinutes": 120,
"retry": { "maxAttempts": 3, "backoffSeconds": 60 }
}
- Stages execute sequentially until an ack is recorded.
- Deterministic ordering: channels within a stage are sorted lexicographically before dispatch.
Ack tokens
- Token payload:
{ tenant, deliveryId, expiresUtc, ruleId, actionHash }. - Signed with Authority-issued DSSE key; verified by Notify WebService before accepting
POST /acks/{token}. - Expiry defaults to 24h; tokens are single-use and idempotent.
Escalation flow
- Rule fires → action references an escalation policy.
- Stage 0 deliveries sent; ledger records attempts and ack URL.
- If no ack by
delaySeconds, next stage dispatches; repeats until ack or final stage. - On ack, remaining stages are cancelled; ledger entry marked
acknowledgedwith timestamp and subject.
Quiet hours & throttles
- Quiet hours suppress new escalations; in-flight escalations continue.
- Per-policy throttle prevents repeated escalation runs for identical
actionHashwithin a configurable window (default 30m).
Observability
- Counters:
notify.escalation.started,notify.escalation.stage_sent,notify.escalation.ack,notify.escalation.cancelledtagged bytenant,policy,stage. - Logs: structured
escalation.{started|stage_sent|ack|cancelled}with delivery ids and rationale.
Runbooks
- Update escalation policy safely: create new policy id, switch rules, then delete old policy to avoid mid-flight ambiguity.
- If a stage storms, set throttle higher or add quiet hours; do not delete the policy mid-flight—use
cancelEscalationendpoint instead.