Files
git.stella-ops.org/docs/notifications/rules.md
root 68da90a11a
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Restructure solution layout by module
2025-10-28 15:10:40 +02:00

148 lines
7.7 KiB
Markdown

# Notifications Rules
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
Rules decide which platform events deserve a notification, how aggressively they should be throttled, and which channels/actions should run. They are tenant-scoped contracts that guarantee deterministic routing across Notify.Worker replicas.
---
## 1. Rule lifecycle
1. **Authoring.** Operators create or update rules through the Notify WebService (`POST /rules`, `PATCH /rules/{id}`) or UI. Payloads are normalised to the current `NotifyRule` schema version.
2. **Evaluation.** Notify.Worker evaluates enabled rules per incoming event. Tenancy is enforced first, followed by match filters, VEX gates, throttles, and digest handling.
3. **Delivery.** Matching actions are enqueued with an idempotency key to prevent storm loops. Throttle rejections and digest coalescing are recorded in the delivery ledger.
4. **Audit.** Every change carries `createdBy`/`updatedBy` plus timestamps; the delivery ledger references `ruleId`/`actionId` for traceability.
---
## 2. Rule schema reference
| Field | Type | Notes |
|-------|------|-------|
| `ruleId` | string | Stable identifier; clients may provide UUID/slug. |
| `tenantId` | string | Must match the tenant header supplied when the rule is created. |
| `name` | string | Display label shown in UI and audits. |
| `description` | string? | Optional operator-facing note. |
| `enabled` | bool | Disabled rules remain stored but skipped during evaluation. |
| `labels` | map<string,string> | Sorted, trimmed key/value tags supporting filtering. |
| `metadata` | map<string,string> | Reserved for automation; stored verbatim (sorted). |
| `match` | [`NotifyRuleMatch`](#3-match-filters) | Declarative filters applied before actions execute. |
| `actions[]` | [`NotifyRuleAction`](#4-actions-throttles-and-digests) | Ordered set of channel dispatchers; minimum one. |
| `createdBy`/`createdAt` | string?, instant | Populated automatically when omitted. |
| `updatedBy`/`updatedAt` | string?, instant | Defaults to creation values when unspecified. |
| `schemaVersion` | string | Auto-upgraded during persistence; use for migrations. |
Rules are immutable snapshots; updates produce a full document write so workers observing change streams can refresh caches deterministically.
---
## 3. Match filters
`NotifyRuleMatch` narrows which events trigger the rule. All string collections are trimmed, deduplicated, and sorted to guarantee deterministic evaluation.
| Field | Type | Behaviour |
|-------|------|-----------|
| `eventKinds[]` | string | Lower-cased; supports any canonical Notify event (`scanner.report.ready`, `scheduler.rescan.delta`, `zastava.admission`, etc.). Empty list matches all kinds. |
| `namespaces[]` | string | Exact match against `event.scope.namespace`. Supports glob-style filters via upstream enrichment (planned). |
| `repositories[]` | string | Matches `event.scope.repo`. |
| `digests[]` | string | Lower-cased; matches `event.scope.digest`. |
| `labels[]` | string | Matches event attributes or delta labels (`kev`, `critical`, `license`, …). |
| `componentPurls[]` | string | Matches component identifiers inside the event payload when provided. |
| `minSeverity` | string? | Lower-cased severity gate (e.g., `medium`, `high`, `critical`). Evaluated on new findings inside event deltas; events lacking severity bypass this gate unless set. |
| `verdicts[]` | string | Accepts scan/report verdicts (`fail`, `warn`, `block`, `escalate`, `deny`). |
| `kevOnly` | bool? | When `true`, only KEV-tagged findings fire. |
| `vex` | object | Additional gating aligned with VEX consensus; see below. |
### 3.1 VEX gates
`NotifyRuleMatchVex` offers fine-grained control when VEX findings accompany events:
| Field | Default | Effect |
|-------|---------|--------|
| `includeAcceptedJustifications` | `true` | Include findings marked `not_affected`/`acceptable` in consensus. |
| `includeRejectedJustifications` | `false` | Surface findings the consensus rejected. |
| `includeUnknownJustifications` | `false` | Allow findings without explicit justification. |
| `justificationKinds[]` | `[]` | Optional allow-list of justification codes (e.g., `exploit_observed`, `component_not_present`). |
If the VEX block filters out every applicable finding, the rule is treated as a non-match and no actions run.
---
## 4. Actions, throttles, and digests
Each rule requires at least one action. Actions are deduplicated and sorted by `actionId`, so prefer deterministic identifiers.
| Field | Type | Notes |
|-------|------|-------|
| `actionId` | string | Stable identifier unique within the rule. |
| `channel` | string | Reference to a channel (`channelId`) configured in `/channels`. |
| `template` | string? | Template key to use for rendering; falls back to channel default when omitted. |
| `digest` | string? | Digest window key (`instant`, `5m`, `15m`, `1h`, `1d`). `instant` bypasses coalescing. |
| `throttle` | ISO8601 duration? | Optional throttle TTL (`PT300S`, `PT1H`). Prevents duplicate deliveries when the same idempotency hash appears before expiry. |
| `locale` | string? | BCP-47 tag (stored lower-case). Template lookup falls back to channel locale then `en-us`. |
| `enabled` | bool | Disabled actions skip rendering but remain stored. |
| `metadata` | map<string,string> | Connector-specific hints (priority, layout, etc.). |
### 4.1 Evaluation order
1. Verify channel exists and is enabled; disabled channels mark the delivery as `Dropped`.
2. Apply throttle idempotency key: `hash(ruleId|actionId|event.kind|scope.digest|delta.hash|dayBucket)`. Hits are logged as `Throttled`.
3. If the action defines a digest window other than `instant`, append the event to the open window and defer delivery until flush.
4. When delivery proceeds, the renderer resolves the template, locale, and metadata before invoking the connector.
---
## 5. Example rule payload
```json
{
"ruleId": "rule-critical-soc",
"tenantId": "tenant-dev",
"name": "Critical scanner verdicts",
"description": "Route KEV-tagged critical findings to SOC Slack with zero delay.",
"enabled": true,
"match": {
"eventKinds": ["scanner.report.ready"],
"labels": ["kev", "critical"],
"minSeverity": "critical",
"verdicts": ["fail", "block"],
"kevOnly": true
},
"actions": [
{
"actionId": "act-slack-critical",
"channel": "chn-slack-soc",
"template": "tmpl-critical",
"digest": "instant",
"throttle": "PT300S",
"locale": "en-us",
"metadata": {
"priority": "p1"
}
}
],
"labels": {
"owner": "soc"
},
"metadata": {
"revision": "12"
}
}
```
Dry-run calls (`POST /rules/{id}/test`) accept the same structure along with a sample Notify event payload to exercise match logic without invoking connectors.
---
## 6. Operational guidance
- Keep rule scopes narrow (namespace/repository) before relying on severity gates; this minimises noise and improves digest summarisation.
- Always configure a throttle window for instant actions to protect against repeated upstream retries.
- Use rule labels to organise dashboards and access control (e.g., `owner:soc`, `env:prod`).
- Prefer tenant-specific rule IDs so Offline Kit exports remain deterministic across environments.
- If a rule depends on derived metadata (e.g., policy verdict tags), list those dependencies in the rule description for audit readiness.
---
> **Imposed rule reminder:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.