docs consolidation work
This commit is contained in:
@@ -50,3 +50,34 @@ Status for these items is tracked in `src/Notifier/StellaOps.Notifier/TASKS.md`
|
||||
|
||||
## Epic alignment
|
||||
- **Epic 11 – Notifications Studio:** notifications workspace, preview tooling, immutable delivery ledger, throttling/digest controls, and forthcoming correlation/simulation features.
|
||||
|
||||
## Implementation Status
|
||||
|
||||
### Delivery Phases
|
||||
- **Phase 1 – Core rules engine & delivery ledger:** Implement rules/channels schema, event ingestion, rule evaluation, idempotent deliveries, audit logging
|
||||
- **Phase 2 – Connectors & rendering:** Ship Slack/Teams/Email/Webhook connectors, template rendering, localization, throttling, retries, secret referencing
|
||||
- **Phase 3 – Console & CLI authoring:** Provide UI/CLI for rule authoring, previews, channel health, delivery browsing, digests, test sends
|
||||
- **Phase 4 – Governance & observability:** Add approvals, RBAC, tenant quotas, metrics/logs/traces, dashboards, alerts, runbooks
|
||||
- **Phase 5 – Offline & compliance:** Produce Offline Kit bundles (rules/channels/deploy scripts), signed exports, retention policies, auditing
|
||||
|
||||
### Acceptance Criteria
|
||||
- Rules evaluate deterministically per event; deliveries idempotent with audit trail and DSSE signatures
|
||||
- Channel connectors support retries, rate limits, health checks, previews; secrets referenced securely
|
||||
- Console/CLI support rule creation, testing, digests, delivery browsing, export/import workflows
|
||||
- Observability dashboards track delivery health; alerts fire for sustained failures or backlog; runbooks cover remediation
|
||||
- Offline Kit bundle contains configs, rules, digests, deployment scripts for air-gapped installs
|
||||
- Notify respects tenancy and RBAC; governance (approvals, change log) enforced for high-impact rules
|
||||
|
||||
### Key Risks & Mitigations
|
||||
- **Notification storms:** Throttling, digests, dedupe windows, preview/test gating
|
||||
- **Secret compromise:** Secret references only, rotation workflows, audit logging
|
||||
- **Connector API changes:** Versioned adapter layer, nightly health checks, fallback channels
|
||||
- **Noise vs signal:** Simulation previews, metrics, rule scoring, recommended defaults
|
||||
- **Offline parity:** Export/import of rules, connectors, digests with signed manifests
|
||||
|
||||
### Current Phase Progress
|
||||
- Phase 1: Core rules engine mostly complete; template dispatch/rendering in progress
|
||||
- Phase 2: Connector and rendering work not yet started; depends on Phase 1 completion
|
||||
- Phase 3: Console/CLI authoring work not started; depends on Phase 2 completion
|
||||
- Phase 4: Core observability complete; governance and risk notifications blocked on upstream dependencies
|
||||
- Phase 5: Offline basics complete; tenancy work blocked on upstream Sprint 0172
|
||||
|
||||
@@ -1,160 +0,0 @@
|
||||
# Implementation plan — Notify
|
||||
|
||||
## Delivery phases
|
||||
- **Phase 1 – Core rules engine & delivery ledger**
|
||||
Implement rules/channels schema, event ingestion, rule evaluation, idempotent deliveries, and audit logging.
|
||||
- **Phase 2 – Connectors & rendering**
|
||||
Ship Slack/Teams/Email/Webhook connectors, template rendering, localization, throttling, retries, and secret referencing.
|
||||
- **Phase 3 – Console & CLI authoring**
|
||||
Provide UI/CLI for rule authoring, previews, channel health, delivery browsing, digests, and test sends.
|
||||
- **Phase 4 – Governance & observability**
|
||||
Add approvals, RBAC, tenant quotas, Notify metrics/logs/traces, dashboards, Notify-specific alerts, and Notify runbooks.
|
||||
- **Phase 5 – Offline & compliance**
|
||||
Produce Offline Kit bundles (rules/channels/deploy scripts), signed exports, retention policies, and auditing for regulated environments.
|
||||
|
||||
## Work breakdown
|
||||
- **Service & worker**
|
||||
- REST API for rules/channels/delivery history, idempotency middleware, digest scheduler.
|
||||
- Worker pipelines for event intake, rule matching, template rendering, delivery execution, retries, and throttling.
|
||||
- Delivery ledger capturing payload metadata, response, retry state, DSSE signatures.
|
||||
- **Connectors**
|
||||
- Slack/Teams/Email/Webhook plug-ins with configuration validation, rate limiting, error classification.
|
||||
- Secrets referenced via Authority/Secret store; no plaintext storage.
|
||||
- **Console & CLI**
|
||||
- Console module for rules builder, condition editor, preview, test send, delivery insights, digests and schedule configuration.
|
||||
- CLI (`stella notify rule|channel|delivery`) for automation, export/import.
|
||||
- **Integrations**
|
||||
- Event sources: Concelier, Excititor, Policy Engine, Vuln Explorer, Export Center, Attestor, Zastava, Scheduler.
|
||||
- Notify events to Notify (meta) for failure escalations, accepted-risk expiration reminders.
|
||||
- **Observability & ops**
|
||||
- Metrics: delivery success/failure, retry counts, throttle hits, digest generation, channel health.
|
||||
- Logs/traces with tenant, rule ID, channel, correlation ID; dashboards and alerts.
|
||||
- Runbooks for misconfigured channels, throttling, event backlog, incident digest.
|
||||
- **Docs & compliance**
|
||||
- Update Notifications Studio guides, channel runbooks, security/RBAC docs, Offline Kit instructions.
|
||||
- Provide compliance checklist (audit logging, retention, opt-out).
|
||||
|
||||
## Acceptance criteria
|
||||
- Rules evaluate deterministically per event; deliveries idempotent with audit trail and DSSE signatures.
|
||||
- Channel connectors support retries, rate limits, health checks, previews; secrets referenced securely.
|
||||
- Console/CLI support rule creation, testing, digests, delivery browsing, and export/import workflows.
|
||||
- Observability dashboards track delivery health; alerts fire for sustained failures or backlog; runbooks cover remediation.
|
||||
- Offline Kit bundle contains configs, rules, digests, and deployment scripts for air-gapped installs.
|
||||
- Notify respects tenancy and RBAC; governance (approvals, change log) enforced for high-impact rules.
|
||||
|
||||
## Risks & mitigations
|
||||
- **Notification storms:** throttling, digests, dedupe windows, preview/test gating.
|
||||
- **Secret compromise:** secret references only, rotation workflows, audit logging.
|
||||
- **Connector API changes:** versioned adapter layer, nightly health checks, fallback channels.
|
||||
- **Noise vs signal:** simulation previews, metrics, rule scoring, recommended defaults.
|
||||
- **Offline parity:** export/import of rules, connectors, and digests with signed manifests.
|
||||
|
||||
## Test strategy
|
||||
- **Unit:** rule evaluation, template rendering, connector clients, throttling, digests.
|
||||
- **Integration:** end-to-end events from core services, multi-channel routing, retries, audit logging.
|
||||
- **Performance:** burst throttling, digest creation, large rule sets.
|
||||
- **Security:** RBAC tests, tenant isolation, secret reference validation, DSSE signature verification.
|
||||
- **Offline:** export/import round-trips, Offline Kit deployment, manual delivery replay.
|
||||
|
||||
## Definition of done
|
||||
- Notify service, workers, connectors, Console/CLI, observability, and Offline Kit assets shipped with documentation and runbooks.
|
||||
- Compliance checklist appended to docs; ./TASKS.md and ../../TASKS.md updated with progress.
|
||||
|
||||
## Sprint alignment (2025-11-30)
|
||||
- Docs sprint: `docs/implplan/SPRINT_322_docs_modules_notify.md`; statuses mirrored in `docs/modules/notify/TASKS.md`.
|
||||
- Observability evidence stub: `operations/observability.md` and `operations/dashboards/notify-observability.json` (to be populated after next demo outputs).
|
||||
- NOTIFY-DOCS-0002 remains blocked pending NOTIFY-SVC-39-001..004 (correlation/digests/simulation/quiet hours); keep sprint/TASKS synced when those land.
|
||||
|
||||
---
|
||||
|
||||
## Sprint readiness tracker
|
||||
|
||||
> Last updated: 2025-11-27 (NOTIFY-ENG-0001)
|
||||
|
||||
This section maps delivery phases to implementation sprints and tracks readiness checkpoints.
|
||||
|
||||
### Phase 1 — Core rules engine & delivery ledger
|
||||
| Task ID | Status | Sprint | Notes |
|
||||
|---------|--------|--------|-------|
|
||||
| NOTIFY-SVC-37-001 | ✅ DONE (2025-11-24) | SPRINT_0172_0001_0002_notifier_ii | Pack approval contract published (OpenAPI schema, payloads). |
|
||||
| NOTIFY-SVC-37-002 | ✅ DONE (2025-11-24) | SPRINT_0172_0001_0002_notifier_ii | Ingestion endpoint with Mongo persistence, idempotent writes, audit trail. |
|
||||
| NOTIFY-SVC-37-003 | 🔄 DOING | SPRINT_0172_0001_0002_notifier_ii | Approval/policy templates, routing predicates; dispatch/rendering pending. |
|
||||
| NOTIFY-SVC-37-004 | ✅ DONE (2025-11-24) | SPRINT_0172_0001_0002_notifier_ii | Acknowledgement API, test harness, metrics. |
|
||||
| NOTIFY-OAS-61-001 | ✅ DONE (2025-11-17) | SPRINT_0171_0001_0001_notifier_i | OAS with rules/templates/incidents/quiet hours endpoints. |
|
||||
| NOTIFY-OAS-61-002 | ✅ DONE (2025-11-17) | SPRINT_0171_0001_0001_notifier_i | `/.well-known/openapi` discovery endpoint. |
|
||||
| NOTIFY-OAS-62-001 | ✅ DONE (2025-11-17) | SPRINT_0171_0001_0001_notifier_i | SDK examples for rule CRUD. |
|
||||
| NOTIFY-OAS-63-001 | ✅ DONE (2025-11-17) | SPRINT_0171_0001_0001_notifier_i | Deprecation headers and templates. |
|
||||
|
||||
**Checkpoint:** Core rules engine mostly complete; template dispatch/rendering in progress.
|
||||
|
||||
### Phase 2 — Connectors & rendering
|
||||
| Task ID | Status | Sprint | Notes |
|
||||
|---------|--------|--------|-------|
|
||||
| NOTIFY-SVC-38-002 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Channel adapters (email, chat webhook, generic webhook) with retry policies. |
|
||||
| NOTIFY-SVC-38-003 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Template service, renderer with redaction and localization. |
|
||||
| NOTIFY-SVC-38-004 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | REST + WS APIs for rules CRUD, templates preview, incidents. |
|
||||
| NOTIFY-DOC-70-001 | ✅ DONE (2025-11-02) | SPRINT_0171_0001_0001_notifier_i | Architecture docs for `src/Notify` vs `src/Notifier` split. |
|
||||
|
||||
**Checkpoint:** Connector and rendering work not yet started; depends on Phase 1 completion.
|
||||
|
||||
### Phase 3 — Console & CLI authoring
|
||||
| Task ID | Status | Sprint | Notes |
|
||||
|---------|--------|--------|-------|
|
||||
| NOTIFY-SVC-39-001 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Correlation engine with throttler, quiet hours, incident lifecycle. |
|
||||
| NOTIFY-SVC-39-002 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Digest generator with schedule runner. |
|
||||
| NOTIFY-SVC-39-003 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Simulation engine for dry-run rules against historical events. |
|
||||
| NOTIFY-SVC-39-004 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Quiet hour calendars with audit logging. |
|
||||
|
||||
**Checkpoint:** Console/CLI authoring work not started; depends on Phase 2 completion.
|
||||
|
||||
### Phase 4 — Governance & observability
|
||||
| Task ID | Status | Sprint | Notes |
|
||||
|---------|--------|--------|-------|
|
||||
| NOTIFY-SVC-40-001 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Escalations, on-call schedules, PagerDuty/OpsGenie adapters. |
|
||||
| NOTIFY-SVC-40-002 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Summary storm breaker, localization bundles. |
|
||||
| NOTIFY-SVC-40-003 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Security hardening (signed ack links, webhook HMAC). |
|
||||
| NOTIFY-SVC-40-004 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Observability metrics/traces, dead-letter handling, chaos tests. |
|
||||
| NOTIFY-OBS-51-001 | ✅ DONE (2025-11-22) | SPRINT_0171_0001_0001_notifier_i | SLO evaluator webhooks with templates/routing/suppression. |
|
||||
| NOTIFY-OBS-55-001 | ✅ DONE (2025-11-22) | SPRINT_0171_0001_0001_notifier_i | Incident mode templates with evidence/trace/retention context. |
|
||||
| NOTIFY-ATTEST-74-001 | ✅ DONE (2025-11-16) | SPRINT_0171_0001_0001_notifier_i | Templates for verification failures, key revocations, transparency. |
|
||||
| NOTIFY-ATTEST-74-002 | 📝 TODO | SPRINT_0171_0001_0001_notifier_i | Wire notifications to key rotation/revocation events. |
|
||||
| NOTIFY-RISK-66-001 | ⏳ BLOCKED | SPRINT_0171_0001_0001_notifier_i | Risk severity escalation triggers; needs POLICY-RISK-40-002. |
|
||||
| NOTIFY-RISK-67-001 | ⏳ BLOCKED | SPRINT_0171_0001_0001_notifier_i | Risk profile publish/deprecate notifications. |
|
||||
| NOTIFY-RISK-68-001 | ⏳ BLOCKED | SPRINT_0171_0001_0001_notifier_i | Per-profile routing, quiet hours, dedupe. |
|
||||
|
||||
**Checkpoint:** Core observability complete; governance and risk notifications blocked on upstream dependencies.
|
||||
|
||||
### Phase 5 — Offline & compliance
|
||||
| Task ID | Status | Sprint | Notes |
|
||||
|---------|--------|--------|-------|
|
||||
| NOTIFY-AIRGAP-56-002 | ✅ DONE | SPRINT_0171_0001_0001_notifier_i | Bootstrap Pack with deterministic secrets and offline validation. |
|
||||
| NOTIFY-TEN-48-001 | ⏳ BLOCKED | SPRINT_0173_0001_0003_notifier_iii | Tenant-scope rules/templates; needs Sprint 0172 tenancy model. |
|
||||
|
||||
**Checkpoint:** Offline basics complete; tenancy work blocked on upstream Sprint 0172.
|
||||
|
||||
---
|
||||
|
||||
### Overall readiness summary
|
||||
|
||||
| Phase | Status | Blocking items |
|
||||
|-------|--------|----------------|
|
||||
| **1 – Core rules engine** | 🔄 In progress | NOTIFY-SVC-37-003 dispatch/rendering |
|
||||
| **2 – Connectors & rendering** | 📝 Not started | Phase 1 completion |
|
||||
| **3 – Console & CLI** | 📝 Not started | Phase 2 completion |
|
||||
| **4 – Governance & observability** | 🔄 Partial | POLICY-RISK-40-002 for risk notifications |
|
||||
| **5 – Offline & compliance** | 🔄 Partial | Sprint 0172 tenancy model |
|
||||
|
||||
### Cross-module dependencies
|
||||
|
||||
| Dependency | Required by | Status |
|
||||
|------------|-------------|--------|
|
||||
| Attestor payload localization | NOTIFY-ATTEST-74-002 | Freeze pending |
|
||||
| POLICY-RISK-40-002 export | NOTIFY-RISK-66/67/68 | BLOCKED |
|
||||
| Sprint 0172 tenancy model | NOTIFY-TEN-48-001 | In progress |
|
||||
| Telemetry SLO webhook schema | NOTIFY-OBS-51-001 | ✅ Published (`docs/modules/notify/slo-webhook-schema.md`) |
|
||||
|
||||
### Next actions
|
||||
1. Complete NOTIFY-SVC-37-003 dispatch/rendering wiring (Sprint 0172).
|
||||
2. Start NOTIFY-SVC-38-002 channel adapters once Phase 1 closes.
|
||||
3. Track POLICY-RISK-40-002 to unblock risk notification tasks.
|
||||
4. Monitor Sprint 0172 tenancy model for NOTIFY-TEN-48-001.
|
||||
Reference in New Issue
Block a user