- Added `FilesystemPackRunProvenanceWriter` to write provenance manifests to the filesystem. - Introduced `MongoPackRunArtifactReader` to read artifacts from MongoDB. - Created `MongoPackRunProvenanceWriter` to store provenance manifests in MongoDB. - Developed unit tests for filesystem and MongoDB provenance writers. - Established `ITimelineEventStore` and `ITimelineIngestionService` interfaces for timeline event handling. - Implemented `TimelineIngestionService` to validate and persist timeline events with hashing. - Created PostgreSQL schema and migration scripts for timeline indexing. - Added dependency injection support for timeline indexer services. - Developed tests for timeline ingestion and schema validation.
11 KiB
11 KiB
Implementation plan — Notify
Delivery phases
- Phase 1 – Core rules engine & delivery ledger
Implement rules/channels schema, event ingestion, rule evaluation, idempotent deliveries, and audit logging. - Phase 2 – Connectors & rendering
Ship Slack/Teams/Email/Webhook connectors, template rendering, localization, throttling, retries, and secret referencing. - Phase 3 – Console & CLI authoring
Provide UI/CLI for rule authoring, previews, channel health, delivery browsing, digests, and test sends. - Phase 4 – Governance & observability
Add approvals, RBAC, tenant quotas, Notify metrics/logs/traces, dashboards, Notify-specific alerts, and Notify runbooks. - Phase 5 – Offline & compliance
Produce Offline Kit bundles (rules/channels/deploy scripts), signed exports, retention policies, and auditing for regulated environments.
Work breakdown
- Service & worker
- REST API for rules/channels/delivery history, idempotency middleware, digest scheduler.
- Worker pipelines for event intake, rule matching, template rendering, delivery execution, retries, and throttling.
- Delivery ledger capturing payload metadata, response, retry state, DSSE signatures.
- Connectors
- Slack/Teams/Email/Webhook plug-ins with configuration validation, rate limiting, error classification.
- Secrets referenced via Authority/Secret store; no plaintext storage.
- Console & CLI
- Console module for rules builder, condition editor, preview, test send, delivery insights, digests and schedule configuration.
- CLI (
stella notify rule|channel|delivery) for automation, export/import.
- Integrations
- Event sources: Concelier, Excititor, Policy Engine, Vuln Explorer, Export Center, Attestor, Zastava, Scheduler.
- Notify events to Notify (meta) for failure escalations, accepted-risk expiration reminders.
- Observability & ops
- Metrics: delivery success/failure, retry counts, throttle hits, digest generation, channel health.
- Logs/traces with tenant, rule ID, channel, correlation ID; dashboards and alerts.
- Runbooks for misconfigured channels, throttling, event backlog, incident digest.
- Docs & compliance
- Update Notifications Studio guides, channel runbooks, security/RBAC docs, Offline Kit instructions.
- Provide compliance checklist (audit logging, retention, opt-out).
Acceptance criteria
- Rules evaluate deterministically per event; deliveries idempotent with audit trail and DSSE signatures.
- Channel connectors support retries, rate limits, health checks, previews; secrets referenced securely.
- Console/CLI support rule creation, testing, digests, delivery browsing, and export/import workflows.
- Observability dashboards track delivery health; alerts fire for sustained failures or backlog; runbooks cover remediation.
- Offline Kit bundle contains configs, rules, digests, and deployment scripts for air-gapped installs.
- Notify respects tenancy and RBAC; governance (approvals, change log) enforced for high-impact rules.
Risks & mitigations
- Notification storms: throttling, digests, dedupe windows, preview/test gating.
- Secret compromise: secret references only, rotation workflows, audit logging.
- Connector API changes: versioned adapter layer, nightly health checks, fallback channels.
- Noise vs signal: simulation previews, metrics, rule scoring, recommended defaults.
- Offline parity: export/import of rules, connectors, and digests with signed manifests.
Test strategy
- Unit: rule evaluation, template rendering, connector clients, throttling, digests.
- Integration: end-to-end events from core services, multi-channel routing, retries, audit logging.
- Performance: burst throttling, digest creation, large rule sets.
- Security: RBAC tests, tenant isolation, secret reference validation, DSSE signature verification.
- Offline: export/import round-trips, Offline Kit deployment, manual delivery replay.
Definition of done
- Notify service, workers, connectors, Console/CLI, observability, and Offline Kit assets shipped with documentation and runbooks.
- Compliance checklist appended to docs; ./TASKS.md and ../../TASKS.md updated with progress.
Sprint alignment (2025-11-30)
- Docs sprint:
docs/implplan/SPRINT_322_docs_modules_notify.md; statuses mirrored indocs/modules/notify/TASKS.md. - Observability evidence stub:
operations/observability.mdandoperations/dashboards/notify-observability.json(to be populated after next demo outputs). - NOTIFY-DOCS-0002 remains blocked pending NOTIFY-SVC-39-001..004 (correlation/digests/simulation/quiet hours); keep sprint/TASKS synced when those land.
Sprint readiness tracker
Last updated: 2025-11-27 (NOTIFY-ENG-0001)
This section maps delivery phases to implementation sprints and tracks readiness checkpoints.
Phase 1 — Core rules engine & delivery ledger
| Task ID | Status | Sprint | Notes |
|---|---|---|---|
| NOTIFY-SVC-37-001 | ✅ DONE (2025-11-24) | SPRINT_0172_0001_0002_notifier_ii | Pack approval contract published (OpenAPI schema, payloads). |
| NOTIFY-SVC-37-002 | ✅ DONE (2025-11-24) | SPRINT_0172_0001_0002_notifier_ii | Ingestion endpoint with Mongo persistence, idempotent writes, audit trail. |
| NOTIFY-SVC-37-003 | 🔄 DOING | SPRINT_0172_0001_0002_notifier_ii | Approval/policy templates, routing predicates; dispatch/rendering pending. |
| NOTIFY-SVC-37-004 | ✅ DONE (2025-11-24) | SPRINT_0172_0001_0002_notifier_ii | Acknowledgement API, test harness, metrics. |
| NOTIFY-OAS-61-001 | ✅ DONE (2025-11-17) | SPRINT_0171_0001_0001_notifier_i | OAS with rules/templates/incidents/quiet hours endpoints. |
| NOTIFY-OAS-61-002 | ✅ DONE (2025-11-17) | SPRINT_0171_0001_0001_notifier_i | /.well-known/openapi discovery endpoint. |
| NOTIFY-OAS-62-001 | ✅ DONE (2025-11-17) | SPRINT_0171_0001_0001_notifier_i | SDK examples for rule CRUD. |
| NOTIFY-OAS-63-001 | ✅ DONE (2025-11-17) | SPRINT_0171_0001_0001_notifier_i | Deprecation headers and templates. |
Checkpoint: Core rules engine mostly complete; template dispatch/rendering in progress.
Phase 2 — Connectors & rendering
| Task ID | Status | Sprint | Notes |
|---|---|---|---|
| NOTIFY-SVC-38-002 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Channel adapters (email, chat webhook, generic webhook) with retry policies. |
| NOTIFY-SVC-38-003 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Template service, renderer with redaction and localization. |
| NOTIFY-SVC-38-004 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | REST + WS APIs for rules CRUD, templates preview, incidents. |
| NOTIFY-DOC-70-001 | ✅ DONE (2025-11-02) | SPRINT_0171_0001_0001_notifier_i | Architecture docs for src/Notify vs src/Notifier split. |
Checkpoint: Connector and rendering work not yet started; depends on Phase 1 completion.
Phase 3 — Console & CLI authoring
| Task ID | Status | Sprint | Notes |
|---|---|---|---|
| NOTIFY-SVC-39-001 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Correlation engine with throttler, quiet hours, incident lifecycle. |
| NOTIFY-SVC-39-002 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Digest generator with schedule runner. |
| NOTIFY-SVC-39-003 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Simulation engine for dry-run rules against historical events. |
| NOTIFY-SVC-39-004 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Quiet hour calendars with audit logging. |
Checkpoint: Console/CLI authoring work not started; depends on Phase 2 completion.
Phase 4 — Governance & observability
| Task ID | Status | Sprint | Notes |
|---|---|---|---|
| NOTIFY-SVC-40-001 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Escalations, on-call schedules, PagerDuty/OpsGenie adapters. |
| NOTIFY-SVC-40-002 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Summary storm breaker, localization bundles. |
| NOTIFY-SVC-40-003 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Security hardening (signed ack links, webhook HMAC). |
| NOTIFY-SVC-40-004 | 📝 TODO | SPRINT_0172_0001_0002_notifier_ii | Observability metrics/traces, dead-letter handling, chaos tests. |
| NOTIFY-OBS-51-001 | ✅ DONE (2025-11-22) | SPRINT_0171_0001_0001_notifier_i | SLO evaluator webhooks with templates/routing/suppression. |
| NOTIFY-OBS-55-001 | ✅ DONE (2025-11-22) | SPRINT_0171_0001_0001_notifier_i | Incident mode templates with evidence/trace/retention context. |
| NOTIFY-ATTEST-74-001 | ✅ DONE (2025-11-16) | SPRINT_0171_0001_0001_notifier_i | Templates for verification failures, key revocations, transparency. |
| NOTIFY-ATTEST-74-002 | 📝 TODO | SPRINT_0171_0001_0001_notifier_i | Wire notifications to key rotation/revocation events. |
| NOTIFY-RISK-66-001 | ⏳ BLOCKED | SPRINT_0171_0001_0001_notifier_i | Risk severity escalation triggers; needs POLICY-RISK-40-002. |
| NOTIFY-RISK-67-001 | ⏳ BLOCKED | SPRINT_0171_0001_0001_notifier_i | Risk profile publish/deprecate notifications. |
| NOTIFY-RISK-68-001 | ⏳ BLOCKED | SPRINT_0171_0001_0001_notifier_i | Per-profile routing, quiet hours, dedupe. |
Checkpoint: Core observability complete; governance and risk notifications blocked on upstream dependencies.
Phase 5 — Offline & compliance
| Task ID | Status | Sprint | Notes |
|---|---|---|---|
| NOTIFY-AIRGAP-56-002 | ✅ DONE | SPRINT_0171_0001_0001_notifier_i | Bootstrap Pack with deterministic secrets and offline validation. |
| NOTIFY-TEN-48-001 | ⏳ BLOCKED | SPRINT_0173_0001_0003_notifier_iii | Tenant-scope rules/templates; needs Sprint 0172 tenancy model. |
Checkpoint: Offline basics complete; tenancy work blocked on upstream Sprint 0172.
Overall readiness summary
| Phase | Status | Blocking items |
|---|---|---|
| 1 – Core rules engine | 🔄 In progress | NOTIFY-SVC-37-003 dispatch/rendering |
| 2 – Connectors & rendering | 📝 Not started | Phase 1 completion |
| 3 – Console & CLI | 📝 Not started | Phase 2 completion |
| 4 – Governance & observability | 🔄 Partial | POLICY-RISK-40-002 for risk notifications |
| 5 – Offline & compliance | 🔄 Partial | Sprint 0172 tenancy model |
Cross-module dependencies
| Dependency | Required by | Status |
|---|---|---|
| Attestor payload localization | NOTIFY-ATTEST-74-002 | Freeze pending |
| POLICY-RISK-40-002 export | NOTIFY-RISK-66/67/68 | BLOCKED |
| Sprint 0172 tenancy model | NOTIFY-TEN-48-001 | In progress |
| Telemetry SLO webhook schema | NOTIFY-OBS-51-001 | ✅ Published (docs/notifications/slo-webhook-schema.md) |
Next actions
- Complete NOTIFY-SVC-37-003 dispatch/rendering wiring (Sprint 0172).
- Start NOTIFY-SVC-38-002 channel adapters once Phase 1 closes.
- Track POLICY-RISK-40-002 to unblock risk notification tasks.
- Monitor Sprint 0172 tenancy model for NOTIFY-TEN-48-001.