Add tenant isolation smoke test for telemetry stack
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled

This commit introduces a new script `tenant_isolation_smoke.py` that performs smoke tests to validate tenant isolation in the telemetry storage stack (Tempo + Loki) with mutual TLS enabled. The script checks that traces and logs pushed with specific tenant headers are only accessible to the corresponding tenants, ensuring proper enforcement of multi-tenancy. The tests include pushing a trace and a log entry, followed by assertions to verify access restrictions based on tenant IDs.
This commit is contained in:
master
2025-11-05 15:09:54 +02:00
parent 90c244948a
commit c1acd04249
20 changed files with 890 additions and 574 deletions

View File

@@ -1,35 +1,46 @@
# StellaOps Notify
Notify evaluates operator-defined rules against platform events and dispatches channel-specific payloads with full auditability.
## Responsibilities
- Process event streams and apply tenant-scoped routing rules.
- Render connector-specific payloads (email, Slack, Teams, webhook, custom).
- Enforce throttling, digests, and delivery retries.
- Surface delivery/audit data for UI and CLI consumers.
## Key components
- `StellaOps.Notify.WebService` (rules API + preview).
- `StellaOps.Notify.Worker` (delivery engine).
- Connector libraries under `StellaOps.Notify.Connectors.*`.
## Integrations & dependencies
- MongoDB for rule/channel storage.
- Redis/NATS for delivery queues.
- CLI/UI for authoring and monitoring notifications.
## Operational notes
- Schema fixtures in ./resources/schemas & ./resources/samples.
- Connector-specific monitoring dashboards.
- Offline runner guidance inside operations playbook.
## Related resources
- ./resources/schemas
- ./resources/samples
## Backlog references
- NOTIFY-SVC-38..40 (Notify backlog) referenced in `docs/README.md`.
- DOCS-NOTIFY updates tracked in ../../TASKS.md when available.
## Epic alignment
- **Epic 11 Notifications Studio:** deliver notifications workspace, preview tooling, immutable delivery ledger, and tenant-scoped throttling/digest controls.
# StellaOps Notify
Notify (Notifications Studio) converts platform events into tenant-scoped alerts with deterministic delivery, offline parity, and a full audit trail. The service is split between the reusable tooling in `src/Notify/*` and the runtime host in `src/Notifier/*` (decision recorded 2025-11-02) so downstream systems can embed the rules engine without inheriting the Studio UI.
## Scope & responsibilities
- Apply tenant-scoped rules to events from Scanner, Scheduler, VEX Lens, Attestor, Task Runner, and Zastava.
- Render channel-specific payloads (Slack, Teams, Email, webhook) using deterministic templates with localisation safeguards.
- Enforce throttling, digests, and quiet-hour calendars so bursts stay explainable and recoverable.
- Persist deliveries, attempts, throttles, and DSSE hashes for CLI/UI investigation and compliance export.
## Current capabilities (Sprint 38 foundations)
- **Rules + channels API:** `StellaOps.Notify.WebService` exposes CRUD, previews, and health probes secured by Authority scopes.
- **Worker pipeline:** `StellaOps.Notify.Worker` ingests bus events, evaluates match predicates, applies per-tenant throttles, and dispatches deliveries.
- **Connector plug-ins:** Restart-time plug-ins under `StellaOps.Notify.Connectors.*` (Slack, Teams, Email, generic webhook) with health checks and retry policy hints declared in `notify-plugin.json`.
- **Template engine:** Deterministic rendering with safe helpers, locale bundles, and redaction defaults that keep Offline Kit parity.
- **Delivery ledger:** Mongo-backed ledger storing hashed payloads, attempts, throttled/digested markers, and provenance links for audit + exports.
## In progress / upcoming (Sprint 39 focus)
- `NOTIFY-SVC-39-001` correlation engine with token-bucket throttles, incident lifecycle, and quiet-hours evaluator.
- `NOTIFY-SVC-39-002` digest generator with schedule runner, ledger queries, and distribution across existing channels.
- `NOTIFY-SVC-39-003` simulation API for rule dry-runs against historical events.
- `NOTIFY-SVC-39-004` quiet-hour calendar integration and default throttles with audit logging.
Status for these items is tracked in `src/Notifier/StellaOps.Notifier/TASKS.md` and sprint plans; update this README once tasks merge.
## Key docs & release alignment
- [`docs/notifications/overview.md`](../../notifications/overview.md) — summary of capabilities, imposed rules, and customer journey.
- [`docs/notifications/architecture.md`](../../notifications/architecture.md) — Notifications Studio runtime view (published 2025-10-29).
- [`docs/notifications/rules.md`](../../notifications/rules.md) — declarative matcher syntax and evaluation order.
- [`docs/notifications/digests.md`](../../notifications/digests.md) — digest windows, coalescing logic, and delivery samples.
- [`docs/notifications/templates.md`](../../notifications/templates.md) — template helpers, localisation, and redaction guidelines.
- [`docs/updates/2025-10-29-notify-docs.md`](../../updates/2025-10-29-notify-docs.md) — latest release note; follow-ups remain to validate connector metadata, quiet-hours semantics, and simulation payloads once Sprint 39 drops land.
## Integrations & dependencies
- **Storage:** MongoDB (`rules`, `channels`, `deliveries`, `digests`, `throttles`) with change streams for worker snapshots.
- **Queues:** Redis Streams or NATS JetStream for ingestion, throttling, and DLQs (`notify.dlq`).
- **Authority:** OpTok-protected APIs, DPoP-backed CLI/UI scopes (`notify.viewer`, `notify.operator`, `notify.admin`), and secret references for channel credentials.
- **Observability:** Prometheus metrics (`notify.sent_total`, `notify.failed_total`, `notify.digest_coalesced_total`, etc.), OTEL traces, and dashboards documented in `docs/notifications/architecture.md#12-observability-prometheus--otel`.
## Operational notes
- Schema fixtures live in `./resources/schemas`; event and delivery samples live in `./resources/samples` for contract tests and UI mocks.
- Offline Kit bundles ship plug-ins, default templates, and seed rules; update manifests under `ops/offline-kit/` when connectors change.
- Dashboards and alert references depend on `DEVOPS-NOTIFY-39-002`; coordinate before renaming metrics or labels.
- When releasing new rule or connector features, mirror guidance into `docs/notifications/*.md` and checklists in `docs/updates/2025-10-29-notify-docs.md` until the follow-ups are closed.
## Epic alignment
- **Epic 11 Notifications Studio:** notifications workspace, preview tooling, immutable delivery ledger, throttling/digest controls, and forthcoming correlation/simulation features.

View File

@@ -4,6 +4,7 @@
| ID | Status | Owner(s) | Description | Notes |
|----|--------|----------|-------------|-------|
| NOTIFY-DOCS-0001 | DOING (2025-10-29) | Docs Guild | Validate that ./README.md aligns with the latest release notes. | See ./AGENTS.md |
| NOTIFY-OPS-0001 | TODO | Ops Guild | Review runbooks/observability assets after next sprint demo. | Sync outcomes back to ../../TASKS.md |
| NOTIFY-ENG-0001 | TODO | Module Team | Cross-check implementation plan milestones against `/docs/implplan/SPRINT_*.md`. | Update status via ./AGENTS.md workflow |
| NOTIFY-DOCS-0001 | DONE (2025-11-05) | Docs Guild | Validate that ./README.md aligns with the latest release notes. | README refreshed to match 2025-10-29 release note and reference follow-ups. |
| NOTIFY-DOCS-0002 | TODO | Docs Guild | Document correlation engine, digest generator, simulation API, and quiet-hour calendars once NOTIFY-SVC-39-001..004 merge. | Blocked on NOTIFY-SVC-39-001..004 landing; update README + notifications/* docs. |
| NOTIFY-OPS-0001 | TODO | Ops Guild | Review runbooks/observability assets after next sprint demo. | Sync outcomes back to ../../TASKS.md |
| NOTIFY-ENG-0001 | TODO | Module Team | Cross-check implementation plan milestones against `/docs/implplan/SPRINT_*.md`. | Update status via ./AGENTS.md workflow |