Add Authority Advisory AI and API Lifecycle Configuration

- Introduced AuthorityAdvisoryAiOptions and related classes for managing advisory AI configurations, including remote inference options and tenant-specific settings.
- Added AuthorityApiLifecycleOptions to control API lifecycle settings, including legacy OAuth endpoint configurations.
- Implemented validation and normalization methods for both advisory AI and API lifecycle options to ensure proper configuration.
- Created AuthorityNotificationsOptions and its related classes for managing notification settings, including ack tokens, webhooks, and escalation options.
- Developed IssuerDirectoryClient and related models for interacting with the issuer directory service, including caching mechanisms and HTTP client configurations.
- Added support for dependency injection through ServiceCollectionExtensions for the Issuer Directory Client.
- Updated project file to include necessary package references for the new Issuer Directory Client library.
This commit is contained in:
master
2025-11-02 13:40:38 +02:00
parent 66cb6c4b8a
commit f98cea3bcf
516 changed files with 68157 additions and 24754 deletions

View File

@@ -92,7 +92,7 @@ Documents are stored using the canonical JSON serializer (`NotifyCanonicalJsonSe
## 5. Deployment & configuration
- **Configuration sources.** YAML files feed typed options (`NotifyMongoOptions`, `NotifyWorkerOptions`, etc.). Environment variables can override connection strings and rate limits for production.
- **Authority integration.** Two OAuth clients (`notify-web`, `notify-web-dev`) with scopes `notify.read` and `notify.admin` are required. Authority enforcement can be disabled for air-gapped dev use by providing `developmentSigningKey`.
- **Authority integration.** Two OAuth clients (`notify-web`, `notify-web-dev`) with scopes `notify.viewer`, `notify.operator`, and (for dev/admin flows) `notify.admin` are required. Authority enforcement can be disabled for air-gapped dev use by providing `developmentSigningKey`.
- **Plug-in management.** `plugins.baseDirectory` and `orderedPlugins` guarantee deterministic loading. Offline Kits copy the plug-in tree verbatim; operations must keep the order aligned across environments.
- **Observability.** Workers expose structured logs (`ruleId`, `actionId`, `eventId`, `throttleKey`). Metrics include:
- `notify_rule_matches_total{tenant,eventKind}`

View File

@@ -63,7 +63,7 @@ Digest state lives in Mongo (`digests` collection) and mirrors the schema descri
| Endpoint | Description | Notes |
|----------|-------------|-------|
| `POST /digests` | Issues administrative commands (e.g., force flush, reopen) for a specific action/window. | Request body specifies the command target; requires `notify.admin`. |
| `GET /digests/{actionKey}` | Returns the currently open window (if any) for the referenced action. | Supports operators/CLI inspecting pending digests; requires `notify.read`. |
| `GET /digests/{actionKey}` | Returns the currently open window (if any) for the referenced action. | Supports operators/CLI inspecting pending digests; requires `notify.viewer`. |
| `DELETE /digests/{actionKey}` | Drops the open window without notifying (emergency stop). | Emits an audit record; use sparingly. |
All routes honour the tenant header and reuse the standard Notify rate limits.

View File

@@ -1,76 +1,77 @@
# Notifications Overview
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
Notifications Studio turns raw platform events into concise, tenant-scoped alerts that reach the right responders without overwhelming them. The service is sovereign/offline-first, follows the Aggregation-Only Contract (AOC), and produces deterministic outputs so the same configuration yields identical deliveries across environments.
---
## 1. Mission & value
- **Reduce noise.** Only materially new or high-impact changes reach chat, email, or webhooks thanks to rule filters, throttles, and digest windows.
- **Explainable results.** Every delivery is traceable back to a rule, action, and event payload stored in the delivery ledger; operators can audit what fired and why.
- **Safe by default.** Secrets remain in external stores, templates are sandboxed, quiet hours and throttles prevent storms, and idempotency guarantees protect downstream systems.
- **Offline-aligned.** All configuration, templates, and plug-ins ship with Offline Kits; no external SaaS is required to send notifications.
---
## 2. Core capabilities
| Capability | What it does | Key docs |
|------------|--------------|----------|
| Rules engine | Declarative matchers for event kinds, severities, namespaces, VEX context, KEV flags, and more. | [`notifications/rules.md`](rules.md) |
| Channel catalog | Slack, Teams, Email, Webhook connectors loaded via restart-time plug-ins; metadata stored without secrets. | [`notifications/architecture.md`](architecture.md) |
| Templates | Locale-aware, deterministic rendering via safe helpers; channel defaults plus tenant-specific overrides. | [`notifications/templates.md`](templates.md) |
| Digests | Coalesce bursts into periodic summaries with deterministic IDs and audit trails. | [`notifications/digests.md`](digests.md) |
| Delivery ledger | Tracks rendered payload hashes, attempts, throttles, and outcomes for every action. | [`modules/notify/architecture.md`](../modules/notify/architecture.md#7-data-model-mongo) |
---
## 3. How it fits into StellaOps
1. **Producers emit events.** Scanner, Scheduler, VEX Lens, Attestor, and Zastava publish canonical envelopes (`NotifyEvent`) onto the internal bus.
2. **Notify.Worker evaluates rules.** For each tenant, the worker applies match filters, VEX gates, throttles, and digest policies before rendering the action.
3. **Connectors deliver.** Channel plug-ins send the rendered payload to Slack/Teams/Email/Webhook targets and report back attempts and outcomes.
4. **Consumers investigate.** Operators pivot from message links into Console dashboards, SBOM views, or policy overlays with correlation IDs preserved.
The Notify WebService fronts worker state with REST APIs used by the UI and CLI. Tenants authenticate via StellaOps Authority scopes `notify.read` and `notify.admin`. All operations require the tenant header (`X-StellaOps-Tenant`) to preserve sovereignty boundaries.
---
## 4. Operating model
| Area | Guidance |
|------|----------|
| **Tenancy** | Each rule, channel, template, and delivery belongs to exactly one tenant. Cross-tenant sharing is intentionally unsupported. |
| **Determinism** | Configuration persistence normalises strings and sorts collections. Template rendering produces identical `bodyHash` values when inputs match. |
| **Scaling** | Workers scale horizontally; per-tenant rule snapshots are cached and refreshed from Mongo change streams. Redis (or equivalent) guards throttles and locks. |
| **Offline** | Offline Kits include plug-ins, default templates, and seed rules. Operators can edit YAML/JSON manifests before air-gapped deployment. |
| **Security** | Channel secrets use indirection (`secretRef`), Authority-protected OAuth clients secure API access, and delivery payloads are redacted before storage where required. |
---
## 5. Getting started (first 30 minutes)
| Step | Goal | Reference |
|------|------|-----------|
| 1 | Deploy Notify WebService + Worker with Mongo and Redis | [`modules/notify/architecture.md`](../modules/notify/architecture.md#1-runtime-shape--projects) |
# Notifications Overview
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
Notifications Studio turns raw platform events into concise, tenant-scoped alerts that reach the right responders without overwhelming them. The service is sovereign/offline-first, follows the Aggregation-Only Contract (AOC), and produces deterministic outputs so the same configuration yields identical deliveries across environments.
---
## 1. Mission & value
- **Reduce noise.** Only materially new or high-impact changes reach chat, email, or webhooks thanks to rule filters, throttles, and digest windows.
- **Explainable results.** Every delivery is traceable back to a rule, action, and event payload stored in the delivery ledger; operators can audit what fired and why.
- **Safe by default.** Secrets remain in external stores, templates are sandboxed, quiet hours and throttles prevent storms, and idempotency guarantees protect downstream systems.
- **Offline-aligned.** All configuration, templates, and plug-ins ship with Offline Kits; no external SaaS is required to send notifications.
---
## 2. Core capabilities
| Capability | What it does | Key docs |
|------------|--------------|----------|
| Rules engine | Declarative matchers for event kinds, severities, namespaces, VEX context, KEV flags, and more. | [`notifications/rules.md`](rules.md) |
| Channel catalog | Slack, Teams, Email, Webhook connectors loaded via restart-time plug-ins; metadata stored without secrets. | [`notifications/architecture.md`](architecture.md) |
| Templates | Locale-aware, deterministic rendering via safe helpers; channel defaults plus tenant-specific overrides. | [`notifications/templates.md`](templates.md) |
| Digests | Coalesce bursts into periodic summaries with deterministic IDs and audit trails. | [`notifications/digests.md`](digests.md) |
| Delivery ledger | Tracks rendered payload hashes, attempts, throttles, and outcomes for every action. | [`modules/notify/architecture.md`](../modules/notify/architecture.md#7-data-model-mongo) |
| Ack tokens | DSSE-signed acknowledgement tokens with webhook allowlists and escalation guardrails enforced by Authority. | [`modules/notify/architecture.md`](../modules/notify/architecture.md#81-ack-tokens--escalation-workflows) |
---
## 3. How it fits into StellaOps
1. **Producers emit events.** Scanner, Scheduler, VEX Lens, Attestor, and Zastava publish canonical envelopes (`NotifyEvent`) onto the internal bus.
2. **Notify.Worker evaluates rules.** For each tenant, the worker applies match filters, VEX gates, throttles, and digest policies before rendering the action.
3. **Connectors deliver.** Channel plug-ins send the rendered payload to Slack/Teams/Email/Webhook targets and report back attempts and outcomes.
4. **Consumers investigate.** Operators pivot from message links into Console dashboards, SBOM views, or policy overlays with correlation IDs preserved.
The Notify WebService fronts worker state with REST APIs used by the UI and CLI. Tenants authenticate via StellaOps Authority scopes `notify.viewer`, `notify.operator`, and (for escalated actions) `notify.admin`. All operations require the tenant header (`X-StellaOps-Tenant`) to preserve sovereignty boundaries.
---
## 4. Operating model
| Area | Guidance |
|------|----------|
| **Tenancy** | Each rule, channel, template, and delivery belongs to exactly one tenant. Cross-tenant sharing is intentionally unsupported. |
| **Determinism** | Configuration persistence normalises strings and sorts collections. Template rendering produces identical `bodyHash` values when inputs match. |
| **Scaling** | Workers scale horizontally; per-tenant rule snapshots are cached and refreshed from Mongo change streams. Redis (or equivalent) guards throttles and locks. |
| **Offline** | Offline Kits include plug-ins, default templates, and seed rules. Operators can edit YAML/JSON manifests before air-gapped deployment. |
| **Security** | Channel secrets use indirection (`secretRef`), Authority-protected OAuth clients secure API access, and delivery payloads are redacted before storage where required. |
---
## 5. Getting started (first 30 minutes)
| Step | Goal | Reference |
|------|------|-----------|
| 1 | Deploy Notify WebService + Worker with Mongo and Redis | [`modules/notify/architecture.md`](../modules/notify/architecture.md#1-runtime-shape--projects) |
| 2 | Register OAuth clients/scopes in Authority | [`etc/authority.yaml.sample`](../../etc/authority.yaml.sample) |
| 3 | Install channel plug-ins and capture secret references | [`plugins/notify`](../../plugins) |
| 4 | Create a tenant rule and test preview | [`POST /channels/{id}/test`](../modules/notify/architecture.md#8-external-apis-webservice) |
| 5 | Inspect deliveries and digests | `/api/v1/notify/deliveries`, `/api/v1/notify/digests` |
---
## 6. Alignment with implementation work
| Backlog item | Impact on docs | Status |
|--------------|----------------|--------|
| `NOTIFY-SVC-38-001..004` | Foundational correlation, throttling, simulation hooks. | **In progress** align behaviour once services publish beta APIs. |
| `NOTIFY-SVC-39-001..004` | Adds correlation engine, digest generator, simulation API, quiet hours. | **Pending** revisit rule/digest sections when these tasks merge. |
Action: coordinate with the Notifications Service Guild when `NOTIFY-SVC-39-001..004` land to validate payload fields, quiet-hours semantics, and any new connector metadata that should be documented here and in the channel-specific guides.
---
> **Imposed rule reminder:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
| 3 | Install channel plug-ins and capture secret references | [`plugins/notify`](../../plugins) |
| 4 | Create a tenant rule and test preview | [`POST /channels/{id}/test`](../modules/notify/architecture.md#8-external-apis-webservice) |
| 5 | Inspect deliveries and digests | `/api/v1/notify/deliveries`, `/api/v1/notify/digests` |
---
## 6. Alignment with implementation work
| Backlog item | Impact on docs | Status |
|--------------|----------------|--------|
| `NOTIFY-SVC-38-001..004` | Foundational correlation, throttling, simulation hooks. | **In progress** align behaviour once services publish beta APIs. |
| `NOTIFY-SVC-39-001..004` | Adds correlation engine, digest generator, simulation API, quiet hours. | **Pending** revisit rule/digest sections when these tasks merge. |
Action: coordinate with the Notifications Service Guild when `NOTIFY-SVC-39-001..004` land to validate payload fields, quiet-hours semantics, and any new connector metadata that should be documented here and in the channel-specific guides.
---
> **Imposed rule reminder:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.

View File

@@ -1,62 +1,62 @@
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
# Pack Approval Notification Integration — Requirements
## Overview
Task Runner now produces pack plans with explicit approval and policy-gate metadata. The Notifications service must ingest those events, persist their state, and fan out actionable alerts (approvals requested, policy holds, resumptions). This document captures the requirements for the first Notifications sprint dedicated to the Task Runner bridge.
Deliverables feed Sprint 37 tasks (`NOTIFY-SVC-37-00x`) and unblock Task Runner sprint 43 (`TASKRUN-43-001`).
## Functional Requirements
### 1. Approval Event Contract
- Define a canonical schema for **PackApprovalRequested** and **PackApprovalUpdated** events.
- Fields must include `runId`, `approvalId`, tenant context, plan hash, required grants, step identifiers, message template, and resume callback metadata.
- Provide an OpenAPI fragment and x-go/x-cs models for Task Runner and CLI compatibility.
- Document error/acknowledgement semantics (success, retryable failure, validation failure).
### 2. Ingestion & Persistence
- Expose a secure Notifications API endpoint (`POST /notifications/pack-approvals`) receiving Task Runner events.
- Validate scope (`Packs.Approve`, `Notifier.Events:Write`) and tenant match.
- Persist approval state transitions in Mongo (`notifications.pack_approvals`) with indexes on run/approval/tenant.
- Store outbound notification audit records with correlation IDs to support Task Runner resume flow.
### 3. Notification Routing
- Derive recipients from new rule predicates (`event.kind == "pack.approval"`).
- Render approval templates (email + webhook JSON) including plan metadata and approval links (resume token).
- Emit policy gate notifications as “hold” incidents with context (parameters, messages).
- Support localization fallback and redaction of secrets (never ship approval tokens unencrypted).
### 4. Resume & Ack Handshake
- Provide an approval ack endpoint (`POST /notifications/pack-approvals/{runId}/{approvalId}/ack`) that records decision metadata and forwards to Task Runner resume hook (HTTP callback + message bus placeholder).
- Return structured responses with resume token / status for CLI integration.
- Ensure idempotent updates (dedupe by runId + approvalId + decisionHash).
### 5. Observability & Security
- Emit metrics for approval notifications queued/sent, outstanding approvals, and acknowledgement latency.
- Log audit trail events (`pack.approval.requested`, `pack.approval.acknowledged`, `pack.policy.hold`).
- Enforce HMAC or mTLS for Task Runner -> Notifier ingestion; support configurable IP allowlist.
- Provide chaos-test plan for notification failure modes (channel outage, storage failure).
## Non-Functional Requirements
- Deterministic processing: identical approval events lead to identical outbound notifications (idempotent).
- Timeouts: ingestion endpoint must respond < 500ms under nominal load.
- Retry strategy: Task Runner expects 5xx/429 for transient errors; document backoff guidance.
- Data retention: approval records retained 90 days, purge job tracked under ops runbook.
## Sprint 37 Task Mapping
| Task ID | Scope |
| --- | --- |
| **NOTIFY-SVC-37-001** | Author this contract doc, OpenAPI fragment, and schema references. Coordinate with Task Runner/Authority guilds. |
| **NOTIFY-SVC-37-002** | Implement secure ingestion endpoint, Mongo persistence, and audit hooks. Provide integration tests with sample events. |
| **NOTIFY-SVC-37-003** | Build approval/policy notification templates, routing rules, and channel dispatch (email + webhook). |
| **NOTIFY-SVC-37-004** | Ship acknowledgement endpoint + Task Runner callback client, resume token handling, and metrics/dashboards. |
## Open Questions
1. Who owns approval resume callback (Task Runner Worker vs Orchestrator)? Resolve before NOTIFY-SVC-37-004.
2. Should approvals generate incidents in existing incident schema or dedicated collection? Decision impacts Mongo design.
3. Authority scopes for approval ingestion/ack reuse `Packs.Approve` or introduce `Packs.Approve:notify`? Coordinate with Authority team.
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
# Pack Approval Notification Integration — Requirements
## Overview
Task Runner now produces pack plans with explicit approval and policy-gate metadata. The Notifications service must ingest those events, persist their state, and fan out actionable alerts (approvals requested, policy holds, resumptions). This document captures the requirements for the first Notifications sprint dedicated to the Task Runner bridge.
Deliverables feed Sprint 37 tasks (`NOTIFY-SVC-37-00x`) and unblock Task Runner sprint 43 (`TASKRUN-43-001`).
## Functional Requirements
### 1. Approval Event Contract
- Define a canonical schema for **PackApprovalRequested** and **PackApprovalUpdated** events.
- Fields must include `runId`, `approvalId`, tenant context, plan hash, required grants, step identifiers, message template, and resume callback metadata.
- Provide an OpenAPI fragment and x-go/x-cs models for Task Runner and CLI compatibility.
- Document error/acknowledgement semantics (success, retryable failure, validation failure).
### 2. Ingestion & Persistence
- Expose a secure Notifications API endpoint (`POST /notifications/pack-approvals`) receiving Task Runner events.
- Validate scope (`packs.approve`, `Notifier.Events:Write`) and tenant match.
- Persist approval state transitions in Mongo (`notifications.pack_approvals`) with indexes on run/approval/tenant.
- Store outbound notification audit records with correlation IDs to support Task Runner resume flow.
### 3. Notification Routing
- Derive recipients from new rule predicates (`event.kind == "pack.approval"`).
- Render approval templates (email + webhook JSON) including plan metadata and approval links (resume token).
- Emit policy gate notifications as “hold” incidents with context (parameters, messages).
- Support localization fallback and redaction of secrets (never ship approval tokens unencrypted).
### 4. Resume & Ack Handshake
- Provide an approval ack endpoint (`POST /notifications/pack-approvals/{runId}/{approvalId}/ack`) that records decision metadata and forwards to Task Runner resume hook (HTTP callback + message bus placeholder).
- Return structured responses with resume token / status for CLI integration.
- Ensure idempotent updates (dedupe by runId + approvalId + decisionHash).
### 5. Observability & Security
- Emit metrics for approval notifications queued/sent, outstanding approvals, and acknowledgement latency.
- Log audit trail events (`pack.approval.requested`, `pack.approval.acknowledged`, `pack.policy.hold`).
- Enforce HMAC or mTLS for Task Runner -> Notifier ingestion; support configurable IP allowlist.
- Provide chaos-test plan for notification failure modes (channel outage, storage failure).
## Non-Functional Requirements
- Deterministic processing: identical approval events lead to identical outbound notifications (idempotent).
- Timeouts: ingestion endpoint must respond < 500ms under nominal load.
- Retry strategy: Task Runner expects 5xx/429 for transient errors; document backoff guidance.
- Data retention: approval records retained 90 days, purge job tracked under ops runbook.
## Sprint 37 Task Mapping
| Task ID | Scope |
| --- | --- |
| **NOTIFY-SVC-37-001** | Author this contract doc, OpenAPI fragment, and schema references. Coordinate with Task Runner/Authority guilds. |
| **NOTIFY-SVC-37-002** | Implement secure ingestion endpoint, Mongo persistence, and audit hooks. Provide integration tests with sample events. |
| **NOTIFY-SVC-37-003** | Build approval/policy notification templates, routing rules, and channel dispatch (email + webhook). |
| **NOTIFY-SVC-37-004** | Ship acknowledgement endpoint + Task Runner callback client, resume token handling, and metrics/dashboards. |
## Open Questions
1. Who owns approval resume callback (Task Runner Worker vs Orchestrator)? Resolve before NOTIFY-SVC-37-004.
2. Should approvals generate incidents in existing incident schema or dedicated collection? Decision impacts Mongo design.
3. Authority scopes for approval ingestion/ack reuse `packs.approve` or introduce `packs.approve:notify`? Coordinate with Authority team.