381 lines
36 KiB
Markdown
381 lines
36 KiB
Markdown
# StellaOps Authority Service
|
||
|
||
> **Status:** Drafted 2025-10-12 (CORE5B.DOC / DOC1.AUTH) – aligns with Authority revocation store, JWKS rotation, and bootstrap endpoints delivered in Sprint 1.
|
||
|
||
## 1. Purpose
|
||
The **StellaOps Authority** service issues OAuth2/OIDC tokens for every StellaOps module (Concelier, Backend, Agent, Zastava) and exposes the policy controls required in sovereign/offline environments. Authority is built as a minimal ASP.NET host that:
|
||
|
||
- brokers password, client-credentials, and device-code flows through pluggable identity providers;
|
||
- persists access/refresh/device tokens in MongoDB with deterministic schemas for replay analysis and air-gapped audit copies;
|
||
- distributes revocation bundles and JWKS material so downstream services can enforce lockouts without direct database access;
|
||
- offers bootstrap APIs for first-run provisioning and key rotation without redeploying binaries.
|
||
|
||
Authority is deployed alongside Concelier in air-gapped environments and never requires outbound internet access. All trusted metadata (OpenIddict discovery, JWKS, revocation bundles) is cacheable, signed, and reproducible.
|
||
|
||
## 2. Component Architecture
|
||
Authority is composed of five cooperating subsystems:
|
||
|
||
1. **Minimal API host** – configures OpenIddict endpoints (`/token`, `/authorize`, `/revoke`, `/jwks`), publishes the OpenAPI contract at `/.well-known/openapi`, and enables structured logging/telemetry. Rate limiting hooks (`AuthorityRateLimiter`) wrap every request.
|
||
2. **Plugin host** – loads `StellaOps.Authority.Plugin.*.dll` assemblies, applies capability metadata, and exposes password/client provisioning surfaces through dependency injection.
|
||
3. **Mongo storage** – persists tokens, revocations, bootstrap invites, and plugin state in deterministic collections indexed for offline sync (`authority_tokens`, `authority_revocations`, etc.).
|
||
4. **Cryptography layer** – `StellaOps.Cryptography` abstractions manage password hashing, signing keys, JWKS export, and detached JWS generation.
|
||
5. **Offline ops APIs** – internal endpoints under `/internal/*` provide administrative flows (bootstrap users/clients, revocation export) guarded by API keys and deterministic audit events.
|
||
|
||
A high-level sequence for password logins:
|
||
|
||
```
|
||
Client -> /token (password grant)
|
||
-> Rate limiter & audit hooks
|
||
-> Plugin credential store (Argon2id verification)
|
||
-> Token persistence (Mongo authority_tokens)
|
||
-> Response (access/refresh tokens + deterministic claims)
|
||
```
|
||
|
||
## 3. Token Lifecycle & Persistence
|
||
Authority persists every issued token in MongoDB so operators can audit or revoke without scanning distributed caches.
|
||
|
||
- **Collection:** `authority_tokens`
|
||
- **Key fields:**
|
||
- `tokenId`, `type` (`access_token`, `refresh_token`, `device_code`, `authorization_code`)
|
||
- `subjectId`, `clientId`, ordered `scope` array
|
||
- `tenant` (lower-cased tenant hint from the issuing client, omitted for global clients)
|
||
|
||
### Console OIDC client
|
||
|
||
- **Client ID**: `console-web`
|
||
- **Grants**: `authorization_code` (PKCE required), `refresh_token`
|
||
- **Audience**: `console`
|
||
- **Scopes**: `openid`, `profile`, `email`, `advisory:read`, `vex:read`, `aoc:verify`, `findings:read`, `orch:read`, `vuln:read`
|
||
- **Redirect URIs** (defaults): `https://console.stella-ops.local/oidc/callback`
|
||
- **Post-logout redirect**: `https://console.stella-ops.local/`
|
||
- **Tokens**: Access tokens inherit the global 2 minute lifetime; refresh tokens remain short-lived (30 days) and can be exchanged silently via `/token`.
|
||
- **Roles**: Assign Authority role `Orch.Viewer` (exposed to tenants as `role/orch-viewer`) when operators need read-only access to Orchestrator telemetry via Console dashboards. Policy Studio ships dedicated roles (`role/policy-author`, `role/policy-reviewer`, `role/policy-approver`, `role/policy-operator`, `role/policy-auditor`) that align with the new `policy:*` scope family; issue them per tenant so audit trails remain scoped.
|
||
|
||
Configuration sample (`etc/authority.yaml.sample`) seeds the client with a confidential secret so Console can negotiate the code exchange on the backend while browsers execute the PKCE dance.
|
||
|
||
### Console Authority endpoints
|
||
|
||
- `/console/tenants` — Requires `authority:tenants.read`; returns the tenant catalogue for the authenticated principal. Requests lacking the `X-Stella-Tenant` header are rejected (`tenant_header_missing`) and logged.
|
||
- `/console/profile` — Requires `ui.read`; exposes subject metadata (roles, scopes, audiences) and indicates whether the session is within the five-minute fresh-auth window.
|
||
- `/console/token/introspect` — Requires `ui.read`; introspects the active access token so the SPA can prompt for re-authentication before privileged actions.
|
||
|
||
All endpoints demand DPoP-bound tokens and propagate structured audit events (`authority.console.*`). Gateways must forward the `X-Stella-Tenant` header derived from the access token; downstream services rely on the same value for isolation. Keep Console access tokens short-lived (default 15 minutes) and enforce the fresh-auth window for admin actions (`ui.admin`, `authority:*`, `policy:activate`, `exceptions:approve`).
|
||
- `status` (`valid`, `revoked`, `expired`), `createdAt`, optional `expiresAt`
|
||
- `revokedAt`, machine-readable `revokedReason`, optional `revokedReasonDescription`
|
||
- `revokedMetadata` (string dictionary for plugin-specific context)
|
||
- **Persistence flow:** `PersistTokensHandler` stamps missing JWT IDs, normalises scopes, and stores every principal emitted by OpenIddict.
|
||
- **Revocation flow:** `AuthorityTokenStore.UpdateStatusAsync` flips status, records the reason metadata, and is invoked by token revocation handlers and plugin provisioning events (e.g., disabling a user).
|
||
- **Expiry maintenance:** `AuthorityTokenStore.DeleteExpiredAsync` prunes non-revoked tokens past their `expiresAt` timestamp. Operators should schedule this in maintenance windows if large volumes of tokens are issued.
|
||
|
||
### Expectations for resource servers
|
||
Resource servers (Concelier WebService, Backend, Agent) **must not** assume in-memory caches are authoritative. They should:
|
||
|
||
- cache `/jwks` and `/revocations/export` responses within configured lifetimes;
|
||
- honour `revokedReason` metadata when shaping audit trails;
|
||
- treat `status != "valid"` or missing tokens as immediate denial conditions.
|
||
- propagate the `tenant` claim (`X-Stella-Tenant` header in REST calls) and reject requests when the tenant supplied by Authority does not match the resource server's scope; Concelier and Excititor guard endpoints refuse cross-tenant tokens.
|
||
|
||
### Tenant propagation
|
||
|
||
- Client provisioning (bootstrap or plug-in) accepts a `tenant` hint. Authority normalises the value (`trim().ToLowerInvariant()`) and persists it alongside the registration. Clients without an explicit tenant remain global.
|
||
- Issued principals include the `stellaops:tenant` claim. `PersistTokensHandler` mirrors this claim into `authority_tokens.tenant`, enabling per-tenant revocation and reporting.
|
||
- Rate limiter metadata now tags requests with `authority.tenant`, unlocking per-tenant throughput metrics and diagnostic filters. Audit events (`authority.client_credentials.grant`, `authority.password.grant`, bootstrap flows) surface the tenant and login attempt documents index on `{tenant, occurredAt}` for quick queries.
|
||
- Client credentials that request `advisory:ingest`, `advisory:read`, `vex:ingest`, `vex:read`, `signals:read`, `signals:write`, `signals:admin`, or `aoc:verify` now fail fast when the client registration lacks a tenant hint. Issued tokens are re-validated against persisted tenant metadata, and Authority rejects any cross-tenant replay (`invalid_client`/`invalid_token`), ensuring aggregation-only workloads remain tenant-scoped.
|
||
- Client credentials that request `export.viewer`, `export.operator`, or `export.admin` must provide a tenant hint. Requests for `export.admin` also need accompanying `export_reason` and `export_ticket` parameters; Authority returns `invalid_request` when either value is missing and records the denial in token audit events.
|
||
- Policy Studio scopes (`policy:author`, `policy:review`, `policy:approve`, `policy:operate`, `policy:audit`, `policy:simulate`, `policy:run`, `policy:activate`) require a tenant assignment; Authority rejects tokens missing the hint with `invalid_client` and records `scope.invalid` metadata for auditing.
|
||
- **AOC pairing guardrails** – Tokens that request `advisory:read`, `vex:read`, or any `signals:*` scope must also request `aoc:verify`. Authority rejects mismatches with `invalid_scope` (`Scope 'aoc:verify' is required when requesting advisory/vex read scopes.` or `Scope 'aoc:verify' is required when requesting signals scopes.`) so automation surfaces deterministic errors.
|
||
- **Signals ingestion guardrails** – Sensors and services requesting `signals:write`/`signals:admin` must also request `aoc:verify`; Authority records the `authority.aoc_scope_violation` tag when the pairing is missing so operators can trace failing sensors immediately.
|
||
- Password grant flows reuse the client registration's tenant and enforce the configured scope allow-list. Requested scopes outside that list (or mismatched tenants) trigger `invalid_scope`/`invalid_client` failures, ensuring cross-tenant access is denied before token issuance.
|
||
|
||
### Default service scopes
|
||
|
||
| Client ID | Purpose | Scopes granted | Sender constraint | Tenant |
|
||
|----------------------|---------------------------------------|--------------------------------------|-------------------|-----------------|
|
||
| `concelier-ingest` | Concelier raw advisory ingestion | `advisory:ingest`, `advisory:read` | `dpop` | `tenant-default` |
|
||
| `excitor-ingest` | Excititor raw VEX ingestion | `vex:ingest`, `vex:read` | `dpop` | `tenant-default` |
|
||
| `aoc-verifier` | Aggregation-only contract verification | `aoc:verify`, `advisory:read`, `vex:read` | `dpop` | `tenant-default` |
|
||
| `cartographer-service` | Graph snapshot construction | `graph:write`, `graph:read` | `dpop` | `tenant-default` |
|
||
| `graph-api` | Graph Explorer gateway/API | `graph:read`, `graph:export`, `graph:simulate` | `dpop` | `tenant-default` |
|
||
| `export-center-operator` | Export Center operator automation | `export.viewer`, `export.operator` | `dpop` | `tenant-default` |
|
||
| `export-center-admin` | Export Center administrative automation | `export.viewer`, `export.operator`, `export.admin` | `dpop` | `tenant-default` |
|
||
| `vuln-explorer-ui` | Vuln Explorer UI/API | `vuln:read` | `dpop` | `tenant-default` |
|
||
| `signals-uploader` | Reachability sensor ingestion | `signals:write`, `signals:read`, `aoc:verify` | `dpop` | `tenant-default` |
|
||
|
||
> **Secret hygiene (2025‑10‑27):** The repository includes a convenience `etc/authority.yaml` for compose/helm smoke tests. Every entry’s `secretFile` points to `etc/secrets/*.secret`, which ship with `*-change-me` placeholders—replace them with strong values (and wire them through your vault/secret manager) before issuing tokens in CI, staging, or production.
|
||
|
||
For factory provisioning, issue sensors the **SignalsUploader** role template (`signals:write`, `signals:read`, `aoc:verify`). Authority rejects ingestion tokens that omit `aoc:verify`, preserving aggregation-only contract guarantees for reachability signals.
|
||
|
||
These registrations are provided as examples in `etc/authority.yaml.sample`. Clone them per tenant (for example `concelier-tenant-a`, `concelier-tenant-b`) so tokens remain tenant-scoped by construction.
|
||
|
||
Graph Explorer introduces dedicated scopes: `graph:write` for Cartographer build jobs, `graph:read` for query/read operations, `graph:export` for long-running export downloads, and `graph:simulate` for what-if overlays. Assign only the scopes a client actually needs to preserve least privilege—UI-facing clients should typically request read/export access, while background services (Cartographer, Scheduler) require write privileges.
|
||
|
||
#### Least-privilege guidance for graph clients
|
||
|
||
- **Service identities** – The Cartographer worker should request `graph:write` and `graph:read` only; grant `graph:simulate` exclusively to pipeline automation that invokes Policy Engine overlays on demand. Keep `graph:export` scoped to API gateway components responsible for streaming GraphML/JSONL artifacts. Authority enforces this by rejecting `graph:write` tokens that lack `properties.serviceIdentity: cartographer`.
|
||
- **Tenant propagation** – Every client registration must pin a `tenant` hint. Authority normalises the value and stamps it into issued tokens (`stellaops:tenant`) so downstream services (Scheduler, Graph API, Console) can enforce tenant isolation without custom headers. Graph scopes (`graph:read`, `graph:write`, `graph:export`, `graph:simulate`) are denied if the tenant hint is missing.
|
||
- **SDK alignment** – Use the generated `StellaOpsScopes` constants in service code to request graph scopes. Hard-coded strings risk falling out of sync as additional graph capabilities are added.
|
||
- **DPOP for automation** – Maintain sender-constrained (`dpop`) flows for Cartographer and Scheduler to limit reuse of access tokens if a build host is compromised. For UI-facing tokens, pair `graph:read`/`graph:export` with short lifetimes and enforce refresh-token rotation at the gateway.
|
||
|
||
#### Export Center scope guardrails
|
||
|
||
- **Viewer vs operator** – `export.viewer` grants read-only access to export profiles, manifests, and bundles. Automation that schedules or reruns exports should request `export.operator` (and typically `export.viewer`). Tenant hints remain mandatory; Authority refuses tokens without them.
|
||
- **Administrative mutations** – Changes to retention policies, encryption key references, or schedule defaults require `export.admin`. When requesting tokens with this scope, clients must supply `export_reason` and `export_ticket` parameters; Authority persists the values for audit records and rejects missing metadata with `invalid_request`.
|
||
- **Operational hygiene** – Rotate `export.admin` credentials infrequently and run them through fresh-auth workflows where possible. Prefer distributing verification tooling with `export.viewer` tokens for day-to-day bundle validation.
|
||
|
||
#### Vuln Explorer permalinks
|
||
|
||
- **Scope** – `vuln:read` authorises Vuln Explorer to fetch advisory/linkset evidence and issue shareable links. Assign it only to front-end/API clients that must render vulnerability details.
|
||
- **Signed links** – `POST /permalinks/vuln` (requires `vuln:read`) accepts `{ "tenant": "tenant-a", "resourceKind": "vulnerability", "state": { ... }, "expiresInSeconds": 86400 }` and returns a JWT (`token`) plus `issuedAt`/`expiresAt`. The token embeds the tenant, requested state, and `vuln:read` scope and is signed with the same Authority signing keys published via `/jwks`.
|
||
- **Validation** – Resource servers verify the permalink using cached JWKS: check signature, ensure the tenant matches the current request context, honour the expiry, and enforce the contained `vuln:read` scope. The payload’s `resource.state` block is opaque JSON so UIs can round-trip filters/search terms without new schema changes.
|
||
|
||
## 4. Revocation Pipeline
|
||
Authority centralises revocation in `authority_revocations` with deterministic categories:
|
||
|
||
| Category | Meaning | Required fields |
|
||
| --- | --- | --- |
|
||
| `token` | Specific OAuth token revoked early. | `revocationId` (token id), `tokenType`, optional `clientId`, `subjectId` |
|
||
| `subject` | All tokens for a subject disabled. | `revocationId` (= subject id) |
|
||
| `client` | OAuth client registration revoked. | `revocationId` (= client id) |
|
||
| `key` | Signing/JWE key withdrawn. | `revocationId` (= key id) |
|
||
|
||
`RevocationBundleBuilder` flattens Mongo documents into canonical JSON, sorts entries by (`category`, `revocationId`, `revokedAt`), and signs exports using detached JWS (RFC 7797) with cosign-compatible headers.
|
||
|
||
**Export surfaces** (deterministic output, suitable for Offline Kit):
|
||
|
||
- CLI: `stella auth revoke export --output ./out` writes `revocation-bundle.json`, `.jws`, `.sha256`.
|
||
- Verification: `stella auth revoke verify --bundle <path> --signature <path> --key <path>` validates detached JWS signatures before distribution, selecting the crypto provider advertised in the detached header (see `docs/security/revocation-bundle.md`).
|
||
- API: `GET /internal/revocations/export` (requires bootstrap API key) returns the same payload.
|
||
- Verification: `stella auth revoke verify` validates schema, digest, and detached JWS using cached JWKS or offline keys, automatically preferring the hinted provider (libsodium builds honour `provider=libsodium`; other builds fall back to the managed provider).
|
||
|
||
**Consumer guidance:**
|
||
|
||
1. Mirror `revocation-bundle.json*` alongside Concelier exports. Offline agents fetch both over the existing update channel.
|
||
2. Use bundle `sequence` and `bundleId` to detect replay or monotonicity regressions. Ignore bundles with older sequence numbers unless `bundleId` changes and `issuedAt` advances.
|
||
3. Treat `revokedReason` taxonomy as machine-friendly codes (`compromised`, `rotation`, `policy`, `lifecycle`). Translating to human-readable logs is the consumer’s responsibility.
|
||
|
||
## 5. Signing Keys & JWKS Rotation
|
||
Authority signs revocation bundles and publishes JWKS entries via the new signing manager:
|
||
|
||
- **Configuration (`authority.yaml`):**
|
||
```yaml
|
||
signing:
|
||
enabled: true
|
||
algorithm: ES256 # Defaults to ES256
|
||
keySource: file # Loader identifier (file, vault, etc.)
|
||
provider: default # Optional preferred crypto provider
|
||
activeKeyId: authority-signing-dev
|
||
keyPath: "../certificates/authority-signing-dev.pem"
|
||
additionalKeys:
|
||
- keyId: authority-signing-dev-2024
|
||
path: "../certificates/authority-signing-dev-2024.pem"
|
||
source: "file"
|
||
```
|
||
- **Sources:** The default loader supports PEM files relative to the content root; additional loaders can be registered via `IAuthoritySigningKeySource`.
|
||
- **Providers:** Keys are registered against the `ICryptoProviderRegistry`, so alternative implementations (HSM, libsodium) can be plugged in without changing host code.
|
||
- **OpenAPI discovery:** `GET /.well-known/openapi` returns the published authentication contract (JSON by default, YAML when requested). Responses include `X-StellaOps-Service`, `X-StellaOps-Api-Version`, `X-StellaOps-Build-Version`, plus grant and scope headers, and honour conditional requests via `ETag`/`If-None-Match`.
|
||
- **JWKS output:** `GET /jwks` lists every signing key with `status` metadata (`active`, `retired`). Old keys remain until operators remove them from configuration, allowing verification of historical bundles/tokens.
|
||
|
||
### Rotation SOP (no downtime)
|
||
1. Generate a new P-256 private key (PEM) on an offline workstation and place it where the Authority host can read it (e.g., `../certificates/authority-signing-2025.pem`).
|
||
2. Call the authenticated admin API:
|
||
```bash
|
||
curl -sS -X POST https://authority.example.com/internal/signing/rotate \
|
||
-H "x-stellaops-bootstrap-key: ${BOOTSTRAP_KEY}" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"keyId": "authority-signing-2025",
|
||
"location": "../certificates/authority-signing-2025.pem",
|
||
"source": "file"
|
||
}'
|
||
```
|
||
3. Verify the response reports the previous key as retired and fetch `/jwks` to confirm the new `kid` appears with `status: "active"`.
|
||
4. Persist the old key path in `signing.additionalKeys` (the rotation API updates in-memory options; rewrite the YAML to match so restarts remain consistent).
|
||
5. If you prefer automation, trigger the `.gitea/workflows/authority-key-rotation.yml` workflow with the new `keyId`/`keyPath`; it wraps `ops/authority/key-rotation.sh` and reads environment-specific secrets. The older key will be marked `retired` and appended to `signing.additionalKeys`.
|
||
6. Re-run `stella auth revoke export` so revocation bundles are signed with the new key. Downstream caches should refresh JWKS within their configured lifetime (`StellaOpsAuthorityOptions.Signing` + client cache tolerance).
|
||
|
||
The rotation API leverages the same cryptography abstractions as revocation signing; no restart is required and the previous key is marked `retired` but kept available for verification.
|
||
|
||
## 6. Bootstrap & Administrative Endpoints
|
||
Administrative APIs live under `/internal/*` and require the bootstrap API key plus rate-limiter compliance.
|
||
|
||
| Endpoint | Method | Description |
|
||
| --- | --- | --- |
|
||
| `/internal/users` | `POST` | Provision initial administrative accounts through the registered password-capable plug-in. Emits structured audit events. |
|
||
| `/internal/clients` | `POST` | Provision OAuth clients (client credentials / device code). |
|
||
| `/internal/revocations/export` | `GET` | Export revocation bundle + detached JWS + digest. |
|
||
| `/internal/signing/rotate` | `POST` | Promote a new signing key (see SOP above). Request body accepts `keyId`, `location`, optional `source`, `algorithm`, `provider`, and metadata. |
|
||
|
||
All administrative calls emit `AuthEventRecord` entries enriched with correlation IDs, PII tags, and network metadata for offline SOC ingestion.
|
||
|
||
> **Tenant hint:** include a `tenant` entry inside `properties` when bootstrapping clients. Authority normalises the value, stores it on the registration, and stamps future tokens/audit events with the tenant.
|
||
|
||
### Bootstrap client example
|
||
|
||
```jsonc
|
||
POST /internal/clients
|
||
{
|
||
"clientId": "concelier",
|
||
"confidential": true,
|
||
"displayName": "Concelier Backend",
|
||
"allowedGrantTypes": ["client_credentials"],
|
||
"allowedScopes": ["concelier.jobs.trigger", "advisory:ingest", "advisory:read"],
|
||
"properties": {
|
||
"tenant": "tenant-default"
|
||
}
|
||
}
|
||
```
|
||
|
||
For environments with multiple tenants, repeat the call per tenant-specific client (e.g. `concelier-tenant-a`, `concelier-tenant-b`) or append suffixes to the client identifier.
|
||
|
||
### Aggregation-only verification tokens
|
||
|
||
- Issue a dedicated client (e.g. `aoc-verifier`) with the scopes `aoc:verify`, `advisory:read`, and `vex:read` for each tenant that runs guard checks. Authority refuses to mint tokens for these scopes unless the client registration provides a tenant hint.
|
||
- The CLI (`stella aoc verify --tenant <tenant>`) and Console verification panel both call `/aoc/verify` on Concelier and Excititor. Tokens that omit the tenant claim or present a tenant that does not match the stored registration are rejected with `invalid_client`/`invalid_token`.
|
||
- Audit: `authority.client_credentials.grant` entries record `scope.invalid="aoc:verify"` when requests are rejected because the tenant hint is missing or mismatched.
|
||
|
||
### Exception approvals & routing
|
||
|
||
- New scopes `exceptions:read`, `exceptions:write`, and `exceptions:approve` govern access to the exception lifecycle. Map these via tenant roles (`exceptions-service`, `exceptions-approver`) as described in `/docs/security/authority-scopes.md`.
|
||
- Configure approval routing in `authority.yaml` with declarative templates. Each template exposes an `authorityRouteId` for downstream services (Policy Engine, Console) and an optional `requireMfa` flag:
|
||
|
||
```yaml
|
||
exceptions:
|
||
routingTemplates:
|
||
- id: "secops"
|
||
authorityRouteId: "approvals/secops"
|
||
requireMfa: true
|
||
description: "Security Operations approval chain"
|
||
- id: "governance"
|
||
authorityRouteId: "approvals/governance"
|
||
requireMfa: false
|
||
description: "Non-production waiver review"
|
||
```
|
||
|
||
- Clients requesting exception scopes must include a tenant assignment. Authority rejects client-credential flows that request `exceptions:*` with `invalid_client` and logs `scope.invalid="exceptions:write"` (or the requested scope) in `authority.client_credentials.grant` audit events when the tenant hint is missing.
|
||
- When any configured routing template sets `requireMfa: true`, user-facing tokens that contain `exceptions:approve` must be acquired through an MFA-capable identity provider. Password/OIDC flows that lack MFA support are rejected with `authority.password.grant` audit events where `reason="Exception approval scope requires an MFA-capable identity provider."`
|
||
- Update interactive clients (Console) to request `exceptions:read` by default and elevate to `exceptions:approve` only inside fresh-auth workflows for approvers. Documented examples live in `etc/authority.yaml.sample`.
|
||
- Verification responses map guard failures to `ERR_AOC_00x` codes and Authority emits `authority.client_credentials.grant` + `authority.token.validate_access` audit records containing the tenant and scopes so operators can trace who executed a run.
|
||
- For air-gapped or offline replicas, pre-issue verification tokens per tenant and rotate them alongside ingest credentials; the guard endpoints never mutate data and remain safe to expose through the offline kit schedule.
|
||
|
||
## 7. Configuration Reference
|
||
|
||
| Section | Key | Description | Notes |
|
||
| --- | --- | --- | --- |
|
||
| Root | `issuer` | Absolute HTTPS issuer advertised to clients. | Required. Loopback HTTP allowed only for development. |
|
||
| Tokens | `accessTokenLifetime`, `refreshTokenLifetime`, etc. | Lifetimes for each grant (access, refresh, device, authorization code, identity). | Enforced during issuance; persisted on each token document. |
|
||
| Storage | `storage.connectionString` | MongoDB connection string. | Required even for tests; offline kits ship snapshots for seeding. |
|
||
| Signing | `signing.enabled` | Enable JWKS/revocation signing. | Disable only for development. |
|
||
| Signing | `signing.algorithm` | Signing algorithm identifier. | Currently ES256; additional curves can be wired through crypto providers. |
|
||
| Signing | `signing.keySource` | Loader identifier (`file`, `vault`, custom). | Determines which `IAuthoritySigningKeySource` resolves keys. |
|
||
| Signing | `signing.keyPath` | Relative/absolute path understood by the loader. | Stored as-is; rotation request should keep it in sync with filesystem layout. |
|
||
| Signing | `signing.activeKeyId` | Active JWKS / revocation signing key id. | Exposed as `kid` in JWKS and bundles. |
|
||
| Signing | `signing.additionalKeys[].keyId` | Retired key identifier retained for verification. | Manager updates this automatically after rotation; keep YAML aligned. |
|
||
| Signing | `signing.additionalKeys[].source` | Loader identifier per retired key. | Defaults to `signing.keySource` if omitted. |
|
||
| Security | `security.rateLimiting` | Fixed-window limits for `/token`, `/authorize`, `/internal/*`. | See `docs/security/rate-limits.md` for tuning. |
|
||
| Bootstrap | `bootstrap.apiKey` | Shared secret required for `/internal/*`. | Only required when `bootstrap.enabled` is true. |
|
||
|
||
### 7.1 Sender-constrained clients (DPoP & mTLS)
|
||
|
||
Authority now understands two flavours of sender-constrained OAuth clients:
|
||
|
||
- **DPoP proof-of-possession** – clients sign a `DPoP` header for `/token` requests. Authority validates the JWK thumbprint, HTTP method/URI, and replay window, then stamps the resulting access token with `cnf.jkt` so downstream services can verify the same key is reused.
|
||
- Configure under `security.senderConstraints.dpop`. `allowedAlgorithms`, `proofLifetime`, and `replayWindow` are enforced at validation time.
|
||
- `security.senderConstraints.dpop.nonce.enabled` enables nonce challenges for high-value audiences (`requiredAudiences`, normalised to case-insensitive strings). When a nonce is required but missing or expired, `/token` replies with `WWW-Authenticate: DPoP error="use_dpop_nonce"` (and, when available, a fresh `DPoP-Nonce` header). Clients must retry with the issued nonce embedded in the proof.
|
||
- `security.senderConstraints.dpop.nonce.store` selects `memory` (default) or `redis`. When `redis` is configured, set `security.senderConstraints.dpop.nonce.redisConnectionString` so replicas share nonce issuance and high-value clients avoid replay gaps during failover.
|
||
- Example (enabling Redis-backed nonces; adjust audiences per deployment):
|
||
```yaml
|
||
security:
|
||
senderConstraints:
|
||
dpop:
|
||
enabled: true
|
||
proofLifetime: "00:02:00"
|
||
replayWindow: "00:05:00"
|
||
allowedAlgorithms: [ "ES256", "ES384" ]
|
||
nonce:
|
||
enabled: true
|
||
ttl: "00:10:00"
|
||
maxIssuancePerMinute: 120
|
||
store: "redis"
|
||
redisConnectionString: "redis://authority-redis:6379?ssl=false"
|
||
requiredAudiences:
|
||
- "signer"
|
||
- "attestor"
|
||
```
|
||
Operators can override any field via environment variables (e.g. `STELLAOPS_AUTHORITY__SECURITY__SENDERCONSTRAINTS__DPOP__NONCE__STORE=redis`).
|
||
- Declare client `audiences` in bootstrap manifests or plug-in provisioning metadata; Authority now defaults the token `aud` claim and `resource` indicator from this list, which is also used to trigger nonce enforcement for audiences such as `signer` and `attestor`.
|
||
- **Mutual TLS clients** – client registrations may declare an mTLS binding (`senderConstraint: mtls`). When enabled via `security.senderConstraints.mtls`, Authority validates the presented client certificate against stored bindings (`certificateBindings[]`), optional chain verification, and timing windows. Successful requests embed `cnf.x5t#S256` into the access token (and introspection output) so resource servers can enforce the certificate thumbprint.
|
||
- `security.senderConstraints.mtls.enforceForAudiences` forces mTLS whenever the requested `aud`/`resource` (or the client's configured audiences) intersect the configured allow-list (default includes `signer`). Clients configured for different sender constraints are rejected early so operator policy remains consistent.
|
||
- Certificate bindings now act as an allow-list: Authority verifies thumbprint, subject, issuer, serial number, and any declared SAN values against the presented certificate, with rotation grace windows applied to `notBefore/notAfter`. Operators can enforce subject regexes, SAN type allow-lists (`dns`, `uri`, `ip`), trusted certificate authorities, and rotation grace via `security.senderConstraints.mtls.*`.
|
||
|
||
Both modes persist additional metadata in `authority_tokens`: `senderConstraint` records the enforced policy, while `senderKeyThumbprint` stores the DPoP JWK thumbprint or mTLS certificate hash captured at issuance. Downstream services can rely on these fields (and the corresponding `cnf` claim) when auditing offline copies of the token store.
|
||
|
||
### 7.2 Policy Engine clients & scopes
|
||
|
||
Policy Engine v2 introduces dedicated scopes and a service identity that materialises effective findings. Configure Authority as follows when provisioning policy clients:
|
||
|
||
| Client | Scopes | Notes |
|
||
| --- | --- | --- |
|
||
| `policy-engine` (service) | `policy:run`, `findings:read`, `effective:write` | Must include `properties.serviceIdentity: policy-engine` and a tenant. Authority rejects `effective:write` tokens without the marker or tenant. |
|
||
| `policy-cli` / automation | `policy:read`, `policy:author`, `policy:review`, `policy:simulate`, `findings:read` *(optionally add `policy:approve` / `policy:operate` / `policy:activate` for promotion pipelines)* | Keep scopes minimal; reroll CLI/CI tokens issued before 2025‑10‑27 so they drop legacy scope names and adopt the new set. |
|
||
| UI/editor sessions | `policy:read`, `policy:author`, `policy:simulate` (+ reviewer/approver/operator scopes as appropriate) | Issue tenant-specific clients so audit and rate limits remain scoped. |
|
||
|
||
Sample YAML entry:
|
||
|
||
```yaml
|
||
- clientId: "policy-engine"
|
||
displayName: "Policy Engine Service"
|
||
grantTypes: [ "client_credentials" ]
|
||
audiences: [ "api://policy-engine" ]
|
||
scopes: [ "policy:run", "findings:read", "effective:write" ]
|
||
tenant: "tenant-default"
|
||
properties:
|
||
serviceIdentity: "policy-engine"
|
||
senderConstraint: "dpop"
|
||
auth:
|
||
type: "client_secret"
|
||
secretFile: "../secrets/policy-engine.secret"
|
||
```
|
||
|
||
Compliance checklist:
|
||
|
||
- [ ] `policy-engine` client includes `properties.serviceIdentity: policy-engine` and a tenant hint; logins missing either are rejected.
|
||
- [ ] Non-service clients omit `effective:write` and receive only the scopes required for their role (`policy:read`, `policy:author`, `policy:review`, `policy:approve`, `policy:operate`, `policy:simulate`, etc.).
|
||
- [ ] Legacy tokens using `policy:write`/`policy:submit`/`policy:edit` are rotated to the new scope set before Production change freeze (see release migration note below).
|
||
- [ ] Approval/activation workflows use identities distinct from authoring identities; tenants are provisioned per client to keep telemetry segregated.
|
||
- [ ] Operators document reviewer assignments and incident procedures alongside `/docs/security/policy-governance.md` and archive policy evidence bundles (`stella policy bundle export`) with each release.
|
||
|
||
### 7.3 Orchestrator roles & scopes
|
||
|
||
| Role / Client | Scopes | Notes |
|
||
| --- | --- | --- |
|
||
| `Orch.Viewer` role | `orch:read` | Read-only access to Orchestrator dashboards, queues, and telemetry. |
|
||
| `Orch.Operator` role | `orch:read`, `orch:operate` | Issue short-lived tokens for control actions (pause/resume, retry, sync). Token requests **must** include `operator_reason` (≤256 chars) and `operator_ticket` (≤128 chars); Authority rejects requests missing either value and records both in audit events. |
|
||
|
||
Token request example via client credentials:
|
||
|
||
```bash
|
||
curl -u orch-operator:s3cr3t! \
|
||
-d 'grant_type=client_credentials' \
|
||
-d 'scope=orch:operate' \
|
||
-d 'operator_reason=resume source after maintenance' \
|
||
-d 'operator_ticket=INC-2045' \
|
||
https://authority.example.com/token
|
||
```
|
||
|
||
Tokens lacking `operator_reason` or `operator_ticket` receive `invalid_request`; audit events (`authority.client_credentials.grant`) surface the supplied values under `request.reason` and `request.ticket` for downstream review.
|
||
CLI clients set these parameters via `Authority.OperatorReason` / `Authority.OperatorTicket` (environment variables `STELLAOPS_ORCH_REASON` and `STELLAOPS_ORCH_TICKET`).
|
||
|
||
## 8. Offline & Sovereign Operation
|
||
- **No outbound dependencies:** Authority only contacts MongoDB and local plugins. Discovery and JWKS are cached by clients with offline tolerances (`AllowOfflineCacheFallback`, `OfflineCacheTolerance`). Operators should mirror these responses for air-gapped use.
|
||
- **Structured logging:** Every revocation export, signing rotation, bootstrap action, and token issuance emits structured logs with `traceId`, `client_id`, `subjectId`, and `network.remoteIp` where applicable. Mirror logs to your SIEM to retain audit trails without central connectivity.
|
||
- **Determinism:** Sorting rules in token and revocation exports guarantee byte-for-byte identical artefacts given the same datastore state. Hashes and signatures remain stable across machines.
|
||
|
||
## 9. Operational Checklist
|
||
- [ ] Protect the bootstrap API key and disable bootstrap endpoints (`bootstrap.enabled: false`) once initial setup is complete.
|
||
- [ ] Schedule `stella auth revoke export` (or `/internal/revocations/export`) at the same cadence as Concelier exports so bundles remain in lockstep.
|
||
- [ ] Rotate signing keys before expiration; keep at least one retired key until all cached bundles/tokens signed with it have expired.
|
||
- [ ] Monitor `/health` and `/ready` plus rate-limiter metrics to detect plugin outages early.
|
||
- [ ] Ensure downstream services cache JWKS and revocation bundles within tolerances; stale caches risk accepting revoked tokens.
|
||
|
||
For plug-in specific requirements, refer to **[Authority Plug-in Developer Guide](dev/31_AUTHORITY_PLUGIN_DEVELOPER_GUIDE.md)**. For revocation bundle validation workflow, see **[Authority Revocation Bundle](security/revocation-bundle.md)**.
|